Class DataGenerator

java.lang.Object
weka.datagenerators.DataGenerator
All Implemented Interfaces:
Serializable, OptionHandler, Randomizable, RevisionHandler
Direct Known Subclasses:
ClassificationGenerator, ClusterGenerator, RegressionGenerator

public abstract class DataGenerator extends Object implements OptionHandler, Randomizable, Serializable, RevisionHandler
Abstract superclass for data generators that generate data for classifiers and clusterers.
Version:
$Revision: 1.8 $
Author:
FracPete (fracpete at waikato dot ac dot nz)
See Also:
  • Constructor Details

    • DataGenerator

      public DataGenerator()
      initializes with default settings.
      Note: default values are set via a default<name> method. These default methods are also used in the listOptions method and in the setOptions method. Why? Derived generators can override the return value of these default methods, to avoid exceptions.
  • Method Details

    • listOptions

      public Enumeration listOptions()
      Returns an enumeration describing the available options.
      Specified by:
      listOptions in interface OptionHandler
      Returns:
      an enumeration of all the available options
    • setOptions

      public void setOptions(String[] options) throws Exception
      Parses a list of options for this object.

      For list of valid options see class description.

      Specified by:
      setOptions in interface OptionHandler
      Parameters:
      options - the list of options as an array of strings
      Throws:
      Exception - if an option is not supported
    • getOptions

      public String[] getOptions()
      Gets the current settings of the datagenerator RDG1. Removing of blacklisted options has to be done in the derived class, that defines the blacklist-entry.
      Specified by:
      getOptions in interface OptionHandler
      Returns:
      an array of strings suitable for passing to setOptions
      See Also:
      • removeBlacklist(String[])
    • defineDataFormat

      public Instances defineDataFormat() throws Exception
      Initializes the format for the dataset produced. Must be called before the generateExample or generateExamples methods are used. Also sets a default relation name in case the current relation name is empty.
      Returns:
      the format for the dataset
      Throws:
      Exception - if the generating of the format failed
      See Also:
      • defaultRelationName()
    • generateExample

      public abstract Instance generateExample() throws Exception
      Generates one example of the dataset.
      Returns:
      the generated example
      Throws:
      Exception - if the format of the dataset is not yet defined
      Exception - if the generator only works with generateExamples which means in non single mode
    • generateExamples

      public abstract Instances generateExamples() throws Exception
      Generates all examples of the dataset.
      Returns:
      the generated dataset
      Throws:
      Exception - if the format of the dataset is not yet defined
      Exception - if the generator only works with generateExample, which means in single mode
    • generateStart

      public abstract String generateStart() throws Exception
      Generates a comment string that documentates the data generator. By default this string is added at the beginning of the produced output as ARFF file type, next after the options.
      Returns:
      string contains info about the generated rules
      Throws:
      Exception - if the generating of the documentation fails
    • generateFinished

      public abstract String generateFinished() throws Exception
      Generates a comment string that documentates the data generator. By default this string is added at the end of the produced output as ARFF file type.
      Returns:
      string contains info about the generated rules
      Throws:
      Exception - if the generating of the documentation fails
    • getSingleModeFlag

      public abstract boolean getSingleModeFlag() throws Exception
      Return if single mode is set for the given data generator mode depends on option setting and or generator type.
      Returns:
      single mode flag
      Throws:
      Exception - if mode is not set yet
    • setDebug

      public void setDebug(boolean debug)
      Sets the debug flag.
      Parameters:
      debug - the new debug flag
    • getDebug

      public boolean getDebug()
      Gets the debug flag.
      Returns:
      the debug flag
    • debugTipText

      public String debugTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setRelationName

      public void setRelationName(String relationName)
      Sets the relation name the dataset should have.
      Parameters:
      relationName - the new relation name
    • getRelationName

      public String getRelationName()
      Gets the relation name the dataset should have.
      Returns:
      the relation name the dataset should have
    • relationNameTipText

      public String relationNameTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getNumExamplesAct

      public int getNumExamplesAct()
      Gets the number of examples the dataset should have.
      Returns:
      the number of examples the dataset should have
    • setOutput

      public void setOutput(PrintWriter newOutput)
      Sets the print writer.
      Parameters:
      newOutput - the new print writer
    • getOutput

      public PrintWriter getOutput()
      Gets the print writer.
      Returns:
      print writer object
    • defaultOutput

      public PrintWriter defaultOutput()
      Gets the writer, which is used for outputting to stdout. A workaround for the problem of closing stdout when closing the associated Printwriter.
      Returns:
      writer object
    • outputTipText

      public String outputTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • setDatasetFormat

      public void setDatasetFormat(Instances newFormat)
      Sets the format of the dataset that is to be generated.
      Parameters:
      newFormat - the new dataset format of the dataset
    • getDatasetFormat

      public Instances getDatasetFormat()
      Gets the format of the dataset that is to be generated.
      Returns:
      the dataset format of the dataset
    • formatTipText

      public String formatTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getSeed

      public int getSeed()
      Gets the random number seed.
      Specified by:
      getSeed in interface Randomizable
      Returns:
      the random number seed.
    • setSeed

      public void setSeed(int newSeed)
      Sets the random number seed.
      Specified by:
      setSeed in interface Randomizable
      Parameters:
      newSeed - the new random number seed.
    • seedTipText

      public String seedTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • getRandom

      public Random getRandom()
      Gets the random generator.
      Returns:
      the random generator
    • setRandom

      public void setRandom(Random newRandom)
      Sets the random generator.
      Parameters:
      newRandom - is the random generator.
    • randomTipText

      public String randomTipText()
      Returns the tip text for this property
      Returns:
      tip text for this property suitable for displaying in the explorer/experimenter gui
    • makeData

      public static void makeData(DataGenerator generator, String[] options) throws Exception
      Calls the data generator.
      Parameters:
      generator - one of the data generators
      options - options of the data generator
      Throws:
      Exception - if there was an error in the option list