Class Agrawal

  • All Implemented Interfaces:
    java.io.Serializable, OptionHandler, Randomizable, RevisionHandler, TechnicalInformationHandler

    public class Agrawal
    extends ClassificationGenerator
    implements TechnicalInformationHandler
    Generates a people database and is based on the paper by Agrawal et al.:
    R. Agrawal, T. Imielinski, A. Swami (1993). Database Mining: A Performance Perspective. IEEE Transactions on Knowledge and Data Engineering. 5(6):914-925. URL http://www.almaden.ibm.com/software/quest/Publications/ByDate.html.

    BibTeX:

     @article{Agrawal1993,
        author = {R. Agrawal and T. Imielinski and A. Swami},
        journal = {IEEE Transactions on Knowledge and Data Engineering},
        note = {Special issue on Learning and Discovery in Knowledge-Based Databases},
        number = {6},
        pages = {914-925},
        title = {Database Mining: A Performance Perspective},
        volume = {5},
        year = {1993},
        URL = {http://www.almaden.ibm.com/software/quest/Publications/ByDate.html},
        PDF = {http://www.almaden.ibm.com/software/quest/Publications/papers/tkde93.pdf}
     }
     

    Valid options are:

     -h
      Prints this help.
     -o <file>
      The name of the output file, otherwise the generated data is
      printed to stdout.
     -r <name>
      The name of the relation.
     -d
      Whether to print debug informations.
     -S
      The seed for random function (default 1)
     -n <num>
      The number of examples to generate (default 100)
     -F <num>
      The function to use for generating the data. (default 1)
     -B
      Whether to balance the class.
     -P <num>
      The perturbation factor. (default 0.05)
    Version:
    $Revision: 1.6 $
    Author:
    Richard Kirkby (rkirkby at cs dot waikato dot ac dot nz), FracPete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Constructor Detail

      • Agrawal

        public Agrawal()
        initializes the generator with default values
    • Method Detail

      • globalInfo

        public java.lang.String globalInfo()
        Returns a string describing this data generator.
        Returns:
        a description of the data generator suitable for displaying in the explorer/experimenter gui
      • getTechnicalInformation

        public TechnicalInformation getTechnicalInformation()
        Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
        Specified by:
        getTechnicalInformation in interface TechnicalInformationHandler
        Returns:
        the technical information about this class
      • setOptions

        public void setOptions​(java.lang.String[] options)
                        throws java.lang.Exception
        Parses a list of options for this object.

        Valid options are:

         -h
          Prints this help.
         -o <file>
          The name of the output file, otherwise the generated data is
          printed to stdout.
         -r <name>
          The name of the relation.
         -d
          Whether to print debug informations.
         -S
          The seed for random function (default 1)
         -n <num>
          The number of examples to generate (default 100)
         -F <num>
          The function to use for generating the data. (default 1)
         -B
          Whether to balance the class.
         -P <num>
          The perturbation factor. (default 0.05)
        Specified by:
        setOptions in interface OptionHandler
        Overrides:
        setOptions in class ClassificationGenerator
        Parameters:
        options - the list of options as an array of strings
        Throws:
        java.lang.Exception - if an option is not supported
      • getFunction

        public SelectedTag getFunction()
        Gets the function for generating the data.
        Returns:
        the function.
        See Also:
        FUNCTION_TAGS
      • setFunction

        public void setFunction​(SelectedTag value)
        Sets the function for generating the data.
        Parameters:
        value - the function.
        See Also:
        FUNCTION_TAGS
      • functionTipText

        public java.lang.String functionTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getBalanceClass

        public boolean getBalanceClass()
        Gets whether the class is balanced.
        Returns:
        whether the class is balanced.
      • setBalanceClass

        public void setBalanceClass​(boolean value)
        Sets whether the class is balanced.
        Parameters:
        value - whether to balance the class.
      • balanceClassTipText

        public java.lang.String balanceClassTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getPerturbationFraction

        public double getPerturbationFraction()
        Gets the perturbation fraction.
        Returns:
        the perturbation fraction.
      • setPerturbationFraction

        public void setPerturbationFraction​(double value)
        Sets the perturbation fraction.
        Parameters:
        value - the perturbation fraction.
      • perturbationFractionTipText

        public java.lang.String perturbationFractionTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getSingleModeFlag

        public boolean getSingleModeFlag()
                                  throws java.lang.Exception
        Return if single mode is set for the given data generator mode depends on option setting and or generator type.
        Specified by:
        getSingleModeFlag in class DataGenerator
        Returns:
        single mode flag
        Throws:
        java.lang.Exception - if mode is not set yet
      • defineDataFormat

        public Instances defineDataFormat()
                                   throws java.lang.Exception
        Initializes the format for the dataset produced. Must be called before the generateExample or generateExamples methods are used. Re-initializes the random number generator with the given seed.
        Overrides:
        defineDataFormat in class DataGenerator
        Returns:
        the format for the dataset
        Throws:
        java.lang.Exception - if the generating of the format failed
        See Also:
        DataGenerator.getSeed()
      • generateExample

        public Instance generateExample()
                                 throws java.lang.Exception
        Generates one example of the dataset.
        Specified by:
        generateExample in class DataGenerator
        Returns:
        the generated example
        Throws:
        java.lang.Exception - if the format of the dataset is not yet defined
        java.lang.Exception - if the generator only works with generateExamples which means in non single mode
      • generateExamples

        public Instances generateExamples()
                                   throws java.lang.Exception
        Generates all examples of the dataset. Re-initializes the random number generator with the given seed, before generating instances.
        Specified by:
        generateExamples in class DataGenerator
        Returns:
        the generated dataset
        Throws:
        java.lang.Exception - if the format of the dataset is not yet defined
        java.lang.Exception - if the generator only works with generateExample, which means in single mode
        See Also:
        DataGenerator.getSeed()
      • generateStart

        public java.lang.String generateStart()
        Generates a comment string that documentates the data generator. By default this string is added at the beginning of the produced output as ARFF file type, next after the options.
        Specified by:
        generateStart in class DataGenerator
        Returns:
        string contains info about the generated rules
      • generateFinished

        public java.lang.String generateFinished()
                                          throws java.lang.Exception
        Generates a comment string that documentats the data generator. By default this string is added at the end of theproduces output as ARFF file type.
        Specified by:
        generateFinished in class DataGenerator
        Returns:
        string contains info about the generated rules
        Throws:
        java.lang.Exception - if the generating of the documentaion fails
      • getRevision

        public java.lang.String getRevision()
        Returns the revision string.
        Specified by:
        getRevision in interface RevisionHandler
        Returns:
        the revision
      • main

        public static void main​(java.lang.String[] args)
        Main method for executing this class.
        Parameters:
        args - should contain arguments for the data producer: