Class SubsetByExpression

  • All Implemented Interfaces:
    java.io.Serializable, CapabilitiesHandler, OptionHandler, RevisionHandler

    public class SubsetByExpression
    extends SimpleBatchFilter
    Filters instances according to a user-specified expression.

    Grammar:

    boolexpr_list ::= boolexpr_list boolexpr_part | boolexpr_part;

    boolexpr_part ::= boolexpr:e {: parser.setResult(e); :} ;

    boolexpr ::= BOOLEAN
    | true
    | false
    | expr < expr
    | expr <= expr
    | expr > expr
    | expr >= expr
    | expr = expr
    | ( boolexpr )
    | not boolexpr
    | boolexpr and boolexpr
    | boolexpr or boolexpr
    | ATTRIBUTE is STRING
    ;

    expr ::= NUMBER
    | ATTRIBUTE
    | ( expr )
    | opexpr
    | funcexpr
    ;

    opexpr ::= expr + expr
    | expr - expr
    | expr * expr
    | expr / expr
    ;

    funcexpr ::= abs ( expr )
    | sqrt ( expr )
    | log ( expr )
    | exp ( expr )
    | sin ( expr )
    | cos ( expr )
    | tan ( expr )
    | rint ( expr )
    | floor ( expr )
    | pow ( expr for base , expr for exponent )
    | ceil ( expr )
    ;

    Notes:
    - NUMBER
    any integer or floating point number
    (but not in scientific notation!)
    - STRING
    any string surrounded by single quotes;
    the string may not contain a single quote though.
    - ATTRIBUTE
    the following placeholders are recognized for
    attribute values:
    - CLASS for the class value in case a class attribute is set.
    - ATTxyz with xyz a number from 1 to # of attributes in the
    dataset, representing the value of indexed attribute.

    Examples:
    - extracting only mammals and birds from the 'zoo' UCI dataset:
    (CLASS is 'mammal') or (CLASS is 'bird')
    - extracting only animals with at least 2 legs from the 'zoo' UCI dataset:
    (ATT14 >= 2)
    - extracting only instances with non-missing 'wage-increase-second-year'
    from the 'labor' UCI dataset:
    not ismissing(ATT3)

    Valid options are:

     -E <expr>
      The expression to use for filtering
      (default: true).
     -F
      Apply the filter to instances that arrive after the first
      (training) batch. The default is to not apply the filter (i.e.
      always return the instance)
    Version:
    $Revision: 9804 $
    Author:
    fracpete (fracpete at waikato dot ac dot nz)
    See Also:
    Serialized Form
    • Constructor Detail

      • SubsetByExpression

        public SubsetByExpression()
    • Method Detail

      • globalInfo

        public java.lang.String globalInfo()
        Returns a string describing this filter.
        Specified by:
        globalInfo in class SimpleFilter
        Returns:
        a description of the filter suitable for displaying in the explorer/experimenter gui
      • input

        public boolean input​(Instance instance)
                      throws java.lang.Exception
        Input an instance for filtering. Filter requires all training instances be read before producing output (calling the method batchFinished() makes the data available). If this instance is part of a new batch, m_NewBatch is set to false.
        Overrides:
        input in class SimpleBatchFilter
        Parameters:
        instance - the input instance
        Returns:
        true if the filtered instance may now be collected with output().
        Throws:
        java.lang.IllegalStateException - if no input structure has been defined
        java.lang.Exception - if something goes wrong
        See Also:
        SimpleBatchFilter.batchFinished()
      • listOptions

        public java.util.Enumeration listOptions()
        Returns an enumeration describing the available options.
        Specified by:
        listOptions in interface OptionHandler
        Overrides:
        listOptions in class SimpleFilter
        Returns:
        an enumeration of all the available options.
      • setOptions

        public void setOptions​(java.lang.String[] options)
                        throws java.lang.Exception
        Parses a given list of options.

        Valid options are:

         -E <expr>
          The expression to use for filtering
          (default: true).
         -F
          Apply the filter to instances that arrive after the first
          (training) batch. The default is to not apply the filter (i.e.
          always return the instance)
        Specified by:
        setOptions in interface OptionHandler
        Overrides:
        setOptions in class SimpleFilter
        Parameters:
        options - the list of options as an array of strings
        Throws:
        java.lang.Exception - if an option is not supported
        See Also:
        SimpleFilter.reset()
      • getOptions

        public java.lang.String[] getOptions()
        Gets the current settings of the filter.
        Specified by:
        getOptions in interface OptionHandler
        Overrides:
        getOptions in class SimpleFilter
        Returns:
        an array of strings suitable for passing to setOptions
      • setExpression

        public void setExpression​(java.lang.String value)
        Sets the expression used for filtering.
        Parameters:
        value - the expression
      • getExpression

        public java.lang.String getExpression()
        Returns the expression used for filtering.
        Returns:
        the expression
      • expressionTipText

        public java.lang.String expressionTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setFilterAfterFirstBatch

        public void setFilterAfterFirstBatch​(boolean b)
        Set whether to apply the filter to instances that arrive once the first (training) batch has been seen. The default is to not apply the filter and just return each instance input. This is so that, when used in the FilteredClassifier, a test instance does not get "consumed" by the filter and a prediction is always generated.
        Parameters:
        b - true if the filter should be applied to instances that arrive after the first (training) batch has been processed.
      • getFilterAfterFirstBatch

        public boolean getFilterAfterFirstBatch()
        Get whether to apply the filter to instances that arrive once the first (training) batch has been seen. The default is to not apply the filter and just return each instance input. This is so that, when used in the FilteredClassifier, a test instance does not get "consumed" by the filter and a prediction is always generated.
        Returns:
        true if the filter should be applied to instances that arrive after the first (training) batch has been processed.
      • filterAfterFirstBatchTipText

        public java.lang.String filterAfterFirstBatchTipText()
        Returns the tip text for this property.
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • main

        public static void main​(java.lang.String[] args)
        Main method for running this filter.
        Parameters:
        args - arguments for the filter: use -h for help