Package weka.classifiers.meta
Class MetaCost
- java.lang.Object
-
- weka.classifiers.Classifier
-
- weka.classifiers.SingleClassifierEnhancer
-
- weka.classifiers.RandomizableSingleClassifierEnhancer
-
- weka.classifiers.meta.MetaCost
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
,CapabilitiesHandler
,OptionHandler
,Randomizable
,RevisionHandler
,TechnicalInformationHandler
public class MetaCost extends RandomizableSingleClassifierEnhancer implements TechnicalInformationHandler
This metaclassifier makes its base classifier cost-sensitive using the method specified in
Pedro Domingos: MetaCost: A general method for making classifiers cost-sensitive. In: Fifth International Conference on Knowledge Discovery and Data Mining, 155-164, 1999.
This classifier should produce similar results to one created by passing the base learner to Bagging, which is in turn passed to a CostSensitiveClassifier operating on minimum expected cost. The difference is that MetaCost produces a single cost-sensitive classifier of the base learner, giving the benefits of fast classification and interpretable output (if the base learner itself is interpretable). This implementation uses all bagging iterations when reclassifying training data (the MetaCost paper reports a marginal improvement when only those iterations containing each training instance are used in reclassifying that instance). BibTeX:@inproceedings{Domingos1999, author = {Pedro Domingos}, booktitle = {Fifth International Conference on Knowledge Discovery and Data Mining}, pages = {155-164}, title = {MetaCost: A general method for making classifiers cost-sensitive}, year = {1999} }
Valid options are:-I <num> Number of bagging iterations. (default 10)
-C <cost file name> File name of a cost matrix to use. If this is not supplied, a cost matrix will be loaded on demand. The name of the on-demand file is the relation name of the training data plus ".cost", and the path to the on-demand file is specified with the -N option.
-N <directory> Name of a directory to search for cost files when loading costs on demand (default current directory).
-cost-matrix <matrix> The cost matrix in Matlab single line format.
-P Size of each bag, as a percentage of the training set size. (default 100)
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.rules.ZeroR)
Options specific to classifier weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the console
Options after -- are passed to the designated classifier.- Version:
- $Revision: 1.24 $
- Author:
- Len Trigg (len@reeltwo.com)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static int
MATRIX_ON_DEMAND
load cost matrix on demandstatic int
MATRIX_SUPPLIED
use explicit matrixstatic Tag[]
TAGS_MATRIX_SOURCE
Specify possible sources of the cost matrix
-
Constructor Summary
Constructors Constructor Description MetaCost()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String
bagSizePercentTipText()
Returns the tip text for this propertyvoid
buildClassifier(Instances data)
Builds the model of the base learner.java.lang.String
costMatrixSourceTipText()
Returns the tip text for this propertyjava.lang.String
costMatrixTipText()
Returns the tip text for this propertydouble[]
distributionForInstance(Instance instance)
Classifies a given instance after filtering.int
getBagSizePercent()
Gets the size of each bag, as a percentage of the training set size.Capabilities
getCapabilities()
Returns default capabilities of the classifier.CostMatrix
getCostMatrix()
Gets the misclassification cost matrix.SelectedTag
getCostMatrixSource()
Gets the source location method of the cost matrix.int
getNumIterations()
Gets the number of bagging iterationsjava.io.File
getOnDemandDirectory()
Returns the directory that will be searched for cost files when loading on demand.java.lang.String[]
getOptions()
Gets the current settings of the Classifier.java.lang.String
getRevision()
Returns the revision string.TechnicalInformation
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.java.lang.String
globalInfo()
Returns a string describing classifierjava.util.Enumeration
listOptions()
Returns an enumeration describing the available options.static void
main(java.lang.String[] argv)
Main method for testing this class.java.lang.String
numIterationsTipText()
Returns the tip text for this propertyjava.lang.String
onDemandDirectoryTipText()
Returns the tip text for this propertyvoid
setBagSizePercent(int newBagSizePercent)
Sets the size of each bag, as a percentage of the training set size.void
setCostMatrix(CostMatrix newCostMatrix)
Sets the misclassification cost matrix.void
setCostMatrixSource(SelectedTag newMethod)
Sets the source location of the cost matrix.void
setNumIterations(int numIterations)
Sets the number of bagging iterationsvoid
setOnDemandDirectory(java.io.File newDir)
Sets the directory that will be searched for cost files when loading on demand.void
setOptions(java.lang.String[] options)
Parses a given list of options.java.lang.String
toString()
Output a representation of this classifier-
Methods inherited from class weka.classifiers.RandomizableSingleClassifierEnhancer
getSeed, seedTipText, setSeed
-
Methods inherited from class weka.classifiers.SingleClassifierEnhancer
classifierTipText, getClassifier, setClassifier
-
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
-
-
-
-
Field Detail
-
MATRIX_ON_DEMAND
public static final int MATRIX_ON_DEMAND
load cost matrix on demand- See Also:
- Constant Field Values
-
MATRIX_SUPPLIED
public static final int MATRIX_SUPPLIED
use explicit matrix- See Also:
- Constant Field Values
-
TAGS_MATRIX_SOURCE
public static final Tag[] TAGS_MATRIX_SOURCE
Specify possible sources of the cost matrix
-
-
Method Detail
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing classifier- Returns:
- a description suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classRandomizableSingleClassifierEnhancer
- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.Exception
Parses a given list of options. Valid options are:-I <num> Number of bagging iterations. (default 10)
-C <cost file name> File name of a cost matrix to use. If this is not supplied, a cost matrix will be loaded on demand. The name of the on-demand file is the relation name of the training data plus ".cost", and the path to the on-demand file is specified with the -N option.
-N <directory> Name of a directory to search for cost files when loading costs on demand (default current directory).
-cost-matrix <matrix> The cost matrix in Matlab single line format.
-P Size of each bag, as a percentage of the training set size. (default 100)
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.rules.ZeroR)
Options specific to classifier weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the console
Options after -- are passed to the designated classifier.- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classRandomizableSingleClassifierEnhancer
- Parameters:
options
- the list of options as an array of strings- Throws:
java.lang.Exception
- if an option is not supported
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of the Classifier.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classRandomizableSingleClassifierEnhancer
- Returns:
- an array of strings suitable for passing to setOptions
-
costMatrixSourceTipText
public java.lang.String costMatrixSourceTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getCostMatrixSource
public SelectedTag getCostMatrixSource()
Gets the source location method of the cost matrix. Will be one of MATRIX_ON_DEMAND or MATRIX_SUPPLIED.- Returns:
- the cost matrix source.
-
setCostMatrixSource
public void setCostMatrixSource(SelectedTag newMethod)
Sets the source location of the cost matrix. Values other than MATRIX_ON_DEMAND or MATRIX_SUPPLIED will be ignored.- Parameters:
newMethod
- the cost matrix location method.
-
onDemandDirectoryTipText
public java.lang.String onDemandDirectoryTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getOnDemandDirectory
public java.io.File getOnDemandDirectory()
Returns the directory that will be searched for cost files when loading on demand.- Returns:
- The cost file search directory.
-
setOnDemandDirectory
public void setOnDemandDirectory(java.io.File newDir)
Sets the directory that will be searched for cost files when loading on demand.- Parameters:
newDir
- The cost file search directory.
-
bagSizePercentTipText
public java.lang.String bagSizePercentTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getBagSizePercent
public int getBagSizePercent()
Gets the size of each bag, as a percentage of the training set size.- Returns:
- the bag size, as a percentage.
-
setBagSizePercent
public void setBagSizePercent(int newBagSizePercent)
Sets the size of each bag, as a percentage of the training set size.- Parameters:
newBagSizePercent
- the bag size, as a percentage.
-
numIterationsTipText
public java.lang.String numIterationsTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumIterations
public void setNumIterations(int numIterations)
Sets the number of bagging iterations- Parameters:
numIterations
- the number of iterations to use
-
getNumIterations
public int getNumIterations()
Gets the number of bagging iterations- Returns:
- the maximum number of bagging iterations
-
costMatrixTipText
public java.lang.String costMatrixTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getCostMatrix
public CostMatrix getCostMatrix()
Gets the misclassification cost matrix.- Returns:
- the cost matrix
-
setCostMatrix
public void setCostMatrix(CostMatrix newCostMatrix)
Sets the misclassification cost matrix.- Parameters:
newCostMatrix
- the cost matrix
-
getCapabilities
public Capabilities getCapabilities()
Returns default capabilities of the classifier.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Overrides:
getCapabilities
in classSingleClassifierEnhancer
- Returns:
- the capabilities of this classifier
- See Also:
Capabilities
-
buildClassifier
public void buildClassifier(Instances data) throws java.lang.Exception
Builds the model of the base learner.- Specified by:
buildClassifier
in classClassifier
- Parameters:
data
- the training data- Throws:
java.lang.Exception
- if the classifier could not be built successfully
-
distributionForInstance
public double[] distributionForInstance(Instance instance) throws java.lang.Exception
Classifies a given instance after filtering.- Overrides:
distributionForInstance
in classClassifier
- Parameters:
instance
- the instance to be classified- Returns:
- the class distribution for the given instance
- Throws:
java.lang.Exception
- if instance could not be classified successfully
-
toString
public java.lang.String toString()
Output a representation of this classifier- Overrides:
toString
in classjava.lang.Object
- Returns:
- a string representaiton of the classifier
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classClassifier
- Returns:
- the revision
-
main
public static void main(java.lang.String[] argv)
Main method for testing this class.- Parameters:
argv
- should contain the following arguments: -t training file [-T test file] [-c class index]
-
-