Package edu.msu.cme.rdp.classifier
Class Classifier
- java.lang.Object
-
- edu.msu.cme.rdp.classifier.Classifier
-
public class Classifier extends java.lang.Object
This is the class to do the classification.
-
-
Field Summary
Fields Modifier and Type Field Description static int
MAX_SEQ_LEN
static int
MIN_BOOTSTRSP_WORDS
static int
MIN_GOOD_WORDS
static int
MIN_SEQ_LEN
The minimum number of bases per sequence.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addConfidence(HierarchyTree node, java.util.HashMap map)
increase the count of the RankAssignment in the map if match that node or any ancestor of that node.ClassificationResult
classify(ClassifierSequence seq)
ClassificationResult
classify(ClassifierSequence seq, int min_bootstrap_words)
Takes a query sequence, returns the classification result.ClassificationResult
classify(edu.msu.cme.rdp.readseq.readers.Sequence seq)
Takes a query sequence, returns the classification result.java.lang.String
getTrainRank()
-
-
-
Field Detail
-
MIN_SEQ_LEN
public static final int MIN_SEQ_LEN
The minimum number of bases per sequence. Initially set to 200.- See Also:
- Constant Field Values
-
MAX_SEQ_LEN
public static final int MAX_SEQ_LEN
- See Also:
- Constant Field Values
-
MIN_GOOD_WORDS
public static final int MIN_GOOD_WORDS
- See Also:
- Constant Field Values
-
MIN_BOOTSTRSP_WORDS
public static final int MIN_BOOTSTRSP_WORDS
- See Also:
- Constant Field Values
-
-
Method Detail
-
getTrainRank
public java.lang.String getTrainRank()
-
classify
public ClassificationResult classify(edu.msu.cme.rdp.readseq.readers.Sequence seq) throws java.io.IOException
Takes a query sequence, returns the classification result. For each query sequence, first assign it to a genus node using all the words for calculation. Then randomly chooses one-eighth of the all overlapping words in the query to calculate the joint probability. The number of times a genus was selected out of the number of bootstrap trials was used as an estimate of confidence in the assignment to that genus.- Throws:
ShortSequenceException
- if the sequence length is less than the minimum sequence length.java.io.IOException
-
classify
public ClassificationResult classify(ClassifierSequence seq)
-
classify
public ClassificationResult classify(ClassifierSequence seq, int min_bootstrap_words)
Takes a query sequence, returns the classification result. For each query sequence, first assign it to a genus node using all the words for calculation. Then randomly chooses one-eighth of the all overlapping words in the query to calculate the joint probability. The number of times a genus was selected out of the number of bootstrap trials was used as an estimate of confidence in the assignment to that genus.- Throws:
ShortSequenceException
- if the sequence length is less than the minimum sequence length.
-
addConfidence
public void addConfidence(HierarchyTree node, java.util.HashMap map)
increase the count of the RankAssignment in the map if match that node or any ancestor of that node.- Parameters:
node
-map
-
-
-