Class TrainingInfo


  • public class TrainingInfo
    extends java.lang.Object
    The TrainingInfo holds all the training information and taxonomy hierarchy information.
    • Constructor Summary

      Constructors 
      Constructor Description
      TrainingInfo()
      Creates new TrainingInfo.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      Classifier createClassifier()
      Creates a new Classifier if all the train information have been completed, throws exception if not.
      void createGenusWordProbList​(java.io.Reader reader)
      Reads in the index of the genus treenode and conditional probability that genus contains a word.
      void createLogWordPriorArr​(java.io.Reader reader)
      Reads in the log value of the word prior probability and saves to an array LogWordPriorArr.
      void createProbIndexArr​(java.io.Reader reader)
      Reads in start index of the conditional probability of each genus, saves to an array wordConditionalProbIndexArr.
      void createTree​(java.io.Reader reader)
      Reads in the tree information from a reader and create all the HierarchyTrees.
      void generateWordPairDiffArr​(int[] word, int beginIndex)
      For a given word w1 and the reverse complement word w2, calculates the difference between the log word prior of w1 and w2 and saves to an array.
      HierarchyTree getGenusNodebyIndex​(int i)
      Returns a genus node from the genusNodeList at the specified position.
      int getGenusNodeListSize()
      Returns the number of the genus nodes.
      HierarchyVersion getHierarchyInfo()
      Returns the info of the taxonomy hierarchy from of the training file.
      java.lang.String getHierarchyVersion()
      Returns the version of the taxonomical hierarchy.
      float getLogLeaveCount​(int i)
      Returns the log value of (number of leaves + 1) of a genus
      float getLogWordPrior​(int wordIndex)
      Returns the log value of the prior probability of a word.
      HierarchyTree getRootTree()
      Returns the root of the trees.
      int getStartIndex​(int wordIndex)
      Returns the start index of GenusIndexWordConditionalProb in the array for the specified wordIndex.
      int getStopIndex​(int wordIndex)
      Returns the stop index of GenusIndexWordConditionalProb in the array for the specified wordIndex.
      java.lang.String getTrainRank()  
      GenusWordConditionalProb getWordConditionalProbObject​(int posIndex)
      Returns a GenusIndexWordConditionalProb from the genusIndex_wordConditionalProbList at the specified postion in the list.
      float getWordPairPriorDiff​(int wordIndex)
      Returns the difference between given word and its reverse complement word.
      boolean isSeqReversed​(int[] wordIndexArr, int wordCount)  
      boolean isSeqReversed​(ClassifierSequence seq)
      Returns true if the sequence is in reverse orientation.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • TrainingInfo

        public TrainingInfo()
        Creates new TrainingInfo.
    • Method Detail

      • createTree

        public void createTree​(java.io.Reader reader)
                        throws java.io.IOException,
                               TrainingDataException
        Reads in the tree information from a reader and create all the HierarchyTrees. Note: the tree information has to be read after at least one of the other three files because we need to set the version information.
        Throws:
        java.io.IOException
        TrainingDataException
      • createLogWordPriorArr

        public void createLogWordPriorArr​(java.io.Reader reader)
                                   throws java.io.IOException,
                                          TrainingDataException
        Reads in the log value of the word prior probability and saves to an array LogWordPriorArr.
        Throws:
        java.io.IOException
        TrainingDataException
      • generateWordPairDiffArr

        public void generateWordPairDiffArr​(int[] word,
                                            int beginIndex)
        For a given word w1 and the reverse complement word w2, calculates the difference between the log word prior of w1 and w2 and saves to an array. Repeats for every possible word of size 8.
      • createGenusWordProbList

        public void createGenusWordProbList​(java.io.Reader reader)
                                     throws java.io.IOException,
                                            TrainingDataException
        Reads in the index of the genus treenode and conditional probability that genus contains a word. Saves the data into a list genus_wordConditionalProbList.
        Throws:
        java.io.IOException
        TrainingDataException
      • createProbIndexArr

        public void createProbIndexArr​(java.io.Reader reader)
                                throws java.io.IOException,
                                       TrainingDataException
        Reads in start index of the conditional probability of each genus, saves to an array wordConditionalProbIndexArr.
        Throws:
        java.io.IOException
        TrainingDataException
      • createClassifier

        public Classifier createClassifier()
        Creates a new Classifier if all the train information have been completed, throws exception if not.
      • getRootTree

        public HierarchyTree getRootTree()
        Returns the root of the trees.
      • getTrainRank

        public java.lang.String getTrainRank()
        Returns:
        the rank the classifier was trained on
      • getGenusNodeListSize

        public int getGenusNodeListSize()
        Returns the number of the genus nodes.
      • getGenusNodebyIndex

        public HierarchyTree getGenusNodebyIndex​(int i)
        Returns a genus node from the genusNodeList at the specified position.
      • getLogWordPrior

        public float getLogWordPrior​(int wordIndex)
        Returns the log value of the prior probability of a word.
      • getWordPairPriorDiff

        public float getWordPairPriorDiff​(int wordIndex)
        Returns the difference between given word and its reverse complement word.
      • getLogLeaveCount

        public float getLogLeaveCount​(int i)
        Returns the log value of (number of leaves + 1) of a genus
      • getStartIndex

        public int getStartIndex​(int wordIndex)
        Returns the start index of GenusIndexWordConditionalProb in the array for the specified wordIndex.
      • getStopIndex

        public int getStopIndex​(int wordIndex)
        Returns the stop index of GenusIndexWordConditionalProb in the array for the specified wordIndex.
      • getWordConditionalProbObject

        public GenusWordConditionalProb getWordConditionalProbObject​(int posIndex)
        Returns a GenusIndexWordConditionalProb from the genusIndex_wordConditionalProbList at the specified postion in the list.
      • getHierarchyVersion

        public java.lang.String getHierarchyVersion()
        Returns the version of the taxonomical hierarchy.
      • getHierarchyInfo

        public HierarchyVersion getHierarchyInfo()
        Returns the info of the taxonomy hierarchy from of the training file.
      • isSeqReversed

        public boolean isSeqReversed​(ClassifierSequence seq)
                              throws java.io.IOException
        Returns true if the sequence is in reverse orientation. Sums the difference between all the overlapping words from the query sequence and the reverse complements of those word. If the summation is less that zero, the query sequence is in reverse orientation.
        Throws:
        java.io.IOException
      • isSeqReversed

        public boolean isSeqReversed​(int[] wordIndexArr,
                                     int wordCount)