Class ClassifierTraineeMaker


  • public class ClassifierTraineeMaker
    extends java.lang.Object
    A command line class to create training information from the raw data.
    • Constructor Summary

      Constructors 
      Constructor Description
      ClassifierTraineeMaker​(java.lang.String taxFile, java.lang.String seqFile, java.lang.String cnFile, int trainset_no, java.lang.String version, java.lang.String modification, java.lang.String outdir)
      Creates a new ClassifierTraineeMaker
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static void main​(java.lang.String[] args)
      This is the main method to create training files from raw taxonomic information.
      static void printLicense()
      Prints the license information to std err.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • ClassifierTraineeMaker

        public ClassifierTraineeMaker​(java.lang.String taxFile,
                                      java.lang.String seqFile,
                                      java.lang.String cnFile,
                                      int trainset_no,
                                      java.lang.String version,
                                      java.lang.String modification,
                                      java.lang.String outdir)
                               throws java.io.FileNotFoundException,
                                      java.io.IOException
        Creates a new ClassifierTraineeMaker
        Parameters:
        taxFile - contains the hierarchical taxonomy information in the following format: taxid*taxon name*parent taxid*depth*rank". taxid, the parent taxid and depth should be in integer format. depth indicates the depth from the root taxon.
        seqFile - contains the raw training sequences in fasta format. The header of this fasta file starts with ">", followed by the sequence name, white space(s) and a list taxon names seperated by ';' with highest rank taxon first. For example: >seq1 ROOT;Ph1;Fam1;G1;
        Note: a sequence can only be assigned to the lowest rank taxon.
        trainset_no - is used to mark the training files generated.
        version - indicates the version of the hierarchical taxonomy.
        modification - holds the modification information of the taxonomy if any.
        outdir - specifies the output directory. The parsed training information will be saved into four files in the given output directory.
        Throws:
        java.io.FileNotFoundException
        java.io.IOException
    • Method Detail

      • printLicense

        public static void printLicense()
        Prints the license information to std err.
      • main

        public static void main​(java.lang.String[] args)
                         throws java.io.FileNotFoundException,
                                java.io.IOException
        This is the main method to create training files from raw taxonomic information.

        Usage: java ClassifierTraineeMaker tax_file rawseq.fa trainsetNo version version_modification output_directory. See the ClassifierTraineeMaker constructor for more detail.

        Parameters:
        args -
        Throws:
        java.io.FileNotFoundException
        java.io.IOException