Package edu.msu.cme.rdp.classifier.train
Class ClassifierTraineeMaker
- java.lang.Object
-
- edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker
-
public class ClassifierTraineeMaker extends java.lang.Object
A command line class to create training information from the raw data.
-
-
Constructor Summary
Constructors Constructor Description ClassifierTraineeMaker(java.lang.String taxFile, java.lang.String seqFile, java.lang.String cnFile, int trainset_no, java.lang.String version, java.lang.String modification, java.lang.String outdir)
Creates a new ClassifierTraineeMaker
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static void
main(java.lang.String[] args)
This is the main method to create training files from raw taxonomic information.static void
printLicense()
Prints the license information to std err.
-
-
-
Constructor Detail
-
ClassifierTraineeMaker
public ClassifierTraineeMaker(java.lang.String taxFile, java.lang.String seqFile, java.lang.String cnFile, int trainset_no, java.lang.String version, java.lang.String modification, java.lang.String outdir) throws java.io.FileNotFoundException, java.io.IOException
Creates a new ClassifierTraineeMaker- Parameters:
taxFile
- contains the hierarchical taxonomy information in the following format: taxid*taxon name*parent taxid*depth*rank". taxid, the parent taxid and depth should be in integer format. depth indicates the depth from the root taxon.seqFile
- contains the raw training sequences in fasta format. The header of this fasta file starts with ">", followed by the sequence name, white space(s) and a list taxon names seperated by ';' with highest rank taxon first. For example: >seq1 ROOT;Ph1;Fam1;G1;
Note: a sequence can only be assigned to the lowest rank taxon.trainset_no
- is used to mark the training files generated.version
- indicates the version of the hierarchical taxonomy.modification
- holds the modification information of the taxonomy if any.outdir
- specifies the output directory. The parsed training information will be saved into four files in the given output directory.- Throws:
java.io.FileNotFoundException
java.io.IOException
-
-
Method Detail
-
printLicense
public static void printLicense()
Prints the license information to std err.
-
main
public static void main(java.lang.String[] args) throws java.io.FileNotFoundException, java.io.IOException
This is the main method to create training files from raw taxonomic information.Usage: java ClassifierTraineeMaker tax_file rawseq.fa trainsetNo version version_modification output_directory. See the ClassifierTraineeMaker constructor for more detail.
- Parameters:
args
-- Throws:
java.io.FileNotFoundException
java.io.IOException
-
-