Class StockholmFileParser


  • public class StockholmFileParser
    extends java.lang.Object
    Stockholm file parser.
    for more information about the format refer to
     Pfam DESCRIPTION OF FIELDS
    
        Compulsory fields:
        ------------------
    
        AC   Accession number:           Accession number in form PFxxxxx.version or PBxxxxxx.
        ID   Identification:             One word name for family.
        DE   Definition:                 Short description of family.
        AU   Author:                     Authors of the entry.
        SE   Source of seed:             The source suggesting the seed members belong to one family.
        GA   Gathering method:           Search threshold to build the full alignment.
        TC   Trusted Cutoff:             Lowest sequence score and domain score of match in the full alignment.
        NC   Noise Cutoff:               Highest sequence score and domain score of match not in full alignment.
        TP   Type:                       Type of family -- presently Family, Domain, Motif or Repeat.
        SQ   Sequence:                   Number of sequences in alignment.
        //                               End of alignment.
    
        Optional fields:
        ----------------
    
        DC   Database Comment:           Comment about database reference.
        DR   Database Reference:         Reference to external database.
        RC   Reference Comment:          Comment about literature reference.
        RN   Reference Number:           Reference Number.
        RM   Reference Medline:          Eight digit medline UI number.
        RT   Reference Title:            Reference Title.
        RA   Reference Author:           Reference Author
        RL   Reference Location:         Journal location.
        PI   Previous identifier:        Record of all previous ID lines.
        KW   Keywords:                   Keywords.
        CC   Comment:                    Comments.
        NE   Pfam accession:             Indicates a nested domain.
        NL   Location:                   Location of nested domains - sequence ID, start and end of insert.
        WK   Wikipedia Reference:        Reference to wikipedia.
    
        Obsolete fields:
        -----------
        AL   Alignment method of seed:   The method used to align the seed members.
        AM   Alignment Method:          The order ls and fs hits are aligned to the model to build the full align.
    
     
    Since:
    3.0.5
    Author:
    Amr AL-Hossary, Marko Vaz
    • Field Detail

      • INFINITY

        public static final int INFINITY
        indicates reading as much as possible, without limits
        See Also:
        Constant Field Values
    • Constructor Detail

      • StockholmFileParser

        public StockholmFileParser()
    • Method Detail

      • parse

        public StockholmStructure parse​(java.lang.String filename)
                                 throws java.io.IOException,
                                        ParserException
        Parses a Stockholm file and returns a StockholmStructure object with its content.
        This function is meant to be used for single access to specific file and it closes the file after doing its assigned job. Any subsequent call to parseNext(int) will throw an exception or will function with unpredicted behavior.
        Parameters:
        filename - complete(?) path to the file from where to read the content
        Returns:
        stockholm file content
        Throws:
        java.io.IOException - when an exception occurred while opening/reading/closing the file+
        ParserException - if unexpected format is encountered
      • parse

        public java.util.List<StockholmStructure> parse​(java.lang.String filename,
                                                        int max)
                                                 throws java.io.IOException,
                                                        ParserException
        Parses a Stockholm file and returns a StockholmStructure object with its content.
        This function doesn't close the file after doing its assigned job; to allow for further calls of parseNext(int).
        Parameters:
        filename - file from where to read the content. see InputStreamProvider for more details.
        max - maximum number of files to read, INFINITY for all.
        Returns:
        a vector of StockholmStructure containing parsed structures.
        Throws:
        java.io.IOException - when an exception occurred while opening/reading/closing the file.
        ParserException - if unexpected format is encountered
        See Also:
        parseNext(int)
      • parse

        public StockholmStructure parse​(java.io.InputStream inStream)
                                 throws ParserException,
                                        java.io.IOException
        parses InputStream and returns a the first contained alignment in a StockholmStructure object. Used mainly for multiple files within the same input stream, (e.g. when reading from Pfam flat files.
        This method leaves the stream open for further calls of parseNext(int).
        Parameters:
        inStream - the InputStream containing the file to read.
        Returns:
        a StockholmStructure object representing file contents.
        Throws:
        java.io.IOException
        ParserException
        See Also:
        parseNext(int)
      • parse

        public java.util.List<StockholmStructure> parse​(java.io.InputStream inStream,
                                                        int max)
                                                 throws java.io.IOException
        parses an InputStream and returns at maximum max objects contained in that file.
        This method leaves the stream open for further calls of parse(InputStream, int) (same function) or parseNext(int).
        Parameters:
        inStream - the stream to parse
        max - maximum number of structures to try to parse, INFINITY to try to obtain as much as possible.
        Returns:
        a List of StockholmStructure objects. If there are no more structures, an empty list is returned.
        Throws:
        java.io.IOException - in case an I/O Exception occurred.
        See Also:
        parseNext(int)