Class BitSequenceReader.BitArrayWorker<C extends Compound>

    • Constructor Detail

      • BitArrayWorker

        public BitArrayWorker​(Sequence<C> sequence)
      • BitArrayWorker

        public BitArrayWorker​(java.lang.String sequence,
                              CompoundSet<C> compoundSet)
      • BitArrayWorker

        public BitArrayWorker​(CompoundSet<C> compoundSet,
                              int length)
      • BitArrayWorker

        public BitArrayWorker​(CompoundSet<C> compoundSet,
                              int[] sequence)
    • Method Detail

      • bitMask

        protected abstract byte bitMask()
        This method should return the bit mask to be used to extract the bytes you are interested in working with. See solid implementations on how to create these
      • compoundsPerDatatype

        protected abstract int compoundsPerDatatype()
        Should return the maximum amount of compounds we can encode per int
      • generateIndexToCompounds

        protected abstract java.util.List<C> generateIndexToCompounds()
        Should return the inverse information that generateCompoundsToIndex() returns i.e. if the Compound C returns 1 from compoundsToIndex then we should find that compound here in position 1
      • generateCompoundsToIndex

        protected abstract java.util.Map<C,​java.lang.Integer> generateCompoundsToIndex()
        Returns what the value of a compound is in the backing bit storage i.e. in 2bit storage the value 0 is encoded as 00 (in binary).
      • bitsPerCompound

        protected int bitsPerCompound()
        Returns how many bits are used to represent a compound e.g. 2 if using 2bit encoding.
      • seqArraySize

        public int seqArraySize​(int length)
      • populate

        public void populate​(java.lang.String sequence)
        Loops through the chars in a String and passes them onto setCompoundAt(char, int)
      • setCompoundAt

        public void setCompoundAt​(char base,
                                  int position)
        Converts from char to Compound and sets it at the given biological index
      • setCompoundAt

        public void setCompoundAt​(C compound,
                                  int position)
        Sets the compound at the specified biological index
      • getCompoundAt

        public C getCompoundAt​(int position)
        Returns the compound at the specified biological index
      • processUnknownCompound

        protected byte processUnknownCompound​(C compound,
                                              int position)
                                       throws java.lang.IllegalStateException
        Since bit encoding only supports a finite number of bases it is more than likely when processing sequence you will encounter a compound which is not covered by the encoding e.g. N in a 2bit sequence. You can override this to convert the unknown base into one you can process or store locations of unknown bases for a level of post processing in your subclass.
        Parameters:
        compound - Compound process
        Returns:
        Byte representation of the compound
        Throws:
        java.lang.IllegalStateException - Done whenever this method is invoked
      • getIndexToCompoundsLookup

        protected java.util.List<C> getIndexToCompoundsLookup()
        Returns a list of compounds the index position of which is used to translate from the byte representation into a compound.
      • getCompoundsToIndexLookup

        protected java.util.Map<C,​java.lang.Integer> getCompoundsToIndexLookup()
        Returns a map which converts from compound to an integer representation
      • getCompoundSet

        public CompoundSet<C> getCompoundSet()
        Returns the compound set backing this store
      • getLength

        public int getLength()
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object
      • equals

        public boolean equals​(java.lang.Object o)
        Overrides:
        equals in class java.lang.Object