Class TagNode

    • Field Detail

      • mDefaultScanner

        protected static final Scanner mDefaultScanner
        The default scanner for non-composite tags.
      • mAttributes

        protected java.util.Vector mAttributes
        The tag attributes. Objects of type Attribute. The first element is the tag name, subsequent elements being either whitespace or real attributes.
      • breakTags

        protected static java.util.Hashtable breakTags
        Set of tags that breaks the flow.
    • Constructor Detail

      • TagNode

        public TagNode()
        Create an empty tag.
      • TagNode

        public TagNode​(Page page,
                       int start,
                       int end,
                       java.util.Vector attributes)
        Create a tag with the location and attributes provided
        Parameters:
        page - The page this tag was read from.
        start - The starting offset of this node within the page.
        end - The ending offset of this node within the page.
        attributes - The list of attributes that were parsed in this tag.
        See Also:
        Attribute
      • TagNode

        public TagNode​(TagNode tag,
                       TagScanner scanner)
        Create a tag like the one provided.
        Parameters:
        tag - The tag to emulate.
        scanner - The scanner for this tag.
    • Method Detail

      • getAttribute

        public java.lang.String getAttribute​(java.lang.String name)
        Returns the value of an attribute.
        Specified by:
        getAttribute in interface Tag
        Parameters:
        name - Name of attribute, case insensitive.
        Returns:
        The value associated with the attribute or null if it does not exist, or is a stand-alone or
        See Also:
        Tag.setAttribute(java.lang.String, java.lang.String)
      • removeAttribute

        public void removeAttribute​(java.lang.String key)
        Remove the attribute with the given key, if it exists.
        Specified by:
        removeAttribute in interface Tag
        Parameters:
        key - The name of the attribute.
      • setAttribute

        public void setAttribute​(java.lang.String key,
                                 java.lang.String value,
                                 char quote)
        Set attribute with given key, value pair where the value is quoted by quote.
        Specified by:
        setAttribute in interface Tag
        Parameters:
        key - The name of the attribute.
        value - The value of the attribute.
        quote - The quote character to be used around value. If zero, it is an unquoted value.
        See Also:
        Tag.getAttribute(java.lang.String)
      • setAttribute

        public void setAttribute​(Attribute attribute)
        Set an attribute. This replaces an attribute of the same name. To set the zeroth attribute (the tag name), use setTagName().
        Parameters:
        attribute - The attribute to set.
      • getAttributesEx

        public java.util.Vector getAttributesEx()
        Gets the attributes in the tag.
        Specified by:
        getAttributesEx in interface Tag
        Returns:
        Returns the list of Attributes in the tag. The first element is the tag name, subsequent elements being either whitespace or real attributes.
        See Also:
        Tag.setAttributesEx(java.util.Vector)
      • getTagName

        public java.lang.String getTagName()
        Return the name of this tag.

        Note: This value is converted to uppercase and does not begin with "/" if it is an end tag. Nor does it end with a slash in the case of an XML type tag. To get at the original text of the tag name use getRawTagName(). The conversion to uppercase is performed with an ENGLISH locale.

        Specified by:
        getTagName in interface Tag
        Returns:
        The tag name.
        See Also:
        Tag.setTagName(java.lang.String)
      • getRawTagName

        public java.lang.String getRawTagName()
        Return the name of this tag.
        Specified by:
        getRawTagName in interface Tag
        Returns:
        The tag name or null if this tag contains nothing or only whitespace.
      • setTagName

        public void setTagName​(java.lang.String name)
        Set the name of this tag. This creates or replaces the first attribute of the tag (the zeroth element of the attribute vector).
        Specified by:
        setTagName in interface Tag
        Parameters:
        name - The tag name.
        See Also:
        Tag.getTagName()
      • setAttributesEx

        public void setAttributesEx​(java.util.Vector attribs)
        Sets the attributes. NOTE: Values of the extended hashtable are two element arrays of String, with the first element being the original name (not uppercased), and the second element being the value.
        Specified by:
        setAttributesEx in interface Tag
        Parameters:
        attribs - The attribute collection to set.
        See Also:
        Tag.getAttributesEx()
      • setTagBegin

        public void setTagBegin​(int tagBegin)
        Sets the nodeBegin.
        Parameters:
        tagBegin - The nodeBegin to set
      • getTagBegin

        public int getTagBegin()
        Gets the nodeBegin.
        Returns:
        The nodeBegin value.
      • setTagEnd

        public void setTagEnd​(int tagEnd)
        Sets the nodeEnd.
        Parameters:
        tagEnd - The nodeEnd to set
      • getTagEnd

        public int getTagEnd()
        Gets the nodeEnd.
        Returns:
        The nodeEnd value.
      • setText

        public void setText​(java.lang.String text)
        Parses the given text to create the tag contents.
        Specified by:
        setText in interface Node
        Overrides:
        setText in class AbstractNode
        Parameters:
        text - A string of the form <TAGNAME xx="yy">.
        See Also:
        Node.getText()
      • toPlainTextString

        public java.lang.String toPlainTextString()
        Get the plain text from this node.
        Specified by:
        toPlainTextString in interface Node
        Specified by:
        toPlainTextString in class AbstractNode
        Returns:
        An empty string (tag contents do not display in a browser). If you want this tags HTML equivalent, use toHtml().
      • toHtml

        public java.lang.String toHtml​(boolean verbatim)
        Render the tag as HTML. A call to a tag's toHtml() method will render it in HTML.
        Specified by:
        toHtml in interface Node
        Specified by:
        toHtml in class AbstractNode
        Parameters:
        verbatim - If true return as close to the original page text as possible.
        Returns:
        The tag as an HTML fragment.
        See Also:
        Node.toHtml()
      • toString

        public java.lang.String toString()
        Print the contents of the tag.
        Specified by:
        toString in interface Node
        Specified by:
        toString in class AbstractNode
        Returns:
        An string describing the tag. For text that looks like HTML use #toHtml().
      • breaksFlow

        public boolean breaksFlow()
        Determines if the given tag breaks the flow of text.
        Specified by:
        breaksFlow in interface Tag
        Returns:
        true if following text would start on a new line, false otherwise.
      • accept

        public void accept​(NodeVisitor visitor)
        Default tag visiting code. Based on isEndTag(), calls either visitTag() or visitEndTag().
        Specified by:
        accept in interface Node
        Specified by:
        accept in class AbstractNode
        Parameters:
        visitor - The visitor that is visiting this node.
      • isEmptyXmlTag

        public boolean isEmptyXmlTag()
        Is this an empty xml tag of the form <tag/>.
        Specified by:
        isEmptyXmlTag in interface Tag
        Returns:
        true if the last character of the last attribute is a '/'.
      • setEmptyXmlTag

        public void setEmptyXmlTag​(boolean emptyXmlTag)
        Set this tag to be an empty xml node, or not. Adds or removes an ending slash on the tag.
        Specified by:
        setEmptyXmlTag in interface Tag
        Parameters:
        emptyXmlTag - If true, ensures there is an ending slash in the node, i.e. <tag/>, otherwise removes it.
      • isEndTag

        public boolean isEndTag()
        Predicate to determine if this tag is an end tag (i.e. </HTML>).
        Specified by:
        isEndTag in interface Tag
        Returns:
        true if this tag is an end tag.
      • getStartingLineNumber

        public int getStartingLineNumber()
        Get the line number where this tag starts.
        Specified by:
        getStartingLineNumber in interface Tag
        Returns:
        The (zero based) line number in the page where this tag starts.
      • getEndingLineNumber

        public int getEndingLineNumber()
        Get the line number where this tag ends.
        Specified by:
        getEndingLineNumber in interface Tag
        Returns:
        The (zero based) line number in the page where this tag ends.
      • getIds

        public java.lang.String[] getIds()
        Return the set of names handled by this tag. Since this a a generic tag, it has no ids.
        Specified by:
        getIds in interface Tag
        Returns:
        The names to be matched that create tags of this type.
      • getEnders

        public java.lang.String[] getEnders()
        Return the set of tag names that cause this tag to finish. These are the normal (non end tags) that if encountered while scanning (a composite tag) will cause the generation of a virtual tag. Since this a a non-composite tag, the default is no enders.
        Specified by:
        getEnders in interface Tag
        Returns:
        The names of following tags that stop further scanning.
      • getEndTagEnders

        public java.lang.String[] getEndTagEnders()
        Return the set of end tag names that cause this tag to finish. These are the end tags that if encountered while scanning (a composite tag) will cause the generation of a virtual tag. Since this a a non-composite tag, it has no end tag enders.
        Specified by:
        getEndTagEnders in interface Tag
        Returns:
        The names of following end tags that stop further scanning.
      • setThisScanner

        public void setThisScanner​(Scanner scanner)
        Set the scanner associated with this tag.
        Specified by:
        setThisScanner in interface Tag
        Parameters:
        scanner - The scanner for this tag.
        See Also:
        Tag.getThisScanner()
      • getEndTag

        public Tag getEndTag()
        Get the end tag for this (composite) tag. For a non-composite tag this always returns null.
        Specified by:
        getEndTag in interface Tag
        Returns:
        The tag that terminates this composite tag, i.e. </HTML>.
        See Also:
        Tag.setEndTag(org.htmlparser.Tag)
      • setEndTag

        public void setEndTag​(Tag end)
        Set the end tag for this (composite) tag. For a non-composite tag this is a no-op.
        Specified by:
        setEndTag in interface Tag
        Parameters:
        end - The tag that terminates this composite tag, i.e. </HTML>.
        See Also:
        Tag.getEndTag()