Package Torello.HTML

Class TextNode

  • All Implemented Interfaces:
    java.io.Serializable, java.lang.CharSequence, java.lang.Cloneable, java.lang.Comparable<TextNode>

    public final class TextNode
    extends HTMLNode
    implements java.lang.CharSequence, java.io.Serializable, java.lang.Cloneable, java.lang.Comparable<TextNode>
    TextNode - Documentation.

    This is intended to ensure that the HTMLNode tree-hierarchy is kept ultra-simple. The HTMLNode class is abstract, and there are only two child-classes in the type-hierarchy. TextNode is a subclass of HTMLNode, add does not add any methods or fields at all. The only other subclass of HTMLNode is TagNode, and that adds two fields and a method.

    IMPORTANT: Text that is found between both the HTML <SCRIPT> ...</SCRIPT>, and also the HTML <STYLE> ... </STYLE> are simply saved as instances of class 'TextNode'. The real rationale behind this comes by way of the fact that these HTML Parse, Search, Update and Scrape classes are not capable executing script, yet... As such, there is no parse performed on CSS, Java-Script, jQuery, or any other of the myriad scripting languages often found inside of HTML documents. Rather than creating any number of various and sundry "partial parses" or "attempted parses," this text is just left as an instance of class 'TextNode', but of course whatever HTML <SCRIPT> or <STYLE> elements surrounding this text node shall of course be easily found.

    The three inherited classes of abstract class HTMLNode are very light-weight, and contain some amount of public methods, but do not have heavy internal-state (either static, or non-static). Below is a list of the internal field's that are added to each of the three instantiations of the ancestor HTMLNode class:

    • class TagNode adds a field public final boolean isClosing - which tells a user if this tag has a forward-slash immediately following the '<' (less-than symbol) at character position 2. This is how one identifies a 'closing-version' of the element, for instance: '</DIV>' and '</SPAN>' would both have their public final boolean isClosing fields set to TRUE. There is also a public final String tok field added to instances of TagNode that identify what html element the TagNode represents. For example an HTML Element such as: <A HREF="http://My.URL.com" TARGET=_blank>, would have it's String 'tok' field set to 'a'

    • class TextNode this inherited class from class HTMLNode does not add any internal state at all. It has the exact same internally-maintained fields as its parent-class. The public final String str field merely states what text this text-node actually represents.

    • class CommentNode for searching-purposes, and ease-of-use, class CommentNode, which is the third and final class to inherit HTMLNode keeps one extra internal-field, which is public final String body. This field is a redundant, duplicate, of the internal string public final String str - which is inherited from the HTML Node class. The subtle difference is that, since comment nodes represent the HTML <!-- and --> symbols, the 'body' of the comment sometimes needs to be searched, quickly. The public final String body leaves off these leading and ending comment delimiter symbols: <!-- and -->
    See Also:
    HTMLNode, TagNode, CommentNode, Serialized Form



    • Field Detail

      • serialVersionUID

        public static final long serialVersionUID
        This fulfils the SerialVersion UID requirement for all classes that implement Java's interface java.io.Serializable. Using the Serializable Implementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.
        See Also:
        Constant Field Values
        Code:
        Exact Field Declaration Expression:
        1
        public static final long serialVersionUID = 1;
        
      • comp2

        public static final java.util.Comparator<TextNode> comp2
        This is an "alternative Comparitor" that can be used for sorting instances of this class. It should work with the Collections.sort(List, Comparator) method in the standard JDK package java.util.*;

        NOTE: This version utilizes the standard JDK String.compareToIgnoreCase(String) method.
        See Also:
        HTMLNode.str
        Code:
        Exact Field Declaration Expression:
        1
        2
        public static final Comparator<TextNode> comp2 =
                (TextNode txn1, TextNode txn2) -> txn1.str.compareToIgnoreCase(txn2.str);
        
    • Constructor Detail

      • TextNode

        public TextNode​(java.lang.String s)
        Constructs a new TextNode with internal field String str equal to parameter 's'
        Parameters:
        s - Any valid Java String may be passed here.
    • Method Detail

      • isTextNode

        public boolean isTextNode()
        This method identifies that 'this' instance of 'HTMLNode' is, indeed, actually an instance of the (sub-class) TextNode.
        Overrides:
        isTextNode in class HTMLNode
        Returns:
        This method shall always return TRUE It overrides the parent-class HTMLNode method isTextNode(), which always returns FALSE.
        See Also:
        isTextNode()
        Code:
        Exact Method Body:
        1
         return true;
        
      • clone

        public TextNode clone()
        Java's interface Cloneable requirements. This instantiates a new TextNode with identical String str fields.
        Specified by:
        clone in class HTMLNode
        Returns:
        A new TextNode whose internal fields are identical to this one.
        Code:
        Exact Method Body:
        1
         return new TextNode(str);
        
      • compareTo

        public int compareTo​(TextNode tn)
        Java's interface Comparable<T> requirements. This does a very simple comparison using the underlying field final String str that all Text's contain.
        Specified by:
        compareTo in interface java.lang.Comparable<TextNode>
        Parameters:
        tn - Any other TextNode to be compared to 'this' TextNode
        Returns:
        An integer that fulfils Java's interface Comparable<T> public boolean compareTo(T t) method requirements.
        Code:
        Exact Method Body:
        1
         return this.str.compareTo(tn.str);