Package Torello.HTML

Class Elements


  • public class Elements
    extends java.lang.Object
    Elements - Documentation.

    The exact reason to have included this class is not so obvious. Yes, it is useful to traverse HTML tables in Java. However, to the novice user who doesn't quite understand how the words "Find" and "Get" really relate to HTMLNode Vectors, using these higher-level search functions might make things easier. If the words "TagNode" (which is, sort-of, opposite a "TextNode") still doesn't make so much sense - here, all a programmer really ought to do is download an HTML page into a page vector to where it is in the format Vector<HTMLNode> - and then try searching for any of the commonly found HTML elements in that page.

    The actual purpose of this class is to see how to use the classes in Node-Search, with-ease. There are only a few methods (about 10), and they show the uses of the node-search operations by providing the code inside the method body inside this method-declarations of this Javadoc page. Think of this as a "work-book."

    JavaScript:
    1
    2
    3
    4
    // NOTE: Mostly, if you are familiar with JavaScript, this will make sense:
    // Java-Script for obtaining the HTML-Content of a divider "<DIV>" element.
    var html    = document.getElementById("main-content").innerHTML; // for-example
    var nodes   = document.getElementsByClassName("article-footer");
    

    Script as above will essentially translate to calls such as:
    1
    2
    3
    // Java-HTML Scrape Package means of doing the same thing (almost, but not identical)
    Vector<HTMLNode>    subPage = InnerTagGetInclusive.first(some_page, "id", TextComparitor.EQ_CI_TRM, "main-content");
    Vector<TagNode>     tn      = InnerTagGet.all(some_page, "class", TextComparitor.C, "article-footer");
    

    ALWAYS: Node-search methods that use the term "Find" retrieve the node's integer-position inside the page Vector, while methods that use the term "Get" return the node itself. There is no CSS-selector corollary to this difference, primarily because Java-Script's Document Object Model a.k.a. "the DOM-Tree"), is, well, a tree! This package uses array-like java Vector's - instead of Tree's. Java-Vector's provides an extreme amount of simplicity when dealing with web-pages that have any readable text. Primarily, because humans generally think in terms of "sentences" rather than "trees," looking, parsing and even translating content is much easier this way.

    FURTHERMORE: Node-search methods that use the term "Inclusive" retrieve the entire list of nodes (or integer node-pointers) between the opening and closing version of the tag and attributes that your are searching. They are "a tautology" to Java-Script's "someElement.innerHTML". If the term "Inclusive" is not present, only the opening-TagNode itself, or the opening-TagNode's index in the HTML Page-Vector will be returned.

    If one calls DotPair dp = Elements.findTable(someHTMLPage); the DotPair variable that is returned from this function will delineate / demarcate the starting and ending positions within the Vector<HTMLNode> that constitute the first HTML-'Table' structure found on the web-page.

    If one calls Vector<HTMLNode> list = Elements.getOL(someHTMLPage); the Vector that is returned will be the entire sub-set of HTMLNode's copied from the original Vector someHTMLPage that comprise the very first HTML 'OL' (Ordered List) Element found on this page.

    Static (Functional) API: The methods in this class are all (100%) defined with the Java Key-Word / Key-Concept 'static'. Furthermore, there is no way to obtain an instance of this class, because there are no public (nor private) constructors. Java's Spring-Boot, MVC feature is *not* utilized because it flies directly in the face of the light-weight data-classes philosophy. This has many advantages over the rather ornate Component Annotations (@Component, @Service, @AutoWired, etc... 'Java Beans') syntax:

    • The methods here use the key-word 'static' which means (by implication) that there is no internal-state. Without any 'internal state' there is no need for constructors in the first place! (This is often the complaint by MVC Programmers).
    • A 'Static' (Functional-Programming) API expects to use fewer data-classes, and light-weight data-classes, making it easier to understand and to program.
    • The Vectorized HTML data-model allows more user-control over HTML parse, search, update & scrape. Also, memory management, memory leakage, and the Java Garbage Collector ought to be intelligible through the 'reuse' of the standard JDK class Vector for storing HTML Web-Page data.

    The power that object-oriented programming extends to a user is (mostly) limited to data-representation. Thinking of "Services" as "Objects" (Spring-MVC, 'Java Beans') is somewhat 'over-applying' the Object Oriented Programming Model. Like most classes in the Java-HTML JAR Library, this class backtracks to a more C-Styled Functional Programming Model (no Objects) - by re-using (quite profusely) the key-word static with all of its methods, and by sticking to Java's well-understood class Vector

    Internal-State: A user may click on this class' source code (see link below) to view any and all internally defined fields class. A cursory inspection of the code would prove that this class has precisely zero internally defined global fields (Spaghetti). All variables used by the methods in this class are local fields only, and therefore this class ought to be though of as 'state-less'.



    • Method Detail

      • findBody

        public static DotPair findBody​(java.util.Vector<? extends HTMLNode> html)
        Retrieves the start and end points of the web-page body in the underlying HTML page-Vector. All nodes between <BODY> ... </BODY> will be included.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        InnerTagFindInclusive
        Code:
        Exact Method Body:
        1
         return InnerTagFindInclusive.first(html, "body");
        
      • getBody

        public static java.util.Vector<HTMLNodegetBody​
                    (java.util.Vector<? extends HTMLNode> html)
        
        Gets the nodes of the web-page body. All nodes between <BODY> ... </BODY> will be included.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        InnerTagGetInclusive
        Code:
        Exact Method Body:
        1
         return InnerTagGetInclusive.first(html, "body");
        
      • findHead

        public static DotPair findHead​(java.util.Vector<? extends HTMLNode> html)
        Retrieves the start and end points of the web-page header in the underlying HTML page-Vector. All nodes between <HEAD> ... </HEAD> will be included.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        InnerTagFindInclusive
        Code:
        Exact Method Body:
        1
         return InnerTagFindInclusive.first(html, "head");
        
      • getHead

        public static java.util.Vector<HTMLNodegetHead​
                    (java.util.Vector<? extends HTMLNode> html)
        
        Gets the nodes of the web-page header. All nodes between <HEAD> ... </HEAD> will be included.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        InnerTagGetInclusive
        Code:
        Exact Method Body:
        1
         return InnerTagGetInclusive.first(html, "head");
        
      • findMeta

        public static int[] findMeta​(java.util.Vector<? extends HTMLNode> html)
        Gets all <META NAME="..." CONTENT="..."> (or <META CHARSET="..."> and <META HTTP-EQUIV="...">) elements in a web-page header - returned via their position in the page-Vector.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML Elements, as an integer-array list of index-pointers to the underlying Vector.
        See Also:
        TagNodeFind
        Code:
        Exact Method Body:
        1
         return TagNodeFind.all(html, TC.OpeningTags, "meta");
        
      • getMeta

        public static java.util.Vector<TagNodegetMeta​
                    (java.util.Vector<? extends HTMLNode> html)
        
        Gets all <META NAME="..." CONTENT="..."> (or <META CHARSET="..."> and <META HTTP-EQUIV="...">) elements in a web-page header.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML Elements, as TagNode's, in a return Vector.
        See Also:
        TagNodeGet
        Code:
        Exact Method Body:
        1
         return TagNodeGet.all(html, TC.OpeningTags, "meta");
        
      • findLink

        public static int[] findLink​(java.util.Vector<? extends HTMLNode> html)
        Gets all <LINK REL="..." HREF="..."> elements in a web-page header - returned via their position in the page-Vector.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML Elements, as an integer-array list of index-pointers to the underlying Vector.
        See Also:
        TagNodeFind
        Code:
        Exact Method Body:
        1
         return TagNodeFind.all(html, TC.OpeningTags, "link");
        
      • getLink

        public static java.util.Vector<TagNodegetLink​
                    (java.util.Vector<? extends HTMLNode> html)
        
        Gets all <LINK REL="..." HREF="..."> elements in a web-page header.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML Elements, as TagNode's, in a return Vector.
        See Also:
        TagNodeGet
        Code:
        Exact Method Body:
        1
         return TagNodeGet.all(html, TC.OpeningTags, "link");
        
      • findTitle

        public static DotPair findTitle​(java.util.Vector<? extends HTMLNode> html)
        Returns the start and end positions in the page-Vector of the HTML <TITLE>...</TITLE> elements.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        InnerTagFindInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeFindInclusive.first(html, "title");
        
      • getTitle

        public static java.util.Vector<HTMLNodegetTitle​
                    (java.util.Vector<? extends HTMLNode> html)
        
        Returns the <TITLE>...</TITLE> elements sub-list from the HTML page-Vector.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        InnerTagGetInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeGetInclusive.first(html, "title");
        
      • titleString

        public static java.lang.String titleString​
                    (java.util.Vector<? extends HTMLNode> html)
        
        Returns the String encapsulated by the HTML 'HEAD'-section's "<TITLE>...</TITLE>" element, if there such an element. If there is no such element, null is returned. If there is a 'TITLE' element, but it has the empty-String (zero-length-string) an empty String is returned.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>. Retrieves the 'TITLE' of an HTML page - by getting the String-text between the 'TITLE' elements.
        Returns:
        The title string
        Code:
        Exact Method Body:
        1
        2
        3
        4
        5
         Vector<HTMLNode> title = getTitle(html);
        
         if (title == null)      return null;
                
         return Util.textNodesString(title);
        
      • findTable

        public static DotPair findTable​(java.util.Vector<? extends HTMLNode> html)
        This method will find the very first HTML 'TABLE' (<TABLE> <TH>...</TH> <TR> <TD>..</TD> ... </TR> ... </TABLE>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeFindInclusive.first(html, "table");
        
      • findTable

        public static DotPair findTable​(java.util.Vector<? extends HTMLNode> html,
                                        int sPos,
                                        int ePos)
        This method will find the very first HTML 'TABLE' (<TABLE> <TH>...</TH> <TR> <TD>..</TD> ... </TR> ... </TABLE>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeFindInclusive.first(html, sPos, ePos, "table");
        
      • getTable

        public static java.util.Vector<HTMLNodegetTable​
                    (java.util.Vector<? extends HTMLNode> html)
        
        This method will get the very first HTML 'TABLE' (<TABLE> <TR> <TH>...</TH> </TR> <TR> <TD>..</TD> ... </TR> ... </TABLE>) element set. This returns a sub-Vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeGetInclusive.first(html, "table");
        
      • getTable

        public static java.util.Vector<HTMLNodegetTable​
                    (java.util.Vector<? extends HTMLNode> html,
                     int sPos,
                     int ePos)
        
        This method will get the very first HTML 'TABLE' (<TABLE> <TH>...</TH> <TR> <TD>..</TD> ... </TR> ... </TABLE>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The requested HTML sublist, as a Vector.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeGetInclusive.first(html, sPos, ePos, "table");
        
      • findSelect

        public static DotPair findSelect​
                    (java.util.Vector<? extends HTMLNode> html)
        
        This method will find the very first first HTML 'SELECT-OPTION' set. (<SELECT> ... <OPTION> ... </OPTION> .. </SELECT>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeFindInclusive.first(html, "select");
        
      • findSelect

        public static DotPair findSelect​
                    (java.util.Vector<? extends HTMLNode> html,
                     int sPos,
                     int ePos)
        
        This method will find the very first first HTML 'SELECT-OPTION' set. (<SELECT> ... <OPTION> ... </OPTION> .. </SELECT>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeFindInclusive.first(html, sPos, ePos, "select");
        
      • getSelect

        public static java.util.Vector<HTMLNodegetSelect​
                    (java.util.Vector<? extends HTMLNode> html)
        
        This method will find the very first first HTML 'SELECT-OPTION' set. (<SELECT> ... <OPTION> ... </OPTION> .. </SELECT>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair.) This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeGetInclusive.first(html, "select");
        
      • getSelect

        public static java.util.Vector<HTMLNodegetSelect​
                    (java.util.Vector<? extends HTMLNode> html,
                     int sPos,
                     int ePos)
        
        This method will find the very first first HTML 'SELECT-OPTION' set. (<SELECT> ... <OPTION> ... </OPTION> .. </SELECT>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The requested HTML sublist, as a Vector.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeGetInclusive.first(html, sPos, ePos, "select");
        
      • findUL

        public static DotPair findUL​(java.util.Vector<? extends HTMLNode> html)
        This method will find the very first HTML Un-Ordered List (<UL> ..<LI>...</LI> ... </UL>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeFindInclusive.first(html, "ul");
        
      • findUL

        public static DotPair findUL​(java.util.Vector<? extends HTMLNode> html,
                                     int sPos,
                                     int ePos)
        This method will find the very first HTML Un-Ordered List (<UL> ..<LI>...</LI> ... </UL>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeFindInclusive.first(html, sPos, ePos, "ul");
        
      • getUL

        public static java.util.Vector<HTMLNodegetUL​
                    (java.util.Vector<? extends HTMLNode> html)
        
        This method will find the very first HTML Un-Ordered List (<UL> ..<LI>...</LI> ... </UL>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeGetInclusive.first(html, "ul");
        
      • getUL

        public static java.util.Vector<HTMLNodegetUL​
                    (java.util.Vector<? extends HTMLNode> html,
                     int sPos,
                     int ePos)
        
        This method will find the very first HTML Un-Ordered List (<UL> ..<LI>...</LI> ... </UL>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The requested HTML sublist, as a Vector.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeGetInclusive.first(html, sPos, ePos, "ul");
        
      • findOL

        public static DotPair findOL​(java.util.Vector<? extends HTMLNode> html)
        This method will find the very first HTML Un-Ordered List (<OL> ..<LI>...</LI> ... </OL>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeFindInclusive.first(html, "ol");
        
      • findOL

        public static DotPair findOL​(java.util.Vector<? extends HTMLNode> html,
                                     int sPos,
                                     int ePos)
        This method will find the very first HTML Un-Ordered List (<OL> ..<LI>...</LI> ... </OL>) element set. This returns the Vector Position starting and ending boundaries DotPair.start, DotPair.end rather than pointer-references to the nodes. This is what the 'FIND' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The start and end index pointers, as a DotPair, of the HTML requested HTML sublist.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeFindInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeFindInclusive.first(html, sPos, ePos, "ol");
        
      • getOL

        public static java.util.Vector<HTMLNodegetOL​
                    (java.util.Vector<? extends HTMLNode> html)
        
        This method will find the very first HTML Un-Ordered List (<OL> ..<LI>...</LI> ... </OL>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        Returns:
        The requested HTML sublist, as a Vector.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeGetInclusive.first(html, "ol");
        
      • getOL

        public static java.util.Vector<HTMLNodegetOL​
                    (java.util.Vector<? extends HTMLNode> html,
                     int sPos,
                     int ePos)
        
        This method will find the very first HTML Un-Ordered List (<OL> ..<LI>...</LI> ... </OL>) element set. This returns a sub-vector (an actual Vector<HTMLNode> object, not a Vector / array starting and ending indices pair). This is what the 'GET' keyword usually means in this HTML-Scrape package.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Returns:
        The requested HTML sublist, as a Vector.
        Throws:
        java.lang.IndexOutOfBoundsException - This exception shall be thrown if any of the following are true:

        • If 'sPos' is negative, or if sPos is greater-than-or-equal-to the size of the Vector
        • If 'ePos' is zero, or greater than the size of the Vector
        • If the value of 'sPos' is a larger integer than 'ePos'. If 'ePos' was negative, it is first reset to Vector.size(), before this check is done.
        See Also:
        TagNodeGetInclusive
        Code:
        Exact Method Body:
        1
         return TagNodeGetInclusive.first(html, sPos, ePos, "ol");
        
      • findAllOption

        public static java.util.Vector<DotPairfindAllOption​
                    (java.util.Vector<? extends HTMLNode> selectList)
                throws MalformedHTMLException
        
        This will use the "L1 Inclusive" concept defined in this HTML package to provide a list (returned using the type: java.util.Vector<DotPair>) of each element that fits the <OPTION> ... </OPTION> HTML "select-option element" structure.
        Parameters:
        selectList - An HTML list of TagNode's and TextNode's that constitute an selection-option drop-down menu. This list cannot contain extraneous TagNode's or TextNode's, but rather, must begin and end with the open and close "select" HTML drop-down menu Tags.
        Returns:
        A "list of lists" - specifically, a list of Torello.HTML.DotPair , each of which delineate a complete <OPTION> ... </OPTION> sub-list that are present within this HTML "select" drop-down-menu structure.
        Throws:
        MalformedHTMLException - This method in no way performs a complete evaluation of the HTML structure provided by the user in the Vector<? extends HTMLNode> list parameter that is passed. However rules that are related to the HTML elements "Select Option" <SELECT>...<OPTION> ... </OPTION> ... </SELECT> are inspected.

        • If the passed list parameter does not start and end with the exact HTML elements - <SELECT>, </SELECT> , then this exception is thrown.
        • If the passed list parameter contains "extraneous HTML tags" or "extraneous text" in between the <OPTION> ... </OPTION> or <SELECT> ... </SELECT> list-start and list-end demarcated HTML TagNodes, then the Torello.HTML.MalformedHTMLException will, again, be thrown
        See Also:
        checkEndPoints(Vector, String[]), checkL1(Vector, Vector), TagNodeFindL1Inclusive
        Code:
        Exact Method Body:
        1
        2
        3
        4
         checkEndPoints(selectList, "select");
         Vector<DotPair> ret = TagNodeFindL1Inclusive.all(selectList, "option");
         checkL1(selectList, ret);
         return ret;
        
      • getAllOption

        public static java.util.Vector<java.util.Vector<HTMLNode>> getAllOption​
                    (java.util.Vector<? extends HTMLNode> selectList)
                throws MalformedHTMLException
        
        This does the exact same thing as findAllOption(Vector) but the returned value is converted from "sublist endpoints" (a vector of start/end pairs), and into a "List of Sub-Lists", which is specifically a list (java.util.Vector<>) containing sub-lists (also: java.util.Vector<HTMLNode>)

        NOTE: All of the rules and conditions explained in the comments for method findAllOption(Vector) apply to this method as well.
        Parameters:
        selectList - An HTML list of TagNode's and TextNode's that constitute an selection-option drop-down menu. This list cannot contain extraneous TagNode's or TextNode's, but rather, must begin and end with the open and close "select" HTML drop-down menu Tags.
        Returns:
        A "list of lists" - specifically, a list of java.util.Vector<HTMLNode> (sublists), each of which delineate a complete <OPTION> ... </OPTION> sub-list that are present within this HTML "select" drop-down-menu structure.
        Throws:
        MalformedHTMLException - This method in no way performs a complete evaluation of the HTML structure provided by the user in the Vector<? extends HTMLNode> list parameter that is passed. However rules that are related to the HTML elements "Select Option" <SELECT>...<OPTION> ... </OPTION> ... </SELECT> are inspected.

        • If the passed list parameter does not start and end with the exact HTML elements - <SELECT>, </SELECT>, then this exception is thrown.
        • If the passed list parameter contains "extraneous HTML tags" or "extraneous text" in between the <OPTION> ... </OPTION> or <SELECT> ... </SELECT> list-start and list-end demarcated HTML TagNodes, then the Torello.HTML.MalformedHTMLException will, again, be thrown
        See Also:
        DotPair.toVectors(Vector, Vector)
        Code:
        Exact Method Body:
        1
         return DotPair.toVectors(selectList, findAllOption(selectList));
        
      • findAllLI

        public static java.util.Vector<DotPairfindAllLI​
                    (java.util.Vector<? extends HTMLNode> list)
                throws MalformedHTMLException
        
        This will use the "L1 Inclusive" concept defined in this HTML package to provide a list (returned using the type: java.util.Vector<DotPair>) of each element that fits the <LI> ... </LI> HTML "list element" structure.
        Parameters:
        list - An HTML list of TagNode's and TextNode's that constitute an ordered or unordered list. This list cannot contain extraneous TagNode's or TextNode's, but rather, must begin and end with the open and close list Tags.
        Returns:
        A "list of lists" - specifically, a list of Torello.HTML.DotPair, each of which delineate a complete <LI> ... </LI> sub-list that are present within this HTML list structure.
        Throws:
        MalformedHTMLException - This method in no way performs a complete evaluation of the HTML structure provided by the user in the Vector<? extends HTMLNode> list parameter that is passed. However rules that are related to the HTML elements "Ordered List" <OL>...</OL> and "unordered list" <UL>...</UL> are inspected.

        • If the passed list parameter does not start and end with the same HTML elements - specifically <OL>, <UL> , then this exception is thrown.
        • If the passed list parameter contains "extraneous HTML tags" or "extraneous text" in between the <OL> or <UL> ... </OL> or </UL> list-start and list-end demarcated HTML TagNodes, then the Torello.HTML.MalformedHTMLException will, again, be thrown
        See Also:
        checkEndPoints(Vector, String[]), checkL1(Vector, Vector), TagNodeFindL1Inclusive
        Code:
        Exact Method Body:
        1
        2
        3
        4
         checkEndPoints(list, "ol", "ul");
         Vector<DotPair> ret = TagNodeFindL1Inclusive.all(list, "li");
         checkL1(list, ret);
         return ret;
        
      • getAllLI

        public static java.util.Vector<java.util.Vector<HTMLNode>> getAllLI​
                    (java.util.Vector<? extends HTMLNode> list)
                throws MalformedHTMLException
        
        This does the exact same thing as findAllLI(Vector) but the returned value is converted from "sublist endpoints" (a vector of start/end pairs), and into a "List of Sub-Lists", which is specifically a list (java.util.Vector<>) containing sub-lists (also: java.util.Vector<HTMLNode>)

        NOTE: All of the rules and conditions explained in the comments for method findAllLI(Vector) apply to this method as well.
        Parameters:
        list - An HTML list of TagNode's and TextNode's that constitute an ordered or unordered list. This list cannot contain extraneous TagNode's or TextNode's, but rather, must begin and end with the open and close list Tags.
        Returns:
        A "list of lists" - specifically, a list of java.util.Vector<HTMLNode> (sublists), each of which delineate a complete <UL>...</UL> sub-list that are present within this HTML list structure.
        Throws:
        MalformedHTMLException - This method in no way performs a complete evaluation of the HTML structure provided by the user in the Vector<? extends HTMLNode> list parameter that is passed. However rules that are related to the HTML elements "Ordered List" (<OL>...</OL>) and "unordered list" (<UL>...</UL>) are inspected.

        • If the passed list parameter does not start and end with the same HTML elements - specifically <OL>, <UL> , then this exception is thrown.
        • If the passed list parameter contains "extraneous HTML tags" or "extraneous text" in between the <OL> or <UL> ... </OL> or </UL> list-start and list-end demarcated HTML TagNode's, then the Torello.HTML.MalformedHTMLException will, again, be thrown.
        See Also:
        DotPair.toVectors(Vector, Vector)
        Code:
        Exact Method Body:
        1
         return DotPair.toVectors(list, findAllLI(list));
        
      • checkEndPoints

        protected static java.lang.String checkEndPoints​
                    (java.util.Vector<? extends HTMLNode> list,
                     java.lang.String... tokList)
                throws MalformedHTMLException
        
        This method is used to guarantee precisely two conditions to the passed HTML Tag list.

        • Condition 1: The Vector<HTMLNode> list parameter begins and ends with the exact same HTML Tag, (for instance: <H1> ... </H1>, or perhaps <LI> ... </LI> )
        • Condition 2: The HTML-Tag that is found at the start and end of this list is one contained within the 'tokList' variable-length String-array parameter. (if the 'tokList' parameter was a java.lang.String[] tokList = { "th", "tr" }, then the passed "HTMLNode list" (Vector) parameter would have to begin and end with either: <TH> ... </TH> or with <TR> ... </TR>

        Much of the java code in this method is used to provide some explanatory Exception message information.
        Parameters:
        list - This is supposed to be a typical "open" and "close" HTML TagNode structure. It may be anything including: <DIV ID="..."> ... </DIV> , or <TABLE ...> ... </TABLE> , or even <BODY> ... </BODY>
        tokList - This is expected to be the possible set of tokens with which this HTML list may begin or end with.
        Returns:
        If the passed list parameter passes both the conditions specified above, then the token from the list of tokens that were provided is returned.

        NOTE: If the list does not meet these conditions, a Torello.HTML.MalformedHTMLException will be thrown with an explanatory exception-message (and, obviously, the method will not return anything!)
        Throws:
        MalformedHTMLException - Some explanatory information is provided to the coder for what has failed with the input list.
        Code:
        Exact Method Body:
        1
         return checkEndPoints(list, 0, list.size()-1, tokList);
        
      • checkEndPoints

        protected static java.lang.String checkEndPoints​
                    (java.util.Vector<? extends HTMLNode> list,
                     int sPos,
                     int ePos,
                     java.lang.String... tokList)
                throws MalformedHTMLException
        
        This method, functionally, does the exact same thing as "checkEndPoints" - but with the endpoints specified. It is being kept with protected access since it might be unclear what endpoints are being checked. The previous method has many java exception case strings laboriously typed out. Rather than retype this, this method is being introduced. Functionally, it does the same thing as checkEndPoints(Vector, String) - except it does not use list.elementAt(0) or list.elementAt(element.size()-1) as the starting and ending points.
        Parameters:
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        tokList - The list of valid HTML Element names (tokens).
        Throws:
        MalformedHTMLException
        See Also:
        checkEndPoints(Vector, String[])
        Code:
        Exact Method Body:
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        16
        17
        18
        19
        20
        21
        22
        23
        24
        25
        26
        27
        28
        29
        30
        31
         HTMLNode n = null;		String tok = null;
                
         if ((n = list.elementAt(sPos)).isTagNode())
             tok = ((TagNode) n).tok;
         else throw new MalformedHTMLException(
             "This list does not begin an HTML TagNode, but rather a: " +
             n.getClass().getName() + "\n" + n.str
         );
                
         if (! (n = list.elementAt(ePos)).isTagNode())
             throw new MalformedHTMLException(
                 "This list does not end with an HTML TagNode, but rather a : " +
                 n.getClass().getName() + "\n" + n.str
             );
        
         if (! ((TagNode) n).tok.equals(tok))
             throw new MalformedHTMLException(
                 "This list does not begin and end with the same HTML TagNode:\n" +
                 "[OpeningTag: " + tok + "]\t[ClosingTag: " + ((TagNode) n).tok + "]"
             );
        
         for (String t : tokList) if (t.equals(tok)) return tok;
        
         String expectedTokList = "";
         for (String t: tokList) expectedTokList += " " + t;
        
         throw new MalformedHTMLException(
             "The opening and closing HTML Tag tokens for this list are not members of the " +
             "tokList parameter set...\n" +
             "Expected HTML Tag List: " + expectedTokList + "\nFound Tag: " + tok
         );
        
      • checkL1

        protected static void checkL1​(java.util.Vector<? extends HTMLNode> list,
                                      java.util.Vector<DotPair> sublists)
                               throws MalformedHTMLException
        This checks that the sublists demarcated by the Vector<DotPair> htmlSubLists parameter are properly formatted HTML. It would be easier to provide an example of "proper HTML formatting" and "improper HTML formatting" here, rather that trying to explain this using English.

        PROPER HTML:

        HTML Elements:
        1
        2
        3
        4
        5
        6
        7
         <UL>
         	<LI> This is a list element.</LI>
         	<LI> This is another list element.</LI>
         	<LI> This list element contains <B><I> extra-tags</I></B> like "bold", "italics", and
               even a <A HREF="http://Torello.Directory">link!</A></LI>
         </UL>
         
        

        IMPROPER HTML:

        HTML Elements:
        1
        2
        3
        4
        5
        6
        7
         <UL>
         This text should not be here, and constitutes "malformed HTML"
         <LI> This LI element is just fine.</LI>
         <A HREF="http://ChineseNewsBoard.com">This link</A> should be between LI elements
         <LI> This LI element is also just fine!</LI>
         </UL> 
         
        

        In the above two lists, the latter would generate a MalformedHTMLException
        Throws:
        MalformedHTMLException - whenever improper HTML is presented to this function
        Code:
        Exact Method Body:
        1
         checkL1(list, 0, list.size()-1, sublists);
        
      • checkL1

        protected static void checkL1​(java.util.Vector<? extends HTMLNode> list,
                                      int sPos,
                                      int ePos,
                                      java.util.Vector<DotPair> sublists)
                               throws MalformedHTMLException
        This method, functionally, does the exact same thing as "checkEL1" - but with the endpoints specified. It is being kept with protected access since it might be unclear what endpoints are being checked. The previous method has many java exception case String's laboriously typed out. Rather than retype this, this method is being introduced. Functionally, it does the same thing as checkL1(Vector, String) - except it does not use list.elementAt(0) or list.elementAt(element.size()-1) as the starting and ending points.
        Parameters:
        sPos - This is the (integer) Vector-index that sets a limit for the left-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'inclusive' meaning that the HTMLNode at this Vector-index will be visited by this method.

        NOTE: If this value is negative, or larger than the length of the input-Vector, an exception will be thrown.
        ePos - This is the (integer) Vector-index that sets a limit for the right-most Vector-position to inspect/search inside the input Vector-parameter. This value is considered 'exclusive' meaning that the 'HTMLNode' at this Vector-index will not be visited by this method.

        NOTE: If this value is larger than the size of input the Vector-parameter, an exception will throw.

        ALSO: Passing a negative value to this parameter, 'ePos', will cause its value to be reset to the size of the input Vector-parameter.
        Throws:
        MalformedHTMLException
        See Also:
        checkL1(Vector, Vector)
        Code:
        Exact Method Body:
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        16
        17
        18
        19
        20
        21
        22
        23
        24
        25
        26
        27
        28
         int last=sPos;		HTMLNode n;		int t=ePos - 1;
         for (DotPair sublist : sublists)
             if (sublist.start == (last+1)) last = sublist.end;
             else
             {
                 if ((sublist.start < (last+1)) || (sublist.start >= t))
                     throw new IllegalArgumentException(
                         "The provided subLists parameter does not contain subLists that are in " +
                         "order of the original list.  The 'list of sublists' must contain " +
                         "sublists that are in increasing sorted order.\n" +
                         "Specifically, each sublist must contain start and end points that are " +
                         "sequentially increasing.  Also, they may not overlap."
                     );
                 else
                 {
                     for (int i=(last+1); i < sublist.start; i++)
                         if ((n = list.elementAt(i)).isTagNode())
                             throw new MalformedHTMLException(
                                 "There is a spurious HTML-Tag element at Vector position: " + i +
                                 "\n=>\t" + n.str
                             );
                         else if (n.isTextNode() && (n.str.trim().length() > 0))
                             throw new MalformedHTMLException(
                                 "There is a spurious Text-Node element at Vector position: " + i +
                                 "\n=>\t" + n.str
                             );
                 }
             }