Package Torello.HTML

Class Util.Inclusive

  • Enclosing class:
    Util

    public static class Util.Inclusive
    extends java.lang.Object
    Util.Inclusive Documentation

    These methods provided in this class will search for an inclusive-match to an input, opening, TagNode. The use user must provide the HTML-Vector containing the opening TagNode, and the six search variants, (Count, Find, Get, Peek, Poll, and Remove each have a method in this class for retrieving the type requested.




Stateless Class: This class neither contains any program-state, nor can it be instantiated.
The @StaticFunctional Annotation may also be called 'The Spaghetti Report'
  • 1 Constructor(s), 1 declared private, zero-argument constructor
  • 11 Method(s), 11 declared static
  • 0 Field(s)


    • Method Detail

      • find

        public static int find​(java.util.Vector<? extends HTMLNode> html,
                               int nodeIndex)
        This finds the closing HTML 'TagNode' match for a given opening 'TagNode' in a given-input html page or sub-section.
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        nodeIndex - An index into that Vector. This index must point to an HTMLNode element that is:

        1. An instance of TagNode
        2. A TagNode whose 'isClosing' field is FALSE
        3. Is not a 'singleton' HTML element-token (i.e. <IMG>, <BR>, <H1>, etc...)
        Returns:
        An "inclusive search" finds OpeningTag and ClosingTag pairs - and returns all the elements between them in the contents of a return-Vector, or Vector DotPair-end-point value. This method will take a particular node of a Vector, and (as long it has a match) find it's closing HTMLNode match. The integer returned will be the index into this page of the closing, matching TagNode.
        Throws:
        TagNodeExpectedException - If the node in the Vector-parameter 'html' contained at index 'nodeIndex' is not an instance of TagNode, then this exception is thrown.
        OpeningTagNodeExpectedException - If the node in the Vector-parameter 'html' at index 'nodeIndex' is a closing version of the HTML element, then this exception shall throw.
        InclusiveException - If the node in Vector-parameter 'html', pointed-to by index 'nodeIndex' is an HTML 'Singleton' / Self-Closing Tag, then this exception will be thrown.
        See Also:
        TagNode, TagNode.tok, TagNode.isClosing, HTMLNode
        Code:
        Exact Method Body:
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        16
        17
        18
        19
        20
        21
        22
        23
        24
        25
        26
        27
        28
        29
        30
        31
        32
        33
        34
         TagNode     tn          = null;
         HTMLNode    n           = null;
         String      tok         = null;
        
         if (! html.elementAt(nodeIndex).isTagNode())
             throw new TagNodeExpectedException (
                 "You have attempted to find a closing tag to match an opening one, " +
                 "but the 'nodeIndex' (" + nodeIndex + ") you have passed doesn't contain " +
                 "an instance of TagNode."
             );
         else tn = (TagNode) html.elementAt(nodeIndex);
        
         if (tn.isClosing) throw new OpeningTagNodeExpectedException(
             "The TagNode indicated by 'nodeIndex' = " + nodeIndex + " has its 'isClosing' " +
             "boolean as TRUE - this is not an opening TagNode, but it must be to continue."
         );
        
         // Checks to ensure this token is not a 'self-closing' or 'singleton' tag.
         // If it is an exception shall throw.
         tok = tn.tok;
         InclusiveException.check(tok);
        
         int         end         = html.size();
         int         openCount   = 1;
        
         for (int pos = nodeIndex; pos < end; pos++)
             if ((n = html.elementAt(pos)).isTagNode())
                 if ((tn = ((TagNode) n)).tok.equals(tok))
                 {
                     openCount += tn.isClosing ? -1 : 1;
                     if (openCount == 0) return pos;
                 }
        
         return -1;
        
      • get

        public static java.util.Vector<HTMLNodeget​
                    (java.util.Vector<? extends HTMLNode> html,
                     int nodeIndex)
        
        Convenience Method. Invokes find(Vector, int).

        Converts output to 'GET' format (Vector-sublist), using Util.cloneRange(Vector, int, int)
        Code:
        Exact Method Body:
        1
        2
         int endPos = find(html, nodeIndex);
         return (endPos == -1) ? null : cloneRange(html, nodeIndex, endPos + 1);
        
      • peek

        public static SubSection peek​(java.util.Vector<? extends HTMLNode> html,
                                      int nodeIndex)
        Convenience Method. Invokes find(Vector, int).

        Converts output to 'PEEK' format (SubSection), using Util.cloneRange(Vector, int, int)
        Code:
        Exact Method Body:
        1
        2
        3
        4
        5
        6
         int endPos = find(html, nodeIndex);
        
         return (endPos == -1) ? null : new SubSection(
             new DotPair(nodeIndex, endPos),
             cloneRange(html, nodeIndex, endPos + 1)
         );
        
      • poll

        public static java.util.Vector<HTMLNodepoll​
                    (java.util.Vector<? extends HTMLNode> html,
                     int nodeIndex)
        
        Convenience Method. Invokes find(Vector, int).

        Converts output to 'POLL' format (Vector-sublist), using Util.pollRange(Vector, int, int). Removes Sub-List.
        Code:
        Exact Method Body:
        1
        2
         int endPos = find(html, nodeIndex);
         return (endPos == -1) ? null : pollRange(html, nodeIndex, endPos + 1);
        
      • remove

        public static int remove​(java.util.Vector<? extends HTMLNode> html,
                                 int nodeIndex)
        Convenience Method. Invokes find(Vector, int).

        Converts output to 'REMOVE' format (int - number of nodes removed), using Util.removeRange(Vector, int, int). Removes Sub-List.
        Code:
        Exact Method Body:
        1
        2
         int endPos = find(html, nodeIndex);
         return (endPos == -1) ? 0 : removeRange(html, nodeIndex, endPos + 1);
        
      • vectorOPT

        public static java.util.Vector<HTMLNodevectorOPT​
                    (java.util.Vector<? extends HTMLNode> html,
                     int tagPos)
        
        Convenience Method. Invokes dotPairOPT(Vector, int).

        Converts output to Vector<HTMLNode>.
        Code:
        Exact Method Body:
        1
        2
        3
         DotPair dp = dotPairOPT(html, tagPos);
         if (dp == null) return null;
         else            return Util.cloneRange(html, dp.start, dp.end + 1);
        
      • subSectionOPT

        public static SubSection subSectionOPT​
                    (java.util.Vector<? extends HTMLNode> html,
                     int tagPos)
        
        Convenience Method. Invokes dotPairOPT(Vector, int).

        Converts output to SubSection.
        Code:
        Exact Method Body:
        1
        2
        3
         DotPair dp = dotPairOPT(html, tagPos);
         if (dp == null) return null;
         else            return new SubSection(dp, Util.cloneRange(html, dp.start, dp.end + 1));
        
      • dotPairOPT

        public static DotPair dotPairOPT​
                    (java.util.Vector<? extends HTMLNode> html,
                     int tagPos)
        
        OPT: Optimized Which means this method expects that any parameter-error checking has already been performed.

        There are no error-checks, nor validity-checks performed on the input to this method. This is a heavily-used, internally-used method for this package. Originally, this was included in the internal-helper set of classes for the Node-Search package.

        PURPOSE AND USE: This method expects to receive a vectorized-html page, or sub-page, along with a valid-index into that page pointing to an instance of a TagNode. The TagNode instance is expected to be BOTH an OpeningTag, and a non-singleton (non-self-closing) HTML Element. This method finds the corresponding "closing, matching, paired" TagNode HTML Element. For instance, a "<DIV ..."> HTML element is matched to it's corresponding "</DIV>" element, and an "<A ...>" element to it's closing "</A>" element.

        This method is heavily used in any class in the Node-Search Package that contains or uses the word 'inclusive.' This is because 'inclusive' is closely-similar to the "java-script function" '.innerHTML' All three of the following optimization methods perform identical tasks, but have different return types (of similar / identical data):

        • public static DotPair inclusiveDotPairOPT(Vector, int, int) - which returns the matching 'innerHTML' as an index-pointer pair.
        • public static Vector<HTMLNode> inclusiveVectorOPT(Vector, int, int) - which returns the matching 'innerHTML' as cloned-copy of the Vector-sublist as new instance of 'Vector<HTMLNode>'.
        • public static DotPair inclusiveDotPairOPT(Vector, int, int) - which returns the matching 'innerHTML' as cloned-copy of the Vector-sublist combined-with it's DotPair (both the 'Vector' clone and the 'DotPair' index-pointers are returned, together, as an instance of SubSection).
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        tagPos - This may be any valid position within this html-Vector, and for obvious reasons it must both be positive, and less than the size of the Vector. It must also point to a valid MObject-reference to an instance of class TagNode.
        Returns:
        A 'DotPair' version of an inclusive, end-to-end HTML tag-element.

        Again, there is a strong similarity between the term "inclusive-match" and the java-script Object-field 'innerHTML.' Both of these terms essentially refer to a block of HTML code that begins with a non-singleton HTML element (like a <DIV> - divider) that has an opening-tag: <DIV> and a closing-tag </DIV> - and includes all HTMLNode's between these.
        See Also:
        TagNode, TagNode.isClosing, TagNode.tok, DotPair
        Code:
        Exact Method Body:
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        16
        17
        18
        19
        20
        21
        22
        23
        24
         // Temp Variables
         HTMLNode n;		TagNode tn;		int openCount = 1;
        
         int len = html.size();
        
         // This is the name (token) of the "Opening HTML Element", we are searching for
         // the matching, closing element
         String tok = ((TagNode) html.elementAt(tagPos)).tok;
        
         for (int i = (tagPos+1); i < len; i++)
             if ((n = html.elementAt(i)).isTagNode())
                 if ((tn = (TagNode) n).tok.equals(tok))
                 {
                     // This keeps a "Depth Count" - where "depth" is just the number of 
                     // opened tags, for which a matching, closing tag hasn't been found yet.
                     openCount += (tn.isClosing ? -1 : 1);
        
                     // When all open-tags of the specified HTML Element 'tok' have been
                     // found, search has finished.
                     if (openCount == 0) return new DotPair(tagPos, i);
                 }
        
         // Was not found
         return null;
        
      • vectorOPT

        public static java.util.Vector<HTMLNodevectorOPT​
                    (java.util.Vector<? extends HTMLNode> html,
                     int tagPos,
                     int end)
        
        Convenience Method. Invokes dotPairOPT(Vector, int, int).

        Converts output to Vector<HTMLNode>.
        Code:
        Exact Method Body:
        1
        2
        3
         DotPair dp = dotPairOPT(html, tagPos, end);
         if (dp == null) return null;
         else            return Util.cloneRange(html, dp.start, dp.end + 1);
        
      • subSectionOPT

        public static SubSection subSectionOPT​
                    (java.util.Vector<? extends HTMLNode> html,
                     int tagPos,
                     int end)
        
        Convenience Method. Invokes dotPairOPT(Vector, int, int).

        Converts output to SubSection.
        Code:
        Exact Method Body:
        1
        2
        3
         DotPair dp = dotPairOPT(html, tagPos, end);
         if (dp == null) return null;
         else            return new SubSection(dp, Util.cloneRange(html, dp.start, dp.end + 1));
        
      • dotPairOPT

        public static DotPair dotPairOPT​
                    (java.util.Vector<? extends HTMLNode> html,
                     int tagPos,
                     int end)
        
        OPT: Optimized Which means this method expects that any parameter-error checking has already been performed.

        There are no error-checks, nor validity-checks performed on the input to this method. This is a heavily-used, internally-used method for this package. Originally, this was included in the internal-helper set of classes for the Node-Search package.

        PURPOSE AND USE: This method expects to receive a vectorized-html page, or sub-page, along with a valid-index into that page pointing to an instance of a TagNode. The TagNode instance is expected to be BOTH an OpeningTag, and a non-singleton (non-self-closing) HTML Element. This method finds the corresponding "closing, matching, paired" TagNode HTML Element. For instance, a "<DIV ..."> HTML element is matched to it's corresponding "</DIV>" element, and an "<A ...>" element to it's closing "</A>" element.

        This method is heavily used in any class in the Node-Search Package that contains or uses the word 'inclusive.' This is because 'inclusive' is closely-similar to the "java-script function" '.innerHTML' All three of the following optimization methods perform identical tasks, but have different return types (of similar / identical data):

        • public static DotPair inclusiveDotPairOPT(Vector, int, int) - which returns the matching 'innerHTML' as an index-pointer pair.
        • public static Vector<HTMLNode> inclusiveVectorOPT(Vector, int, int) - which returns the matching 'innerHTML' as cloned-copy of the Vector-sublist as new instance of 'Vector<HTMLNode>'.
        • public static DotPair inclusiveDotPairOPT(Vector, int, int) - which returns the matching 'innerHTML' as cloned-copy of the Vector-sublist combined-with it's DotPair (both the 'Vector' clone and the 'DotPair' index-pointers are returned, together, as an instance of SubSection).
        Parameters:
        html - This may be any vectorized-html web-page, or an html sub-section / partial-page. All that the variable-type wild-card '? extends HTMLNode' means is this method can receive a Vector<TagNode>, Vector<TextNode> or a Vector<CommentNode>, without throwing an exception, or producing erroneous results. These 'sub-type' Vectors are very often returned as search results from the classes in the 'NodeSearch' package. The most common vector-type used is Vector<HTMLNode>.
        tagPos - This may be any valid position within this html-Vector, and for obvious reasons it must both be positive, and less than the size of the Vector. It must also point to a valid MObject-reference to an instance of class TagNode.
        end - This is a "loop-variable" instance that establishes an ending-perimeter around the search-location for finding an inclusive-match. (As an aside, it essentially maps to int ePos' in all of the node-search methods). If a complete end-to-end open-and-close "inclusive-match" is not found within the perimeter of 'tagPos' and 'end', then a 'null' shall be returned.
        Returns:
        A 'DotPair' version of an inclusive, end-to-end HTML tag-element.

        Again, there is a strong similarity between the term "inclusive-match" and the java-script Object-field 'innerHTML.' Both of these terms essentially refer to a block of HTML code that begins with a non-singleton HTML element (like a <DIV> - divider) that has an opening-tag: <DIV> and a closing-tag </DIV> - and includes all HTMLNode's between these.
        See Also:
        TagNode, TagNode.isClosing, TagNode.tok, DotPair
        Code:
        Exact Method Body:
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        16
        17
        18
        19
        20
        21
        22
        23
        24
        25
         // Temp Variables
         HTMLNode n;		TagNode tn;		int openCount = 1;		int endPos;
        
         // This is the name (token) of the "Opening HTML Element", we are searching for
         // the matching, closing element
         String tok = ((TagNode) html.elementAt(tagPos)).tok;
        
         for (endPos = (tagPos+1); endPos < end; endPos++)
             if ((n = html.elementAt(endPos)).isTagNode())
                 if ((tn = (TagNode) n).tok.equals(tok))
                 {
                     // This keeps a "Depth Count" - where "depth" is just the number of
                     // opened tags, for which a matching, closing tag hasn't been found yet.
                     openCount += (tn.isClosing ? -1 : 1);
        
                     // System.out.print(".");
        
                     // When all open-tags of the specified HTML Element 'tok' have been
                     // found, search has finished.
                     if (openCount == 0) return new DotPair(tagPos, endPos);
                 }
        
         // The end of the vectorized-html page (or subsection) was reached, but the
         // matching-closing element was not found.
         return null; // assert(endPos == html.size());