Class TagNodeFindInclusive

  • public class TagNodeFindInclusive
    extends java.lang.Object
    TagNode Find Inclusive - Documentation.

    TagNodeFindInclusive =>

    1. TagNode: This implies that only HTML TagNode's will be used for searching. The field TagNode.tok field is used as a search criteria. This public, final String field contains the name of the HTML Element - for instance, 'div', 'p', 'span', 'img', etc...
      InnerTag's - (a.k.a. 'attributes') - are not part of the search.
    2. Find: This implies that integer values are returned by these methods. These integers are intended to serve as pointers into the underlying input Java Vector.
    3. Inclusive: The word "Inclusive" is used to indicate that all HTMLNode's between an opening and closing HTML-tag is requested. The concept is extremely similar to the Java-Script feature / "term" '.innerHTML', although in this (JavaHTML) JAR Library, no DOM Trees are ever constructed. This method will return all nodes between the first matching TagNode element, and its closing TagNode element pair.

    Methods Available

    Method Explanation
    first (...) Obtain the first integer-value node-pointer, and it's closing-tag pair from the HTML Vector as a DotPair that meets the criteria.
    nth (...) Obtain the nth integer-value node-pointer, and it's closing-tag pair from the HTML Vector as a DotPair that meets the criteria.
    last (...) Obtain the last integer-value node-pointer, and it's closing-tag pair from the HTML Vector as a DotPair that meets the criteria.
    nthFromEnd (...) Obtain the nth-from-last integer-value node-pointer, and it's closing-tag pair from the HTML Vector as a DotPair that meets the criteria.
    all (...) Obtain all integer-value node-pointer DotPair open-and-closing tag-pairs from the HTML Vector that meet the criteria.
    allExcept (...) Obtain all integer-value node-pointer DotPair open-and-closing tag-pairs from the HTML Vector that do not meet the criteria.

    Method Parameters

    Parameter Explanation
    Vector<? extends HTMLNode> html This represents any vectorized HTML page, sub-page, or list of partial-elements.
    int nth This represents the 'nth' match of a comparison for-loop. When the method-signature used includes the parameter 'nth' , the first n-1 matches that are found - will be skipped, and the 'nth' match is, instead, returned.

    EXCEPTIONS: An NException shall throw if the value of parameter 'nth' is zero, negative, or larger than the size of the input html-Vector.
    int sPos, int ePos When these parameters are present, only HTMLNode's that are found between the specified Vector indices will be considered for matching with the search criteria.

    NOTE: In every situation where the parameters int sPos, int ePos are used, parameter 'ePos' will accept a negative value, but parameter 'sPos' will not. When 'ePos' is passed a negative-value, the internal LV ('Loop Variable Counter') will have its public final int end; field set to the length of the vectorized-html page that was passed. (html.size() of parameter Vector<HTMLNode> html).

    EXCEPTIONS: An IndexOutOfBoundsException will be thrown if:

    • If sPos is negative, or if sPos is greater-than or equal-to the size of the input Vector
    • If ePos is zero, or greater than the size of the input Vector.
    • If sPos is a larger integer than ePos
    TC tagCriteria The three values of enumeration TC are: TC.OpeningTags, TC.ClosingTags and TC.Both. These values specify a search-criteria result set for an HTML TagNode. There are two types of HTML Elements:

    • "opening versions" of the HTML-tag such as: <A HREF="...">
    • "closing versions" of the element such as: </A>.

    NOTE: If parameter 'tagCriteria' is passed a value of TC.Both, then (and this is hopefully obvious), that both 'opening' and 'closing' versions of the tag will be considered to meet / match the search criteria.
    String... htmlTags When this parameter is present, only HTMLNode's which are both instances of class TagNode *and* whose TagNode.tok field String-value matches (is equal to) at least one of the elements in this VarArgs String parameter-set will be considered for a match.

    COMMON EXAMPLES: Some common examples of valid htmlTags are: a, div, img, table, tr, meta as well as all other valid HTML element-tokens.

    NOTE: This comparison is performed using a case-insensitive compare-method

    EXCEPTIONS: If even one of the elements in this parameter-set is an invalid HTML token, an HTMLTokException will be thrown.

    Return Values:

    1. DotPair The public class DotPair is just a 2-integer set that identifies the start and ending of a sub-list or "sub-array" inside the html vectorized-page parameter 'html'
    2. A return value of null implies no matching sub-lists or sub-pages were found.
    3. Vector<DotPair> This would be a "list of sub-lists" or an "array of sub-arrays" which are used when multiple results (multiple sub-lists) are needed to be returned to the calling procedure. Such a Vector<DotPair> represent a list of sub-list-pointers into the vectorized-page parameter 'html', with each integer being a different position in the vector that has a matching TagNode
    4. A zero-length Vector<DotPair> vector means no matches were found on the page or sub-page. Zero-length vectors are returned from any method where the possibility existed for multiple-matches being provided as a result-set.
    5. Iterator<DotPair> Returns, one-at-a-time, index-pointers DotPair of sub-lists or sub-pages into the vectorized-HTML page parameter 'html'.

    Static (Functional) API: The methods in this class are all (100%) defined with the Java Key-Word / Key-Concept 'static'. Furthermore, there is no way to obtain an instance of this class, because there are no public (nor private) constructors. Java's Spring-Boot, MVC feature is *not* utilized because it flies directly in the face of the light-weight data-classes philosophy. This has many advantages over the rather ornate Component Annotations (@Component, @Service, @AutoWired, etc... 'Java Beans') syntax:

    • The methods here use the key-word 'static' which means (by implication) that there is no internal-state. Without any 'internal state' there is no need for constructors in the first place! (This is often the complaint by MVC Programmers).
    • A 'Static' (Functional-Programming) API expects to use fewer data-classes, and light-weight data-classes, making it easier to understand and to program.
    • The Vectorized HTML data-model allows more user-control over HTML parse, search, update & scrape. Also, memory management, memory leakage, and the Java Garbage Collector ought to be intelligible through the 'reuse' of the standard JDK class Vector for storing HTML Web-Page data.

    The power that object-oriented programming extends to a user is (mostly) limited to data-representation. Thinking of "Services" as "Objects" (Spring-MVC, 'Java Beans') is somewhat 'over-applying' the Object Oriented Programming Model. Like most classes in the Java-HTML JAR Library, this class backtracks to a more C-Styled Functional Programming Model (no Objects) - by re-using (quite profusely) the key-word static with all of its methods, and by sticking to Java's well-understood class Vector

    Internal-State: A user may click on this class' source code (see link below) to view any and all internally defined fields class. A cursory inspection of the code would prove that this class has precisely zero internally defined global fields (Spaghetti). All variables used by the methods in this class are local fields only, and therefore this class ought to be though of as 'state-less'.

    View Actual Hi-Lited Code Files: