Package Torello.HTML.NodeSearch
The purpose of these classes is to allow a programmer to "search" through webpages that have
been vectorized and downloaded to Java
The following key words are important to understand when deciding on an appropriate search class and search method:
The following key words are also important, and will explain some 'Nuances' for the HTML search methods:
Finally, the key-word "inclusive" should probably be explained here. Mostly, the key-word "inclusive" is, actually, very similar to the Java-Script concept of
When an
Vector<HTMLNode>
.
The following key words are important to understand when deciding on an appropriate search class and search method:
-
InnerTag:
This implies attributes inside an HTMLTagNode
element are used to search forTagNode's
. -
TagNode:
This implies that only the HTML elementfinal String '.tok'
field may be used for specifying search criteria. InnerTag's - a.k.a. 'attributes' - are not part of the search criteria. -
TextNode:
This implies thatTagNode
elements are ignored completely in this search, and instead, the "text" represented as instances ofTextNode
, are searched.
The following key words are also important, and will explain some 'Nuances' for the HTML search methods:
Count:
This implies that a count of the number of nodes that have matched a specified search criteria shall be computed. Methods in'Count'
classes will always return simple-integers that represent this count.Find:
This implies that integer-arrays, or simple-integers are returned by the methods in any of the classes with the word'Find'
in the class' name. These integers are intended to function as pointers into the underlying JavaVector<HTMLNode>
.Get:
This implies thatHTMLNode's
, themselves (TagNode, TextNode
etc...), are returned by the methods in any of these classes. Integer-pointers (a.k.a. the integer-index into the underlyingVector<HTMLNode
) are not returned.Peek:
This implies that BOTH theVector
-index AND theHTMLNode
found at-that-index-location are SIMULTANEOUSLY returned by the methods in a class having the word'Peek'
in its name. It is here that the (sort-of) 'simple' and 'extra' data-classes'TagNodeIndex', 'TextNodeIndex'
, etc... are used. They are for the return values of the'Peek'
methods.Poll:
This refers to the operation of BOTH removing a node from the vectorized-html web-page, AND returning the node (or nodes) that were removed back to the programmer as a return value. Remember, for all methods in classes that have the word'Poll'
in their name, after the method is finished theVector<HTMLNode>
will, indeed, contain fewer elements.Remove:
This implies that neither nodes nor node-pointers are returned, and furthermore the nodes are simply removed from the page. An integer-value stating to the caller exactly how many nodes were removed is returned. Remember, after a'remove'
operation, the initial vectorized-html will contain fewer elements.
Finally, the key-word "inclusive" should probably be explained here. Mostly, the key-word "inclusive" is, actually, very similar to the Java-Script concept of
'.innerHTML'
. This object-field is a field in most of the classes in a Java-Script
DOM Tree. It implies that every node between the opening element ('<DIV ..>'
for
example) and matching closing-element ('</DIV>'
for example) are used / returned.
When an
HTMLNode
is searched using either an 'InnerTag-Search'
(attribute key-value pair), or a simple 'TagNode-Search'
method, then the opening-tag, the closing-tag - and every HTMLNode
between these
two is returned by that method!
-
Interfaces Java Entity Description AVT AVT - DocumentationTextComparitor TextComparitor - DocumentationHTML Iterators Java Entity Description AbstractHNLI<E extends HTMLNode,F> AbstractHNLI: Abstract HTML Node List Iterator - DocumentationHNLI<E extends HTMLNode> HNLI: HTML Node List Iterator - DocumentationHNLIInclusive HNLIInclusive: HTMLNode List Iterator, Inclusive - DocumentationCommentNodeIterator CommentNode Iterator - DocumentationTagNodeIterator TagNode Iterator - DocumentationTextNodeIterator TextNode Iterator - DocumentationInnerTagIterator InnerTag Iterator - DocumentationInnerTagInclusiveIterator InnerTag Inclusive Iterator - DocumentationTagNodeInclusiveIterator TagNode Inclusive Iterator - DocumentationHTML Search Classes Java Entity Description ARGCHECK ARGSCHECK - DocumentationJS Java-Script "Similar-Functions" - DocumentationTagNodeCount TagNode Count - DocumentationTagNodeFind TagNode Find - DocumentationTagNodeGet TagNode Get - DocumentationTagNodePeek TagNode Peek - DocumentationTagNodePoll TagNode Poll - DocumentationTagNodeRemove TagNode Remove - DocumentationTextNodeCount TextNode Count - DocumentationTextNodeFind TextNode Find - DocumentationTextNodeGet TextNode Get - DocumentationTextNodePeek TextNode Peek - DocumentationTextNodePoll TextNode Poll - DocumentationTextNodeRemove TextNode Remove - DocumentationCommentNodeCount CommentNode Count - DocumentationCommentNodeFind CommentNode Find - DocumentationCommentNodeGet CommentNode Get - DocumentationCommentNodePeek CommentNode Peek - DocumentationCommentNodePoll CommentNode Poll - DocumentationCommentNodeRemove CommentNode Remove - DocumentationInnerTagCount InnerTag Count - DocumentationInnerTagFind InnerTag Find - DocumentationInnerTagGet InnerTag Get - DocumentationInnerTagPeek InnerTag Peek - DocumentationInnerTagPoll InnerTag Poll - DocumentationInnerTagRemove InnerTag Remove - DocumentationInnerTagFindInclusive InnerTag Find Inclusive - DocumentationInnerTagGetInclusive InnerTag Get Inclusive - DocumentationInnerTagPeekInclusive InnerTag Peek Inclusive - DocumentationInnerTagPollInclusive InnerTag Poll Inclusive - DocumentationInnerTagRemoveInclusive InnerTag Remove Inclusive - DocumentationTagNodeFindInclusive TagNode Find Inclusive - DocumentationTagNodeGetInclusive TagNode Get Inclusive - DocumentationTagNodePeekInclusive TagNode Peek Inclusive - DocumentationTagNodePollInclusive TagNode Poll Inclusive - DocumentationTagNodeRemoveInclusive TagNode Remove Inclusive - DocumentationTagNodePeekL1Inclusive TagNode Peek L1 (Sibling) Inclusive - DocumentationTagNodeGetL1Inclusive TagNode Get L1 (Sibling) Inclusive - DocumentationTagNodeFindL1Inclusive TagNode Find L1 (Sibling) Inclusive - DocumentationExceptions Java Entity Description CSSStrException CSSStrException - DocumentationCursorException CursorException - DocumentationHTMLNotFoundException HTMLNotFoundException - DocumentationInclusiveException InclusiveException - DocumentationIteratorOutOfBoundsException IteratorOutOfBoundsException - DocumentationSecondModificationException SecondModificationException - DocumentationTCCompareStrException TCCompareStrException - Documentation