Package Torello.HTML
Class DotPair
- java.lang.Object
-
- Torello.HTML.DotPair
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
,java.lang.Comparable<DotPair>
,java.lang.Iterable<java.lang.Integer>
public final class DotPair extends java.lang.Object implements java.io.Serializable, java.lang.Comparable<DotPair>, java.lang.Cloneable, java.lang.Iterable<java.lang.Integer>
DotPair - Documentation.
The purpose of this class is to keep the starting and ending points of an array sub-list together. In a much older computer language (LISP/Scheme) a'dotted pair'
is just two integers (numbers) that are glued to each other. Here, the two numbers are intended to represent Array Start and Array End Position values for the sub-list of aVector
.
NOTE: Calling this class "Arraysub-listEndPoints" would be a lot more descriptive, but the name would be so long to type that instead, it is going to be called'DotPair'
IMPORTANT NOTE: For every one of the Find, Get and Remove node methods, the input parameterssPos, ePos
are designed such that:- the
"sPos"
is inclusive, meaning that theVector
index denoted by the value of this parameter is included in the sub-list. - the
"ePos"
is exclusive, meaning that theVector
index denoted by the value of this parameter is NOT included in the sub-list.
HOWEVER, HERE: inclass DotPair
- the
"start"
is inclusive, meaning that theVector
index denoted by the value of this class field is included in the sub-list. - the
"end"
is ALSO inclusive, meaning that theVector
index denoted by the value of this class field is ALSO included in the sub-list.
Generally the"sPos, ePos"
method parameters and aDotPair.start
orDotPair.end
field have exactly identical meanings - EXCEPT for the above noted difference.- See Also:
NodeIndex
,SubSection
, Serialized Form
Hi-Lited Source-Code:
- View Here: Torello/HTML/DotPair.java
- Open New Browser-Tab: Torello/HTML/DotPair.java
-
-
Field Summary
Serializable ID Modifier and Type Field static long
serialVersionUID
Start & End Field Modifier and Type Field int
end
int
start
Alternate Sort Comparator Modifier and Type Field static Comparator<DotPair>
comp2
-
Constructor Summary
Constructors Constructor DotPair(int start, int end)
-
Method Summary
Basic Methods Modifier and Type Method boolean
enclosedBy(DotPair other)
boolean
encloses(DotPair other)
boolean
isInside(int index)
boolean
overlaps(DotPair other)
int
size()
Collate Multiple DotPair's to a Single Index-List Modifier and Type Method static PrimitiveIterator.OfInt
iterator(Iterable<DotPair> dpi, boolean leastToGreatest)
static int[]
toPosArray(Iterable<DotPair> dpi, boolean leastToGreatest)
static IntStream
toStream(Iterable<DotPair> dpi, boolean leastToGreatest)
Collate Multiple DotPair's to a Single Index-List, Include End-Points Only Modifier and Type Method static PrimitiveIterator.OfInt
endPointsIterator(Iterable<DotPair> dpi, boolean leastToGreatest)
static int[]
endPointsToPosArray(Iterable<DotPair> dpi, boolean leastToGreatest)
static IntStream
endPointsToStream(Iterable<DotPair> dpi, boolean leastToGreatest)
Retrieve HTMLNode's from an HTML-Vector, using one or more DotPair's Modifier and Type Method static Vector<SubSection>
toSubSections(Vector<? extends HTMLNode> html, Vector<DotPair> sublists)
static Vector<HTMLNode>
toVector(Vector<? extends HTMLNode> html, DotPair dp)
static Vector<Vector<HTMLNode>>
toVectors(Vector<? extends HTMLNode> html, Vector<DotPair> sublists)
Methods: Class java.lang.Object Modifier and Type Method DotPair
clone()
boolean
equals(Object o)
int
hashCode()
String
toString()
Methods: Interface java.lang.Iterable Modifier and Type Method PrimitiveIterator.OfInt
iterator()
Methods: Interface java.lang.Comparable Modifier and Type Method int
compareTo(DotPair other)
-
-
-
Field Detail
-
serialVersionUID
public static final long serialVersionUID
This fulfils the SerialVersion UID requirement for all classes that implement Java'sinterface java.io.Serializable
. Using theSerializable
Implementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.- See Also:
- Constant Field Values
- Code:
- Exact Field Declaration Expression:
1
public static final long serialVersionUID = 1;
-
start
public final int start
This is intended to be the "starting index" into an sub-array of an HTMLVector
ofHTMLNode
elements.- Code:
- Exact Field Declaration Expression:
1
public final int start;
-
end
public final int end
This is intended to be the "ending index" into a sub-array of an HTMLVector
ofHTMLNode
elements.- Code:
- Exact Field Declaration Expression:
1
public final int end;
-
comp2
public static java.util.Comparator<DotPair> comp2
This is an "alternative Comparitor" that can be used for sorting instances of this class. It should work with theCollections.sort(List, Comparator)
method in the standard JDK packagejava.util.*;
NOTE: This simply compares the size of oneDotPair
to a second. The smaller shall be sorted first, and the larger (longer-in-length)DotPair
shall be sorted later. If they are of equal size, whichever of the two has an earlier'start'
position in theVector
is considered first.- See Also:
CommentNode.body
- Code:
- Exact Field Declaration Expression:
1 2 3 4 5
public static Comparator<DotPair> comp2 = (DotPair dp1, DotPair dp2) -> { int ret = dp1.size() - dp2.size(); return (ret != 0) ? ret : (dp1.start - dp2.start); };
-
-
Constructor Detail
-
DotPair
public DotPair(int start, int end)
This constructor takes two integers and saves them into thepublic
member fields.- Parameters:
start
- This is intended to store the starting position of a vectorized-webpage sub-list or subpage.end
- This will store the ending position of a vectorized-html webpage or subpage.- Throws:
java.lang.IndexOutOfBoundsException
- A negative'start'
or'end'
parameter-value will cause this exception throw.java.lang.IllegalArgumentException
- A'start'
parameter-value that is larger than the'end'
parameter will cause this exception throw.- See Also:
NodeIndex
,SubSection
- Code:
- Exact Constructor Body:
1 2 3 4 5 6 7 8 9 10 11 12 13
if (start < 0) throw new IndexOutOfBoundsException ("Negative start value passed to DotPair constructor: start = " + start); if (end < 0) throw new IndexOutOfBoundsException ("Negative ending value passed to DotPair constructor: end = " + end); if (end < start) throw new IllegalArgumentException( "Start-parameter value passed to constructor is greater than ending-parameter: " + "start: [" + start + "], end: [" + end + ']' ); this.start = start; this.end = end;
-
-
Method Detail
-
hashCode
public int hashCode()
Implements the standard java'hashCode()'
method. This will provide a hash-code that is likely to avoid crashes.- Overrides:
hashCode
in classjava.lang.Object
- Returns:
- A hash-code that may be used for inserting
'this'
instance into a hashed table, map or list. - Code:
- Exact Method Body:
1
return this.start + (1000 * this.end);
-
size
public int size()
The purpose of this is to remind the user that the array bounds are inclusive at BOTH ends of the sub-list. Often, in manyjava.lang.String
operations, the start-position is included in the results, but the end position is not.
NOTICE: For a instance of'DotPair'
, the intention is to include both the start and ending positions are both INCLUSIVE, meaning they are both included in the sub-list.- Returns:
- The length of a sub-array that would be indicated by this dotted pair.
- Code:
- Exact Method Body:
1
return this.end - this.start + 1;
-
toString
public java.lang.String toString()
Java'stoString()
requirement.- Overrides:
toString
in classjava.lang.Object
- Returns:
- A string representing 'this' instance of DotPair.
- Code:
- Exact Method Body:
1
return "[" + start + ", " + end + "]";
-
equals
public boolean equals(java.lang.Object o)
Java'spublic boolean equals(Object o)
requirements.- Overrides:
equals
in classjava.lang.Object
- Parameters:
o
- This may be any JavaObject
, but only ones of'this'
type whose internal-values are identical will force this method to return TRUE.- Returns:
- TRUE if (and only if) parameter
'o'
is aninstanceof DotPair
and, also, both have equal start and ending field values. - Code:
- Exact Method Body:
1 2 3 4 5 6
if (o instanceof DotPair) { DotPair dp = (DotPair) o; return (this.start == dp.start) && (this.end == dp.end); } else return false;
-
clone
public DotPair clone()
Java'sinterface Cloneable
requirements. This instantiates a newDotPair
with identical'start', 'end'
fields.- Overrides:
clone
in classjava.lang.Object
- Returns:
- A new
DotPair
whose internal fields are identical to this one. - Code:
- Exact Method Body:
1
return new DotPair(this.start, this.end);
-
compareTo
public int compareTo(DotPair other)
Java'sinterface Comparable<T>
requirements. This is not the only comparison4 operation possible, but it does satisfy one reasonable requirement - SPECIFICALLY: which of two separate instances ofDotPair
start first.
NOTE: If twoDotPair
instances begin at the sameVector
-index, then the shorter of the two shall come first.- Specified by:
compareTo
in interfacejava.lang.Comparable<DotPair>
- Parameters:
other
- Any otherDotPair
to be compared to'this' DotPair
- Returns:
- An integer that fulfils Java's
interface Comparable<T> public boolean compareTo(T t)
method requirements. - Code:
- Exact Method Body:
1 2
int ret = this.start - other.start; return (ret != 0) ? ret : (this.size() - other.size());
-
iterator
public java.util.PrimitiveIterator.OfInt iterator()
This shall return anint Iterator
(which is properly namedclass java.util.PrimitiveIterator.OfInt
) that iterates integers beginning with the value inthis.start
and ending with the value inthis.end
.- Specified by:
iterator
in interfacejava.lang.Iterable<java.lang.Integer>
- Returns:
- An
Iterator
that iterates'this'
instance ofDotPair
from the beginning of the range, to the end of the range. TheIterator
returned will produce Java's primitive typeint
. - Code:
- Exact Method Body:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
return new PrimitiveIterator.OfInt() { private int cursor = start; public boolean hasNext() { return this.cursor <= end; } public int nextInt() { if (cursor == end) throw new NoSuchElementException ("Cursor has reached the value stored in 'end' [" + end + "]"); return cursor++; } };
-
isInside
public boolean isInside(int index)
This will test whether a specific index is contained (betweendp.start
anddp.end
, inclusively.- Parameters:
index
- This is any integer index value. It must be greater than zero.- Returns:
- TRUE If the value of index is greater-than-or-equal-to the value stored in
field
'start'
and furthermore is less-than-or-equal-to the value of field'end'
- Throws:
java.lang.IndexOutOfBoundsException
- If the value is negative, this exception will throw.- Code:
- Exact Method Body:
1 2 3 4
if (index < 0) throw new IndexOutOfBoundsException ("You have passed a negative index [" + index + "] here, but this is not allowed."); return (index >= start) && (index <= end);
-
enclosedBy
public boolean enclosedBy(DotPair other)
This will test whether'this' DotPair
is completely enclosed by parameterDotPair 'other'
.- Parameters:
other
- AnotherDotPair.
This parameter is expected to be a descriptor of the same vectorized-webpage as'this' DotPair
is. It is not mandatory, but if not, the comparison is likely meaningless.- Returns:
- TRUE If (and only if) parameter
'other'
encloses'this'
. - Code:
- Exact Method Body:
1
return (other.start <= this.start) && (other.end >= this.end);
-
encloses
public boolean encloses(DotPair other)
This will test whether'this' DotPair
is encloses, completely, parameterDotPair 'other'
.- Parameters:
other
- AnotherDotPair.
This parameter is expected to be a descriptor of the same vectorized-webpage as'this' DotPair
is. It is not mandatory, but if not, the comparison is likely meaningless.- Returns:
- TRUE If (and only if) parameter
'other'
is enclosed completely by'this'
. - Code:
- Exact Method Body:
1
return (this.start <= other.start) && (this.end >= other.end);
-
overlaps
public boolean overlaps(DotPair other)
This will test whether parameter'other'
has any overlappingVector
-indices with'this' DotPair.
- Parameters:
other
- AnotherDotPair.
This parameter is expected to be a descriptor of the same vectorized-webpage as'this' DotPair
is. It is not mandatory, but if not, the comparison is likely meaningless.- Returns:
- TRUE If (and only if) parameter
'other'
and'this'
have any overlap. - Code:
- Exact Method Body:
1 2
return ((this.start >= other.start) && (this.start <= other.end)) || ((this.end >= other.start) && (this.end <= other.end));
-
toVector
public static java.util.Vector<HTMLNode> toVector (java.util.Vector<? extends HTMLNode> html, DotPair dp)
This method converts a sublist, represented by a "dotted pair", and converts it into aVector
ofHTMLNode
.
NOTE: TheDotPair dp
parameter contains fieldsstart, end
, which simply represent the starting and ending indices into the HTML pageVector
. This method cycles through thatVector
, beginning with thedp.start
field, and ending with thedp.end
field. EachHTMLNode
reference within the sublist is inserted into the returnedVector
.- Parameters:
html
- Any Vectorized-HTML Web-Page, or sub-pagedp
- Any sublist within that HTML page.- Returns:
- A
Vector
version of the original sublist that was represented by passed parameter'dp'
- Code:
- Exact Method Body:
1 2 3 4 5 6
Vector<HTMLNode> ret = new Vector<>(); LV l = new LV(html, dp.start, dp.end + 1); for (int i=l.start; i < l.end; i++) ret.addElement(html.elementAt(i)); return ret;
-
toVectors
public static java.util.Vector<java.util.Vector<HTMLNode>> toVectors (java.util.Vector<? extends HTMLNode> html, java.util.Vector<DotPair> sublists)
This will cycle through a "list of sublists" and call the methodtoVector(Vector<? extends HTMLNode> html, DotPair dp)
on each sublist in the input parameter'sublists'
Those sublists will be collected into anotherVector
and returned.- Parameters:
html
- Any Vectorized-HTML Web-Page, or sub-pagesublists
- A "List of sublists" within that HTML page.- Returns:
- This method shall return a
Vector
containing vectors as sublists. - Code:
- Exact Method Body:
1 2 3 4
Vector<Vector<HTMLNode>> ret = new Vector<>(); for (DotPair sublist : sublists) ret.addElement(toVector(html, sublist)); return ret;
-
toSubSections
public static java.util.Vector<SubSection> toSubSections (java.util.Vector<? extends HTMLNode> html, java.util.Vector<DotPair> sublists)
This will cycle through a "list of sublists" and call the methodtoVector(Vector<? extends HTMLNode> html, DotPair dp)
on each sublist in the input parameter'sublists'
. Those sublists will be collected into anotherVector
and returned.- Parameters:
html
- Any Vectorized-HTML Web-Page, or sub-pagesublists
- A "List of sublists" within that HTML page.- Returns:
- This method shall return a
Vector
containing vectors as sublists. - Code:
- Exact Method Body:
1 2 3 4 5 6
Vector<SubSection> ret = new Vector<>(); for (DotPair sublist : sublists) ret.addElement(new SubSection(sublist, toVector(html, sublist))); return ret;
-
iterator
public static java.util.PrimitiveIterator.OfInt iterator (java.lang.Iterable<DotPair> dpi, boolean leastToGreatest)
- Code:
- Exact Method Body:
1
return toStream(dpi, leastToGreatest).iterator();
-
toPosArray
public static int[] toPosArray(java.lang.Iterable<DotPair> dpi, boolean leastToGreatest)
- Code:
- Exact Method Body:
1
return toStream(dpi, leastToGreatest).toArray();
-
toStream
public static java.util.stream.IntStream toStream (java.lang.Iterable<DotPair> dpi, boolean leastToGreatest)
This method will convert a list ofDotPair
instances to a Javajava.util.stream.IntStream
. The generatedIntStream
shall contain allVector
-indices (integers) that are within the bounds of any of theDotPair's
listed by parameter'dpi'
.
Stating this a second time, this position-index list (IntStream
) is built out of the contents of the'dpi'
parameter. The returned index-list that's created will have all indices that are "inside" (as inisInside(int)
) any of the'DotPair's'
within parameter'dpi'
.
HINT: Many of the "Find" Methods available in theHTML.NodeSearch
package return instances ofVector<DotPair>
. TheseVectors
ofDotPair
are to be thought-of as "lists of sub-lists of a vectorized-html web-page. This method can help identify each and every integer-index that are "inside any of these passed sublists."
SUBTLE POINT: The sublists (TheDotPair's
of input-parameter'dpi'
) might overlap. Furthermore, others might have spaces/gaps between them. This method shall return an'IntStream'
of integer-indices, all of which are guaranteed to be members of a least one (but possibly many of) the'dpi' DotPair
sublists.
NOTE ABOUT STALE-DATA: Try to keep in mind, always, that when writing code that modifies vectorized-HTML, the moment any node is inserted or deleted allVector
indices in your memory and data-structures may / might become stale or "invalid."
There are myriad ways to handle this issue, many of which are beyond the scope of this Documentation Entry. Generally, the best suggestion to keep in mind, is that if you are modifying a vectorized-html page, perform your updates or removals in reverse order, and yourVector
index-list pointers will not become stale pointers.- Parameters:
dpi
- This may be any source for aclass 'Dotpair'
instance which implements thepublic interface java.lang.Iterable<Dotpair>
interface.leastToGreatest
- When this parameter receives a TRUE value, the results that are returned from thisIntStream
will be sorted least to greatest. To generated anIntStream
that produces results that are sorted from greatest to least, pass FALSE to this parameter.- Returns:
- A java
java.util.stream.IntStream
of the integers in that are members of thisIterable<DotPair>
- Code:
- Exact Method Body:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Iterator<DotPair> iter = dpi.iterator(); TreeSet<DotPair> ts = new TreeSet<>(); while (iter.hasNext()) ts.add(iter.next()); // The tree-set will add the "DotPair" to the tree - and keep them sorted, // since that's what "TreeSet" does. Iterator<DotPair> tsIter = leastToGreatest ? ts.iterator() : ts.descendingIterator(); IntStream.Builder builder = IntStream.builder(); DotPair dp = null; if (leastToGreatest) while (tsIter.hasNext()) for (int i=(dp=tsIter.next()).start; i <= dp.end; i++) builder.add(i); // We are building a "forward-index" stream... DO AS MUCH SORTING... AS POSSIBLE! else while (tsIter.hasNext()) for (int i=(dp=tsIter.next()).end; i >= dp.start; i--) builder.add(i); // we are building a "reverse-index" stream... Make sure to add the sub-lists in // reverse-order. if (leastToGreatest) return builder.build().sorted().distinct(); // We have added them in order (mostly!!) - VERY-TRICKY, and this is the whole point... // MULTIPLE, OVERLAPPING DOTPAIRS // We need to sort because the DotPair sublists have been added in "sorted order" but // the overall list is not (necessarily, but possibly) sorted! else return builder.build().map(i -> -i).sorted().map(i -> -i).distinct(); // Here, the exact same argument holds, but also, when "re-sorting" we have to futz // around with the fact that Java's 'IntStream' class does not have a specialized // reverse-sort() (or alternate-sort()) method... (Kind of another JDK bug).
-
endPointsIterator
public static java.util.PrimitiveIterator.OfInt endPointsIterator (java.lang.Iterable<DotPair> dpi, boolean leastToGreatest)
- Code:
- Exact Method Body:
1
return endPointsToStream(dpi, leastToGreatest).iterator();
-
endPointsToPosArray
public static int[] endPointsToPosArray(java.lang.Iterable<DotPair> dpi, boolean leastToGreatest)
Convenience Method.
Invokes:endPointsToStream(Iterable, boolean)
Converts: output to anint[]
array.- Code:
- Exact Method Body:
1
return endPointsToStream(dpi, leastToGreatest).toArray();
-
endPointsToStream
public static java.util.stream.IntStream endPointsToStream (java.lang.Iterable<DotPair> dpi, boolean leastToGreatest)
Collates a list of dotted-pairs into anIntStream
. Here, only the starting and ending values of theDotPair's
are inserted into the returnedIntStream
. Any indices that lay betweenDotPair.start
andDotPair.end
are not placed into the output-IntStream
.
All other behaviors of this method are the same astoStream(Iterable, boolean)
.- Parameters:
dpi
- This may be any source for aclass 'Dotpair'
instance which implements thepublic interface java.lang.Iterable<Dotpair>
interface.leastToGreatest
- When this parameter receives a TRUE value, the results that are returned from thisIntStream
will be sorted least to greatest. To generated anIntStream
that produces results that are sorted from greatest to least, pass FALSE to this parameter.- Returns:
- A java
java.util.stream.IntStream
of the integers in that are members of thisIterable<DotPair>
. Only the valuesDotPair.start
, andDotPair.end
are included in the output-IntStream
. This is unlike the methodtoStream(Iterable, boolean)
in that, here, only the starting and ending points of the dotted-pair are placed into result. In the other method, the start-index, end-index and all indices in between them are placed into the returned-Stream
. - See Also:
toStream(java.lang.Iterable<Torello.HTML.DotPair>,boolean)
- Code:
- Exact Method Body:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
Iterator<DotPair> iter = dpi.iterator(); TreeSet<DotPair> ts = new TreeSet<>(); while (iter.hasNext()) ts.add(iter.next()); // The tree-set will add the "DotPair" to the tree - and keep them sorted, // since that's what "TreeSet" does. Iterator<DotPair> tsIter = leastToGreatest ? ts.iterator() : ts.descendingIterator(); IntStream.Builder builder = IntStream.builder(); DotPair dp = null; if (leastToGreatest) while (tsIter.hasNext()) { dp = tsIter.next(); builder.add(dp.start); // In this method, only start/end are placed into the IntStream builder.add(dp.end); // The indices BETWEEN start/end ARE NOT appened to the IntStream } // We are building a "forward-index" stream... DO AS MUCH SORTING... AS POSSIBLE! else while (tsIter.hasNext()) { dp = tsIter.next(); builder.add(dp.end); // Only start/end are appended. builder.add(dp.start); // NOTE: This is a "reverse order" IntStream } // we are building a "reverse-index" stream... Make sure to add the sub-lists in // reverse-order. if (leastToGreatest) return builder.build().sorted().distinct(); // We have added them in order (mostly!!) - VERY-TRICKY, and this is the whole point... // MULTIPLE, OVERLAPPING DOTPAIRS // We need to sort because the DotPair sublists have been added in "sorted order" but // the overall list is not (necessarily, but possibly) sorted! else return builder.build().map(i -> -i).sorted().map(i -> -i).distinct(); // Here, the exact same argument holds, but also, when "re-sorting" we have to futz // around with the fact that Java's 'IntStream' class does not have a specialized // reverse-sort() (or alternate-sort()) method... (Kind of another JDK bug).
-
-