Interface HTMLModifier

  • Functional Interface:
    This is a functional interface and can therefore be used as the assignment target for a lambda expression or method reference.

    @FunctionalInterface
    public interface HTMLModifier
    HTMLModifier - Documentation.

    This allows a user to write a method, or a lambda expression that can take an HTML Page as a parameter (vectorized HTML) and perform any one (or all) of the following:

    1. Modify the HTML Elements in the news Article
    2. Clean or remove unnecessary portions of the Article
    3. Update URL's before downloading the pictures
    4. Insert HTML into portions of the page
    5. Extract salient information for internal processing



    • Field Detail

      • serialVersionUID

        static final long serialVersionUID
        This fulfils the SerialVersion UID requirement for all classes that implement Java's interface java.io.Serializable. Using the Serializable Implementation offered by java is very easy, and can make saving program state when debugging a lot easier. It can also be used in place of more complicated systems like "hibernate" to store data as well.

        Functional Interfaces are usually not thought of as Data Objects that need to be saved, stored and retrieved; however, having the ability to store intermediate results along with the lambda-functions that helped get those results can make debugging easier.
        See Also:
        Constant Field Values
        Code:
        Exact Field Declaration Expression:
        1
        public static final long serialVersionUID = 1;
        
    • Method Detail

      • modifyOrRetrieve

        void modifyOrRetrieve​(java.util.Vector<HTMLNode> html,
                              java.net.URL originalPageURL,
                              int sectionNum,
                              int articleNum)
        FUNCTIONAL-INTERFACE METHOD: This is the method that must be fulfilled to meet the requirements of this FunctionalInterface
        Parameters:
        html - This is the vectorized-HTML news-article
        originalPageURL - This is the original URL from whence the page was downloaded. It is provided to this method, merely for convenience.
        sectionNum - Each article that is downloaded belonged to a particular news section. News sections are listed by their position in a Java Vector, and the index into ths Vector is treated as a the 'ID', or the 'number' of the section.

        This value is provided here to this method just for convenience. When '.html' files are saved (as '.dat' files) - their save location is of the form of 'directoryName/fileName.dat' - where the String 'directoryName' is this Section Number provided here.
        articleNum - Each article that is downloaded is given a number that is merely the order of the Article in the download process. The 5th Article to be downloaded in a given / particular section would have filename '005.dat'.

        This value is provided here to this method just for convenience.