Package Torello.Java

Class GREP


  • public class GREP
    extends java.lang.Object
    GREP - Documentation.

    GREP: Global Regular-Expressions Print



    GREP is a UNIX-Platform Search String for the File-System command-line program. It was written many years ago, in the 1980's. Generally, it is very useful, and there using the UNIX Command Line Version usually can make things a lot easier. Here, there are several things that make using this version of {@code GREP} a lot easier. Because the class FileNode has been through extensive testing, using {@code GREP} to search through files based on the directory-tree that is maintained can be made a lot easier when these two classes 'work together.'

    POINT OF ORDER: class FileNode is intended to be used as a "General Purpose Tool" for many operations on the File-System. The original use for the UNIX / BASH-Shell Command "GREP" was to search through files on the file-system, (either through a single-file, or recursively through the entire directory-tree), and print out information regarding whether the contents of any files matched certain "match criteria." These "match criteria" would be passed to the {@code grep} command-line tool using regular-expressions, or simple text-String's to be matched explicitly. Here in Java, the FunctionalInterface SearchAndPrint that is passed to this tool ought do its own version of text-matching and text-printing. Generally, this should very likely be easier in Java than on the command line, not harder.

    EXTENDING GREP, SED, AWK: When Regular-Expression matches are returned, it would likely be a single line to re-use Java's prolific regular-expression matching and replacement tools from package java.util.regex.* to do quick String-replacement operations on large sets of text-files. Writing Java Build Scripts, for instance occasionally requires fixing the same LOC in numerous locations. Finding that line of code using this class GREP (in conjunction with class FileNode) can be quick, and invoking the method String.replace(...) or even String.replaceAll(...) would provide the same features that other tools from the 1980's provided - SED & AWK.

    Example:
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    public static void main(String[] argv) throws IOException
    {
        // This "builder" or "hamburger stack" of method invokations loads the HTML Documentation Files into a FileNode
        // The resulting tree is "flattened" from a tree-structure, to a Vector.
        // If there happen to be any non-HTML files in the named directory (which there aren't), they will be "pruned" (filtered)
        Vector<FileNode> htmlFiles = FileNode
            .createRoot("Torello/Build/DocHTML/NodeSearch/")
            .loadTree(true, false)
            .prune(fileNode -> fileNode.getFullPathName().endsWith(".html"), false)
            .flatten();
    
        // This will search the through all "HTML" (Java Code Documentation Files) for the RegEx String in the "Pattern.compile" method
        // The command will use the "Simple" IOExceptionHandler (only if there are any IOExceptions, which here, there won't be)
        // The command will use the "ALL" (SearchAndPrint class) to print all matches.
        GREP.search(htmlFiles, SearchAndPrint.ALL(Pattern.compile
            ("<td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>\\*and\\*</i>"),
            true, System.out), IOExceptionHandler.SIMPLE);
    
        // ALSO NOTE: In the above line, the return results ARE NOT retained, but they easily could be with the line:
        // Vector<FileNode> matchResults = GREP.search(...)
    }
    

    If the above class were called, it would print the following text to a UNIX terminal (with colors):

    File: Torello/Build/DocHTML/NodeSearch/TagNodeFindInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagPoll.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagPoll.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodeCount.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagRemove.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagRemove.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodeGetL1Inclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagGetInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagGetInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagPollInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagPollInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagCount.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagCount.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodePeekL1Inclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodeFindL1Inclusive.html, Line:     <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodeGet.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodeIterator.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagIterator.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagIterator.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagRemoveInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagRemoveInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodeFind.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagPeekInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagPeekInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagFind.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagFind.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodeRemoveInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodePeekInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagPeek.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagPeek.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodeInclusiveIterator.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagGet.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagGet.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodeGetInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodeRemove.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagFindInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagFindInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodePollInclusive.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagInclusiveIterator.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/InnerTagInclusiveIterator.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodePeek.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    File: Torello/Build/DocHTML/NodeSearch/TagNodePoll.html, Line:      <td>When this parameter is present, only HTMLNode's which are both <code>instanceof class TagNode</code> <i>*and*</i> have
    


    Static (Functional) API: The methods in this class are all (100%) defined with the Java Key-Word / Key-Concept 'static'. Furthermore, there is no way to obtain an instance of this class, because there are no public (nor private) constructors. Java's Spring-Boot, MVC feature is *not* utilized because it flies directly in the face of the light-weight data-classes philosophy. This has many advantages over the rather ornate Component Annotations (@Component, @Service, @AutoWired, etc... 'Java Beans') syntax:

    • The methods here use the key-word 'static' which means (by implication) that there is no internal-state. Without any 'internal state' there is no need for constructors in the first place! (This is often the complaint by MVC Programmers).
    • A 'Static' (Functional-Programming) API expects to use fewer data-classes, and light-weight data-classes, making it easier to understand and to program.
    • The Vectorized HTML data-model allows more user-control over HTML parse, search, update & scrape. Also, memory management, memory leakage, and the Java Garbage Collector ought to be intelligible through the 'reuse' of the standard JDK class Vector for storing HTML Web-Page data.

    The power that object-oriented programming extends to a user is (mostly) limited to data-representation. Thinking of "Services" as "Objects" (Spring-MVC, 'Java Beans') is somewhat 'over-applying' the Object Oriented Programming Model. Like most classes in the Java-HTML JAR Library, this class backtracks to a more C-Styled Functional Programming Model (no Objects) - by re-using (quite profusely) the key-word static with all of its methods, and by sticking to Java's well-understood class Vector

    Internal-State: A user may click on this class' source code (see link below) to view any and all internally defined fields class. A cursory inspection of the code would prove that this class has precisely zero internally defined global fields (Spaghetti). All variables used by the methods in this class are local fields only, and therefore this class ought to be though of as 'state-less'.



    • Method Detail

      • search

        public static <T> T search​(VarList<T,​FileNode> listChoice,
                                   java.util.Iterator<FileNode> iter,
                                   SearchAndPrint sp,
                                   IOExceptionHandler ioeh)
        This version of GREP shall search every file in a java Iterator<FileNode> data-structure. Any 'directories' (which are not files) returned by this Iterator shall be skipped / ignored.
        Parameters:
        listChoice - This allows the user to choose a particular Container or List, such as Vector, Stream, or array. Furthermore the ability to specify a sort-type, and a data-out choice including File-Name, Full-Path Name, or File-Node is provided by a long list of options in class RetTypeChoice

        See: FileNode.RetTypeChoice.
        iter - Any Iterator of type Iterator<FileNode>. Each file that is returned in the return set shall be searched for the described matches using 'SearchAndPrint' parameter 'sp'
        sp - The 'sp' parameter is the implementation of the file-grep operation that the user is requesting. The actions that this SearchAndPrint method-pointer should perform include:

        • Test if the file contains a match - using whatever string-testing procedure required. Usually either a simple String token is being searched, or an entire String regular-expression is used.
        • Do some kind of output printing to a writable printing parameter to report these String-match results
        • return either TRUE or FALSE to indicate whether or not a particular file contained matches.


        IMPORTANT: This parameter cannot be null. It is the core of the grep-search algorithm. If the programmer wishes to rely on the standard 'token-search' or 'regular-expression-search' methods defined in class 'SearchAndPrint' - then the programmer should simply use one of the standard factory methods provided in 'SearchAndPrint' to build an instance of this class.
        ioeh - When performing these UNIX-GREP styled text-file searches, it becomes imperative to catch any potential IOException's so that if one file is failing to load to Java, the rest of the entire search will not be sacrificed or ceased. Using this parameter allows a programmer to log exceptions, or take any user-defined action when any potential IOException's occur while reading the files in a directory tree. Using this exception-handler allows the GREP-Search to continue, even if reading one particular file fails.

        NOTE: This parameter may be null, and if it is, it will be ignored - and all exceptions shall suppressed. This means that no exception information will be reported back to the user. If this parameter is null, and if an IOException is generated while traversing a particular file for GREP-search, that file will merely be skipped (gracefully), and its contents will not be searched.
        Returns:
        The return-list shall contain a reference of every instance of FileNode for which whose contents on the file-system contained a match - as determined by the SearchAndPrint instance that is passed to parameter 'sp'.

        The implementation of VarList<T, FileNode> used inside of class FileNode.RetTypeChoice allows a user of this API to specify the return type of this method. There are dozens of options for what file-system information about each file being returned should include, and how these returned instances of FileNode should be sorted. Note that this return-value allows for converting the files to their name as Java String's - among other operations.

        Furthermore, the static member fields in class RetTypeChoice also allow a programmer to choose a container for holding these instances of FileNode (or String-converted FileNode's).

        See: FileNode.RetTypeChoice.
        See Also:
        VarList, SearchAndPrint, FileRW.loadFileToString(String), FileNode.isDirectory, FileNode.toString(), SearchAndPrint.test(String, String), IOExceptionHandler.accept(FileNode, IOException)
        Code:
        Exact Method Body:
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        16
         VarList<T, FileNode> ret = listChoice.create();
        
         while (iter.hasNext())
         {
             FileNode fn = iter.next();
             if (fn.isDirectory) continue;
             try
             {
                 String  fileContents    = FileRW.loadFileToString(fn.toString());
                 boolean wasMatch        = sp.test(fn.toString(), fileContents);
        
                 if (wasMatch) ret.insert(fn);
        
             } catch (IOException e) { if (ioeh != null) ioeh.accept(fn, e); }
         }
         return ret.retrieve();
        
      • searchFile

        public static boolean searchFile​(FileNode file,
                                         SearchAndPrint sp,
                                         IOExceptionHandler ioeh)
        This version of GREP shall search a single file only. The passed parameter 'file' must be a FileNode instance that represents a UNIX or MS-DOS file, not a directory.
        Parameters:
        file - This is the file to be searched or "GREPPED."
        sp - The 'sp' parameter is the implementation of the file-grep operation that the user is requesting. The actions that this SearchAndPrint method-pointer should perform include:

        • Test if the file contains a match - using whatever string-testing procedure required. Usually either a simple String token is being searched, or an entire String regular-expression is used.
        • Do some kind of output printing to a writable printing parameter to report these String-match results
        • return either TRUE or FALSE to indicate whether or not a particular file contained matches.


        IMPORTANT: This parameter cannot be null. It is the core of the grep-search algorithm. If the programmer wishes to rely on the standard 'token-search' or 'regular-expression-search' methods defined in class 'SearchAndPrint' - then the programmer should simply use one of the standard factory methods provided in 'SearchAndPrint' to build an instance of this class.
        ioeh - This parameter may be used, or it may be left null. It is less emphasized here, because only a single file is being searched. That file is specified in the parameters to this method. Since there is only one file to search, catching the IOException's makes little difference because GREP-search on other files will not be hindered, since there are no other files being searched.
        Returns:
        A value of TRUE shall indicate that there was a match found.
        Throws:
        FileExpectedException - The parameter 'file' must actually be a 'file' instance of FileNode. If 'file' is not actually a file, but rather a directory, then this exception shall throw.
        See Also:
        FileExpectedException.check(FileNode), FileRW.loadFileToString(String), FileNode.toString(), SearchAndPrint.test(String, String), IOExceptionHandler.accept(FileNode, IOException)
        Code:
        Exact Method Body:
        1
        2
        3
        4
        5
        6
        7
        8
         // Can only test files, not directories...
         FileExpectedException.check(file);
        
         // Test to see if there is a match
         try
             { return sp.test(file.toString(), FileRW.loadFileToString(file.toString())); }
         catch (IOException e)
             { if (ioeh != null) ioeh.accept(file, e); return false; }