Class Translate


  • public class Translate
    extends java.lang.Object
    Translate (普通话, Simplified & Traditional Chinese) - Documentation.

    This provides a marginally simpler way to access Google Cloud Server's Translate API, with a specification-requirement of using Mandarin. It converts the JSON, automatically, into Java Vector's of Strings. Translations can be provided, of Mandarin, in English, Spanish, and even the "Traditional Character Set" of Mandarin.

    Static (Functional) API: The methods in this class are all (100%) defined with the Java Key-Word / Key-Concept 'static'. Furthermore, there is no way to obtain an instance of this class, because there are no public (nor private) constructors. Java's Spring-Boot, MVC feature is *not* utilized because it flies directly in the face of the light-weight data-classes philosophy. This has many advantages over the rather ornate Component Annotations (@Component, @Service, @AutoWired, etc... 'Java Beans') syntax:

    • The methods here use the key-word 'static' which means (by implication) that there is no internal-state. Without any 'internal state' there is no need for constructors in the first place! (This is often the complaint by MVC Programmers).
    • A 'Static' (Functional-Programming) API expects to use fewer data-classes, and light-weight data-classes, making it easier to understand and to program.
    • The Vectorized HTML data-model allows more user-control over HTML parse, search, update & scrape. Also, memory management, memory leakage, and the Java Garbage Collector ought to be intelligible through the 'reuse' of the standard JDK class Vector for storing HTML Web-Page data.

    The power that object-oriented programming extends to a user is (mostly) limited to data-representation. Thinking of "Services" as "Objects" (Spring-MVC, 'Java Beans') is somewhat 'over-applying' the Object Oriented Programming Model. Like most classes in the Java-HTML JAR Library, this class backtracks to a more C-Styled Functional Programming Model (no Objects) - by re-using (quite profusely) the key-word static with all of its methods, and by sticking to Java's well-understood class Vector

    Internal-State: A user may click on this class' source code (see link below) to view any and all internally defined fields class. A cursory inspection of the code would prove that this class has precisely zero internally defined global fields (Spaghetti). All variables used by the methods in this class are local fields only, and therefore this class ought to be though of as 'state-less'.



    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method
      static void article​(Vector<String> simpSentencesIN, Vector<Vector<String>> sentencesOUT, Vector<Vector<Vector<String>>> wordTablesOUT, Vector<String> DOUTArr, Vector<java.lang.Boolean> DOUTErrorBoolArr)
      static boolean block​(String simpSentenceIN, Vector<String> sentencesOUT, Vector<Vector<String>> wordTableOUT, Appendable DOUT)
      static String getEnglish​(String chineseWord)
      static String getPinYin​(String chineseWord)
      static String[] sentenceZH​(String chinese)
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • article

        public static void article​
                    (java.util.Vector<java.lang.String> simpSentencesIN,
                     java.util.Vector<java.util.Vector<java.lang.String>> sentencesOUT,
                     java.util.Vector<java.util.Vector<java.util.Vector<java.lang.String>>> wordTablesOUT,
                     java.util.Vector<java.lang.String> DOUTArr,
                     java.util.Vector<java.lang.Boolean> DOUTErrorBoolArr)
                throws java.io.IOException
        
        Throws:
        java.io.IOException
        Code:
        Exact Method Body:
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
         for (String simpSentence : simpSentencesIN)
         {
             Vector<String>          sentences   = new Vector<String>();
             Vector<Vector<String>>  wordTable   = new Vector<Vector<String>>();
             StringBuilder           DOUT        = new StringBuilder();
             boolean                 error       = block(simpSentence, sentences, wordTable, DOUT);
        
             sentencesOUT.add(sentences);
             wordTablesOUT.add(wordTable);
             DOUTArr.add(DOUT.toString());
             DOUTErrorBoolArr.add(Boolean.valueOf(error));
         }
        
      • block

        public static boolean block​
                    (java.lang.String simpSentenceIN,
                     java.util.Vector<java.lang.String> sentencesOUT,
                     java.util.Vector<java.util.Vector<java.lang.String>> wordTableOUT,
                     java.lang.Appendable DOUT)
                throws java.io.IOException
        
        Throws:
        java.io.IOException - The interface java.lang.Appendable mandates that the IOException must be treated as a checked exception for all output operations. Therefore IOException is a required exception in this method' throws clause.
        Code:
        Exact Method Body:
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        16
        17
        18
        19
        20
        21
        22
        23
        24
        25
        26
        27
        28
        29
        30
        31
        32
        33
         String[]    gtScrape        = sentenceZH(simpSentenceIN);
         String      pronSentence    = gtScrape[0];
         String      englSentence    = gtScrape[1];
        
         sentencesOUT.removeAllElements();
         sentencesOUT.add(simpSentenceIN);
         sentencesOUT.add(pronSentence);
         sentencesOUT.add(englSentence);
        
         Vector<String>  simpWords   = new Vector<String>();
         Vector<String>  pronWords   = new Vector<String>();
         boolean         errorParse  = PinYinParse.parse
                             (DOUT, simpSentenceIN, pronSentence, simpWords, pronWords);
        
         if (pronWords.size() != simpWords.size()) throw new IllegalStateException(
             "The pronunciation and the character vector's should be the exact same length.\n" +
             "pronWords.size() = " + pronWords.size() + " and simpWords.size() = " + 
             simpWords.size()
         );
        
         int len = pronWords.size();
         for (int i=0; i < len; i++)
         {
             Vector<String> vocabEntryRow = new Vector<String>();
        
             vocabEntryRow.add(simpWords.elementAt(i));
             vocabEntryRow.add(pronWords.elementAt(i));
             vocabEntryRow.add(""); //Dictionary.lookupTrad(simp, pron));
             vocabEntryRow.add(""); //Dictionary.lookupEngl(simp, pron));
        
             wordTableOUT.add(vocabEntryRow);
         }
         return errorParse;
        
      • sentenceZH

        public static java.lang.String[] sentenceZH​(java.lang.String chinese)
        This receives as input a sentence in simplified Mandarin Chinese. If it finds a period in it, it breaks the sentence up into smaller bricks based around the period. It queries Google Translate using this sentence.
        Parameters:
        chinese - Any sentence, paragraph, phrase or word in Simplified-Mandarin
        Returns:
        Two separate Strings returned in a String array - two elements long.
        1. ret[0] - The pronunciation (罗马拼音) String scraped from a call to Google Translate
        2. ret[0] - The English - also scraped from a call to http://translate.google.com
        Code:
        Exact Method Body:
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        11
        12
        13
        14
        15
        16
        17
        18
        19
        20
        21
        22
        23
        24
        25
        26
        27
        28
        29
        30
        31
        32
        33
        34
        35
        36
        37
        38
        39
        40
         if (chinese.indexOf('\n') != -1) throw new IllegalArgumentException("CHINESE:\t" + chinese + "\nContains a newline!");
        
         String[]        cArr            = chinese.trim().split("。");
         StringBuilder   completePron    = new StringBuilder();
         StringBuilder   completeEngl    = new StringBuilder();
        
         for (int i=0; i < cArr.length; i++)
         {
             // Prepare the queries and scrape http://translate.google.com/query web-page.
             Vector<HTMLNode>    page        = null;
             int                 retryCount  = 0;
        
             while ((page == null) && (retryCount < 6))
                 try {
                     String          chineseQ    = URLs.toProperURLV2(cArr[i] + "。");
                     BufferedReader  br          = Scrape.openConn_iso_8859_1("https://translate.google.com/?q=" + chineseQ);
                     page                        = HTMLPage.getPageTokens(br, false);
                 } catch (Exception e) {
                     retryCount++;
                     System.out.println("RETRY-SCRAPE Google Translate:\n" + "Attempt #" + retryCount + "\n" + e.getMessage());
                 }
        
             // Get Chinese PinYin as Sentence
             StringBuilder       pron    = new StringBuilder();
             Vector<HTMLNode>    partial = InnerTagGetInclusive.first(page, "div", "id", TextComparitor.EQ_CASE_INSENSITIVE, "src-translit");
        
             Util.removeAllTagNodes(partial);
             for (HTMLNode n : partial) pron.append(((TextNode) n).str);
             completePron.append(Escape.replace(pron.toString()).trim() + "  ");
        
        
             // Get English from Translate Website as a Sentence
             StringBuilder   engl    = new StringBuilder();
             partial                 = InnerTagGetInclusive.first(page, "span", "id", TextComparitor.EQ_CASE_INSENSITIVE, "result_box");
             Util.removeAllTagNodes(partial);
             for (HTMLNode n : partial) engl.append(((TextNode) n).str);
             completeEngl.append(Escape.replace(engl.toString().replaceAll("\\\\u200b", "")).trim() + "  ");
         }
         String [] retArr = { completePron.toString(), completeEngl.toString() };
         return retArr;
        
      • getPinYin

        public static java.lang.String getPinYin​(java.lang.String chineseWord)
                                          throws java.io.IOException
        Retrieves the PinYin pronunciation from Google Translate Servers for a single Chinese Word.
        Parameters:
        chineseWord - Any single word in simplified Mandarin Chinese
        Returns:
        The Pinyin Pronunciation of that word, stripped by Google Translate Servers.
        Throws:
        java.io.IOException
        Code:
        Exact Method Body:
        1
        2
        3
        4
        5
        6
        7
        8
         BufferedReader      br      = Scrape.openConn_iso_8859_1("https://translate.google.com/?q=" + chineseWord + "&source=zh-CN");
         Vector<HTMLNode>    page    = HTMLPage.getPageTokens(br, false);
         Vector<HTMLNode>    partial = InnerTagGetInclusive.first(page, "div", "id", TextComparitor.EQ_CASE_INSENSITIVE, "src-translit");
         String              pron    = "";
        
         Util.removeAllTagNodes(partial);
         for (HTMLNode n : partial) pron += ((TextNode) n).str;
         return Escape.replace(pron);
        
      • getEnglish

        public static java.lang.String getEnglish​(java.lang.String chineseWord)
                                           throws java.io.IOException
        Retrieves the Google Translate (English) Textbox-defintion for a particular Mandarin Chinese Word.
        NOTE: This is not the information under the primary/main translation-text-box, this is the translation-text-box word itself.
        Parameters:
        chineseWord - Any single word in simplified Mandarin Chinese
        Returns:
        The Google Translate Server's best attempt at a Translation.
        Throws:
        java.io.IOException
        Code:
        Exact Method Body:
        1
        2
        3
        4
        5
        6
        7
        8
         BufferedReader      br      = Scrape.openConn_iso_8859_1("https://translate.google.com/?q=" + chineseWord);
         Vector<HTMLNode>    page    = HTMLPage.getPageTokens(br, false, null, "matches.txt", null);
         Vector<HTMLNode>    partial = InnerTagGetInclusive.first(page, "span", "id", TextComparitor.EQ_CASE_INSENSITIVE, "result_box");
         String              engl    = "";
        
         Util.removeAllTagNodes(partial);
         for (HTMLNode n : partial) engl += ((TextNode) n).str;
         return Escape.replace(engl);