Class CharacterUtils


  • public class CharacterUtils
    extends Object
    Collection of utilities for character handling. Contains utilities for semi-automatically creating lexer rules.
    • Constructor Detail

      • CharacterUtils

        public CharacterUtils()
        Constructor for CharacterUtils.
    • Method Detail

      • toUnicodeChar

        public static String toUnicodeChar​(char c)
        Create a hex representation of the UTF-16 encoding of a Java char. This is the representation that's understood by Java when reading source code.
        Parameters:
        c - The char to be encoded.
        Returns:
        String Hex representation of character. For example, the result of encoding 'A' would be "A".
      • toHexString

        public static String toHexString​(char c)
        Create a hex representation of the UTF-16 encoding of a Java char. This is the representation that's understood by the JavaCC lexer.
        Parameters:
        c - The char to be encoded.
        Returns:
        String Hex representation of character. For example, the result of encoding 'A' would be "0x0041".
      • getLetterRange

        public static ArrayList<org.apache.uima.internal.util.CharacterUtils.CharRange> getLetterRange()
        Generate an ArrayList of CharRanges for what Java considers to be a letter. I use this as input to Unicode agnostic lexers like ANTLR.
        Returns:
        ArrayList A list of character ranges.
      • getDigitRange

        public static ArrayList<org.apache.uima.internal.util.CharacterUtils.CharRange> getDigitRange()
        Generate an ArrayList of CharRanges for what Java considers to be a digit. I use this as input to Unicode agnostic lexers like ANTLR.
        Returns:
        ArrayList A list of character ranges.
      • printAntlrLexRule

        public static void printAntlrLexRule​(String name,
                                             ArrayList<org.apache.uima.internal.util.CharacterUtils.CharRange> charRanges)
      • printJavaCCLexRule

        public static void printJavaCCLexRule​(String name,
                                              ArrayList<org.apache.uima.internal.util.CharacterUtils.CharRange> charRanges)
      • main

        public static void main​(String[] args)