Class CategoryPath

  • All Implemented Interfaces:
    Serializable, Cloneable, Comparable<CategoryPath>

    public class CategoryPath
    extends Object
    implements Serializable, Cloneable, Comparable<CategoryPath>
    A CategoryPath holds a sequence of string components, specifying the hierarchical name of a category.

    CategoryPath is designed to reduce the number of object allocations, in two ways: First, it keeps the components internally in two arrays, rather than keeping individual strings. Second, it allows reusing the same CategoryPath object (which can be clear()ed and new components add()ed again) and of add()'s parameter (which can be a reusable object, not just a string).

    See Also:
    Serialized Form
    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Field Detail

      • chars

        protected char[] chars
      • ends

        protected short[] ends
      • ncomponents

        protected short ncomponents
    • Constructor Detail

      • CategoryPath

        public CategoryPath​(int capacityChars,
                            int capacityComponents)
        Construct a new empty CategoryPath object. CategoryPath objects are meant to be reused, by add()ing components, and later clear()ing, and add()ing components again. The CategoryPath object is created with a buffer pre-allocated for a given number of characters and components, but the buffer will grow as necessary (see capacityChars() and capacityComponents()).
      • CategoryPath

        public CategoryPath()
        Create an empty CategoryPath object. Equivalent to the constructor CategoryPath(int, int) with the two initial-capacity arguments set to zero.
      • CategoryPath

        public CategoryPath​(String pathString,
                            char delimiter)
        Construct a new CategoryPath object, given a single string with components separated by a given delimiter character.

        The initial capacity of the constructed object will be exactly what is needed to hold the given path. This fact is convenient when creating a temporary object that will not be reused later.

      • CategoryPath

        public CategoryPath​(CharSequence... components)
        Construct a new CategoryPath object, copying an existing path given as an array of strings.

        The new object occupies exactly the space it needs, without any spare capacity. This is the expected behavior in the typical use case, where this constructor is used to create a temporary object which is never reused.

      • CategoryPath

        public CategoryPath​(CategoryPath existing)
        Construct a new CategoryPath object, copying the path given in an existing CategoryPath object.

        This copy-constructor is handy when you need to save a reference to a CategoryPath (e.g., when it serves as a key to a hash-table), but cannot save a reference to the original object because its contents can be changed later by the user. Copying the contents into a new object is a solution.

        This constructor does not copy the capacity (spare buffer size) of the existing CategoryPath. Rather, the new object occupies exactly the space it needs, without any spare. This is the expected behavior in the typical use case outlined in the previous paragraph.

      • CategoryPath

        public CategoryPath​(CategoryPath existing,
                            int prefixLen)
        Construct a new CategoryPath object, copying a prefix with the given number of components of the path given in an existing CategoryPath object.

        If the given length is negative or bigger than the given path's actual length, the full path is taken.

        This constructor is often convenient for creating a temporary object with a path's prefix, but this practice is wasteful, and therefore inadvisable. Rather, the application should be written in a way that allows considering only a prefix of a given path, without needing to make a copy of that path.

    • Method Detail

      • length

        public short length()
        Return the number of components in the facet path. Note that this is not the number of characters, but the number of components.
      • trim

        public void trim​(int nTrim)
        Trim the last components from the path.
        Parameters:
        nTrim - Number of components to trim. If larger than the number of components this path has, the entire path will be cleared.
      • capacityChars

        public int capacityChars()
        Returns the current character capacity of the CategoryPath. The character capacity is the size of the internal buffer used to hold the characters of all the path's components. When a component is added and the capacity is not big enough, the buffer is automatically grown, and capacityChars() increases.
      • capacityComponents

        public int capacityComponents()
        Returns the current component capacity of the CategoryPath. The component capacity is the maximum number of components that the internal buffer can currently hold. When a component is added beyond this capacity, the buffer is automatically grown, and capacityComponents() increases.
      • add

        public void add​(CharSequence component)
        Add the given component to the end of the path.

        Note that when a String object is passed to this method, a reference to it is not saved (rather, its content is copied), which will lead to that String object being gc'ed. To reduce the number of garbage objects, you can pass a mutable CharBuffer instead of an immutable String to this method.

      • clear

        public void clear()
        Empty the CategoryPath object, so that it has zero components. The capacity of the object (see capacityChars() and capacityComponents()) is not reduced, so that the object can be reused without frequent reallocations.
      • appendTo

        public void appendTo​(Appendable out,
                             char delimiter)
                      throws IOException
        Build a string representation of the path, with its components separated by the given delimiter character. The resulting string is appended to a given Appendable, e.g., a StringBuilder, CharBuffer or Writer.

        Note that the two cases of zero components and one component with zero length produce indistinguishable results (both of them append nothing). This is normally not a problem, because components should not normally have zero lengths.

        An IOException can be thrown if the given Appendable's append() throws this exception.

        Throws:
        IOException
      • appendTo

        public void appendTo​(Appendable out,
                             char delimiter,
                             int prefixLen)
                      throws IOException
        like appendTo(Appendable, char), but takes only a prefix of the path, rather than the whole path.

        If the given prefix length is negative or bigger than the path's actual length, the whole path is taken.

        Throws:
        IOException
      • appendTo

        public void appendTo​(Appendable out,
                             char delimiter,
                             int start,
                             int end)
                      throws IOException
        like appendTo(Appendable, char), but takes only a part of the path, rather than the whole path.

        start specifies the first component in the subpath, and end is one past the last component. If start is negative, 0 is assumed, and if end is negative or past the end of the path, the path is taken until the end. Otherwise, if end<=start, nothing is appended. Nothing is appended also in the case that the path is empty.

        Throws:
        IOException
      • toString

        public String toString​(char delimiter)
        Build a string representation of the path, with its components separated by the given delimiter character. The resulting string is returned as a new String object. To avoid this temporary object creation, consider using appendTo(Appendable, char) instead.

        Note that the two cases of zero components and one component with zero length produce indistinguishable results (both of them return an empty string). This is normally not a problem, because components should not normally have zero lengths.

      • toString

        public String toString()
        This method, an implementation of the Object.toString() interface, is to allow simple printing of a CategoryPath, for debugging purposes. When possible, it recommended to avoid using it it, and rather, if you want to output the path with its components separated by a delimiter character, specify the delimiter explicitly, with toString(char).
        Overrides:
        toString in class Object
      • toString

        public String toString​(char delimiter,
                               int prefixLen)
        like toString(char), but takes only a prefix with a given number of components, rather than the whole path.

        If the given length is negative or bigger than the path's actual length, the whole path is taken.

      • toString

        public String toString​(char delimiter,
                               int start,
                               int end)
        like toString(char), but takes only a part of the path, rather than the whole path.

        start specifies the first component in the subpath, and end is one past the last component. If start is negative, 0 is assumed, and if end is negative or past the end of the path, the path is taken until the end. Otherwise, if end<=start, an empty string is returned. An emptry string is returned also in the case that the path is empty.

      • getComponent

        public String getComponent​(int i)
        Return the i'th component of the path, in a new String object. If there is no i'th component, a null is returned.
      • lastComponent

        public String lastComponent()
        Return the last component of the path, in a new String object. If the path is empty, a null is returned.
      • copyToCharArray

        public int copyToCharArray​(char[] outputBuffer,
                                   int outputBufferStart,
                                   int numberOfComponentsToCopy,
                                   char separatorChar)
        Copies the specified number of components from this category path to the specified character array, with the components separated by a given delimiter character. The array must be large enough to hold the components and separators - the amount of needed space can be calculated with charsNeededForFullPath().

        This method returns the number of characters written to the array.

        Parameters:
        outputBuffer - The destination character array.
        outputBufferStart - The first location to write in the output array.
        numberOfComponentsToCopy - The number of path components to write to the destination buffer.
        separatorChar - The separator inserted between every pair of path components in the output buffer.
        See Also:
        charsNeededForFullPath()
      • charsNeededForFullPath

        public int charsNeededForFullPath()
        Returns the number of characters required to represent this entire category path, if written using copyToCharArray(char[], int, int, char) or appendTo(Appendable, char). This includes the number of characters in all the components, plus the number of separators between them (each one character in the aforementioned methods).
      • add

        public void add​(CharSequence pathString,
                        char delimiter)
        Add the given components to the end of the path. The components are given in a single string, separated by a given delimiter character. If the given string is empty, it is assumed to refer to the root (empty) category, and nothing is added to the path (rather than adding a single empty component).

        Note that when a String object is passed to this method, a reference to it is not saved (rather, its content is copied), which will lead to that String object being gc'ed. To reduce the number of garbage objects, you can pass a mutable CharBuffer instead of an immutable String to this method.

      • equals

        public boolean equals​(Object obj)
        Compare the given CategoryPath to another one. For two category paths to be considered equal, only the path they contain needs to be identical The unused capacity of the objects is not considered in the comparison.
        Overrides:
        equals in class Object
      • isDescendantOf

        public boolean isDescendantOf​(CategoryPath other)
        Test whether this object is a descendant of another CategoryPath. This is true if the other CategoryPath is the prefix of this.
      • hashCode

        public int hashCode()
        Calculate a hashCode for this path, used when a CategoryPath serves as a hash-table key. If two objects are equal(), their hashCodes need to be equal, so like in equal(), hashCode does not consider unused portions of the internal buffers in its calculation.

        The hash function used is modeled after Java's String.hashCode() - a simple multiplicative hash function with the multiplier 31. The same hash function also appeared in Kernighan & Ritchie's second edition of "The C Programming Language" (1988).

        Overrides:
        hashCode in class Object
      • hashCode

        public int hashCode​(int prefixLen)
        Like hashCode(), but find the hash function of a prefix with the given number of components, rather than of the entire path.
      • longHashCode

        public long longHashCode()
        Calculate a 64-bit hash function for this path. Unlike hashCode(), this method is not part of the Java standard, and is only used if explicitly called by the user.

        If two objects are equal(), their hash codes need to be equal, so like in equals(Object), longHashCode does not consider unused portions of the internal buffers in its calculation.

        The hash function used is a simple multiplicative hash function, with the multiplier 65599. While Java's standard multiplier 31 (used in hashCode()) gives a good distribution for ASCII strings, it turns out that for foreign-language strings (with 16-bit characters) it gives too many collisions, and a bigger multiplier produces fewer collisions in this case.

      • longHashCode

        public long longHashCode​(int prefixLen)
        Like longHashCode(), but find the hash function of a prefix with the given number of components, rather than of the entire path.
      • serializeAppendTo

        public void serializeAppendTo​(Appendable out)
                               throws IOException
        Write out a serialized (as a character sequence) representation of the path to a given Appendable (e.g., a StringBuilder, CharBuffer, Writer, or something similar.

        This method may throw a IOException if the given Appendable threw this exception while appending.

        Throws:
        IOException
      • setFromSerialized

        public int setFromSerialized​(CharSequence buffer,
                                     int offset)
        Set a CategoryPath from a character-sequence representation written by serializeAppendTo(Appendable).

        Reading starts at the given offset into the given character sequence, and the offset right after the end of this path is returned.

      • equalsToSerialized

        public boolean equalsToSerialized​(CharSequence buffer,
                                          int offset)
        Check whether the current path is identical to the one serialized (with serializeAppendTo(Appendable)) in the given buffer, at the given offset.
      • hashCodeOfSerialized

        public static int hashCodeOfSerialized​(CharSequence buffer,
                                               int offset)
        This method calculates a hash function of a path that has been written to (using serializeAppendTo(Appendable)) a character buffer. It is guaranteed that the value returned is identical to that which hashCode() would have produced for the original object before it was serialized.
      • serializeToStreamWriter

        public void serializeToStreamWriter​(OutputStreamWriter osw)
                                     throws IOException
        Serializes the content of this CategoryPath to a byte stream, using UTF-8 encoding to convert characters to bytes, and treating the ends as 16-bit characters.
        Parameters:
        osw - The output byte stream.
        Throws:
        IOException - If there are encoding errors.
      • deserializeFromStreamReader

        public void deserializeFromStreamReader​(InputStreamReader isr)
                                         throws IOException
        Serializes the content of this CategoryPath to a byte stream, using UTF-8 encoding to convert characters to bytes, and treating the ends as 16-bit characters.
        Parameters:
        isr - The input stream.
        Throws:
        IOException - If there are encoding errors.
      • compareTo

        public int compareTo​(CategoryPath other)
        Compares this CategoryPath with the other CategoryPath for lexicographic order. Returns a negative integer, zero, or a positive integer as this CategoryPath lexicographically precedes, equals to, or lexicographically follows the other CategoryPath.
        Specified by:
        compareTo in interface Comparable<CategoryPath>