Class CategoryPath
- java.lang.Object
-
- org.apache.lucene.facet.taxonomy.CategoryPath
-
- All Implemented Interfaces:
Serializable
,Cloneable
,Comparable<CategoryPath>
public class CategoryPath extends Object implements Serializable, Cloneable, Comparable<CategoryPath>
A CategoryPath holds a sequence of string components, specifying the hierarchical name of a category.CategoryPath is designed to reduce the number of object allocations, in two ways: First, it keeps the components internally in two arrays, rather than keeping individual strings. Second, it allows reusing the same CategoryPath object (which can be clear()ed and new components add()ed again) and of add()'s parameter (which can be a reusable object, not just a string).
- See Also:
- Serialized Form
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
-
Field Summary
Fields Modifier and Type Field Description protected char[]
chars
protected short[]
ends
protected short
ncomponents
-
Constructor Summary
Constructors Constructor Description CategoryPath()
Create an empty CategoryPath object.CategoryPath(int capacityChars, int capacityComponents)
Construct a new empty CategoryPath object.CategoryPath(CharSequence... components)
Construct a new CategoryPath object, copying an existing path given as an array of strings.CategoryPath(String pathString, char delimiter)
Construct a new CategoryPath object, given a single string with components separated by a given delimiter character.CategoryPath(CategoryPath existing)
Construct a new CategoryPath object, copying the path given in an existing CategoryPath object.CategoryPath(CategoryPath existing, int prefixLen)
Construct a new CategoryPath object, copying a prefix with the given number of components of the path given in an existing CategoryPath object.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
add(CharSequence component)
Add the given component to the end of the path.void
add(CharSequence pathString, char delimiter)
Add the given components to the end of the path.void
appendTo(Appendable out, char delimiter)
Build a string representation of the path, with its components separated by the given delimiter character.void
appendTo(Appendable out, char delimiter, int prefixLen)
likeappendTo(Appendable, char)
, but takes only a prefix of the path, rather than the whole path.void
appendTo(Appendable out, char delimiter, int start, int end)
likeappendTo(Appendable, char)
, but takes only a part of the path, rather than the whole path.int
capacityChars()
Returns the current character capacity of the CategoryPath.int
capacityComponents()
Returns the current component capacity of the CategoryPath.int
charsNeededForFullPath()
Returns the number of characters required to represent this entire category path, if written usingcopyToCharArray(char[], int, int, char)
orappendTo(Appendable, char)
.void
clear()
Empty the CategoryPath object, so that it has zero components.Object
clone()
int
compareTo(CategoryPath other)
Compares this CategoryPath with the other CategoryPath for lexicographic order.int
copyToCharArray(char[] outputBuffer, int outputBufferStart, int numberOfComponentsToCopy, char separatorChar)
Copies the specified number of components from this category path to the specified character array, with the components separated by a given delimiter character.void
deserializeFromStreamReader(InputStreamReader isr)
Serializes the content of this CategoryPath to a byte stream, using UTF-8 encoding to convert characters to bytes, and treating the ends as 16-bit characters.boolean
equals(Object obj)
Compare the given CategoryPath to another one.boolean
equalsToSerialized(int prefixLen, CharSequence buffer, int offset)
Just likeequalsToSerialized(CharSequence, int)
, but compare to a prefix of the CategoryPath, instead of the whole CategoryPath.boolean
equalsToSerialized(CharSequence buffer, int offset)
Check whether the current path is identical to the one serialized (withserializeAppendTo(Appendable)
) in the given buffer, at the given offset.String
getComponent(int i)
Return the i'th component of the path, in a new String object.int
hashCode()
Calculate a hashCode for this path, used when a CategoryPath serves as a hash-table key.int
hashCode(int prefixLen)
LikehashCode()
, but find the hash function of a prefix with the given number of components, rather than of the entire path.static int
hashCodeOfSerialized(CharSequence buffer, int offset)
This method calculates a hash function of a path that has been written to (usingserializeAppendTo(Appendable)
) a character buffer.boolean
isDescendantOf(CategoryPath other)
Test whether this object is a descendant of another CategoryPath.String
lastComponent()
Return the last component of the path, in a new String object.short
length()
Return the number of components in the facet path.long
longHashCode()
Calculate a 64-bit hash function for this path.long
longHashCode(int prefixLen)
LikelongHashCode()
, but find the hash function of a prefix with the given number of components, rather than of the entire path.void
serializeAppendTo(int prefixLen, Appendable out)
Just likeserializeAppendTo(Appendable)
, but writes only a prefix of the CategoryPath.void
serializeAppendTo(Appendable out)
Write out a serialized (as a character sequence) representation of the path to a given Appendable (e.g., a StringBuilder, CharBuffer, Writer, or something similar.void
serializeToStreamWriter(OutputStreamWriter osw)
Serializes the content of this CategoryPath to a byte stream, using UTF-8 encoding to convert characters to bytes, and treating the ends as 16-bit characters.int
setFromSerialized(CharSequence buffer, int offset)
Set a CategoryPath from a character-sequence representation written byserializeAppendTo(Appendable)
.String
toString()
This method, an implementation of theObject.toString()
interface, is to allow simple printing of a CategoryPath, for debugging purposes.String
toString(char delimiter)
Build a string representation of the path, with its components separated by the given delimiter character.String
toString(char delimiter, int prefixLen)
liketoString(char)
, but takes only a prefix with a given number of components, rather than the whole path.String
toString(char delimiter, int start, int end)
liketoString(char)
, but takes only a part of the path, rather than the whole path.void
trim(int nTrim)
Trim the last components from the path.
-
-
-
Constructor Detail
-
CategoryPath
public CategoryPath(int capacityChars, int capacityComponents)
Construct a new empty CategoryPath object. CategoryPath objects are meant to be reused, by add()ing components, and later clear()ing, and add()ing components again. The CategoryPath object is created with a buffer pre-allocated for a given number of characters and components, but the buffer will grow as necessary (seecapacityChars()
andcapacityComponents()
).
-
CategoryPath
public CategoryPath()
Create an empty CategoryPath object. Equivalent to the constructorCategoryPath(int, int)
with the two initial-capacity arguments set to zero.
-
CategoryPath
public CategoryPath(String pathString, char delimiter)
Construct a new CategoryPath object, given a single string with components separated by a given delimiter character.The initial capacity of the constructed object will be exactly what is needed to hold the given path. This fact is convenient when creating a temporary object that will not be reused later.
-
CategoryPath
public CategoryPath(CharSequence... components)
Construct a new CategoryPath object, copying an existing path given as an array of strings.The new object occupies exactly the space it needs, without any spare capacity. This is the expected behavior in the typical use case, where this constructor is used to create a temporary object which is never reused.
-
CategoryPath
public CategoryPath(CategoryPath existing)
Construct a new CategoryPath object, copying the path given in an existing CategoryPath object.This copy-constructor is handy when you need to save a reference to a CategoryPath (e.g., when it serves as a key to a hash-table), but cannot save a reference to the original object because its contents can be changed later by the user. Copying the contents into a new object is a solution.
This constructor does not copy the capacity (spare buffer size) of the existing CategoryPath. Rather, the new object occupies exactly the space it needs, without any spare. This is the expected behavior in the typical use case outlined in the previous paragraph.
-
CategoryPath
public CategoryPath(CategoryPath existing, int prefixLen)
Construct a new CategoryPath object, copying a prefix with the given number of components of the path given in an existing CategoryPath object.If the given length is negative or bigger than the given path's actual length, the full path is taken.
This constructor is often convenient for creating a temporary object with a path's prefix, but this practice is wasteful, and therefore inadvisable. Rather, the application should be written in a way that allows considering only a prefix of a given path, without needing to make a copy of that path.
-
-
Method Detail
-
length
public short length()
Return the number of components in the facet path. Note that this is not the number of characters, but the number of components.
-
trim
public void trim(int nTrim)
Trim the last components from the path.- Parameters:
nTrim
- Number of components to trim. If larger than the number of components this path has, the entire path will be cleared.
-
capacityChars
public int capacityChars()
Returns the current character capacity of the CategoryPath. The character capacity is the size of the internal buffer used to hold the characters of all the path's components. When a component is added and the capacity is not big enough, the buffer is automatically grown, and capacityChars() increases.
-
capacityComponents
public int capacityComponents()
Returns the current component capacity of the CategoryPath. The component capacity is the maximum number of components that the internal buffer can currently hold. When a component is added beyond this capacity, the buffer is automatically grown, and capacityComponents() increases.
-
add
public void add(CharSequence component)
Add the given component to the end of the path.Note that when a String object is passed to this method, a reference to it is not saved (rather, its content is copied), which will lead to that String object being gc'ed. To reduce the number of garbage objects, you can pass a mutable CharBuffer instead of an immutable String to this method.
-
clear
public void clear()
Empty the CategoryPath object, so that it has zero components. The capacity of the object (seecapacityChars()
andcapacityComponents()
) is not reduced, so that the object can be reused without frequent reallocations.
-
appendTo
public void appendTo(Appendable out, char delimiter) throws IOException
Build a string representation of the path, with its components separated by the given delimiter character. The resulting string is appended to a given Appendable, e.g., a StringBuilder, CharBuffer or Writer.Note that the two cases of zero components and one component with zero length produce indistinguishable results (both of them append nothing). This is normally not a problem, because components should not normally have zero lengths.
An IOException can be thrown if the given Appendable's append() throws this exception.
- Throws:
IOException
-
appendTo
public void appendTo(Appendable out, char delimiter, int prefixLen) throws IOException
likeappendTo(Appendable, char)
, but takes only a prefix of the path, rather than the whole path.If the given prefix length is negative or bigger than the path's actual length, the whole path is taken.
- Throws:
IOException
-
appendTo
public void appendTo(Appendable out, char delimiter, int start, int end) throws IOException
likeappendTo(Appendable, char)
, but takes only a part of the path, rather than the whole path.start
specifies the first component in the subpath, andend
is one past the last component. Ifstart
is negative, 0 is assumed, and ifend
is negative or past the end of the path, the path is taken until the end. Otherwise, ifend<=start
, nothing is appended. Nothing is appended also in the case that the path is empty.- Throws:
IOException
-
toString
public String toString(char delimiter)
Build a string representation of the path, with its components separated by the given delimiter character. The resulting string is returned as a new String object. To avoid this temporary object creation, consider usingappendTo(Appendable, char)
instead.Note that the two cases of zero components and one component with zero length produce indistinguishable results (both of them return an empty string). This is normally not a problem, because components should not normally have zero lengths.
-
toString
public String toString()
This method, an implementation of theObject.toString()
interface, is to allow simple printing of a CategoryPath, for debugging purposes. When possible, it recommended to avoid using it it, and rather, if you want to output the path with its components separated by a delimiter character, specify the delimiter explicitly, withtoString(char)
.
-
toString
public String toString(char delimiter, int prefixLen)
liketoString(char)
, but takes only a prefix with a given number of components, rather than the whole path.If the given length is negative or bigger than the path's actual length, the whole path is taken.
-
toString
public String toString(char delimiter, int start, int end)
liketoString(char)
, but takes only a part of the path, rather than the whole path.start
specifies the first component in the subpath, andend
is one past the last component. Ifstart
is negative, 0 is assumed, and ifend
is negative or past the end of the path, the path is taken until the end. Otherwise, ifend<=start
, an empty string is returned. An emptry string is returned also in the case that the path is empty.
-
getComponent
public String getComponent(int i)
Return the i'th component of the path, in a new String object. If there is no i'th component, a null is returned.
-
lastComponent
public String lastComponent()
Return the last component of the path, in a new String object. If the path is empty, a null is returned.
-
copyToCharArray
public int copyToCharArray(char[] outputBuffer, int outputBufferStart, int numberOfComponentsToCopy, char separatorChar)
Copies the specified number of components from this category path to the specified character array, with the components separated by a given delimiter character. The array must be large enough to hold the components and separators - the amount of needed space can be calculated withcharsNeededForFullPath()
.This method returns the number of characters written to the array.
- Parameters:
outputBuffer
- The destination character array.outputBufferStart
- The first location to write in the output array.numberOfComponentsToCopy
- The number of path components to write to the destination buffer.separatorChar
- The separator inserted between every pair of path components in the output buffer.- See Also:
charsNeededForFullPath()
-
charsNeededForFullPath
public int charsNeededForFullPath()
Returns the number of characters required to represent this entire category path, if written usingcopyToCharArray(char[], int, int, char)
orappendTo(Appendable, char)
. This includes the number of characters in all the components, plus the number of separators between them (each one character in the aforementioned methods).
-
add
public void add(CharSequence pathString, char delimiter)
Add the given components to the end of the path. The components are given in a single string, separated by a given delimiter character. If the given string is empty, it is assumed to refer to the root (empty) category, and nothing is added to the path (rather than adding a single empty component).Note that when a String object is passed to this method, a reference to it is not saved (rather, its content is copied), which will lead to that String object being gc'ed. To reduce the number of garbage objects, you can pass a mutable CharBuffer instead of an immutable String to this method.
-
equals
public boolean equals(Object obj)
Compare the given CategoryPath to another one. For two category paths to be considered equal, only the path they contain needs to be identical The unused capacity of the objects is not considered in the comparison.
-
isDescendantOf
public boolean isDescendantOf(CategoryPath other)
Test whether this object is a descendant of another CategoryPath. This is true if the other CategoryPath is the prefix of this.
-
hashCode
public int hashCode()
Calculate a hashCode for this path, used when a CategoryPath serves as a hash-table key. If two objects are equal(), their hashCodes need to be equal, so like in equal(), hashCode does not consider unused portions of the internal buffers in its calculation.The hash function used is modeled after Java's String.hashCode() - a simple multiplicative hash function with the multiplier 31. The same hash function also appeared in Kernighan & Ritchie's second edition of "The C Programming Language" (1988).
-
hashCode
public int hashCode(int prefixLen)
LikehashCode()
, but find the hash function of a prefix with the given number of components, rather than of the entire path.
-
longHashCode
public long longHashCode()
Calculate a 64-bit hash function for this path. UnlikehashCode()
, this method is not part of the Java standard, and is only used if explicitly called by the user.If two objects are equal(), their hash codes need to be equal, so like in
equals(Object)
, longHashCode does not consider unused portions of the internal buffers in its calculation.The hash function used is a simple multiplicative hash function, with the multiplier 65599. While Java's standard multiplier 31 (used in
hashCode()
) gives a good distribution for ASCII strings, it turns out that for foreign-language strings (with 16-bit characters) it gives too many collisions, and a bigger multiplier produces fewer collisions in this case.
-
longHashCode
public long longHashCode(int prefixLen)
LikelongHashCode()
, but find the hash function of a prefix with the given number of components, rather than of the entire path.
-
serializeAppendTo
public void serializeAppendTo(Appendable out) throws IOException
Write out a serialized (as a character sequence) representation of the path to a given Appendable (e.g., a StringBuilder, CharBuffer, Writer, or something similar.This method may throw a IOException if the given Appendable threw this exception while appending.
- Throws:
IOException
-
serializeAppendTo
public void serializeAppendTo(int prefixLen, Appendable out) throws IOException
Just likeserializeAppendTo(Appendable)
, but writes only a prefix of the CategoryPath.- Throws:
IOException
-
setFromSerialized
public int setFromSerialized(CharSequence buffer, int offset)
Set a CategoryPath from a character-sequence representation written byserializeAppendTo(Appendable)
.Reading starts at the given offset into the given character sequence, and the offset right after the end of this path is returned.
-
equalsToSerialized
public boolean equalsToSerialized(CharSequence buffer, int offset)
Check whether the current path is identical to the one serialized (withserializeAppendTo(Appendable)
) in the given buffer, at the given offset.
-
equalsToSerialized
public boolean equalsToSerialized(int prefixLen, CharSequence buffer, int offset)
Just likeequalsToSerialized(CharSequence, int)
, but compare to a prefix of the CategoryPath, instead of the whole CategoryPath.
-
hashCodeOfSerialized
public static int hashCodeOfSerialized(CharSequence buffer, int offset)
This method calculates a hash function of a path that has been written to (usingserializeAppendTo(Appendable)
) a character buffer. It is guaranteed that the value returned is identical to that whichhashCode()
would have produced for the original object before it was serialized.
-
serializeToStreamWriter
public void serializeToStreamWriter(OutputStreamWriter osw) throws IOException
Serializes the content of this CategoryPath to a byte stream, using UTF-8 encoding to convert characters to bytes, and treating the ends as 16-bit characters.- Parameters:
osw
- The output byte stream.- Throws:
IOException
- If there are encoding errors.
-
deserializeFromStreamReader
public void deserializeFromStreamReader(InputStreamReader isr) throws IOException
Serializes the content of this CategoryPath to a byte stream, using UTF-8 encoding to convert characters to bytes, and treating the ends as 16-bit characters.- Parameters:
isr
- The input stream.- Throws:
IOException
- If there are encoding errors.
-
compareTo
public int compareTo(CategoryPath other)
Compares this CategoryPath with the other CategoryPath for lexicographic order. Returns a negative integer, zero, or a positive integer as this CategoryPath lexicographically precedes, equals to, or lexicographically follows the other CategoryPath.- Specified by:
compareTo
in interfaceComparable<CategoryPath>
-
-