Package org.apache.uima.cas.impl
Class CasSerializerSupport.CasDocSerializer
- java.lang.Object
-
- org.apache.uima.cas.impl.CasSerializerSupport.CasDocSerializer
-
- Enclosing class:
- CasSerializerSupport
public class CasSerializerSupport.CasDocSerializer extends Object
Use an inner class to hold the data for serializing a CAS. Each call to serialize() creates its own instance. package private to allow a test case to access not static to share the logger and the initializing values (could be changed)
-
-
Field Summary
Fields Modifier and Type Field Description CASImpl
cas
TypeSystemImpl
filterTypeSystem
IntVector[]
indexedFSs
boolean
isDelta
Whether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified.boolean
isDynamicMultiRef
Set to true for JSON configuration of using dynamic multi-ref detection for arrays and listsboolean
isFiltering
Whether the serializer needs to check for filtered-out types/features.boolean
isFormattedOutput
ListUtils
listUtils
MarkerImpl
marker
Used to tell if a FS was created before or after mark.IntVector
modifiedEmbeddedValueFSs
PositiveIntSet
multiRefFSs
Set of FSs that have multiple references Has an entry for each FS (not just array or list FSs) which is (from some point on) being serialized as a multi-ref, that is, is **not** being serialized (any more) using the special notation for arrays and lists or, for JSON, **not** being serialized using the embedded notation This is for JSON which is computing the multi-refs, not depending on the setting in a feature.boolean
needNameSpaces
Set<String>
nsPrefixesUsed
the set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back inMap<String,String>
nsUriToPrefixMap
map from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace stringIntVector
previouslySerializedFSs
XmiSerializationSharedData
sharedData
for Delta serialization, holds the info gathered from deserialization needed for delta serialization and for handling out-of-type-system data for both plain and delta serializationComparator<Integer>
sortFssByType
sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by idTypeSystemImpl
tsi
XmlElementName[]
typeCode2namespaceNames
PositiveIntSet_impl
visited_not_yet_written
set of FSs that have been visited and enqueued to be serialized - exception: arrays and lists which are "inline" are put into this set, but are not enqueued to be serialized.
-
Constructor Summary
Constructors Constructor Description CasDocSerializer(ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss)
CasDocSerializer(ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss, boolean trackMultiRefs)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int
classifyType(int type)
Classifies a type.void
encodeFS(int addr)
Encode an individual FS.void
encodeIndexed()
void
encodeQueued()
String
getNameSpacePrefix(String uimaTypeName, String nsUri, int lastDotIndex)
int
getSofaAddr(int sofaNum)
TypeImpl[]
getSortedUsedTypes()
String
getTypeNameFromXmlElementName(XmlElementName xe)
String
getUniqueString(String s)
String
getXmiId(int addr)
Get the XMI ID to use for an FS.int
getXmiIdAsInt(int addr)
boolean
isStaticMultiRef(int featCode)
void
serialize()
Starts serializationvoid
writeViewsCommons()
-
-
-
Field Detail
-
cas
public final CASImpl cas
-
tsi
public final TypeSystemImpl tsi
-
visited_not_yet_written
public final PositiveIntSet_impl visited_not_yet_written
set of FSs that have been visited and enqueued to be serialized - exception: arrays and lists which are "inline" are put into this set, but are not enqueued to be serialized. - FSs added to this, during "enqueue" phase, prior to encoding uses: - for Arrays and Lists, used to detect multi-refs - for Lists, used to detect loops - during enqueuing phase, prevent multiple enqueuings - during encoding phase, to prevent multiple encodings Public for use by JsonCasSerializer
-
multiRefFSs
public final PositiveIntSet multiRefFSs
Set of FSs that have multiple references Has an entry for each FS (not just array or list FSs) which is (from some point on) being serialized as a multi-ref, that is, is **not** being serialized (any more) using the special notation for arrays and lists or, for JSON, **not** being serialized using the embedded notation This is for JSON which is computing the multi-refs, not depending on the setting in a feature. This is also for xmi, to enable adding to "queue" (once) for each FSs of this kind. Used: - limit the number of times this is put onto the queue to 1. - skip encoding of items on "queue" if not in this Set (maybe not needed? 8/2017 mis) - serialize if not in indexed set, dynamic ref == true, and in this set (otherwise serialize only from ref)
-
isDynamicMultiRef
public final boolean isDynamicMultiRef
Set to true for JSON configuration of using dynamic multi-ref detection for arrays and lists
-
previouslySerializedFSs
public IntVector previouslySerializedFSs
-
modifiedEmbeddedValueFSs
public IntVector modifiedEmbeddedValueFSs
-
indexedFSs
public final IntVector[] indexedFSs
-
listUtils
public final ListUtils listUtils
-
typeCode2namespaceNames
public XmlElementName[] typeCode2namespaceNames
-
needNameSpaces
public boolean needNameSpaces
-
nsUriToPrefixMap
public final Map<String,String> nsUriToPrefixMap
map from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace string
-
nsPrefixesUsed
public final Set<String> nsPrefixesUsed
the set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back in
-
marker
public final MarkerImpl marker
Used to tell if a FS was created before or after mark.
-
sharedData
public final XmiSerializationSharedData sharedData
for Delta serialization, holds the info gathered from deserialization needed for delta serialization and for handling out-of-type-system data for both plain and delta serialization
-
isDelta
public final boolean isDelta
Whether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified. Set to true if Marker object is not null and CASImpl object of this serialize matches the CASImpl in Marker object.
-
isFiltering
public final boolean isFiltering
Whether the serializer needs to check for filtered-out types/features. Set to true if type system of CAS does not match type system that was passed to constructor of serializer.
-
filterTypeSystem
public TypeSystemImpl filterTypeSystem
-
isFormattedOutput
public final boolean isFormattedOutput
-
sortFssByType
public final Comparator<Integer> sortFssByType
sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by id
-
-
Constructor Detail
-
CasDocSerializer
public CasDocSerializer(ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss)
- Parameters:
ch
- -cas
- -sharedData
- -marker
- -csss
- -
-
CasDocSerializer
public CasDocSerializer(ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss, boolean trackMultiRefs)
-
-
Method Detail
-
getSofaAddr
public int getSofaAddr(int sofaNum)
- Parameters:
sofaNum
- - starts at 1- Returns:
- the addr of the sofa FS, or 0
-
getSortedUsedTypes
public TypeImpl[] getSortedUsedTypes()
-
encodeFS
public void encodeFS(int addr) throws Exception
Encode an individual FS. Json has 2 encodings For type: "typeName" : [ { "@id" : 123, feat : value .... }, { "@id" : 456, feat : value .... }, ... ], ... For id: "nnnn" : {"@type" : typeName ; feat : value ...} For cases where the top level type is an array or list, there is a generated feature name, "@collection" whose value is the list or array of values associated with that type.- Parameters:
addr
- The address to be encoded.- Throws:
SAXException
- passthruException
-
classifyType
public final int classifyType(int type)
Classifies a type. This returns an integer code identifying the type as one of the primitive types, one of the array types, one of the list types, or a generic FS type (anything else).The
LowLevelCAS.ll_getTypeClass(int)
method classifies primitives and array types, but does not have a special classification for list types, which we need for XMI serialization. Therefore, in addition to the type codes defined onLowLevelCAS
, this method can return one of the type codes TYPE_CLASS_INTLIST, TYPE_CLASS_FLOATLIST, TYPE_CLASS_STRINGLIST, or TYPE_CLASS_FSLIST.- Parameters:
type
- the type to classify- Returns:
- one of the TYPE_CLASS codes defined on
LowLevelCAS
or on this interface.
-
getXmiId
public String getXmiId(int addr)
Get the XMI ID to use for an FS.- Parameters:
addr
- address of FS- Returns:
- XMI ID. If addr == CASImpl.NULL, returns null
-
getXmiIdAsInt
public int getXmiIdAsInt(int addr)
-
getNameSpacePrefix
public String getNameSpacePrefix(String uimaTypeName, String nsUri, int lastDotIndex)
-
getTypeNameFromXmlElementName
public String getTypeNameFromXmlElementName(XmlElementName xe)
-
isStaticMultiRef
public boolean isStaticMultiRef(int featCode)
-
-