FAQs
The GJXDM element j:PersonModusOperandi is of type j:ActivityType, and can be used to describe "A methodology of action, particularly a criminal action, known to be routinely associated with a persons crimes."
Content Elements:
Content elements enclose data. The following is an example:
<Person s:id="A">
...
<PersonName>
<PersonFullName>Adam Smith</PersonFullName>
</PersonName>
...
</Person>
In this example, there is a person object. The person contains an element called PersonName. The PersonName element contains an element called PersonFullName. The PersonFullName element contains a string Adam Smith. The PersonFullName element is obviously a content-containing element. It has
the person’s name (a literal string) as its content.
The PersonName is also a content-containing element, as its content represents the person name, as a structured object. It contains the element PersonFullName, and could contain additional elements.
Reference Elements:
Reference elements do not enclose content. Instead, they reference content as external objects:
<Incident>
<ActivityDate>2003-10-02</ActivityDate>
...
<IncidentSeizedPropertyRef s:ref="C"/>
...
</Incident>
In the above example, the property that was seized as part of the incident is referenced out to another object, an XML object in the same XML instance, with the identifier C.
<Property s:id="C">
<PropertyDescriptionText>
White microwave oven
</PropertyDescriptionText>
<PropertyTypeCode>HOVEN</PropertyTypeCode>
<PropertyMakeName>Kenmore</PropertyMakeName>
<PropertyModelName>63292</PropertyModelName>
</Property>
The object that has the identifier C is an instance of Property, specifically representing a microwave oven. The reasons for representing the microwave oven
outside of the incident should be quite evident: it is its own object, independent of the incident. It has its own life cycle. If the incident did not exist, the microwave oven would still exist.
The seized property is an element of the incident because it is a fixed part of the incident. The incident involved the seizing of the property, and that will not change. However, the incident should be a reference element, as the property has its own life cycle, outside of the incident.
Abstract elements are elements defined in XML schema but cannot appear in an XML instance; this is an XML schema mechanism for forcing substitution. Substitution groups must be used whenever abstract elements are defined; an element that is a member of the abstract element’s substitution group must appear in place of the abstract element in an XML instance.
NIEM uses abstract elements as the head-element for all substitution groups. Abstract elements are used throughout the NIEM wherever there is a concept that can be represented multiple ways. An abstract element serves as a placeholder in the reference model, but it must be substituted when creating an exchange specification.
Below is a snippet of NIEM schema to illustrate this.
<xsd:complexType name="PersonType">
<xsd:complexContent>
<xsd:extension base="u:SuperType">
<xsd:sequence>
<xsd:element ref="u:PersonSex" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
<xsd:element name="PersonSex" abstract="true"/>
<xsd:element name="PersonSexText" type="u:TextType" substitutionGroup="u:PersonSex" / >
<xsd:element name="PersonSexCode" type="ncic:SEXCodeType" substitutionGroup="u:PersonSex" / >
In the example above both PersonSexText and PersonSexCode belong to the PersonSex substitution group. This means either PersonSexText or PersonSexCode must replace PersonSex whenever PersonType is instantiated.
The answer is that technically you can use a code list as an element or an attribute or both. However, rule 2 of the NIEM Conformance Rules (https://www.niem.gov/aboutniem/grant-funding/Pages/implementation-guide…) states the following: “If the appropriate component (type, element, attribute, etc.) required for an IEPD exists in the NIEM, use that component. Do not create a duplicate component of one that already exists.” By defining both an attribute and an element you are creating two components that mean the same thing. One would need a very strong business justification for using the code list as both an element and an attribute.
Members of a substitution group are not mutually exclusive. However, cardinality constraints on the head element apply. So if the head element has maxOccurs=”1” (or maxOccurs is not specified, as the default value is 1), only one (but any one) of the substitution group members may be used. If the head element has maxOccurs= ”unlimited,” than any number of substitution group members may be used in any combination and in any order. So, for example, if elements A, B, and C are members of the substitution group, you could have such combinations as AA, AAB, ABC, CBBA, and so on (although only certain combinations likely make business sense).
Working with this scenario below:
An IEPD with an existing extension element that is a code for widget capabilities (a widget can have multiple capabilities), and an inclusion in the information exchange XML not only the widget capability codes, but also a description of each code used. There are two options-
The name should be WidgetCapabilityText, and assuming you can have more than one widget capability in an instance, the example below is a common way to do it:
<my-ext:Widget>
<my-ext:WidgetCapabilityCode>WD</my-ext:WidgetCapabilityCode>
<my-ext:WidgetCapabilityCode>SY</my-ext:WidgetCapabilityCode>
<my-ext:WidgetCapabilityText>Washes the dishes</my-ext:WidgetCapabilityText>
<my-ext:WidgetCapabilityText>Sweeps the yard</my-ext:WidgetCapabilityText>
</my-ext:Widget>
Another way to get this done is to use the metadata element nc:DescriptionText to hold the literal and apply that metadata element to WidgetCapabilityCode.
<nc:Metadata s:id="WD">
<nc:DescriptionText>Washes the dishes</nc:DescriptionText>
</nc:Metadata>
<nc:Metadata s:id="SY">
<nc:DescriptionText>Sweeps the yard</nc:DescriptionText>
</nc:Metadata>
…
<my-ext:Widget>
<my-ext:WidgetCapabilityCode s:metadata="WD">WD</my-ext:WidgetCapabilityCode>
<my-ext:WidgetCapabilityCode s:metadata="SY">SY</my-ext:WidgetCapabilityCode>
</my-ext:Widget>
(note the choice of the id values are arbitrarily the same as the code values)
The GJXDM and NIEM schemas (and subsets thereof) contain global element declarations. If there are multiple global element declarations in a schema, a validating parser is not capable of determining which of those global elements is intended to be the root of a valid instance. (Some parsers consider this an error, others a warning; the XML Schema specification is ambiguous on the proper parser behavior.)
For this reason, it is a best practice for every IEPD to include a document schema that identifies the single element that is the root of a valid instance. The document schema defines an IEPD-specific namespace that contains only a single element declaration. The element's type should be in either the GJXDM or NIEM namespace or an IEPD-specific extension namespace.
So, in summary answer to the question: Always use a document schema when it is important to you to identify the root element of a valid instance. Since this is almost always desired, it is a best practice always to define a document schema.
An extension schema defines an IEPD-specific namespace that contains types and elements that are particular to that IEPD. Types in an extension schema should extend types in GJXDM or NIEM; elements in an extension schema should be either of types in the extension schema, or types in GJXDM or NIEM.
It is typical for an IEPD to have at least a few extensions. Extension schemas play a valuable role in the concept of conformance. While it is important for IEPDs to leverage types and elements in GJXDM and NIEM if they fit the semantics of the exchange, it is equally important not to use a type or element from them if its definition does not fit the semantics. It is very important not to use a GJXDM or NIEM element solely on the basis of the name "sounding like" what is intended in the exchange semantics; the definition must be consistent with those semantics as well.
So, in summary answer to the question: Use an extension schema whenever the exchange involves semantics that do not exist in GJXDM or NIEM.
Alternatives to schema subsets include the use of restriction, modularizing the full JXDD schema, or just copying over only the elements and definitions one needs.
The XML Schema restriction mechanism allows users to take a type and restrict away the elements they don't need and to modify the occurrence restrictions of other elements. While this seems like an acceptable approach, it presents some problems that may not be obvious. The first problem is that if users create a schema subset by restricting the full JXDD schema, the full JXDD schema would still be imported. No benefits would be gained in loading or validation time. The second problem is that restrictions cannot be enforced. To create a restriction, a new local type would be created based on the original JXDD type. Elements could be dropped or their number of occurrences reduced. The local schema would still have to import the full JXDD schema to do this, but using fast validation tools would make this possible. The real problem is in usage. Elements defined to be of the original JXDD type would be able to use the local restricted type in the XML instance through type substitution; however, there is no way to enforce this type substitution to occur. It would still be entirely possible, and in fact easier, for the original unrestricted JXDD type to be used. Validation would not recognize that the local restricted type should be used instead of the original JXDD type. The only way to work around this would be to create a new local element of the new locally restricted type. Validation would then enforce that the local type be used, but the element would have no connection to the JXDD and would not be understood by others to whom the schema is sent. This loses much of the benefit gained by using the JXDD - understandability. The use of restriction is not prohibited, but it offers much less in terms of performance benefits and validation support than schema subsets provide. Furthermore, in most cases restriction is not a sufficient alternative to schema subsets.
Another alternative would be to modularize the full JXDD schema into different components, as has been suggested. The problem with this is that there is no set of lines over which modularization would work and provide the benefits desired. If the full JXDD schema was divided into smaller components, the smaller components would still need to import each other because they are all interrelated. A person module would need reference to a location module, a contact information module, an organization module, a miscellaneous module for common types, and some subset of an activity module for the person subtypes. Little performance gains are made and complexity is increased. A seemingly simpler alternative to building schema subsets would be for users to copy over only those element and type definitions that they need from the full JXDD schema into their own document schema. This approach has problems as well. If users copy over JXDD components into their document schema without putting them into the Justice namespace, then other users would not be able to recognize that those components come from the JXDD. The namespace is what identifies the common source of the components; without this, recognition is lost.
Instead, if users copy over JXDD components into their own document schema and put them under the Justice namespace, then this tie to the JXDD is in name only. Usually, the structural definitions of JXDD components that are used in a local schema are imported from a definition schema in the Justice namespace. The full JXDD schema is one such definition schema - it is an official definition of the JXDD elements and types. The JXDD schema subset is another. A local copy of JXDD components in a document schema is just that - a local copy. There is no official structural definition schema against which to validate and ensure that the components appear as they should. The local document only validates against itself. There is no guarantee that components are actually from the JXDD; at best all you have is a claim. In either case, local copies with or without Justice namespace references, it would not be possible to reference and identify appropriate components as valid Justice elements and types.
There are many factors that need to be taken into account before deciding to use SSGT that include what is the end goal, alternatives, short term concerns and long term concerns.
In order to produce a set of schema subsets the following properties are needed in the tool .
The tool must allow a user to search and navigate through the full Justice dictionary. This is neccesary because users will need to see what is available before they can choose which parts they want to use.
The tool must give users the ability to create schema subsets by adding constraints.
The tool should allow users to create extension and document schemas by making customizations. Notice that this is not required functionality - a base tool could be built without it but would not be capable of providing the complete set of schemas.
The tool must be able to generate the customized schema subsets from the user input. This requires knowledge of the dictionary, data model, and the rules for creating valid schema subsets. Standard Commercial Registries
There is no puropose to recreate an existing product that could meet our needs. Therefore it is important to take a look at what a commercial, off-the-shelf, ebXML-compliant registry could offer us. A commercial registry could catalogue the Justice dictionary and store metadata about it, either at a component level or a document level. A commercial registry could also give users some manner of searching and retrieving data through a user interface.
These are important and necessary functionalities, but they are often not adequate enough to support the construction of customized schemas. To start with, only one class of registry could be used. This would be a registry with component level granularity. Any other type of registry would be useless for our purposes. A document level granularity would mean that the registry could only store and retrieve the dictionary as a full JXDD schema. This gives users no support in accessing and customizing individual components and defeats our purpose. Suppose we then choose a registry that has a component level granularity. It would be able to store the dictionary (a list of elements and types with definitions) piece by piece rather than lumped together in a single document. However, there is no way for any off-the-shelf registry to have knowledge of the Justice data model that the dictionary is based upon. This data model is very important - it has some relationships built into it that gives the JXDD its power and flexibility.
Off the shelf, no registry would be able to utilize the JXDD to its full potential. Additionally, the registry would have no mechanism to build the schema subsets or any knowledge of how to do so. It is apparent from the volume of comments received that the need for a customized schema subset generation tool is immediate. Because there is no product right now that is capable of this, it must be built.
This tool should have the capabilities outlined in the requirements section above. The tool should provide a graphical user interface to allow users to search through the dictionary components, add constraints and customizations, and define customized schemas. The schema subset generation tool should take in user input and, from that input, generate a valid set of customized schema subsets, carefully formed to maintain its integrity and interoperability. The tool should then return the set of schemas to the user, who then becomes the owner of those files.
Future work: Despite there not being an off-the-shelf registry product ready to meet our current needs, it might be possible for an existing registry to be modified so that it supports the full Justice data model and all of the requirements for building customized schema subsets. To start with, this would involve some research and comparison of different registry products and analysis of potential candidates to determine whether making such modifications is feasible. If so, adding awareness of the Justice data model and the capacity to build schema subsets could then be added. If it is not possible to make the necessary enhancements to a commercial registry, it becomes necessary to build a custom registry to fit the Justice data model.
After a registry is either modified or built, the back end of the schema subset generation tool will need to be changed to communicate with the registry. This allows code maintenance to be performed on the registry side and new versions of the JXDD to be handled automatically rather than forcing tool upgrades.Will this tool be the only way to create schema subsets? No. There are other ways this could be done. One step for the schema generation tool will be to translate the user input specifying how to build the customized schemas into an XML request file or wantlist. This will happen in the background, transparent to the user. The wantlist would be sent to the registry. The registry would process the file and then generate and return the customized schema subsets. The format of the request file should be publicly available, so that others can create their own front-ends and still use the registry to produce the actual schemas.
Another way to generate schema subsets would be to create and distribute a library that could perform the same functionality as the registry tool. A third way would be for users to go through the set of full schemas making restrictions and creating extension and document schemas by hand. Another might be through the use of XML Style Sheet Language (XSL). There are probably many different ways that this work could be done. The benefit of using a JXDD schema subset generation tool is that if a user specifies valid input, an appropriately and consistently formed set of customized schema subsets will be returned. Without a thorough understanding of the Justice data model, it could be very easy to unintentionally break conformance.
SSGT: http://niem.gtri.gatech.edu/niemtools/ssgt/SSGT-Search.iepd
Constraint schemas allow to place extra conditions on elements in addition to those already provided by GJXDM or NIEM. Using a constraint schema prevents having to define a valid subset schema for every situation where data needs vary depending on the context. Constraint schemas place additional constraints that a particular organization requires. In other words, constraint schemas are basically a second layer of validation that one can use for their own organization to verify that the data exchanges conform to one's organization's needs when they are more specific than what the GJXDM or NIEM provide. Constraint schemas allow one to sidestep the natural side effect of global definitions, that is, having data represented one and only one way.
What one cannot do in a constraint schema is add new elements to the type, change the order of the elements as they are defined in the full GJXDM or NIEM schemas, or change an elements base type to a type that is not a valid subset of the original base type. For instance, a constraint could define an element originally defined as xsd:string to be xsd:decimal but not the reverse. A decimal is a valid subset of string, but string is the superset to decimal. Any of these types of adjustments would require an extension schema.
When validating an instance document that was generated to meet a specification that only includes a constraint schema, simply change the schema location in the instance document from pointing to the constraint to point to the full GJXDM schema for the second part validation.
https://www.niem.gov/about-niem/news/niem-naming-and-design-rules-50-be…
Guidelines for XML Schemas: http://xml.coverpages.org/schemas.html
Since the full GJXDM and NIEM schemas include components that are optional and over-inclusive, users have the ability to retrieve only those components from the data dictionaries that they need. Many users will not want every element to be able to occur repeatedly. Furthermore, it is unlikely that a user will need to use the entire contents of the full schemas. This is the basic idea behind schema subsets—to provide smaller schemas that define only those components from the dictionary that the user wants to include.
The full GJXDM or NIEM schemas can be used, but it is not necessary to do so. Smaller schema subsets can be more manageable than the full schemas and will usually permit more rapid validation of document instances. The overriding rule for using GJXDM or NIEM schema subsets is as follows: Document instances that validate to the schema subset will validate to the full GJXDM or NIEM.
The Schema Subset Generation Tool (SSGT) produces compliant schema subsets. It is available at the following locations:
NIEM SSGT: http://niem.gtri.gatech.edu/niemtools/ssgt/SSGT-Search.iepd
The GJXDM and NIEM schemas represent reference models that are intentionally optional and over-inclusive. The purpose of the reference model is to focus structure and semantics through well-defined types and properties, not to dictate which components to use or how to fine-tune their content.
Contraint Schemas do this in two different ways:
Encapsulating Cardinality
Constraint schemas can be used solely to constrain cardinality. This allows for cardinality validation via XSD. In this case, the constraint schema is simply a copy of the subset schema with "minOccurs" and "maxOccurs" attributes set. When validating, it takes the place of the subset schema.
There are utilities to help create constraint schemas from a subset schema. Constrain schemas are optional. Cardinality can also be enforced solely within the subset schema, at the cost of easy reuse
Enforcing Business Rules
Constraint schemas can also let you do non-GJXDM or non-NIEM modifications to subset schemas in order to enforce business rules. Practical exchanges of actual XML instances usually require much tighter constraints on elements and attributes than the schema representing the reference model allows. In a constraint schema, the user can restrict element occurrences and employ facets. An instance must pass both the conformance validation path and the constraint validation path. In other words, a 2-path, parallel validation is required.
Conformance validation ensures proper uses of the GJXDM and NIEM, and constraint validation (more restrictive) confirms that an instance adheres to rules specific to the user’s application. In many cases (though not a requirement), a constraint schema will be a copy of the GJXDM or NIEM schema subset (or full) with appropriate constraints applied.
The Department of Justice funded the Georgia Tech Research Institute (GTRI) to build a Subset Schema Generation Tool (SSGT) that would allow searching the GJXDM and creating customized subsets. A similar tool exists for NIEM.
These tools present the full set of GJXDM and NIEM types and elements for the user to select into a subset. Components can be added to the subset by performing searches for the specific types or elements that are required. Once selected, types and elements are included in the subset. Types and elements erroneously included in the subset can be removed as desired.
Once the desired subset is built, the user can generate the schema subset (which is in fact often a set of related schemas).
NIEM SSGT: https://staging.niem.gtri.gatech.edu/niemtools/ssgt/help.iepd;jsessioni…
SSGT Tutorial: http://vimeo.com/109940669
Consequently, the Department of Justice funded the Georgia Tech Research Institute (GTRI) to build a Subset Schema Generation Tool (SSGT). This tool presents the full set of GJXDM types and elements for the user to select into a subset. Types and elements can either be selected as they are encountered, or the user can search the GJXDM using the GJXDM Model Viewer (which is contained in the SSGT.) Once selected, types and elements are included in the subset. Types and elements erroneously included in the subset can be removed as desired.
Once the desired subset is built, the user can generate the schema subset (which is in fact often a set of related schemas, as is the full GJXDM).
Schema Subset Generation Tool: https://www.niem.gov/tools-catalog
Wantlist Specification for the Schema Subset Generation Tool
The Load/Save function in the Schema Subset Generation Tool (SSGT) uses a Wantlist to preserve the details of a schema subset. The Wantlist format is XML. The current Wantlist schema specification is located at:
Wantlist Schema
Please note that this schema will change as capabilities are added to SSGT (such as global constraints).