FAQs
The answer is that technically you can use a code list as an element or an attribute or both. However, rule 2 of the NIEM Conformance Rules (https://www.niem.gov/aboutniem/grant-funding/Pages/implementation-guide…) states the following: “If the appropriate component (type, element, attribute, etc.) required for an IEPD exists in the NIEM, use that component. Do not create a duplicate component of one that already exists.” By defining both an attribute and an element you are creating two components that mean the same thing. One would need a very strong business justification for using the code list as both an element and an attribute.
Members of a substitution group are not mutually exclusive. However, cardinality constraints on the head element apply. So if the head element has maxOccurs=”1” (or maxOccurs is not specified, as the default value is 1), only one (but any one) of the substitution group members may be used. If the head element has maxOccurs= ”unlimited,” than any number of substitution group members may be used in any combination and in any order. So, for example, if elements A, B, and C are members of the substitution group, you could have such combinations as AA, AAB, ABC, CBBA, and so on (although only certain combinations likely make business sense).
Working with this scenario below:
An IEPD with an existing extension element that is a code for widget capabilities (a widget can have multiple capabilities), and an inclusion in the information exchange XML not only the widget capability codes, but also a description of each code used. There are two options-
The name should be WidgetCapabilityText, and assuming you can have more than one widget capability in an instance, the example below is a common way to do it:
<my-ext:Widget>
<my-ext:WidgetCapabilityCode>WD</my-ext:WidgetCapabilityCode>
<my-ext:WidgetCapabilityCode>SY</my-ext:WidgetCapabilityCode>
<my-ext:WidgetCapabilityText>Washes the dishes</my-ext:WidgetCapabilityText>
<my-ext:WidgetCapabilityText>Sweeps the yard</my-ext:WidgetCapabilityText>
</my-ext:Widget>
Another way to get this done is to use the metadata element nc:DescriptionText to hold the literal and apply that metadata element to WidgetCapabilityCode.
<nc:Metadata s:id="WD">
<nc:DescriptionText>Washes the dishes</nc:DescriptionText>
</nc:Metadata>
<nc:Metadata s:id="SY">
<nc:DescriptionText>Sweeps the yard</nc:DescriptionText>
</nc:Metadata>
…
<my-ext:Widget>
<my-ext:WidgetCapabilityCode s:metadata="WD">WD</my-ext:WidgetCapabilityCode>
<my-ext:WidgetCapabilityCode s:metadata="SY">SY</my-ext:WidgetCapabilityCode>
</my-ext:Widget>
(note the choice of the id values are arbitrarily the same as the code values)
The GJXDM and NIEM schemas (and subsets thereof) contain global element declarations. If there are multiple global element declarations in a schema, a validating parser is not capable of determining which of those global elements is intended to be the root of a valid instance. (Some parsers consider this an error, others a warning; the XML Schema specification is ambiguous on the proper parser behavior.)
For this reason, it is a best practice for every IEPD to include a document schema that identifies the single element that is the root of a valid instance. The document schema defines an IEPD-specific namespace that contains only a single element declaration. The element's type should be in either the GJXDM or NIEM namespace or an IEPD-specific extension namespace.
So, in summary answer to the question: Always use a document schema when it is important to you to identify the root element of a valid instance. Since this is almost always desired, it is a best practice always to define a document schema.
An extension schema defines an IEPD-specific namespace that contains types and elements that are particular to that IEPD. Types in an extension schema should extend types in GJXDM or NIEM; elements in an extension schema should be either of types in the extension schema, or types in GJXDM or NIEM.
It is typical for an IEPD to have at least a few extensions. Extension schemas play a valuable role in the concept of conformance. While it is important for IEPDs to leverage types and elements in GJXDM and NIEM if they fit the semantics of the exchange, it is equally important not to use a type or element from them if its definition does not fit the semantics. It is very important not to use a GJXDM or NIEM element solely on the basis of the name "sounding like" what is intended in the exchange semantics; the definition must be consistent with those semantics as well.
So, in summary answer to the question: Use an extension schema whenever the exchange involves semantics that do not exist in GJXDM or NIEM.
Alternatives to schema subsets include the use of restriction, modularizing the full JXDD schema, or just copying over only the elements and definitions one needs.
The XML Schema restriction mechanism allows users to take a type and restrict away the elements they don't need and to modify the occurrence restrictions of other elements. While this seems like an acceptable approach, it presents some problems that may not be obvious. The first problem is that if users create a schema subset by restricting the full JXDD schema, the full JXDD schema would still be imported. No benefits would be gained in loading or validation time. The second problem is that restrictions cannot be enforced. To create a restriction, a new local type would be created based on the original JXDD type. Elements could be dropped or their number of occurrences reduced. The local schema would still have to import the full JXDD schema to do this, but using fast validation tools would make this possible. The real problem is in usage. Elements defined to be of the original JXDD type would be able to use the local restricted type in the XML instance through type substitution; however, there is no way to enforce this type substitution to occur. It would still be entirely possible, and in fact easier, for the original unrestricted JXDD type to be used. Validation would not recognize that the local restricted type should be used instead of the original JXDD type. The only way to work around this would be to create a new local element of the new locally restricted type. Validation would then enforce that the local type be used, but the element would have no connection to the JXDD and would not be understood by others to whom the schema is sent. This loses much of the benefit gained by using the JXDD - understandability. The use of restriction is not prohibited, but it offers much less in terms of performance benefits and validation support than schema subsets provide. Furthermore, in most cases restriction is not a sufficient alternative to schema subsets.
Another alternative would be to modularize the full JXDD schema into different components, as has been suggested. The problem with this is that there is no set of lines over which modularization would work and provide the benefits desired. If the full JXDD schema was divided into smaller components, the smaller components would still need to import each other because they are all interrelated. A person module would need reference to a location module, a contact information module, an organization module, a miscellaneous module for common types, and some subset of an activity module for the person subtypes. Little performance gains are made and complexity is increased. A seemingly simpler alternative to building schema subsets would be for users to copy over only those element and type definitions that they need from the full JXDD schema into their own document schema. This approach has problems as well. If users copy over JXDD components into their document schema without putting them into the Justice namespace, then other users would not be able to recognize that those components come from the JXDD. The namespace is what identifies the common source of the components; without this, recognition is lost.
Instead, if users copy over JXDD components into their own document schema and put them under the Justice namespace, then this tie to the JXDD is in name only. Usually, the structural definitions of JXDD components that are used in a local schema are imported from a definition schema in the Justice namespace. The full JXDD schema is one such definition schema - it is an official definition of the JXDD elements and types. The JXDD schema subset is another. A local copy of JXDD components in a document schema is just that - a local copy. There is no official structural definition schema against which to validate and ensure that the components appear as they should. The local document only validates against itself. There is no guarantee that components are actually from the JXDD; at best all you have is a claim. In either case, local copies with or without Justice namespace references, it would not be possible to reference and identify appropriate components as valid Justice elements and types.
There are many factors that need to be taken into account before deciding to use SSGT that include what is the end goal, alternatives, short term concerns and long term concerns.
In order to produce a set of schema subsets the following properties are needed in the tool .
The tool must allow a user to search and navigate through the full Justice dictionary. This is neccesary because users will need to see what is available before they can choose which parts they want to use.
The tool must give users the ability to create schema subsets by adding constraints.
The tool should allow users to create extension and document schemas by making customizations. Notice that this is not required functionality - a base tool could be built without it but would not be capable of providing the complete set of schemas.
The tool must be able to generate the customized schema subsets from the user input. This requires knowledge of the dictionary, data model, and the rules for creating valid schema subsets. Standard Commercial Registries
There is no puropose to recreate an existing product that could meet our needs. Therefore it is important to take a look at what a commercial, off-the-shelf, ebXML-compliant registry could offer us. A commercial registry could catalogue the Justice dictionary and store metadata about it, either at a component level or a document level. A commercial registry could also give users some manner of searching and retrieving data through a user interface.
These are important and necessary functionalities, but they are often not adequate enough to support the construction of customized schemas. To start with, only one class of registry could be used. This would be a registry with component level granularity. Any other type of registry would be useless for our purposes. A document level granularity would mean that the registry could only store and retrieve the dictionary as a full JXDD schema. This gives users no support in accessing and customizing individual components and defeats our purpose. Suppose we then choose a registry that has a component level granularity. It would be able to store the dictionary (a list of elements and types with definitions) piece by piece rather than lumped together in a single document. However, there is no way for any off-the-shelf registry to have knowledge of the Justice data model that the dictionary is based upon. This data model is very important - it has some relationships built into it that gives the JXDD its power and flexibility.
Off the shelf, no registry would be able to utilize the JXDD to its full potential. Additionally, the registry would have no mechanism to build the schema subsets or any knowledge of how to do so. It is apparent from the volume of comments received that the need for a customized schema subset generation tool is immediate. Because there is no product right now that is capable of this, it must be built.
This tool should have the capabilities outlined in the requirements section above. The tool should provide a graphical user interface to allow users to search through the dictionary components, add constraints and customizations, and define customized schemas. The schema subset generation tool should take in user input and, from that input, generate a valid set of customized schema subsets, carefully formed to maintain its integrity and interoperability. The tool should then return the set of schemas to the user, who then becomes the owner of those files.
Future work: Despite there not being an off-the-shelf registry product ready to meet our current needs, it might be possible for an existing registry to be modified so that it supports the full Justice data model and all of the requirements for building customized schema subsets. To start with, this would involve some research and comparison of different registry products and analysis of potential candidates to determine whether making such modifications is feasible. If so, adding awareness of the Justice data model and the capacity to build schema subsets could then be added. If it is not possible to make the necessary enhancements to a commercial registry, it becomes necessary to build a custom registry to fit the Justice data model.
After a registry is either modified or built, the back end of the schema subset generation tool will need to be changed to communicate with the registry. This allows code maintenance to be performed on the registry side and new versions of the JXDD to be handled automatically rather than forcing tool upgrades.Will this tool be the only way to create schema subsets? No. There are other ways this could be done. One step for the schema generation tool will be to translate the user input specifying how to build the customized schemas into an XML request file or wantlist. This will happen in the background, transparent to the user. The wantlist would be sent to the registry. The registry would process the file and then generate and return the customized schema subsets. The format of the request file should be publicly available, so that others can create their own front-ends and still use the registry to produce the actual schemas.
Another way to generate schema subsets would be to create and distribute a library that could perform the same functionality as the registry tool. A third way would be for users to go through the set of full schemas making restrictions and creating extension and document schemas by hand. Another might be through the use of XML Style Sheet Language (XSL). There are probably many different ways that this work could be done. The benefit of using a JXDD schema subset generation tool is that if a user specifies valid input, an appropriately and consistently formed set of customized schema subsets will be returned. Without a thorough understanding of the Justice data model, it could be very easy to unintentionally break conformance.
SSGT: http://niem.gtri.gatech.edu/niemtools/ssgt/SSGT-Search.iepd
Constraint schemas allow to place extra conditions on elements in addition to those already provided by GJXDM or NIEM. Using a constraint schema prevents having to define a valid subset schema for every situation where data needs vary depending on the context. Constraint schemas place additional constraints that a particular organization requires. In other words, constraint schemas are basically a second layer of validation that one can use for their own organization to verify that the data exchanges conform to one's organization's needs when they are more specific than what the GJXDM or NIEM provide. Constraint schemas allow one to sidestep the natural side effect of global definitions, that is, having data represented one and only one way.
What one cannot do in a constraint schema is add new elements to the type, change the order of the elements as they are defined in the full GJXDM or NIEM schemas, or change an elements base type to a type that is not a valid subset of the original base type. For instance, a constraint could define an element originally defined as xsd:string to be xsd:decimal but not the reverse. A decimal is a valid subset of string, but string is the superset to decimal. Any of these types of adjustments would require an extension schema.
When validating an instance document that was generated to meet a specification that only includes a constraint schema, simply change the schema location in the instance document from pointing to the constraint to point to the full GJXDM schema for the second part validation.
https://www.niem.gov/about-niem/news/niem-naming-and-design-rules-50-be…
Guidelines for XML Schemas: http://xml.coverpages.org/schemas.html
Since the full GJXDM and NIEM schemas include components that are optional and over-inclusive, users have the ability to retrieve only those components from the data dictionaries that they need. Many users will not want every element to be able to occur repeatedly. Furthermore, it is unlikely that a user will need to use the entire contents of the full schemas. This is the basic idea behind schema subsets—to provide smaller schemas that define only those components from the dictionary that the user wants to include.
The full GJXDM or NIEM schemas can be used, but it is not necessary to do so. Smaller schema subsets can be more manageable than the full schemas and will usually permit more rapid validation of document instances. The overriding rule for using GJXDM or NIEM schema subsets is as follows: Document instances that validate to the schema subset will validate to the full GJXDM or NIEM.
The Schema Subset Generation Tool (SSGT) produces compliant schema subsets. It is available at the following locations:
NIEM SSGT: http://niem.gtri.gatech.edu/niemtools/ssgt/SSGT-Search.iepd
The GJXDM and NIEM schemas represent reference models that are intentionally optional and over-inclusive. The purpose of the reference model is to focus structure and semantics through well-defined types and properties, not to dictate which components to use or how to fine-tune their content.
Contraint Schemas do this in two different ways:
Encapsulating Cardinality
Constraint schemas can be used solely to constrain cardinality. This allows for cardinality validation via XSD. In this case, the constraint schema is simply a copy of the subset schema with "minOccurs" and "maxOccurs" attributes set. When validating, it takes the place of the subset schema.
There are utilities to help create constraint schemas from a subset schema. Constrain schemas are optional. Cardinality can also be enforced solely within the subset schema, at the cost of easy reuse
Enforcing Business Rules
Constraint schemas can also let you do non-GJXDM or non-NIEM modifications to subset schemas in order to enforce business rules. Practical exchanges of actual XML instances usually require much tighter constraints on elements and attributes than the schema representing the reference model allows. In a constraint schema, the user can restrict element occurrences and employ facets. An instance must pass both the conformance validation path and the constraint validation path. In other words, a 2-path, parallel validation is required.
Conformance validation ensures proper uses of the GJXDM and NIEM, and constraint validation (more restrictive) confirms that an instance adheres to rules specific to the user’s application. In many cases (though not a requirement), a constraint schema will be a copy of the GJXDM or NIEM schema subset (or full) with appropriate constraints applied.
The Department of Justice funded the Georgia Tech Research Institute (GTRI) to build a Subset Schema Generation Tool (SSGT) that would allow searching the GJXDM and creating customized subsets. A similar tool exists for NIEM.
These tools present the full set of GJXDM and NIEM types and elements for the user to select into a subset. Components can be added to the subset by performing searches for the specific types or elements that are required. Once selected, types and elements are included in the subset. Types and elements erroneously included in the subset can be removed as desired.
Once the desired subset is built, the user can generate the schema subset (which is in fact often a set of related schemas).
NIEM SSGT: https://staging.niem.gtri.gatech.edu/niemtools/ssgt/help.iepd;jsessioni…
SSGT Tutorial: http://vimeo.com/109940669
Consequently, the Department of Justice funded the Georgia Tech Research Institute (GTRI) to build a Subset Schema Generation Tool (SSGT). This tool presents the full set of GJXDM types and elements for the user to select into a subset. Types and elements can either be selected as they are encountered, or the user can search the GJXDM using the GJXDM Model Viewer (which is contained in the SSGT.) Once selected, types and elements are included in the subset. Types and elements erroneously included in the subset can be removed as desired.
Once the desired subset is built, the user can generate the schema subset (which is in fact often a set of related schemas, as is the full GJXDM).
Schema Subset Generation Tool: https://www.niem.gov/tools-catalog
Wantlist Specification for the Schema Subset Generation Tool
The Load/Save function in the Schema Subset Generation Tool (SSGT) uses a Wantlist to preserve the details of a schema subset. The Wantlist format is XML. The current Wantlist schema specification is located at:
Wantlist Schema
Please note that this schema will change as capabilities are added to SSGT (such as global constraints).
The process of converting the contents of an XML document to objects, such as a class, is called XML data binding. A tool called a data binder does this by mapping XML schema components to classes that reside in a computer’s memory. This process allows applications to access XML data, for a number of reasons, from the object rather that directly from the XML file.
There are several utilities available that can perform the data binding process. The choice is primarily dependant on the development environment. Below is a list of utilities that are available for common programming platforms. Please note that this list is not all inclusive.
Java
Castor
Eclipse Modeling Framework (EMF)
http://www.eclipse.org/modeling/emf/?project=emf
Java Architecture for XML Binding (JAXB)
http://java.sun.com/developer/technicalArticles/WebServices/jaxb/
XMLBeans
C++
Code Synthesis XSD
http://codesynthesis.com/projects/xsd/
xmlbeansxx
.NET
Data binding functionality is built into the .NET platform. The System.Xml.Serialization namespace contains several classes that can be used to bind XML data and access XML data.
Xsd.exe
The XML Schema Definition Tool is included as a part of the .NET framework. This tool can be used to generate runtime classes from XML schema files.
The Global Justice Extensible Markup Language Data Model (GJXDM) is an XML standard designed to create a uniform method for law enforcement and judicial agencies to exchange criminal justice information in a timely manner.
GJXDM removes the burden from agencies to independently create exchange standards. Its extensibility allows for flexibility to deal with unique agency requirements and changes. GJXDM is an object-oriented data model for organizing the content of a data dictionary, the Global Justice XML Data Dictionary (GJXDD), in a database. From this database, an XML schema specification can be generated that consistently represents the semantics and structure of common data elements and types required for information exchange within the justice and public safety communities. There are three primary parts to the GJXDM: the Data Dictionary (identifying content and meaning), the Data Model (defining structure and organization), and the Component Reuse Repository (a database).
The work accomplished to date, based on participation by practitioners from the justice and public safety communities, has resulted in the creation of a data model that can be used to generate data schema which will facilitate information sharing among the various jurisdictions of those communities. This was done in a manner that reduced the cost of developing the technical solutions required, simplified the process and associated products, and enhanced the interoperability quotient of the end product. The approach combines successful practices in data modeling with recent technology standards for XML schema.
Proactive information sharing promotes public safety by allowing users to make accurate and rapid decisions. GJXDM lets users share critical information better and faster, and reduce costs and overcome delays in case management. GJXDM provides for electronic data sharing in a simple automatic process between previously incompatible systems.
The automation revolution led to piecemeal interoperability, individual software applications that were programmed to only serve a single agency or a group of agencies. This approach did not take into account the need to share information between agencies with different applications. Yet the need to share information is now important as ever.
Exchange of information can be accomplished between different applications by using a uniform standard, regardless of computer system or platform. Luckily, this is exactly what XML allows. Global Justice XML Data Model is an XML markup-language that serves this purpose for law enforcement and related agencies. GJXDM has a standard Justice specific vocabulary and agreed to objects that all agencies can use to describe information. Below is a scenario which illustrates how the useof GJXDM yields the best possible result for lawenforcement.
A local police officer conducts a routine traffic stop for speeding. When the police officer approaches the car, he discovers that there is a young female passenger alongside the male driver. Thereafter, the police officer enters the driver's information into his mobile data computer. Immediately the system queries all participating neighboring jurisdictions and agencies, besides his local district, for any information concerning the driver. Information is returned from a state Sex Offender Registry that the driver is a registered sex offender who is prohibited from being the company of any child under the age of 16. Further, the officer also receives information from the Department of Corrections that there is a warrant for the driver's arrest because he is wanted for not registering with them. It turns out that the passenger is actually a 12 year old girl, so the driver is arrested based on both pieces of information obtained using GJXDM. Without GJXDM, the driver might have just received a speeding ticket. Luckily the XML technology, GJXDM, allowed for the quick and easy transfer of information between the local police officer's computer and the systems of all the other appropriate agencies.
The scenario provided is only one of many examples of how the exchange of information is crucial in ensuring justice. The XML enhanced data sharing provides for a common framework to improve data and information sharing. And as XML is increasingly being adapted as the IT standard globally, more tools will become available to work with XML. Also, XML is a very adaptable language, since it works on top of most existingservers (nothing needs to be replaced to work withXML).
US Dept. of Justice GJXDM Website: https://bja.ojp.gov/program/it/national-initiatives/gjxdm
US Dept. of Justice Information Sharing Initiative: https://bja.ojp.gov/program/it/national-initiatives