FAQs
Since the initiation of justice systems integration in the United States, practitioners have generally worked with vendors to develop unique and proprietary solutions to their individual information-sharing needs, either within one agency or within a specific jurisdiction. These technical solutions, while solving immediate information-sharing objectives, created many independent systems and limited their ability to share information among other systems throughout the nation.
In 2001, a formal effort was undertaken to reconcile several of the XML specifications developed by justice practitioners. The U.S. Department of Justice (DOJ) funded meetings of the following organizational representatives:
• The Joint Task Force on Rap Sheet Standardization, which developed an Interstate Criminal History Transmission Specification using XML.
• The Institute for Intergovernmental Research (IIR), which developed the Regional Information Sharing System.
• The LegalXML Court Filing Workgroup, which developed an Electronic Court Filing Standard using XML.
• The American Association of Motor Vehicle Administrators, which was working to develop Driver and Vehicle transactions in XML.
These meetings resulted in a reconciliation of these various XML data standards into a common Justice XML Data Dictionary (JXDD). The first JXDD (Version 1.0) depicted the elements in a Microsoft Access database. Subsequent versions (2.0 and 2.1) of JXDD represented the 1.0 elements as XML schema.
These early efforts were critical to the development of the current GJXDM—an effort undertaken as part of the DOJ’s Global Justice Information Sharing Initiative supported by the XML Structure Task Force
The development of the Global Justice XML Data Model (GJXDM), which incorporates a comprehensive Global Justice XML Data Dictionary, represents a significant change in the way practitioners will develop their information-sharing systems. GJXDM provides a common language with which justice entities can describe, structure, and share information on criminal justice matters and offenders within a locality, the state, among the states, or with federal or tribal entities.
In early 2002, DOJ formed a group called the XML Structure Task Force (XSTF) under Global's Infrastructure/Standards Working Group to develop an object-oriented XML data model for justice information sharing.
XSTF is comprised of justice practitioners and industry representatives from various justice communities of interest. They include representation from local, state, and federal law enforcement, courts, corrections, probation and parole, and transportation agencies; the Federal Bureau of Investigation; SEARCH, The National Consortium for Justice Information and Statistics; the U.S. Chief Information Officers Council; and the Integrated Justice Information Systems (IJIS) Institute (a consortium of private-sector companies involved in justice and public safety).
XSTF's contribution has been supported by development staff, notably research scientists from Georgia Tech Research Institute (GTRI) and the National Telecommunication and Information Administration (NTIA). GTRI developed the technical concepts, using XML best practices and standards in the design and implementation of GJXDM.
This XSTF effort has provided a framework within which a productive relationship has developed among practitioners, industry, and development staff.
The primary goal of XSTF in designing for the Data Model is to develop a common set of reusable, extendible XML data components that could be combined in justice documents, transactions, and messages that are consistently structured to support interoperability among justice and public safety systems nationwide.
In GJXDM and NIEM, the composition of data entities and the relationships between them have been defined precisely and flexibly using two kinds of components: types and properties. An understanding of the concept of types and properties is the baseline for developing an understanding of the models. Types and properties are maintained as distinct entities.
A type is a structure which carries values associated with a real-world entity. Types represent real-world entities, such as persons or vehicles and can also be defined as an independently insatiable collection of properties. When an object is created, it is described as being instantiated.
A property associates specific characteristics with an instance of a type. Every property in the data dictionary is given a definition which outlines how the property is to be used, and what it means.
There are three components to a property:
1. The property name is a unique label applied to the property (property names are unique within the data dictionary).
2. The subject type is the type to which the property applies. For example, with the property “Name”, if a person has a name, then the subject type of the property “Name” would be “PersonType”.
3. The object type is the type of the value of the property. For example, if a name is a string, then the object type of the property “Name” would be “StringType”.
The Global Infrastructure/Standards Working Group (GISWG) recommended that OJP form the XML Structure Task Force (XSTF) to identify data requirements, explore XML concepts, and apply XML best practices to the design and implementation of the Global Justice XML Data Model (GJXDM). This recommendation was accepted and today the XSTF is composed of government and industry domain experts (law enforcement, courts, and corrections), technical managers, and engineers.
XSTF works closely with Georgia Tech Research Institute (GTRI) on each GJXDM version. Its vision is to significantly advance justice information sharing by providing a common language and vocabulary that reduces cost and technical barriers. GXSTF has developed a consistent, extendable, and maintainable XML schema reference specification for data elements and types that represent the data requirements of the general justice and public safety communities.
XSTF conducts evaluations, solicits feedback from technical experts and practitioners, and authorizes changes based on this feedback. XSTF is heavily involved in the GJXDM release process and approves all fixes, additions, deletions, and modifications to each implementation, all of which are applied to future releases with a cumulative change log published along with each release. When a reasonable number of updates are approved by the XSTF, a new version is released.
XSTF is also responsible for GJXDM guidance, review, and issue resolution.
800 Megahertz (MHz). 800 MHz refers to public safety radio systems using channels located in or near the 800 MHz band. Approximately 300 channels located in the 800 MHz spectrum band have been assigned for use by state and local public safety entities. The disadvantage is that this higher frequency has less range and so a greater infrastructure is needed to cover the same range as lower frequencies.
Access. To interact with a system entity in order to manipulate, use, gain knowledge of, and/or obtain a representation of some or all of a system entity’s resources.
Access control. Protection of resources against unauthorized access; a process by which use of resources is regulated according to a security policy and is permitted by only authorized system entities according to that policy.
Access control information. Any information used for access control purposes, including contextual information. Contextual information might include source Internet Protocol (IP) address, encryption strength, the type of operation being requested, time of day, etc. Portions of access control information may be specific to the request itself, some may be associated with the connection via which the request is transmitted, and others (for example, time of day) may be “environmental.”
Access rights. A description of the type of authorized interactions a person or system can have with a resource. Examples include read, write, execute, add, modify, and delete.
AFIS (Automated Fingerprint Identification System). AFIS is a database of digitized offender fingerprint files. A user can enter a fingerprint and a computer will generate a list of possible matches within minutes. The matches are then examined and verified by a fingerprint expert.
Architecture. Architecture refers to the design of a system. It may refer to either hardware or software, or a combination of both. The software architecture of a program or computing system is the structure or structures of the system. This structure includes software components, the externally visible properties of those components, the relationships among them, and the constraints on their use.
Artifact. A piece of digital information. An artifact may be any size and may be composed of other artifacts. Examples of artifacts: a message, a URI, an XML document, a Portable Network Graphics (PNG) image.
Asynchronous. An interaction is said to be asynchronous when the associated messages are chronologically and procedurally decoupled. For example, in a request-response interaction, the client agent can process the response at some indeterminate point in the future when its existence is discovered. Mechanisms to do this include polling, notification by receipt of another message, etc.
Attribute. A characteristic of an object or entity. An object’s attributes are said to describe the object. Objects’ attributes are often specified in terms of their physical traits, such as size, shape, weight, and color, etc., for real-world objects. Objects in cyberspace might have attributes describing size, type of encoding, network address, etc.
Authentication. Authentication is the process of verifying that a potential partner in a conversation (or data exchange) is capable of representing a person or organization.
Authorization. The process of determining, by evaluating applicable access control information, whether a subject is allowed to have the specified types of access to a particular resource. Usually, authorization is in the context of authentication. Once a subject is authenticated, it may be authorized to perform different types of access.
AVL (Automatic Vehicle Locator). AVL uses Global Positioning System technology to locate the position of patrol cars on a digital map. This information assists the dispatcher in knowing which calls should be assigned to which officers.
Binding. An association between an interface, a transmission protocol, and a data format. A binding specifies the protocol and data format to be used in transmitting messages defined by the associated interface. See also SOAP binding.
BIOS (Basic Input/Output System). BIOS controls the startup of the machines or computers and other functions, such as the keyboard, display, and disk drive. BIOS is stored on read-only memory and is not erased when the computer is turned off. BIOS on newer machines is stored on flash read-only memory, allowing it to be erased and rewritten to update BIOS.
CAD (Computer Aided Dispatch). A computer system that assists 911 operators and dispatch personnel in handling and prioritizing calls. Enhanced 911 will send the location of the call to the CAD system, which will automatically display the address of the 911 caller on a screen in front of the operator. Complaint information is then entered into the computer and is easily retrievable. The system may be linked to mobile data terminals (MDTs) in patrol cars, thereby allowing dispatchers and officers to communicate without using voice. The system may also be interfaced with NCIC, AVL, or a number of other programs.
Cardinality. The number of instances of an entity in relation to another entity, e.g., one-to-one, one-to-many, many-to-many
CDPD (Cellular Digit Packet Data). A data transmission technology that uses unused cellular phone channels to transmit data in packets.
Class. A description of a set of objects that share the same attributes, operations, methods, relationships, and semantics
Client/Server architecture. A network model that a computer or process server uses to provide services to the workstations (clients) connected to that computer (server). This architecture allows the client to share resources such as files, printers, and processing power with other clients.
Community of interest. A group of professionals informally bound to one another through exposure to a common class of problems, and common pursuit of solutions, and thereby embodying a store of knowledge. The justice and public safety domain is considered a community of interest.
Compliant. Hardware and software capable of satisfying a particular requirement, such as manipulation of four-digit dates, is deemed ‘‘compliant.’’
Component. A component is a software object, meant to interact with other components, encapsulating certain functionality or a set of functionalities. A component has a clearly defined interface and conforms to a prescribed behavior common to all components within an architecture.
Computer crime mapping. Computer crime mapping allows a department to display calls for service on a computerized pin map that aids in crime analysis efforts.
Conceptual data model (CDM). A data model that defines the real-world entities, and the relationships between these entities, in a business context. A CDM is typically constructed as an Entity Relationship Diagram (ERD), e.g., Unified Modeling Language (UML) class diagram.
Confidentiality. Assuring information will be kept secret, with access limited to appropriate persons.
Connection. A transport layer virtual circuit established between two programs for the purpose of communication.
Conversion. Conversion is the translation of valid values into another format on a permanent basis; for example, translating two-digit years to four-digit year values.
Core data type. Basic business data items that describe common concepts used in general business activities.
Data. Facts represented in a readable language (such as numbers, characters, images, or other methods of recording) on a durable medium. Data on its own carries no meaning. Empirical data are facts originating in or based on observations or experiences. A database is a store of data concerning a particular domain. Data in a database may be less structured or have weaker semantics (built-in meaning) than knowledge in a knowledge base. Compare data with Information.
Data architecture. A component of the design architecture, the data architecture consists of among others, data entities, which have attributes and relationships with other data entities. These entities are related to the business functions.
Data class. A set of data objects that share a common structure and a common behavior. The terms “class” and “type” are usually (but not always) interchangeable; a class is a slightly different concept than a type, in that it emphasizes the classifications of structure and behavior.
Data dictionary. A file that defines the basic organization of a database. It will contain a list of all files in the database, the number of records in each file, and the names and types of each field.
Data element. A basic unit of data having a meaning and distinct units and values. A uniquely named and defined component of data definition; a data “cell” into which data items (actual values) can be placed; the lowest level of physical representation of data.
Data element [Federal Enterprise Architecture (FEA) Data Reference Model]. Physical description of the data used within an Information Exchange Package. A representation of a data object, a data property, and a data representation.
Data mart. A collection of data that is organized to support a specific application. The data is sometimes optimized for this application.
Data model. A graphical and/or lexical representation of data, specifying their properties, structure, and inter-relationships
Data object. A basic definition of the data element. Anything that exists in storage and on which operations can be performed, such as files, programs, or arrays. A collection of data elements that are aggregated for or by a specific application.
Data property. Description of the data element in context of the data object.
Data Reference Model (DRM). One of the five models in the FEA Reference Model framework, to aid in describing the types of interaction and exchanges that occur between the federal government and its various customers, constituencies, and business partners.
Data representation. Describes how data is described within the property and object layers.
Data standards. Data standards are agreed-upon terms for defining and sharing data.
Data type. A specification of the permissible content for a class of objects, where the content can be comprised of one or more literal values, i.e., positive integer, or any complex data structure, i.e., hierarchy of child elements within an XML core component.
Data warehouse. An implementation of an informational database used to store sharable data sourced from an operational database-of-record. It is typically a subject database that allows users to tap into a company’s vast store of operational data to track and respond to business trends and facilitate forecasting and planning efforts.
Database. A data structure that stores metadata, i.e., data about data. In general, it is an organized collection of information.
Digital signature. A value computed with a cryptographic algorithm and appended to a data object in such a way that any recipient of the data can use the signature to verify the data’s origin and integrity.
Discovery. The act of locating a machine-processable description of a web service-related resource that may have been previously unknown and that meets certain functional criteria. It involves matching a set of functional and other criteria with a set of resource descriptions. The goal is to find an appropriate Web service-related resource.
Discovery service. A service that enables agents to retrieve web services-related resource description.
Document. Any data that can be represented in a digital form.
Document Type Definition (DTD). World Wide Web Consortium (W3C) Recommendation. A document-centric XML schema language.
DublinCore Metadata Initiative (DCMI). Dublin Core is an open forum engaged in the development of interoperable online metadata standards that support a broad range of purposes and business models. DCMI’s activities include consensus-driven working groups, global conferences and workshops, standards liaison, and educational efforts to promote widespread acceptance of metadata standards and practices.
Electronic Data Interchange (EDI). The automated exchange of any predefined and structured data for business among information systems of two or more organizations.
Element [XML]. The fundamental building block of an XML document. XML elements can contain other elements and/or text data. XML elements are composed of a start tag, content, and end tag.
Encryption. A process that translates plain text into a code. The reader of an encrypted file must have a key to decrypt the file. This involves cryptographic transformation of data (called “plaintext”) into a form (called “ciphertext”) that conceals the data’s original meaning to prevent it from being known or used. If the transformation is reversible, the corresponding reversal process is called “decryption,” which is a transformation that restores encrypted data to its original state.
End point. An association between a binding and a network address, specified by a URI, that may be used to communicate with an instance of a service. An end point indicates a specific location for accessing a service using a specific protocol and data format.
Entity [XML]. An information-sharing unit. All agencies are entities; so are courts and legislative bodies. Private organizations that share governmental information are also entities, as are private persons.
Firewall. A system designed to prevent unauthorized access to or from a private network. Firewalls are often used to prevent Internet users from accessing private networks connected to the Internet.
Functional specifications. Formal descriptions of a software system used as a blueprint for implementation. Specifications should state the purposes of the program, provide implementation details, and describe the specific functions of the software from the user’s perspective.
Gap analysis. The difference between projected outcomes and desired outcomes.
Gateway. An agent that terminates a message on an inbound interface with the intent of presenting it through an outbound interface as a new message. Unlike a proxy, a gateway receives messages as if it were the final receiver for the message. Due to possible mismatches between the inbound and outbound interfaces, a message may be modified and may have some or all of its meaning lost during the conversion process. For example, an HTTP PUT has no equivalent in SMTP.
GPS (Global Positioning System). A satellite navigation system operated by the U.S. Department of Defense. It provides coded satellite signals that can be processed by a GPS receiver, enabling the receiver to compute position, velocity, and time.
GUI (Graphical User Interface). GUI (often pronounced “gooey”) is a program interface that uses a computer’s graphic systems to make a program more user-friendly. GUI may include standard formats for representing text and graphics, making it easier to share data between programs running on the same GUI.
Hardware. Objects used to store and run software, such as a computer, monitor, keyboard, disk, and printer.
HTML (Hypertext Markup Language). A language that allows one to tag a document, primarily with markup used for presentation, for example, font size, typeface, headings, paragraphs, tables, etc.
IAFIS (Integrated Automated Fingerprint Identification System). A new (July 1999) national online fingerprint and criminal history database run by the FBI. Justice agencies that submit urgent electronic requests for identification will receive a response within two hours.
Identifier. An identifier is an unambiguous name for a resource.
III (Interstate Identification Index). Designed and run by the FBI, III is part of IAFIS and contains criminal history records for almost 30 million offenders and can be queried using a name, birth date, and other information.
Information. Contextual meaning associated with, or derived from, data.
Information Exchange Package (IEP). An IEP represents a set of data that is transmitted for a specific business purpose. It is the actual XML instance that delivers the payload or information. (The word “package” as used here refers to a package of the actual data, not a package of artifacts documenting the structure and content of the data.) An IEP can be prefixed with “GJXDM” to indicate or highlight that the IEP is conformant to the Global Justice XML Data Model, as in “GJXDM Information Exchange Package.” The fact that an IEP is GJXDM-conformant may be readily apparent from the context, so it is not absolutely necessary to use the word “GJXDM” even if the IEP is GJXDM-conformant. (See also Reference.)
Information Exchange Package documentation. A collection of artifacts that describe the structure and content of an IEP. It does not specify other interface layers (such as Web services). It can optionally be prefixed with “GJXDM” to indicate or highlight that a resulting IEP is GJXDM-conformant. This term replaces “Exchange Document.” (See also Reference.)
Instance [XML]. Representation of the values of all the XML items.
Integrity. Assuring information will not be accidentally or maliciously altered or destroyed.
Interface. A program or device that connects programs and/or devices.
Internet. A decentralized global network connecting millions of computers.
Interoperable [Data]. Interoperable means to be functionally equivalent or interchangeable components of the system or process in which they are used.
Intranet. A secure private network that uses TCP/IP protocols.
LAN (Local Area Network). A computer network that connects workstations and personal computers and allows them to access data and devices anywhere on the LAN. A LAN is usually contained within one building.
Laptop. A computer that has capabilities beyond that of the mobile data computer. It may contain report writing and accident reconstruction programs.
LAWN (Local Area Wireless Network). A LAN that uses high-frequency radio waves rather than wires to communicate between nodes.
Legacy system. Older software and hardware systems still in use and generally proprietary.
Lexicon. Provides a glossary and cross-reference for words that may have multiple meanings. The purpose is to create common definitions to allow for clearer understanding.
Live scan. A machine that replaces ink-and-roll fingerprints. Fingers are rolled across a platen, scanned into a computer, and converted to a digital form of storage. Fingerprint cards are then printed out on a laser printer. The machine will immediately reject low-quality prints.
Logical data model. A model of the logical representation of objects about which the enterprise records information, in either automated or nonautomated form. It would be represented as a fully attributed, keyed, normalized entity relationship model reflecting the intent of the semantic model.
Loose coupling. Coupling is the dependency between interacting systems. This dependency can be decomposed into real dependency and artificial dependency:
• Real dependency is the set of features or services that a system consumes from other systems. The real dependency always exists and cannot be reduced.
• Artificial dependency is the set of factors that a system has to comply with in order to consume the features or services provided by other systems. Typical artificial dependency factors are language dependency, platform dependency, application programming interface (API) dependency, etc. Artificial dependency always exists, but it or its cost can be reduced.
Loose coupling describes the configuration in which artificial dependency has been reduced to the minimum.
MDC (Mobile Data Computer). A microcomputer used by public safety agencies to access databases for information on persons and property. The MDC uses wireless communication and allows an officer to exchange information with the dispatcher and other officers without using voice channels.
Message. The basic unit of communication between a requester and a provider. In the context of a web service, the message contains the data to be communicated to or from a web service as a single logical transmission. See also SOAP message.
Message correlation. The association of a message with a context. Message correlation ensures that the requester can match the reply with the request, especially when multiple replies may be possible.
Message Exchange Pattern (MEP). A template, devoid of application semantics, that describes a generic pattern for the exchange of messages between exchange partners. It describes the relationships (e.g., temporal, causal, sequential, etc.) of multiple messages exchanged in conformance with the pattern, as well as the normal and abnormal termination of any message exchange conforming to the pattern.
Message receiver. An exchange partner that receives a message.
Message reliability. The degree of certainty that a message will be delivered and that sender and receiver will both have the same understanding of the delivery status.
Message sender. The exchange partner that transmits a message.
Message transport. A mechanism that may be used by exchange partners to deliver messages.
Metadata. Represents information about the data and could include value constraints, naming rule, etc.
Metadata registry. An information system for registering metadata
Namespace. Namespaces are the solution to naming conflicts in XML. Using XML namespaces can help alleviate issues that arise where XML elements and attributes use identical names. XML namespaces help to identify and resolve conflicts between elements that have the same name but mean different things. A namespace is a domain that contains a set of XML element names.
NCIC or NCIC 2000 (National Crime Information Center). NCIC is a computer system maintained by the FBI that can be queried by local agencies via state computer systems known as ‘‘control terminal agencies.’’ NCIC contains 17 files with over 10 million records, plus 24 million criminal history records contained within the Interstate Identification Index (one of the 17 files). Files include the III, the Missing Persons File, the Unidentified Persons File, the U.S. Secret Service Protective File, and the Violent Gang/Terrorist File.
Network. A network is created when two or more computers are joined by some type of transmission media allowing them to communicate directly, or to share storage devices and peripherals. Transmission media can include cable lines, telephone lines, or satellite systems.
NIBRS (National Incident-Based Reporting System). An incident-based crime reporting system, run by the FBI, through which data is collected on each single crime occurrence. NIBRS data is designed to be generated as a byproduct of local, state, and federal automated records systems. NIBRS collects data on each single incident and arrest within 22 offense categories made up of 46 specific crimes called Group A offenses. Specific facts are collected for each of the offenses coming to the attention of public safety agencies. In addition to Group A offenses, there are 11 group B offense categories that only report arrest data. NIBRS is expected to eventually replace UCR.
NLETS. NLETS–the International Justice and Public Safety Information Sharing Network, formerly known as the National Law Enforcement Telecommunications System, is a high-speed communications network and message switch that connects almost every public safety agency in the country. It allows local agencies to make inquiries into state databases to access criminal history records, vehicle registration records, driver’s license files, etc. NLETS also interfaces with NCIC and other national files and allows states to exchange information with each other.
Node. A node can be a computer or some other device such as a printer. Every node has a unique network address.
Nonrepudiation. Method by which the sender of data is provided with proof of delivery and the recipient is assured of the sender’s identity, so that neither can later deny having processed the data.
OASIS (Organization for the Advancement of Structured Information). A not-for-profit consortium that advances electronic business by promoting open, collaborative development of interoperability specifications.
Object. Anything perceivable or conceivable; a real-world entity.
Object-oriented programming (OOP). OOP combines data structures and functions (computer directions) to create ‘‘objects,’’ making it easier to maintain and modify software.
OMG (Object Management Group). The industry group dedicated to promoting object-oriented technology and its standardization.
Ontology. An explicit formal specification of how to represent the objects, concepts, and other entities that are assumed to exist in some area of interest and the relationships that hold among them. In computer science, an ontology is the attempt to formulate an exhaustive and rigorous conceptual schema within a given domain, a typically hierarchical data structure containing all the relevant entities and their relationships and rules (theorems, regulations) within that domain.
Open architecture. Open architecture systems are designed to allow system components to be easily connected to devices and programs made by other manufacturers.
Operating system. The basic program used by a computer to run other programs. An operating system recognizes input from the keyboard, sends output to the display screen, and keeps track of files and directories on the disk and controlling peripheral devices such as disk drives and printers. It provides a platform for other software applications.
OWL (Web Ontology Language). OWL is intended to be used when the information contained in documents needs to be processed by applications, as opposed to situations where the content only needs to be presented to humans. OWL can be used to explicitly represent the meaning of terms in vocabularies and the relationships between those terms. This representation of terms and their interrelationships is called an ontology. OWL has more facilities for expressing meaning and semantics than XML, RDF, and RDF-S, and thus OWL goes beyond these languages in its ability to represent machine-interpretable content on the Web.
Permission. A policy that prescribes the allowed actions of an agent and/or resource.
Person or organization. A person or organization may be the owner or agents that provide or request Web services.
Physical data model. A physical representation of the objects of the enterprise. The representation style of this model would depend on the technology chosen for implementation. If relational technology is chosen, this would be a model of the table structure required to support the logical data model in a relational-style model. In an object-oriented notation, this would be a class-hierarchy/association-style model.
Platform. The underlying hardware or software for a system. The term is often used as a synonym for operating system
Policy. A constraint on the behavior of agents, persons, or organizations.
Primitive types. Primitive types, as distinct from composite types, are datatypes provided by a programming language as basic building blocks. Typical primitive types include:
• Character.
• Integer (with a variety of precisions).
• Floating-point number with binary representation usually conforming to the Institute of Electrical and Electronics Engineers (IEEE) standards for floating point representation.
• Fixed-point with a variety of precisions and a programmer-selected scale.
• Boolean, having the values “true” and “false.”
• String, a sequence of characters.
• Reference (also called a “pointer” or “handle”), a small value referring to another object, possibly a much larger one.
Privacy policy. A set of rules and practices that specify or regulate how a person or organization collects, processes (uses), and discloses another party’s personal data as a result of an interaction.
Property. A characteristic common to all members of an object class.
Proprietary. The term ‘‘proprietary’’ generally refers to a system whose manufacturer will not divulge specifications that would allow other companies to duplicate the product. It is also known as a closed architecture.
Protocol. A set of formal rules describing how to transmit data, especially across a network. Low-level protocols define the electrical and physical standards to be observed, bit- and byte-ordering, and the transmission and error detection and correction of the bit stream. High-level protocols deal with the data formatting, including the syntax of messages, the terminal-to-computer dialogue, character sets, sequencing of messages, etc.
Proxy. An agent that relays a message between a requester agent and a provider agent, appearing to the Web service to be the requester.
Quality of service. An obligation accepted and advertised by a provider entity to service consumers.
RMS (Records Management System). An RMS stores computerized records of crime incident reports and other data. It may automatically compile information for UCR or NIBRS reporting. Can perform greater functions when integrated with other systems such as CAD and GPS.
Reference. Information Exchange Package Documentation may have the word “Reference” in its title if it has been mandated, approved, endorsed, recommended, or acknowledged by a cognizant organization, e.g., “GJXDM Information Exchange Package Documentation for a Reference Incident Report.” Reference IEP Documentation often may be used as the basis for IEP Documentation meeting the specific business needs of an information-sharing enterprise. This term replaces “Reference Exchange Document” and “Reference Document.”
Reference architecture. The generalized architecture of several end systems that share one or more common domains. The reference architecture defines the infrastructure common to the end systems and the interfaces of components that will be included in the end systems. The reference architecture is then instantiated to create a software architecture of a specific system. The definition of the reference architecture facilitates deriving and extending new software architectures for classes of systems. A reference architecture, therefore, plays a dual role with regard to specific target software architectures. First, it generalizes and extracts common functions and configurations. Second, it provides a base for instantiating target systems that use that common base more reliably and cost effectively
Regression test. A regression test is performed before production to identify and prevent errors and verify that unchanged software will continue to function as designed.
Relational Database Management System. A type of database management system that stores data in related tables. New types of data can more easily be added, and the user can view the data in multiple ways.
Repository. An information system used to store and access architectural information, relationships among the information elements, and work products.
Resource Description Framework (RDF). A Semantic web standard that provides a framework for asset management, enterprise integration, and the sharing and reuse of data on the Web.
Scaleable. A term that describes how well a system can be adapted and expanded to meet increased demands.
Schema. Specification of the characteristics and relationships of a class of objects.
Schema [XML]. A specification to define the structure of XML documents and to specify datatypes for attribute values and element content. In addition to the DTD, there are several XML schema languages, including XML Schema (W3C), Schematron, and RELAX NG.
Scope creep. The slow and continuous expansion of the scope of a project, such as data type or routine, resulting in a broad, unfocused, and unmanageable scope and usually leading to cost-overruns, missed deadlines, and loss of original goals.
Security architecture. A plan and set of principles for an administrative domain and its security domains that describe the security services that a system is required to provide to meet the needs of its users, the system elements required to implement the services, and the performance levels required in the elements to deal with the threat environment. A complete security architecture for a system addresses administrative security, communication security, computer security, emanations security, personnel security, and physical security, and prescribes security policies for each. A complete security architecture needs to deal with both intentional, intelligent threats and accidental threats. A security architecture should explicitly evolve over time as an integral part of the evolution of its administrative domain.
Security policy. A set of rules and practices that specify or regulate how a system or organization provides security services to protect resources. Security policies are components of security architectures. Significant portions of security policies are implemented via security services, using security policy expressions.
Security service. A processing or communication service that is provided by a system to give a specific kind of protection to resources, where said resources may reside with said system or reside with other systems, for example, an authentication service or a public key infrastructure (PKI)-based document attribution and authentication service. A security service is a superset of AAA (authentication, authorization, accounting) services. Security services typically implement portions of security policies and are implemented via security mechanisms.
Semantic model. A model of the actual enterprise objects (i.e., things, assets) that is significant to the enterprise. Typically, the semantic model would be represented as an entity/relationship model and would be at a level of definition expressing concepts (i.e., terms and facts) used in the significant business objectives/strategies implemented later as business rules.
Semantic web. The Semantic web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. It is based on the RDF, which integrates a variety of applications using XML for syntax and URIs for naming.
Server. The program in the client/server architecture that answers client’s requests. The term ‘‘server’’ is also used to designate the computer that makes resources available to the workstations (clients) on the network.
Service. An abstract resource that represents a capability of performing tasks that form a coherent functionality from the point of view of data providers and requesters.
Service description. A set of documents that describe the interface to and semantics of a service.
Service interface. The abstract boundary that a service exposes. It defines the types of messages and the message exchange patterns that are involved in interacting with the service, together with any conditions implied by those messages.
Service semantics. The semantics of a service is the behavior expected when interacting with the service. The semantics expresses a contract (not necessarily a legal contract) between the provider entity and the requester entity. It expresses the effect of invoking the service. Service semantics may be formally described in a machine-readable form; identified but not formally defined; or informally defined via an agreement between the provider and the requester.
Service-oriented Architecture (SOA). An architectural style whose goal is to achieve loose coupling among interacting software agents. A service is a unit of work done by a service provider to achieve desired end results for a service consumer. Both service provider and service consumer are roles played by software agents/brokers on behalf of their owners. The communication can involve either simple data exchange or it could involve two or more services coordinating some activity. Some means of connecting services to each other is needed.
SOAP (Simple Object Access Protocol). The formal set of conventions governing the format and processing rules of a SOAP message.
SOAP application. A software entity that produces, consumes, or otherwise acts upon SOAP messages in a manner conforming to the SOAP processing model.
SOAP binding. The formal set of rules for carrying a SOAP message within or on top of another protocol (underlying protocol) for the purpose of exchange. Examples of SOAP bindings include carrying a SOAP message within an HTTP entity-body, or over a TCP stream.
SOAP body. A collection of zero or more element information items targeted at an ultimate SOAP receiver in the SOAP message path.
SOAP envelope. The outermost element information item of a SOAP message.
SOAP feature. An extension of the SOAP messaging framework typically associated with the exchange of messages between communicating SOAP nodes. Examples of features include “reliability,” “security,” “correlation,” “routing,” and the concept of message exchange patterns.
SOAP header. A collection of zero or more SOAP header blocks, each of which might be targeted at any SOAP receiver within the SOAP message path.
SOAP header block. An element information item used to delimit data that logically constitutes a single computational unit within the SOAP header. The type of a SOAP header block is identified by the fully qualified name of the header block element information item.
SOAP intermediary. A SOAP intermediary is both a SOAP receiver and a SOAP sender and is targetable from within a SOAP message. It processes the SOAP header blocks targeted at it and acts to forward a SOAP message toward an ultimate SOAP receiver.
SOAP message. The basic unit of communication between SOAP nodes.
SOAP message exchange pattern (MEP). A template for the exchange of SOAP messages between SOAP nodes enabled by one or more underlying SOAP protocol bindings. A SOAP MEP is an example of a SOAP feature.
SOAP message path. The set of SOAP nodes through which a single SOAP message passes. This includes the initial SOAP sender, zero or more SOAP intermediaries, and an ultimate SOAP receiver.
SOAP node. The embodiment of the processing logic necessary to transmit, receive, process, and/or relay a SOAP message, according to the set of conventions defined by this recommendation. A SOAP node is responsible for enforcing the rules that govern the exchange of SOAP messages. It accesses the services provided by the underlying protocols through one or more SOAP bindings.
SOAP receiver. A SOAP node that accepts a SOAP message.
SOAP role. A SOAP node’s expected function in processing a message. A SOAP node can act in multiple roles.
SOAP sender. A SOAP node that transmits a SOAP message.
Software. A set of computer instructions or data stored in an electronic format.
Spectrum. The array of channels, like the channels on a television, available for communications transmissions. Commonly referred to as a spectrum, these channels are a finite natural resource; they cannot be created, purchased, or discovered.
SQL (Structured Query Language). A language used specifically by a relational database to query, modify, and manage information.
State. A set of attributes representing the properties of a component at some point in time.
Synchronous. An interaction is said to be synchronous when the participating agents must be available to receive and process the associated messages from the time the interaction is initiated until all messages are actually received or some failure condition is determined. The exact meaning of “available to receive the message” depends on the characteristics of the participating agents (including the transfer protocol it uses); it may, but does not necessarily, imply tight time synchronization.
Systems software. Systems software consists of the operating system and all utilities that enable the computer to function.
Taxonomy. A hierarchical classification or categorization of a set of things
TCP/IP (Transmission Control Protocol/Internet Protocol). TCP/IP is standard transmission protocol used to connect hosts on the Internet.
Transaction. Transaction is a feature of the architecture that supports the coordination of results or operations in a multiple-step interaction. The fundamental characteristic of a transaction is the ability to join multiple actions into the same unit of work, such that the actions either succeed or fail as a unit.
Type. A description of a class of objects that share the same operations, abstract attributes and relationships, and semantics.
UCR (Uniform Crime Reports). UCR is a city, county, and state public safety program operated by the FBI that provides a nationwide view of crime based on the submission of statistics by public safety agencies throughout the country. The following offenses are recorded: murder and nonnegligent manslaughter; forcible rape; robbery; aggravated assault; burglary; larceny theft; motor vehicle theft; arson; and hate crimes.
Unified Resource Identifier (URI). A URI points to an external file that defines the namespace. The URI may either be a URL (Universal Resource Locator) that points to a web server or a URN (Universal Resource Name) that names a resource, but which does not specify a de-referenceable network object.
Validation. The evaluation of a system during development or at the time of completion to determine if it satisfies all the requirements.
WAN (Wide Area Network). A WAN consists of two or more LANs connected via telephone lines or radio waves.
Web browser. A software application used to locate and display web pages. It may be able to display graphics, sound, and video in addition to text.
Web service. A web service is a software system designed to support interoperable machine-to-machine interaction over a network. It has an interface described in a machine-processable format (specifically WSDL). Other systems interact with the web service in a manner prescribed by its description using SOAP-messages, typically conveyed using HTTP with an XML serialization in conjunction with other web-related standards.
Web Services Description Language (WSDL). An XML format for describing network services as a set of endpoints operating on messages containing either document-oriented or procedure-oriented information. The operations and messages are described abstractly, and then bound to a concrete network protocol and message format to define an endpoint. Related concrete endpoints are combined into abstract endpoints (services). WSDL is extensible to allow description of endpoints and their messages regardless of what message formats or network protocols are used to communicate.
World Wide Web. A system of Internet servers that support HTML-formatted documents.
World Wide Web Consortium (W3C). The World Wide Web Consortium is the international standards body for interoperable technologies (specifications, guidelines, software, and tools) that support the development of the World Wide Web to its full potential, including XML. W3C is a collaborative forum for information, commerce, communication, and collective understanding.
XML (eXtensible Markup Language). XML is a structured language for describing information being sent electronically by one entity to another. XML Schema defines the rules and constraints for the characteristics of the data, such as structure, relationships, allowable values, and data types.
XML Namespace. A simple method for qualifying element and attribute names used in XML documents by associating them with namespaces identified by URI references
XML Schema Language (XSD). W3C Recommendation. Specification for defining the structure, content, and semantics of XML documents. Defines a richer set of datatypes than the DTD. XML schemas support namespaces.
Source: https://bja.ojp.gov/sites/g/files/xyckuh186/files/media/document/GJXDMU…
Checkboxes are most often used in justice agency forms to select either “Yes” or “No.” In Information Technology a check box is a graphical user interface element that permits the user to make multiple selections from a number of options. Normally, check boxes are shown on the screen as a square box that can contain white space (for false) or a tick mark or X (for true). Adjacent to the check box is normally shown a caption describing the meaning of the check box. The initial value of a checkbox can be defined to be checked or not.
In XML, the representation of a checkbox is as a Boolean type, which is one of the primitive types provided by programming languages as basic building blocks. Typical primitive types include:
· Character
· Integer (with a variety of precisions)
· Floating-point number with binary representation usually conforming to the IEEE standards for floating point representation;
· Fixed-point with a variety of precisions and a programmer-selected scale.
· Boolean, having the values true and false.
· String, a sequence of characters
· Reference (also called a "pointer" or "handle"), a small value referring to another object, possibly a much larger one.
XML Schema language allows addition of annotations to schema components through an 'annotation' element (<xsd:annotation>) which must contain either a documentation element (<xsd:documentation>) or appInfo element (<xsd:appInfo>). A source attribute may be added to either element to provide a URL reference to the source of the annotation. Annotations provide a way to provide documentation and application information that may be parsed and accessed by applications via an application program interface (API).
The appInfo element can be used to provide information for tools, StyleSheets and other applications. Both documentation and appInfo appear as sub-elements of annotation, which usually is placed at the beginning of most schema constructions.
The appinfo
element is subject to the same rules for appearing in an XML Schema as the documentation
element, as they are both contained in the annotation
element. This means that it can be used within most schema constructs. In the following example we have nested some script inside the appinfo
element, which is intended to indicate to an application what action to take, depending upon which of a choice of two elements a document instance contains:
<xs:group name="CreditOrDebitGroup">
<xs:annotation>
<xs:appinfo>
if (currentNode.firstChild != "Credit")
docParser.load(debitURL);
else
document.write("Your account will be credited within 24
hours.");
</xs:appinfo>
</xs:annotation>
<xs:choice>
<xs:element name = "Credit" type = "CreditType" />
<xs:element name = "Debit" type = "DebitType" />
</xs:choice>
</xs:group>
Some useful XML resources:
Useful Websites
World Wide Web Consortium
http://www.w3.org
XML.com
http://www.xml.com
W3 Schools
http://www.w3schools.com/xml/default.asp
XML Cover Pages
http://xml.coverpages.org/
Organization for Advancement of Structured Information Standards (OASIS)
http://www.oasis-open.org/
Introductions to XML
The following list of introductory and tutorial articles on XML is extracted from the complete chronological listing of articles collected for the XML Cover Pages.
"Extending Your Markup: An XML Tutorial." By André Bergholz (Stanford University). From IEEE Computer. July/August 2000.
"Media-Independent Publishing: Four Myths About XML. [The W3C's Working Group Chair Dispels Some Myths About XML.]" By Jon Bosak. First appeared in IEEE Computer Volume 31, Number 10 (October 1998), pages 120-122.
"XML, Java, and the Future of the Web." By Jon Bosak.
"XML - Questions & Answers." By Jon Bosak (Sun Microsystems).
"Microsoft's Vision for XML." By Adam Bosworth (General Manager, Microsoft Corporation).
[Tutorial Introduction]: "Declaring Elements and Attributes in an XML DTD." By Ronald Bourret.
"SGML and Meta-information: From SGML DTDs to XML-DATA." By François Chahuneau.
"The Evolution of Web Documents: The Ascent of XML." By Dan Connolly, Rohit Khare, and Adam Rifkin.
"Introduction to XML." By Lars Marius Garshol.
"Markup and Core Concepts." By Erik T. Ray. Chapter 2 in Learning XML: Creating Self-Describing Data (O'Reilly, 2001; ISBN: 0-596-00046-4).
"SGML and XML Concepts." From the Society of American Archivists and the Library of Congress. Chapter 6 from the EAD Application Guidelines for Version 1.0. EAD Overview, useful generally as an SGML/XML tutorial.
"A Gentle Introduction to XML" from the Text Encoding Initiative (TEI) Guidelines [XML version] Chapter 2.
"A Technical Introduction to XML." By Norm Walsh. 1998.
XML Basics Quick Start From ZVON.org: The Guide to the XML Galaxy
Any enumerated type with more than ~2000 values fails in XMLBeans. The warnings generally received are:
[java] warning: SchemaType Enumeration found with too many enumeration values to create a Java enumeration. The base SchemaType T=VMACodeSimpleType@http://niem.gov/niem/fbi/2.0" will be used instead
The user must break each oversize table into parts of a size the tool can handle. These partial tables are then "Unioned" together through XML Schema. The "union" technique is essentially a work-around that requires modifying the type definition (in the reference schemas) for any enumerated type that exceeds the limit.
Example:
VMACodeSimpleType and/or VMOCodeSimpleType are usually the tables that cause these warnings to be generated in XML Beans.
Here is the technique applied to VMA:
- Split the VMACodeSimpleType into four even pieces.
- Four worked;
- Three will also work and;
- Two might also work.
Keep the number of parts to the absolute minimum that works.
- Then union the pieces in the VMACodeSimpleType definition as follow:
<xsd:simpleType name="VMACodeSimpleType">
<xsd:union memberTypes="ncic:VMA1CodeSimpleType ncic:VMA2CodeSimpleType ncic:VMA3CodeSimpleType ncic:VMA4CodeSimpleType"/>
</xsd:simpleType>
Note:
While this technique modifies the NIEM simple type structure, it is effectively a form of constraint to accommodate tool limitations. Because the simple type name is not affected by the workaround, an IEPD using the modified code table will still validate against the NIEM and will therefore be conformant.
The National Information Exchange Model (NIEM) is rapidly becoming the most important XML exchange standard for the U.S. government and its information partners. This article by Priscilla Walmsley, a four-part series, provides an overview of the process for defining a NIEM information exchange. It then takes one through the first step—modeling an exchange using UML, through the last step - assembling the schemas, documentation, and all the other artifacts of an exchange into a complete NIEM-conformant IEPD. The last article in these series, also describes the process of validating and publishing an IEPD.
For more details on this, please click on the links below:
Part 1 - Model your NIEM exchange
Part 2 - Map and subset NIEM
Part 3 - Extend NIEM
Part 4 - Assemble the IEPD
Source: Creating a NIEM IEPD
Context Data Elements are components that have common names because they are all attributes of the same object through inheritance. It is crucial to first understand the concept of inheritance of properties from more general types to more specific types to understand context data elements.
The easiest way to understand the concept of inheritance in GJXDM and NIEM is through the following example. One very general object (high level type) is ActivityType. There are a number of more specific objects derived from ActivityType, such as ArrestType and BookingType. All three aforementioned complex types have a number of properties (XML elements) that compose them. One property of ActivityType is ActivityDate (the date an activity occurred). Since ArrestType and BookingType are derived from ActivityType, they are considered more specific Activity objects with a number of specialized properties of their own. However, they each inherit all the properties of their parent, ActivityType. One property inherited by both ArrestType and BookingType from ActivityType is ActivityDate.
Now suppose you want to determine how to tag (in XML) an ArrestDate. If you search the GJXDM or NIEM schema for ArrestDate, you will not find one. The reason is that the value of an arrest date would be tagged as ActivityDate inside of the ArrestType. Analogously, a BookingDate would also be tagged as an ActivityDate as a property of BookingType. This is because the date of any specific activity derived from ActivityType inherits ActivityDate.
GJXDM and NIEM are object-oriented models that take advantage of property inheritance to reuse components. This helps to keep the total number of elements smaller. However, this also means that some specific data element names do not exist in the dictionary because more generically named properties are inherited down to the more specific subtypes.
During analysis of the data requirements sources, the XML Structure Task Force (XSTF) found no evidence of the neccesity for mandatory use of date and time together. Many sources had defined both date and time components, but there were no components of the type DateTime.
The XML Schema DateTime type uses ISO 8601 format and does not permit the option to use either date or time without the other. This means that if a component is of type DateTime, then both date and time are significant and must be present. As a result, there is no way to indicate that time is insignificant or unknown. The XSTF believes that this could lead to potentially confusing or misleading data. Therefore, it was decided to use xsd:date and xsd:time should be used optionally and independently.
Bail and court ordered fines are not child properties of Case (they are not part of CaseType or inherited from ActivityType). However, there is a Bail property (of BailType, extends ActivityType), and a FinancialObligation property (of FinancialObligationType, extends ObligationType) that could be used for the court ordered fines.
In the XML Schema for this exchange, one should consider making the root an extension of the DocumentType specific to this exchange. For example, one could define an element SummaryCase and an associated complex type SummaryCaseType (which extends DocumentType). This helps indicate SummaryCase is the root of the XML document and allows the Document metadata (in DocumentType) to be used.
If there is only one case per document instance, then one can simply define SummaryCaseType to include j:Case, j:Bail, and j:FinancialObligation in their entirety. The fact that they are all together in the same SummaryCase indicates they are related. It is also possible to use a GJXDM subset schema to exclude the elements in j:Case, j:Bail, and j:FinancialObligation that are not needed.
Or, instead of including j:Bail and j:FinancialObligation in their entirety, it is possible to just include selected properties (if there are only a few of them). Alternatively (and especially if you can have more than one case per document instance). Then one could define a local CaseType as an extension of j:CaseType. In the local CaseType, j:Bail and j:FinancialObligation can be included in their entirety or just the selected properties from these that are necessary . SummaryCase would contain only j:Case. In the XML document instance one would use type substitution to substitute the local CaseType (e.g.). In the Amber Alert example in the Developer Workshop materials there is an example of how this is done - some missing person and contact properties were included in a local IncidentType that extends the j:IncidentType.
It is important to note that type substitution can be more difficult to handle on the receive end in some architectures. Instead of type substitution, it is possible to just define a local Case property with your local CaseType as described above. Finally, one could also consider using references if your case can have multiple defendants and you need to relate bail amounts or fine amounts with particular defendants. Lets say, defendant 'x' information is specified in Case. In CaseDefendantParty.Person, it would be neccesary specify j:id='x'. In Bail, BailSubjectReference (instead of BailSubject) with j:ref='x' would be used.
Several tools are available in the justice community to search NIEM / GJXDM. These include the National Center for State Courts (NCSC) Wayfarer tool, the Model Viewer tool built into the Subset Schema Generation Tool (SSGT), and the Excel spreadsheet included in the NIEM / GJXDM distribution.
In Wayfarer and the SSGT Model Viewer, to locate an element like Charge, simply type the element's name in the "search" text box, and click search. Both tools will show the elements that are contained within this element's type. For example, within ChargeType, one of the included elements is ChargeVictim. Clicking on the link for ChargeVictim takes the user to VictimType.
To find elements in the spreadsheet, simply use Excel's "Find..." feature. (Be sure to click the option that searches across worksheets, as the spreadsheet splits up the GJXDM into separate worksheets.) Hyperlinks embedded within the spreadsheet take the user to relevant types.
These elements are defined by the types derived from DocumentType in the GJXDM. Within the GJXDM class hierarchy, a generic DocumentType is defined. This type contains a number of metadata-like properties identified by Dublin Core, the Intel community, and records management community. The "outermost tags in instance documents" are the elements defined by the types derived from DocumentType in the JXDM.
The consistent baseline model and dictionary of reusable building blocks to create them is being designed and build by the Justice community. Furthermore, note that there are already a number of groups in existence that provide "official blessing," for example the Joint Task Force on Rap Sheet Standardization. This is also why each is allowed to reside in its own namespace (outside of JXDM) -- because undoubtedly there will be some customization (extensions to JXDM) required in many of these documents. Again unofficially, examples would be
<element name="RapSheet" type= "RapSheetType"></element>
<element name="DriverHistory" type= "DriverHistoryType"></element>
<element name="ProtectionOrder" type= "ProtectionOrderType"></element>
<element name="CourtOrder" type= "CourtOrderType"></element>
... where each "__________Type" above is derived from and therefore inherits metadata from the JXDM DocumentType.
Another more specialized example:
<element name="MNCourtOrder" type="MNCourtOrderType"></element>
... a specialization of CourtOrder for Minnesota which might be derived from the more generic CourtOrderType.
Many complex-type elements with an ID element also have a corresponding text element and/or a code element. The content of any element of type IDType is dependent on the context. Since ID is of type TextType, the actual identifier could be a name or a number. One can use the ID element, the text element, or both, depending on the business need. For example, for the case where one would like to use JurisdictionID with ID = 10, and to send “Maricopa” as the textual translation for that ID. So, one could decide to use for either a name or a number, depending on how one identifies the jurisdiction districts in the information sharing enterprise. If the problem is that one wants to include both the name and the number of the district in the jurisdiction, then could be used for the number and
for the name (instead of adding an element to IDType for the name).
is defined as "A district in a jurisdiction." Another alternative is to use a code. The use case referred to here would often be solved with a code; in the enumeration the value would be "10" and the documentation (presumably available to sender and receiver) would be "Maricopa". For this example, adding a local extension something like (with a corresponding enumeration) would work. The advantage of this method is that ties the code to its description in the schema (so you won't have a conflict if is "Maricopa" but is "9"). The disadvantage of this method is that you are not actually sending the name (the receiving system must refer to the enumeration schema), and in this case you have to extend the GJXDM.
GJXDM and NIEM add a great deal of redundancy to element names to provide a set of data element names that are relatively complete, semantically precise, and globally understood. The meaning of any given tagname should be determinable within a variety of circumstances, ranging from well-structured documents with rich context to transaction or message-oriented formats that may be weak in context.
To do this the XML Structure Task Force prohibits synonyms, and uses ISO/IEC Standard 11179 (Specification and Standardization of Data Elements) syntax for naming elements consistently and precisely. The resulting data dictionary will have potentially wide applicability and a larger scope, but the expense will be longer names. Furthermore, we used the Draft Federal XML Developer's Guide 2002: which states that ISO 11179 naming syntax should be used for XML element names.
XML Developer's Guide: http://xml.fido.gov/documents/in_progress/developersguide.pdf
The Global Justice XML Data Dictionary schema employs the following rules for rendering elements vs. attributes:
(1) Employ elements whenever possible. Attributes are used as the exception, and only with reasonable justification based on XML limitations or significant avoidance of complexity. Justification: Elements are much more flexible than attributes. Attributes cannot be complex and cannot occur multiple times. Federal guidelines and best practices suggest the avoidance of attribute use.
(2) SuperType properties are always attributes. Justification: The SuperType contains properties that are applicable to all components of the GJXDM. Therefore, all fields will include the properties of SuperType. All GJXDM components are derived from SuperType, so that all components inherit all the SuperType properties. At the same time, SuperType has neither complex nor simple content. In fact, it has no content; it's empty. Thus, objects with simple content can still be derived from SuperType, because it cannot contain any subordinate elements. And finally, one can consider SuperType properties generic enough to be metadata (such as, probabilityNumeric, distributionText, reportedDate, expirationDate, etc.) on all GJXDM components.
(3) DocumentType properties are always elements. Reason: In view of rule (1), a justification is unnecessary. However, we provide the following explanation because of the special nature of DocumentType as the root of all reference document types. DocumentType is derived from SuperType. In the GJXDM, reference document types derived from DocumentType are the primary basis for information exchange transactions. As such, DocumentType has a set of metadata properties that are common to all documents derived from it. We render these properties as elements for several reasons: • There is no need for DocumentType to be empty (as there is for SuperType). • Metadata defined for DocumentType is fairly complex and cannot easily be rendered as attributes. • The common properties of many documents and transactions will likely be extended and evolve. • What may be metadata in a library or relational table sense is relevant document data to somebody.
(4) Use attributes for metadata that simply qualify the format or representation of a data value, but are not a required part of that data value, and when such use will avoid complexity. Justification: Sometimes metadata qualifiers are not an essential part of the data itself. The data can stand alone and still be understood, or in other words, the qualifiers can be ignored without harm to understanding. The qualifier simply clarifies or focuses the meaning of the data or its representation. Furthermore, use of attributes for such qualifiers avoids unnecessary content complexity. Example:
<PersonGivenName namevariant= "initial">M</PersonGivenName>
5) Properties of simple numeric types are attributes. Justification: Type Measure, Numeric, Quantity, Rate, and several other types that require qualifiers such as unit of measure or tolerance will be handled with attributes to maintain simple content. These qualifiers are simple and have stable values. The data value and qualifiers cannot be separated without almost total loss of meaning. Furthermore, use of attributes for such qualifiers avoids unnecessary content complexity. Example:
<VehicleSpeedMeasure speedunit="mph" tolerance= "5">37</VehicleSpeedMeasure>
6) Mixed content is NOT supported. Justification: Mixed content models are confusing and extremely difficult to implement and parse. Example:
<VehicleSpeedMeasure>37
<SpeedUnit>mph</SpeedUnit>
</VehicleSpeedMeasure>
The capability to store common names (context data elements) in the dictionary database that are not used in the Globa Justice XML Data Dictionary (GJXDD) schema will be added. These context elements will be defined and will refer (link) to the Global Justice XML Data Model (GJXDM) elements that represent them in a given context. For example, ArrestDate will be defined in the dictionary, and its entry will reference (link to) ActivityDate and explain the Arrest context. This capability will assist users who must map their local dictionaries to GJXDM.
There are a number of ways to identify appropriate context data elements. The XSTF is evaluating them. The capability to have and use context data elements for navigation of GJXDM is available with the first release of the GJXDM Search Tool. However, identifying what context data elements should be contained in the GJXDM database will be evolutionary and based upon input from the field and methods for collecting such. Also, by design, synonymous elements are not permitted in the GJXDD. However, data element definitions may contain synonyms that will not appear as tags in the GJXDD schema. Thus, within the documentation it is easy to discover synonyms by searching. A data registry for the GJXDM would also enable this.
In addition, the National Center for State Courts Wayfarer GJXDM search tool ("Wayfarer") has a "contextual search" feature that can often identify context data elements.
Wayfarer: http://apps.ncsc.org/niem/
GJXDM and NIEM are reference models. Elements are designed for maximum initial flexibility.
They are:
Overly inclusive — there are more elements than necessary for many applications.
Optional — minoccurs= "0"
Unrestricted — maxoccurs="unbounded"
These rules are not absolute. There may be cases in which these rules are not practical or realistic, or where the XML Schema specifications must take precedence. In addition the reference architectures define a constraint schema in which more specific constraints can be applied to elements.
In dynamic type substitution, the structure that contains the new element remains unmodified. When constructing an instance, the structure of a derived type may be substituted for an expected type structure. The original remains the same. The only requirements to using type substitution are that the name of the type to substitute must be specified as the value of attribute "xsi:type," an attribute of the original element, and only derived types may be substituted. (For example, SubjectType may be substituted for PersonType, but VehicleType may not be substituted.)
In static element replacement, the element substitution is done in the schema, rather than the instance. This requires extending types all the way up the containment hierarchy to the root of the schema. (While this does result in a larger number of derived types in the extension schema, it has the advantage of making the element replacements explicit in the schema.) To make the element replacement explicit, it is recommended that the schema designer include documentation in the schema indicating the name of the element being replaced. The "appinfo" namespace included in GJXDM and NIEM provide an element suited for this purpose.
Both dynamic type substitution and static element replacement have advantages and disadvantages.
The dynamic type substitution has distinct advantages.
o The main advantage of dynamic type substitution is that it achieves the desired element replacement without requiring definition of as many derived types in the extension schema.
o Another advantage of dynamic type substitution is keeping the original element name and path. Even if other users don't understand your local “CourtOrderNarrative” extension, they can still process the original "CourtActivityCourtOrder" because it has the same name and same path as expected.
However, dynamic type substitution also has several disadvantages.
o The complete definition of the containing types is unknown until instances are created.
o The schema designer has to describe where type substitution is required in a non-schema artifact, which makes it difficult to use the schema as the basis for exchange.
o Many common XML tools and messaging infrastructure do not support type substitution in some instances.
The advantages and disadvantages of static element replacement are the opposite of dynamic type substitution. Static element replacement results in more derived types in the extension schema and enjoys broader infrastructure support. However, the schema can be used as a more complete definition of the IEP structure, making the semantics clear and explicit.
What are the guidelines for using elements vs. attributes?
The following guidelines are extracted or paraphrased from the Draft Federal XML Developer's Guide (April 2002).
Use elements that extend easily for maximum flexibility, and use attributes to reduce complexities or when forced to maintain simple content.
One of the key schema design decisions is whether to represent an information element as an XML element or attribute. Once an information item has been made an attribute, it cannot be extended further. There cannot be multiple uses of it within the same element (i.e. 0 or 1, but no more). Also, if enumerated, an attribute cannot be extended to add values to its enumeration list.
Unfortunately, some of these guidelines can be very subjective. For example, what constitutes metadata and what constitutes data? Often this depends upon the user perspective and the application. For a model with very large applicability and a requirement for maximum flexibility, it is probably safer to use elements that can be more easily extended. Using attributes may prevent unnecessary complexities or when forced to maintain simple content.
Attributes SHOULD:
Only be used to convey metadata that will not be parsed. In other words, attribute values should be single words or numeric, not lists of words or numbers that may require further parsing.
Provide metadata that describes the entire content of an element. If the element has children, any attributes should be generally applicable to all the children.
Provide extra metadata required to better understand the business value of an element.
Be short, preferably numbers or conforming to the XML Name Token convention.
Only be used to describe information units that cannot or will not be further extended or subdivided.
Attributes SHOULD NOT contain long string values.
For these reasons and to promote uniformity, Federal guidelines discourage the use of attributes. But if you must use them, keep them simple.
“DocumentType” is derived from “SuperType”. In the GJXDM, reference document types derived from “DocumentType” are the primary basis for information exchange transactions. As such, “DocumentType” has a set of metadata properties that are common to all documents derived from it.
We render these properties as elements for several reasons:
There is no need for “DocumentType” to be empty (as there is for “SuperType”).
Metadata defined for “DocumentType” is fairly complex and cannot easily be rendered as attributes.
The common properties of many documents and transactions will likely be extended and evolve.
What may be metadata in a library or relational table sense is relevant document data to others.
A complete list that details all of the deprecated elements and other changes can be found in the change log at the following page:
The GJXDM element j:PersonModusOperandi is of type j:ActivityType, and can be used to describe "A methodology of action, particularly a criminal action, known to be routinely associated with a persons crimes."
Content Elements:
Content elements enclose data. The following is an example:
<Person s:id="A">
...
<PersonName>
<PersonFullName>Adam Smith</PersonFullName>
</PersonName>
...
</Person>
In this example, there is a person object. The person contains an element called PersonName. The PersonName element contains an element called PersonFullName. The PersonFullName element contains a string Adam Smith. The PersonFullName element is obviously a content-containing element. It has
the person’s name (a literal string) as its content.
The PersonName is also a content-containing element, as its content represents the person name, as a structured object. It contains the element PersonFullName, and could contain additional elements.
Reference Elements:
Reference elements do not enclose content. Instead, they reference content as external objects:
<Incident>
<ActivityDate>2003-10-02</ActivityDate>
...
<IncidentSeizedPropertyRef s:ref="C"/>
...
</Incident>
In the above example, the property that was seized as part of the incident is referenced out to another object, an XML object in the same XML instance, with the identifier C.
<Property s:id="C">
<PropertyDescriptionText>
White microwave oven
</PropertyDescriptionText>
<PropertyTypeCode>HOVEN</PropertyTypeCode>
<PropertyMakeName>Kenmore</PropertyMakeName>
<PropertyModelName>63292</PropertyModelName>
</Property>
The object that has the identifier C is an instance of Property, specifically representing a microwave oven. The reasons for representing the microwave oven
outside of the incident should be quite evident: it is its own object, independent of the incident. It has its own life cycle. If the incident did not exist, the microwave oven would still exist.
The seized property is an element of the incident because it is a fixed part of the incident. The incident involved the seizing of the property, and that will not change. However, the incident should be a reference element, as the property has its own life cycle, outside of the incident.
Abstract elements are elements defined in XML schema but cannot appear in an XML instance; this is an XML schema mechanism for forcing substitution. Substitution groups must be used whenever abstract elements are defined; an element that is a member of the abstract element’s substitution group must appear in place of the abstract element in an XML instance.
NIEM uses abstract elements as the head-element for all substitution groups. Abstract elements are used throughout the NIEM wherever there is a concept that can be represented multiple ways. An abstract element serves as a placeholder in the reference model, but it must be substituted when creating an exchange specification.
Below is a snippet of NIEM schema to illustrate this.
<xsd:complexType name="PersonType">
<xsd:complexContent>
<xsd:extension base="u:SuperType">
<xsd:sequence>
<xsd:element ref="u:PersonSex" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
<xsd:element name="PersonSex" abstract="true"/>
<xsd:element name="PersonSexText" type="u:TextType" substitutionGroup="u:PersonSex" / >
<xsd:element name="PersonSexCode" type="ncic:SEXCodeType" substitutionGroup="u:PersonSex" / >
In the example above both PersonSexText and PersonSexCode belong to the PersonSex substitution group. This means either PersonSexText or PersonSexCode must replace PersonSex whenever PersonType is instantiated.
The answer is that technically you can use a code list as an element or an attribute or both. However, rule 2 of the NIEM Conformance Rules (https://www.niem.gov/aboutniem/grant-funding/Pages/implementation-guide…) states the following: “If the appropriate component (type, element, attribute, etc.) required for an IEPD exists in the NIEM, use that component. Do not create a duplicate component of one that already exists.” By defining both an attribute and an element you are creating two components that mean the same thing. One would need a very strong business justification for using the code list as both an element and an attribute.
Members of a substitution group are not mutually exclusive. However, cardinality constraints on the head element apply. So if the head element has maxOccurs=”1” (or maxOccurs is not specified, as the default value is 1), only one (but any one) of the substitution group members may be used. If the head element has maxOccurs= ”unlimited,” than any number of substitution group members may be used in any combination and in any order. So, for example, if elements A, B, and C are members of the substitution group, you could have such combinations as AA, AAB, ABC, CBBA, and so on (although only certain combinations likely make business sense).
Working with this scenario below:
An IEPD with an existing extension element that is a code for widget capabilities (a widget can have multiple capabilities), and an inclusion in the information exchange XML not only the widget capability codes, but also a description of each code used. There are two options-
The name should be WidgetCapabilityText, and assuming you can have more than one widget capability in an instance, the example below is a common way to do it:
<my-ext:Widget>
<my-ext:WidgetCapabilityCode>WD</my-ext:WidgetCapabilityCode>
<my-ext:WidgetCapabilityCode>SY</my-ext:WidgetCapabilityCode>
<my-ext:WidgetCapabilityText>Washes the dishes</my-ext:WidgetCapabilityText>
<my-ext:WidgetCapabilityText>Sweeps the yard</my-ext:WidgetCapabilityText>
</my-ext:Widget>
Another way to get this done is to use the metadata element nc:DescriptionText to hold the literal and apply that metadata element to WidgetCapabilityCode.
<nc:Metadata s:id="WD">
<nc:DescriptionText>Washes the dishes</nc:DescriptionText>
</nc:Metadata>
<nc:Metadata s:id="SY">
<nc:DescriptionText>Sweeps the yard</nc:DescriptionText>
</nc:Metadata>
…
<my-ext:Widget>
<my-ext:WidgetCapabilityCode s:metadata="WD">WD</my-ext:WidgetCapabilityCode>
<my-ext:WidgetCapabilityCode s:metadata="SY">SY</my-ext:WidgetCapabilityCode>
</my-ext:Widget>
(note the choice of the id values are arbitrarily the same as the code values)
The GJXDM and NIEM schemas (and subsets thereof) contain global element declarations. If there are multiple global element declarations in a schema, a validating parser is not capable of determining which of those global elements is intended to be the root of a valid instance. (Some parsers consider this an error, others a warning; the XML Schema specification is ambiguous on the proper parser behavior.)
For this reason, it is a best practice for every IEPD to include a document schema that identifies the single element that is the root of a valid instance. The document schema defines an IEPD-specific namespace that contains only a single element declaration. The element's type should be in either the GJXDM or NIEM namespace or an IEPD-specific extension namespace.
So, in summary answer to the question: Always use a document schema when it is important to you to identify the root element of a valid instance. Since this is almost always desired, it is a best practice always to define a document schema.
An extension schema defines an IEPD-specific namespace that contains types and elements that are particular to that IEPD. Types in an extension schema should extend types in GJXDM or NIEM; elements in an extension schema should be either of types in the extension schema, or types in GJXDM or NIEM.
It is typical for an IEPD to have at least a few extensions. Extension schemas play a valuable role in the concept of conformance. While it is important for IEPDs to leverage types and elements in GJXDM and NIEM if they fit the semantics of the exchange, it is equally important not to use a type or element from them if its definition does not fit the semantics. It is very important not to use a GJXDM or NIEM element solely on the basis of the name "sounding like" what is intended in the exchange semantics; the definition must be consistent with those semantics as well.
So, in summary answer to the question: Use an extension schema whenever the exchange involves semantics that do not exist in GJXDM or NIEM.
Alternatives to schema subsets include the use of restriction, modularizing the full JXDD schema, or just copying over only the elements and definitions one needs.
The XML Schema restriction mechanism allows users to take a type and restrict away the elements they don't need and to modify the occurrence restrictions of other elements. While this seems like an acceptable approach, it presents some problems that may not be obvious. The first problem is that if users create a schema subset by restricting the full JXDD schema, the full JXDD schema would still be imported. No benefits would be gained in loading or validation time. The second problem is that restrictions cannot be enforced. To create a restriction, a new local type would be created based on the original JXDD type. Elements could be dropped or their number of occurrences reduced. The local schema would still have to import the full JXDD schema to do this, but using fast validation tools would make this possible. The real problem is in usage. Elements defined to be of the original JXDD type would be able to use the local restricted type in the XML instance through type substitution; however, there is no way to enforce this type substitution to occur. It would still be entirely possible, and in fact easier, for the original unrestricted JXDD type to be used. Validation would not recognize that the local restricted type should be used instead of the original JXDD type. The only way to work around this would be to create a new local element of the new locally restricted type. Validation would then enforce that the local type be used, but the element would have no connection to the JXDD and would not be understood by others to whom the schema is sent. This loses much of the benefit gained by using the JXDD - understandability. The use of restriction is not prohibited, but it offers much less in terms of performance benefits and validation support than schema subsets provide. Furthermore, in most cases restriction is not a sufficient alternative to schema subsets.
Another alternative would be to modularize the full JXDD schema into different components, as has been suggested. The problem with this is that there is no set of lines over which modularization would work and provide the benefits desired. If the full JXDD schema was divided into smaller components, the smaller components would still need to import each other because they are all interrelated. A person module would need reference to a location module, a contact information module, an organization module, a miscellaneous module for common types, and some subset of an activity module for the person subtypes. Little performance gains are made and complexity is increased. A seemingly simpler alternative to building schema subsets would be for users to copy over only those element and type definitions that they need from the full JXDD schema into their own document schema. This approach has problems as well. If users copy over JXDD components into their document schema without putting them into the Justice namespace, then other users would not be able to recognize that those components come from the JXDD. The namespace is what identifies the common source of the components; without this, recognition is lost.
Instead, if users copy over JXDD components into their own document schema and put them under the Justice namespace, then this tie to the JXDD is in name only. Usually, the structural definitions of JXDD components that are used in a local schema are imported from a definition schema in the Justice namespace. The full JXDD schema is one such definition schema - it is an official definition of the JXDD elements and types. The JXDD schema subset is another. A local copy of JXDD components in a document schema is just that - a local copy. There is no official structural definition schema against which to validate and ensure that the components appear as they should. The local document only validates against itself. There is no guarantee that components are actually from the JXDD; at best all you have is a claim. In either case, local copies with or without Justice namespace references, it would not be possible to reference and identify appropriate components as valid Justice elements and types.
There are many factors that need to be taken into account before deciding to use SSGT that include what is the end goal, alternatives, short term concerns and long term concerns.
In order to produce a set of schema subsets the following properties are needed in the tool .
The tool must allow a user to search and navigate through the full Justice dictionary. This is neccesary because users will need to see what is available before they can choose which parts they want to use.
The tool must give users the ability to create schema subsets by adding constraints.
The tool should allow users to create extension and document schemas by making customizations. Notice that this is not required functionality - a base tool could be built without it but would not be capable of providing the complete set of schemas.
The tool must be able to generate the customized schema subsets from the user input. This requires knowledge of the dictionary, data model, and the rules for creating valid schema subsets. Standard Commercial Registries
There is no puropose to recreate an existing product that could meet our needs. Therefore it is important to take a look at what a commercial, off-the-shelf, ebXML-compliant registry could offer us. A commercial registry could catalogue the Justice dictionary and store metadata about it, either at a component level or a document level. A commercial registry could also give users some manner of searching and retrieving data through a user interface.
These are important and necessary functionalities, but they are often not adequate enough to support the construction of customized schemas. To start with, only one class of registry could be used. This would be a registry with component level granularity. Any other type of registry would be useless for our purposes. A document level granularity would mean that the registry could only store and retrieve the dictionary as a full JXDD schema. This gives users no support in accessing and customizing individual components and defeats our purpose. Suppose we then choose a registry that has a component level granularity. It would be able to store the dictionary (a list of elements and types with definitions) piece by piece rather than lumped together in a single document. However, there is no way for any off-the-shelf registry to have knowledge of the Justice data model that the dictionary is based upon. This data model is very important - it has some relationships built into it that gives the JXDD its power and flexibility.
Off the shelf, no registry would be able to utilize the JXDD to its full potential. Additionally, the registry would have no mechanism to build the schema subsets or any knowledge of how to do so. It is apparent from the volume of comments received that the need for a customized schema subset generation tool is immediate. Because there is no product right now that is capable of this, it must be built.
This tool should have the capabilities outlined in the requirements section above. The tool should provide a graphical user interface to allow users to search through the dictionary components, add constraints and customizations, and define customized schemas. The schema subset generation tool should take in user input and, from that input, generate a valid set of customized schema subsets, carefully formed to maintain its integrity and interoperability. The tool should then return the set of schemas to the user, who then becomes the owner of those files.
Future work: Despite there not being an off-the-shelf registry product ready to meet our current needs, it might be possible for an existing registry to be modified so that it supports the full Justice data model and all of the requirements for building customized schema subsets. To start with, this would involve some research and comparison of different registry products and analysis of potential candidates to determine whether making such modifications is feasible. If so, adding awareness of the Justice data model and the capacity to build schema subsets could then be added. If it is not possible to make the necessary enhancements to a commercial registry, it becomes necessary to build a custom registry to fit the Justice data model.
After a registry is either modified or built, the back end of the schema subset generation tool will need to be changed to communicate with the registry. This allows code maintenance to be performed on the registry side and new versions of the JXDD to be handled automatically rather than forcing tool upgrades.Will this tool be the only way to create schema subsets? No. There are other ways this could be done. One step for the schema generation tool will be to translate the user input specifying how to build the customized schemas into an XML request file or wantlist. This will happen in the background, transparent to the user. The wantlist would be sent to the registry. The registry would process the file and then generate and return the customized schema subsets. The format of the request file should be publicly available, so that others can create their own front-ends and still use the registry to produce the actual schemas.
Another way to generate schema subsets would be to create and distribute a library that could perform the same functionality as the registry tool. A third way would be for users to go through the set of full schemas making restrictions and creating extension and document schemas by hand. Another might be through the use of XML Style Sheet Language (XSL). There are probably many different ways that this work could be done. The benefit of using a JXDD schema subset generation tool is that if a user specifies valid input, an appropriately and consistently formed set of customized schema subsets will be returned. Without a thorough understanding of the Justice data model, it could be very easy to unintentionally break conformance.
SSGT: http://niem.gtri.gatech.edu/niemtools/ssgt/SSGT-Search.iepd
Constraint schemas allow to place extra conditions on elements in addition to those already provided by GJXDM or NIEM. Using a constraint schema prevents having to define a valid subset schema for every situation where data needs vary depending on the context. Constraint schemas place additional constraints that a particular organization requires. In other words, constraint schemas are basically a second layer of validation that one can use for their own organization to verify that the data exchanges conform to one's organization's needs when they are more specific than what the GJXDM or NIEM provide. Constraint schemas allow one to sidestep the natural side effect of global definitions, that is, having data represented one and only one way.
What one cannot do in a constraint schema is add new elements to the type, change the order of the elements as they are defined in the full GJXDM or NIEM schemas, or change an elements base type to a type that is not a valid subset of the original base type. For instance, a constraint could define an element originally defined as xsd:string to be xsd:decimal but not the reverse. A decimal is a valid subset of string, but string is the superset to decimal. Any of these types of adjustments would require an extension schema.
When validating an instance document that was generated to meet a specification that only includes a constraint schema, simply change the schema location in the instance document from pointing to the constraint to point to the full GJXDM schema for the second part validation.
https://www.niem.gov/about-niem/news/niem-naming-and-design-rules-50-be…
Guidelines for XML Schemas: http://xml.coverpages.org/schemas.html
Since the full GJXDM and NIEM schemas include components that are optional and over-inclusive, users have the ability to retrieve only those components from the data dictionaries that they need. Many users will not want every element to be able to occur repeatedly. Furthermore, it is unlikely that a user will need to use the entire contents of the full schemas. This is the basic idea behind schema subsets—to provide smaller schemas that define only those components from the dictionary that the user wants to include.
The full GJXDM or NIEM schemas can be used, but it is not necessary to do so. Smaller schema subsets can be more manageable than the full schemas and will usually permit more rapid validation of document instances. The overriding rule for using GJXDM or NIEM schema subsets is as follows: Document instances that validate to the schema subset will validate to the full GJXDM or NIEM.
The Schema Subset Generation Tool (SSGT) produces compliant schema subsets. It is available at the following locations:
NIEM SSGT: http://niem.gtri.gatech.edu/niemtools/ssgt/SSGT-Search.iepd
The GJXDM and NIEM schemas represent reference models that are intentionally optional and over-inclusive. The purpose of the reference model is to focus structure and semantics through well-defined types and properties, not to dictate which components to use or how to fine-tune their content.
Contraint Schemas do this in two different ways:
Encapsulating Cardinality
Constraint schemas can be used solely to constrain cardinality. This allows for cardinality validation via XSD. In this case, the constraint schema is simply a copy of the subset schema with "minOccurs" and "maxOccurs" attributes set. When validating, it takes the place of the subset schema.
There are utilities to help create constraint schemas from a subset schema. Constrain schemas are optional. Cardinality can also be enforced solely within the subset schema, at the cost of easy reuse
Enforcing Business Rules
Constraint schemas can also let you do non-GJXDM or non-NIEM modifications to subset schemas in order to enforce business rules. Practical exchanges of actual XML instances usually require much tighter constraints on elements and attributes than the schema representing the reference model allows. In a constraint schema, the user can restrict element occurrences and employ facets. An instance must pass both the conformance validation path and the constraint validation path. In other words, a 2-path, parallel validation is required.
Conformance validation ensures proper uses of the GJXDM and NIEM, and constraint validation (more restrictive) confirms that an instance adheres to rules specific to the user’s application. In many cases (though not a requirement), a constraint schema will be a copy of the GJXDM or NIEM schema subset (or full) with appropriate constraints applied.
The Department of Justice funded the Georgia Tech Research Institute (GTRI) to build a Subset Schema Generation Tool (SSGT) that would allow searching the GJXDM and creating customized subsets. A similar tool exists for NIEM.
These tools present the full set of GJXDM and NIEM types and elements for the user to select into a subset. Components can be added to the subset by performing searches for the specific types or elements that are required. Once selected, types and elements are included in the subset. Types and elements erroneously included in the subset can be removed as desired.
Once the desired subset is built, the user can generate the schema subset (which is in fact often a set of related schemas).
NIEM SSGT: https://staging.niem.gtri.gatech.edu/niemtools/ssgt/help.iepd;jsessioni…
SSGT Tutorial: http://vimeo.com/109940669
Consequently, the Department of Justice funded the Georgia Tech Research Institute (GTRI) to build a Subset Schema Generation Tool (SSGT). This tool presents the full set of GJXDM types and elements for the user to select into a subset. Types and elements can either be selected as they are encountered, or the user can search the GJXDM using the GJXDM Model Viewer (which is contained in the SSGT.) Once selected, types and elements are included in the subset. Types and elements erroneously included in the subset can be removed as desired.
Once the desired subset is built, the user can generate the schema subset (which is in fact often a set of related schemas, as is the full GJXDM).
Schema Subset Generation Tool: https://www.niem.gov/tools-catalog
Wantlist Specification for the Schema Subset Generation Tool
The Load/Save function in the Schema Subset Generation Tool (SSGT) uses a Wantlist to preserve the details of a schema subset. The Wantlist format is XML. The current Wantlist schema specification is located at:
Wantlist Schema
Please note that this schema will change as capabilities are added to SSGT (such as global constraints).