
| CEOS
Working Group on Information Systems and Services CEOS INTeroperability EXtensions |
DocRef.:CEOS/WGISS/PTT/SDDCEOS/WGISS/CINTEX/CM |
| Issue: | Version 1.3 |
| Date: | April 1999 |
This document has been approved for publication by the Catalogue INTeroperability EXperiment, (CINTEX), of the Committee on Earth Observation Satellites (CEOS) and reflects the consensus of the CINTEX technical panel experts from the CEOS member agencies.
1.2 Organization of the Collections Manual (CM) *
1.3 ICS/CINTEX Development Process *
1.4 A Guide To CINTEX Documents *
1.7 Catalogue Interoperability *
1.7.1 Purpose and Scope of Catalogue Interoperability *
1.7.2.1 Design Approach: CIP Space, IGP Space and ICS *
1.7.2.2 Collections Data Model *
1.7.2.3 CIP as a Z39.50 Profile *
1.7.2.5 Data Product Ordering and Security *
1.7.2.6 Guide Documents in ICS *
1.7.3 Levels of Compliance to CIP, IGP and ICS *
2. The ICS Collections Model *
2.1.1 Archive Collection Descriptor *
2.1.2 Theme Collection Descriptors *
2.1.3 Theme and Archive Collection Descriptors Summary *
3.2.1.1 Formulating the Collections Structure *
3.2.1.1.2 Analysis of Existing Data *
3.2.1.1.3 Organizing the Data into a Collections Structure *
3.2.1.1.3.1 Using the Query Model *
3.2.1.1.3.2 Including Collections *
3.2.1.1.3.3 Creating a Root Collection *
3.2.1.1.4 Relating Collections/Guides *
3.2.1.2 Review of the Collections Structure *
3.2.2 Collections Database (CDB) *
3.2.2.1.1 Identify Additional Elements/Attributes *
3.2.2.1.2 Verify Multiplicities *
3.2.2.1.3 Verify Mandatory Attributes *
3.2.2.1.6 Adding Product Metadata *
3.2.2.2.1 Scientific Review Process *
3.2.2.2.2 Periodic Consistency Review Process *
3.2.3.1 Adding a CIP Collection *
3.2.3.2 Adding CIP Collection with Local Attributes *
3.2.3.2.1 Creating a New Entry -- Present Service *
3.2.3.2.2 Creating a New Entry -- Search and Present Service *
3.3.1 Modifying the Collections Structure *
3.3.1.2 Deleting Theme Collections *
3.3.2 Modify Existing Collections in the Collections Database *
3.3.2.1 Deleting Collections *
3.3.2.2 Modifying Existing Product Metadata *
3.3.2.3 Deleting Product Descriptor *
3.3.3.1 Modifying the Collection Related Entries in the Explain Database *
3.3.3.2 Valids in the Explain *
3.3.3.2.1 Capturing the Valids *
3.3.3.2.2 Maintaining the ICS Valids *
3.3.3.2.2.1 CIP Attribute Valid Value Changes *
3.3.3.2.3
Adding Local Valids for Local Attributes *
Figure 1-1 Collection Management Activities *
Figure 1-3 VENN Diagram of CIP Space and ICS *
Figure 1-4 The Concept Of A Collection *
Figure 3-1 Collections Process *
Figure 3-2 Formulating the Collections Structure *
Figure 3-3 Adding Collections to an Existing Collections Structure *
Figure 3-4 Deleting a Theme Collection from an Existing Collections Structure *
Figure 3-5 Including Collections *
Figure 3-6 Including Remote Collections *
Figure 3-7 Adding a Root Collection *
Figure 3-8 Mandatory Data for Collection Descriptors *
Figure 3-9 Mandatory Data for Product Descriptors *
Figure 3-10 Add New CIP Collection Entry to Explain *
Figure 3-11 Add Collection with Local Attributes to Explain --Present *
Figure 3-12 Add Collection with Local Attributes to Explain - Search & Present *
Figure 3-13 Adding Collections - Option "1" *
Figure 3-14 Adding Collections - Option "2" *
Figure 3-15 Adding Collections *
Figure A-1 Collection Structure "A" *
Figure A-2 Collection Structure "B" *
Figure A-3 Collection Structure "C" *
Figure A-4 Collection Structure "D" *
Figure A-5 Collection Structure "E *
Table 2-1 Example of Combining Attributes *
Table 2-2 What Collections May Include *
Table 3-1 Product Descriptor Analysis Results Table *
Table 3-2 Collection Descriptor Analysis Results Table (Archive Collections) *
Table 3-3 Collection Descriptor Analysis Results Table (Archive&Theme Collections) *
Table 3-4 Collection Descriptor Analysis Results Table (Remote Collections) *
Table 3-5 Collection Descriptor Analysis Results Table (Root Collection) *
Table 3-6 Collection Descriptor Analysis Results Table (Adding Present Service Column) *
Table 3-7 Explain Data Class Valid and Default Descriptions *
Table 3-8 AVHRR Collection Explain Data Class Selections *
Table 3-9 Present Service Extensions -- Explain Data Class Selections *
Table 3-10 Search & Present Service Extensions - Explain Data Class Selections *
Table 3-11 Updating Collection Descriptor Analysis Results Table *
Table B-1 CIP Elements *
Table B-2 CIP Item Descriptors ARS *
Table B-3: CIP Sub-Elements ARS *
Document Status Sheet
|
|
|
|
|
|
April 1997 | First issue of document to CINTEX |
|
|
May 1997 | CET Review - Hold for Completion of Specification Consolidation |
|
|
September 1998 | CET Review – Document Reorganized per CET |
|
|
February 1999 | CET Review |
|
|
April 1999 | Baseline Version |
The first section provides general information about the contents of the Collections Manual such as the Purpose, Glossary, Definitions etc. This information applies to all of the subsequent sections. Additionally, an overview section that describes the Committee on Earth Observation Satellites (CEOS ) INTeroperability EXtensions (CINTEX ) approach to catalogue interoperability is included in this section of the manual.
The second section addresses the ICS Collections Model. In this section the various types of ICS Collections are defined in terms of their descriptors. Additionally, detailed characteristics that further define the various Collections are also addressed in this section.
The third section addresses Collection Management activities that relate to the creation and maintenance of Collection Descriptors. These activities provide the framework and suggested strategy for creating Collections Structures and entries in the Collections Database (CDB). The components of the Explain Database that relate to Collection Management are also addressed in this section.
The organization of the Collections Manual is centered around the roles that each individual should play in support of Collections Management. The following diagram, Figure 1-1, graphically depicts these roles and the interaction with the various activities described in this manual. To begin the Data Provider will provide the mandatory metadata for his data Collections/products. The mandatory metadata coupled with specific requirements obtained from the users as well as the Data Providers will serve as input to the analysis activities. The ISA will then construct/modify a Query Model from the results of the analysis and create/modify a Collections Structure and corresponding entries in the CDB and Explain Databases.

The ICS URD specifies user requirements for an interoperable infrastructure linking catalogue systems of different agencies. For this purpose it defines requirements for all interoperable components and the protocols needed for exchanging messages between them. (Currently, the URD does not reflect the hypertext transfer protocol (http) approach to guide and will be updated.)
The ICS SDD defines the elements and interfaces which
comprise the CEOS Interoperable Catalogue System (ICS). The SDD provides
diagrams showing the interrelations between ICS elements, scenarios to
explain the dynamic interaction, a data model showing the data relations,
the communications services utilized in ICS, and the system management
approach for ICS. The SDD provides both design and tutorial information.
The CIP Specification defines the interoperable protocol for exchanging messages related to data search and data ordering. CIP is defined as a profile of the International Organisation for Standardization (ISO) Z39.50 with extensions for distributed searching using the Collections Model. The specification defines all CIP messages, as well as the attributes used for searching and the elements needed for retrieval. CIP may be used outside of the CEOS ICS. The CIP Specification is the definitive source for determining CIP compliance.
Additional information about CINTEX activities and documents can be found at:
http://ceos.ccrs.nrcan.gc.ca/taskteam/cip.html
| ARS | Abstract Record Structure |
| AVHRR | Advanced Very High-Resolution Radiometer |
| BNSC | British National Space Centre |
| CCRS | Canada Centre for Remote Sensing |
| CDB | Collection Data Base |
| CEO | Centre for Earth Observation (European Commission) |
| CEOS | Committee on Earth Observation Satellites |
| CERES | Clouds and the Earth’s Radiant Energy System |
| CINTEX | CEOS INTeroperability EXtensions |
| CIP | Catalogue Interoperability Protocol |
| CM | Collections Manual |
| CMT | Collection Management Tools |
| CNES | Centre National d'Etudes Spatiales (France) |
| CSIRO | Commonwealth Scientific and Industrial Research Organisation (Australia) |
| DB | Data Base |
| DBMS | Data Base Management System |
| DLR | Deutsches Zentrum fur Luft-und Raumfahrt |
| EO | Earth Observation |
| ESA | European Space Agency |
| ESRIN | European Space Research Institute (ESA) |
| GEO | Geospatial Metadata Profile |
| GILS | Government Information Locator Service |
| GIS | Geographic Information System |
| GSFC | Goddard Space Flight Center (NASA) |
| HTML | Hyper Text Mark-up Language |
| HTTP | Hypertext Transfer Protocol (for consistency with HTML) |
| ICS | Interoperable Catalogue System |
| ID | IDentifier |
| IGP | ICS Guide Protocol |
| IRE-RAS | Institute of Radio Engineering and Electronics. - Russian Academy of Science |
| ISA
ISO |
ICS Systems Administrator
The International Organisation for Standardization |
| LaRC | Langley Research Center (NASA) |
| LIS | Lightning Imaging Sensor |
| MODIS | Moderate-Resolution Imaging Spectroradiometer |
| NASA | National Aeronautics and Space Administration (US) |
| NASDA | National Space Development Agency (Japan) |
| NOAA | National Oceanic and Atmospheric Administration (US) |
| NRSC | National Remote Sensing Center |
| NSRS | Natural Environment Research Council (BNSC) |
| PTT | Protocol Task Team (Part of CEOS- WGISS-AS) |
| QA | Quality Assurance |
| RM | Retrieval Manager |
| RPN | Reverse Polish Notation |
| SAGE | Stratospheric Aerosol and Gas Experiment III |
| SDD | System Design Document |
| SST | Sea Surface Temperature |
| TBD | To Be Determined |
| TBR | To Be Resolved |
| TBS | To Be Supplied |
| URD
URL |
User Requirements Document
Universal Resource Locator |
| VISSR | Visible and Infrared Spin Scan Radiometer |
| WGISS | Working Group on Information Systems and Services (Part of CEOS) |
| WWW | World Wide Web |
| Abstract Record Structure | An Abstract Record Structure is the primary component of a database schema. An Abstract Record Structure applied to a database record results in an abstract database record. |
| Archive | An Archive of EO data can hold
various types of data ranging from satellite images and climatological
products processed from the images, to observation data and climatological
statistics. An Archive may also contain information describing the EO data
and also supplementary data such as design documentation, algorithm object
and source code, technical reports, user manuals, etc.
There is likely to be a database management system for maintenance and low level access to the data. The archive will, in general be accessed by a front end archive server that then presents the data as requested by the Retrieval Manager |
| Archive Collection | Group of related metadata based on the contents of the Archive. |
| Attribute | An attribute is a characteristic of a search term, or one of several characteristic components that together form a characteristic of a search term. |
| Catalogue Interoperability | The ability to provide a Data User with the appearance of a single, unified catalogue for all participating data providers. In order to provide catalogue interoperability all participating data providers must support at least one common method (i.e. API) for accessing functions such as authentication directory, inventory, guide and order. Each supplier may support additional consumer functional interfaces to support their private data users. |
| Catalogue System | A Catalogue System provides services such as inventory, browse, directory, order and guide, which may be supplemented by further services, but should contain at a minimum, inventory. The CIP is the protocol that shall enable the many services of many catalogue systems to interoperate. Usually a catalogue system resides at a particular agency or data provider facility but may be distributed across catalogue sites. |
| Catalogue Translator | One of three types of ICS Translators. Catalogue Translator converts CIP messages into a data provider's protocol for the services of Inventory, Directory, Browse, Guide and Ordering. A detailed discussion about the various translators can be found in the ICS System Design Document, Section 3.5. |
| Collection | A Collection is (1) a group of related data items with certain common characteristics. Collections are generally defined by data providers, but may also be defined by users. (2) An abbreviated term for "Collection Descriptor". |
| Collection Category | The type of Collection i.e. Theme or Archive. |
| Collection Descriptor | Metadata description of a Collection. This descriptor includes pointers to items included in the Collection. The included items may themselves be Collections, thus forming a hierarchical Collections Structure. |
| Collection Management Tool | Used by the ISA for tasks involved with populating and maintaining the data in the Retrieval Manager. These tasks involve translating Collection or directory information into CIP Collection format, checking for valid entries and the presence of mandatory data. |
| Collections Structure | A Collections Structure is a logical organization of Collection and Product Descriptors. |
| Data Provider | Individual responsible for the Scientific Content of the data product(s). |
| Element | An element is the smallest unit of information used to define the schema elements which in turn defines a schema. |
| Element Sets | Element Sets are a compilation
of elements that form a view of the data. CIP has identified eight (8)
Element Sets. Each of these sets is described below.
Full (F): the Full element set includes all defined standard elements from the appropriate database schema (for the CIP, this is usually the CIP database schema, as defined in Appendix B), and so, when applied, results in a null transformation. This is a large set of elements, but it ensures that clients receive everything their users may need to evaluate the retrieval record for further processing. CIP Full (CF): the CIP Full element set includes all defined standard CIP elements from the CIP database schema as defined in Appendix B. When the CIP database schema is used, the Full and CIP Full element sets are therefore equivalent. However, when a custom database schema is used (i.e. a custom schema defined by a data provider and containing Local Attributes in addition to the standard CIP ones), the CIP Full element set contains uniquely the standard CIP elements, whereas the Full element set contains all the elements defined in the custom database schema, i.e. the standard CIP elements and the custom local elements. Brief (B): the Brief element set includes a minimal subset of the defined standard schema elements available from the appropriate database schema. Summary (Sum): the Summary element set includes a subset of the defined standard schema elements that is appropriate for interoperability with the GEO profile. Browse (Br): the Browse element set includes a subset of the defined schema elements that are appropriate for retrieval of browse data alone. Options (Opt): the Options element set includes a subset of the defined schema elements that are appropriate for retrieval of options alone. Local Attributes (LA): the Local Attributes element set includes a subset of the defined standard schema elements that is appropriate for retrieval of the local attributes in a product descriptor. Collection Members (CM): the Collection Members element set includes a subset of the defined schema elements that are appropriate for retrieval of the collection hierarchy tree. Appendix B of this document identifies the elements that are included in each of the above Element Sets. |
| Guide | Data that is available to the user to enhance understanding of the EO data, spacecraft, instrument, etc., and hence make a detailed analysis of whether the product data will be of value for a particular application. Guide data may also contain information necessary for processing the product data further, such as calibration coefficients. |
| ICS Site Administrator | The human operator that performs all tasks needed to establish and maintain a Retrieval Manager. In practice this is more than one person as the tasks are various types: scientist for Collection definition, data base expert for maintaining CD, system operator for diagnosing and correcting operational activities, etc. For convenience purposes all of these tasks are performed by the ISA. |
| Item Descriptor | An item descriptor is comprised
of one or more attributes. The attributes describe the item in question
in a consistent manner, therefore resulting in dynamically defined item
instances. The item descriptor is used to represent a number of key objects
in the CIP domain such as Collection Descriptor
The attributes and their values that constitute the item descriptor can be searched on so as to identify a particular item descriptor or group of item descriptors. |
| Present Service | The present service allows the client to request response records corresponding to database records represented by a specified result set. |
| Product data | A unique aggregation of data generated from information held in, or to be held in an archive (for predicted products). It can be located and retrieved by a user via CIP, possibly following further processing, such as map projection, sub-setting, band selection, etc., after or during extraction of the raw data as stored in the archive. |
| Product Descriptor | Metadata description for product data. |
| Result Set | A local data structure used as a selection mechanism for the transfer of records, identified by a query. Its logical structure is a named ordered list of result set items, and possibly, unspecified information that may be used as a surrogate for the search that created the result set. |
| Retrieval Manager | A Retrieval Manager services (and
may be installed at) each catalogue site, it is used to integrate together
the local catalogue systems and provide communication between users and
other catalogue site Retrieval Managers. It is anticipated that each catalogue
site would have at least one Retrieval Manager and that Retrieval Manager
would ‘know about’ or ‘own’ a number of collections. The data within these
Collections would be the responsibility of that Retrieval Manager, with
external Collections referenced only and managed by their respective Retrieval
Managers.
The Retrieval Managers at each catalogue site would also communicate with each other using the CIP. The Retrieval Manager would then also communicate with local catalogue servers, such as archives and inventories, within its own site to service requests received from users. Another key function of the Retrieval Manager is to route search queries to other relevant Retrieval Managers and consolidate the search results before returning them to the user. |
| Schema | A common understanding shared by the client and the server of the information contained in the records of the database. The schema is defined in terms of schema elements. |
| Search Service | The search service enables an origin to query databases at a target system, and to receive information about the results of the query. |
| Tag Set | A tag set is the set of identifiers for the elements. |
| Task Package | The set of attributes that describe an activity which is started by an Extended Services Request. Based on Z39.50 definition for a Task Package. |
| Theme Collection | Group of related metadata based on a common theme or purpose. |
| Translators | Software element that converts CIP into the protocols used by a data provider. Three Translators are identified in ICS: Catalogue Translator, OHS Translator, UPS Translator. A detailed discussion about the various translators can be found in the ICS System Design Document, Section 3.5. |
[R1] Catalogue Interoperability Experiment (CINTEX) Development Plan, CEOS/WGISS/CINTEX/Plan, Issue 1.0, 19 July 1996, Committee on Earth Observation Satellites /CINTEX
[R2] Interoperable Catalogue System (ICS) User Requirements Document (URD), CEOS/WGISS/CINTEX/ICS-URD, Issue 2.2, March 1997, Committee on Earth Observation Satellites /CINTEX
[R3] Catalogue Interoperability Protocol (CIP) Specification - Release B, CEOS/WGISS/CINTEX/CIP, Issue 2.4, June 1998, Committee on Earth Observation Satellites /CINTEX
[R4] Interoperable Catalogue System (ICS) System Design Document (SDD), CEO/WGISS/CINTEX/SDD, ISSUE 1.4, June 1998, Committee on Earth Observation Satellites /CINTEX
[R5] Information Retrieval (Z39.50): Application Service Definition and Protocol Specification, ANSI/NISO Z39.50-1995, Official Text, July 1995, Z39.50 Maintenance Agency.
[R6] ICS Guide Design and Protocol Specification, CEOS/WGISS/CINTEX/GDPS, Issue 1.1, July 1998, Committee on Earth Observation Satellites/CINTEX
[R7] CIP Specification Valids, CEOS/WGISS/CINTEX/GDPS, Issue 0.5 August 1998, Committee on Earth Observation Satellites/CINTEX
Catalogue interoperability may extend beyond just the
members of CEOS in promoting data access within a wider community of EO
data providers and eventually to non EO data providers.
The ICS domain can be seen in Figure 1-2 as divided into two virtual domains;
To support transparent access to multiple catalogues, a three tier structure was used to design the ICS space. Client’s exchange messages with a middleware layer, which in turn interacts with multiple catalogue servers. The middleware provides the routing and translation services to allow client requests to be presented at the multiple heterogeneous catalogues. The middleware is of two types of elements: Managers and Translators. Managers provide an access point for clients and route the requests to the various servers. Translators, bound with the clients and servers, translate CIP or IGP to and from the native protocol of the client or server. Future client and server developments may use CIP or IGP directly and hence not require translatorsThis approach supports a diversity of clients, and servers. Clients may be used directly by a human user or may be an agency system acting on behalf of a user. Depending on the design of an existing catalogue system, services may be provided by different servers and translators. Because the routing service provided by the Middleware is independent of the type of service, separate translators may be provided for inventory, browse, ordering, and user profiles. This architecture is also applicable for small data providers, such as university research groups, who are unable to provide adequate middleware at their site but still wish to join the ICS domain. Their local catalogue, inventory and guide documents can be made available to the ICS community by the inclusion of appropriate descriptions within another agency’s middleware..
CIP Space is a protocol centric view of catalogue interoperability and provides for the loosest coupling needed to achieve catalogue interoperability among a wide community of EO data providers and users of EO data.. A range of design solutions is permitted by the CIP and IGP spaces. To provide for a higher degree of uniform services at the cost of additional agreements between agencies, additional design for interoperability is defined in the ICS design document. The additional design definitions pertain to the allocation of functionality and data amongst components, agreement on an underlying communication protocol, and agreement on how to conduct distributed system management of ICS. The difference between CIP Space and the ICS is depicted in Figure 1-3. CIP Space is defined by those CEOS agencies and other federations and organizations which provide catalogue services using CIP and/or guide service using IGP. Those CEOS agencies which provides services, communications and systems management compatible with the ICS design make up the ICS. Its should be noted that while all ICS members must implement CIP, guide handling is considered an optional element of ICS and an ICS members may choose not to implement IGP. Note that other federations may choose to use the ICS design as the basis for their federation.
Assuming query and result routing between geographically dispersed sites (see Figure 1-2), an agreed middleware layer and its interfaces to users and providers needs to be in place. To define such a system, the CINTEX have established the following CIP, IGP and ICS standards:
Collection Characteristics are terms that serve to further describe the roles of the various Collections within ICS. For example, Terminal Collections are those Collections that appear in terminal positions in a Collections Structure and thus have a semantic meaning that ICS recognizes and responds to. Therefore, it is important for the ISA to understand the following Collections Characteristics.
By definition, a Collection is a grouping of items that have something in common. A Collection may have members that have many or fully common attributes (Archive Collection), or a Collection may have members that have a common semantic theme, though only a small subset of common attributes (Theme Collection).
Further CIP specifies a list of standard attributes that can be searched. Some of these standard attributes are mandatory for all Collection members (different mandatory sets for different descriptor types), while some are optional (although commonly understood). Finally, some attributes can be locally defined.
Evolution over time:
Static members do not change over time. This can be a result of static underlying Collections or the mechanism used to create the Theme Collection such as a Volcanic Eruption Theme Collection whose members will more than likely remain static over time.
Dynamic members may change over time based on changes in the included Collections. It is envisioned that the majority of Collections will not be static, but will evolve as ICS is used. This will occur in response to the way in which the user community wants to view relationships between the various data held by ICS, and the ways the CEOS Agencies wish to respond to those desires. Dynamic membership will require close supervision by the Retrieval Manager Administrator to ensure that the Collection Descriptor information is current.
Terminal Collection
Identifier
Each member (collection descriptor) of a Collections Structure must have an identifier that is unique within the Retrieval Manager. This unique identifier of a Collection Descriptor (Archive or Theme) will include the Retrieval Manager (RM) identifier and the Collection Identifier. A Product ID is assumed to be unique within its home Archive Collection.
Uniqueness:
A Collection Member (included item descriptor ID) may be a member of more than one Collection. However, duplicate members (included item descriptor ID’s) must not be visible within a single Collection. For example, Provider Archive Collection AMSR on ADEOSII could not appear twice as an included item descriptor ID in a Collection that contained both the Sea Surface Temperature Collection and the Andes Event Collection. This property is known as elimination of duplicates to achieve uniqueness.
In the case of a Collection that is a child of two or more included Collections, any operation such as search, which traverses the Collection Tree from the top level Collection will end up repeatedly visiting the child Collection. The unique Collection Identifier provides a means of preventing repeated operations on the same Collection. This is achieved by noting which tree nodes (unique identifiers) had been visited and then restricting access to those nodes (unique identifiers) for the same search. The Retrieval Manager will perform this functionality.
Remote Collections:
Remote Collections are members of a Collection Hierarchy whose descriptor information is stored or maintained at a CIP site other than the one in which it is listed as an included remote item.
Normally, a Collections Structure would be held in one place (for example as a database on a computer). A logical Collection Tree is where one or more of the members are held elsewhere - the complete Collection Tree thus spans multiple sites. If a Collection references (lists an included item descriptor or included task package) a Member Collection at a remote site, this Member Collection is termed a ‘remote member’.
Collection descriptors do not have to maintain information about which Remote Parent Collections refer to them; remote members are indistinguishable from local Collection Members from the user’s point of view. This concept is supported by the consistent use of Universal Resource Locators (URL) to identify Collection Descriptors, in the same manner as the complete World Wide Web (WWW) is seen by the user as a single database. A Retrieval Manager ‘owns’ those Collections for which it stores and maintains the attributes; it only stores the pointers (Included Item Descriptor ID’s/included task packages) to remote members, not their attributes and values.
No attribute or value of an attribute for a remote member, or the pointer (included item descriptor/task package name) to the remote member, can be guaranteed. The Retrieval Manager where the remote member is stored may not be available; the remote member may have changed its data structure (adding, changing or deleting attributes), or the remote member may have been deleted from the remote Retrieval Manager.
Related Collections:
Collections may be related to one another without the need for a "parent-child" or the "include" construct. The relation may be through content or purpose, for example, and allows the spanning of one Collection Tree to another. A Collection Descriptor will contain a list (possibly empty) of related Collections as part of its content.
Local Attributes
Local Attributes are Collection-Specific Characteristics. The existence of Local Attributes may be specified within a Collection Descriptor by setting the Local Use Attribute Flag Element. A flag of 0 indicates that the Collection Descriptor does not contain Local Attributes, 1 indicates that the Local Attributes are described within the Collection Descriptor, with corresponding values captured in the member Product Descriptors; 2 indicates that the Local Attributes have been described in the Explain Database and the values for the attributes are captured in the Product Descriptor. Local Attributes Using the Collections Database and Local Attributes Using Explain are addressed in Section 3.
Ordering Nodes
TBD
Additionally, the user may request that the search be contained locally to the target Retrieval Manager (i.e., a local search), or request that the search be propagated to other Retrieval Managers based on the Collections (i.e., a distributed search).
- Collection Search: finds Collection Descriptors of interest without searching included products.
- Product Search: finds individual product descriptors that may eventually lead to the order of an actual data product.
The GEO Profile supports Geographic Information Systems (GIS) applications and thus is of special interest to users of EO data. For this reason an alignment of the CIP and GEO profiles has been made. The objective of this alignment has been to allow both GEO and CIP clients to search and retrieve records from databases defined by either profile, and thereby maximize interoperability. The alignment was helped by the similarity of the spatial and temporal attributes of the metadata, has had to take into account the different data models in CIP and GEO. It should be emphasized that the CIP/GEO interoperability is for search on the intersection of CIP and GEO attributes and the retrieval of item descriptors. There is no interoperability on the more advanced functions of CIP such as ordering and security.
This system is not based on Z39.50 and is not a mandatory
capability of an ICS node. However there is a strong linkage between the
CIP client/retrieval manager and the HTTP based client and indexing method
for Guide Documents. To allow coordinated access to catalogues and documents
an ICS client was designed with a CIP Client component and an IGP Client
Component. The ingest of documents and Collections into the ICS is coordinated
by the Collection Management Tool (CMT) to assist in maintaining the consistency
of the Collection descriptor and the HTTP index that enable search and
access of Guide Documents. Further details of this ingest process are discussed
in this Collections Manual. The specific design of the guide system can
be found in the ICS Guide Design and Protocol Specification [R7].
IGP has only one compliance level that is full compliance
so no system that does not implement the full IGP Specification can be
considered IGP compliant.
In addition to these two primary categories of Collection Descriptors ICS also supports the capturing of Product Descriptors. These descriptors describe the contents of individual product data that is stored in the local archives.
This section of the Collections Manual addresses the rules associated with the creation of Collection and Product Descriptors. Each subsection will begin with a brief definition of the object (Collection or Product Descriptor) followed by the mandatory information for each object and a discussion of the rules surrounding their creation. The purpose for this discussion is to ensure that the ISAs create these objects with the same semantic definition. This in turn will provide an important step towards data interoperability.
Archive Collection Descriptors are aggregations of product descriptor information. One Archive Collection Descriptor will typically exist for many data products. The common information across these products is reflected in the Collection Descriptor for the Archive Collection. For Collections to be considered Archive the following should be true:
2. All of the product descriptors reflected in the Archive Collection had the same set of elements in their description.
What They Contain:
These Collection Descriptors must contain
Spatial Coverage
Included Product Descriptors
Keyword (Spatial)
Keyword (Temporal)
The Data Provider will create these Collections from the information
contained in the local inventory system.
Theme Collections are Collections that are based on themes or topics of interest. They are a mechanism for organizing the Earth Observation Data into manageable sets of information. For example, a Data Provider may desire to organize several of his Archive Collections into a Sea Surface Temperature Theme Collection that may or may not contain homogenous data. For example, the geographic extent for each of the Archive Collections may be non-overlapping thus resulting in a combined global extent for the new Theme Collection. Or, several of the Archive Collections may contain Atmospheric Geophysical Parameter Data, while the remaining Archive Collections address Sea Surface Temperature Measurements. Therefore, Theme Collections allow the ISA and/or Data Provider to combine many existing Collections into a single Collection.
What They Contain:
Each Theme Collection must
Additionally, these Collections, like the Archive Collections, may also contain dynamic or static data. The content of the Static Collection will not vary over time. For example, a Science Data Provider may choose to create a Theme Collection which will reference all known Collection Descriptors that contain information about a specific event, i.e., the Mid West Flood of US in 1992. Once established the content of this Collection would not necessarily vary. On the other hand, the dynamic Collection will change as frequently as the underlying (included) Collection/Product Descriptors change. For example, assume that a Sea Surface Temperature Theme Collection has been established that includes several Archive Collections. Also assume that these Collections are still in the process of being continually updated with additional data product information. The additional data product information has forced a change to the Temporal Coverage for the related Archive Collections. Therefore, the Sea Surface Temperature Theme Collection’s Temporal Coverage would also be updated to reflect the changes in the underlying Archive Collections.- Contain as a minimum the mandatory data identified in Appendix B Table B-2 and B-3, for the Collection Descriptor, and illustrated in the Mandatory Data for Collections Figure 3-8, Section 3.2, of this document.
- Contain at least one (1) Included Item Descriptor ID.
- eventually trace to at least one (1) Provider Archive Collection so that the goals of ICS can be met.
How They Are Created:There are several creators of the Theme Collections. The Data Provider may create Theme Collections to support his user communities or research activities. He may include in his Theme Collection other Theme Collection, Product or Archive Collection Descriptors. Table 2-1 provides an example of combining attributes from various Collections to create a single Theme Collection Descriptor. Column one identifies the attributes that will need to be combined across Collection Descriptors. Columns two through four identify the values of the attributes in the existing Collection Descriptors that will serve as input to the new Collection Descriptor in column five. Column five identifies the results of combining the values of the attributes in columns two through four for the new Theme Collection Descriptor. Column six provides comments on combining the attributes reflected in column one.
| Attributes | Theme
Collection
Input Collection |
Archive
Collection
Input Collection |
Archive
Collection
Input Collection |
Theme
Collection
New Collection |
Comments |
| ArchivingCentreId | N/A | GSFC | LaRC | N/A | |
| CollectionCategory | Registered | Registered | Registered | Registered | This is a default value for all collections |
| CollectionHierarchy
Category |
Theme | Archive | Archive | Theme | |
| CollectionHierarchy
Position |
Non-Terminal | Terminal | Terminal | Non-Terminal | |
| EndDate | 971201 | 980101 | 981201 | 981201 | |
| GeospatialForm | Model | Model | Remote-Sensing Image | Model, Remote-Sensing Image | |
| ItemDescriptorID | z39.50s://
larc.nasa.gov/thc_1 |
z39.50s://
larc.nasa.gov/ arc_fire_ax_isccp_dx |
z39.50s://larc.nasa.gov/
arc_atm_ceres_01 |
z39.50s://
larc.nasa.gov/thc_2 |
Unique ID for this new Theme Collection |
| ItemDescriptor
Language |
English | Italian | French | English | Language for this new collection |
| ItemDescriptorName | AirQualLead | AirQualMts | AirQualUrban | AirQual | |
| RevisionDate | 960101 | 960101 | 961201 | 990401 | This should reflect the date of creation of the new Theme Collection. |
| StartDate | 960101 | 960101 | 961201 | 960101 | Earliest date of the combined collections. |
| Northboundingcoordinate | 40 | 45 | 23 | 45 | Take max. value |
| Purpose | Monitoring of Air Quality | Air Quality Urban | Air Quality Mountains | Monitoring of Air Quality Urban, Mountains | Semantic accumulation |
| ThemeKeyword | Earth Science>Atmosphere>Air Quality>Lead | Earth Science>Atmosphere>Air Quality>Carbon Monoxide | Earth Science>Atmosphere>Air Quality>Emissions | Earth Science>Atmosphere>Air Quality | |
| Local Attributes | None | a,b,c,d,e | a,c,d,f | Not Allowed | Combining Local Attributes is not allowed |
Product Descriptors are the metadata that describe the instances of the product data that are archived in the local system. This metadata is used to search and retrieve information about the product data.
What They Contain:All Product Descriptors must
- be referenced in an associated Archive Collection. Therefore, an Archive Collection must exist for each Product Descriptor or group of Product Descriptors.
- point to their associated product data instance. This is achieved by associating the Product ID contained in the Item Descriptor ID with the product data instance ID contained in the local inventory.
- contain the mandatory attributes specified in Appendix B, Table B-2 and B-3, and illustrated in the Mandatory Data for Products, Figure 3-9 of this manual
How They Are Created:Product Descriptors are primarily created from the local inventory descriptions. This creation activity centers around the mapping of existing product information contained in the local inventory to the CIP Product Descriptor Schema Definition, reference Appendix B, Table B-2. This is a multi step operation that requires knowledge of the site's existing inventory and the CIP Data Definitions. This process is defined on the CEO Web Page (TBS).
Each task identified in the above process may be the responsibility
of the ISA or shared among various individuals who may be responsible for
various aspects of the data contained in the ICS. For example, a Data Analyst
may assist in analyzing the EO Data and establishing a Collections Structure;
a Scientist may assist in the scientific review activity and a Database
Designer may be responsible for the creation and modification of the database
entries.
1. Ensuring that the information contained in the database conforms to the specified rulesMandatory attributes exist,
Valids are correct,
Data Types for each attribute are correct.
2. Ensuring that the information contained in the Collections Structure conforms to the specified rules (Ref Section 2), for Archive Collections, Theme Collections, and Product Descriptors. Reference Section 3.2.1 of this document for a discussion on Collections Structures.3. Ensuring that the Collections Structure is referentially correct which implies that the included and related items exist either locally or remotely.
4. Ensuring that all of the Collection IDs are unique within the Retrieval Manager.
5. Working with the Data Providers in defining Collections that satisfy the users needs.
6. Ensuring that the designated scientific review group has scientifically reviewed the data.
7. Ensuring that any data mappings that may have occurred between existing systems and ICS are semantically correct.
8. Ensuring that the Terminal Theme Collection’s included Product Descriptors contain the related Archive Collection
9. Ensuring that the Product Descirptors have been previously defined
10. Ensuring that the Query Model has been defined.
· The use attributes identified in Appendix B Table B-2 and B-3;
· The schemas;
· The tag sets;
· The Collection Identifiers;
· The record syntax information identified in the CIP Release B Specification as SUTRS, GRS-1, EXPLAIN, and ES Task Package.
2. That the use and present attributes/elements are identified in the Explain Database
3. That logical consistency exists between the various
data objects in the Explain Database.
Collection Structures are site-specific data organizational strategies. Like any data organizational strategy, themes or aggregation topics are used to organize the data. These data organizational strategies will vary from site to site depending on the Collection Categories and the desired associations between the various Collections. Therefore, a standard software procedure to ensure that themes or aggregation topics are consistent across sites is not possible. It may be possible for a local site to develop site-specific software to ensure that the site's Theme Collections are consistent within the site's Collections Structure, however, this is external to the ICS Retrieval Manager. Figure 3-2 illustrates a proposed process for determining the appropriate Collections Structure for the site's data holdings. Additionally, Figures 3-3 and 3-4 illustrate a process for modifying an existing structure. The shaded processes in Figures 3-2, 3-3 and 3-4 are addressed in Sections 3.2.2 and 3.2.3 of this document.
In ICS the basic query parameters are the mandatory use attributes that are specified in Appendix B Table B-2 and B-3. In addition to these parameters the site may identify additional use attributes from the optional use attributes, also specified in Appendix B Table B-2 and B-3, or identify their own set of local use attributes. Through this specialization, site-specific Use Attributes can be developed that will represent the site-specific query requirements. For example each Query Model, regardless of its heritage (ICS and/or Site-Specific) will contain search parameters. The ICS Use Attributes for a product descriptor include Bounding Rectangle, Product Descriptor, Spatial Coverage, Temporal Coverage and Temporal Range search parameters. For the Collection Descriptor the Bounding Rectangle, Collection Descriptor, Collection Type, Data Originator, Included Collections Descriptors, Included Item Descriptors, Instrument Sensor, Keywords, Platform, Product Collection Specific, Revision, Spatial Coverage, Temporal Coverage, and Theme Keywords, are specified as search parameters. A site-specific capability may contain all of the above in addition to site-specific parameters such as Data Centre Name.
In addition to the parameters, what makes a Query Capability a Query Model is the coupling of the parameters, relationships, and values for the parameters. For example, assume that a user frequently requests all AVHRR and Atmospheric Collections. A Query Model would then be developed that specifies Sensor=AVHRR, "Sensor" being the parameter, "=" being the relationship, and "AVHRR" being the value and Theme Keyword=Atmosphere. There can exist any number of parameter = value pairs in a Site Query Model.
The user still has available, for searching purposes, the basic ICS Use Attributes. The Query Model does not eliminate this capability; it merely extends the query capability to another level of specificity.
The two primary sources for the components of the Query
Model (parameters, relationships and values) are the user requirements
and data access patterns.
It may be useful to construct several tables that identify and record the characteristics for each of the collections and/or products. A Collection Table would capture the Collection characteristics while the product, product characteristics. The purpose for these tables is to assist in identifying the characteristics for the descriptors, recording the values for the attributes among the set of Collections and/or products, constructing the Collections Structure, and populating the Collections Database. Tables 3-1 and 3-2 provide examples of the structure and content of these tables. As a minimum, it is recommended that the ICS Query Parameters serve as the column headings. In the example Tables 3-1 and 3-2 illustrated below, only a subset of the above Query Parameters were used to demonstrate this concept.
|
|
|
|
|
|
|
|
| z39.50s://larc.nasa.gov/pidax_dx_noa11_9206 | Product | Albedo | AVHRR | 920601
920630 |
39.9000 -39,9000
10.1000 -5.1000 |
|
| z39.50s://larc.nasa.gov/pidax_dx_noa12_9206 | Product | Pressure | AVHRR | 920601
920630 |
39.9000 -39,9000
10.1000 -5.1000 |
|
| z39.50s://larc.nasa.gov/piddx_8902_noaa11 | Product | Ice | AVHRR | 890201
890228 |
90.0000 -90.0000
-180.0000 180.0000 |
|
| z39.50s://larc.nasa.gov/piddx_8901_noaa11 | Product | Snow | AVHRR | 890101
890131 |
90.0000 -90.0000
-180.0000 180.0000 |
|
| z39.50s://larc.nasa.gov/pidmx_02_ceress2 | Product | Radiation Flux | CERES | 980101
980102 |
39.9000 -39,9000
10.1000 -5.1000 |