Core Catalogue Features

Self-Description Storage and Lifecycle

A Catalogue contains two types of storage: The Self-Description Storage and the Self-Description Graph. The Self-Description Storage contains the raw Self-Descriptions as JSON-LD files according to ADR-001 (Footnote: serialized into JSON-LD files (Footnote: with additional metadata concerning their lifecycle. Since Self-Descriptions are protected by cryptographic signatures, they are immutable and cannot be changed once published. The lifecycle state of a Self-Description is described in additional metadata. There are four possible states for the Self-Description lifecycle. The default state is “active”. The other states are terminal, i.e., no further state transitions follow upon them.

  1. Active
  2. End-of-Life (after a timeout date, e.g., the expiry of a cryptographic signature)
  3. Deprecated (by a newer Self-Description)
  4. Revoked (by the original issuer or a trusted party, e.g., because it contained wrong or fraudulent information)

The Catalogues provide access to the raw Self-Descriptions that are currently loaded including the lifecycle metadata. This allows Consumers to verify the Self-Descriptions and the cryptographic proofs in a self-sovereign manner.

Self-Description Graph

The Self-Description Graph contains the information imported from the Self-Descriptions that are known to the Catalogue and in an “active” lifecycle state. The reason for the Self-Description Graph as a separate storage type is that it is better suited for complex queries across Self-Descriptions. The graph is read-only from a user perspective. The only way to make changes to the graph is via the lifecycle of the store Self-Descriptions.

The ground truth information remain the raw JSON-LD files. For example, if there is an update to the Self-Description schemas, the entire graph can be rebuilt from the JSON-LD files. In order to scale up the catalogue, parallel instances of the Self-Description Graph can be generated to process queries indepdently. For this, there might be a delay of a few minutes until a Self-Description that is loaded into the SD-Storage is importet to all graph instances.

REST Interface

The Catalogues have no built-in user interface. It instead provides an API that can be used by an external user interface or technical clients. The interfaces of the GAIA-X Federation Services use REST and more particular the OpenAPI language to describe the interfaces.

The openCypher graph query language is intended for search. To present search results objectively and without discrimination, compliant Catalogues use this query algorithm with no internal ranking of results: Besides the user-defined query statements with explicit filter- and sort-criteria, result are ordered randomly. The random seed for the search results can be set on a per-session basis so that the query results are repeatable within a session with the Catalogue.

Self-Description Verification

In a privately hosted Catalogue, the authentication information can be used to allow a user to upload new Self-Descriptions and/or change the lifecycle state of existing Self-Descriptions. In a public Catalogue, the cryptographic signatures of the Self-Descriptions are checked if the issuer of the Self-Description is the owner of the subject of the Self-Description. If that is the case, then the Self-Description is accepted by the Catalogue. Hence, Self-Descriptions in JSON-LD format can be communicated to the Catalogue by third parties, as the trust verification is independent from the distribution mechanism.

Besides the Trust Verification, the Self-Descriptions are checked for syntactic and semantic consistency. For this, the Self-Description Schemas are the basis.

Schema Management

Every Self-Description has to adhere to one or more schema and has to uniquely reference it. Schemas are defined using ontologies in the RDF format, as well as SHACL shapes. Only attributes defined by the schema can be used in the Self-Description. The core ontology for GAIA-X Self-Descriptions is managed by the AISBL. Additional schemas can be provided by individual ecosystems and application domains – such as healthcare.

At this time no software interface for automated schema updates is defined. As schema updates are low frequency events, their update is a maintenance task for the catalogue operator who has direct access to the schema management implementation.

Interaction with the Portal

Another option to interact with the Catalogue is to use a GUI frontend (e.g. a GAIA-X Portal or a custom GUI implementation) that uses the Catalogue REST API in the background. The interaction between the Catalogue and a GUI frontend is based on an authenticated session for the individual user of the GUI frontend.

Specification document