QGIS Enhancement: Overhaul Metadata Management In QGIS
Date 2017/02/26
Author Tim Sutton (@timlinux)
Contact [email protected]
maintainer @timlinux
Version QGIS 3.0
Summary
- Provide the Infrastructure in QGIS to author, consume and share Dublin Core, ISO 19115 etc. GIS Metadata
- Support the mandatory fields and those fields that are needed to support service descriptions for QGIS Server
- Make the creation of metadata seamless where it does not exist
- Use templates and heuristics so that we need the minimum number of clicks in order to define basic functionally metadata
- Focus on minimal compliance first (Dublin Core)
- Support the use of presets so when a new document is being created, the minimum amount of work is needed.
- Start with something simple, plan for incremental improvements over time
Preceding work
QGIS Metadata Strategy: https://gist.github.com/tomkralidis/33f781e361f6d855c2f4
Keep in mind the existing QEPs:
#33
#50
And keep in mind existing metadata editors:
Our intent is to build on these previous works and ideas by building a number of components that provide a comprehensive metadata strategy for QGIS.
Proposed Solution
The big picture of what we plan to produce is here:

We propose to first implement these components (WP = Work Package):
- WP1: A QGIS Specific Metadata Schema (under the guidance of Angleos Tzotsos and Tom Kralidis)
- WP 2: general purpose class for representing QGIS Metadata internally (under the guidance of Nyall Dawson)
- A replacement for the current layer metadata tab that will have:
- WP 4: A viewing mode where metadata is nicely laid out in an html document
- WP 8: An editing mode where metadata can be edited.
We will make a separate QEP for the other work packages once these are the above ones are taken care of. More details on these Work packages can be found below.
Work package 1 - Schema Definition/Selection:
Input schema selection: In this phase we will identify input schemas to be used for validation. We propose initially to support Dublin Core and validate against the following CSW Record schemas:
Although to keep it simple, we start by supporting Dublin Core, we expect that the future evolution will be towards supporting ISO.
Internal Schema: In this phase we will specify a schema for internal representation of metadata within QGIS (‘the QGIS Metadata Schema’). This schema would be independent of any existing standards and would be the basic structure in which all incoming metadata would be stored. When we add support for additional formats in the future, the expectation would be that these formats are also transitioned to the QGIS internal format on import so that we can deal with a single common metadata structure internally.
Since the QGIS internal schema most likely won’t be a superset of all existing schemas, conversions between this and any other schemas may result in a loss information, which mean we won’t support metadata “round trips”. One proposed solution to loss of round tripping is to keep the original metadata document (if provided) and then interpolate new values into it if it is updated.
We will also identify which fields should be mandatory within the QGIS Metadata Schema. These should include mostly information which we can extract automatically from the dataset, without requiring any intervention from the user. Only in this way, we can guarantee the automatic generation of internal metadata for every dataset.
Other things to mention:
-
The QGIS metadata format will be used for when writing metadata into project files (and validator used to parse that back).
-
Service descriptions will be part of the schema
-
A document can describe either a service or a dataset, not both and not more than one dataset.
-
We plan to initially support layer level metadata and then project (service) level metadata in the future.
-
A small mockup by Tom to show how a QGIS Schema may look (incomplete) :
We can take some inspiration from the GeoNode data model too - possibly providing a similar data model.
Although we start with Dublin Core, keep ISO in mind, as a future target.
Status: A proposed schema has been written here qgis/qgis/#4330
Work package 2 - QGIS Metadata API:
In this work package we will build the basic C++ framework for parsing metadata from a schema - initially Dublin Core and QGIS Metadata Schema. This includes implementing an internal model for representing metadata, based on the metadata schema created on WP1.
Additional deliverables:
- Regression tests to ensure that we do not have regressions in the metadata subsystem.
- Clear API documentation, explaining how to use the API.
- Example code, to be included in the QGIS python cookbook, in order to make it easy to “get started” manipulating metadata programmatically.
- Python bindings, in order to expose the metadata classes to the plugin authors.
Work package 3 - Implement QGIS Metadata Storage support
In this WP we will introduce an external physical format for storing metadata internally, the “metadata store”. The goal is to support portability, enabling users to share their metadata, even in offline scenarios. This WP will build directly on the outputs of WP1, which will define an "internal metadata schema" and WP2, “QGIS metadata API”, which will encode/decode from the internal schema to the supported schemas (right now, only Dublin core).

QGIS will support two types of metadata stores: stores and local. In this WP we will focus on local stores, only. In the diagram below we depict the inheritance model for metadata stores, where an abstract metadata store will have a polymorphic behavior, according to the particular data format. For instance in the case of a PostgreSQL DB, the method “save” will create a table on the database, whether in the case of a Shapefile, it would create an XML file.

Some formats, such as text files, can be more limited than others. For that reason, we will create a “prime” format, the “QGIS metadata store”, which can accompany more restrictive formats.The prime format will be an SQLite database, because of its lightweight, and because it is well-known within the QGIS community.
As the goal is to support all these different formats in the future, we will design an infrastructure to accommodate that, but in this first iteration we will focus on the simple use case of creating an xml file, and an SQLite data store. The metadata contents will be passed by the metadata API. In this WP we will implement format translation, but not schema translation.
We will implement a user interface to allow the user to configure serialization/deserialization behavior, e.g.: in which format we should write metadata, and where. In WP5, we will add metadata detection (which perhaps we can turn on and off in the project settings). For instance, if there is an xml file with the same name and path as a Shapefile, QGIS would attempt to automatically import metadata.
The QGIS metadata store will be synced with any changes that we apply to the metadata. In the moment that we export metadata into XML, it will write those changes to the XML file.
Metadata search will also be polymorphic, according to the data format. In this iteration we will implement some text search for SQLite, and will use that rather than searching in text files which tends to be slower.
Activities:
- Design infrastructure for metadata storage in QGIS
- Implement use case for “QGIS Metadata store” + XML
- Sync “QGIS metadata store” with user edits.
- Design and implement UI for serializing/deserializing behaviour.
- Implement metadata search for this use case.
Deliverables:
- Infrastructure to accommodate the external storage of metadata in QGIS, fully implemented for the use case of XML files.
- Support for searching the metadata store.
- UI for saving/loading metadata.
Dependencies:
WP1, WP2
Work package 4 - Implement QGIS metadata viewer:
Metadata is only useful if it is visible to the users of the dataset that the metadata is associated with. For this reason we should have provision for presenting the metadata in an eye-pleasing and informative manner and with minimal work required on behalf of the user. We also aim to implement this, earlier on in the project workflow, so that we can start outputting the data stored in QgsMetadata.
Some thoughts:
- Implementation will be C++ with python bindings. For this reason we will not use Jinja2 or similar templating languages.
- We should probably include derived data (even if it is not stored in the internal model) such as layer extents, min / max / mean etc. values and so on as there is also a utilitarian value to the metadata in QGIS. So this WP will display the information from the metadata storage + generate some metadata from the data.
The ideas is to replace this:

With something like this (taken from GeoNode):

Work package 8 - Implement QGIS metadata editor for layers
- The idea would be to closely emulate the GeoNode editor under developement by GeoSolutions - prototype here:
In wizard mode:

In form mode:

Example(s)
(optional)
Affected Files
(required if applicable)
Performance Implications
(required if known at design time)
Further Considerations/Improvements
We have some funding to make these work packages happen (for around 80%) - if anyone is interested in co funding the shortfall, please let us know.
There is a discussion group at: https://gitter.im/qgis/metadata for those who wish to collaborate in making QGIS metadata better.
The following people have already joined the effort and will be doing implementation work, planning, offering advice etc.
- Tom Kralidis (advisor, metadata)
- Angelos Tzotsos (metadata)
- Joana Simoes (metadata)
- Nyall Dawson (Core QGIS integration)
- Tim Sutton / Ismail Sunni / Etienne Trimaille (QGIS Layer Properties Metadata Panel)
Backwards Compatibility
This will be new code and will replace any existing metadata implementation work (including what is currently in layer properties dialog). We will try to make sure that server and other parts that rely on metadata do not break - we would welcome support and input from those working on QGIS Server.
Issue Tracking ID(s)
(optional)
Votes
(required)
QGIS Enhancement: Overhaul Metadata Management In QGIS
Date 2017/02/26
Author Tim Sutton (@timlinux)
Contact [email protected]
maintainer @timlinux
Version QGIS 3.0
Summary
Preceding work
QGIS Metadata Strategy: https://gist.github.com/tomkralidis/33f781e361f6d855c2f4
Keep in mind the existing QEPs:
#33
#50
And keep in mind existing metadata editors:
MetaDoor (python editor) : https://www.dataone.org/software-tools/metadoor
Our intent is to build on these previous works and ideas by building a number of components that provide a comprehensive metadata strategy for QGIS.
Proposed Solution
The big picture of what we plan to produce is here:
We propose to first implement these components (WP = Work Package):
We will make a separate QEP for the other work packages once these are the above ones are taken care of. More details on these Work packages can be found below.
Work package 1 - Schema Definition/Selection:
Input schema selection: In this phase we will identify input schemas to be used for validation. We propose initially to support Dublin Core and validate against the following CSW Record schemas:
Although to keep it simple, we start by supporting Dublin Core, we expect that the future evolution will be towards supporting ISO.
Internal Schema: In this phase we will specify a schema for internal representation of metadata within QGIS (‘the QGIS Metadata Schema’). This schema would be independent of any existing standards and would be the basic structure in which all incoming metadata would be stored. When we add support for additional formats in the future, the expectation would be that these formats are also transitioned to the QGIS internal format on import so that we can deal with a single common metadata structure internally.
Since the QGIS internal schema most likely won’t be a superset of all existing schemas, conversions between this and any other schemas may result in a loss information, which mean we won’t support metadata “round trips”. One proposed solution to loss of round tripping is to keep the original metadata document (if provided) and then interpolate new values into it if it is updated.
We will also identify which fields should be mandatory within the QGIS Metadata Schema. These should include mostly information which we can extract automatically from the dataset, without requiring any intervention from the user. Only in this way, we can guarantee the automatic generation of internal metadata for every dataset.
Other things to mention:
The QGIS metadata format will be used for when writing metadata into project files (and validator used to parse that back).
Service descriptions will be part of the schema
A document can describe either a service or a dataset, not both and not more than one dataset.
We plan to initially support layer level metadata and then project (service) level metadata in the future.
A small mockup by Tom to show how a QGIS Schema may look (incomplete) :
We can take some inspiration from the GeoNode data model too - possibly providing a similar data model.
Although we start with Dublin Core, keep ISO in mind, as a future target.
Status: A proposed schema has been written here qgis/qgis/#4330
Work package 2 - QGIS Metadata API:
In this work package we will build the basic C++ framework for parsing metadata from a schema - initially Dublin Core and QGIS Metadata Schema. This includes implementing an internal model for representing metadata, based on the metadata schema created on WP1.
Additional deliverables:
Work package 3 - Implement QGIS Metadata Storage support
In this WP we will introduce an external physical format for storing metadata internally, the “metadata store”. The goal is to support portability, enabling users to share their metadata, even in offline scenarios. This WP will build directly on the outputs of WP1, which will define an "internal metadata schema" and WP2, “QGIS metadata API”, which will encode/decode from the internal schema to the supported schemas (right now, only Dublin core).
QGIS will support two types of metadata stores: stores and local. In this WP we will focus on local stores, only. In the diagram below we depict the inheritance model for metadata stores, where an abstract metadata store will have a polymorphic behavior, according to the particular data format. For instance in the case of a PostgreSQL DB, the method “save” will create a table on the database, whether in the case of a Shapefile, it would create an XML file.
Some formats, such as text files, can be more limited than others. For that reason, we will create a “prime” format, the “QGIS metadata store”, which can accompany more restrictive formats.The prime format will be an SQLite database, because of its lightweight, and because it is well-known within the QGIS community.
As the goal is to support all these different formats in the future, we will design an infrastructure to accommodate that, but in this first iteration we will focus on the simple use case of creating an xml file, and an SQLite data store. The metadata contents will be passed by the metadata API. In this WP we will implement format translation, but not schema translation.
We will implement a user interface to allow the user to configure serialization/deserialization behavior, e.g.: in which format we should write metadata, and where. In WP5, we will add metadata detection (which perhaps we can turn on and off in the project settings). For instance, if there is an xml file with the same name and path as a Shapefile, QGIS would attempt to automatically import metadata.
The QGIS metadata store will be synced with any changes that we apply to the metadata. In the moment that we export metadata into XML, it will write those changes to the XML file.
Metadata search will also be polymorphic, according to the data format. In this iteration we will implement some text search for SQLite, and will use that rather than searching in text files which tends to be slower.
Activities:
Deliverables:
Dependencies:
WP1, WP2
Work package 4 - Implement QGIS metadata viewer:
Metadata is only useful if it is visible to the users of the dataset that the metadata is associated with. For this reason we should have provision for presenting the metadata in an eye-pleasing and informative manner and with minimal work required on behalf of the user. We also aim to implement this, earlier on in the project workflow, so that we can start outputting the data stored in QgsMetadata.
Some thoughts:
The ideas is to replace this:
With something like this (taken from GeoNode):
Work package 8 - Implement QGIS metadata editor for layers
In wizard mode:
In form mode:
Example(s)
(optional)
Affected Files
(required if applicable)
Performance Implications
(required if known at design time)
Further Considerations/Improvements
We have some funding to make these work packages happen (for around 80%) - if anyone is interested in co funding the shortfall, please let us know.
There is a discussion group at: https://gitter.im/qgis/metadata for those who wish to collaborate in making QGIS metadata better.
The following people have already joined the effort and will be doing implementation work, planning, offering advice etc.
Backwards Compatibility
This will be new code and will replace any existing metadata implementation work (including what is currently in layer properties dialog). We will try to make sure that server and other parts that rely on metadata do not break - we would welcome support and input from those working on QGIS Server.
Issue Tracking ID(s)
(optional)
Votes
(required)