Matadisco makes datasets more discoverable. It uses ATProto for publishing records that point to the metadata.
For more details about the background and the ideas behind the project refer to the announcement blog post of the IPFS Foundation as well as a more technical one at @vmx' blog.
This is the central entry point for the Matadisco project. It's the place to discuss ideas and issues.
The central piece of Matadisco is the Lexicon schema. To make it easier to read, the MLF syntax is used:
/// A Matadisco record
record matadisco {
/// The time the original metadata/data was created
publishedAt!: Datetime,
/// A URI that links to resource containing the metadata
resource!: Uri,
/// Preview of the data
preview: {
/// The media type the preview has
mimeType!: string,
/// The URL to the preview
url: Uri,
},
}
See cx.vmx.matadisco.json for the ATProto Lexicon schema in its usual JSON format.
This Lexicon is expected to evolve when more people start using it to publish data. You can already add arbitrary fields to ATProto records, but it would make sense to work together on common fields to increase interoperability on those.
Generating the JSON format from MLF file (it's expected that Rust is setup properly):
> cargo install --git https://tangled.org/stavola.xyz/mlf mlf-cli
> mlf generate lexicon --input cx.vmx.matadisco.mlf --output ./ --flatPublishing Matadisco records needs some service to run. Ideally it's integrated directly into the publishing pipeline of the metadata records themselves. Though you can also run a third party service. Projects doing that (please open a PRs to add your own project):
- sentinel-to-atproto: Publishes records sourced from Element 84's Earth Search STAC catalogue. It's using a scheduled Cloudflare Worker that queries the catalogue every few minutes.
- gdi-de-csw-to-atproto: Publishes records sourced from the German geodata catalogue geodatenkatalog.de. It's using schduled GitHub workers. The the XML based CSW requests needs a bit of processing and sometimes thousands of records get updated at once.They are then published stretched over a longer period of time.
Metadisco records can be used to build your own portals or workflows.
- matadisco-viewer: Very basic viewer of all incoming Metadisco records in real-time.