New module: schemaview. Dynamic views over schemas#25
Conversation
This provides a way to dynamically perform operations over a "raw" schema object. It thus provides an alternative to loading schemas using schemaloader, and also provides a more generic replacement to https://github.com/biolink/biolink-model-toolkit/ schemaview implements a "facade" pattern over a schema object. It allows us to access things like the inferred properties of a slot, without altering the underlying schema object The design is inspired partly by the OWLAPI, and all methods are parameterized by an "imports" flag which indicates whether the method should be resolved over the full imports closure or just the main schema. No merging of imports necessary It also provides ancestor/descendant methods. These by default include mixins and is-as, but these methods can also be parameterized We also took the cache design from bmt, but this should be robust to updates E.g. if modifications to the underlying schema is made then the cache will be rebuilt. See: - linkml/linkml#59 - linkml/linkml#144 - linkml/linkml#48 - linkml/linkml#270
|
Looks very straightforward to use / easy to understand! |
hsolbrig
left a comment
There was a problem hiding this comment.
What do the changes to compile_python.py have to do with this issue?
Is there a test case for this change? The reason I ask is that an earlier version of compile_python.py had something that looked very similar to this, but we discovered that cwd was rather arbitrary. As an example, things would behave differently if you ran your unit tests from the tests/test_something directory than from just plain tests. The real challenge, unfortunately, is determining the relative package path before this function gets called -- by the point you reach here, you don't have sufficient information to know the base.
hsolbrig
left a comment
There was a problem hiding this comment.
Like the schema view idea though. That said, another alternative:
If you think about it, gen_json_schema, gen_shex, gen_jsonld_context and the like extract and transform the information that they need from a SchemaDefinition. In particular, generating RDF requires the name of the context file that was extracted from the schema.
One approach (not recommended) would be to create a new model for use in the csv loader/dumper -- take what you need from SchemaDefinition and add it to this model.
An alternative, however, might be to do a vanilla YAML dump of the fully processed schema definition. It could be used "out of the box" for other things without having to invoke the SchemaLoader (linkml) -- instead, just by doing a YAML load.
We could still use a good library for traversing the content. The stuff in the generators package has gotten a bit lumpy over time -- if we create this view package, we should look at refactoring the generator base to use it instead.
Sorry, these should have gone in a separate PR
Not totally following... Are you talking about the csvgenerator? or runtime loaders/dumpers? Note that originally we created this as a separate repo rather than put in a separate runtime, as it needed access to the schema. But with a schemaview class in the runtime we can then bring csv/tsv loaders/dumpers back in
You could, although I think this is often confusing due to the materialization of induced classes.
Agreed - but I think this should be incremental. If you are OK with it, we could merge this PR, we have many applications that need something like SchemaView ASAP (it's essentially a generalization of BMT). We can then gradually start simplifying the generator code, but doing this very carefully of course. |
Adding a new library schemaview.
This provides a way to dynamically perform operations over a "raw"
schema object. It thus provides an alternative to loading schemas
using schemaloader, and also provides a more generic replacement
to https://github.com/biolink/biolink-model-toolkit/
schemaview implements a "facade" pattern over a schema object.
It allows us to access things like the inferred properties
of a slot, without altering the underlying schema object
Methods:
The design is inspired partly by the OWLAPI, and all methods
are parameterized by an "imports" flag which indicates whether
the method should be resolved over the full imports closure
or just the main schema. No merging of imports necessary
It also provides ancestor/descendant methods. These by default
include mixins and is-as, but these methods can also be parameterized
We also took the cache design from bmt, but this should be robust to updates
E.g. if modifications to the underlying schema is made then the cache will
be rebuilt.
See:
- linkml/linkml#59
- linkml/linkml#144
- linkml/linkml#48
- linkml/linkml#270