Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SHACL processor #236

Merged
merged 69 commits into from
Jun 30, 2019
Merged

SHACL processor #236

merged 69 commits into from
Jun 30, 2019

Conversation

langsamu
Copy link
Contributor

@langsamu langsamu commented Mar 29, 2019

Overview

This is an implementation of a fully compliant SHACL Core and SHACL-SPARQL processor.
Both data to be validated and validation rules reside in IGraphs.

I've created a web application that exposes the processor (OpenAPI).

Compliance

The processor is successfully tested against the SHACL Test Suite.

Compliance caveats

sh:datatype

The datatype constraint component relies on System.Xml.Schema for illformedness testing of literal nodes. Since this is not available for .NET Core 1.1, the feature is not supported for that target.
The following tests are excluded from NETCORE:

yield return "core/complex/shacl-shacl.ttl";
yield return "core/node/datatype-001.ttl";
yield return "core/property/datatype-001.ttl";
yield return "core/property/datatype-ill-formed.ttl";
yield return "core/property/not-001.ttl";
yield return "core/property/or-datatypes-001.ttl";
yield return "core/targets/targetObjectsOf-001.ttl";

<a:b>

Two tests (core/node/maxLength-001 and core/node/minLength-001) include the URI node <a:b>, which is illegal in .NET and therefore also in dotNetRDF.

GraphMatcher

Two tests (core/path/path-complex-002 and core/property/nodeKind-001) produce validation reports with a topology such that assessing their equality resorts to brute force matching and is thus not feasible in a unit test.

Implementation report

ImplementationReport.Generate creates an EARL implementation report in the format required by the SHACL Test Suite and Implementation Report.
Cases that fail automated testing (as per the caveats above) have been tested manually and are reported as such.
The latest report is available alongside the console application that generates it.

I'll submit the implementation report if this PR is merged and released.

Usage

Conformance checking usage

  1. Instantiate a ShapesGraph with an existing IGraph containing a shapes graph.
  2. Call Conforms with an existing IGraph containing a data graph.
  3. The return value is a bool representing whether the data graph conforms to the shapes graph.

[Fact]
public void Conformance()
{
var dataGraph = new Graph();
dataGraph.LoadFromString(@"
@prefix : <urn:> .
:s :p :o .
");
var shapesGraph = new Graph();
shapesGraph.LoadFromString(@"
@prefix : <urn:> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
[
sh:targetNode :s ;
sh:class :C ;
] .
");
var processor = new ShapesGraph(shapesGraph);
var conforms = processor.Conforms(dataGraph);
Assert.False(conforms);
}

Validation usage

  1. Instantiate a ShapesGraph with an existing IGraph containing a shapes graph.
  2. Call Validate with an existing IGraph containing a data graph.
  3. The return value is a Report representing the validation report.

public void Validation()
{
var dataGraph = new Graph();
dataGraph.LoadFromString(@"
@prefix : <urn:> .
:s :p :o .
");
var shapesGraph = new Graph();
shapesGraph.LoadFromString(@"
@prefix : <urn:> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
[
sh:targetNode :s ;
sh:property [
sh:path :p ;
sh:class :C
]
] .
");
var reportGraph = new Graph();
reportGraph.LoadFromString(@"
@prefix : <urn:> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
[
a sh:ValidationReport ;
sh:conforms false ;
sh:result [
a sh:ValidationResult ;
sh:sourceConstraintComponent sh:ClassConstraintComponent ;
sh:resultSeverity sh:Violation ;
sh:sourceShape [] ;
sh:focusNode :s ;
sh:resultPath :p ;
sh:value :o
]
] .
");
var processor = new ShapesGraph(shapesGraph);
var report = processor.Validate(dataGraph);
Assert.Equal(reportGraph, report.Graph);
}

API

// The processor
public class ShapesGraph {

    // Instantiate with existing shapes graph
    public ShapesGraph(IGraph shapesGraph) {}

    // Check conformance and produce validation report
    public Report Validate(IGraph dataGragh) {}

    // Check conformance only
    public bool Conforms(IGraph dataGragh) {}
}

// A validation report
public class Report {

    // Whether data graph conforms to shapes graph
    public bool Conforms { get; }

    // A report graph that can be used for test suite compliance checking
    public IGraph Normalised { get; }

    // Attach to only report in report graph
    public static Report Parse(IGraph reportGraph) {}
}

// The SHACL vocabulary
public static class Vocabulary {

    // A URI node per vocabulary term, e.g.
    public static IUriNode AlternativePath { get; }
}

The rest of the code is internal. It is an extensive implementation of the SHACL information model and validation logic. It could be made public in the future with additional guards and tests to facilitate programmatic construction and inspection of SHACL shapes graphs.

@langsamu
Copy link
Contributor Author

@kal, @tpluscode: Before I submit this for your review, there are several issues I'd appreciate assistance with.

  1. As far as I can tell, two of the test cases in the test suite (1, 2) fail due to GraphMatcher edge case? #235. I couldn't find a fix or a workaround.
  2. Another fails due to SPARQL parsing fails filter with trailing dot after group #237.
  3. The SHACL datatype constraint requires checking whether a literal is ill-typed according to SPARQL (e.g. "x"^^xsd:int).
    My solution uses System.Xml.Schema which is unavailable in .NETStandard 1.4.
    Should I
    1. Let the tests fail for this target?
    2. Exclude them?
    3. PlatformNotSupported?

@kal
Copy link
Contributor

kal commented May 7, 2019

@langsamu - I think that the best course of action with the datatypes issue would be to exclude those tests from testing the .NETStandard 1.4 build and we should document that datatype constraint checking is not supported on that platform. You can use a file with the extension .NetCore.cs to add code that is only compiled under .NETStandard 1.4 (or use the preprocessor flag NETCORE)

@langsamu
Copy link
Contributor Author

@kal, ready for review with thanks.

@kal
Copy link
Contributor

kal commented Jun 30, 2019

This is awesome work, thanks @langsamu and apologies for taking so long to merge the PR!

@kal kal merged commit c2b2fab into dotnetrdf:master Jun 30, 2019
@kal kal added this to the 2.3 milestone Nov 29, 2019
@langsamu langsamu deleted the shacl branch January 5, 2020 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants