-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Adding new schema diffing capability #2983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Member
Author
|
This PR is not ready, but because of the size I am merging it now. All classes are marked internal and not yet ready for usage. The high level diff analyzer is not completely ready and some polish/cleanup is missing. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A new schema diffing
This PR adds a new Schema diffing class providing a comprehensive and detailed capability to compare two GraphQL schemas.
A GraphQL schema diff is expressed as a list of minimal edit operations transforming the old schema into a new one.
This is based on the the graph edit distance definition.
The algorithm implemented here is based on the following two papers (both papers are from the same authors. The first one is newer, but the older one is more detailed in some aspects)
Accelerating Graph Similarity Search via Efficient GED Computation.
Efficient Graph Edit Distance Computation and Verification via Anchor-aware Lower Bound Estimation
The algorithm is a modified version of "AStar-BMao". It is adapted to directed graphs as a GraphQL schema is most naturally represented as directed graph (vs the undirected graphs used in the papers).
The main class is
SchemaDiffingreturning a list of graph edit operations (insert/delete/change vertex, insert/delete/change edge).The second main class is
EditOperationAnalyzerwhich analyzes a list of edit operation to produce a more high level list ofschema changes. (This is still wip).
Graph edit distance
Maybe this is obvious, but a GraphQL schema itself is a graph (in the CS sense). While there are different ways how to model a schema s graph for the this application the model looks like this:

See the
SchemaGraphclass for how it is implemented.The problem how to get from one schema to another is equivalent to finding the graph edit distance. The graph edit distance is the minimal amount of edit operations to go from one graph to another. Here (and in the referenced papers) we only consider insert/delete/change of vertices and insert/delete/change of edges and every change has the same "value" as the other.
The problem is that the general task of finding the graph edit distance is NP-hard. The papers above present the current state of art in finding a somehow efficient algorithm for the general problem. This algorithm is called AStar-BMao.
The new
SchemaDiffingclasses implements AStar-BMao with two major modification: first the type of edit operations we allow is restricted, because we have a "typed graph". (See the image above). For example we don't want to change aScalarvertex into aObjectvertex. We do this to speed up the diffing and in order to make the diffing outcome more semantically meaningful. The second change is that the graph is a directed compared to the undirected ones in the paper.The outcome of the
SchemaDiffingis a list of low level graph operations describing how to get from one schema to another. For example:vs
will result in one change vertex operation, renaming the field from
hellotohello2.Edit operation analyzer
While the low level edit operations are correct, they are not really great for users to understand what actually changed in a schema. The
EditOperationAnalyzerclass aims to solve that by interpreting the list of graph edit operations in a GraphQL sense and describing the schema changes as a list of Union/Interface/... changes.On top of that we can add some kind of breaking change detection that will mark certain changes as breaking/questionable etc. This will need to be configurable in some way to accomodate the different API specific needs.
A blog post describing breaking changes, based on this work here, is planned.
TODO