Graph data management is instrumental for several use cases such as recommendation, root cause an... more Graph data management is instrumental for several use cases such as recommendation, root cause analysis, financial fraud detection, and enterprise knowledge representation. Efficiently supporting these use cases yields a number of unique requirements, including the need for a concise query language and graph-aware query optimization techniques. The goal of the Linked Data Benchmark Council (LDBC) is to design a set of standard benchmarks that capture representative categories of graph data management problems, making the performance of systems comparable and facilitating competition among vendors. LDBC also conducts research on graph schemas and graph query languages. This paper introduces the LDBC organization and its work over the last decade. 2 https://find-and-update.company-information.service.gov.uk/company/ 08716467 3 This seemingly trivial matter posed a practical hurdle for an organization with many directors (one per member at the time) located in different parts of the world. 4 The Members Policy Council is called the Members Council in official documents.
Long-running “business transactions” which may be processed by discrete organizations across the ... more Long-running “business transactions” which may be processed by discrete organizations across the public internet differ from classical atomic transactions in requiring increased protocol security and interoperability, and relaxable atomicity, isolation and durability properties. A protocol is required which is independent of communications mechanism, is capable of supporting fully ACID transaction processing, yet is also capable of supporting different AID qualities of service. Such a protocol would provide “appropriate transactionality” to applications. “Cohesive” actions (cohesions) could be processed as a superset of atomic actions, thus enabling a clean integration of legacy transactional resources and services, when appropriate.
Property graphs have reached a high level of maturity, witnessed by multiple robust graph databas... more Property graphs have reached a high level of maturity, witnessed by multiple robust graph database systems as well as the ongoing ISO standardization effort aiming at creating a new standard Graph Query Language (GQL). Yet, despite documented demand, schema support is limited both in existing systems and in the first version of the GQL Standard. It is anticipated that the second version of the GQL Standard will include a rich DDL. Aiming to inspire the development of GQL and enhance the capabilities of graph database systems, we propose PG-Schema, a simple yet powerful formalism for specifying property graph schemas. It features PG-Schema with flexible type definitions supporting multi-inheritance, as well as expressive constraints based on the recently proposed PG-Keys formalism. We provide the formal syntax and semantics of PG-Schema, which meet principled design requirements grounded in contemporary property graph management scenarios, and offer a detailed comparison of its featu...
Three documents from 2018 by Neo4j Inc. authors describing property graph data model, schema and ... more Three documents from 2018 by Neo4j Inc. authors describing property graph data model, schema and mapping to SQL schema
This Work Charter describes the goals and scope of a graph data community effort to define a grap... more This Work Charter describes the goals and scope of a graph data community effort to define a graph schema definition language that extends the proposed GQL feature of graph types.
Cypher is a query language for property graphs. It was originally designed and implemented as par... more Cypher is a query language for property graphs. It was originally designed and implemented as part of the Neo4j graph database, and it is currently used in a growing number of commercial systems, industrial applications and research projects. In this work, we provide denotational semantics of the core fragment of the read-only part of Cypher, which features in particular pattern matching, filtering, and most relational operations on tables.
Cypher is a property graph query language that provides expressive and efficient querying of grap... more Cypher is a property graph query language that provides expressive and efficient querying of graph data. Originally designed and implemented within the Neo4j graph database, it is now being used by several industrial database products, as well as open-source and research projects. Since 2015, Cypher has been an open, evolving language, with the aim of becoming a fullyspecified standard with many independent implementations. We introduce Cypher and the property graph model, and then describe extensions – either actively being developed or under discussion – which will be incorporated into Cypher in the near future. These include (i) making Cypher into a fully compositional language by supporting multiple graphs and allowing graphs to be returned from queries; (ii) allowing for more complex patterns (based on regular path queries) to be expressed; and (iii) allowing for different pattern matching semantics – homomorphism, relationship isomorphism (the current default) or node isomorph...
This paper provides precise mathematical definitions of a property graph as specified in the prop... more This paper provides precise mathematical definitions of a property graph as specified in the proposed GQL international standard, which is an attributed mixed multigraph with loops. It further defines a partially-oriented walk in such a property graph, which is called a path in GQL, as well as restricted classes of such walks (trails, simple/acyclic paths).
Despite the maturity of commercial graph databases, little consensus has been reached so far on t... more Despite the maturity of commercial graph databases, little consensus has been reached so far on the standardization of data definition languages (DDLs) for property graphs (PG). Discussion on the characteristics of PG schemas is ongoing in many standardization and community groups. Although some basic aspects of a schema are already present in most commercial graph databases, full support is missing allowing to constraint property graphs with more or less flexibility. In this paper, we show how schema validation can be enforced through homomorphisms between PG schemas and PG instances by leveraging a concise schema DDL inspired by Cypher syntax. We also briefly discuss PG schema evolution that relies on graph rewriting operations allowing to consider both prescriptive and descriptive schemas.
Proceedings of the 2018 International Conference on Management of Data, 2018
The Cypher property graph query language is an evolving language, originally designed and impleme... more The Cypher property graph query language is an evolving language, originally designed and implemented as part of the Neo4j graph database, and it is currently used by several commercial database products and researchers. We describe Cypher 9, which is the first version of the language governed by the openCypher Implementers Group. We first introduce the language by example, and describe its uses in industry. We then provide a formal semantic definition of the core read-query features of Cypher, including its variant of the property graph data model, and its "ASCII Art" graph pattern matching mechanism for expressing subgraphs of interest to an application. We compare the features of Cypher to other property graph query languages, and describe extensions, at an advanced stage of development, which will form part of Cypher 10, turning the language into a compositional language which supports graph projections and multiple named graphs.
The paper describes the present and the future of graph updates in Cypher, the language of the Ne... more The paper describes the present and the future of graph updates in Cypher, the language of the Neo4j property graph database and several other products. Update features include those with clear analogs in relational databases, as well as those that do not correspond to any relational operators. Moreover, unlike SQL, Cypher updates can be arbitrarily intertwined with querying clauses. After presenting the current state of update features, we point out their shortcomings, most notably violations of atomicity and non-deterministic behavior of updates. These have not been previously known in the Cypher community. We then describe the industry-academia collaboration on designing a revised set of Cypher update operations. Based on discovered shortcomings of update features, a number of possible solutions were devised. They were presented to key Cypher users, who were given the opportunity to comment on how update features are used in real life, and on their preferences for proposed fixes....
Seventh IEEE International Conference on E-Commerce Technology Workshops
Web Service protocol standards should be unambiguous and provide a complete description of the al... more Web Service protocol standards should be unambiguous and provide a complete description of the allowed behavior of the protocols' participants. Implementation of such protocols is an error-prone process, firstly because of the lack of precision and completeness of the standards, and secondly because of erroneous transformation of semantics from the specification to the final implementation. Applying the TLA+ paradigm we first consider the protocol on an abstract level. Safety properties taken from real world scenarios are compared to the facilities of the protocol. As result, we identified some limitation of applicability of the WS-BA protocol to abstract application use cases, modelled from the real world scenarios. These limitations are an omission of possible activities seen in the real world. Further, WS-C and WS-BA make assumptions about the internal structures of the participants, violating SOA paradigm. The former error could be detected by the use of formal methods. The latter can be circumvented by a sophisticated implementation strategy. The proposed strategy of implementing WS-Coordination and WS-BusinessActivity allows non-intrusive integration of the transactional framework, considering SOA requirements. This paper describes the results of analysis and some design decisions taken during the proof-of-concept implementation of WS-C and WS-BA frameworks.
Proceedings of the 2022 International Conference on Management of Data
As graph databases become widespread, the International Organization for Standardization (ISO) an... more As graph databases become widespread, the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) have approved a project to create GQL, a standard property graph query language. This complements the SQL/PGQ project, which specifies how to define graph views over a SQL tabular schema, and to run read-only queries against them. Both projects have been assigned to the ISO/IEC JTC1 SC32 working group for Database Languages, WG3, which continues to maintain and enhance SQL as a whole. This common responsibility helps enforce a policy that the identical core of both PGQ and GQL is a graph pattern matching sub-language, here termed GPML. The WG3 design process is also analyzed by an academic working group, part of the Linked Data Benchmark Council (LDBC), whose
Graph data management is instrumental for several use cases such as recommendation, root cause an... more Graph data management is instrumental for several use cases such as recommendation, root cause analysis, financial fraud detection, and enterprise knowledge representation. Efficiently supporting these use cases yields a number of unique requirements, including the need for a concise query language and graph-aware query optimization techniques. The goal of the Linked Data Benchmark Council (LDBC) is to design a set of standard benchmarks that capture representative categories of graph data management problems, making the performance of systems comparable and facilitating competition among vendors. LDBC also conducts research on graph schemas and graph query languages. This paper introduces the LDBC organization and its work over the last decade. 2 https://find-and-update.company-information.service.gov.uk/company/ 08716467 3 This seemingly trivial matter posed a practical hurdle for an organization with many directors (one per member at the time) located in different parts of the world. 4 The Members Policy Council is called the Members Council in official documents.
Long-running “business transactions” which may be processed by discrete organizations across the ... more Long-running “business transactions” which may be processed by discrete organizations across the public internet differ from classical atomic transactions in requiring increased protocol security and interoperability, and relaxable atomicity, isolation and durability properties. A protocol is required which is independent of communications mechanism, is capable of supporting fully ACID transaction processing, yet is also capable of supporting different AID qualities of service. Such a protocol would provide “appropriate transactionality” to applications. “Cohesive” actions (cohesions) could be processed as a superset of atomic actions, thus enabling a clean integration of legacy transactional resources and services, when appropriate.
Property graphs have reached a high level of maturity, witnessed by multiple robust graph databas... more Property graphs have reached a high level of maturity, witnessed by multiple robust graph database systems as well as the ongoing ISO standardization effort aiming at creating a new standard Graph Query Language (GQL). Yet, despite documented demand, schema support is limited both in existing systems and in the first version of the GQL Standard. It is anticipated that the second version of the GQL Standard will include a rich DDL. Aiming to inspire the development of GQL and enhance the capabilities of graph database systems, we propose PG-Schema, a simple yet powerful formalism for specifying property graph schemas. It features PG-Schema with flexible type definitions supporting multi-inheritance, as well as expressive constraints based on the recently proposed PG-Keys formalism. We provide the formal syntax and semantics of PG-Schema, which meet principled design requirements grounded in contemporary property graph management scenarios, and offer a detailed comparison of its featu...
Three documents from 2018 by Neo4j Inc. authors describing property graph data model, schema and ... more Three documents from 2018 by Neo4j Inc. authors describing property graph data model, schema and mapping to SQL schema
This Work Charter describes the goals and scope of a graph data community effort to define a grap... more This Work Charter describes the goals and scope of a graph data community effort to define a graph schema definition language that extends the proposed GQL feature of graph types.
Cypher is a query language for property graphs. It was originally designed and implemented as par... more Cypher is a query language for property graphs. It was originally designed and implemented as part of the Neo4j graph database, and it is currently used in a growing number of commercial systems, industrial applications and research projects. In this work, we provide denotational semantics of the core fragment of the read-only part of Cypher, which features in particular pattern matching, filtering, and most relational operations on tables.
Cypher is a property graph query language that provides expressive and efficient querying of grap... more Cypher is a property graph query language that provides expressive and efficient querying of graph data. Originally designed and implemented within the Neo4j graph database, it is now being used by several industrial database products, as well as open-source and research projects. Since 2015, Cypher has been an open, evolving language, with the aim of becoming a fullyspecified standard with many independent implementations. We introduce Cypher and the property graph model, and then describe extensions – either actively being developed or under discussion – which will be incorporated into Cypher in the near future. These include (i) making Cypher into a fully compositional language by supporting multiple graphs and allowing graphs to be returned from queries; (ii) allowing for more complex patterns (based on regular path queries) to be expressed; and (iii) allowing for different pattern matching semantics – homomorphism, relationship isomorphism (the current default) or node isomorph...
This paper provides precise mathematical definitions of a property graph as specified in the prop... more This paper provides precise mathematical definitions of a property graph as specified in the proposed GQL international standard, which is an attributed mixed multigraph with loops. It further defines a partially-oriented walk in such a property graph, which is called a path in GQL, as well as restricted classes of such walks (trails, simple/acyclic paths).
Despite the maturity of commercial graph databases, little consensus has been reached so far on t... more Despite the maturity of commercial graph databases, little consensus has been reached so far on the standardization of data definition languages (DDLs) for property graphs (PG). Discussion on the characteristics of PG schemas is ongoing in many standardization and community groups. Although some basic aspects of a schema are already present in most commercial graph databases, full support is missing allowing to constraint property graphs with more or less flexibility. In this paper, we show how schema validation can be enforced through homomorphisms between PG schemas and PG instances by leveraging a concise schema DDL inspired by Cypher syntax. We also briefly discuss PG schema evolution that relies on graph rewriting operations allowing to consider both prescriptive and descriptive schemas.
Proceedings of the 2018 International Conference on Management of Data, 2018
The Cypher property graph query language is an evolving language, originally designed and impleme... more The Cypher property graph query language is an evolving language, originally designed and implemented as part of the Neo4j graph database, and it is currently used by several commercial database products and researchers. We describe Cypher 9, which is the first version of the language governed by the openCypher Implementers Group. We first introduce the language by example, and describe its uses in industry. We then provide a formal semantic definition of the core read-query features of Cypher, including its variant of the property graph data model, and its "ASCII Art" graph pattern matching mechanism for expressing subgraphs of interest to an application. We compare the features of Cypher to other property graph query languages, and describe extensions, at an advanced stage of development, which will form part of Cypher 10, turning the language into a compositional language which supports graph projections and multiple named graphs.
The paper describes the present and the future of graph updates in Cypher, the language of the Ne... more The paper describes the present and the future of graph updates in Cypher, the language of the Neo4j property graph database and several other products. Update features include those with clear analogs in relational databases, as well as those that do not correspond to any relational operators. Moreover, unlike SQL, Cypher updates can be arbitrarily intertwined with querying clauses. After presenting the current state of update features, we point out their shortcomings, most notably violations of atomicity and non-deterministic behavior of updates. These have not been previously known in the Cypher community. We then describe the industry-academia collaboration on designing a revised set of Cypher update operations. Based on discovered shortcomings of update features, a number of possible solutions were devised. They were presented to key Cypher users, who were given the opportunity to comment on how update features are used in real life, and on their preferences for proposed fixes....
Seventh IEEE International Conference on E-Commerce Technology Workshops
Web Service protocol standards should be unambiguous and provide a complete description of the al... more Web Service protocol standards should be unambiguous and provide a complete description of the allowed behavior of the protocols' participants. Implementation of such protocols is an error-prone process, firstly because of the lack of precision and completeness of the standards, and secondly because of erroneous transformation of semantics from the specification to the final implementation. Applying the TLA+ paradigm we first consider the protocol on an abstract level. Safety properties taken from real world scenarios are compared to the facilities of the protocol. As result, we identified some limitation of applicability of the WS-BA protocol to abstract application use cases, modelled from the real world scenarios. These limitations are an omission of possible activities seen in the real world. Further, WS-C and WS-BA make assumptions about the internal structures of the participants, violating SOA paradigm. The former error could be detected by the use of formal methods. The latter can be circumvented by a sophisticated implementation strategy. The proposed strategy of implementing WS-Coordination and WS-BusinessActivity allows non-intrusive integration of the transactional framework, considering SOA requirements. This paper describes the results of analysis and some design decisions taken during the proof-of-concept implementation of WS-C and WS-BA frameworks.
Proceedings of the 2022 International Conference on Management of Data
As graph databases become widespread, the International Organization for Standardization (ISO) an... more As graph databases become widespread, the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) have approved a project to create GQL, a standard property graph query language. This complements the SQL/PGQ project, which specifies how to define graph views over a SQL tabular schema, and to run read-only queries against them. Both projects have been assigned to the ISO/IEC JTC1 SC32 working group for Database Languages, WG3, which continues to maintain and enhance SQL as a whole. This common responsibility helps enforce a policy that the identical core of both PGQ and GQL is a graph pattern matching sub-language, here termed GPML. The WG3 design process is also analyzed by an academic working group, part of the Linked Data Benchmark Council (LDBC), whose
Uploads
Papers by Alastair Green