Digital resources in the Social Sciences and Humanities OpenEdition Our platforms OpenEdition Books OpenEdition Journals Hypotheses Calenda Libraries OpenEdition Freemium Follow us

OpenCitations’ renewed compliance with the Principles of Open Scholarly Infrastructure (April 2026)

OpenCitations has formally adopted the Principles of Open Scholarly Infrastructure (POSI) since its first self-assessment in August 2021. At that time, we had only recently been included in the SCOSS funding round and, although we had a clear vision of what we wanted to build and the role we wanted to play within the Open Science ecosystem, both our financial and human resources were still very limited. For this reason, the POSI self-assessment proved to be an important exercise, since it allowed us to critically reflect on both the strengths we could build upon and the weaknesses we needed to address as we developed OpenCitations into a sustainable, community-governed Open Science infrastructure. 

That first assessment highlighted some key areas of attention. In particular, it highlighted the limited executive power of the community bodies involved in governance, as well as challenges related to long-term financial sustainability, particularly the ability to generate and manage financial surpluses. 

Between 2021 and 2026, OpenCitations has evolved significantly, becoming a more complex infrastructure by expanding both the services it offers and the number of data sources it integrates. During this period, we have also expanded our team, worked on the development of a container-based infrastructure, and relaunched our governance framework. 

These developments made it necessary to update our POSI self-assessment. This update is also timely, given the recent release of POSI Version 2 in October 2025. The new version of the principles is the result of a collective effort by POSI adopters and includes several clarifications to existing principles (for example, around the concept of lobbying and the distinction between transparent governance and transparent operations). In addition, new principles have been introduced to better reflect how scholarly infrastructures operate within the broader research ecosystem. 

We are therefore happy to share our updated POSI self-assessment, which provides a comprehensive snapshot of OpenCitations as of April 2026. At the same time, it represents a renewed declaration of our commitment to the POSI principles. 

Disclaimer. As with many early POSI adopters, our previous assessment used a traffic-light system to indicate the level of alignment with each principle. In this system: 

  • Green indicates that the principle is fully met and evidenced in practice. 
  • Yellow indicates that the principle is partially met, with active steps underway to achieve full alignment. 
  • Red indicates that the principle is not currently met, or that compliance is not feasible. 

During the POSI Version 2 working group discussions, it emerged that the use of this traffic-light system is not mandatory for POSI self-assessments. Nevertheless, we have decided to maintain this approach, as we believe it remains particularly effective as a communication tool and, at the same time, ensures continuity with our previous assessment. We have, however, taken the liberty of adding one more element: a symbol to indicate where we were already “green” in 2021, but have further improved our performance. 

The symbol is therefore a “plus” placed next to the green traffic light.

Coverage across the scholarly enterprise

OpenCitations demonstrates broad coverage by collecting citation data from global scholarship, ensuring representation across disciplines, geographic areas, and research communities. The scope of OpenCitations’ coverage is universal, not limited to a particular scholarly domain, nor to the English language, nor restricted by imposed acceptance criteria.

Stakeholder governed

Governance is structured to reflect the stakeholder community, with the International Advisory Board elected by the Council of members and responsible for approving the Trustee Network. For more information: https://opencitations.net/governance/ 

Non-discriminatory participation or membership

Membership is open to all individuals and organisations that support Open Science principles, enabling inclusive and non-discriminatory participation in line with OpenCitations’ founding values.

Transparent governance

OpenCitations maintains a high level of transparency by publicly documenting its organisational structure, governance processes, and decision-making procedures. Reports and relevant documentation are shared with the community, including annual reports and the Rules of Membership and Organizational Bodies.

Cannot lobby

OpenCitations does not engage in political or financial lobbying activities. Its role remains focused on supporting the scholarly community without pursuing regulatory changes that would advantage its own position.

Living will

A clear long-term stewardship strategy is in place, ensuring that data, services, and infrastructure can be responsibly transferred if needed. This is supported by a recently implemented fully replicable technical infrastructure and a new governance model designed to facilitate handover thanks to the presence of the Trustee Network, which is capable of appointing new hosting members to physically and administratively host the infrastructure.

Regular review of purpose and community value

OpenCitations has recently monitored its relevance and community value through mechanisms such as community surveys. In addition, it has established governance and technical strategies to support a responsible wind-down process, including, if necessary, the transfer of assets and infrastructure through its Trustee Network. Additionally, the Trustee Network is responsible for regularly monitoring OpenCitations’ adherence to its mission and values, as well as its annual activities and finances.

Transparent operations

OpenCitations ensures a high level of operational transparency by openly providing key documentation, including financial reports, a public roadmap, a mission statement and value proposition, a sustainability model and fee structure, and privacy policies.

Time-limited funds used only for time-limited activities

Grant income is restricted to funding specific, time-limited projects, including the appointment of personnel working on them, while core operations are not dependent on such funding. 

Goal to generate surplus

Thanks to memberships and donations, as well as in-kind support from the University of Bologna, OpenCitations has recently achieved the financial capacity to ensure stability until 2029, at least, in terms of funds allocated for staff salaries and technical operational expenses.

Establish and maintain financial reserves guided by policy

Although OpenCitations has reached a level of financial stability that ensures a budget surplus, there is currently no formal financial policy defining the amount of reserves to be allocated for a transition or wind-down plan, or to address exceptional or unforeseen events. However, OpenCitations has already initiated the necessary consultations to develop a Financial Reserve Policy, which will not only define the level and management of reserves but also provide a clear framework for handling revenues. In addition, it will establish and formalise procedures for approving both budget forecasts and actual expenditures.

Mission-consistent revenue generation

Revenue generation is aligned with the organisation’s mission (in particular, the value proposition according to which “external financial support is required from the stakeholder community to support OpenCitations and enable it to expand its delivery of high-quality comprehensive open bibliographic and citation metadata”), primarily through community funding via membership fees and donations. OpenCitations‘ members are listed on the website: https://opencitations.net/members-and-donors/  

Revenue generated from services, not data

OpenCitations charges no fees for any of its services, access to its data, or reuse of its software. OpenCitations members, donors and third parties all have equal free access.

Volunteer labour

OpenCitations’ core operations are carried out by paid staff, ensuring that the continuity and reliability of its services do not depend on volunteer labour. Nevertheless, OpenCitations recognises the value of voluntary labour. Indeed, the document Rules of Membership and Organizational Structure specifies that membership of the International Advisory Board is honorary and without remuneration, and that the Hosting Entity will reimburse all reasonable expenses related to travel, accommodation, and meals incurred in attending meetings, within the limits of the budget allocated for such purposes. While this is already stated in the governance framework, it is important to reiterate and further formalise it within the Financial Policy document currently under development. More broadly, the OpenCitations Mission Statement emphasises the importance of community engagement (voluntary and non-remunerated) through, for example, the involvement of community actors in the direct provision and curation of OpenCitations data, as well as the broader role of the community in supporting funding and participating in governance.

Transition planning

Transition planning is only partially developed. While the governance structure is in place, ensuring a management handover through the possibility of changing the hosting member upon approval by the Trustee Network, there is a lack of detailed operational documentation for individual roles within the management and technical teams, which may limit the organisation’s ability to ensure immediate continuity in the event of key personnel changes.

Open source

All OpenCitations software is released under open source licences, ensuring full transparency, reusability, and the possibility for the community to inspect, modify, and replicate the infrastructure.

Open data

To ensure the greatest possible reusability, all OpenCitations data is published under a Creative Commons CC0 Public Domain Waiver that permits downloading and re-use of any nature, including added-value re-purposing and commercial exploitation.

Available data

Data are provided through multiple access points, including REST APIs, SPARQL endpoints, query interfaces, and downloadable data dumps. This ensures broad accessibility and supports diverse use cases across the community.

Patent non-assertion

OpenCitations commits not to pursue patents, ensuring its infrastructure remains fully open and replicable. 

Prioritise interoperability and open standards to ensure continuity and resilience

The infrastructure has been redesigned as a container-based solution, thereby facilitating replication, deployment across different environments, and long-term service continuity.

From the Community Survey to action: the OpenCitations Roadmap for 2026

In February, we published a blog post presenting the main outcomes of the Community Survey we conducted between October and December 2025 to help guide the future direction of OpenCitations. As we explained in that post, the data collection and analysis phase resulted in a report that we have made openly available on Zenodo. Following the publication of the report, the OpenCitations Team entered a phase of internal evaluation to define priorities for the coming year. In doing so, we carefully considered the need to ensure that OpenCitations continues to grow in a sustainable way, aligned with the capacity and resources of our team. 

As a first step in this process, the survey outcomes were presented at the International Advisory Board Meeting held in February. This meeting provided an opportunity for an open, constructive discussion of OpenCitations’ strategic direction. The feedback and reflections from that dialogue were fundamental in shaping the next stage of planning. Indeed, building on these discussions, a dedicated internal Task Force Meeting was then held to define the roadmap priorities. This meeting involved the Director, the CTO, the Research Manager, the Systems Administrator, and the Community Manager, who worked together to translate the survey insights and the International Advisory Board feedback into concrete strategic decisions. 

The starting point for this discussion was a slide highlighting three key areas of debate: 

  • Whether to prioritize facilitating the use of OpenCitations by improving interfaces, tutorials, and documentation or to focus primarily on expanding metadata coverage. 
  • Whether to reinforce the role of OpenCitations as a backend infrastructure, or to invest more effort in frontend development to increase its visibility. 
  • Whether to pursue a vertical growth model, focused on strengthening the existing core community, or a more horizontal approach aimed at achieving broader adoption across the research community. 

Based on these considerations, we decided to focus on a limited set of priorities for 2026. It’s important to note that the aspects that are not included in the most immediate roadmap should not be interpreted as being dismissed. On the contrary, since all these points originate from valuable feedback we received from the community, each of them represents an area that we intend to address in the future, in proportion to the size of our team and the resources available. We have thus defined a set of realistic objectives for the period M3–M12 of 2026 that are consistent with the nature of OpenCitations and with the work already underway. 

A key outcome of this planning phase is the decision to strengthen our engagement with the core community during 2026. One important step in this direction will be the development of tutorials tailored to specific use cases. For example, we plan to create a clear and comprehensive “Getting Started” tutorial to support Web developers in working with the OpenCitations APIs. At the same time, we will continue working on expanding our data coverage by pursuing the ingestion of new data sources. In particular, during 2026, we will integrate data from OUTCITE and MATILDA into OpenCitations.  

Among the use cases identified by the community, a particularly relevant one concerns semantic interoperability within research infrastructures. For this reason, we have decided to focus significantly on backend development in this area. This direction is consistent with the work we are already carrying out within the GRAPHIA project, where the implementation of a REST API endpoint compliant with the Scientific Knowledge Graph – Interoperability Framework (SKG-IF, https://skg-if.github.io) represents one of the key assets, with the aim of defining a common mechanism for interoperability across open scholarly infrastructures. The SKG-IF recommendation includes the definition of a shared exchange format, interoperability mechanisms based on SPAR ontologies, a common validation approach, and an API specification.  

Finally, one activity that sits at the intersection between backend development and coverage expansion is the work we are conducting with TIB Hannover and OJS on crowdsourcing citation data. PKP, OpenCitations, and TIB have started implementing a workflow within OJS to support this effort. In particular, TIB has developed an OJS plugin that enables the ingestion of citations from OJS journals directly into OpenCitations. During 2025, OpenCitations has been working on the development of an automated crowdsourcing workflow that will allow these citations to be directly integrated into our collections. 

All these activities have been collected, together with their respective timelines, within our public roadmap, which remains the main reference point for monitoring the progress of OpenCitations development:  https://trello.com/b/RprHYoKL/opencitations  

Now that the analysis and planning process related to the Community Survey has reached its conclusion, we would like to renew our sincere thanks to everyone who contributed to it. We are grateful to the members of our community who responded so thoughtfully to the survey, to the partners who helped disseminate it, and to the members of the International Advisory Board who supported both the design of the survey and the strategic reflections that followed. The dense and coherent roadmap that has emerged from this process is the result of a collective effort. It reflects OpenCitations’ commitment to listening to its community and turning those insights into concrete objectives that will guide our work, to strengthen the open availability of open metadata and the interoperability between research infrastructures.  

OpenCitations is a plural: insights from the OpenCitations Community Survey

At OpenCitations, we have always described ourselves as community-driven and community-governed infrastructure. This definition deeply reflects our nature, since OpenCitations was born as a non-profit Open Science infrastructure to support open scholarship for the benefit of the scholarly community, and it continues to rely on that same community for its sustainability and governance, through memberships and participation in strategic decision-making processes. 

For this reason, it is essential for us that both our technical development and our outreach activities remain aligned with the real needs of the academic community. Over the past year, we have focused on strengthening our technical infrastructure and increasing awareness of OpenCitations’ values and activities. We are now entering a phase in which future priorities and developments need to be defined, at a time when the landscape of open citation services has become more crowded and complex. This makes it increasingly important for OpenCitations to clarify its positioning within the research ecosystem. 

To address this need, we came up with a simple insight: instead of assuming what the community needs, we decided to ask directly. This intuition led to the idea of launching a Community Survey. Developed with the support and guidance of the OpenCitations International Advisory Board, the survey was open to anyone in the academic community, including users, members, and people encountering OpenCitations for the first time. It was intentionally broad, covering both technical and outreach aspects. Its goal was not to evaluate performance, but to better understand how OpenCitations is perceived, used, and expected to evolve. 

The survey was open from 20 October to 16 December 2025, and we would like to sincerely thank everyone who took the time to respond and to help disseminate it within their networks. The feedback we received was of high quality, showing a strong level of engagement and a genuine willingness to provide thoughtful answers, especially through the open-ended questions. 

Respondents came from a wide range of backgrounds, primarily within academia and research, including researchers, librarians, information specialists, developers, and professionals working on research infrastructures and scholarly communication. Affiliations were mostly universities, research institutes, libraries, and non-profit infrastructures, reflecting a community already closely connected to Open Science practices. Levels of familiarity with OpenCitations varied considerably, from regular users to people who discovered the infrastructure for the first time through the survey. 

Across responses, OpenCitations was consistently associated with strong values such as openness, transparency, reproducibility, and independence from commercial interests. A recurring theme is the trust in the quality of the data and in the team maintaining the infrastructure. At the same time, awareness of specific services was uneven. While bibliographic and citation databases are relatively well known, more technical components such as APIs, data models, and SPARQL endpoints are mainly familiar to technically experienced users. 

This perception was reflected by usage patterns: OpenCitations often operates behind the scenes, with its data adopted within repositories and other scholarly tools. Citation analysis and metrics are another common use case, where OpenCitations is valued as an open alternative data source to proprietary databases. More specialised applications, such as knowledge graph construction and editorial workflows, were also mentioned, highlighting the flexibility of open citation data. 

When it comes to choosing OpenCitations over other databases, one of the strongest motivations is alignment with Open Science values. Respondents repeatedly emphasised the importance of open licences and the public-good orientation of the infrastructure. At the same time, several barriers remain, concerning in particular the limited coverage in certain areas, delays in data updates, technical complexity, and fragmented documentation. Another limit to adoption is the lack of awareness, together with institutional reliance on commercial databases. 

Regarding community engagement, awareness of OpenCitations often spreads through informal academic networks, conferences, workshops, and mailing lists. Active participation, however, remains limited, since many respondents follow OpenCitations’ activities without directly engaging. Still, there is a clear interest in deeper involvement, especially when it comes to contributing to Open Science, learning new skills, and aligning with institutional priorities around openness and transparency. 

Looking ahead, respondents expressed strong expectations around the expansion and enrichment of citation data, improved interoperability with other infrastructures, tutorials to support non-technical users, and more user-friendly interfaces. 

We are very glad to have opened this dialogue with the community. While some of the feedback confirmed issues we were already aware of, the survey also revealed we had not fully considered, particularly regarding how OpenCitations is perceived within the broader community. Overall, the Community Survey confirms that OpenCitations is widely recognised as a valuable and trustworthy open infrastructure for open science. At the same time, it highlights concrete areas for improvement, especially in usability, coverage, and community engagement. 

We see this survey not as a conclusion, but as a starting point for ongoing dialogue. The detailed results of the survey, including charts and word clouds, are available on Zenodo for public consultation:   

Di Giambattista, C., & Peroni, S. (2026). Summary Report of the OpenCitations Community Survey 2025. Zenodo. https://doi.org/10.5281/zenodo.18470862 

This survey now opens a phase of reflection that will inform future steps in our roadmap, and may also lead us to repeat this experiment in the future, exploring alternative formats for community engagement beyond surveys. Above all, we look forward to sharing concrete outcomes of this collaborative process with the community, as we continue to build OpenCitations with the community and for the community. 

 

Building, Maintaining, Listening: OpenCitations’ 2025

Looking back at 2025, OpenCitations experienced a year defined by both visible engagement and foundational work where, while the technical aspects and collaborative workflows were strengthened behind the scenes, many accomplishments were shared openly with (and for the benefit of) the community. What follows, therefore, is not just a list of milestones but a narrative of alignment between the technical architecture, outreach, and collaboration.

Embracing and managing growth

One of the central challenges for OpenCitations in 2025 was how to grow without losing quality. Over the year, four major data dumps were released: two for OpenCitations Meta and two for OpenCitations Index. The last 2025 Index dump, released in July, surpassed 2.2 billion citation records. Although we acknowledge that there were fewer dumps than usual in terms of frequency, we justify this slower pace by the fact that over the last two years the technical team has committed to a comprehensive infrastructure redesign, focusing particularly on data refinement and quality improvement. The goal was to provide fewer dumps, but with higher quality. Now that the mechanisms are well-established, we expect to increase the frequency of dumps in 2026.

Working to include new data sources

Many of the advances of 2025 emerged not from isolated development but from collaboration. Significant effort went into improving how citation data enters OpenCitations in the first place. Collaborations with external projects explored new ways to make the ingestion of citation data more efficient, potentially widening the range of sources that can contribute to the open citation ecosystem.

In particular, PKP, OpenCitations, and TIB have started implementing a workflow in OJS. TIB has created a plugin for OJS to enable ingestion of OJS citations directly into OpenCitations. During 2025, OpenCitations has been working on crowdsourcing an automatic workflow to ingest these citations into its collections directly.

Similarly, collaboration with the Matilda project, a bibliographic and bibliometric platform for open science, focused on validating tabular bibliographic data before ingestion, with the perspective of including Matilda citations as a future source of OpenCitations.

Re-engineering for the living will

Part of the 2025 work took place out of sight. The OpenCitations technical team undertook a comprehensive redesign of the infrastructure architecture, moving toward a microservices-based approach supported by Infrastructure as Code (IaC).

The redesign was guided by the long-term perspective of improving scalability and reliability, reducing operational friction, and ensuring that the infrastructure could move, adapt, and survive beyond any single deployment context. The result is an infrastructure that can be redeployed or transferred with minimal overhead. In this context, in light of the meaningful progress made in ensuring the “living will” of the infrastructure, in 2026 we plan to update our compliance assessment with the Principles of Open Scholarly Infrastructure, currently at version 2.0, released in October 2025. This version clarifies and expands on governance, sustainability and insurance principles for open scholarly infrastructures, including commitments to stakeholder governance, open licensing, clear data availability and interoperability to ensure resilience and continuity.

At the same time, attention turned to the question of quality as an ongoing practice. We successfully tested HERITRACE, a semantic data-curation tool developed for the GLAM sector, enabling intuitive editing of bibliographic data along with provenance and change tracking (announced at the end of 2024), in a small pilot, standing out from existing platforms thanks to its ease of use, strong provenance management, and high customisation. Within OpenCitations, HERITRACE will allow users to curate bibliographic and citation metadata, improving overall data quality.

OpenCitations’ window display completed

In early 2025, OpenCitations also turned its attention outward, introducing a renewed visual identity, including a new logo. This was not simply an aesthetic update, but part of a broader effort to better communicate OpenCitations’ values: openness, transparency, and community.

The website redesign followed the same logic. With a cleaner structure and clearer navigation, the site now offers faster access to core services, while also making OpenCitations itself more visible. It also places greater emphasis on OpenCitations’ community-driven nature, with expanded sections highlighting the organisation’s mission, values, team, membership programme, partners, and international collaborations, making it easier for all users to engage with and understand OpenCitations’ work.

Talking about Open Research Information at WOOC2025

In May 2025, OpenCitations hosted the Workshop on Open Citations and Open Scholarly Metadata (WOOC) in Bologna. Over two days, researchers, publishers, funders, policy-makers, and infrastructure providers came together to discuss open access to research information.

The workshop combined strategic discussion, such as the Bologna Meeting on Open Research Information, with community presentations, posters, and invited talks, creating a space where policy, infrastructure, and scholarly practice could meet. As in previous editions, WOOC served not only as a venue for sharing work, but as a space for coordination, alignment, and long-term thinking around open scholarly metadata.

From audience to co-creation

As part of the effort dedicated to improving community awareness of OpenCitations through dedicated outreach initiatives, in October, we launched our own Newsletter, creating a regular channel for sharing updates, technical developments, and community stories, helping to keep a diverse international audience informed and connected.

In parallel, OpenCitations launched a Community Survey to actively gather feedback from users, partners, and the research community as a whole. Rather than assuming needs or priorities, the survey explicitly invited critique, uncertainty, and suggestion, treating community members not as end users, but as co-creators of OpenCitations’ future. The responses will directly inform future development, reinforcing the idea of OpenCitations as a genuinely community-driven infrastructure.

What’s next?

If 2025 can be characterised by a single idea, it is continuity. Rather than chasing novelty, OpenCitations invested in making its data more reliable, its infrastructure more resilient, and its relationship with the community stronger.

These foundations matter. They enable growth without fragility, collaboration without dependency, and openness without compromise. As OpenCitations moves forward, the work of 2025 will remain present: not always visible, but deeply embedded in how open citation data is produced, shared, and sustained.

Shape the future of OpenCitations: take our Community Survey

OpenCitations has always existed thanks to and for its community, a diverse network of institutions and individuals who believe in the value of open scholarly data.

As a community-based open infrastructure, our strength lies in collaboration. The insights, feedback, and experiences shared by our community partners are what help us refine our services and keep our mission aligned with the evolving global research ecosystem.

To continue building an infrastructure that truly reflects the needs of the scholarly community, we’ve now launched the OpenCitations Community Survey.

Whether you are a supporter, a partner institution, a user, or have simply come across OpenCitations by chance (even if you’ve never used our data or looked closely at what we do), this survey is your opportunity to share your experience, tell us what works, what could be improved, and what you’d like to see in the future.

Your insights will directly inform how we evolve our services and activities to better support your work, research, tools and, more generally, the circulation of open knowledge.

It takes about 10 minutes to complete, and every single response helps us strengthen OpenCitations as a community-driven open infrastructure.

👉 Take the survey and help shape the future of OpenCitations:
https://forms.cloud.microsoft/e/GYSZ230686

An announcement from GraspOS: Final Conference of the GraspOS Project

SAVE THE DATE

Opening Research Assessment

Final Conference of the GraspOS Project

12–13 November 2025

CNR Area della Ricerca di Pisa, Italy

We are pleased to announce the Final Conference of the GraspOS project: Opening Research Assessment, taking place on 12–13 November 2025 in Pisa, Italy.

For research assessment to truly support Open Science, we must prioritise transparency, inclusivity, and fairness in how we recognise research activities and contributions. This means not only ensuring that all contributions to Open Science are valued but also promoting openness in the data, tools, services, and other resources used in the assessment process.

The Conference will present the results of the GraspOS project and engage in discussions on transforming research assessment into a system that is open, responsible and aligned with the principles of Open Science.

Invited talks and expert panels will address critical topics for advancing a responsible research assessment system that fully embraces Open Science principles including:

● Open infrastructures for responsible research assessment

● Transparent and inclusive assessment practices

● Recognition of contributions to Open Science

The event will be hybrid, free of charge, and registration will be required. A call for poster contributions aligned with these themes will open soon, and a detailed programme and registration link will follow shortly.

The conference will be followed by the final event of the CoARA Italian National Chapter on the afternoon of 13 November.

Stay tuned, and please save the date!

The GraspOS Project Team


Website: https://graspos.eu/graspos-conference-2025 

GRAPHIA Project Launched in January 2025

OPERAS hosted the launch of the GRAPHIA project in Brussels earlier this year. On the 22nd and 23rd January, 2025, 21 partner institutions met in OPERAS’ office and online to celebrate the project which is funded by the European Commission for over €8 million, plus an additional €1.6 million from the Switzerland State Secretariat for Education, Research, and Innovation. GRAPHIA aims to create the first comprehensive Social Science and Humanities (SSH) Knowledge Graph designed to integrate fragmented data into a unified entry point. The focus will be on disciplines within SSH, which contribute essential knowledge to society influencing culture, economics and ethical decisions among other factors. The project is expected to run from January 2025 to December 2027.

OPERAS coordinates the project with the purpose of significantly improving SSH data visualisation and analysis capacities through pioneering artificial intelligence (AI) solutions. The project addresses gaps in provision that leave SSH knowledge disconnected and poorly available. This will be accomplished by leveraging AI to create a Knowledge Graph that will deliver an expansive representation of knowledge in the diverse disciplines within SSH. GRAPHIA will empower researchers to uncover patterns and insight from unstructured data, illuminating social phenomena and cultural trends with clarity that is not available in current solutions.  

A major highlight of GRAPHIA will be an SSH Citation Index, an innovative framework for citation data extraction and enrichment to accelerate access to previous literature across the range of SSH disciplines. GRAPHIA integrates industry partners into this project to amplify the project’s impact by gaining the perspective and expertise from a range of stakeholders, reflecting the influence of SSH disciplines on society. Such collaboration will motivate innovations that apply to academics while being commercially viable, opening SSH disciplines and solutions to new markets and technologies. GRAPHIA is the signal of its partner organisations commitment to open science and increasing EU research infrastructures capabilities, enhancing global competitiveness, while facilitating broad and long-lasting impact of project results.

Part of OpenCitations’ personnel working at the University of Bologna is involved in GRAPHIA for the development of tools to enable data extraction from PDFs, and OpenCitations itself serves in the project as a source of information for the Knowledge Graph.

To follow the progress of the GRAPHIA project, join us on Bluesky and/or LinkedI

Contact GRAPHIA at: [email protected]