Skip to content
Sergey Parinov edited this page May 11, 2017 · 20 revisions

CirCitEc AIMS AND A CONCEPT

1. ABOUT THE PROJECT

The CitEcCyr project has to run: a) a public service to process available research papers to build/update open citation linkages dataset; and b) a pilot version of open scholarly infrastructure, which has following more or less innovative/original features for a field of the citation data processing:

  1. Open distributed architecture. We provide a concept, open source software and an initial core infrastructure for interoperable systems, which are processing citation linkages from research papers.

Two nodes of this infrastructure - CitEc and CitEcCyr - are distributed by a specialisation on processing papers in specific languages (Romano-Germanic and Russian languages). Other languages (Chinese, Japanese, Arabic, etc.) can be added by the same way. Other types of specialisation also can exist among nodes of this infrastructure.

  1. Transparency. We will allow publishers, authors and readers of papers to see for each paper how their citation linkages are created by our system and to trace why some papers' references are not processed/counted as citation linkages.

  2. Better presentation of citation linkages. We will count a number of mentions for each reference in a paper and present these as semantic linkages. We will store the "text coordinates" of each reference mention and its left/right context in a paper. Readers of papers will see each mention as an open annotation object directing to the cited paper with ability to make comments on it or to enrich this citation by linking with a text fragment from the cited paper, etc.

  3. Enrichment. We will provide tools for authors of papers to enter additional data to correct errors while processing citations from their papers. We will provide a tool for authors to enrich their citation linkages, e.g. by qualitative characteristics of their motivation for citing papers of other authors, etc.

  4. Public control. Readers of papers will see how authors used enrichment facilities to increase their number of citations. Public will be able to react on wrong authors behaviour.

2. BACKGROUND

As a background we are using already available data and services provided by RePEc (http://repec.org/, collecting papers metadata), CitEc (http://citec.repec.org/, creating citation linkages dataset for papers from RePEc and its visualisation), RePEc Author Service (https://authors.repec.org/, authors credentials and relations with their papers) and some services at Sociorepec (https://sociorepec.org/) and Socionet (https://socionet.ru/).

CitEcCyr system is using the CitEc architecture as a prototype and also some its software modules. However, CitEc cannot process papers in Russian language.

3. COMMON RESEARCH INFORMATION SPACE

Common research information space (ComRIS) is defined as:

А) collections (series) of public research papers metadata, having common or compatible model of ID and common metadata format;

Б) public services allow all, whose papers are satisfied ComRIS requirements, freely include them into ComRIS, and also to browse ComRIS content and take neede data for its processing and building new data and/or services. RePEc and Socionet model to collect metadata works as ComRIS. So below we understand ComRIS as research papers data and services provided by RePEc and Socionet.

It means the CitEcCyr system is configured to work with RePEc model of papers’ ID (Socionet uses the same model) and with its format of metadata called ReDIF.

4. HOW THE SYSTEM SHOULD WORK

CitEcCyr system creates citation relationships by processing Russian language research papers from common research information space provided by RePEc and Socionet services. Figure 1 presents general data flows of such papers' processing process.

Figure 1. Data flow diagram

Figure 1. Data flow diagram

More detailed diagram of main stages of data processing and related workflows of the CitEcCyr system is presented on Fig. 2.

The CitEcCyr system (column «B» at Fig. 2) is interacting with the CitEc system (column «C» at Fig. 2) and also with some number of services of common research information space (column «A» at Fig.2).

Figure 2. Work flow diagram

Figure 2. Work flow diagram

5. CHALLENGES

Since we intend to create a fragment of open scholarly infrastructure we recognise some challenges for collaboration with similar project/systems:

a) we have to go to an agreement on:

  • how different systems of processing citation data coordinate their input flow of research papers:

    • how to exclude duplication between them in processing of research papers;

    • how to control and manage a covering of all available research papers, etc.;

  • how new systems can join this infrastructure and how they can specialise;

  • how different system can organise their output data to present it for consumers as a common dataset;

b) we have to go to better understanding of how our work is related with the research community initiatives like scholarly commons:

  • in principle, we can propose scholarly commons like:

    • authors of research paper have rights to know about usage of his/her papers, including who/when used/cited it, what part of the paper was used, for what purposes;

    • authors have rights to publicly react on using their papers, etc.