Skip to main content

Caterina Caracciolo

Followers

21

Following

39

Co-author

1

Public Views

This is a test
Address: Italy

less

Giovanni L'Abate

Consiglio per la Ricerca e Sperimentazione in Agricoltura

Edoardo Costantini

Consiglio per la Ricerca e Sperimentazione in Agricoltura

Daniele Pittalis

University of Parma

Università di Sassari

Romina Lorenzetti

University of Bucharest

George Cojocaru

Academia de Stiinte Agricole si Silvice

Maria Fantappiè

Consiglio per la Ricerca e Sperimentazione in Agricoltura

Interests

Uploads

Papers by Caterina Caracciolo

Creation and Use of Lexicons and Ontologies for NL Interfaces to Databases

In this paper we present an original approach to natural language query interpretation which has ... more In this paper we present an original approach to natural language query interpretation which has been implemented within the FuLL (Fuzzy Logic and Language) Italian project of BC S.r.l. In particular, we discuss here the creation of linguistic and ontological resources, together with the exploitation of existing ones, for natural language-driven database access and retrieval. Both the database and the queries we experiment with are Italian, but the methodology we broach naturally extends to other languages.

Topic driven access to scientific handbooks

... ISBN: 978–90–5776–176–8 Page 6. Ai luoghi della vita e alle biciclette che li attraversano. i... more

Topic driven access to full text documents

Structured access to scientific information

... We thank Joost Kircz and the referees for helpful comments and sug-gestions. INSTITUTE FOR LO... more

Generating and Retrieving Text Segments for Focused Access to Scientific Documents

When presented with a retrieved document, users of a search engine are usually left with the task... more When presented with a retrieved document, users of a search engine are usually left with the task of pinning down the relevant information inside the document. Often this is done by a time-consuming combination of skimming, scrolling and Ctrl+F. In the setting of a digital library for scientific literature the issue is especially urgent when dealing with reference works, such as surveys and handbooks, as these typically contain long documents. Our aim is to develop methods for providing a “go-read-here” type of retrieval functionality, which points the user to a segment where she can best start reading to find out about her topic of interest. We examine multiple query-independent ways of segmenting texts into coherent chunks that can be returned in response to a query. Most (experienced) authors use paragraph breaks to indicate topic shifts, thus providing us with one way of segmenting documents. We compare this structural method with semantic text segmentation methods, both with respect to topical focus and relevancy. Our experimental evidence is based on manually segmented scientific documents and a set of queries against this corpus. Structural segmentation based on contiguous blocks of relevant paragraphs is shown to be a viable solution for our intended application of providing “go-read-here” functionality.

D1.1.3 NeOn Formalisms for Modularization: Syntax, Semantics, Algebra

Page 1. NeOn-project.org NeOn: Lifecycle Support for Networked Ontologies Integrated Project (IST... more

Results of the Ontology Alignment Evaluation Initiative 2008

Ontology matching consists of finding correspondences between ontology entities. OAEI campaigns a... more Ontology matching consists of finding correspondences between ontology entities. OAEI campaigns aim at comparing ontology matching systems on precisely defined test sets. Test sets can use ontologies of different nature (from expressive OWL ontologies to simple directories) and use different modalities, e.g., blind evaluation, open evaluation, consensus. OAEI-2008 builds over previous campaigns by having 4 tracks with 8 test sets followed by 13 participants. Following the trend of previous years, more participants reach the forefront. The official results of the campaign are those published on the OAEI web site.

Towards Topic Driven Access to Full Text Documents

We address the issue of providing topic driven access to full text documents. The methodology we ... more We address the issue of providing topic driven access to full text documents. The methodology we propose is a combination of topic segmentation and information retrieval techniques. By segmenting the text into topic driven segments, we obtain small and coherent documents that can be used in two ways: as a basis for automatically generating hypertext links, and as a visualization aid for the reader who is presented with a small set of focused and restricted text snippets. In the presence of a concept hierarchy, or ontology, information retrieval techniques can be used to connect the segments obtained to concepts in the ontology. In this paper we concentrate on the text segmentation phase: we describe our approach to segmentation, discuss issues related to evaluation, and report on preliminary results.

Towards Scientific Information Disclosure Through Concept Hierarchies

... al. (Eds.) VWF Berlin, 2002 Towards Scientific Information Disclosure Through Concept Hiera... more

Networked Ontologies from the Fisheries Domain

In this paper we report on ongoing work concerning the creation of a network of ontologies based ... more In this paper we report on ongoing work concerning the creation of a network of ontologies based on metadata for time series relative to the domain of fisheries, and hint at the possibility of exploiting the network for web service applications. The results obtained so far show that the reengineering of classification systems stored as relational databases is possible, although some technical problems is still to be addressed.

Towards Topic Driven Access to Full Text Documents

We address the issue of providing topic driven access to full text documents. The methodology we ... more We address the issue of providing topic driven access to full text documents. The methodology we propose is a combination of topic segmentation and information retrieval techniques. By segmenting the text into topic driven segments, we obtain small and coherent documents that can be used in two ways: as a basis for automatically generating hypertext links, and as a visualization aid for the reader who is presented with a small set of focused and restricted text snippets. In the presence of a concept hierarchy, or ontology, information retrieval techniques can be used to connect the segments obtained to concepts in the ontology. In this paper we concentrate on the text segmentation phase: we describe our approach to segmentation, discuss issues related to evaluation, and report on preliminary results.

D7.4.1 Software architecture for managing the Fisheries ontologieslifecycle

... Soonho Kim, Marta Iglesias Sucasas, Caterina Caracciolo, Andrew Bagdanov (FAO); ... Óscar Muñ... more

Towards Interoperability of Geopolitical Information within FAO

Requirements for the Treatment of Multilinguality in Ontologies within FAO

International organizations like FAO are intrinsically multilingual. FAO is currently experimenti... more International organizations like FAO are intrinsically multilingual. FAO is currently experimenting with semantic-oriented technologies based on ontologies, with the purpose of integrating data across various information systems and providing better services to end users. However, in order for these technologies to be used in real-life scenarios, models and tools for accommodating and managing multilingual data are needed. This paper analyzes the requirements for the treatment of multilinguality as resulting from the experience we gained at FAO.

Comparing human and automatic thesaurus mapping approaches in the agricultural domain

Computing Research Repository, 2008

Knowledge organization systems (KOS), like thesauri and other controlled vocabularies, are used t... more Knowledge organization systems (KOS), like thesauri and other controlled vocabularies, are used to provide subject access to information systems across the web. Due to the heterogeneity of these systems, mapping between vocabularies becomes crucial for retrieving relevant information. However, mapping thesauri is a laborious task, and thus big efforts are being made to automate the mapping process. This paper examines two mapping approaches involving the agricultural thesaurus AGROVOC, one machine-created and one human created. We are addressing the basic question "What are the pros and cons of human and automatic mapping and how can they complement each other?" By pointing out the difficulties in specific cases or groups of cases and grouping the sample into simple and difficult types of mappings, we show the limitations of current automatic methods and come up with some basic recommendations on what approach to use when.

Talks by Caterina Caracciolo

Soil vocabularies and Open data

by Giovanni L'Abate and Caterina Caracciolo

On Open Data: Intellectual property and Data license Role of standard vocabularies to search for ... more On Open Data: Intellectual property and Data license
Role of standard vocabularies to search for and describe Open Data, especially in the context of soil data infrastructures
agINFRA work on lifting the local values used in ISIS to published, linked vocabularies
agINFRA Soil Terms vs Agrovoc & NALT
Toward a real interoperability of soil data: SOIL.WRB

Creation and Use of Lexicons and Ontologies for NL Interfaces to Databases

In this paper we present an original approach to natural language query interpretation which has ... more In this paper we present an original approach to natural language query interpretation which has been implemented within the FuLL (Fuzzy Logic and Language) Italian project of BC S.r.l. In particular, we discuss here the creation of linguistic and ontological resources, together with the exploitation of existing ones, for natural language-driven database access and retrieval. Both the database and the queries we experiment with are Italian, but the methodology we broach naturally extends to other languages.

Topic driven access to scientific handbooks

... ISBN: 978–90–5776–176–8 Page 6. Ai luoghi della vita e alle biciclette che li attraversano. i... more

Topic driven access to full text documents

Structured access to scientific information

... We thank Joost Kircz and the referees for helpful comments and sug-gestions. INSTITUTE FOR LO... more

Generating and Retrieving Text Segments for Focused Access to Scientific Documents

When presented with a retrieved document, users of a search engine are usually left with the task... more When presented with a retrieved document, users of a search engine are usually left with the task of pinning down the relevant information inside the document. Often this is done by a time-consuming combination of skimming, scrolling and Ctrl+F. In the setting of a digital library for scientific literature the issue is especially urgent when dealing with reference works, such as surveys and handbooks, as these typically contain long documents. Our aim is to develop methods for providing a “go-read-here” type of retrieval functionality, which points the user to a segment where she can best start reading to find out about her topic of interest. We examine multiple query-independent ways of segmenting texts into coherent chunks that can be returned in response to a query. Most (experienced) authors use paragraph breaks to indicate topic shifts, thus providing us with one way of segmenting documents. We compare this structural method with semantic text segmentation methods, both with respect to topical focus and relevancy. Our experimental evidence is based on manually segmented scientific documents and a set of queries against this corpus. Structural segmentation based on contiguous blocks of relevant paragraphs is shown to be a viable solution for our intended application of providing “go-read-here” functionality.

D1.1.3 NeOn Formalisms for Modularization: Syntax, Semantics, Algebra

Page 1. NeOn-project.org NeOn: Lifecycle Support for Networked Ontologies Integrated Project (IST... more

Results of the Ontology Alignment Evaluation Initiative 2008

Ontology matching consists of finding correspondences between ontology entities. OAEI campaigns a... more Ontology matching consists of finding correspondences between ontology entities. OAEI campaigns aim at comparing ontology matching systems on precisely defined test sets. Test sets can use ontologies of different nature (from expressive OWL ontologies to simple directories) and use different modalities, e.g., blind evaluation, open evaluation, consensus. OAEI-2008 builds over previous campaigns by having 4 tracks with 8 test sets followed by 13 participants. Following the trend of previous years, more participants reach the forefront. The official results of the campaign are those published on the OAEI web site.

Towards Topic Driven Access to Full Text Documents

We address the issue of providing topic driven access to full text documents. The methodology we ... more We address the issue of providing topic driven access to full text documents. The methodology we propose is a combination of topic segmentation and information retrieval techniques. By segmenting the text into topic driven segments, we obtain small and coherent documents that can be used in two ways: as a basis for automatically generating hypertext links, and as a visualization aid for the reader who is presented with a small set of focused and restricted text snippets. In the presence of a concept hierarchy, or ontology, information retrieval techniques can be used to connect the segments obtained to concepts in the ontology. In this paper we concentrate on the text segmentation phase: we describe our approach to segmentation, discuss issues related to evaluation, and report on preliminary results.

Towards Scientific Information Disclosure Through Concept Hierarchies

... al. (Eds.) VWF Berlin, 2002 Towards Scientific Information Disclosure Through Concept Hiera... more

Networked Ontologies from the Fisheries Domain

In this paper we report on ongoing work concerning the creation of a network of ontologies based ... more In this paper we report on ongoing work concerning the creation of a network of ontologies based on metadata for time series relative to the domain of fisheries, and hint at the possibility of exploiting the network for web service applications. The results obtained so far show that the reengineering of classification systems stored as relational databases is possible, although some technical problems is still to be addressed.

Towards Topic Driven Access to Full Text Documents

We address the issue of providing topic driven access to full text documents. The methodology we ... more We address the issue of providing topic driven access to full text documents. The methodology we propose is a combination of topic segmentation and information retrieval techniques. By segmenting the text into topic driven segments, we obtain small and coherent documents that can be used in two ways: as a basis for automatically generating hypertext links, and as a visualization aid for the reader who is presented with a small set of focused and restricted text snippets. In the presence of a concept hierarchy, or ontology, information retrieval techniques can be used to connect the segments obtained to concepts in the ontology. In this paper we concentrate on the text segmentation phase: we describe our approach to segmentation, discuss issues related to evaluation, and report on preliminary results.

D7.4.1 Software architecture for managing the Fisheries ontologieslifecycle

... Soonho Kim, Marta Iglesias Sucasas, Caterina Caracciolo, Andrew Bagdanov (FAO); ... Óscar Muñ... more

Towards Interoperability of Geopolitical Information within FAO

Requirements for the Treatment of Multilinguality in Ontologies within FAO

International organizations like FAO are intrinsically multilingual. FAO is currently experimenti... more International organizations like FAO are intrinsically multilingual. FAO is currently experimenting with semantic-oriented technologies based on ontologies, with the purpose of integrating data across various information systems and providing better services to end users. However, in order for these technologies to be used in real-life scenarios, models and tools for accommodating and managing multilingual data are needed. This paper analyzes the requirements for the treatment of multilinguality as resulting from the experience we gained at FAO.

Comparing human and automatic thesaurus mapping approaches in the agricultural domain

Computing Research Repository, 2008

Knowledge organization systems (KOS), like thesauri and other controlled vocabularies, are used t... more Knowledge organization systems (KOS), like thesauri and other controlled vocabularies, are used to provide subject access to information systems across the web. Due to the heterogeneity of these systems, mapping between vocabularies becomes crucial for retrieving relevant information. However, mapping thesauri is a laborious task, and thus big efforts are being made to automate the mapping process. This paper examines two mapping approaches involving the agricultural thesaurus AGROVOC, one machine-created and one human created. We are addressing the basic question "What are the pros and cons of human and automatic mapping and how can they complement each other?" By pointing out the difficulties in specific cases or groups of cases and grouping the sample into simple and difficult types of mappings, we show the limitations of current automatic methods and come up with some basic recommendations on what approach to use when.

Soil vocabularies and Open data

by Giovanni L'Abate and Caterina Caracciolo

On Open Data: Intellectual property and Data license Role of standard vocabularies to search for ... more On Open Data: Intellectual property and Data license
Role of standard vocabularies to search for and describe Open Data, especially in the context of soil data infrastructures
agINFRA work on lifting the local values used in ISIS to published, linked vocabularies
agINFRA Soil Terms vs Agrovoc & NALT
Toward a real interoperability of soil data: SOIL.WRB