francesco cutugno

Università degli Studi di Napoli "Federico II", Dipartimento di Ingegneria Elettrica e delle Tecnologie dell'Informazione, Faculty Member

Followers

151

Following

Public Views

Paul Levinson

Fordham University

Bert Vaux

University of Cambridge

Joaquim Llisterri

Universitat Autònoma de Barcelona

Claire Bowern

Yale University

David Seamon

Kansas State University

T. Florian Jaeger

University of Rochester

Louis de Saussure

University of Neuchâtel

Armando Marques-Guedes

UNL - New University of Lisbon

Eitan Grossman

The Hebrew University of Jerusalem

John Sutton

Macquarie University

InterestsView All (9)

Uploads

Papers by francesco cutugno

Syllable classification using static matrices and prosodic features

In this paper we explore the usefulness of prosodic features for syllable classification. In orde... more In this paper we explore the usefulness of prosodic features for syllable classification. In order to do this, we represent the syllable as a static analysis unit such that its acoustic-temporal dynamics could be merged into a set of features that the SVM classifier will consider as a whole. In the first part of our experiment we used MFCC as features for classification, obtaining a maximum accuracy of 86.66%. The second part of our study tests whether the prosodic information is complementary to the cepstral information for syllable classification. The results obtained show that combining the two types of information does improve the classification, but further analysis is necessary for a more successful combination of the two types of features.

Download

Automatic Speech Segmentation for Italian (ASSI): tools, models, evaluation and applications

HAL (Le Centre pour la Communication Scientifique Directe), Jan 26, 2011

1. ABSTRACT The main aim of this work is to provide a set of tools for automatic segmentation of ... more

Automatic speech segmentation for Italian: tools, models, evaluation, and applications

On this website, a set of statistical models is made available, that can be used for the automati... more

Pitch and Functional Characterization of Hesitation Phenomena in Italian Discourse

Schettino L, Betz S, Cutugno F, Wagner P. Pitch and functional characterization of hesitation phe... more

Multiple-source Data Collection and Processing into a Graph Database Supporting Cultural Heritage Applications

Journal on Computing and Cultural Heritage, 2021

The continuous growth of available resources on the web, both in the form of Linked Open Data and... more The continuous growth of available resources on the web, both in the form of Linked Open Data and on Social Networks, provides an important opportunity to gather information concerning specific kinds of touristic activities like, for example, cultural tourism, eco-tourism, bike-tourism, and so on. Both decision makers and tourists can take advantage from these data, as demonstrated by previous works, with institutional actors foreseeing an increase in the use of this data to substitute other time-consuming and expensive approaches. However, managing multiple sources built with different goals and structures is not straightforward, so specific design choices must be made when assembling this kind of information. Graph databases represent an ideal way to combine multiple-source data but, to be successful, strategies accounting for inconsistencies and format differences have to be defined to support coherent analysis. Also, the continuously changing nature of crowd-sourced data makes i...

On the use of the rhythmogram for automatic syllabic prominence detection

Interspeech 2011, 2011

... Title: On the use of the rhythmogram for automatic syllabic prominence detection. Authors: Bo... more

Percezione e Categorizzazione DI Foni Vocalici: Adeguatezza Delle Procedure Sperimentali

ABSTRACT SOMMARIO. Nel presente lavoro saranno discussi alcuni problemi relativi alle metodologie... more ABSTRACT SOMMARIO. Nel presente lavoro saranno discussi alcuni problemi relativi alle metodologie di indagine utilizzate negli studi sulla percezione fonetica. In particolare, si intende evidenziare il rischio di circolarità insito nelle procedure sperimentali che fanno ricorso a test di percezione su segmenti isolati con paradigma di risposte forzate. I risultati dell&#39;esperimento che verrà illustrato in questa comunicazione conducono a porre in discussione l&#39;utilità e l&#39;adeguatezza di tali procedure, ma soprattutto a mettere in guardia lo sperimentatore dall&#39;attribuire validità generale e assoluta alla categorizzazione indotta dallo specifico compito richiesto e dal set di risposte messo a disposizione. INTRODUZIONE. L&#39;ipotesi di lavoro prende spunto dall&#39;analisi dei risultati di due esperimenti di percezione di stimoli vocalici volti alla definizione di categorie e alla ricerca di confini tra foni vocalici adiacenti. In entrambi gli esperimenti l&#39;attenzione è stata diretta alla porzione del piano F1/F2 relativa all&#39;area di esistenza delle vocali dell&#39;italiano poste sul continuum [a, E, e, i]. Nel primo esperimento [1] veniva effettuato un test di identificazione vocalica con stimoli vocalici sintetici, distinti in tre serie lungo l&#39;asse [a-i], allo scopo di individuare i confini percettivi tra le quattro categorie vocaliche in esame: i risultati del test confermavano la possibilità di individuare aree di esistenza percettiva delimitate da confini netti e ben definiti e portavano a formulare l&#39;ipotesi che il ricorso ad una valutazione di tipo percettivo potesse costituire un utile criterio integrativo di definizione di categorie vocaliche rispetto a quello articolatorio-acustico tradizionale.

Limiti e complessità del recupero dell'informazione da Treebank sintattiche

Su Alcune Correlazioni Tra Riduzioni Segmentali Tratti Prosodici Nel Parlato Spontaneo: Il Ruolo Del Fattore Tempo

Un'indagine sulla definizione del confine percettivo tra foni vocalici

L'A. presente et discute les resultats de deux tests d'identification effectues dans le b... more L'A. presente et discute les resultats de deux tests d'identification effectues dans le but de verifier l'existence des limites perceptives parmi differents phonemes vocaliques en italien. Il s'interesse en particulier a la portion du diagramme F1/F2 qui delimite l'aire des voyelles centrales et anterieures (a, e, e, i). A l'interieur de cette aire il etudie les fluctuations des limites perceptives entre les phonemes contigus qui, en italien parle, manifestent des zones de superposition significatives

API: Archivio del parlato italiano

The vowel system of Italian connected speech

Multigranular Scale Speech Recognizers: Technological and Cognitive View

Lecture Notes in Computer Science, 2005

We propose a Multigranular Automatic Speech Recognizer. The hypothesis is that speech signal cont... more We propose a Multigranular Automatic Speech Recognizer. The hypothesis is that speech signal contains information distributed on more different time scales. Many works from various scientific fields ranging from neurobiology to speech technologies, seem to concord on this assumption. In a broad sense, it seems that speech recognition in human is optimal because of a partial parallelization process according to which the left-to-right stream of speech is captured in a multilevel grid in which several linguistic analyses take place contemporarily. Our investigation aims, in this view, to apply these new ideas to the project of more robust and efficient recognizers.

Download

A dialogue system for multimodal human-robot interaction

Proceedings of the 15th ACM on International conference on multimodal interaction, 2013

ABSTRACT This paper presents a POMDP-based dialogue system for multimodal human-robot interaction... more ABSTRACT This paper presents a POMDP-based dialogue system for multimodal human-robot interaction (HRI). Our aim is to exploit a dialogical paradigm to allow a natural and robust interaction between the human and the robot. The proposed dialogue system should improve the robustness and the flexibility of the overall interactive system, including multimodal fusion, interpretation, and decision-making. The dialogue is represented as a Partially Observable Markov Decision Process (POMDPs) to cast the inherent communication ambiguity and noise into the dialogue model. POMDPs have been used in spoken dialogue systems, mainly for tourist information services, but their application to multimodal human-robot interaction is novel. This paper presents the proposed model for dialogue representation and the methodology used to compute a dialogue strategy. The whole architecture has been integrated on a mobile robot platform and has bee n tested in a human-robot interaction scenario to assess the overall performances with respect to baseline controllers.

New features in Spoken Language Search Hawk (SpLaSH): Query Language and Query Sequence

Proceedings of LREC2010, 2010

In this work we present further development of the SpLaSH (Spoken Language Search Hawk) project. ... more

Download

Sillabificazione fonologica e sillabificazione fonetica

Atti del XXXIII, Congresso della Società di …, 2001

AN. ANA. S.: aligning text to temporal syntagmatic progression in Treebanks

… of the 5th Corpus Linguistics Conference …, 2009

1. Introduction The impressive results derived from multimillion-word corpora and the experience ... more

EVALITA 2009: Abla srl Participant Report

Abstract. In this paper we describe the two systems we presented at the EVALITA 2009 workshop, fo... more

Time-and Text-Aligned Annotations: the SpLaSH Data Model

In this work we present SpLaSH data model. SpLaSH (Spoken Language Search Hawk), is a freely avai... more

Reducing hardware and software complexity in Eye-tracking techniques

In the recent years eye tracking is becoming one of the most promising method-ology for human-com... more