Papers by Giovanni Di Liberto

Scientific Reports, Mar 2, 2021
Healthy ageing leads to changes in the brain that impact upon sensory and cognitive processing. I... more Healthy ageing leads to changes in the brain that impact upon sensory and cognitive processing. It is not fully clear how these changes affect the processing of everyday spoken language. Prediction is thought to play an important role in language comprehension, where information about upcoming words is pre-activated across multiple representational levels. However, evidence from electrophysiology suggests differences in how older and younger adults use context-based predictions, particularly at the level of semantic representation. We investigate these differences during natural speech comprehension by presenting older and younger subjects with continuous, narrative speech while recording their electroencephalogram. We use time-lagged linear regression to test how distinct computational measures of (1) semantic dissimilarity and (2) lexical surprisal are processed in the brains of both groups. Our results reveal dissociable neural correlates of these two measures that suggest differences in how younger and older adults successfully comprehend speech. Specifically, our results suggest that, while younger and older subjects both employ context-based lexical predictions, older subjects are significantly less likely to pre-activate the semantic features relating to upcoming words. Furthermore, across our group of older adults, we show that the weaker the neural signature of this semantic pre-activation mechanism, the lower a subject's semantic verbal fluency score. We interpret these findings as prediction playing a generally reduced role at a semantic level in the brains of older listeners during speech comprehension and that these changes may be part of an overall strategy to successfully comprehend speech with reduced cognitive resources. Healthy ageing is accompanied by a myriad of sensory and cognitive changes. This includes a decline in working memory 1 and episodic memory 2 as well as hearing loss 3 and a slowing in processing across cognitive domains 4 . Given that spoken language comprehension is a multifaceted cognitive skill involving all these processes, it is remarkable that it remains relatively stable across a healthy adult's lifespan. An interesting question, therefore, is whether the neural systems supporting successful language comprehension undergo a strategic shift with age to maintain preservation in the face of decline , resulting in measurable differences between younger and older adults engaged in comprehension tasks. Furthermore, a related question is whether such differences play into the reported extra difficulties that older adults experience in trying to follow everyday conversational speech, especially in challenging listening environments . Electrophysiology studies have indicated that age-related differences exist in the neural signatures relating higher level linguistic processing 11 . This has been shown consistently in studies examining the N400 component.

HAL (Le Centre pour la Communication Scientifique Directe), Dec 7, 2020
Spatial hearing allows the localization of sounds in complex acoustic environments. There is cons... more Spatial hearing allows the localization of sounds in complex acoustic environments. There is considerable evidence that this neural system rapidly adapts to changes in sensory inputs and behavioral goals. However, the mechanisms underlying this context-dependent coding are not well understood. In fact, previous studies on sound localization have mainly focused on the perception of simple artificial sounds, such as white-noise or pure tone bursts. In addition, previous research has generally investigated the localization of sounds in the frontal hemicircle while ignoring rear sources. However, their localization is evolutionary relevant and may show different neural coding, given the inherent lack of visual information. Here we present a pilot electroencephalography (EEG) study to identify robust indices of sound localization from participants listening to a short natural sound from eight source positions on the horizontal plane. We discuss a procedure to perform a within-subject classification of the perceived sound direction. Preliminary results suggest a pool of discriminative subject-specific temporal and topographical features correlated with the characteristics of the acoustic event. Our preliminary analysis has identified temporal and topographical features that are sensitive to spatial localization, leading to significant decoding of sounds direction for individual subjects. This pilot study adds to the literature a methodological approach that will lead to the objective classification of natural sounds location from EEG responses.

bioRxiv (Cold Spring Harbor Laboratory), Jun 26, 2023
Hearing impairment alters the sound input received by the human auditory system, reducing speech ... more Hearing impairment alters the sound input received by the human auditory system, reducing speech comprehension in noisy multi-talker auditory scenes. Despite such challenges, attentional modulation on the envelope tracking in multi-talker scenarios is comparable between normal hearing (NH) and hearing impaired (HI) participants, with previous research suggesting an overrepresentation of the speech envelopes in HI individuals (see, e.g., Presacco et al. 2019), even though HI participants reported difficulties in performing the task. This result raises an important question: What speech-processing stage could reflect the difficulty in attentional selection, if not envelope tracking? Here, we use scalp electroencephalography (EEG) to test the hypothesis that such difficulties are underpinned by an over-representation of phonological-level information of the ignored speech sounds. To do so, we carried out a reanalysis of an EEG dataset where EEG signals were recorded as HI participants fitted with hearing aids attended to one speaker (target) while ignoring a competing speaker (masker) and spatialised multi-talker background noise. Multivariate temporal response function analyses revealed that EEG signals reflect stronger phonetic-feature encoding for target than masker speech streams. Interestingly, robust EEG encoding of phoneme onsets emerged for both target and masker streams, in contrast with previous work on NH participants and in line with our hypothesis of an over-representation of the masker. Stronger phoneme-onset encoding emerged for the masker, pointing to a possible neural basis for the higher distractibility experienced by HI individuals.

bioRxiv (Cold Spring Harbor Laboratory), Apr 3, 2024
Speech comprehension involves detecting words and interpreting their meaning according to the pre... more Speech comprehension involves detecting words and interpreting their meaning according to the preceding semantic context. This process is thought to be underpinned by a predictive neural system that uses that context to anticipate upcoming words. Recent work demonstrated that such a predictive process can be probed from neural signals recorded during ecologicallyvalid speech listening tasks by using linear lagged models, such as the temporal response function. This is typically done by extracting stimulus features, such as the estimated wordlevel surprise, and relate such features to the neural signal. While modern large language models (LLM) have led to a substantial leap forward on how word-level features and predictions are modelled, there has been little progress made towards the metrics used for evaluating how well a model is relating stimulus features and neural signals. In fact, previous studies relied on evaluation metrics that were designed for studying continuous univariate sound features, such as the sound envelope, without considering the different requirements of word-level features, which are discrete and sparse in nature. As a result, studies probing lexical prediction mechanisms in ecologically-valid experiments typically exhibit small effectsizes, severely limiting the type of observations that can be drawn and leaving considerable uncertainty on how exactly our brains build lexical predictions. First, the present study discusses and quantifies these limitations on both simulated and actual electroencephalography signals capturing responses to a speech comprehension task. Second, we tackle the issue by introducing two assessment metrics for the neural encoding of lexical surprise that substantially improve the state-of-the-art. The new metrics were tested on both the simulated and actual electroencephalography datasets, demonstrating effectsizes over 140% larger than those for the vanilla temporal response function evaluation. .
Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems

Cortical signals have been shown to track acoustic and linguistic properties of continual speech.... more Cortical signals have been shown to track acoustic and linguistic properties of continual speech. This phenomenon has been measured across the lifespan, reflecting speech understanding as well as cognitive functions such as attention and prediction. Furthermore, atypical low-frequency cortical tracking of speech is found in children with phonological difficulties (developmental dyslexia). Accordingly, low-frequency cortical signals, especially in the delta and theta ranges, may play a critical role in language acquisition. A recent investigation Attaheri et al., 2022 (1) probed cortical tracking mechanisms in infants aged 4, 7 and 11 months as they listened to sung speech. Results from temporal response functions (TRF), phase-amplitude coupling (PAC) and dynamic theta-delta power (PSD) analyses indicated speech envelope tracking and stimulus related power (PSD) via the delta & theta neural signals. Furthermore, delta and theta driven PAC was found at all ages with gamma amplitudes d...

The prevalent ‘core phonological deficit’ model of dyslexia proposes that the reading and spellin... more The prevalent ‘core phonological deficit’ model of dyslexia proposes that the reading and spelling difficulties characterizing affected children stem from prior developmental difficulties in processing speech sound structure, for example perceiving and identifying syllable stress patterns, syllables, rhymes and phonemes. Yet spoken word production appears normal. This suggests an unexpected disconnect between speech input and speech output processes. Here we investigated the output side of this disconnect from a speech rhythm perspective by measuring the speech amplitude envelope (AE) of multisyllabic spoken phrases. The speech AE contains crucial information regarding stress patterns, speech rate, tonal contrasts and intonational information. We created a novel computerized speech copying task in which participants copied aloud familiar spoken targets like “Aladdin”. Seventy-five children with and without dyslexia were tested, some of whom were also receiving an oral intervention d...

Slow cortical oscillations play a crucial role in processing the speech envelope, which is percei... more Slow cortical oscillations play a crucial role in processing the speech envelope, which is perceived atypically by children with Developmental Language Disorder (DLD) and developmental dyslexia. Here we use electroencephalography (EEG) and natural speech listening paradigms to identify neural processing patterns that characterize dyslexic versus DLD children. Using a story listening paradigm, we show that atypical power dynamics and phase-amplitude coupling between delta and theta oscillations characterize dyslexic and DLD children groups, respectively. We further identify EEG common spatial patterns (CSP) during speech listening across delta, theta and beta oscillations describing dyslexic versus DLD children. A linear classifier using four deltaband CSP variables predicted dyslexia status (0.77 AUC). Crucially, these spatial patterns also identified children with dyslexia in a rhythmic syllable task EEG, suggesting a core developmental deficit in neural processing of speech rhythm...

Even prior to producing their first words, infants are developing a sophisticated speech processi... more Even prior to producing their first words, infants are developing a sophisticated speech processing system, with robust word recognition present by 4-6 months of age. These emergent linguistic skills, observed with behavioural investigations, are likely to rely on increasingly sophisticated neural underpinnings. The infant brain is known to robustly track the speech envelope, however to date no cortical tracking study could investigate the emergence of phonetic feature encoding. Here we utilise temporal response functions computed from electrophysiological responses to nursery rhymes to investigate the cortical encoding of phonetic features in a longitudinal cohort of infants when aged 4, 7 and 11 months, as well as adults. The analyses reveal an increasingly detailed and acoustically-invariant phonetic encoding over the first year of life, providing the first direct evidence that the pre-verbal human cortex learns phonetic categories. By 11 months of age, however, infants still did...

The Journal of Neuroscience, 2021
Musical imagery is the voluntary internal hearing of music in the mind without the need for physi... more Musical imagery is the voluntary internal hearing of music in the mind without the need for physical action or external stimulation. Numerous studies have already revealed brain areas activated during imagery. However, it remains unclear to what extent imagined music responses preserve the detailed temporal dynamics of the acoustic stimulus envelope and, crucially, whether melodic expectations play any role in modulating responses to imagined music, as they prominently do during listening. These modulations are important as they reflect aspects of the human musical experience, such as its acquisition, engagement, and enjoyment. This study explored the nature of these modulations in imagined music based on EEG recordings from 21 professional musicians (6 females and 15 males). Regression analyses were conducted to demonstrate that imagined neural signals can be predicted accurately, similarly to the listening task, and were sufficiently robust to allow for accurate identification of the imagined musical piece from the EEG. In doing so, our results indicate that imagery and listening tasks elicited an overlapping but distinctive topography of neural responses to sound acoustics, which is in line with previous fMRI literature. Melodic expectation, however, evoked very similar frontal spatial activation in both conditions, suggesting that they are supported by the same underlying mechanisms. Finally, neural responses induced by imagery exhibited a specific transformation from the listening condition, which primarily included a relative delay and a polarity inversion of the response. This transformation demonstrates the top-down predictive nature of the expectation mechanisms arising during both listening and imagery.

The Journal of Neuroscience, 2021
During music listening, humans routinely acquire the regularities of the acoustic sequences and u... more During music listening, humans routinely acquire the regularities of the acoustic sequences and use them to anticipate and interpret the ongoing melody. Specifically, in line with this predictive framework, it is thought that brain responses during such listening reflect a comparison between the bottom-up sensory responses and top-down prediction signals generated by an internal model that embodies the music exposure and expectations of the listener. To attain a clear view of these predictive responses, previous work has eliminated the sensory inputs by inserting artificial silences (or sound omissions) that leave behind only the corresponding predictions of the thwarted expectations. Here, we demonstrate a new alternate approach in which we decode the predictive electroencephalography (EEG) responses to the silent intervals that are naturally interspersed within the music. We did this as participants (experiment 1, 20 participants, 10 female; experiment 2, 21 participants, 6 female) listened or imagined Bach piano melodies. Prediction signals were quantified and assessed via a computational model of the melodic structure of the music and were shown to exhibit the same response characteristics when measured during listening or imagining. These include an inverted polarity for both silence and imagined responses relative to listening, as well as response magnitude modulations that precisely reflect the expectations of notes and silences in both listening and imagery conditions. These findings therefore provide a unifying view that links results from many previous paradigms, including omission reactions and the expectation modulation of sensory responses, all in the context of naturalistic music listening.

Frontiers in Neuroscience
Here we duplicate a neural tracking paradigm, previously published with infants (aged 4 to 11 mon... more Here we duplicate a neural tracking paradigm, previously published with infants (aged 4 to 11 months), with adult participants, in order to explore potential developmental similarities and differences in entrainment. Adults listened and watched passively as nursery rhymes were sung or chanted in infant-directed speech. Whole-head EEG (128 channels) was recorded, and cortical tracking of the sung speech in the delta (0.5–4 Hz), theta (4–8 Hz) and alpha (8–12 Hz) frequency bands was computed using linear decoders (multivariate Temporal Response Function models, mTRFs). Phase-amplitude coupling (PAC) was also computed to assess whether delta and theta phases temporally organize higher-frequency amplitudes for adults in the same pattern as found in the infant brain. Similar to previous infant participants, the adults showed significant cortical tracking of the sung speech in both delta and theta bands. However, the frequencies associated with peaks in stimulus-induced spectral power (PS...

Frontiers in Neuroscience
Music perception requires the human brain to process a variety of acoustic and music-related prop... more Music perception requires the human brain to process a variety of acoustic and music-related properties. Recent research used encoding models to tease apart and study the various cortical contributors to music perception. To do so, such approaches study temporal response functions that summarise the neural activity over several minutes of data. Here we tested the possibility of assessing the neural processing of individual musical units (bars) with electroencephalography (EEG). We devised a decoding methodology based on a maximum correlation metric across EEG segments (maxCorr) and used it to decode melodies from EEG based on an experiment where professional musicians listened and imagined four Bach melodies multiple times. We demonstrate here that accurate decoding of melodies in single-subjects and at the level of individual musical units is possible, both from EEG signals recorded during listening and imagination. Furthermore, we find that greater decoding accuracies are measured...

Healthy ageing leads to changes in the brain that impact upon sensory and cognitive processing. I... more Healthy ageing leads to changes in the brain that impact upon sensory and cognitive processing. It is not fully clear how these changes affect the processing of everyday spoken language. Prediction is thought to play an important role in language comprehension, where information about upcoming words is pre-activated across multiple representational levels. However, evidence from electrophysiology suggests differences in how older and younger adults use context-based predictions, particularly at the level of semantic representation. We investigate these differences during natural speech comprehension by presenting older and younger subjects with continuous, narrative speech while recording their electroencephalogram. We use linear regression to test how distinct computational measures of 1) semantic dissimilarity and 2) lexical surprisal are processed in the brains of both groups. Our results reveal dissociable neural correlates of these two measures that suggest differences in how y...

The amplitude envelope of speech carries crucial low-frequency acoustic information that assists ... more The amplitude envelope of speech carries crucial low-frequency acoustic information that assists linguistic decoding at multiple time scales. Neurophysiological signals are known to track the amplitude envelope of adult-directed speech (ADS), particularly in the theta-band. Acoustic analysis of infant-directed speech (IDS) has revealed significantly greater modulation energy than ADS in an amplitude-modulation (AM) band centered on ∼2 Hz. Accordingly, cortical tracking of IDS by delta-band neural signals may be key to language acquisition. Speech also contains acoustic information within its higher-frequency bands (beta, gamma). Adult EEG and MEG studies reveal an oscillatory hierarchy, whereby low-frequency (delta, theta) neural phase dynamics temporally organize the amplitude of high-frequency signals (phase amplitude coupling, PAC). Whilst consensus is growing around the role of PAC in the matured adult brain, its role in the development of speech processing is unexplored.Here, w...
Frontiers in Neuroscience, 2022

Scientific Reports, 2021
Driving a car requires high cognitive demands, from sustained attention to perception and action ... more Driving a car requires high cognitive demands, from sustained attention to perception and action planning. Recent research investigated the neural processes reflecting the planning of driving actions, aiming to better understand the factors leading to driving errors and to devise methodologies to anticipate and prevent such errors by monitoring the driver’s cognitive state and intention. While such anticipation was shown for discrete driving actions, such as emergency braking, there is no evidence for robust neural signatures of continuous action planning. This study aims to fill this gap by investigating continuous steering actions during a driving task in a car simulator with multimodal recordings of behavioural and electroencephalography (EEG) signals. System identification is used to assess whether robust neurophysiological signatures emerge before steering actions. Linear decoding models are then used to determine whether such cortical signals can predict continuous steering ac...

The 15th International Conference on Auditory-Visual Speech Processing, 2019
Visual speech information, such as a speaker's mouth and eyebrow movements, enhances speech perce... more Visual speech information, such as a speaker's mouth and eyebrow movements, enhances speech perception. Evidence for this perceptual benefit has mainly been from behavioural or neurophysiological studies that made use of event-related potentials (ERPs). ERP studies, however, are limited by repetitive and short stimuli that are not representative of natural speech. An approach that examines cortical tracking of the speech envelope allows for the use of continuous speech stimuli. This approach has recently been employed to demonstrate that adults' cortical tracking of the speech envelope is augmented when synchronous visual speech information is provided . To date, no study has investigated whether children, like adults, show stronger envelope tracking when congruent visual speech information is available. This study investigates this question by measuring four-year-olds' cortical tracking of continuous auditory-visual speech through electroencephalography (EEG). Cortical tracking was quantified by means of ridge regression models that estimate the linear mapping from the speech to the EEG signal and vice versa. Stimulus reconstruction for auditory-only and auditoryvisual speech was found to be stronger compared to visual-only speech.

eneuro, 2018
In real-world environments, humans comprehend speech by actively integrating prior knowledge (P) ... more In real-world environments, humans comprehend speech by actively integrating prior knowledge (P) and expectations with sensory input. Recent studies have revealed effects of prior information in temporal and frontal cortical areas and have suggested that these effects are underpinned by enhanced encoding of speech-specific features, rather than a broad enhancement or suppression of cortical activity. However, in terms of the specific hierarchical stages of processing involved in speech comprehension, the effects of integrating bottom-up sensory responses and top-down predictions are still unclear. In addition, it is unclear whether the predictability that comes with prior information may differentially affect speech encoding relative to the perceptual enhancement that comes with that prediction. One way to investigate these issues is through examining the impact of P on indices of cortical tracking of continuous speech features. Here, we did this by presenting participants with degraded speech sentences that either were or were not preceded by a clear recording of the same sentences while recording non-invasive electroencephalography (EEG). We assessed the impact of prior information on an isolated index of cortical tracking that reflected phoneme-level processing. Our findings suggest the possibility that prior information affects the early encoding of natural speech in a dual manner. Firstly, the availability of prior information, as hypothesized, enhanced the perceived clarity of degraded speech, which was positively correlated with changes in phoneme-level encoding across subjects. In addition, P induced an overall reduction of this cortical measure, which we interpret as resulting from the increase in predictability.
Uploads
Papers by Giovanni Di Liberto