Papers by Steven Van Vaerenbergh
IEEE Signal Processing Magazine, Jul 1, 2013
Gaussian processes (GPs) are versatile tools that have been successfully employed to solve nonlin... more Gaussian processes (GPs) are versatile tools that have been successfully employed to solve nonlinear estimation problems in machine learning, but that are rarely used in signal processing. In this tutorial, we present GPs for regression as a natural nonlinear extension to optimal Wiener filtering. After establishing their basic formulation, we discuss several important aspects and extensions, including recursive and adaptive algorithms for dealing with non-stationarity, low-complexity solutions, non-Gaussian noise models and classification scenarios. Furthermore, we provide a selection of relevant applications to wireless digital communications.

International Journal of Mathematical Education in Science and Technology, 2020
In this study, we explore automated reasoning tools (ART) in geometry education and we argue that... more In this study, we explore automated reasoning tools (ART) in geometry education and we argue that these tools are part of a wider, nascent ecosystem for computer-supported geometric reasoning. To provide some context, we set out to summarize the capabilities of ART in GeoGebra (GGb), and we discuss the first research proposals of its use in the classroom. While the design and development of ART have been embraced already by several teams of mathematics researchers and developers, the educational community, which is an essential actor in this ecosystem, has not provided sufficient feedback yet on this new technology. We therefore propose a concrete path for incorporating ART in the classroom. We outline a set of necessary procedures towards this goal, and we include a discussion on the benefits and concerns arising from the use of these automated tools in the mathematical learning process.

2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP), Sep 1, 2018
Gated recurrent neural networks have achieved remarkable results in the analysis of sequential da... more Gated recurrent neural networks have achieved remarkable results in the analysis of sequential data. Inside these networks, gates are used to control the flow of information, allowing to model even very long-term dependencies in the data. In this paper, we investigate whether the original gate equation (a linear projection followed by an element-wise sigmoid) can be improved. In particular, we design a more flexible architecture, with a small number of adaptable parameters, which is able to model a wider range of gating functions than the classical one. To this end, we replace the sigmoid function in the standard gate with a non-parametric formulation extending the recently proposed kernel activation function (KAF), with the addition of a residual skip-connection. A set of experiments on sequential variants of the MNIST dataset shows that the adoption of this novel gate allows to improve accuracy with a negligible cost in terms of computational power and with a large speed-up in the number of training iterations.

IEEE Transactions on Emerging Topics in Computational Intelligence, 2018
Complex-valued neural networks (CVNNs) are a powerful modeling tool for domains where data can be... more Complex-valued neural networks (CVNNs) are a powerful modeling tool for domains where data can be naturally interpreted in terms of complex numbers. However, several analytical properties of the complex domain (e.g., holomorphicity) make the design of CVNNs a more challenging task than their real counterpart. In this paper, we consider the problem of flexible activation functions (AFs) in the complex domain, i.e., AFs endowed with sufficient degrees of freedom to adapt their shape given the training data. While this problem has received considerable attention in the real case, a very limited literature exists for CVNNs, where most activation functions are generally developed in a split fashion (i.e., by considering the real and imaginary parts of the activation separately) or with simple phase-amplitude techniques. Leveraging over the recently proposed kernel activation functions (KAFs), and related advances in the design of complex-valued kernels, we propose the first fully complex, non-parametric activation function for CVNNs, which is based on a kernel expansion with a fixed dictionary that can be implemented efficiently on vectorized hardware. Several experiments on common use cases, including prediction and channel equalization, validate our proposal when compared to real-valued neural networks and CVNNs with fixed activation functions.

Neural Networks, 2018
Neural networks are generally built by interleaving (adaptable) linear layers with (fixed) nonlin... more Neural networks are generally built by interleaving (adaptable) linear layers with (fixed) nonlinear activation functions. To increase their flexibility, several authors have proposed methods for adapting the activation functions themselves, endowing them with varying degrees of flexibility. None of these approaches, however, have gained wide acceptance in practice, and research in this topic remains open. In this paper, we introduce a novel family of flexible activation functions that are based on an inexpensive kernel expansion at every neuron. Leveraging over several properties of kernel-based models, we propose multiple variations for designing and initializing these kernel activation functions (KAFs), including a multidimensional scheme allowing to nonlinearly combine information from different paths in the network. The resulting KAFs can approximate any mapping defined over a subset of the real line, either convex or nonconvex. Furthermore, they are smooth over their entire domain, linear in their parameters, and they can be regularized using any known scheme, including the use of 1 penalties to enforce sparseness. To the best of our knowledge, no other known model satisfies all these properties simultaneously. In addition, we provide a relatively complete overview on al
2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018
In this paper, we study the problem of locating a predefined sequence of patterns in a time serie... more In this paper, we study the problem of locating a predefined sequence of patterns in a time series. In particular, the studied scenario assumes a theoretical model is available that contains the expected locations of the patterns. This problem is found in several contexts, and it is commonly solved by first synthesizing a time series from the model, and then aligning it to the true time series through dynamic time warping. We propose a technique that increases the similarity of both time series before aligning them, by mapping them into a latent correlation space. The mapping is learned from the data through a machine-learning setup. Experiments on data from nondestructive testing demonstrate that the proposed approach shows significant improvements over the state of the art.

IEEE transactions on neural networks and learning systems, 2012
In this paper, we introduce a kernel recursive least-squares (KRLS) algorithm that is able to tra... more In this paper, we introduce a kernel recursive least-squares (KRLS) algorithm that is able to track nonlinear, time-varying relationships in data. To this purpose, we first derive the standard KRLS equations from a Bayesian perspective (including a sensible approach to pruning) and then take advantage of this framework to incorporate forgetting in a consistent way, thus enabling the algorithm to perform tracking in nonstationary scenarios. The resulting method is the first kernel adaptive filtering algorithm that includes a forgetting factor in a principled and numerically stable manner. In addition to its tracking ability, it has a number of appealing properties. It is online, requires a fixed amount of memory and computation per time step, incorporates regularization in a natural manner and provides confidence intervals along with each prediction. We include experimental results that support the theory as well as illustrate the efficiency of the proposed algorithm.
2016 24th European Signal Processing Conference (EUSIPCO), 2016
We propose a new linear-in-the-parameters (LIP) nonlinear filter based on kernel methods to addre... more We propose a new linear-in-the-parameters (LIP) nonlinear filter based on kernel methods to address the problem of nonlinear acoustic echo cancellation (NAEC). For this purpose we define a framework based on a parallel scheme in which any kernel-based adaptive filter (KAF) can be incorporated efficiently. This structure is composed of a classic adaptive filter on one branch, committed to estimating the linear part of the echo path, and a kernel adaptive filter on the other branch, to model the nonlinearities rebounding in the echo path. In addition, we propose a novel low-complexity least mean square (LMS) KAF with very few parameters, to be used in the parallel architecture. Finally, we demonstrate the effectiveness of the proposed scheme in real NAEC scenarios, for different choices of the KAF.
Gaussian processes (GPs) are versatile tools that have been successfully employed to solve nonlin... more Gaussian processes (GPs) are versatile tools that have been successfully employed to solve nonlinear estimation problems in machine learning, but that are rarely used in signal processing. In this tutorial, we present GPs for regression as a natural nonlinear extension to optimal Wiener filtering. After establishing their basic formulation, we discuss several important aspects and extensions, including recursive and adaptive algorithms for dealing with non-stationarity, low-complexity solutions, non-Gaussian noise models and classification scenarios. Furthermore, we provide a selection of relevant applications to wireless digital communications.
We study the relationship between online Gaussian process (GP) regression and kernel least mean s... more We study the relationship between online Gaussian process (GP) regression and kernel least mean squares (KLMS) algorithms. While the latter have no capacity of storing the entire posterior distribution during online learning, we discover that their operation corresponds to the assumption of a fixed posterior covariance that follows a simple parametric model. Interestingly, several well-known KLMS algorithms correspond to specific cases of this model. The probabilistic perspective allows us to understand how each of them handles uncertainty, which could explain some of their performance differences.

This paper treats the identification of nonlinear systems that consist of a cascade of a linear c... more This paper treats the identification of nonlinear systems that consist of a cascade of a linear channel and a nonlinearity, such as the well-known Wiener and Hammerstein systems. In particular, we follow a supervised identification approach that simultaneously identifies both parts of the nonlinear system. Given the correct restrictions on the identification problem, we show how kernel canonical correlation analysis (KCCA) emerges as the logical solution to this problem. We then extend the proposed identification algorithm to an adaptive version allowing to deal with time-varying systems. In order to avoid overfitting problems, we discuss and compare three possible regularization techniques for both the batch and the adaptive versions of the proposed algorithm. Simulations are included to demonstrate the effectiveness of the presented algorithm. Copyright © 2008 Steven Van Vaerenbergh et al. This is an open access article distributed under the Creative Commons Attribution License, w...
In this paper we propose a new kernel-based version of the recursive least-squares (RLS) algorith... more In this paper we propose a new kernel-based version of the recursive least-squares (RLS) algorithm for fast adaptive nonlinear filtering. Unlike other previous approaches, we combine a sliding-window approach (to fix the dimensions of the kernel matrix) with conventional L2-norm regularization (to improve generalization). The proposed kernel RLS algorithm is applied to a nonlinear channel identification problem (specifically, a linear filter followed by a memoryless nonlinearity), which typically appears in satellite communications or digital magnetic recording systems. We show that the proposed algorithm is able to operate in a time-varying environment and tracks abrupt changes in either the linear filter or the nonlinearity. 1.

Clustering techniques for equalization have been proposed by a number of authors in the last deca... more Clustering techniques for equalization have been proposed by a number of authors in the last decade. However, most of these approaches focus only on time-invariant singleinput single-output (SISO) channels. In this paper we consider the case of fast time-varying multiple-input multipleoutput (MIMO) channels. The varying nature of the mixing matrix poses new problems that cannot be solved by conventional clustering techniques. By introducing the time scale into the clustering process we are able to untangle the clusters, which in this way behave like intertwined threads. Then, a spectral clustering algorithm is applied. Finally, the identified clusters are assigned to the transmitted symbols using only a few pilots. The geometry of the transmitted constellation is exploited within the spectral clustering algorithm in order to reduce the number of clusters. As shown in the paper, the proposed procedure saves a considerable amount of pilot symbols in comparison to other recently propos...

ArXiv, 2021
This chapter provides an overview of the different Artificial Intelligence (AI) systems that are ... more This chapter provides an overview of the different Artificial Intelligence (AI) systems that are being used in contemporary digital tools for Mathematics Education (ME). It is aimed at researchers in AI and Machine Learning (ML), for whom we shed some light on the specific technologies that are being used in educational applications; and at researchers in ME, for whom we clarify: i) what the possibilities of the current AI technologies are, ii) what is still out of reach and iii) what is to be expected in the near future. We start our analysis by establishing a highlevel taxonomy of AI tools that are found as components in digital ME applications. Then, we describe in detail how these AI tools, and in particular ML, are being used in two key applications, specifically AI-based calculators and intelligent tutoring systems. We finish the chapter with a discussion about student modeling systems and their relationship to artificial general intelligence.

A framework is presented to carry out prediction and classification of Motion Capture (MoCap) mul... more A framework is presented to carry out prediction and classification of Motion Capture (MoCap) multichannel data, based on kernel adaptive filters and multi-kernel learning. To this end, a Kernel Adaptive Filter (KAF) algorithm extracts the dynamic of each channel, relying on the similarity between multiple realizations through the Maximum Mean Discrepancy (MMD) criterion. To assemble dynamics extracted from all MoCap data, center kernel alignment (CKA) is used to assess the contribution of each to the classification tasks (that is, its relevance). Validation is performed on a database of tennis players, performing a good classification accuracy of the considered stroke classes. Besides, we find that the relevance of each channel agrees with the findings reported in the biomechanical analysis. Therefore, the combination of KAF together with CKA allows building a proper representation for extracting relevant dynamics from multiple-channel MoCap data.
The kernel least mean squares (KLMS) algorithm is a computationally efficient nonlinear adaptive ... more The kernel least mean squares (KLMS) algorithm is a computationally efficient nonlinear adaptive filtering method that “kernelizes” the celebrated (linear) least mean squares algorithm. We demonstrate that the least mean squares algorithm is closely related to the Kalman filtering, and thus, the KLMS can be interpreted as an approximate Bayesian filtering method. This allows us to systematically develop extensions of the KLMS by modifying the underlying state-space and observation models. The resulting extensions introduce many desirable properties such as “forgetting”, and the ability to learn from discrete data, while retaining the computational simplicity and time complexity of the original algorithm.

The function most types of RNA molecules perform is determined by their structure, which on its t... more The function most types of RNA molecules perform is determined by their structure, which on its turn is determined by the linear RNA sequence. Predicting the secondary structure of an RNA molecule out of the linear base sequence is a challenge in bioinformatics, with applications in medical sciences, biology and phylogenetic history. In this study, the different known methods of RNA secondary structure prediction are studied first. Then, a number of algorithms and programs are developed, as tools to apply to the problem. An existing algorithm for finding substrings in large strings using suffix trees is extended to an algorithm code that lists repetitions and "biological palindromes" in DNA or RNA sequences, and this is programmed in ANSI C. A basic hidden Markov model is programmed in Matlab, and then extended to the more general model of stochastic contextfree grammars. Algorithms for this model are implemented in Chomsky normal form. Next, the stochastic context-free grammars are described specifically for RNA modelling. At the end of the project, an attempt is made to develop a new prediction approach. Two probabilistic models are constructed, considering RNA molecule features as regarded in the existing thermodynamic approach from the Zuker algorithm. Extensions of the previous probabilistic algorithms are programmed for these two specific cases, the complete models are trained with sequences from RNA databases, and their prediction accuracy is tested on unknown sequences. Results suggest model improvements, and a list of refinements is suggested at the end of this report.

Dyna, 2021
Este articulo describe el desarrollo de un metodo para predecir la demanda de energia electrica d... more Este articulo describe el desarrollo de un metodo para predecir la demanda de energia electrica de la cartera de clientes de una comercializadora. El proyecto viene motivado por el beneficio economico que se produce cuando la entidad dispone de estimas precisas de la demanda energetica a la hora de comprar energia en una subasta electrica. El sistema desarrollado se basa en el analisis de series temporales y aprendizaje automatico. Al tratarse de un proyecto realizado sobre datos de un entorno real, el articulo se enfoca en aspectos practicos del diseno y del desarrollo de un sistema de estas caracteristicas, como la heterogeneidad de las fuentes de datos, y el retraso en la disponibilidad de los datos. Las predicciones obtenidas por el sistema desarrollado se comparan con los resultados de un metodo sencillo usado en la practica. Palabras clave: prediccion de la demanda energetica, energia electrica, aprendizaje maquina, prediccion basada en datos.

En la ultima decada, los metodos kernel (metodos nucleo) han demostrado ser tecnicas muy eficaces... more En la ultima decada, los metodos kernel (metodos nucleo) han demostrado ser tecnicas muy eficaces en la resolucion de problemas no lineales. Parte de su exito puede atribuirse a su solida base matematica dentro de los espacios de Hilbert generados por funciones kernel ("reproducing kernel Hilbert spaces", RKHS); y al hecho de que resultan en problemas convexos de optimizacion. Ademas, son aproximadores no lineales universales y la complejidad computacional que requieren es moderada. Gracias a estas caracteristicas, los metodos kernel constituyen una alternativa atractiva a las tecnicas tradicionales no lineales, como las series de Volterra, los filtros de polinomicos y las redes neuronales. Los metodos kernel tambien presentan ciertos inconvenientes que deben ser abordados adecuadamente en las distintas aplicaciones, por ejemplo, las dificultades asociadas al manejo de grandes conjuntos de datos y los problemas de sobreajuste ocasionados al trabajar en espacios de dimensio...
El popular programa de matematica dinamica GeoGebra incluye herramientas para la verificacion mat... more El popular programa de matematica dinamica GeoGebra incluye herramientas para la verificacion matematica rigurosa y el descubrimiento automatico de proposiciones generales sobre figuras geometricas. En este trabajo se presenta, en primer lugar, una breve descripcion de tales herramientas, para centrarse a continuacion en una reflexion sobre su potencial impacto educativo, a traves de un nuevo diseno de tareas escolares en el ambito de la ensenanza de la geometria, que aprovechen las nuevas caracteristicas de GeoGebra y contribuyan a guiar al estudiante en la indagacion, conjetura y descubrimiento de propiedades geometricas en una construccion dada. Palabras clave: geometria dinamica, razonamiento automatico, geometria elemental, GeoGebra
Uploads
Papers by Steven Van Vaerenbergh