Papers by Juan Morales-García
GreenhouseGuard: Enabling real-time warning prediction for smart greenhouse management
Journal of ambient intelligence and smart environments, Jun 14, 2024
Diagnosis of Cervical Cancer Using a Deep Learning Explainable Fusion Model
Lecture notes in computer science, 2024

ACM transactions on intelligent systems and technology, May 7, 2024
Nowadays, Generative Large Language Models (GLLMs) have made a significant impact in the field of... more Nowadays, Generative Large Language Models (GLLMs) have made a significant impact in the field of Artificial Intelligence (AI). One of the domains extensively explored for these models is their ability as generators of functional source code for software projects. Nevertheless, their potential as assistants to write the code needed to generate and model Machine Learning (ML) or Deep Learning (DL) architectures has not been fully explored to date. For this reason, this work focuses on evaluating the extent to which different tools based on GLLMs, such as ChatGPT or Copilot, are able to correctly define the source code necessary to generate viable predictive models. The use case defined is the forecasting of a time series that reports the indoor temperature of a greenhouse. The results indicate that, while it is possible to achieve good accuracy metrics with simple predictive models generated by GLLMs, the composition of predictive models with complex architectures using GLLMs is still far from improving the accuracy of predictive models generated by human data scientists.
Nationwide Air Pollution Forecasting with Heterogeneous Graph Neural Networks
ACM Transactions on Intelligent Systems and Technology, Dec 13, 2023

Research Square (Research Square), Nov 19, 2023
Nowadays, Generative Large Language Models (GLLMs) have taken the Artificial Intelligent field by... more Nowadays, Generative Large Language Models (GLLMs) have taken the Artificial Intelligent field by storm. One of the domains where these models have been extensively evaluated is in their role of generators of functional source code for software projects. However, their potential as assistants for writing the code necessary to generate and model machine learning or deep learning architectures has not been fully explored to date. For this reason, this work focuses on evaluating the extent to which different tools based on GLLMs, such as Chat-GPT or Copilot, are capable of correctly defining the source code necessary to generate viable predictive models. The use case defined is the forecasting of a time series reporting the inner temperature of a greenhouse. The results show that, while it is true that it possible to obtain good accuracy metrics with simple predictors, the composition of models with complex architectures is still far from the alternative of generating them by human data scientists.

Ambient Intelligence and Smart Environments
League of Legends (LoL) is a multiplayer online battle arena video game developed and published b... more League of Legends (LoL) is a multiplayer online battle arena video game developed and published by Riot Games. It is a team-based game with over 140 characters to make epic plays with. The game blends the speed and intensity of an real-time strategy game (RTS) with role-playing game (RPG) elements. Two teams of powerful champions, each with unique designs and play styles, battle head-to-head across multiple maps and game modes. Exploratory data analysis (EDA) is a statistical technique that can be used to analyze this data to extract valuable information for both researchers and players. By using EDA techniques on LoL match data, players can identify patterns, trends, and relationships that can help optimize their gameplay strategy. EDA can also help players identify their strengths and weaknesses and important statistics for their gameplay. The paper provides an introduction to the treatment of LoL match data using EDA techniques. It presents the most common data analysis technique...

Ambient intelligence and smart environments, Jun 22, 2023
League of Legends (LoL) is a multiplayer online battle arena video game developed and published b... more League of Legends (LoL) is a multiplayer online battle arena video game developed and published by Riot Games. It is a team-based game with over 140 characters to make epic plays with. The game blends the speed and intensity of an real-time strategy game (RTS) with role-playing game (RPG) elements. Two teams of powerful champions, each with unique designs and play styles, battle head-tohead across multiple maps and game modes. Exploratory data analysis (EDA) is a statistical technique that can be used to analyze this data to extract valuable information for both researchers and players. By using EDA techniques on LoL match data, players can identify patterns, trends, and relationships that can help optimize their gameplay strategy. EDA can also help players identify their strengths and weaknesses and important statistics for their gameplay. The paper provides an introduction to the treatment of LoL match data using EDA techniques. It presents the most common data analysis techniques and explores some examples of how to apply these techniques to LoL match data. Furthermore, the paper discusses some ways in which data analysis can help LoL players improve their game, such as identifying their strengths and weaknesses, patterns and trends, important statistics, and meta changes.

Evaluation of low-power devices for smart greenhouse development
The Journal of Supercomputing, Feb 5, 2023
The combination of artificial intelligence and the Internet of Things (AIoT) is enabling the next... more The combination of artificial intelligence and the Internet of Things (AIoT) is enabling the next economic revolution in which data and imme- diacy are at the key players. Agriculture is one of the sectors that can benefit most from the use of AIoT to optimise resources and reduce its environmental footprint. However, this convergence requires com- putational resources that enable the execution of AI workloads, and in the context of agriculture, ensuring autonomous operation and low energy consumption. In this work, we evaluate TinyML and edge com- puting platforms to predict the indoor temperature of an operational greenhouse in situ. In particular, the computational/energy trade-off of these platforms is assessed to analyse whether their use in this con- text is feasible. Two artificial neural networks (ANNs) are adapted to these platforms to predict the indoor temperature of the green- house. Our results show that the microcontroller-based devices can offer a competitive and energy-efficient computational alternative to more traditional edge computing approaches for lightweight ML workloads.

Motivated by the large number of wearables offering geolocation, human mobility mining has emerge... more Motivated by the large number of wearables offering geolocation, human mobility mining has emerged as an novel research field within AI. The study of mobility creates increasingly predictable models in which it is easy to find patterns of behaviour. However, this data is not publicly available and access to it is restricted to large telecommunications operators. In this context, this paper aims to solve one of the main problems of human mobility databases, i.e. the scarcity of data for the generation of human mobility models. For this purpose, Generative adversarial network (GANs) have been proposed to generate synthetic time-series mobility data. Moreover, several neural network models are proposed to assess the impact of synthetic data generation on the prediction of human mobility. Our results show that the use of synthetic data improves predictions of human mobility compared to models based on available measured data.
Evaluation of time-series libraries for temperature prediction in smart greenhouses

Data-driven evaluation of machine learning models for climate control in operational smart greenhouses
Journal of Ambient Intelligence and Smart Environments, Mar 27, 2023
Nowadays, human overpopulation is stressing our ecosystems in different ways, agriculture being a... more Nowadays, human overpopulation is stressing our ecosystems in different ways, agriculture being a critical example as different predictions point towards food shortages in the near future. Accordingly, smart farming is becoming key to the optimization of natural resources so that different crops can be grown efficiently, consuming as few resources as possible. In particular, greenhouses have proved to be an effective way of producing a high volume of vegetables/fruits in a reduced space and within a short time span. Hence, optimizing greenhouse functioning results in less water use and nutrient consumption, less energy use, faster growth, and better product quality. In this article, we carry out an in-depth analysis of different machine learning (ML) models to improve climate control in smart greenhouses. As part of the analysis of the techniques we also considered 3 ways of pre-processing the data, as well as 12-hour and 24-hour forecasting. We focus on forecasting the indoor air temperature of an operational smart greenhouse, i.e. assessing the data anomalies that are inherently present in these environments due to the instability of IoT infrastructures. Several ML models are adapted to time series forecasting to provide an overview of these techniques and to find out which one performs better in this particular scenario. Our results show that, after statistically validating the results, the Random Forest Regression technique gives the best overall result with a mean absolute error of less than 1 degree Celsius.

Sensors, Nov 6, 2020
Internet of Things (IoT) is becoming a new socioeconomic revolution in which data and immediacy a... more Internet of Things (IoT) is becoming a new socioeconomic revolution in which data and immediacy are the main ingredients. IoT generates large datasets on a daily basis but it is currently considered as "dark data", i.e., data generated but never analyzed. The efficient analysis of this data is mandatory to create intelligent applications for the next generation of IoT applications that benefits society. Artificial Intelligence (AI) techniques are very well suited to identifying hidden patterns and correlations in this data deluge. In particular, clustering algorithms are of the utmost importance for performing exploratory data analysis to identify a set (a.k.a., cluster) of similar objects. Clustering algorithms are computationally heavy workloads and require to be executed on high-performance computing clusters, especially to deal with large datasets. This execution on HPC infrastructures is an energy hungry procedure with additional issues, such as high-latency communications or privacy. Edge computing is a paradigm to enable lightweight computations at the edge of the network that has been proposed recently to solve these issues. In this paper, we provide an in-depth analysis of emergent edge computing architectures that include low-power Graphics Processing Units (GPUs) to speed-up these workloads. Our analysis includes performance and power consumption figures of the latest Nvidia's AGX Xavier to compare the energy-performance ratio of these low-cost platforms with a high-performance cloud-based counterpart version. Three different clustering algorithms (i.e., k-means, Fuzzy Minimals (FM), and Fuzzy C-Means (FCM)) are designed to be optimally executed on edge and cloud platforms, showing a speed-up factor of up to 11× for the GPU code compared to sequential counterpart versions in the edge platforms and energy savings of up to 150% between the edge computing and HPC platforms.

Scientific Reports, Jul 26, 2021
We are witnessing the dramatic consequences of the COVID-19 pandemic which, unfortunately, go bey... more We are witnessing the dramatic consequences of the COVID-19 pandemic which, unfortunately, go beyond the impact on the health system. Until herd immunity is achieved with vaccines, the only available mechanisms for controlling the pandemic are quarantines, perimeter closures and social distancing with the aim of reducing mobility. Governments only apply these measures for a reduced period, since they involve the closure of economic activities such as tourism, cultural activities, or nightlife. The main criterion for establishing these measures and planning socioeconomic subsidies is the evolution of infections. However, the collapse of the health system and the unpredictability of human behavior, among others, make it difficult to predict this evolution in the short to medium term. This article evaluates different models for the early prediction of the evolution of the COVID-19 pandemic to create a decision support system for policy-makers. We consider a wide branch of models including artificial neural networks such as LSTM and GRU and statistically based models such as autoregressive (AR) or ARIMA. Moreover, several consensus strategies to ensemble all models into one system are proposed to obtain better results in this uncertain environment. Finally, a multivariate model that includes mobility data provided by Google is proposed to better forecast trend changes in the 14-day CI. A real case study in Spain is evaluated, providing very accurate results for the prediction of 14-day CI in scenarios with and without trend changes, reaching 0.93 R 2 , 4.16 RMSE and 1.08 MAE. The COVID-19 pandemic is the biggest global challenge in our recent history, which puts the welfare state of today's society at risk. Spain is undoubtedly among the countries most affected by the pandemic, with up to 3,697,987 total cases of infection, and a total of 80,196 deaths (as reported on June 7, 2021) 1. Governments worldwide are taking drastic measures such as social distancing, contact tracing, perimeter closures and even quarantines, which are either reinforced or alleviated depending on the epidemiological status of the disease 2. These non-sanitary measures focus on the reduction of human mobility, which has an important socioeconomic effect 3. For instance, according to the European Commission, the economic forecast for Spain is the worst in its recent history with a 9.4% drop in GDP, and an expected unemployment of up to 18.9% at the end of 2020. Globally speaking, the Organisation for Economic Cooperation and Development (OECD) 4 stated that these bad economic projections will lead to widespread poverty, child malnutrition, stress, and suicides, just to mention a few of the dramatic consequences for the population. However, beyond the economic consequences, the measures of social distancing and lockdowns can raise new social scenarios in fundamental aspects such as education, gender violence, immigration and other new issues that may arise because of such extreme public health measures. Early understanding of the evolution of the pandemic prevents scenarios that could increase the number of COVID-19 victims. Governments have implemented public health surveillance systems for COVID-19 based on the fundamental principles provided by the World Health Organization (WHO); i.e., tracking clinical and epidemiological figures such as confirmed, death, active cases, just to mention a few 5,6. This information is usually provided by governments daily, and currently, these surveillance systems provide robust and stable information on the evolution of the pandemic 7. However, this epidemiological information shows a posterior picture of the pandemic, i.e., once people have been infected and are showing symptoms, usually after an incubation period of
Using remote GPU virtualization techniques to enhance edge computing devices
Future Generation Computer Systems, May 1, 2023

Research Square (Research Square), Jul 21, 2022
The Internet of Things (IoT) enables the next economic revolution in which data and immediacy are... more The Internet of Things (IoT) enables the next economic revolution in which data and immediacy are the key players. Edge computing is a compelling alternative for enabling computing capabilities at the network's edge. These computing capabilities could help transform the generated data into useful information by executing machine learning (ML) workloads. TinyML is emerging as a fast-growing ML ecosystem field aiming to perform workloads on typically battery-operated devices, providing sensor data analytics at extremely low power (mW). In this work, we evaluate a wide range of TinyML and edge computing platforms to assess the computational/energy trade-off of these platforms. In this work, we analyse the Arduino Nano 33 BLE Sense microcontroller and three different edge computing devices, namely the CPU-based Raspberry Pi 4 and the CPU-GPU Nvidia Jetson platforms Evaluation of edge computing platforms through TinyML workloads Nano and AGX Xavier. We run two lightweight artificial neural networks (ANN) to forecast the internal temperature of an operational greenhouse. Our results show that the microcontroller-based devices can offer a competitive and energy-efficient computational alternative to more traditional edge computing approaches for lightweight ML workloads.

Research Square (Research Square), May 13, 2021
We are witnessing the dramatic consequences of the COVID-19 pandemic which, unfortunately, go bey... more We are witnessing the dramatic consequences of the COVID-19 pandemic which, unfortunately, go beyond the impact on the health system. Until herd immunity is achieved with vaccines, the only available mechanisms for controlling the pandemic are quarantines, perimeter closures and social distancing with the aim of reducing mobility. Governments only apply these measures for a reduced period of time, since they involve the closure of economic activities such as tourism, cultural activities or nightlife. The main criterion for establishing these measures and planning socioeconomic subsidies is the evolution of infections. Early warning systems in all countries monitor the COVID-19 pandemic evolution. However, the collapse of the health system and the unpredictability of human behaviour, among others, make it difficult to predict this evolution in the short to medium term. This article evaluates different models for the early prediction of the evolution of the COVID-19 pandemic to create a decision support system for policy-makers. We consider a wide branch of models including artificial neural networks such as LSTM and GRU and statistically-based models such as autoregressive (AR) or ARIMA. Moreover, several consensus strategies to ensemble all models into one system are proposed to obtain better results in this uncertain environment. Our results reveal that the ensemble of different models improves the overall accuracy of the prediction, reaching up to 0.93 R 2 , 4.16 RMSE and 3.55 MAE when there are not trend changes in the time-series. Mobility data provided by Google mobility data is also considered as exogenous information for our ensemble model to forecast trend changes, providing a good framework for a complete inference.
A Multi-Model Deep Learning Approach to Address Prediction Imbalances in Smart Greenhouses
Greenhouse intelligent warning system for precision agriculture
2023 19th International Conference on Intelligent Environments (IE)

Data-driven evaluation of machine learning models for climate control in operational smart greenhouses
Journal of Ambient Intelligence and Smart Environments
Nowadays, human overpopulation is stressing our ecosystems in different ways, agriculture being a... more Nowadays, human overpopulation is stressing our ecosystems in different ways, agriculture being a critical example as different predictions point towards food shortages in the near future. Accordingly, smart farming is becoming key to the optimization of natural resources so that different crops can be grown efficiently, consuming as few resources as possible. In particular, greenhouses have proved to be an effective way of producing a high volume of vegetables/fruits in a reduced space and within a short time span. Hence, optimizing greenhouse functioning results in less water use and nutrient consumption, less energy use, faster growth, and better product quality. In this article, we carry out an in-depth analysis of different machine learning (ML) models to improve climate control in smart greenhouses. As part of the analysis of the techniques we also considered 3 ways of pre-processing the data, as well as 12-hour and 24-hour forecasting. We focus on forecasting the indoor air t...
SEPARATE: A tightly coupled, seamless IoT infrastructure for deploying AI algorithms in smart agriculture environments
Internet of Things
Uploads
Papers by Juan Morales-García