Papers by Hector Ceballos

Data
The estimation of occupancy is a crucial contribution to achieve improvements in energy efficienc... more The estimation of occupancy is a crucial contribution to achieve improvements in energy efficiency. The drawback of data or incomplete data related to occupancy in enclosed spaces makes it challenging to develop new models focused on estimating occupancy with high accuracy. Furthermore, considerable variation in the monitored spaces also makes it difficult to compare the results of different approaches. This dataset comprises the indoor environmental information (pressure, altitude, humidity, and temperature) and the corresponding occupancy level for two different rooms: (1) a fitness gym and (2) a living room. The fitness gym data were collected for six days between 18 September and 2 October 2019, obtaining 10,125 objects with a 1 s resolution according to the following occupancy levels: low (2442 objects), medium (5325 objects), and high (2358 objects). The living room data were collected for 11 days between 14 May and 4 June 2020, obtaining 295,823 objects with a 1 s resolution,...
ABSTRACT The Electronic Institutions (EIs) framework is designed for regulating interactions amon... more ABSTRACT The Electronic Institutions (EIs) framework is designed for regulating interactions among heterogeneous agents in open systems [1]. In EIs, agent interactions are speech acts whose exchange is organized as conversation protocols called scenes. Agents can participate simultaneously in multiple scenes playing a single role in each one of them. However, at some point, the execution of a given scene may require the presence of an agent playing a particular role.
2021 Machine Learning-Driven Digital Technologies for Educational Innovation Workshop

Journal of Big Data, 2022
Scientometrics is the field of study and evaluation of scientific measures such as the impact of ... more Scientometrics is the field of study and evaluation of scientific measures such as the impact of research papers and academic journals. It is an important field because nowadays different rankings use key indicators for university rankings and universities themselves use them as Key Performance Indicators (KPI). The purpose of this work is to propose a semantic modeling of scientometric indicators using the ontology Statistical Data and Metadata Exchange (SDMX). We develop a case study at Tecnologico de Monterrey following the Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology. We evaluate the benefits of storing and querying scientometric indicators using linked data as a mean for providing flexible and quick access knowledge representation that supports indicator discovery, enquiring and composition. The semi-automatic generation and further storage of this linked data in the Neo4j graph database enabled an updatable and quick access model.
Cantu et al A Knowledge-based Information System for managing research and value creation in a Un... more Cantu et al A Knowledge-based Information System for managing research and value creation in a University Environment.
Abstract. As Ontologic knowledge gets more and more important in agent-based systems, its handlin... more Abstract. As Ontologic knowledge gets more and more important in agent-based systems, its handling becomes crucial for successful applications. In the context of agent-based applications, we propose a hybrid approach, in which part of the ontology is handled locally, using a “client component”, and the rest of the ontological knowledge is handled by an “ontology agent”, which is accessed by the other agents in the system through their client component. In this sort of “caching ” scheme, most frequent ontologic queries tend to remain stored locally. We propose specific methods for representing, storing, querying and translating ontologies for effective use in the context of the “JITIK ” system, which is a multiagent system for knowledge and information distribution. We report as well a working prototype implementing our proposal, and discuss some performance figures. 1

The impact of the strategies that researchers follow to publish or produce scientific content can... more The impact of the strategies that researchers follow to publish or produce scientific content can have a long-term impact. Identifying which strategies are most influential in the future has been attracting increasing attention in the literature. In this study, we present a systematic review of recommendations of long-term strategies in research analytics and their implementation methodologies. The objective is to present an overview from 2002 to 2018 on the development of this topic, including trends, and addressed contexts. The central objective is to identify data-oriented approaches to learn long-term research strategies, especially in process mining. We followed a protocol for systematic reviews for the engineering area in a structured and respectful manner. The results show the need for studies that generate more specific recommendations based on data mining. This outcome leaves open research opportunities from two particular perspectives—applying methodologies involving proce...

To understand and approach the spread of the SARS-CoV-2 epidemic, machine learning offers fundame... more To understand and approach the spread of the SARS-CoV-2 epidemic, machine learning offers fundamental tools. This study presents the use of machine learning techniques for projecting COVID-19 infections and deaths in Mexico. The research has three main objectives: first, to identify which function adjusts the best to the infected population growth in Mexico; second, to determine the feature importance of climate and mobility; third, to compare the results of a traditional time series statistical model with a modern approach in machine learning. The motivation for this work is to support health care providers in their preparation and planning. The methods compared are linear, polynomial, and generalized logistic regression models to describe the growth of COVID-19 incidents in Mexico. Additionally, machine learning and time series techniques are used to identify feature importance and perform forecasting for daily cases and fatalities. The study uses the publicly available data sets ...

The main purpose of the economic expenditure of countries in research and development is to achie... more The main purpose of the economic expenditure of countries in research and development is to achieve higher levels of scientific findings within research ecosystems, which in turn could generate better living standards for society. Therefore, the collection of scientific production constitutes a faithful image of the capacity, trajectory and scientific depth assignable to each country. The intention of this article is to contribute to the understanding of the factors that certainly influence in the scientific production and how could be improved. In order to achieve this challenge, we select a sample of 19 countries considered partners in science and technology. On the one hand we download social and economic variables (gross domestic expenditure on R&D (GERD) as a percentage of gross domestic product (GDP) and researchers in full-time equivalent (FTE)) and on the other hand variables related to scientific results (total scientific production, scientific production by subject areas and by different institutions, without overlook the citations received as an impact measure) all this data within a 17-year time window. Through a causal model with multiple linear regression using panel data, the experiment confirms that two independent (or explanatory) variables of five selected explain the amount of scientific production by 98% for the countries analyzed. An important conclusion that we highlight stays the importance of checking for compliance of statistical assumptions when using multiple regression in research studies. As a result, we built a reliable predictive model to analyze scenarios in which the increase in any of the independent variables causes a positive effect on scientific production. This model allows decision maker to make comparison among countries and helps in the formulation of future plans on national scientific policies.
Knowledge Management Research & Practice
A case study for impelling university research productivity and impact through collaboration is p... more A case study for impelling university research productivity and impact through collaboration is presented. Scientometric results support the hypothesis that a knowledge management model increased research collaboration and thereby boosted a university's number of publications and citations. Results come from fifteen years of data at a Mexican university with 2400 researchers who produced 24,000 works in fifteen research disciplines. These data are treated with social network visualizations and algorithms to identify patterns of collaboration and clustering, as well as with normalizations to make disciplines comparable and to verify increasing citation impact. The knowledge management model implemented in the study may be a costeffective way for universities to intensify collaboration and improve research performance.

International Journal on Interactive Design and Manufacturing (IJIDeM)
This research article presents a study to compare the teaching performance of teaching-only versu... more This research article presents a study to compare the teaching performance of teaching-only versus teaching-and-research professors at higher education institutions. It is a common belief that, generally, teaching professors outperform research professors in teaching-and-research universities according to student perceptions reflected in student surveys. This case study presents experimental evidence that shows this is not always the case and that, under certain circumstances, it can be the contrary. The case study is from Tecnologico de Monterrey (Tec), a teaching-and-research, private university in Mexico that has developed a research profile during the last two decades using a mix of teaching-only and teaching-and-research faculty members; during this time period, the university has had a growing ascendancy in world university rankings. Data from an institutional student survey called the ECOA was used. The data set contains more than 118,000 graduate and undergraduate courses for 5 semesters (January 2017 to May 2019). The results presented were derived from statistical to data mining methods, including Analysis of Variance and Logistic Regression, that were applied to this data set of more than nine thousand professors who taught those courses. The results show that teaching-and-research professors perform better or at least the same as teaching-only professors. The differences found in teaching with respect to attributes like professors’ gender, age, and research level are also presented.
Ontologic knowledge is getting more and more important in agent-based systems, and its handling i... more Ontologic knowledge is getting more and more important in agent-based systems, and its handling is becoming crucial for successful applications. But placing all the ontology-handling capabilities in each of the system’s agents could make them too heavy. We propose a combination of local and global ontology handling, where part of the ontology is handled locally, using a “client component”, and the rest of the ontological knowledge is handled by an “ontology agent”, which is accessed by the other agents in the system through their client component. We propose specific methods for rep-resenting, storing, querying and translating ontologies for ef-fective use in the context of the “JITIK ” system, which is a multiagent system for knowledge and information distribu-tion. We report a working prototype implementing our pro-posal.
Abstract. In order to aid domain experts during data integration, several schema matching techniq... more Abstract. In order to aid domain experts during data integration, several schema matching techniques have been proposed. Despite the facilities provided by these techniques, mappings between database schemas are still made manually. We propose a methodology for mapping two relational databases that uses ontology matching techniques and takes advantage of tools like D2R Server and AgreementMaker for automating mapping generation and for enabling unified access to information. We present the results obtained by some ontology matching algorithms in this context, demonstrating the feasibility of this approach.

The Electronic Institutions (EIs) framework is designed for regulating interactions among heterog... more The Electronic Institutions (EIs) framework is designed for regulating interactions among heterogeneous agents in open systems [1]. In EIs, agent interactions are speech acts whose exchange is organized as conversation protocols called scenes. Agents can participate simultaneously in multiple scenes playing a single role in each one of them. However, at some point, the execution of a given scene may require the presence of an agent playing a particular role. When such an agent is missing, a deadlock may ensue unless the institution or the agents themselves can invoke the participation of an agent to play the missing role. Such functionality is not provided in the current EI framework. We propose an extension of the framework that addresses that problem in a generic way: the provision of an institutional agent in charge of instantiating new agents and dispatching them to scenes through a participation request protocol. In this paper we make the proposal precise and illustrate it with...
All in-text references underlined in blue are linked to publications on ResearchGate, letting you... more All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.

Background: To understand and approach the COVID-19 spread, Machine Learning offers fundamental t... more Background: To understand and approach the COVID-19 spread, Machine Learning offers fundamental tools. This study presents the use of machine learning techniques for the projection of COVID-19 infections and deaths in Mexico. The research has three main objectives: first, to identify which function adjusts the best to the infected population growth in Mexico; second, to determine the feature importance of climate and mobility; third, to compare the results of a traditional time series statistical model with a modern approach in machine learning. The motivation for this work is to support health care providers in their preparation and planning. Methods: The methods used are linear, polynomial, and generalized logistic regression models to evaluate the growth of the COVID-19 incidents in the country. Additionally, machine learning and time-series techniques are used to identify feature importance and perform forecasting for daily cases and fatalities. The study uses the publicly avail...
Agent-based technologies, originally proposed with the aim of assisting human activities, have be... more Agent-based technologies, originally proposed with the aim of assisting human activities, have been recently adopted in industry for automating business processes. Business Process Model and Notation (BPMN) is a standard notation for modeling business processes that provides a rich graphical representation that can be used for common understanding of processes but also for automation purposes. We propose a normal form of Business Process Diagrams (BPDs) based on Activity Theory that can be transformed into a Causal Bayesian Network, which in turn can be used to tackle with uncertainty introduced by human participants. We illustrate our approach on an Elderly health care scenario obtained from an actual contextual study.

More than four out of 10 sports fans consider themselves soccer fans, making the game the world's... more More than four out of 10 sports fans consider themselves soccer fans, making the game the world's most popular sport. Sports are season based and constantly changing over time, as well, statistics vary according to the sport and league. Understanding sports communities in Social Networks and identifying fan's expertise is a key indicator for soccer prediction. This research proposes a Machine Learning Model using polarity on a dataset of 3,000 tweets taken during the last game week on English Premier League season 19/20. The end goal is to achieve a flexible mechanism, which automatizes the process of gathering the corpus of tweets before a match, and classifies its sentiment to find the probability of a winning game by evaluating the network centrality. Keywords: Graph theory • Machine learning • Sentiment analysis • Social networks • Sports analytics 1.1 Review on Social Network Analysis: Spread Influence Some research studies, as the one developed by Yan, [16] evaluate the influence of users, represented as nodes, on other entities under the Social Network

Asia-Pacific Journal of Operational Research
Data envelopment analysis (DEA) is a methodology for evaluating the relative efficiencies of a se... more Data envelopment analysis (DEA) is a methodology for evaluating the relative efficiencies of a set of decision-making units (DMUs), based on their multiple inputs and outputs. The original model is based on the assumption that DMUs operate independently of one another. However, this assumption may not apply in some situations, as in the case we present in this paper, in which DMUs can work together to produce joint outputs. What makes it more interesting is the situation in which this characteristic of sharing outputs among some DMUs differs from one DMU to another; this makes it more challenging to determine independent efficiency scores that cater for this phenomenon. To address this, the current paper presents a methodology for measuring efficiency in situations in which DMUs share outputs with other units. We examine the case of a set of research groups in a Mexican university. For this study, the inputs used are professors belonging to various groups, and outputs are the publis...

Automation of organizational processes that make intensive use of expert knowledge has on the int... more Automation of organizational processes that make intensive use of expert knowledge has on the intelligent agent paradigm an appropriate metaphor. The idea would be introducing autonomous agents carrying out tasks on behalf of human users in organizational processes. Knowledge transference from experts to autonomous agents has been treated on different ways taking advantage of more than 20 years of experience in knowledge representation in Artificial Intelligence. Data mining and case-based reasoning use historical information for identifying patterns that guide agent’s decision. Inference rules are used for explicitly capturing expert knowledge. Agent’s capacity for adapting to changes rely on the employed formalism. In many cases, the expert is left aside in order to provide an automatic solution. On the other hand, Autonomic Computing is providing interesting approaches for dealing with the dynamic adaptation of components to organizational and environmental changes. This theory p...
Uploads
Papers by Hector Ceballos