Dinei A Rockenbach

Estudo Comparativo de Bancos de Dados NoSQL

Zenodo (CERN European Organization for Nuclear Research), Apr 25, 2018

NoSQL databases emerged to fill limitations of the relational databases. The many options for eac... more NoSQL databases emerged to fill limitations of the relational databases. The many options for each one of the categories, and their distinct characteristics and focus makes this assessment very difficult for decision makers. Most of the time, decisions are taken without the attention and background deserved due to the related complexities. This article aims to compare the relevant characteristics of each database, abstracting the information that bases the market marketing of them. We concluded that although the databases are labeled in a specific category, there is a significant disparity in the functionalities offered by each of them. Also, we observed that new databases are emerging even though there are well-established databases in each one of the categories studied. Finally, it is very challenging to suggest the best database for each category because each scenario has its requirements, which requires a careful analysis where our work can help to simplify this kind of decision.

Download

High-Level Stream and Data Parallelism in C++ for GPUs

XXVI Brazilian Symposium on Programming Languages

Estudo Comparativo de Bancos de Dados NoSQL

NoSQL databases emerged to fill limitations of the relational databases. The many options for eac... more NoSQL databases emerged to fill limitations of the relational databases. The many options for each one of the categories, and their distinct characteristics and focus makes this assessment very difficult for decision makers. Most of the time, decisions are taken without the attention and background deserved due to the related complexities. This article aims to compare the relevant characteristics of each database, abstracting the information that bases the market marketing of them. We concluded that although the databases are labeled in a specific category, there is a significant disparity in the functionalities offered by each of them. Also, we observed that new databases are emerging even though there are well-established databases in each one of the categories studied. Finally, it is very challenging to suggest the best database for each category because each scenario has its requirements, which requires a careful analysis where our work can help to simplify this kind of decision...

Download

NAS Parallel Benchmarks with CUDA and beyond

Software: Practice and Experience, 2021

Avaliação de Desempenho para Banco de Dados com Genoma em Nuvem Privada

by Dinei A Rockenbach and Luan Dopke

Resumo. Os bancos de dados são ferramentas particularmente interessantes para a manipulação de da... more Resumo. Os bancos de dados são ferramentas particularmente interessantes para a manipulação de dados gerados através do sequenciamento de DNA. Este artigo tem como objetivo avaliar o desempenho de três bancos de dados com cargas relacionadas ao sequenciamento de DNA: PostgreSQL e MySQL como bancos de dados relacionais e MongoDB como banco de dados NoSQL. Os resultados demonstram que o PostgreSQL se sobressai aos demais.

Download

Stream Processing on Multi-cores with GPUs: Parallel Programming Models' Challenges

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2019

The stream processing paradigm is used in several scientific and enterprise applications in order... more The stream processing paradigm is used in several scientific and enterprise applications in order to continuously compute results out of data items coming from data sources such as sensors. The full exploitation of the potential parallelism offered by current heterogeneous multi-cores equipped with one or more GPUs is still a challenge in the context of stream processing applications. In this work, our main goal is to present the parallel programming challenges that the programmer has to face when exploiting CPUs and GPUs' parallelism at the same time using traditional programming models. We highlight the parallelization methodology in two use-cases (the Mandelbrot Streaming benchmark and the PARSEC's Dedup application) to demonstrate the issues and benefits of using heterogeneous parallel hardware. The experiments conducted demonstrate how a high-level parallel programming model targeting stream processing like the one offered by SPar can be used to reduce the programming effort still offering a good level of performance if compared with state-of-the-art programming models.

Download

Forecasting Brazilian mortality rates due to occupational accidents using autoregressive moving average approaches

International Journal of Forecasting, 2020

We examine the mortality rates due to occupational accidents of the three states in the southern ... more We examine the mortality rates due to occupational accidents of the three states in the southern region of Brazil using the autoregressive integrated moving average (ARIMA), beta autoregressive moving average ( β ARMA), and Kumaraswamy autoregressive moving average (KARMA) models to fit the data sets, considering monthly observations from 2000 to 2017. We compare them to identify the best predictive model for the southern region of Brazil. We also provide descriptive analysis, revealing the victims’ vulnerability characteristics and comparing them between the states. A clear increase was seen in female participation in the labor market, but the number of deaths from occupational accidents did not increase by the same proportion. Moreover, the state of Parana stood out for having the highest mortality rate from work-related accidents. The fitted ARIMA and β ARMA models using a 6-month time frame presented similar accuracy measurements, while KARMA performed the worst.

High-Level Stream Parallelism Abstractions with SPar Targeting GPUs

Avaliação de Desempenho para Banco de Dados com Genoma em Nuvem Privada

Resumo. Os bancos de dados são ferramentas particularmente interessantes para a manipulação de da... more Resumo. Os bancos de dados são ferramentas particularmente interessantes para a manipulação de dados gerados através do sequenciamento de DNA. Este artigo tem como objetivo avaliar o desempenho de três bancos de dados com cargas relacionadas ao sequenciamento de DNA: PostgreSQL e MySQL como bancos de dados relacionais e MongoDB como banco de dados NoSQL. Os resultados demonstram que o PostgreSQL se sobressai aos demais.

Download

Provendo Abstrações de Alto Nível para GPUs na SPar

Os novos hardwares surgidos nos últimos anos têm seguido a tendência crescente no volume de dados... more Os novos hardwares surgidos nos últimos anos têm seguido a tendência crescente no volume de dados gerados pelo uso das tecnologias digitais. Na vanguarda da busca por mais desempenho estão as Graphics Processing Units (GPUs), hardwares massivamente paralelos originalmente desenhados para processamento gráfico, mas que hoje em dia são largamente utilizados em várias tarefas de cunho geral. O advento das GPUs impulsionou aplicações como carros com direção autônoma, ray tracing em tempo real, aprendizado profundo na inteligência artificial e realidade virtual (VR). Porém, esse ambiente heterogêneo com GPUs e Central Processing Units (CPUs) paralelas apresenta um desafio adicional para o desenvolvimento de software paralelo.

Download

Forecasting Brazilian mortality rates due to occupational accidents using autoregressive moving average approaches

by Cristiane Melchior and Dinei A Rockenbach

International Journal of Forecasting, 2021

We examine the mortality rates due to occupational accidents of the three states in the southern ... more We examine the mortality rates due to occupational accidents of the three states in the southern region of Brazil using the autoregressive integrated moving average (ARIMA), beta autoregressive moving average (ARMA), and Kumaraswamy autoregressive moving average (KARMA) models to fit the data sets, considering monthly observations from 2000 to 2017. We compare them to identify the best predictive model for the southern region of Brazil. We also provide descriptive analysis, revealing the victims’ vulnerability characteristics and comparing them between the states. A clear increase was seen in female participation in the labor market, but the number of deaths from occupational accidents did not increase by the same proportion. Moreover, the state of Paraná stood out for having the highest mortality rate from work-related accidents. The fitted ARIMA and ARMA models using a 6-month time frame presented similar accuracy measurements, while KARMA performed the worst.

Latency‐aware adaptive micro‐batching techniques for streamed data compression on graphics processing units

Concurrency and Computation: Practice and Experience

Stream Processing is a parallel paradigm used in many application domains. With the advance of Gr... more Stream Processing is a parallel paradigm used in many application domains. With the advance of Graphics Processing Units (GPUs), their usage in stream processing applications has increased as well. The efficient utilization of GPU accelerators in streaming scenarios requires to batch input elements in micro-batches, whose computation is offloaded on the GPU leveraging data paral-lelism within the same batch of data. Since data elements are continuously received based on the input speed, the bigger the micro-batch size the higher the latency to completely buffer it and to start the processing on the device. Unfortunately, stream processing applications often have strict latency requirements that need to find the best size of the micro-batches and to adapt it dynamically based on the workload conditions as well as according to the characteristics of the underlying device and network. In this work, we aim at implementing latency-aware adap-tive micro-batching techniques and algorithms for streaming compression applications targeting GPUs. The evaluation is conducted using the Lempel-Ziv-Storer-Szymanski (LZSS) compression application considering different input workloads. As a general result of our work, we noticed that algorithms with elastic adaptation factors respond better for stable workloads, while algorithms with narrower targets respond better for highly unbalanced workloads.

Latency‐aware adaptive micro‐batching techniques for streamed data compression on graphics processing units

Concurrency and Computation: Practice and Experience

Stream Processing is a parallel paradigm used in many application domains. With the advance of Gr... more Stream Processing is a parallel paradigm used in many application domains. With the advance of Graphics Processing Units (GPUs), their usage in stream processing applications has increased as well. The efficient utilization of GPU accelerators in streaming scenarios requires to batch input elements in micro-batches, whose computation is offloaded on the GPU leveraging data paral-lelism within the same batch of data. Since data elements are continuously received based on the input speed, the bigger the micro-batch size the higher the latency to completely buffer it and to start the processing on the device. Unfortunately, stream processing applications often have strict latency requirements that need to find the best size of the micro-batches and to adapt it dynamically based on the workload conditions as well as according to the characteristics of the underlying device and network. In this work, we aim at implementing latency-aware adap-tive micro-batching techniques and algorithms for streaming compression applications targeting GPUs. The evaluation is conducted using the Lempel-Ziv-Storer-Szymanski (LZSS) compression application considering different input workloads. As a general result of our work, we noticed that algorithms with elastic adaptation factors respond better for stable workloads, while algorithms with narrower targets respond better for highly unbalanced workloads.

Latency-aware adaptive micro-batching techniques for streamed data compression on graphics processing units

by Gabriele Mencagli, Massimo Torquati, and Dinei A Rockenbach

Concurrency and Computation: Practice and Experience, 2020

Stream Processing is a parallel paradigm used in many application domains. With the advance of Gr... more Stream Processing is a parallel paradigm used in many application domains. With the advance of Graphics Processing Units (GPUs), their usage in stream processing applications has increased as well. The efficient utilization of GPU accelerators in streaming scenarios requires to batch input elements in micro-batches, whose computation is offloaded on the GPU leveraging data paral-lelism within the same batch of data. Since data elements are continuously received based on the input speed, the bigger the micro-batch size the higher the latency to completely buffer it and to start the processing on the device. Unfortunately, stream processing applications often have strict latency requirements that need to find the best size of the micro-batches and to adapt it dynamically based on the workload conditions as well as according to the characteristics of the underlying device and network. In this work, we aim at implementing latency-aware adap-tive micro-batching techniques and algorithms for streaming compression applications targeting GPUs. The evaluation is conducted using the Lempel-Ziv-Storer-Szymanski (LZSS) compression application considering different input workloads. As a general result of our work, we noticed that algorithms with elastic adaptation factors respond better for stable workloads, while algorithms with narrower targets respond better for highly unbalanced workloads.

Download

Stream Processing on Multi-Cores with GPUs: Parallel Programming Models' Challenges

by Gabriele Mencagli, Dinei A Rockenbach, and Massimo Torquati

n proceedings of the 8th Workshop on Parallel Programming Models - Special Edition on IoT and Machine Learning, 2019

The stream processing paradigm is used in several scientific and enterprise applications in order... more The stream processing paradigm is used in several scientific and enterprise applications in order to continuously compute results out of data items coming from data sources such as sensors. The full exploitation of the potential parallelism offered by current heterogeneous multi-cores equipped with one or more GPUs is still a challenge in the context of stream processing applications. In this work, our main goal is to present the parallel programming challenges that the programmer has to face when exploiting CPUs and GPUs' parallelism at the same time using traditional programming models. We highlight the parallelization methodology in two use-cases (the Mandelbrot Streaming benchmark and the PARSEC's Dedup application) to demonstrate the issues and benefits of using heterogeneous parallel hardware. The experiments conducted demonstrate how a high-level parallel programming model targeting stream processing like the one offered by SPar can be used to reduce the programming effort still offering a good level of performance if compared with state-of-the-art programming models.

Download

Estudo Comparativo de Bancos de Dados NoSQL

by Dalvan Griebler and Dinei A Rockenbach

NoSQL databases emerged to fill limitations of the relational databases. The many options for eac... more NoSQL databases emerged to fill limitations of the relational databases. The many options for each one of the categories, and their distinct characteristics and focus makes this assessment very difficult for decision makers. Most of the time, decisions are taken without the attention and background deserved due to the related complexities. This article aims to compare the relevant characteristics of each database, abstracting the information that bases the market marketing of them. We concluded that although the databases are labeled in a specific category, there is a significant disparity in the functionalities offered by each of them. Also, we observed that new databases are emerging even though there are well-established databases in each one of the categories studied. Finally, it is very challenging to suggest the best database for each category because each scenario has its requirements, which requires a careful analysis where our work can help to simplify this kind of decision.

Download

Estudo Comparativo de Banco de Dados Chave-Valor com Armazenamento em Memória

by Dinei A Rockenbach, Nadine Anderle, and Dalvan Griebler

Escola Regional de Banco de Dados (ERBD), 2017

Key-value databases emerge to address relational databases' limitations and with the increasing c... more Key-value databases emerge to address relational databases' limitations and with the increasing capacity of RAM memory it is possible to offer greater performance and versatility in data storage and processing. The objective is to perform a comparative study of key-value databases with memory storage Redis, Memcached, Voldemort, Aerospike, Hazelcast and Riak KV. Thus, the work contributed to an analysis of different databases and with results that qualitatively demonstrated the characteristics and pointed out the main advantages.

Download

Uploads

Papers by Dinei A Rockenbach

Log In