Papers by Joel M Correa da Rosa

Journal of Applied Statistics, 2013
ABSTRACT In this paper, we provide probabilistic predictions for soccer games of the 2010 FIFA Wo... more ABSTRACT In this paper, we provide probabilistic predictions for soccer games of the 2010 FIFA World Cup modelling the number of goals scored in a game by each team. We use a Poisson distribution for the number of goals for each team in a game, where the scoring rate is considered unknown. We use a Gamma distribution for the scoring rate and the Gamma parameters are chosen using historical data and difference among teams defined by a strength factor for each team. The strength factor is a measure of discrimination among the national teams obtained from their memberships to fuzzy clusters. The clusters are obtained with the use of the Fuzzy C-means algorithm applied to a vector of variables, most of them available on the official FIFA website. Static and dynamic models were used to predict the World Cup outcomes and the performance of our predictions was evaluated using two comparison methods.

A Deus pelo seu amor e fidelidade durante toda esta caminhada. Ao Prof. DrÁlvaro Lima Veiga Filho... more A Deus pelo seu amor e fidelidade durante toda esta caminhada. Ao Prof. DrÁlvaro Lima Veiga Filho por toda a compreensão e incentivo e valiosa orientação Ao Prof. Dr. Marcelo Cunha Medeiros pelo constante apoio e incentivo que foram fundamentais para a elaboração deste trabalho. Ao Prof. Dr. Timo Teräsvirta que tornou possível um período de valioso aprendizado na Escola de Economia de Estocolmo na Suécia. A meus pais, irmãos e tios pelo apoio incondicional. A todo corpo docente do Departamento de Estatística da Universidade Federal do Paraná que assumiu todos os encargos didáticos neste período de afastamento e prestou admirável incentivo ao longo de todo percurso. A CAPES pelo auxílio financeiro concedido através do extinto programa PICDT. Ao CNPQ pelo auxílio concedido durante o período de intercâmbio na Suécia. A Maria Aparecida Thiengo pela compreensão, apoio, incentivo e ajuda na elaboração deste material. Aos amigos Annistina e Mika pelo companheirismo nas horas de lazer que foram revitalizantes. A Pernilla Watson pelo apoio "logístico"que auxiliou a minha estadia em Estocolmo. A Denize Oliveira e Celso Sá pelos ensinamentos sobre a vida acadêmica. A Claúdio Sá pelos momentos de apoio e companheirismo. Aos grandes amigos Jorge e Marcel cuja amizade será eterna. A amiga Malin Hornell pela ajuda no período de adaptaçãoà Suécia. Aos muitos amigos que fiz em Estocolmo. Aos vizinhos amigos Cristina, Eduardo e seus filhosÁlvaro e Maria Eduarda pela agradável companhia na cidade do Rio de Janeiro. A Glaucy Ortiz pelos momentos de companheirismo e apoio. pelos momentos de descontração e companheirismo. Lia, Danny e Ana Alice pelo companheirismo nos meses que antecederam o término deste trabalho.
Evolutionary Computation, 2009
Seventh International Conference on Intelligent Systems Design and Applications (ISDA 2007), 2007
One of the most important fields of researches and applications is time series forecasting. The t... more One of the most important fields of researches and applications is time series forecasting. The task to find a model that can fit the data is not easy, because the most of the problems the series are complex and noisy. Recently, ensemble of machines had been used to get accurate ...
2007 IEEE Congress on Evolutionary Computation, 2007
Abstract Time series forecasting has been considered an important tool to support decisions in d... more Abstract Time series forecasting has been considered an important tool to support decisions in different domains. A highly accurate prediction is essential to ensure the quality of these decisions. Time series forecasting is based on historical data and the predictions are ...

Time series forecasting has been widely used to support decisions, in this context a highly accur... more Time series forecasting has been widely used to support decisions, in this context a highly accurate prediction is essential to ensure the quality of the decisions. Ensembles of machines currently receive a lot of attention; they combine predictions from different forecasting methods as a procedure to im- prove the accuracy. This paper explores Genetic Programming (GP) and Boosting technique to obtain an ensemble of regressors and proposes a new formula for the final hypothesis. This new formula is based on the correlation coefficient instead of the geometric median used traditionally by the boosting algorithm. To validate this method, experiments were accomplished using real, financial and artificial series generated by Monte Carlo Simulation. The mean squared error (MSE) has been used to compare the accuracy of the proposed method against another ones, the "t" test and ANOVA test were used too. The results obtained by using this new methodology was compared with the resu...
Applied Intelligence, 2010
This paper explores the Genetic Programming and Boosting technique to obtain an ensemble of regre... more This paper explores the Genetic Programming and Boosting technique to obtain an ensemble of regressors and proposes a new formula for the updating of weights, as well as for the final hypothesis. Differently from studies found in the literature, in this paper we investigate the use of the correlation metric as an additional factor for the error metric. This new
Textos Para Discussao, 2003
The goal of this paper is to introduce a tree-based model that combines aspects of CART (Classifi... more The goal of this paper is to introduce a tree-based model that combines aspects of CART (Classification and Regression Trees) and STR (Smooth Transition Regression). The model is called the Smooth Transition Regression Tree (STR-Tree). The main idea relies on specifying a parametric nonlinear model through a tree-growing procedure. The resulting model can be analyzed as a smooth transition regression with multiple regimes. Decisions about splits are entirely based on a sequence of Lagrange Multiplier (LM) tests of hypothe-

Clinical and Translational Science, 2015
Achieving timely accrual into clinical research studies remains a challenge for clinical translat... more Achieving timely accrual into clinical research studies remains a challenge for clinical translational research. We developed an evaluation measure, the Accrual Index (AI), normalized for sample size and study duration, using data from the protocol and study management databases. We applied the AI retrospectively and prospectively to assess its utility. Accrual Target, Projected Time to Accrual Completion (PTAC), Evaluable Subjects, Dates of Recruitment Initiation, Analysis, and Completion were defined. AI is (% Accrual Target accrued/% PTAC elapsed). Changes to recruitment practices were described, and data extracted from study management databases. December 2014 (or final) AI was analyzed for 101 studies initiating recruitment from 2007 to 2014. Median AI was ≥1 for protocols initiating recruitment in 2011, 2013, and 2014. The AI varied widely for studies pre-2013. Studies with AI > 4 utilized convenience samples for recruitment. Data-justified PTAC was refined in 2013-2014 after which the AI range narrowed. Protocol characteristics were not associated with study AI. Protocol AI reflects the relative agreement between accrual feasibility assessment (PTAC), and accrual performance, and is affected by recruitment practices. The AI may be useful in managing accountability, modeling accrual, allocating recruitment resources, and testing innovations in recruitment practices.

Progress in neuro-psychopharmacology & biological psychiatry, Jan 4, 2016
Drug addiction, a leading health problem, is a chronic brain disease with a significant genetic c... more Drug addiction, a leading health problem, is a chronic brain disease with a significant genetic component. Animal models and clinical studies established the involvement of glutamate and GABA neurotransmission in drug addiction. This study was designed to assess if 258 variants in 27 genes of these systems contribute to the vulnerability to develop drug addiction. Four independent analyses were conducted in a sample of 1860 subjects divided according to drug of abuse (heroin or cocaine) and ancestry (African and European). A total of 11 SNPs in eight genes showed nominally significant associations (P<0.01) with heroin and/or cocaine addiction in one or both ancestral groups but the associations did not survive correction for multiple testing. Of these SNPs, the GAD1 upstream SNP rs1978340 is potentially functional as it was shown to affect GABA concentrations in the cingulate cortex. In addition, SNPs GABRB3 rs7165224; DBI rs12613135; GAD1 SNPs rs2058725, rs1978340, rs2241164; an...

CNS Neuroscience & Therapeutics, 2015
Drug addiction is characterized, in part, by deregulation of synaptic plasticity in circuits invo... more Drug addiction is characterized, in part, by deregulation of synaptic plasticity in circuits involved in reward, stress, cue learning, and memory. This study was designed to assess whether 185 variants in 32 genes central to synaptic plasticity and signal transduction contribute to vulnerability to develop heroin and/or cocaine addiction. Analyses were conducted in a sample of 1860 subjects divided according to ancestry (African and European) and drug of abuse (heroin or cocaine). Eighteen SNPs in 11 genes (CDK5R1, EPHA4, EPHA6, FOSL2, MAPK3, MBP, MPDZ, NFKB1, NTRK2, NTSR1, and PRKCE) showed significant associations (P &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt; 0.01), but the signals did not survive correction for multiple testing. SNP rs230530 in the NFKB1 gene, encoding the transcription regulator NF-kappa-B, was the only SNP indicated in both ancestry groups and both addictions. This SNP was previously identified in association with alcohol addiction. SNP rs3915568 in NTSR1, which encodes neurotensin receptor, and SNP rs1389752 in MPDZ, which encodes the multiple PDZ domain protein, were previously associated with heroin addiction or alcohol addiction, respectively. The study supports the involvement of genetic variation in signal transduction pathways in heroin and cocaine addiction and provides preliminary evidence suggesting several new risk or protective loci that may be relevant for diagnosis and treatment success.
Scientia Forestalis/Forest Sciences
Pharmacogenomics, 2015
Drug addiction is influenced by genetic factors. To determine if genetic variants in the serotone... more Drug addiction is influenced by genetic factors. To determine if genetic variants in the serotonergic and adrenergic pathways are associated with heroin and/or cocaine addiction. The study examined 140 polymorphisms in 19 genes in 1855 subjects with predominantly European or African ancestries. A total of 38 polymorphisms (13 genes) showed nominal associations, including novel associations in S100A10 (p11) and SLC18A2 (VMAT2). The association of HTR3B SNP rs11606194 with heroin addiction in the European ancestry subgroup remained significant after correction for multiple testing (p corrected = 0.04). The study strengthens our previous findings of association of polymorphisms in HTR3A, HTR3B and ADRA1A. The study suggests partial overlap in genetic susceptibility between populations of different ancestry and between heroin and cocaine addiction.
Pharmacogenomics, 2014
The dopaminergic pathways have been implicated in the etiology of drug addictions. The aim of thi... more The dopaminergic pathways have been implicated in the etiology of drug addictions. The aim of this study was to determine if variants in dopaminergic genes are associated with heroin addiction. The study includes 828 former heroin addicts and 232 healthy controls, of predominantly European ancestry. Ninety seven SNPs (13 genes) were analyzed. Nine nominally significant associations were observed at CSNK1E, ANKK1, DRD2 and DRD3. The results support our previous report of association of CSNK1E SNP rs1534891 with protection from heroin addiction. CSNK1E interacts with circadian rhythms and DARPP-32 and has been implicated in negative regulation of sensitivity to opioids in rodents. It may be a target for drug addiction treatment. Original submitted 8 August 2014; Revision submitted 8 October 2014.
Ciência Rural, 2007
A produção integrada (PI) vem suprir uma demanda crescente de frutos de qualidade, garantir segur... more A produção integrada (PI) vem suprir uma demanda crescente de frutos de qualidade, garantir segurança alimentar, produção com qualidade ambiental e rastreabilidade. Na visão da PI, as práticas da adubação e do controle de doenças estão intimamente relacionadas; no ...

Proceedings of the 2011 15th International Conference on Computer Supported Cooperative Work in Design (CSCWD), 2011
3D virtual worlds are typically designed to reproduce the physical world, both in the features of... more 3D virtual worlds are typically designed to reproduce the physical world, both in the features of the meeting place and also in how people interact with each other. We try to get a better understanding of technical and behavioral issues found in different meeting environments designed for collaborative work. From the problems that we have found, we proposed the SLMeetingRoom, as a model of environment for work meetings which is composed of a set of components to support basic working activities in Second Life. We also performed a pilot study under four different conditions of communication: face-toface, videoconference, Second Life without the SLMeetingRoom and Second Life with the SLMeetingRoom. The pilot study pointed out that the SLMeetingRoom model is a promising environment according to the criteria: cognitive effort and sense of presence, however, for tasks completeness and participation level, we could not find statistical evidence to support our research hypotheses.
Avances en Enfermería, 2014
Memórias do Instituto Oswaldo Cruz, 2013
We analysed the antimicrobial susceptibility, biofilm formation and genotypic profiles of 27 isol... more We analysed the antimicrobial susceptibility, biofilm formation and genotypic profiles of 27 isolates of Staphylococcus haemolyticus obtained from the blood of 19 patients admitted to a hospital in Rio de Janeiro, Brazil. Our analysis revealed a clinical significance of 36.8% and a multi-resistance rate of 92.6% among these isolates. All but one isolate carried the mecA gene. The staphylococcal cassette chromosome mec type I was the most prevalent mec element detected (67%). Nevertheless, the isolates showed clonal diversity based on pulsed-field gel electrophoresis analysis. The ability to form biofilms was detected in 66% of the isolates studied. Surprisingly, no icaAD genes were found among the biofilm-producing isolates.

Journal of Applied Statistics, 2013
ABSTRACT In this paper, we provide probabilistic predictions for soccer games of the 2010 FIFA Wo... more ABSTRACT In this paper, we provide probabilistic predictions for soccer games of the 2010 FIFA World Cup modelling the number of goals scored in a game by each team. We use a Poisson distribution for the number of goals for each team in a game, where the scoring rate is considered unknown. We use a Gamma distribution for the scoring rate and the Gamma parameters are chosen using historical data and difference among teams defined by a strength factor for each team. The strength factor is a measure of discrimination among the national teams obtained from their memberships to fuzzy clusters. The clusters are obtained with the use of the Fuzzy C-means algorithm applied to a vector of variables, most of them available on the official FIFA website. Static and dynamic models were used to predict the World Cup outcomes and the performance of our predictions was evaluated using two comparison methods.

Computational Statistics & Data Analysis, 2008
This paper introduces a tree-based model that combines aspects of classification and regression t... more This paper introduces a tree-based model that combines aspects of classification and regression trees (CART) and smooth transition regression (STR). The model is called the STR-tree. The main idea relies on specifying a parametric nonlinear model through a tree-growing procedure. The resulting model can be analyzed as a smooth transition regression with multiple regimes. Decisions about splits are entirely based on a sequence of Lagrange multiplier (LM) tests of hypotheses. An alternative specification strategy based on a 10-fold cross-validation is also discussed and a Monte Carlo experiment is carried out to evaluate the performance of the proposed methodology in comparison with standard techniques. The STR-tree model outperforms CART when the correct selection of the architecture of simulated trees is discussed. Furthermore, the LM test seems to be a promising alternative to 10-fold cross-validation. Function approximation is also analyzed. When put into proof with real and simulated data sets, the STR-tree model has a superior predictive ability than CART.
Uploads
Papers by Joel M Correa da Rosa