Papers by Giancarlo Mauri
Theoretical Computer Science, Jul 1, 2019
We study a variant of the problem of finding a collection of disjoint s-clubs in a given network.... more We study a variant of the problem of finding a collection of disjoint s-clubs in a given network. Given a graph, the problem asks whether there exists a collection of at most r disjoint s-clubs that covers at least k vertices of the network. An s-club is a connected graph that has diameter bounded by s, for a positive integer s. We demand that each club is non-trivial, that is it has order at least t ≥ 2, for some positive integer t. We prove that the problem is APX-hard even when the input graph has bounded degree, s = 2, t = 3 and r = |V |. Moreover, we show that the problem is polynomial-time solvable when s ≥ 4, t = 3 and r = |V |, and when s ≥ 3, t = 2 and r = |V |. Finally, for s ≥ 2, we present a fixed-parameter algorithm for the problem, when parameterized by the number of covered vertices.

The Journal of Supercomputing, Aug 24, 2018
Ordinary differential equations (ODEs) are a widespread formalism for the mathematical modeling o... more Ordinary differential equations (ODEs) are a widespread formalism for the mathematical modeling of natural and engineering systems, whose analysis is generally performed by means of numerical integration methods. However, real-world models are often characterized by stiffness, a circumstance that can lead to prohibitive execution times. In such cases, the practical viability of many computational tools-e.g., sensitivity analysis-is hampered by the necessity to carry out a large number of simulations. In this work, we present ginSODA, a general-purpose black-box numerical integrator that distributes the calculations on graphics processing units, and allows to run massive numbers of numerical integrations of ODE systems characterized by stiffness. By leveraging symbolic differentiation, meta-programming techniques, and source code hashing, ginSODA automatically builds highly optimized binaries for the CUDA architecture, preventing code re-compilation and allowing to speed up the computation with respect to the sequential execution. ginSODA also provides a simplified Python interface, which allows to define a system of ODEs and the test to be performed in a few lines of code. According to our results, ginSODA provides up to a 25× speedup with respect to the sequential execution.

arXiv (Cornell University), Mar 29, 2019
Prostate cancer is the most common cancer among US men. However, prostate imaging is still challe... more Prostate cancer is the most common cancer among US men. However, prostate imaging is still challenging despite the advances in multi-parametric Magnetic Resonance Imaging (MRI), which provides both morphologic and functional information pertaining to the pathological regions. Along with whole prostate gland segmentation, distinguishing between the Central Gland (CG) and Peripheral Zone (PZ) can guide towards differential diagnosis, since the frequency and severity of tumors differ in these regions; however, their boundary is often weak and fuzzy. This work presents a preliminary study on Deep Learning to automatically delineate the CG and PZ, aiming at evaluating the generalization ability of Convolutional Neural Networks (CNNs) on two multi-centric MRI prostate datasets. Especially, we compared three CNN-based architectures: SegNet, U-Net, and pix2pix. In such a context, the segmentation performances achieved with/without pre-training were compared in 4-fold cross-validation. In general, U-Net outperforms the other methods, especially when training and testing are performed on multiple datasets.

Communications in computer and information science, 2017
The links between metabolic dysfunctions and various diseases or pathological conditions are bein... more The links between metabolic dysfunctions and various diseases or pathological conditions are being increasingly revealed. This revival of interest in cellular metabolism has pushed forward new experimental technologies enabling the characterization of metabolic phenotypes. Unfortunately, while large datasets are being collected, which encompass the concentration of many metabolites of a system under different conditions, these datasets remain largely obscure. In fact, in spite of the efforts to interpret alterations in metabolic concentrations, it is difficult to correctly ascribe them to the corresponding variations in metabolic fluxes (i.e. the rate of turnover of molecules through metabolic pathways) and thus to the up- or down-regulation of given pathways. As a first step towards a systematic procedure to connect alterations in metabolic fluxes with shifts in metabolites, we propose to exploit a Montecarlo approach to look for correlations between the variations in fluxes and in metabolites, observed when simulating the response of a metabolic network to a given perturbation. As a proof of principle, we investigate the dynamics of a simplified ODE model of yeast metabolism under different glucose abundances. We show that, although some linear correlations between shifts in metabolites and fluxes exist, those relationships are far from obvious. In particular, metabolite levels can show a low correlation with changes in the fluxes of the reactions that directly involve them, while exhibiting a strong connection with alterations in fluxes that are far apart in the network.
Springer eBooks, 1997
In this paper Multilayered Automata Networks are formally defined as a generalization of Cellular... more In this paper Multilayered Automata Networks are formally defined as a generalization of Cellular Automata Networks. They are hierarchically organized on the basis of nested-graphs,and can show different kinds of dynamics, which allow to use them to model, for example, complex biological systems comprised of different entities organized in a hierarchical framework.

Theoretical Computer Science, Dec 1, 2019
Inspired by scaffold filling, a recent approach for genome reconstruction from incomplete data, w... more Inspired by scaffold filling, a recent approach for genome reconstruction from incomplete data, we consider a variant of the well-known longest common subsequence problem for the comparison of two sequences. The new problem, called Longest Filled Common Subsequence, aims to compare a complete sequence with an incomplete one, i.e. with some missing elements. Longest Filled Common Subsequence (LFCS), given a complete sequence A, an incomplete sequence B, and a multiset M of symbols missing in B, asks for a sequence B * obtained by inserting the symbols of M into B so that B * induces a common subsequence with A of maximum length. We investigate the computational and approximation complexity of the problem and we show that it is NP-hard and APX-hard when A contains at most two occurrences of each symbol, and we give a polynomial time algorithm when the input sequences are over a constant-size alphabet. We give a 3 5 −approximation algorithm for the Longest Filled Common Subsequence problem. Finally, we present a fixed-parameter algorithm for the problem, when it is parameterized by the number of symbols inserted in B that "match" symbols of A.

Fundamenta Informaticae, Aug 9, 2017
Reaction systems represent a theoretical framework based on the regulation mechanisms of facilita... more Reaction systems represent a theoretical framework based on the regulation mechanisms of facilitation and inhibition of biochemical reactions. The dynamic process defined by a reaction system is typically derived by hand, starting from the set of reactions and a given context sequence. However, this procedure may be error-prone and time-consuming, especially when the size of the reaction system increases. Here we present HERESY, a simulator of reaction systems accelerated on Graphics Processing Units (GPUs). HERESY is based on a fine-grained parallelization strategy, whereby all reactions are simultaneously executed on the GPU, therefore reducing the overall running time of the simulation. HERESY is particularly advantageous for the simulation of large-scale reaction systems, consisting of hundreds or thousands of reactions. By considering as test case some reaction systems with an increasing number of reactions and entities, as well as an increasing number of entities per reaction, we show that HERESY allows up to 29A speed-up with respect to a CPU-based simulator of reaction systems. Finally, we provide some directions for the optimization of HERESY, considering minimal reaction systems in normal form.

BMC Bioinformatics, Apr 1, 2019
Background: In order to fully characterize the genome of an individual, the reconstruction of the... more Background: In order to fully characterize the genome of an individual, the reconstruction of the two distinct copies of each chromosome, called haplotypes, is essential. The computational problem of inferring the full haplotype of a cell starting from read sequencing data is known as haplotype assembly, and consists in assigning all heterozygous Single Nucleotide Polymorphisms (SNPs) to exactly one of the two chromosomes. Indeed, the knowledge of complete haplotypes is generally more informative than analyzing single SNPs and plays a fundamental role in many medical applications. Results: To reconstruct the two haplotypes, we addressed the weighted Minimum Error Correction (wMEC) problem, which is a successful approach for haplotype assembly. This NP-hard problem consists in computing the two haplotypes that partition the sequencing reads into two disjoint subsets , with the least number of corrections to the SNP values. To this aim, we propose here GenHap, a novel computational method for haplotype assembly based on Genetic Algorithms, yielding optimal solutions by means of a global search process. In order to evaluate the effectiveness of our approach, we run GenHap on two synthetic (yet realistic) datasets, based on the Roche/454 and PacBio RS II sequencing technologies. We compared the performance of GenHap against HapCol, an efficient state-of-the-art algorithm for haplotype phasing. Our results show that GenHap always obtains high accuracy solutions (in terms of haplotype error rate), and is up to 4× faster than HapCol in the case of Roche/454 instances and up to 20× faster when compared on the PacBio RS II dataset. Finally, we assessed the performance of GenHap on two different real datasets. Conclusions: Future-generation sequencing technologies, producing longer reads with higher coverage, can highly benefit from GenHap, thanks to its capability of efficiently solving large instances of the haplotype assembly problem. Moreover, the optimization approach proposed in GenHap can be extended to the study of allele-specific genomic features, such as expression, methylation and chromatin conformation, by exploiting multi-objective optimization techniques. The source code and the full documentation are available at the following GitHub repository: https:// github.com/andrea-tango/GenHap.
Lecture Notes in Computer Science, 2009
The modelling of biochemical systems requires the knowledge of several quantitative parameters (e... more The modelling of biochemical systems requires the knowledge of several quantitative parameters (e.g. reaction rates) which are often hard to measure in laboratory experiments. Furthermore, when the system involves small numbers of molecules, the modelling approach should also take into account the effects of randomness on the system dynamics. In this paper, we tackle the problem of estimating the unknown

Swarm and evolutionary computation, Apr 1, 2018
Among the existing global optimization algorithms, Particle Swarm Optimization (PSO) is one of th... more Among the existing global optimization algorithms, Particle Swarm Optimization (PSO) is one of the most effective methods for non-linear and complex high-dimensional problems. Since PSO performance strongly depends on the choice of its settings (i.e., inertia, cognitive and social factors, minimum and maximum velocity), Fuzzy Logic (FL) was previously exploited to select these values. So far, FL-based implementations of PSO aimed at the calculation of a unique settings for the whole swarm. In this work we propose a novel self-tuning algorithm-called Fuzzy Self-Tuning PSO (FST-PSO)-which exploits FL to calculate the inertia, cognitive and social factor, minimum and maximum velocity independently for each particle, thus realizing a complete settings-free version of PSO. The novelty and strength of FST-PSO lie in the fact that it does not require any expertise in PSO functioning, since the behavior of every particle is automatically and dynamically adjusted during the optimization. We compare the performance of FST-PSO with standard PSO, Proactive Particles in Swarm Optimization, Artificial Bee Colony, Covariance Matrix Adaptation Evolution Strategy, Differential Evolution and Genetic Algorithms. We empirically show that FST-PSO can basically outperform all tested algorithms with respect to the convergence speed and is competitive concerning the best solutions found, noticeably with a reduced computational effort.

Scientific Reports, Apr 8, 2021
Self-assembling processes are ubiquitous phenomena that drive the organization and the hierarchic... more Self-assembling processes are ubiquitous phenomena that drive the organization and the hierarchical formation of complex molecular systems. The investigation of assembling dynamics, emerging from the interactions among biomolecules like amino-acids and polypeptides, is fundamental to determine how a mixture of simple objects can yield a complex structure at the nano-scale level. In this paper we present HyperBeta, a novel open-source software that exploits an innovative algorithm based on hyper-graphs to efficiently identify and graphically represent the dynamics of β-sheets formation. Differently from the existing tools, HyperBeta directly manipulates data generated by means of coarse-grained molecular dynamics simulation tools (GROMACS), performed using the MARTINI force field. Coarse-grained molecular structures are visualized using HyperBeta 's proprietary realtime high-quality 3D engine, which provides a plethora of analysis tools and statistical information, controlled by means of an intuitive event-based graphical user interface. The high-quality renderer relies on a variety of visual cues to improve the readability and interpretability of distance and depth relationships between peptides. We show that HyperBeta is able to track the β-sheets formation in coarse-grained molecular dynamics simulations, and provides a completely new and efficient mean for the investigation of the kinetics of these nano-structures. HyperBeta will therefore facilitate biotechnological and medical research where these structural elements play a crucial role, such as the development of novel high-performance biomaterials in tissue engineering, or a better comprehension of the molecular mechanisms at the basis of complex pathologies like Alzheimer's disease.

Journal of Translational Medicine, Oct 20, 2015
Background: Several promising biomarkers have been found for RCC, but none of them has been used ... more Background: Several promising biomarkers have been found for RCC, but none of them has been used in clinical practice for predicting tumour progression. The most widely used features for predicting tumour aggressiveness still remain the cancer stage, size and grade. Therefore, the aim of our study is to investigate the urinary peptidome to search and identify peptides whose concentrations in urine are linked to tumour growth measure and clinical data. Methods: A proteomic approach applied to ccRCC urinary peptidome (n = 117) based on prefractionation with activated magnetic beads followed by MALDI-TOF profiling was used. A systematic correlation study was performed on urinary peptide profiles obtained from MS analysis. Peptide identity was obtained by LC-ESI-MS/MS. Results: Fifteen, twenty-six and five peptides showed a statistically significant alteration of their urinary concentration according to tumour size, pT and grade, respectively. Furthermore, 15 and 9 signals were observed to have urinary levels statistically modified in patients at different pT or grade values, even at very early stages.

PLOS ONE, May 27, 2014
Defining the aggressiveness and growth rate of a malignant cell population is a key step in the c... more Defining the aggressiveness and growth rate of a malignant cell population is a key step in the clinical approach to treating tumor disease. The correct grading of breast cancer (BC) is a fundamental part in determining the appropriate treatment. Biological variables can make it difficult to elucidate the mechanisms underlying BC development. To identify potential markers that can be used for BC classification, we analyzed mRNAs expression profiles, gene copy numbers, microRNAs expression and their association with tumor grade in BC microarray-derived datasets. From mRNA expression results, we found that grade 2 BC is most likely a mixture of grade 1 and grade 3 that have been misclassified, being described by the gene signature of either grade 1 or grade 3. We assessed the potential of the new approach of integrating mRNA expression profile, copy number alterations, and microRNA expression levels to select a limited number of genomic BC biomarkers. The combination of mRNA profile analysis and copy number data with microRNA expression levels led to the identification of two gene signatures of 42 and 4 altered genes (FOXM1, KPNA4, H2AFV and DDX19A) respectively, the latter obtained through a meta-analytical procedure. The 42-based gene signature identifies 4 classes of up-or down-regulated microRNAs (17 microRNAs) and of their 17 target mRNA, and the 4-based genes signature identified 4 microRNAs (Hsa-miR-320d, Hsa-miR-139-5p, Hsa-miR-567 and Hsa-let-7c). These results are discussed from a biological point of view with respect to pathological features of BC. Our identified mRNAs and microRNAs were validated as prognostic factors of BC disease progression, and could potentially facilitate the implementation of assays for laboratory validation, due to their reduced number.

arXiv (Cornell University), Jun 4, 2018
Finding cohesive subgraphs in a network is a well-known problem in graph theory. Several alternat... more Finding cohesive subgraphs in a network is a well-known problem in graph theory. Several alternative formulations of cohesive subgraph have been proposed, a notable example being s-club, which is a subgraph where each vertex is at distance at most s to the others. Here we consider the problem of covering a given graph with the minimum number of s-clubs. We study the computational and approximation complexity of this problem, when s is equal to 2 or 3. First, we show that deciding if there exists a cover of a graph with three 2-clubs is NP-complete, and that deciding if there exists a cover of a graph with two 3-clubs is NP-complete. Then, we consider the approximation complexity of covering a graph with the minimum number of 2-clubs and 3-clubs. We show that, given a graph G = (V, E) to be covered, covering G with the minimum number of 2-clubs is not approximable within factor O(|V | 1/2−ε), for any ε > 0, and covering G with the minimum number of 3-clubs is not approximable within factor O(|V | 1−ε), for any ε > 0. On the positive side, we give an approximation algorithm of factor 2|V | 1/2 log 3/2 |V | for covering a graph with the minimum number of 2-clubs.
Fundamenta Informaticae, 2014
We continue the investigation of the computational power of spaceconstrained P systems. We show t... more We continue the investigation of the computational power of spaceconstrained P systems. We show that only a constant amount of space is needed in order to simulate a polynomial-space bounded Turing machine. Due to this result, we propose an alternative definition of space complexity for P systems, where the amount of information contained in individual objects and membrane labels is also taken into account. Finally, we prove that, when less than a logarithmic number of membrane labels is available, moving the input objects around the membrane structure without rewriting them is not enough to even distinguish inputs of the same length.
Lecture Notes in Computer Science, 2019
Among PSPACE-complete problems, QSAT, or quantified SAT, is one of the most used to show that the... more Among PSPACE-complete problems, QSAT, or quantified SAT, is one of the most used to show that the class of problems solvable in polynomial time by families of a given variant of P systems includes the whole PSPACE. However, most solutions require a membrane nesting depth that is linear with respect to the number of variables of the QSAT instance under consideration. While a system of a certain depth is needed, since depth 1 systems only allows to solve problems in P #P , it was until now unclear if a linear depth was, in fact, necessary. Here we use P systems with active membranes with charges, and we provide a construction that proves that QSAT can be solved with a sublinear nesting depth of order n log n , where n is the number of variables in the quantified formula given as input.

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 2017
Obesity is now one of the most critical and demanding public health condition due to the correlat... more Obesity is now one of the most critical and demanding public health condition due to the correlation with many medical and psychological comorbidities, such as cardiovascular, orthopedic, pneumological, endocrinological, psychopathological complications, above all the type 2 diabetes. Obesity traditionally needs long and expensive treatments in a chronic care management approach. So clinical research has to develop, test and validate cheaper rehabilitation programs. For this reason, we developed the DIABESITY study, the design of a mHealth integrated platform to promote the empowerment of patients in self-monitoring and successfully managing their pathological conditions (focusing on obesity and type 2 diabetes) through the use of mobile devices. In this paper we report this study by discussing the following two important aspects of DIABESITY. (i) Dietary mHealth tools for home-patients; (ii) Measures to capture the psychological factors and processes which mediate change of behavior and affect initiation and maintenance phases.
Bio-Inspired Computing Models and Algorithms, 2019
Enjoying Natural Computing, 2018
We present some high-level open problems in the complexity theory of membrane systems, related to... more We present some high-level open problems in the complexity theory of membrane systems, related to the actual computing power of confluence vs determinism, semi-uniformity vs uniformity, deep vs shallow membrane structures, membrane division vs internal evolution of membranes. For each of these problems we present some reasonable approaches that are, however, unable to be employed "as-is" to provide a complete solution. This will hopefully sparkle new ideas that will allow tackling these open problems.
Uploads
Papers by Giancarlo Mauri