Panagiotis Michailidis

Followers

Following

Public Views

Veysel Demirer

Suleyman Demirel University

Nergiz Ercil Cagiltay

ATILIM UNIVERSITY

Ugur Demiray

Anadolu University

Zeynep TURAN

Ataturk University

Mustafa Şahin

Dokuz Eylül University

Istanbul University

Uludag University

Anadolu University

Karabuk University

Canakkale Onsekiz Mart University

Interests

Uploads

Papers by Panagiotis Michailidis

Parallel Processing of Multiple Pattern Matching Algorithms for Biological Sequences: Methods and Performance Results

Systems and Computational Biology - Bioinformatics and Computational Modeling

Download

Experimental Results on Multiple Pattern Matching Algorithms for Biological Sequences

Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms, 2011

A Preliminary Performance Study on Nonlinear Regression Models using the Jaya Optimisation Algorithm

Parameter estimation in nonlinear regression models (NRMs) represents a major challenge for vario... more Parameter estimation in nonlinear regression models (NRMs) represents a major challenge for various scientific computing applications. In this study, we briefly consider a recent population-based metaheuristic algorithm named Jaya, which is used in estimating the parameters of NRMs. The algorithm is experimentally tested on a set of benchmark regression problems of various levels of difficulty. We show that the algorithm can be used as an alternative means of parameter estimation in NRMs. It is efficient in computational time and achieves a high success rate and accuracy.

Download

An efficient multi-core implementation of the Jaya optimisation algorithm

International Journal of Parallel, Emergent and Distributed Systems, 2017

In this work, we propose a hybrid parallel Jaya optimisation algorithm for a multi-core environme... more In this work, we propose a hybrid parallel Jaya optimisation algorithm for a multi-core environment with the aim of solving large-scale global optimisation problems. The proposed algorithm is called HHCPJaya, and combines the hyper-population approach with the hierarchical cooperation search mechanism. The HHCPJaya algorithm divides the population into many small subpopulations, each of which focuses on a distinct block of the original population dimensions. In the hyper-population approach, we increase the small subpopulations by assigning more than one subpopulation to each core, and each subpopulation evolves independently to enhance the explorative and exploitative nature of the population. We combine this hyper-population approach with the two-level hierarchical cooperative search scheme to find global solutions from all subpopulations. Furthermore, we incorporate an additional updating phase on the respective subpopulations based on global solutions, with the aim of further improving the convergence rate and the quality of solutions. Several experiments applying the proposed parallel algorithm in different settings prove that it demonstrates sufficient promise in terms of the quality of solutions and the convergence rate. Furthermore, a relatively small computational effort is required to solve complex and large-scale optimisation problems.

Download

Evaluating modern parallelization techniques on block matching algorithms

Proceedings of the 18th Panhellenic Conference on Informatics, 2014

Implementation of the String Matching Problem on a Cluster of Workstations

In this paper we present the parallel implementation of the string matching problem with static t... more In this paper we present the parallel implementation of the string matching problem with static text allocation. Experiments are realized using the Message Passing Interface (MPI) library on a homogeneous and a heterogeneous cluster of workstations. Our experimental results report that this implementation achieved significant speedup using different text sizes and number of workstations. We also present a performance prediction model that is general enough to address performance evaluation of both types (homogeneous and heterogeneous) computations. This model agrees well with our experimental measurements of both implementations.

Implementation of the Approximate String Matching Application on a Cluster of Heterogeneous Workstations

Sci. Ann. Cuza Univ., 2002

Multiple String Matching on a GPU using CUDAs

Scalable Computing: Practice and Experience, Jun 30, 2015

Multiple pattern matching algorithms are used to locate the occurrences of patterns from a finite... more Multiple pattern matching algorithms are used to locate the occurrences of patterns from a finite pattern set in a large input string. Aho-Corasick, Set Horspool, Set Backward Oracle Matching, Wu-Manber and SOG, five of the most well known algorithms for multiple matching require an increased computing power, particularly in cases where large-size datasets must be processed, as is common in computational biology applications. Over the past years, Graphics Processing Units (GPUs) have evolved to powerful parallel processors outperforming CPUs in scientific applications. This paper evaluates the speedup of the basic parallel strategy and the different optimization strategies for parallelization of Aho-Corasick, Set Horspool, Set Backward Oracle Matching, Wu-Manber and SOG algorithms on a GPU.

Download

Accelerating kernel density estimation on the GPU using the CUDA framework

Applied mathematical sciences, 2013

The main problem of the kernel density estimation methods is the huge computational requirements,... more The main problem of the kernel density estimation methods is the huge computational requirements, especially for large data sets. One way for accelerating these methods is to use the parallel processing. Recent advances in parallel processing have focused on the use Graphics Processing Units (GPUs) using Compute Unified Device Architecture (CUDA) programming model. In this work we discuss a naive and two optimised CUDA algorithms for the two kernel estimation methods: univariate and multivariate. These optimised algorithms are based on the use of shared memory tiles and loop unrolling techniques. We also present exploratory experimental results of the proposed CUDA algorithms according to the several values of parameters such as number of threads per block, tile size, loop unroll level, number of variables and data (sample) size. The experimental results show significant performance gains of all proposed CUDA algorithms over serial CPU version and small performance speed-ups of the two optimised CUDA algorithms over naive GPU algorithms. Finally, based on extended performance results are obtained general conclusions of all proposed CUDA algorithms for some parameters.

Download

A Parallel Implementation for Finding the Longest Common Subsequence

Proceedings of the Sixth International Conference on Engineering Computational Technology

ABSTRACT Searching for the Longest Common Subsequence (LCS for short) is one of the most fundamen... more ABSTRACT Searching for the Longest Common Subsequence (LCS for short) is one of the most fundamental tasks in bioinformatics. In this paper, we present a parallel implementation of the LCS computation for heterogeneous master-worker platforms. It is the first parallel implementation on cluster environment for this problem. We also report a set of numerical experiments on a heterogeneous platform where computational resources have different computing powers. Also, the workers are connected to the master by communication links of same capacities. The obtained experimental results demonstrate the proposed parallel implementation is efficient with low communication overhead.

Parallel and Distributed Programming in Java: Lecture Notes

Experiments with Efficient Variants of the Zhu-Takaoka String Matching Algorithm

A Performance Prediction Model for Communication Time on Heterogeneous Clusters of Workstations

An Experimental Study for Parallelizing Basic Kernels From Scientific Computing using Multicore Libraries

Scientific computations on multi-core systems using different programming frameworks

Applied Numerical Mathematics, 2016

Numerical linear algebra is one of the most important forms of scientific computation. The basic ... more Numerical linear algebra is one of the most important forms of scientific computation. The basic computations in numerical linear algebra are matrix computations and linear systems solution. These computations are used as kernels in many computational problems. This study demonstrates the parallelisation of these scientific computations using multicore programming frameworks. Specifically, the frameworks examined here are Pthreads, OpenMP, Intel Cilk Plus, Intel TBB, SWARM, and FastFlow. A unified and exploratory performance evaluation and a qualitative study of these frameworks are also presented for parallel scientific computations with several parameters. The OpenMP and SWARM models produce good results running in parallel with compiler optimisation when implementing matrix operations at large and medium scales, whereas the remaining models do not perform as well for some matrix operations. The qualitative results show that the OpenMP, Cilk Plus, TBB, and SWARM frameworks require minimal programming effort, whereas the other models require advanced programming skills and experience. Finally, based on an extended study, general conclusions regarding the programming models and matrix operations for some parameters were obtained.

Download

Implementing basic computational kernels of linear algebra on multicore

Proceedings of the 2012 16th Panhellenic Conference on Informatics, PCI 2012, 2012

Abstract This paper implements basic computational kernels of the scientific computing such as ma... more Abstract This paper implements basic computational kernels of the scientific computing such as matrix-vector product, matrix product and Gaussian elimination on multi-core platforms using several parallel programming tools. Specifically, these tools are Pthreads, OpenMP, Intel Cilk++, Intel TBB, Intel ArBB, SMPSs, SWARM and Fast Flow. The aim of this paper is to present an unified quantitative and qualitative study of these tools for parallel computation of scientific computing kernels on multicore. Finally, based on this study we conclude that the ...

Efficient multi-core computations in computational statistics and econometrics

Proceedings - 15th IEEE International Conference on Computational Science and Engineering, CSE 2012 and 10th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, EUC 2012, 2012

The social researchers use computationallyintensive statistical and econometric methods for data ... more The social researchers use computationallyintensive statistical and econometric methods for data analysis. One way for accelerating these computations is to use the parallel computing with multi-core platforms. In this paper we parallelize some representative computational kernels from statistics and econometrics on multi-core platform using the programming libraries such as Pthreads, OpenMP, Intel Cilk++, Intel TBB, Intel ArBB, SWARM and FastFlow. Specifically, these kernels are multivariate descriptive statistics (such as multivariate mean and multivariate covariance) and kernel density estimation (univariate and multivariate). The purpose of this paper is to present an extensive quantitative and qualitative study of the multi-core programming models for parallel statistical and econometric computations. Finally, based on this study we conclude that the Intel ArBB and the SWARM programming environments are more efficient for implementing statistical computations of large and small scale, respectively. The reason for which these models are efficient because they give good performance and simplicity of programming.

Download

Computational comparison of some multi-core programming tools for basic matrix computations

Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012, 2012

The broad introduction of multi-core platforms into computing has brought a great opportunity to ... more The broad introduction of multi-core platforms into computing has brought a great opportunity to develop computationally demanding applications such as matrix computations on parallel computing platforms. Basic matrix computations such as vector and matrix addition, dot product, outer product, matrix transpose, matrix-vector and matrix multiplication are very challenging computational kernels arising in scientific computing. In this paper, we parallelize those basic matrix computations using the multicore and parallel programming tools. Specifically, these tools are Pthreads, OpenMP, Intel Cilk++, Intel TBB, Intel ArBB, SMPSs, SWARM and FastFlow. The purpose of this paper is to present a quantitative and qualitative study of these tools for parallel matrix computations. Finally, based on this study we conclude that the Intel ArBB and SWARM parallel programming tools are the most appropriate because these give good performance and simplicity of programming.

Download

Processor array architectures for flexible approximate string matching

Journal of Systems Architecture, 2008

In this paper, we present linear processor array architectures for flexible approximate string ma... more In this paper, we present linear processor array architectures for flexible approximate string matching. These architectures are based on parallel realization of dynamic programming and non-deterministic finite automaton algorithms. The algorithms consist of two phases, i.e. preprocessing and searching. Then, starting from the data dependence graphs of the searching phase, parallel algorithms are derived, which can be realized directly onto special purpose processor array architectures for approximate string matching. Further, the preprocessing phase is also accommodated onto the same processor array designs. Finally, the proposed architectures support flexible patterns i.e. patterns with a ''don't care'' symbol, patterns with a complement symbol and patterns with a class symbol.

Download

New Processor Array Architectures for the Longest Common Subsequence Problem

The Journal of Supercomputing, 2005

A longest common subsequence (LCS) of two strings is a common subsequence of two strings of maxim... more A longest common subsequence (LCS) of two strings is a common subsequence of two strings of maximal length. The LCS problem is to find an LCS of two given strings and the length of the LCS (LLCS). In this paper, we present a new linear processor array for solving the LCS problem. The array is based on parallelization of a recent LCS algorithm which consists of two phases, i.e. preprocessing and computation. The computation phase is based on bit-level dynamic programming approach. Implementations of the preprocessing and computation phases are discussed on the same processor array architecture for the LCS problem. Further, we propose a block processor array architecture which reduces the overall communication and time requirements. Finally, we develop a performance model for estimating the performance of the processor array architecture on Pentium processors.

Download

Parallel Processing of Multiple Pattern Matching Algorithms for Biological Sequences: Methods and Performance Results

Systems and Computational Biology - Bioinformatics and Computational Modeling

Download

Experimental Results on Multiple Pattern Matching Algorithms for Biological Sequences

Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms, 2011

A Preliminary Performance Study on Nonlinear Regression Models using the Jaya Optimisation Algorithm

Download

An efficient multi-core implementation of the Jaya optimisation algorithm

International Journal of Parallel, Emergent and Distributed Systems, 2017

Download

Evaluating modern parallelization techniques on block matching algorithms

Proceedings of the 18th Panhellenic Conference on Informatics, 2014

Implementation of the String Matching Problem on a Cluster of Workstations

Implementation of the Approximate String Matching Application on a Cluster of Heterogeneous Workstations

Sci. Ann. Cuza Univ., 2002

Multiple String Matching on a GPU using CUDAs

Scalable Computing: Practice and Experience, Jun 30, 2015

Download

Accelerating kernel density estimation on the GPU using the CUDA framework

Applied mathematical sciences, 2013

Download

A Parallel Implementation for Finding the Longest Common Subsequence

Proceedings of the Sixth International Conference on Engineering Computational Technology

Parallel and Distributed Programming in Java: Lecture Notes

Experiments with Efficient Variants of the Zhu-Takaoka String Matching Algorithm

A Performance Prediction Model for Communication Time on Heterogeneous Clusters of Workstations

An Experimental Study for Parallelizing Basic Kernels From Scientific Computing using Multicore Libraries

Scientific computations on multi-core systems using different programming frameworks

Applied Numerical Mathematics, 2016

Download

Implementing basic computational kernels of linear algebra on multicore

Proceedings of the 2012 16th Panhellenic Conference on Informatics, PCI 2012, 2012

Efficient multi-core computations in computational statistics and econometrics

Proceedings - 15th IEEE International Conference on Computational Science and Engineering, CSE 2012 and 10th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, EUC 2012, 2012

Download

Computational comparison of some multi-core programming tools for basic matrix computations

Download

Processor array architectures for flexible approximate string matching

Journal of Systems Architecture, 2008

Download

New Processor Array Architectures for the Longest Common Subsequence Problem

The Journal of Supercomputing, 2005

Download

Panagiotis Michailidis

Uploads

Papers by Panagiotis Michailidis

Log In