Papers by Leonid B Sokolinsky
The paper describes development principles and the program structure of the Omega File Management... more The paper describes development principles and the program structure of the Omega File Management System (OFMS) for the Omega parallel DBMS engine. The paper gives requirements for OFMS and the description of its general structure and components. The paper gives some effective protocol for interaction with the Disk Subsystem Unit and describes architecture of the Disk Subsystem Emulator. The paper also proposes a page replacement strategy based on so-called static and dynamic page ratings. Described OFMS was implemented for the MBC-100 massively parallel computing system.

The development and investigation of efficient methods of parallel processing of very large datab... more The development and investigation of efficient methods of parallel processing of very large databases using the columnar data representation designed for computer cluster is discussed. An approach that combines the advantages of relational and column-oriented DBMSs is proposed. A new type of distributed column indexes fragmented based on the domain-interval principle is introduced. The column indexes are auxiliary structures that are constantly stored in the distributed main memory of a computer cluster. To match the elements of a column index to the tuples of the original relation, surrogate keys are used. Resource hungry relational operations are performed on the corresponding column indexes rather than on the original relations of the database. As a result, a precomputation table is obtained. Using this table, the DBMS reconstructs the resulting relation. For basic relational operations on column indexes, methods for their parallel decomposition that do not require massive data exchanges between the processor nodes are proposed. This approach improves the class OLAP query performance by hundreds of times.

This paper is devoted to the new edition of the parallel Pursuit algorithm proposed the authors i... more This paper is devoted to the new edition of the parallel Pursuit algorithm proposed the authors in previous works. The Pursuit algorithm uses Fejer's mappings for building pseudo-projection on polyhedron. The algorithm tracks changes in input data and corrects the calculation process. The previous edition of the algorithm assumed using a cube-shaped pursuit region with the number of K cells in one dimension. The total number of cells is K n , where n is the problem dimension. This resulted in high computational complexity of the algorithm. The new edition uses a cross-shaped pursuit region with one crossbar per dimension. Such a region consists of only n(K − 1) + 1 cells. The new algorithm is intended for cluster computing system with Xeon Phi processors. Keywords: Non-stationary linear programming problem · Fejer's mappings · Pursuit algorithm · Massive parallelism · Cluster computing systems · MIC architecture · Intel Xeon Phi · Native mode · OpenMP

1st Ural Workshop on Parallel, Distributed, and Cloud Computing for Young Scientists (Ural-PDC 2015), 2015
The paper describes an approach to the parallel natural join execution on computing clusters with... more The paper describes an approach to the parallel natural join execution on computing clusters with GPU and MIC Coprocessors. This approach is based on a decomposition of natural join relational operator using the column indices and domain-interval fragmentation. This decomposition admits parallel executing the resource-intensive relational operators without data transfers. All column index fragments are stored in main memory. To process the join of two relations, each pair of index fragments corresponding to particular domain interval is joined on a separate processor core. Described approach allows efficient parallel query processing for very large databases on modern computing cluster systems with many-core accelerators. A prototype of the DBMS coprocessor system was implemented using this technique. The results of computational experiments for GPU and Xeon Phi are presented. These results confirm the efficiency of proposed approach. Keywords: big data · parallel query processing · column indices · domain-interval fragmentation · natural join · GPU · MIC

Supercomputing. RuSCDays 2016, 2016
This paper is devoted to the new edition of the parallel Pursuit algorithm proposed the authors i... more This paper is devoted to the new edition of the parallel Pursuit algorithm proposed the authors in previous works. The Pursuit algorithm uses Fejer's mappings for building pseudo-projection on polyhedron. The algorithm tracks changes in input data and corrects the calculation process. The previous edition of the algorithm assumed using a cube-shaped pursuit region with the number of K cells in one dimension. The total number of cells is K n , where n is the problem dimension. This resulted in high computational complexity of the algorithm. The new edition uses a cross-shaped pursuit region with one crossbar per dimension. Such a region consists of only n(K − 1) + 1 cells. The new algorithm is intended for cluster computing system with Xeon Phi processors. Keywords: Non-stationary linear programming problem · Fejer's mappings · Pursuit algorithm · Massive parallelism · Cluster computing systems · MIC architecture · Intel Xeon Phi · Native mode · OpenMP

For the multiprocessor systems of the hierarchical-architecture relational databases, a new appro... more For the multiprocessor systems of the hierarchical-architecture relational databases, a new approach to data layout and load balancing was proposed. Described was a database mul-tiprocessor model enabling simulation and examination of arbitrary multiprocessor hierarchical configurations in the context of the on-line transaction processing applications. An important subclass of the symmetrical multiprocessor hierarchies was considered, and a new data layout strategy based on the method of partial mirroring was proposed for them. The disk space used to replicate the data was evaluated analytically. For the symmetrical hierarchies having certain regularity, theorems estimating the laboriousness of replica formation were proved. An efficient method of load balancing on the basis of the partial mirroring technique was proposed. The methods described are oriented to the clusters and Grid-systems.
The paper is devoted to the issue of decomposition of the join relational operator with the aid o... more The paper is devoted to the issue of decomposition of the join relational operator with the aid of distributed column indices. Such decomposition allows one to utilize the modern many-core accelerators (GPU or Intel Xeon Phi) to speed up the query execution for very large databases. Column indices are the new kind of index structures, which exploits "key-value" technics. The paper describes the methods of column index fragmentation based on domain intervals. This technic allows organizing the parallel query processing without exchanges. All column index fragments are stored in main memory in compressed form to conserve space. This approach can be implemented as a coprocessor for relational database systems. The database coprocessor is able to perform resource-intensive operations much more faster than a conventional DBMS.

One of the important classes of computational problems is problem-oriented workflow applications ... more One of the important classes of computational problems is problem-oriented workflow applications executed in distributed computing environment. A problem-oriented workflow application can be represented by a directed graph whose vertices are tasks and arcs are data flows. For a problem-oriented work-flow application, we can get a priori estimates of the task execution time and the amount of data to be transferred between the tasks. A distributed computing environment designed for the execution of such tasks in a certain subject domain is called problem-oriented environment. To efficiently use resources of the distributed computing environment, special scheduling algorithms are applied. Nowadays, a great number of such algorithms have been proposed. Some of them (like the DSC algorithm) take into account specific features of problem-oriented workflow applications. Others (like Min–Min algorithm) take into account many-core structure of nodes of the computational network. However, none of them takes into account both factors. In this paper, a mathematical model of problem-oriented computing environment is constructed, and a new problem-oriented scheduling (POS) algorithm is proposed. The POS algorithm takes into account both spe-cifics of the problem-oriented jobs and multi-core structure of the computing system nodes. Results of computational experiments comparing the POS algorithm with other known scheduling algorithms are presented.
The paper is dedicated to issues concerning simulation and analysis of hierarchical multiprocesso... more The paper is dedicated to issues concerning simulation and analysis of hierarchical multiprocessor
systems oriented to database applications. Requirements for a parallel database system model are given.
A survey and comparative analysis of known parallel database system models are presented. A new multipro
cessor database system model is introduced. This model allows us to simulate and evaluate arbitrary hierar
chical multiprocessor configurations in the context of the OLTP class database applications. Examples of
using the database multiprocessor model for simulation study of multiprocessor database systems are pre
sented.
The paper is devoted to the classification, design, and analysis of architectures of parallel dat... more The paper is devoted to the classification, design, and analysis of architectures of parallel database
systems. A formalization of the notion “parallel database system” is suggested, which relies on a concept of a
virtual machine. Based on this formalization, a new approach to the classification of architectures of parallel
database systems is suggested. Requirements to parallel database systems are formulated, which serve as criteria
for comparing various architectures. Various classes of architectures of parallel database systems are considered
and compared.
The paper is devoted to the problem of effective query execution in clusterbased systems. An orig... more The paper is devoted to the problem of effective query execution in clusterbased systems. An original approach to data placement and replication on the nodes of a cluster system is presented. Based on this approach, a load balancing method for parallel query processing is developed. A method for parallel query execution in cluster systems based on the load balancing method is suggested. Results of computational
experiments are presented, and analysis of efficiency of the proposed approaches is performed.
Database Systems for Advances Applications, 9th International Conference, DASFAA 2004, 2004
This paper introduces a new approach to database disk buffering,
called the LFU-K method. The LFU... more This paper introduces a new approach to database disk buffering,
called the LFU-K method. The LFU-K page replacement algorithm is an improvement
to the Least Frequently Used (LFU) algorithm. The paper proposes
a theoretical-probability model for formal description of LFU-K algorithm. Using
this model we evaluate estimations for the LFU-K parameters. This paper
also describes an implementation of LFU-2 policy. As we demonstrate by
trace-driven simulation experiments, the LFU-2 algorithm provides significant
improvement over conventional buffering algorithms for the shared-nothing database
systems.
Uploads
Papers by Leonid B Sokolinsky
systems oriented to database applications. Requirements for a parallel database system model are given.
A survey and comparative analysis of known parallel database system models are presented. A new multipro
cessor database system model is introduced. This model allows us to simulate and evaluate arbitrary hierar
chical multiprocessor configurations in the context of the OLTP class database applications. Examples of
using the database multiprocessor model for simulation study of multiprocessor database systems are pre
sented.
systems. A formalization of the notion “parallel database system” is suggested, which relies on a concept of a
virtual machine. Based on this formalization, a new approach to the classification of architectures of parallel
database systems is suggested. Requirements to parallel database systems are formulated, which serve as criteria
for comparing various architectures. Various classes of architectures of parallel database systems are considered
and compared.
experiments are presented, and analysis of efficiency of the proposed approaches is performed.
called the LFU-K method. The LFU-K page replacement algorithm is an improvement
to the Least Frequently Used (LFU) algorithm. The paper proposes
a theoretical-probability model for formal description of LFU-K algorithm. Using
this model we evaluate estimations for the LFU-K parameters. This paper
also describes an implementation of LFU-2 policy. As we demonstrate by
trace-driven simulation experiments, the LFU-2 algorithm provides significant
improvement over conventional buffering algorithms for the shared-nothing database
systems.
systems oriented to database applications. Requirements for a parallel database system model are given.
A survey and comparative analysis of known parallel database system models are presented. A new multipro
cessor database system model is introduced. This model allows us to simulate and evaluate arbitrary hierar
chical multiprocessor configurations in the context of the OLTP class database applications. Examples of
using the database multiprocessor model for simulation study of multiprocessor database systems are pre
sented.
systems. A formalization of the notion “parallel database system” is suggested, which relies on a concept of a
virtual machine. Based on this formalization, a new approach to the classification of architectures of parallel
database systems is suggested. Requirements to parallel database systems are formulated, which serve as criteria
for comparing various architectures. Various classes of architectures of parallel database systems are considered
and compared.
experiments are presented, and analysis of efficiency of the proposed approaches is performed.
called the LFU-K method. The LFU-K page replacement algorithm is an improvement
to the Least Frequently Used (LFU) algorithm. The paper proposes
a theoretical-probability model for formal description of LFU-K algorithm. Using
this model we evaluate estimations for the LFU-K parameters. This paper
also describes an implementation of LFU-2 policy. As we demonstrate by
trace-driven simulation experiments, the LFU-2 algorithm provides significant
improvement over conventional buffering algorithms for the shared-nothing database
systems.