Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2018, Numerical Linear Algebra with Applications
Large sparse linear systems arise in many areas of scientific computing, and the solution of these systems is the most time-consuming part in many large-scale problems. We present a hybrid parallel algorithm, named incomplete WZ parallel solver (IWZPS), for the solution of large sparse nonsingular diagonally dominant linear systems on distributed memory architectures. The method is a combination of both direct and iterative techniques. We compare the present hybrid parallel sparse algorithm IWZPS with the direct and iterative sparse solvers, namely, MUMPS and ILUPACK, respectively. In addition, we compare it with a hybrid parallel solver, DDPS.
2009 Computational Electromagnetics International Workshop, 2009
Abstraet-42onsider the system Ax = b, where A is a large sparse nonsymmetric matrix. It is assumed that A has no sparsity structure that may be exploited in the solution process, its spectrum may lie on both sides of the imaginary axis and its symmetric part may be indefinite. For such systems direct methods may be both time consuming and storage demanding, while iterative methods may not converge. In this paper, a hybrid method, which attempts to avoid these drawbacks, is proposed. An L U factorization of A that depends on a strategy that drops small non-zero elements during the Gaussian elimination process is used as a preconditioner for conjugate gradient-like schemes, ORTHOMIN, GMRES and CGS. Robustness is achieved by altering the drop tolerance and recomputing the preconditioner in the event that the factorization or the iterative method fails. If after a prescribed number of trials the iterative method is still not eonvergent, then a switch is made to a direct solver. Numerical examples, using matrices from the Harwell-Boeing test matrices, show that this hybrid scheme is often less time consuming and storage demanding; than direct solvers, and more robust than iterative methods that depend on preconditioners that depend .an classical positional dropping strategies. I, THE HYBRID ALGORITHM Consider the system of linear algebraic equations Ax = b, where A is a nonsingular, large, sparse and nonsymmetric matrix. We assume also that matrix A is generally sparse (i.e. it has neither any special property, such as symmetry and/or positive definiteness, nor any special pattern, such as bandedness, that can be exploited in the solution of the system). Solving such linear systems may be a rather difficult task. This is so because commonly used direct methods (sparse Gaussian elimination) are too time consuming, and iterative methods whose success depends on the matrix having a definite symmetric part or depends on the spectrum lying on one side of the imaginary axis are not robust enough. Direct methods have the advantage that they normally produce a sufficiently accurate solution, although a direct estimation of the accuracy actually achieved requires additional work. On the other hand, when iterative methods converge sufficiently fast, they require computing time that is several orders of magnitude smaller than that of any direct method. This brief comparison of the main properties of direct methods and iterative methods for the problem at hand shows that the methods of both groups have some advantages and some disadvantages. Ttlerefore it seems worthwhile to design methods that combine the advantages of both groups, while minimizing their disadvantages.
This paper addresses the main issues raised during the parallelization of iterative and direct solvers for such systems in distributed memory multiprocessors. If no preconditioning is considered, iterative solvers are simple to parallelize, as the most time-consuming computational structures are matrix-- vector products. Direct methods are much harder to parallelize, as new nonzero values may appear during computation and pivoting operations are usually accomplished due to numerical stability considerations. Suitable data structures and distributions for sparse solvers are discussed within the framework of a data-parallel environment, and experimentally evaluated and compared with existing solutions. 1 Introduction
Parallel Numerical Computation with Applications, 1999
Many problems in engineering and scienti c domains require solving large sparse systems of linear equations, as a computationally intensive step towards the nal solution. It has long been a challenge to develop e cient parallel formulations of sparse direct solvers due to several di erent complex steps involved in the process. In this paper, we describe PSPASES, one of the rst e cient, portable, and robust scalable parallel solvers for sparse symmetric positive de nite linear systems that we have developed. We discuss the algorithmic and implementation issues involved in its development; and present performance and scalability results on Cray T3E and SGI Origin 2000. PSPASES could solve the largest sparse system (1 million equations) ever solved by a direct method, with the highest performance (51 GFLOPS for Cholesky factorization) ever reported.
Lecture Notes in Computer Science, 2009
The availability of large-scale computing platforms comprised of tens of thousands of multicore processors motivates the need for the next generation of highly scalable sparse linear system solvers. These solvers must optimize parallel performance, processor (serial) performance, as well as memory requirements, while being robust across broad classes of applications and systems. In this paper, we present a new parallel solver that combines the desirable characteristics of direct methods (robustness) and effective iterative solvers (low computational cost), while alleviating their drawbacks (memory requirements, lack of robustness). Our proposed hybrid solver is based on the general sparse solver PARDISO, and the "Spike" family of hybrid solvers. The resulting algorithm, called PSPIKE, is as robust as direct solvers, more reliable than classical preconditioned Krylov subspace methods, and much more scalable than direct sparse solvers. We support our performance and parallel scalability claims using detailed experimental studies and comparison with direct solvers, as well as classical preconditioned Krylov methods.
1995
A parallel solvers package of three solvers with a unified user interface is developed for solving a range of sparse symmetric complex linear systems arising from disc~tization of partial differential equations based on unstructu~d meshs using finite element, finite difference and finite volume analysis. Once the data interface is set up, the package constructs the sparse symmetric complex matrix, and solves the linear system by the method chosen by the user, either a preconditioned hi-conjugate gradient solver, or a two-stage Cholesky LDL7' factorization solver, or a hybrid solver combining the above two methods.A unique feature of the solvers package is that the user deals with local matrices on local meshes on each processor. Scaling problem size N with the number of processors P with N/P fixed, test runs on Intel Delta up to 128 processors show that the bkxmjugate gradient method scales linearly with N whereas the two-stage hybrid method scales with TN .
Parallel Computation, 1999
Anumber of techniques are described for solving sparse linear systems on parallel platforms. The general approach used is a domaindecomposition type method in whichaprocessor is assigned a certain number of rows of the linear system to be solved. Strategies that are discussed include non-standard graph partitioners, and a forced loadbalance technique for the local iterations. A common practice when partitioning a graph is to seek to minimize the number of cut-edges and to have an equal number of equations per processor. It is shown that partitioners that takeinto account the values of the matrix entries may be more e ective.
Parallel Computing on Distributed Memory Multiprocessors, 1993
In this paper the direct solution of sparse linear systems on multiprocessor systems is considered. Elimination trees are used as a tool for identifying and exploiting parallelism in the parallel numerical factorization of the coefficient matrix. Some open problems are described and results of some numerical experiments are provided. This paper is based on a talk by the author at a Workshop on Methods and Algorithms for
1994
Abstract This paper presents research into parallel direct methods for block-diagonal-bordered sparse matrices| LU factorization and Choleski factorization algorithms developed with special consideration for irregular sparse matrices from the electrical power systems community. Direct block-diagonalbordered sparse linear solvers exhibit distinct advantages when compared to general direct parallel sparse algorithms for irregular matrices.
Parallel Computing, 2011
We investigate the efficient iterative solution of large-scale sparse linear systems on shared-memory multiprocessors. Our parallel approach is based on a multilevel ILU preconditioner which preserves the mathematical semantics of the sequential method in ILUPACK. We exploit the parallelism exposed by the task tree corresponding to the nested dissection hierarchy (task parallelism), employ dynamic scheduling of tasks to processors to
Lecture Notes in Computer Science, 2002
This paper presents an overview of pARMS, a package for solving sparse linear systems on parallel platforms. Preconditioners constitute the most important ingredient in the solution of linear systems arising from realistic scientific and engineering applications. The most common parallel preconditioners used for sparse linear systems adapt domain decomposition concepts to the more general framework of "distributed sparse linear systems". The parallel Algebraic Recursive Multilevel Solver (pARMS) is a recently developed package which integrates together variants from both Schwarz procedures and Schur complementtype techniques. This paper discusses a few of the main ideas and design issues of the package. A few details on the implementation of pARMS are provided.
ACM Transactions on Mathematical Software, 2001
This paper provides a comprehensive study and comparison of two state-of-the-art direct solvers for large sparse sets of linear equations on large-scale distributed-memory computers. One is a multifrontal solver called MUMPS, the other is a supernodal solver called SuperLU. We describe the main algorithmic features of the two solvers and compare their performance characteristics with respect to uniprocessor speed, interprocessor communication, and memory requirements. For both solvers, preorderings for numerical stability and sparsity play an important role in achieving high parallel efficiency. We analyse the results with various ordering algorithms. Our performance analysis is based on data obtained from runs on a 512-processor Cray T3E using a set of matrices from real applications. We also use regular 3D grid problems to study the scalability of the two solvers.
SIAM Journal on Computing, 1993
Technical reports, 1998
We i n vestigate and compare stable parallel algorithms for solving diagonally dominant and general narrow-banded linear systems of equations. Narrow-banded means that the bandwidth is very small compared with the matrix order and is typically between 1 and 100. The solvers compared are the banded system solvers of ScaLAPACK 11 and those investigated by Arbenz and Hegland 3, 6. For the diagonally dominant case, the algorithms are analogs of the well-known tridiagonal cyclic reduction algorithm, while the inspiration for the general case is the lesser-known bidiagonal cyclic reduction, which allows a clean parallel implementation of partial pivoting. These divide-and-conquer t ype algorithms complement ne-grained algorithms which perform well only for wide-banded matrices, with each family of algorithms having a range of problem sizes for which i t is superior. We present theoretical analyses as well as numerical experiments conducted on the Intel Paragon.
Lecture Notes in Computer Science, 1999
We i n vestigate and compare stable parallel algorithms for solving diagonally dominant and general narrow-banded linear systems of equations. Narrow-banded means that the bandwidth is very small compared with the matrix order and is typically between 1 and 100. The solvers compared are the banded system solvers of ScaLAPACK 11 and those investigated by Arbenz and Hegland 3, 6 . For the diagonally dominant case, the algorithms are analogs of the well-known tridiagonal cyclic reduction algorithm, while the inspiration for the general case is the lesser-known bidiagonal cyclic reduction, which allows a clean parallel implementation of partial pivoting. These divide-and-conquer t ype algorithms complement ne-grained algorithms which perform well only for wide-banded matrices, with each family of algorithms having a range of problem sizes for which i t is superior. We present theoretical analyses as well as numerical experiments conducted on the Intel Paragon.
Applicable Algebra in Engineering, Communication and Computing, 2007
An important recent development in the area of solution of general sparse systems of linear equations has been the introduction of new algorithms that allow complete decoupling of symbolic and numerical phases of sparse Gaussian elimination with partial pivoting. This enables efficient solution of a series of sparse systems with the same nonzero pattern but different coefficient values, which is a fairly common situation in practical applications. This paper reports on a shared-and distributed-memory parallel general sparse solver based on these new symbolic and unsymmetric-pattern multifrontal algorithms.
We present the first parallel algorithm for solving systems of linear equations in symmetric, diagonally dominant (SDD) matrices that runs in polylogarithmic time and nearly-linear work. The heart of our algorithm is a construction of a sparse approximate inverse chain for the input matrix: a sequence of sparse matrices whose product approximates its inverse. Whereas other fast algorithms for solving systems of equations in SDD matrices exploit low-stretch spanning trees, our algorithm only requires spectral graph sparsifiers.
Report, University of Minnesota and IBM …, 2001
LIMITED DISTRIBUTION NOTICE This report has been submitted for publication outside of IBM and will probably be copyrighted if ac-cepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the ...
… , University of Cambridge, Tech. Rep. UCAM- …, 2005
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.