Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2011
…
10 pages
1 file
Recent innovation in the field of Very Large Scale Integration has resulted into fabrication of high speed processors. The sequential computers, equipped with such high speed processors, are unable to meet the challenges of various real-life and real- time computational problems in the areas of image processing, climate modeling, remote sensing, medical science etc., that require to process massive volume of data. Parallel processing is one of the most appropriate technologies that can meet the challenges of such application areas. A variety of numeric and non-numeric problems are often required to be solved in the above mentioned areas. Prefix computation, polynomial root finding, matrix-matrix multiplication, conflict graph construction are some of the very important computations, which are frequently used for solving such problems. In this thesis, we mainly focus on the design of parallel algorithms for such computations to map them efficiently on suitable interconnection network...
With parallel processing the situation is entirely different. A parallel computer is one that consists of a collection of processing units, or processors, that cooperate to solve a problem by working simultaneously on different parts of that problem. The number of processors used can range from a few tens to several millions. As a result, the time required to solve the problem by a traditional uniprocessor computer is significantly reduced. This approach is attractive for a number of reasons. First, for many computational problems, the natural solution is a parallel one. Second, the cost and size of computer components have declined so sharply in recent years that parallel computers with a large number of processors have become feasible. And, third, it is possible in parallel processing to select the parallel architecture that is best suited to solve the problem or class of problems under consideration. Indeed, architects of parallel computers have the freedom to decide how many processors are to be used, how powerful these should be, what interconnection network links them to one another, whether they share a common memory, to what extent their operations are to be carried out synchronously, and a host of other issues. This wide range of choices has been reflected by the many theoretical models of parallel computation proposed as well as by the several parallel computers that were actually built.
Journal of Communications and Networks, 2014
The high intensity of research and modeling in fields of mathematics, physics, biology and chemistry requires new computing resources. For the big computational complexity of such tasks computing time is large and costly. The most efficient way to increase efficiency is to adopt parallel principles. Purpose of this paper is to present the issue of parallel computing with emphasis on the analysis of parallel systems, the impact of communication delays on their efficiency and on overall execution time. Paper focuses is on finite algorithms for solving systems of linear equations, namely the matrix manipulation (Gauss elimination method, GEM). Algorithms are designed for architectures with shared memory (open multiprocessing, openMP), distributedmemory (message passing interface, MPI) and for their combination (MPI + openMP). The properties of the algorithms were analytically determined and they were experimentally verified. The conclusions are drawn for theory and practice.
Microprocessors and Microsystems, 1987
give examples of parallel algorithms, explain the different parallel interconnection strategies and highlight some microprocessor-based parallel computers Parallel processing is an area of growing interest to the computer science and engineering communities. This paper is an introduction to some of the concepts involved in the design and use of large-scale parallel systems. Parallel machines that are classified as SIMD (synchronous) and MIMD (asynchronous) systems, composed of a large number of microprocessors, are explored. Parallel algorithms are examined, using image smoothin& recursive doubling and contour tracing as examples. Single stage and multistage networks are discussed. The single stage Cube, PM21, Four Nearest Neighbor and Shuffle-Exchange networks are presented, and the multistage Cube network is described. Case studies of three microprocessor-based systems are given as examples of parallel machine designs, specifically the MPP SIMD machine, the Ultracomputer MIMD system, and the PASM SIMD/MIMD machine.
The Journal of Supercomputing, 2020
The prefix computation strategy is a fundamental technique used to solve many problems in computer science such as sorting, clustering, and computer vision. A large number of parallel algorithms have been introduced that are based on a variety of high-performance systems. However, these algorithms do not consider the cost of the prefix computation operation. In this paper, we design a novel strategy for prefix computation to reduce the running time for high-cost operations such as multiplication. The proposed algorithm is based on (1) reducing the size of the partition and (2) keeping a fixed-size partition during all the steps of the computation. Experiments on a multicore system for different array sizes and number sizes demonstrate that the proposed parallel algorithm reduces the running time of the best-known optimal parallel algorithm in the average range of 62.7-79.6%. Moreover, the proposed algorithm has high speedup and is more scalable than those in previous works.
2018
The traditional parallel algorithms of data processing based on method: (for some principal the number of cores, a speed of processing etc.) separate all information which should be processed into parts and then process each part separately on a different cores (or processor). It takes quite a long time and is not always an optimal solution to separate into a parts. It's impossible to bring to minimum a standing of cores. It's not always possible to find optimal choice (or changing algorithm during execution). The algorithms provided by us have the main principal that the processing by the cores should be performed in parallel and bring to minimum the stay of cores.In the article are reviewed two algorithms working according to this principal "smart-delay" and the development of multiplication of matrix transposed ribbon-like algorithm.
Dartmouth College, Hanover, NH, 1992
In this thesis we examine three problems in graph theory and propose e cient parallel algorithms for solving them. We also introduce a number of parallel algorithmic techniques.
Algorithmica, 1990
In this thesis we examine three problems in graph theory and propose e cient parallel algorithms for solving them. We also introduce a number of parallel algorithmic techniques.
IEEE Transactions on Power Systems, 1988
Department of Electrical Engineering C'ollege St ation. Texas 77843 Abdract-A parallel processing sclieiiie for the sulutioii-of sparse linear iietwork equations is presented. The scheine as-su~iies ari already fact.orized coefficient iriat.rix and decomposes t lie forward ;'hackward substit ut ion operations in1 o independent sequences. III tloing tliis. t.lie' sparse vect.or met hods are eiiiplriyerl and i l i r full riglit hand sitlr rector is considered as a siiiii 01 wvrral sparsv vecturh. The dew4oped scheiiie is siiiiulated f o l various test syst.ei~is and the calculate.tl gaiiis in c o nputatitrri tiiiies are given. Introduction: hlost power system probleiiis such a5 the power flow, stat,e estimation. t.ransieiit si ability etc. require repetitive solution of linear simult aiieous equations. The heavy comput.ationa1 load associated witli sucli repet.itive solutioiis lias been greatly reduced by using bpaI"e niatrix tecliiiiques il.21. These techiiirpes were desrlopwl to iiliiiiiiiize the nuui1)er of float.iiig point operatinns aiid t tierrft)rt, t lie total cciii~liiitation tiiiie r m a serial c.0nipiit er. Receiit de\-eIop~i~eiit> in c u n i p u t c~ tec.linology suggest that. further reductions iii coinputation time inay be achieved via parallel pr(.)cess"iig :11-. Several types of multiprocessors including vector processors. common-bus riiultiprocessors sliaring a coiiiiiioii nieniory. dat a flow machines, systolic arrays etc. have been coilsidered by researchers. Among tliese different architect,ures, t,he low cost, attachable array processors have been investigat.ed for power system applications i3,4]. Meanwhile. various parallel algorit.liins were developed for the solution of sparse 1inea.r systeiiis and the results of simulations were presented 15-10]. It was shown in [IO1 that the comniuiiication delays bet.ween processors iiiay in fact be significant, particula,rly in utilizing sparse 1na.t rix t.ec1iniques wliere t.he results produced by one processor are used by several other processors in tlie uext. step of coiiiputations. The method to be presented in tliis paper aiiiis t o iiiiriiniizr such delays due t.o iiit.erprocessor data exclianges and is Imsed on the triangulat,ion grapli [ci] concepts which has recently been used in ext.ending the sparse inverse nietliod t o a parallel process 191. In 191, the concurrent. arithmetic operations are identified tmsed on t.lieir precedence relations and they are assigned t.o different processors assuming t.hat. 110 time is required for processor coniniunica
IOSR Journal of Mathematics, 2016
The current microprocessors are concentrating on the multiprocessor or multi-core system architecture. The parallel algorithms are recently focusing on multi-core system to take full utilization of multiple processors available in the system. The design of parallel algorithm and performance measurement is the major issue on today's multi-core environment. Numerical problems arise in almost every branch of science which requires fast solution. In this paper we have presented parallel algorithms for computing the solution of system of non-linear equations and approximate the simple zeros of polynomial equations. The experimental results reveal that the performances of parallel algorithms are better than sequential. We implemented the parallel algorithms using multithreading features of OpenMP.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Computers & Mathematics with Applications, 1989
Lecture Notes in Computer Science, 2012
Theoretical Computer Science, 1997
IEEE Transactions on Power Systems, 1992
ADVANCES IN …, 2007
Siam Journal on Computing, 1989
The Computer Journal, 2001