Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1996, International Symposium on Parallel Architectures, Algorithms and Networks
The MULTIPLUS project aims at the development of a modular parallel architecture suitable for the study of several aspects of parallelism in both true shared memory and virtual shared memory environments. The MULTIPLUS architecture is able to support up to 1024 Processing Elements based on SPARC microprocessors. The MULPLIX Unix-like operating system offers a suitable parallel programming environment for the
Anais do VII Simpósio de Arquitetura de Computadores e Processamento de Alto Desempenho (SBAC-PAD 1995)
The MULTIPLUS project aims at the development of a modular distributed shared memory parallel architecture able to support up to 1024 processing elements based on SPARC microprocessors and at the implementation of MULPLIX, a Unix-like operating system which provides a suitable parallel programming environment for the MULTIPLUS architecture. After reviewing the main features of the definition of the MULTIPLUS architecture and the MULPLIX operating system, this paper describes in detail the current implementation of the main modules of the MULTIPLUS architecture and presents, with an illustration example, the parallel programming primitives already implemented within MULPLIX.
The intention of this paper is to provide an overview on the subject of Advanced computer Architecture. The overview includes its presence in hardware and software and different types of organization. This paper also covers different part of shared memory multiprocessor. Through this paper we are creating awareness among the people about this rising field of multiprocessor. This paper also offers a comprehensive number of references for each concept in SHARED MEMORY MULTIPROCESSOR.
International Journal of Technology Enhancements and Emerging Engineering Research, 2013
Synchronization is a critical operation in many parallel applications. Conservative Synchronization mechanisms are failing to keep up with the increasing demand for well-organized management operations as systems grow larger and network latency increases. The charity of this paper is threefold. First, we revisit some delegate bringing together algorithms in light of recent architecture innovation and provide an example of how the simplify assumption made by typical logical models of management mechanism can lead to significant performance guess errors. Second, we present an architectural modernism called active memory that enables very fast tiny operations in a shared-memory multiprocessor. Third, we use execution-driven simulation to quantitatively compare the performance of a variety of Synchronization mechanisms based on both existing hardware techniques and active memory operations. To the best of our knowledge, management based on active memory out forms all existing spinlock a...
Handbook on Parallel and Distributed Processing, 2000
2004
Shared memory systems form a major category of multiprocessors. In this category, all processors share a global memory. Communication between tasks running on different processors is performed through writing to and reading from the global memory. All interprocessor coordination and synchronization is also accomplished via the global memory. A shared memory computer system consists of a set of independent processors, a set of memory modules, and an interconnection network as shown in Figure 4.1. Two main problems need to be addressed when designing a shared memory system: performance degradation due to contention, and coherence problems. Performance degradation might happen when multiple processors are trying to access the shared memory simultaneously. A typical design might use caches to solve the contention problem. However, having multiple copies of data, spread throughout the caches, might lead to a coherence problem. The copies in the caches are coherent if they are all equal to the same value. However, if one of the processors writes over the value of one of the copies, then the copy becomes inconsistent because it no longer equals the value of the other copies. In this chapter we study a variety of shared memory systems and their solutions of the cache coherence problem.
Lecture Notes in Computer Science, 1993
Microprocessors and Microsystems, 1989
Howard Oakley examines the features of an operating system designed to enable applications to be configured to hardware resources at run time Mercury is a single-user operating system for transputers which facilitates the running of program tasks on one or more transputers in a processor array. It supports the existing OCCAM model for programming transputers, with the enhancements of being able to configure programs onto available hardware at load t/me, and full message passing facilities which are optimized for speed. The paper explains the design philosophy, and some of the more unusual features of Mercury are illustrated with practical examples. microprocessors operating systems Mercury transputers
2000
Large-scale parallel computations are more common than ever, due to the increasing availability of multi-processor systems. However, writing parallel software is often a complicated and error-prone task. To relieve Diffpack users of the tedious and low-level technical details of parallel programming, we have designed a set of new software modules, tools, and programming rules, which will be the topic of
2001
Systems based on the Pentium® III and Pentium® 4 processors enable the exploitation of parallelism at a fine- and medium-grained level. Dual- and quad-processor systems, for example, enable the exploitation of medium- grained parallelism by using multithreaded code that takes advantage of multiple control and arithmetic logic units. Streaming Single-Instruction-Multiple-Data (SIMD) extensions, on the other hand, enable the exploitation of
Microprocessing and Microprogramming, 1991
IEEE Transactions on Computers, 2000
Advances in Computers, 2000
2007 IEEE International …, 2007
This paper proposes the study of a new computation model that attempts to address the underlying sources of performance degradation (e.g. latency, overhead, and starvation) and the difficulties of programmer productivity (e.g. explicit locality management and scheduling, performance tuning, fragmented memory, and synchronous global barriers) to dramatically enhance the broad effectiveness of parallel processing for high end computing. In this paper, we present the progress of our research on a parallel programming and execution model -mainly, ParalleX. We describe the functional elements of ParalleX, one such model being explored as part of this project. We also report our progress on the development and study of a subset of ParalleX -the LITL-X at University of Delaware. We then present a novel architecture model -Gilgamesh II -as a ParalleX processing architecture. A design point study of Gilgamesh II and the architecture concept strategy are presented.
IEEE Transactions on Software Engineering, 2000
Wiley-Interscience eBooks, 2005
Shared memory systems form a major category of multiprocessors. In this category, all processors share a global memory. Communication between tasks running on different processors is performed through writing to and reading from the global memory. All interprocessor coordination and synchronization is also accomplished via the global memory. A shared memory computer system consists of a set of independent processors, a set of memory modules, and an interconnection network as shown in Figure 4.1. Two main problems need to be addressed when designing a shared memory system: performance degradation due to contention, and coherence problems. Performance degradation might happen when multiple processors are trying to access the shared memory simultaneously. A typical design might use caches to solve the contention problem. However, having multiple copies of data, spread throughout the caches, might lead to a coherence problem. The copies in the caches are coherent if they are all equal to the same value. However, if one of the processors writes over the value of one of the copies, then the copy becomes inconsistent because it no longer equals the value of the other copies. In this chapter we study a variety of shared memory systems and their solutions of the cache coherence problem.
Software: Practice and Experience, 1996
ParC is an extension of the C programming language with block-oriented parallel constructs that allow the programmer to express ne-grain parallelism in a shared-memory model. It is suitable for the expression of parallel shared-memory algorithms, and also conducive for the parallelization of sequential C programs. In addition, performance enhancing transformations can be applied within the language, without resorting to low-level programming. The language includes closed constructs to create parallelism, as well as instructions to cause the termination of parallel activities and to enforce synchronization. The parallel constructs are used to de ne the scope of shared variables, and also to delimit the sets of activities that are in uenced by termination or synchronization instructions. The semantics of parallelism are discussed, especially relating to the discrepancy between the limited number of physical processors and the potentially much larger number of parallel activities in a program.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.