


default search action
10th IPPS 1996: Honolulu, Hawaii, USA
- Proceedings of IPPS '96, The 10th International Parallel Processing Symposium, April 15-19, 1996, Honolulu, Hawaii, USA. IEEE Computer Society 1996, ISBN 0-8186-7255-2

Keynote Address
- Charles E. Leiserson:

Can Multithreaded Programming Save Massively Parallel Computing? 2-3
Session 1 - Compiler Optimization
- Lynn Choi, Pen-Chung Yew:

Eliminating Stale Data References through Array Data-Flow Analysis. 4-13 - Martin C. Rinard, Pedro C. Diniz:

Commutativity Analysis: A Technique for Automatically Parallelizing Pointer-Based Computations. 14-22 - Shaw-Yen Tseng, Chung-Ta King, Chuan Yi Tang:

Profiling Dependence Vectors for Loop Parallelization. 23-27 - David J. Kolson, Alexandru Nicolau, Nikil D. Dutt

, Ken Kennedy:
A Method for Register Allocation to Loops in Multiple Register File Architectures. 28-33 - Jingling Xue:

Affine-by-Statement Transformations of Imperfectly Nested Loops. 34-38 - Rafael H. Saavedra-Barrera, Weihua Mao, Daeyeon Park, Jacqueline Chame, Sungdo Moon:

The Combined Effectiveness of Unimodular Transformations, Tiling, and Software Prefetching. 39-45
Session 2 - Scientific/Engineering Applications
- Ka-Cheong Leung, Ishfaq Ahmad, Hsiao-Ming Hsu:

Ocean Circulation on the Intel Paragon: Modeling and Implementation. 47-54 - Wei-keng Liao, Chao-Wei Ou, Sanjay Ranka:

Dynamic Alignment and Distribution of Irregularly Coupled Data Arrays for Scalable Parallelization of Particle-in-Cell Problems. 57-61 - Hiroaki Kobayashi, Hitoshi Yamauchi, Yuichiro Toh, Tadao Nakamura:

A Hierarchical Parallel Processing System for the Multipass-Rendering Method. 62-67 - Steve G. Steinberg, Jun Yang, Katherine A. Yelick:

Performance Modeling and Composition: A Case Study in Cell Simulation. 68-74
Session 3 - Distributed Memory Systems
- Hideki Murayama, Satoshi Yoshizawa, Takeshi Aimoto, Hidenori Inouchi, Shooichi Murase, Takehisa Hayashi, Hiroshi Iwamoto:

A Study of High-Performance Communication Mechanism for Multicomputer Systems. 76-83 - Timothy G. Mattson, David Scott, Stephen R. Wheat:

A TeraFLOP Supercomputer in 1996: The ASCI TFLOP System. 84-93 - Daniel J. Scales, Michael Burrows, Chandramohan A. Thekkath:

Experience with Parallel Computing on the AN2 Network. 94-103 - Thomas L. Sterling, Donald J. Becker, Chance Reschke, Daniel Savarese, Michael R. Berry:

Achieving a Balanced Low-Cost Architecture for Mass Storage Management through Multiple Fast Ethernet Channels on the Beowulf Parallel Workstation. 104-108 - Klaus E. Schauser, Chris J. Scheiman, J. Mitchell Ferguson, Paul Z. Kolano:

Exploiting the Capabilities of Communications Co-Processors. 109-115 - Andrew Sohn, Mitsuhisa Sato, Namhoon Yoo, Jean-Luc Gaudiot:

Effects of Multithreading on Data and Workload Distribution for Distributed-Memory Multiprocessors. 116-122
Session 4 - Shared Memory Systems
- Fong Pong, Michel Dubois:

Formal Verification of Delayed Consistency Protocols. 124-131 - Robert D. Blumofe, Matteo Frigo, Christopher F. Joerg, Charles E. Leiserson, Keith H. Randall:

Dag-Consistent Distributed Shared Memory. 132-141 - Ricardo Bianchini, Thomas J. LeBlanc, Jack E. Veenstra:

Categorizing Network Traffic in Update-Based Protocols on Scalable Multiprocessors. 142-151 - Henk L. Muller, Paul W. A. Stallard, David H. D. Warren:

Implementing the Data Diffusion Machine Using Crossbar Routers. 152-158 - Sally A. McKee, William A. Wulf:

A Memory Controller for Improved Performance of Streamed Computations on Symmetric Multiprocessors. 159-165 - Stefanos Kaxiras:

Kiloprocessor Extensions to SCI. 166-172
Session 5 - Algorithms
- Miroslaw Kutylowski, Tomasz Wierzbicki:

Approximate Compaction and Padded-Sorting on Exclusive Write PRAMs. 174-181 - Maria Cristina Pinotti, Vincenzo A. Crupi, Sajal K. Das:

A Parallel Solution to the Extended Set Union Problem with Unlimited Backtracking. 182-186 - Bala Ravikumar, X. Xiong:

A Parallel Algorithm for Minimization of Finite Automata. 187-191 - Xiaotie Deng, Binhai Zhu:

A Randomized Algorithm for Voronoi Diagram of Line Segments on Coarse-Grained Multiprocessors. 192-198 - Shuvra S. Bhattacharyya, Sundararajan Sriram, Edward A. Lee:

Self-Timed Resynchronization: A Post-Optimization for Static Multiprocessor Schedules. 199-205 - Weifa Liang

, Richard P. Brent:
Constructing the Spanners of Graphs in Parallel. 206-210
Session 6 - Programming Languages
- Laxmikant V. Kalé, Milind A. Bhandarkar, Narain Jagathesan, Sanjeev Krishnan, Josh Yelon:

Converse: An Interoperable Framework for Parallel Programming. 212-217 - Jose Nagib Cotrim Árabe, Adam Beguelin, Bruce Lowekamp, Erik Seligman, Mike Starkey, Peter Stephan:

Dome: Parallel Programming in a Distributed Computing Environment. 218-224 - Enrico Pontelli

, Gopal Gupta:
Nested Parallel Call Optimization. 225-229 - Yair I. Friedman, Dror G. Feitelson

, Iaakov Exman:
The Parallel Break Construct, or How to Kill an Activity Tree. 230-234 - Xingbin Zhang, Vijay Karamcheti, Tony Ng, Andrew A. Chien:

Optimizing COOP Languages: Study of a Protein Dynamics Program. 235-240 - Raju Pandey, James C. Browne:

Support for Extensibility and Reusability in a Concurrent Object-Oriented Programming Language. 241-247
Session 7 - Communication I
- Gheith A. Abandah, Edward S. Davidson:

Modeling the Communication Performance of the IBM SP2. 249-257 - Yucel Aydogan, Craig B. Stunkel, Cevdet Aykanat, Bülent Abali:

Adaptive Source Routing in Multistage Interconnection Networks. 258-267 - Sherry Moore, Lionel M. Ni:

The Effects of Network Contention on Processor Allocation Strategies. 268-273 - Robert W. Horst:

ServerNet Deadlock Avoidance and Fractahedral Topologies. 274-280 - Sajal K. Das, Sanjoy K. Sen:

Analysis of Memory Interference in Buffered Multiprocessor Systems in Presence of Hot Spots and Favorite Memories. 281-285 - Debashis Basak, Dhabaleswar K. Panda, Mohammad Banikazemi:

Benefits of Processor Clustering in Designing Large Parallel Systems: When and How? 286-290
Session 8 - Implementation of Primitive Operations
- David A. Bader

, Joseph F. JáJá:
Practical Parallel Algorithms for Dynamic Data Redistribution, Median Finding, and Selection. 292-301 - Sun Chung, Anne Condon:

Parallel Implementation of Borvka's Minimum Spanning Tree Algorithm. 302-308 - Ibraheem Al-Furaih, Srinivas Aluru, Sanjay Goil, Sanjay Ranka:

Practical Algorithms for Selection on Coarse-Grained Parallel Computers. 309-313 - George Karypis

, Vipin Kumar:
Parallel Multilevel Graph Partitioning. 314-319 - Seungjo Bae, Sanjay Ranka:

PACK/UNPACK on Coarse-Grained Distributed Memory Parallel Machines. 320-324
Session 9 - Resource Allocation and Management
- Myung M. Bae, Bella Bose:

Resource Placement in Torus-Based Networks. 327-331 - Yiqun Ge, David Y. Y. Yun:

Simultaneous Compression of Makespan and Number of Processors Using CRP. 332-338 - Bodhisattwa Mukherjee, Karsten Schwan:

Implementation of Scalable Blocking Locks Using an Adaptive Thread Scheduler. 339-343 - Samuel H. Russ, Brian K. Flachs, Jonathan Robinson, Bjørn Heckel:

Hector: Automated Task Allocation for MPI. 344-348 - David K. Lowenthal, Gregory R. Andrews:

An Adaptive Approach to Data Placement. 349-353 - Dwip Banerjee, James C. Browne:

Complete Parallelization of Computations: Integration of Data Partitioning and Functional Parallelism for Dynamic Data Structures. 354-360
Keynote Address
- Charles L. Seitz:

MPPs versus Clusters. 362
Session 10 - Communication II
- Tsunehiko Kamachi, Kazuhiro Kusano, Kenji Suehiro, Yoshiki Seo:

Generating Realignment-Based Communication for HPF Programs. 364-371 - Cezary Dubnicki, Liviu Iftode, Edward W. Felten, Kai Li:

Software Support for Virtual Memory-Mapped Communication. 372-281 - Michèle Dion, Cyril Randriamaro, Yves Robert:

How to Optimize Residual Communications? 382-391 - Jan Jonsson, Jonas Vasell:

A Comparative Study of Methods for Time-Deterministic Message Delivery in a Multiprocessor Architecture. 392-398 - Bruce Lowekamp, Adam Beguelin:

ECO: Efficient Collective Operations for Communication on Heterogeneous Networks. 399-405 - Eric A. Brewer

, Paul Gauthier, Armando Fox, Angela Schuett:
Software Techniques for Improving MPP Bulk-Transfer Performance. 406-412
Session 11 - Algorithms: Implementation
- David A. Bader

, Joseph F. JáJá, David Harwood, Larry S. Davis:
Parallel Algorithms for Image Enhancement and Segmentation by Region Growing with an Experimental Study. 414-423 - Yu-Hua Lee, Shi-Jinn Horng:

The Chessboard Distance Transform and the Medial Axis Transform are Interchangeable. 424-428 - Armin Bäumker, Wolfgang Dittrich:

Parallel Algorithms for Image Processing: Practical Algorithms with Experiments. 429-433 - Bongki Moon, Anurag Acharya, Joel H. Saltz:

Study of Scalable Declustering Algorithms for Parallel Grid Files. 434-440 - Sanda M. Harabagiu, Dan I. Moldovan:

A Parallel Algorithm for Text Inference. 441-445
Session 12 - Performance Evaluation and Prediction
- Kelvin K. Yue, David J. Lilja:

Efficient Execution of Parallel Applications in Multiprogrammed Multiprocessor Systems. 448-456 - Xian-He Sun:

The Relation of Scalability and Execution Time. 457-462 - Thu D. Nguyen, Raj Vaswani, John Zahorjan:

Maximizing Speedup through Self-Tuning of Processor Allocation. 463-468 - Shaun Kaneshiro, Tatsuya Shindo:

Profiling Optimized Code: A Profiling System for an HPF Compiler. 469-473 - Thomas Fahringer:

Toward Symbolic Performance Prediction of Parallel Programs. 474-478 - Sivan Toledo:

Performance Prediction with Benchmaps. 479-485
Industrial Track - Invited Presentations
Session-I: Parallel Architectures - Implementation, Programming, and Performance
- Jeffrey M. Nick, Jen-Yao Chung, Nicholas S. Bowen:

IBM System/390 Division: Overview of IBM System/390 Parallel Sysplex - A Commercial Parallel Processing System. 488-495 - Alan L. Smeyne:

Litton Guidance and Control Systems, Inc.: Implementing Parallel Processing in a Rugged Embeddable Environment. 496-501 - Gérard Y. Vichniac, Barry Isenstein, Craig Lund, Arlan Pool:

Mercury Computer Systems, Inc.: Planned Direct Transfers: A Programming Model for Real-Time Applications. 502-505
Session-II: Networking and Distributed Computing
- Yogindra Abhyankar, Anil Degwekar, Abhay Karandikar:

Centre for Development of Advanced Computing: DS-Link over Fiber: A High-Speed Interconnect for Cluster Computing. 507-511 - Woo-Jong Hahn, Ando Ki, Kee-Wook Rim, Soo-Won Kim:

Electronics and Telecommunications Research Institute: A Multiprocessor Server with a New Highly Pipelined Bus. 512-517 - Robert W. Horst, Doug Jewett, William J. Watson, L. Young, Dimiter R. Avresky, R. Wilkinson, Chris M. Cunningham:

Tandem Computers Incorporated: Performance Modeling of ServerNetTM Topologies. 518-523
Session 13 - Synchronization, Virtual Memory, and Runtime System Support
- Georg Stellner:

CoCheck: Checkpointing and Process Migration for MPI. 526-531 - Peter H. Beckman, Dennis Gannon:

Tulip: A Portable Run-Time System for Object-Parallel Systems. 532-536 - Veronica L. M. Reis, Isaac D. Scherson:

A Virtual Memory Model for Parallel Supercomputers. 537-543 - Reiner W. Hartenstein, Jürgen Becker, Michael Herz, Rainer Kress, Ulrich Nageldinger:

A Partitioning Programming Environment for a Novel Parallel Architecture. 544-548 - Martin C. Rinard:

An Integrated Synchronization and Consistency Protocol for the Implementation of a High-Level Parallel Programming Language. 549-553 - Meenakshi Arunachalam, Alok N. Choudhary, Brad Rullman:

Implementation and Evaluation of Prefetching in the Intel Paragon Parallel File System. 554-559
Session 14 - Arrays and Hypercubes
- Qian-Ping Gu, Hisao Tamaki:

Routing a Permutation in the Hypercube by Two Sets of Edge-Disjoint Paths. 561-567 - Val Donaldson, Jeanne Ferrante:

Determining Asynchronous Acyclic Pipeline Execution Times. 568-572 - Bogdan S. Chlebus, José D. P. Rolim, Giora Slutzki:

Distributing Tokens on a Hypercube without Error Accumulation. 573-578 - Amit Sengupta, Cauligi S. Raghavendra:

On Some Global Operations in Faulty SIMD Hypercubes. 579-583 - Hari Krishna Tadepalli, Errol L. Lloyd:

An Improved Approximation Algorithm for Scheduling Task Trees on Linear Arrays. 584-590
Session 15 - Mathematical Methods
- Bing Bing Zhou, Richard P. Brent:

Jacobi-like Algorithms for Eigenvalue Decomposition of a Real Normal Matrix Using Real Arithmetic. 593-600 - Hong Q. Ding, Robert D. Ferraro

:
An Element-Based Concurrent Partitioner for Unstructured Finite Element Meshes. 601-605 - William E. Hart, Scott B. Baden, Richard K. Belew, Scott R. Kohn:

Analysis of the Numerical Effects of Parallelism on a Parallel Genetic Algorithm. 606-612 - Shankar Ramaswamy, Eugene W. Hodges IV, Prithviraj Banerjee:

Compiling MATLAB Programs to ScaLAPACK: Exploiting Task and Data Parallelism. 613-619 - Eugene V. Zima, Karthi R. Vadivelu, Thomas L. Casavant:

Mapping Techniques for Parallel Evaluation of Chains of Recurrences. 620-624 - Adrian Moga, Michel Dubois:

Performance of Asynchronous Linear Iterations with Random Delays. 625-629
Panel
- William M. Farmer, Richard F. Freund, Mark Furtney, Paul Messina, Lionel M. Ni, Charles L. Seitz, Marc Snir:

For a Massive Number of Massively Parallel Machines: What are the Target Applications, Who are the Target Users, and What New R&D is Needed to Hit the Target? 631-634
Keynote Address
- Gregory F. Pfister:

Clusters for Commercial Computing: An Invisible Architecture. 636
Session 16 - Interconnection Networks
- Hyunmin Park, Dharma P. Agrawal:

Generic Methodologies for Deadlock-Free Routing. 638-643 - Yeimkuan Chang:

Partitionability of the Multistage Interconnection Networks. 644-649 - Mounir Hamdi, Siang W. Song:

On Embedding Various Networks into the Hypercube Using Matrix Transformations. 650-654 - Baback A. Izadi, Füsun Özgüner:

Optimal Subcube Fault Tolerance in a Circuit-Switched Hypercube. 655-659 - Yu-Chee Tseng, Shu-Hui Chang, Jang-Ping Sheu:

Fault-Tolerant Ring Embedding in Star Graphs. 660-665 - Mongkol Raksapatcharawong, Timothy Mark Pinkston:

An Optical Interconnect Model for k-ary n-cube Wormhole Networks. 666-672
Session 17 - Bus-Based Algorithms
- Ramachandran Vaidyanathan, Sudharani Nadella:

Fault-Tolerant Multiple Bus Networks for Fan-In Algorithms. 674-681 - Peter Damaschke:

Coping with Sparse Inputs on Enhanced Meshes - Semigroup Computation with COMMON CRCW Buses. 682-686 - Koji Nakano, Stephan Olariu:

An Optimal Algorithm for the Angle-Restricted All Nearest Neighbor Problem on the Reconfigurable. 687-691 - John Matthews, Charles U. Martel:

Parallel Algorithms Using Unreliable Broadcasts. 692-696 - Sandy Pavel, Selim G. Akl:

Efficient Algorithms for the Hough Transform on Arrays with Reconfigurable Optical Buses. 697-701 - Jerry L. Trahan, Chun-ming Lu, Ramachandran Vaidyanathan:

Integer and Floating Point Matrix-Vector Multiplication on the Reconfigurable Mesh. 702-706
Session 18 - Image and Radar Processing
- Shung-Shing Lee, Shi-Jinn Horng, Horng-Ren Tsai, Yu-Hua Lee:

Some Image Processing Algorithms on a RAP with Wider Bus Networks. 708-715 - Peter G. Meisl, Mabo Robert Ito, Ian G. Cumming:

Parallel Synthetic Aperture Radar Processing on Workstation Networks. 716-723 - Alberto Broggi:

The Evolution of a Massively Parallel Vision System for Real-Time Automotive Image Processing. 724-728 - Concettina Guerra:

2D Object Recognition on a Reconfigurable Mesh. 729-733 - Janice S. McMahon, Ken Teitelbaum:

Space-Time Adaptive Processing on the Mesh Synchronous Processor. 734-740 - Michael R. Berry, Tarek A. El-Ghazawi:

An Experimental Study of Input/Output Characteristics of NASA Earth and Space Sciences Applications. 741-747
Session 19 - Special-Purpose Applications
- Beverly Gocal:

Bitonic Sorting on Bene Networks. 749-753 - Célio Estevan Morón:

Designing Adaptable Real-Time Fault-Tolerant Parallel Systems. 754-758 - James D. Allen, David E. Schimmel:

Improving Memory Performance for Indirect Accesses on SIMD Computers. 759-765 - Shousheng He, Mats Torkelson:

A New Approach to Pipeline FFT Processor. 766-770 - Hyun M. Chang, Myung Hoon Sunwoo, Tai-Hoon Cho:

Implementation of a SliM Array Processor. 771-775 - Bernardo Rodriguez, Harry F. Jordan, Gita Alaghband:

Temporal Characterization of Demands for Data Movement on Parallel Programs. 776-779
Session 20 - Communication III
- Amotz Bar-Noy, Ching-Tien Ho:

Broadcasting Multiple Messages in the Multiport Model. 781-788 - Yuanyuan Yang, Gerald M. Masson:

The Necessary Conditions for Clos-Type Nonblocking Multicast Networks. 789-795 - Yuanyuan Yang:

A Class of Interconnection Networks for Multicasting. 796-802 - Michael R. Steed, Mark J. Clement:

Performance Prediction of PVM Programs. 803-807 - Young-Joo Suh, Sudhakar Yalamanchili:

Algorithms for All-to-All Personalized Exchange in 2D and 3D Tori. 808-814 - Anjan K. Venkatramani, Timothy Mark Pinkston, José Duato:

Generalized Theory for Deadlock-Free Adaptive Wormhole Routing and its Application to Disha Concurrent. 815-821
Session 21 - Clusters and Domain Decomposition
- Cong Fu, Tao Yang:

Efficient Run-Time Support for Irregular Task Computations with Mixed Granularities. 823-830 - Joseph Gil, Alan S. Wagner:

A New Technique for 3-D Domain Decomposition on Multicomputers which Reduces Message-Passing. 831-835 - Vasudha Govindan, Mark A. Franklin:

Application Load Imbalance on Parallel Processors. 836-842 - Patrick W. Dowd, Todd M. Carrozzi, Frank A. Pellegrino, Amy Xin Chen:

Native ATM Application Programmer Interface Testbed for Cluster-Based Computing. 843-849 - Daniel Andresen, Tao Yang, Vegard Holmedahl, Oscar H. Ibarra:

SWEB: Towards a Scalable World Wide Web Server on Multicomputers. 850-856 - Rajendra Panwar, WooYoung Kim, Gul Agha:

Parallel Implementations of Irregular Problems Using High-Level Actor Language. 857-862
Additional Papers
- Kannappan Palaniappan, Mohammad Faisal, Chandra Kambhamettu, A. Frederick Hasler:

Implementation of an Automatic Semi-Fluid Motion Analysis Algorithm on a Massively Parallel Computer. 864-877 - Subhash Saini:

NAS Experiences of Porting CM Fortran Codes to on IBM SP2 and SGI Power Challenge. 878-880 - Nihar R. Mahapatra, Shantanu Dutt:

Random Seeking: A General, Efficient, and Informed Randomized Scheme for Dynamic Load Balancing. 881-885 - Marián Vajtersic:

A Direct Block-Five-Diagonal System Solver for the VLSI Parallel Model. 886-890 - Ladan Kazerouni, Basant Rajan, R. K. Shyamasundar:

Mapping Linear Recurrences onto Systolic Arrays. 891-897

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














