


default search action
23rd ICS 2009: Yorktown Heights, NY, USA
- Michael Gschwind, Alexandru Nicolau, Valentina Salapura, José E. Moreira:

Proceedings of the 23rd international conference on Supercomputing, 2009, Yorktown Heights, NY, USA, June 8-12, 2009. ACM 2009, ISBN 978-1-60558-498-0
Keynote Address I
- Mateo Valero:

A european perspective on supercomputing. 1
Keynote Address II
- Don G. Grice:

The roadrunner project and the importance of energy efficiency on the road to exascale computing. 2
Keynote Address III
- Ian T. Foster

:
Computing outside the box. 3
Applications of the cell processor
- Konstantis Daloukas, Christos D. Antonopoulos

, Nikolaos Bellas
:
Implementation of a wide-angle lens distortion correction algorithm on the cell broadband engine. 4-13 - Daniele Paolo Scarpazza, Gregory F. Russell:

High-performance regular expression scanning on the Cell/B.E. processor. 14-25 - Srinivas Chellappa, Franz Franchetti, Markus Püschel:

Computer generation of fast fourier transforms for the cell broadband engine. 26-35 - Tao Liu, Haibo Lin, Tong Chen, Kevin O'Brien, Ling Shao:

DBDB: optimizing DMATransfer for the cell be architecture. 36-45
Cache enhancement techniques
- Julien Dusser, Thomas Piquet, André Seznec

:
Zero-content augmented caches. 46-55 - Mohammad Hammoud, Sangyeun Cho, Rami G. Melhem:

Dynamic cache clustering for chip multiprocessors. 56-67 - Lingxiang Xiang, Tianzhou Chen, Qingsong Shi, Wei Hu:

Less reused filter: improving l2 cache performance via filtering less reused lines. 68-79 - Chuanjun Zhang, Bing Xue:

Divide-and-conquer: a bubble replacement for low level caches. 80-89
Optimizing parallel applications
- Hiroshi Nakashima, Yohei Miyake

, Hideyuki Usui
, Yoshiharu Omura
:
OhHelp: a scalable domain-decomposing dynamic load balancing for particle-in-cell simulations. 90-99 - Mehmet Belgin

, Godmar Back, Calvin J. Ribbens:
Pattern-based sparse matrix representation for memory-efficient SMVM kernels. 100-109 - Abhinav Bhatele, Laxmikant V. Kalé, Sameer Kumar:

Dynamic topology aware load balancing algorithms for molecular dynamics applications. 110-116
Transactional memory I
- JaeWoong Chung, Woongki Baek, Christos Kozyrakis:

Fast memory snapshot for concurrent programmingwithout synchronization. 117-125 - Vladimir Gajinov, Ferad Zyulkyarov

, Osman S. Unsal
, Adrián Cristal
, Eduard Ayguadé
, Tim Harris, Mateo Valero
:
QuakeTM: parallelizing a complex sequential application using transactional memory. 126-135 - Arrvindh Shriraman, Sandhya Dwarkadas

:
Refereeing conflicts in hardware transactional memory. 136-146
Compilers
- Albert Hartono, Muthu Manikandan Baskaran, Cédric Bastoul, Albert Cohen, Sriram Krishnamoorthy, Boyana Norris

, J. Ramanujam
, P. Sadayappan
:
Parametric multi-level tiling of imperfectly nested loops. 147-157 - Cheng Wang, Youfeng Wu, Edson Borin, Shiliang Hu, Wei Liu, Dave Sager, Tin-Fook Ngai, Jesse Fang:

Dynamic parallelization of single-threaded binary programs using speculative slicing. 158-168 - Alexandru Nicolau, Guangqiang Li, Alexander V. Veidenbaum, Arun Kejariwal:

Synchronization optimizations for efficient execution on multi-cores. 169-180 - Jun Shirako, Jisheng M. Zhao, V. Krishna Nandivada

, Vivek Sarkar:
Chunking parallel loops in the presence of synchronization. 181-192
High performance communications I
- Qasim Ali, Samuel P. Midkiff

, Vijay S. Pai:
Efficient high performance collective communication for the cell blade. 193-203 - Junchang Wang, Haipeng Cheng, Bei Hua, Xinan Tang:

Practice of parallelizing network applications on multi-core architectures. 204-213 - Stavros Passas, Kostas Magoutis

, Angelos Bilas
:
Towards 100 gbit/s ethernet: multicore-based parallel communication protocol design. 214-224 - Jiuxing Liu, Bülent Abali:

Virtualization polling engine (VPE): using dedicated CPU cores to accelerate I/O virtualization. 225-234
Accelerating applications with GPUs I
- M. Suhail Rehman, Kishore Kothapalli, P. J. Narayanan

:
Fast and scalable list ranking on the GPU. 235-243 - Sundaresan Venkatasubramanian, Richard W. Vuduc

:
Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems. 244-255 - Jiayuan Meng, Kevin Skadron

:
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs. 256-265
Architectures for High-Performance Computing
- Leo Porter

, Dean M. Tullsen
:
Creating artificial global history to improve branch prediction accuracy. 266-275 - Germán Rodríguez, Ramón Beivide, Cyriel Minkenberg, Jesús Labarta

, Mateo Valero
:
Exploring pattern-aware routing in generalized fat tree networks. 276-285
High-performance communications II
- Tobias Hilbrich, Bronis R. de Supinski, Martin Schulz

, Matthias S. Müller
:
A graph based approach for MPI deadlock detection. 296-305 - Matthew Small, Xin Yuan:

Maximizing MPI point-to-point communication performance on RDMA-enabled clusters with customized protocols. 306-315 - Anthony Danalis, Lori L. Pollock, D. Martin Swany

, John Cavazos:
MPI-aware compiler optimizations for improving communication-computation overlap. 316-325 - Jiuxing Liu, Dan E. Poff, Bülent Abali:

Evaluating high performance communication: a power perspective. 326-337
Storage solutions for supercomputing
- Ji-Yong Shin

, Zenglin Xia, Ning-Yi Xu, Rui Gao, Xiongfei Cai, Seungryoul Maeng, Feng-Hsiung Hsu:
FTL design exploration in reconfigurable high-performance SSD for server applications. 338-349 - Henry M. Monti, Ali Raza Butt

, Sudharshan S. Vazhkudai:
/scratch as a cache: rethinking HPC center scratch storage. 350-359 - Chao Jin

, Hong Jiang, Dan Feng, Lei Tian:
P-Code: a new RAID-6 code with optimal properties. 360-369 - Chuanyi Liu, Yu Gu, Linchun Sun, Bin Yan, Dongsheng Wang:

R-ADMAD: high reliability provision for large-scale de-duplication archival storage systems. 370-379
Accelerating applications with GPUs II
- Guangming Tan, Ziyu Guo, Mingyu Chen, Dan Meng:

Single-particle 3d reconstruction from cryo-electron microscopy images on GPU. 380-389 - Gabriel Falcão Paiva Fernandes

, Vítor Manuel Mendes da Silva, Leonel Sousa
:
How GPUs can outperform ASICs for fast LDPC decoding. 390-399 - Wenjing Ma, Gagan Agrawal:

A translation system for enabling data mining applications on GPUs. 400-409
Transactional memory II
- Polychronis Xekalakis, Nikolas Ioannou, Marcelo Cintra:

Combining thread level speculation helper threads and runahead execution. 410-420 - Salil Mohan Pant, Gregory T. Byrd

:
Limited early value communication to improve performance of transactional memory. 421-429
Novel supercomputing applications
- Keith R. Bisset, Jiangzhuo Chen

, Xizhou Feng
, V. S. Anil Kumar, Madhav V. Marathe:
EpiFast: a fast algorithm for large scale realistic epidemic simulations on distributed memory systems. 430-439 - Rob van Nieuwpoort

, John W. Romein:
Using many-core hardware to correlate radio astronomy signals. 440-449 - Jun Cao, Krista A. Novstrup, Ayush Goyal, Samuel P. Midkiff

, James M. Caruthers
:
A parallel levenberg-marquardt algorithm. 450-459
Power management
- Barry Rountree, David K. Lowenthal

, Bronis R. de Supinski, Martin Schulz
, Vincent W. Freeh, Tyler K. Bletsch:
Adagio: making DVS practical for complex HPC applications. 460-469 - Mohammad Arjomand, Hamid Sarbazi-Azad:

A comprehensive power-performance model for NoCs with multi-flit channel buffers. 470-478 - Andrew Herdrich, Ramesh Illikkal, Ravi R. Iyer, Donald Newell, Vineet Chadha, Jaideep Moses:

Rate-based QoS techniques for cache/memory in CMP platforms. 479-488
Posters
- Ahmad Faraj, Sameer Kumar, Brian E. Smith, Amith R. Mamidala, John A. Gunnels, Philip Heidelberger:

MPI collective communications on the blue gene/p supercomputer: algorithms and optimizations. 489-490 - James Poe, Clay Hughes, Tao Li:

TransMetric: architecture independent workload characterization for transactional memory benchmarks. 491-492 - Md. Mafijul Islam, Sally A. McKee, Per Stenström:

Cancellation of loads that return zero using zero-value caches. 493-494 - Huayong Wang, Henrique Andrade, Bugra Gedik, Kun-Lung Wu:

Auto-vectorization through code generation for stream processing applications. 495-496 - Aleksandr Ovcharenko, Onkar Sahni

, Christopher D. Carothers, Kenneth E. Jansen
, Mark S. Shephard
:
Subdomain communication to increase scalability in large-scale scientific applications. 497-498 - Yasuo Ishii, Mary Inaba, Kei Hiraki:

Access map pattern matching for data cache prefetch. 499-500 - Karan Singh, Major Bhadauria, Sally A. McKee:

Prediction-based power estimation and scheduling for CMPs. 501-502 - Jih-Ching Chiu, Kai-Ming Yang, Yu-Liang Chou:

Design of a novel SIMD architecture by fusing operations and registers. 503-504 - Jian Li, Lixin Zhang, Charles Lefurgy, Richard R. Treumann, Wolfgang E. Denzel:

Thrifty interconnection network for HPC systems. 505-506 - Liang Gu, Xiaoming Li:

Performance modeling for DFT algorithms in FFTW. 507-508 - Major Bhadauria, Vincent M. Weaver, Sally A. McKee:

PARSEC: hardware profiling of emerging workloads for CMP design. 509-510 - Mohamed E. Hussein

, Wael Abd-Almageed
:
Approximate kernel matrix computation on GPUs forlarge scale learning applications. 511-512 - Diana Bautista, Julio Sahuquillo

, Houcine Hassan
, Salvador Petit
, José Duato
:
Dynamic task set partitioning based on balancing memory requirements to reduce power consumption. 513-514 - Alexandros Papakonstantinou, Karthik Gururaj, John A. Stratton, Deming Chen, Jason Cong, Wen-mei W. Hwu:

High-performance CUDA kernel execution on FPGAs. 515-516 - Shih-wei Liao, Tzu-Han Hung, Donald Nguyen, Hucheng Zhou, Chinyen Chou, Chia-Heng Tu:

Prefetch optimizations on large-scale applications via parameter value prediction. 519-520 - Scott Beamer, Krste Asanovic, Christopher Batten, Ajay Joshi, Vladimir Stojanovic:

Designing multi-socket systems using silicon photonics. 521-522 - Victor Lotrich, Norbert Flocke, Mark Ponton, Beverly A. Sanders, Erik Deumens, Rodney J. Bartlett

, Ajith Perera:
An infrastructure for scalable and portable parallel programs for computational chemistry. 523-524

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














