


default search action
27th ICS 2013: Eugene, OR, USA
- Allen D. Malony, Mario Nemirovsky, Samuel P. Midkiff:

International Conference on Supercomputing, ICS'13, Eugene, OR, USA - June 10 - 14, 2013. ACM 2013, ISBN 978-1-4503-2130-3
Keynote address
- Bob Blainey:

Business meets supercomputing: keynote talk. 1-2
DSLs and semantic based compilation 1
- Andrew Stone, Michelle Mills Strout:

Abstractions to separate concerns in semi-regular grids. 3-12 - Thomas Henretty, Richard Veras, Franz Franchetti, Louis-Noël Pouchet, J. Ramanujam

, P. Sadayappan
:
A stencil compiler for short-vector SIMD architectures. 13-24 - Chenyang Liu, Muhammad Hasan Jamal

, Milind Kulkarni, Arun Prakash
, Vijay S. Pai:
Exploiting domain knowledge to optimize parallel computational mechanics codes. 25-36
Tools and performance debugging
- José-María Arnau, Joan-Manuel Parcerisa

, Polychronis Xekalakis:
TEAPOT: a toolset for evaluating performance, power and image quality on mobile graphics systems. 37-46 - Chang-Seo Park, Koushik Sen, Costin Iancu:

Scaling data race detection for partitioned global address space programs. 47-58 - Xing Wu, Frank Mueller:

Elastic and scalable tracing and accurate replay of non-deterministic events. 59-68 - Xu Liu, John M. Mellor-Crummey

, Michael W. Fagan:
A new approach for performance analysis of openMP programs. 69-80
Memory and storage
- Kun Fang, Zhichun Zhu:

Conservative row activation to improve memory power efficiency. 81-90 - Sangyeun Cho, Chanik Park, Hyunok Oh, Sungchan Kim, Youngmin Yi, Gregory R. Ganger:

Active disk meets flash: a case for intelligent SSDs. 91-102 - Myoungsoo Jung, John Shalf

, Mahmut T. Kandemir:
Design of a large-scale storage-class RRAM system. 103-114 - Ju-Young Jung, Sangyeun Cho:

Memorage: emerging persistent RAM based malleable main memory and storage architecture. 115-126
Keynote address
- Steven L. Teig:

Function, latency, bandwidth, power: towards a better computer. 127-128
Communication and heterogeneous systems
- Michail Alvanos

, Montse Farreras
, Ettore Tiotto, José Nelson Amaral, Xavier Martorell
:
Improving communication in PGAS environments: static and dynamic coalescing in UPC. 129-138 - Bogdan Prisacari, Germán Rodríguez, Cyriel Minkenberg, Torsten Hoefler:

Bandwidth-optimal all-to-all exchanges in fat tree networks. 139-148 - Klaus Kofler, Ivan Grasso, Biagio Cosenza

, Thomas Fahringer
:
An automatic input-sensitive approach for heterogeneous task partitioning. 149-160 - Ivan Grasso, Simone Pellegrini, Biagio Cosenza

, Thomas Fahringer
:
LibWater: heterogeneous distributed computing made easy. 161-172
Architecture 1
- Tapasya Patki, David K. Lowenthal

, Barry Rountree, Martin Schulz
, Bronis R. de Supinski:
Exploring hardware overprovisioning in power-constrained, high performance computing. 173-182 - Ramakrishnan Rajamony, Mark W. Stephenson, William Evan Speight:

The power 775 architecture at scale. 183-192 - Ruisheng Wang, Lizhong Chen, Timothy Mark Pinkston:

Bubble coloring: avoiding routing- and protocol-induced deadlocks with minimal virtual channel requirement. 193-202 - Keith D. Underwood

, Eric Borch, John Sizer, Timothy Stremcha, Michael Strom:
Evaluating on-die interconnects for a 4 TB/s router. 203-212
Algorithms
- Matthew Badin, Paolo D'Alberto, Lubomir Bic, Michael B. Dillencourt, Alexandru Nicolau:

Improving numerical accuracy for non-negative matrix multiplication on GPUs using recursive algorithms. 213-222 - Azzam Haidar, Mark Gates

, Stanimire Tomov
, Jack J. Dongarra:
Toward a scalable multi-GPU eigensolver via compute-intensive kernels and efficient communication. 223-232 - Panagiotis A. Foteinos, Nikos Chrisochoides:

High quality real-time image-to-mesh conversion for finite element simulations. 233-242
Architecture 2
- Komal Jothi, Haitham Akkary:

Tuning the continual flow pipeline architecture. 243-252 - Konstantinos Koukos, David Black-Schaffer, Vasileios Spiliopoulos, Stefanos Kaxiras:

Towards more efficient execution: a decoupled access-execute approach. 253-262 - Souad Koliai, Zakaria Bendifallah, Mathieu Tribalat, Cédric Valensi, Jean-Thomas Acquaviva, William Jalby:

Quantifying performance bottleneck cost through differential analysis. 263-272
Irregular algorithms
- Xing Liu

, Mikhail Smelyanskiy, Edmond Chow, Pradeep Dubey:
Efficient sparse matrix-vector multiplication on x86-based many-core processors. 273-282 - Nicholas Gerard Edmonds, Jeremiah Willcock, Andrew Lumsdaine

:
Expressing graph algorithms using generalized active messages. 283-292 - Hari Sundar, Dhairya Malhotra

, George Biros:
HykSort: a new variant of hypercube quicksort on distributed memory architectures. 293-302
Memory
- Gabriel Marin, Collin McCurdy, Jeffrey S. Vetter:

Diagnosis and optimization of application prefetching performance. 303-312 - Changhui Lin, Vijay Nagarajan, Rajiv Gupta

:
Address-aware fences. 313-324 - Vassilis Papaefstathiou, Manolis Katevenis, Dimitrios S. Nikolopoulos

, Dionisios N. Pnevmatikatos
:
Prefetching and cache management using task lifetimes. 325-334
Keynote address
- James E. Smith:

The role of computer designers in reverse-engineering the brain. 335-336
Runtime techniques
- Srinath Sridharan, Gagan Gupta, Gurindar S. Sohi:

Holistic run-time parallelism management for time and energy efficiency. 337-348 - R. Vasudevan, Sathish S. Vadhiyar, Laxmikant V. Kalé:

G-Charm: an adaptive runtime system for message-driven parallel applications on hybrid systems. 349-358 - Javier Bueno, Xavier Martorell

, Rosa M. Badia
, Eduard Ayguadé
, Jesús Labarta
:
Implementing OmpSs support for regions of data in architectures with multiple address spaces. 359-368 - Michael O. Lam

, Jeffrey K. Hollingsworth, Bronis R. de Supinski, Matthew P. LeGendre:
Automatically adapting programs for mixed-precision floating-point computation. 369-378
Order in the house
- Pablo Prieto

, Valentin Puente
, José-Ángel Gregorio
:
CMP off-chip bandwidth scheduling guided by instruction criticality. 379-388 - Wolfgang Frings, Dong H. Ahn, Matthew P. LeGendre, Todd Gamblin, Bronis R. de Supinski, Felix Wolf:

Massively parallel loading. 389-398 - Khaled Hamidouche, Sreeram Potluri, Hari Subramoni, Krishna Chaitanya Kandalla, Dhabaleswar K. Panda:

MIC-RO: enabling efficient remote offload on heterogeneous many integrated core (MIC) clusters with InfiniBand. 399-408
GPUs
- Xin Huo, Sriram Krishnamoorthy, Gagan Agrawal:

Efficient scheduling of recursive control flow on GPUs. 409-420 - Nabeel AlSaber, Milind Kulkarni:

SemCache: semantics-aware caching for efficient GPU offloading. 421-432 - Ping Xiang, Yi Yang, Mike Mantor, Norm Rubin, Lisa R. Hsu, Huiyang Zhou

:
Exploiting uniform vector instructions for GPGPU performance, energy efficiency, and opportunistic reliability enhancement. 433-442 - Amit Sabne, Putt Sakdhnagool, Rudolf Eigenmann:

Scaling large-data computations on multi-GPU accelerators. 443-454
Posters
- Sriram Aananthakrishnan, Greg Bronevetsky, Ganesh Gopalakrishnan:

Hybrid approach for data-flow analysis of MPI programs. 455-456 - Michail Alvanos

, Gabriel Tanase, Montse Farreras
, Ettore Tiotto, José Nelson Amaral, Xavier Martorell
:
Improving performance of all-to-all communication through loop scheduling in PGAS environments. 457-458 - Madhur Amilkanthwar, Shankar Balachandran:

CUPL: a compile-time uncoalesced memory access pattern locator for CUDA. 459-460 - Weiwei Chen, Ewa Deelman, Rizos Sakellariou

:
Imbalance optimization in scientific workflows. 461-462 - Catalin Bogdan Ciobanu

, Dionisios N. Pnevmatikatos
, Kyprianos D. Papadimitriou
, Georgi Nedeltchev Gaydadjiev
:
FASTER run-time reconfiguration management. 463-464 - Hadrien A. Clarke, Antoine Trouvé, Kazuaki J. Murakami:

MAD7: a memory architecture simulator targeted at design space exploration. 465-466 - Truong Vinh Truong Duy

, Taisuke Ozaki:
A decomposition method with minimal communication volume for parallelization of multi-dimensional FFTs. 467-468 - Truong Vinh Truong Duy

, Taisuke Ozaki:
A massively parallel domain decomposition method for large-scale DFT electronic structure calculations. 469-470 - Panagiotis A. Foteinos, Daming Feng, Andrey N. Chernikov, Nikos Chrisochoides:

Multi-layered unstructured mesh generation. 471-472 - Justin A. Hogan, Raymond J. Weber, Brock J. LaMeres, Todd Kaiser:

Network-on-chip for a partially reconfigurable FPGA system. 473-474 - Saurabh Jha

, Tejaswi Agarwal
, B. Rajesh Kanna
:
Exploiting data parallelism in the yConvex hypergraph algorithm for image representation using GPGPUs. 475-476 - Tao Jiang, Lele Zhang, Rui Hou, Yi Zhang, Qianlong Zhang, Lin Chai, Jing Han, Wuxiong Zhang, Cong Wang, Lixin Zhang:

The ARMv8 simulator. 477-478 - Erik Keever, James N. Imamura:

Imogen: a parallel 3D fluid and MHD code for GPUs. 479-480 - Min Li, Sushil Mantri, Pin Zhou, Ali Raza Butt

:
SMIO: I/O similarity aware virtual machine management invirtual desktop environments. 481-482 - David Ozog, Sameer Shende, Allen D. Malony, Jeff R. Hammond, James Dinan, Pavan Balaji:

Inspector/executor load balancing algorithms for block-sparse tensor contractions. 483-484 - Swaroop Pophale, Tony Curtis, Barbara M. Chapman:

Improving performance of openSHMEM reference library by portable PE mapping technique. 485-486 - Sonish Shrestha:

Using platform-independent data locality analysis to predict cache performance on abstract hardware platforms. 487-488 - Tyler Sorensen, Ganesh Gopalakrishnan, Vinod Grover:

Towards shared memory consistency models for GPUs. 489-490 - Alejandro Valero

, Julio Sahuquillo
, Salvador Petit
, José Duato
:
Exploiting reuse information to reduce refresh energy in on-chip eDRAM caches. 491-492 - Cong Wang, Tao Jiang, Rui Hou:

V-OpenCL: a method to use remote GPGPU. 493-494 - Raymond J. Weber, Justin A. Hogan, Brock J. LaMeres, Todd Kaiser:

Power efficiency in a partially reconfigurable multiprocessor system. 495-496

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














