


default search action
32nd ICS 2018: Beijing, China
- Proceedings of the 32nd International Conference on Supercomputing, ICS 2018, Beijing, China, June 12-15, 2018. ACM 2018, ISBN 978-1-4503-5783-8

File system, I/O and Storage System
- Jinrui Cao, Om Rameshwar Gatla, Mai Zheng, Dong Dai, Vidya Eswarappa, Yan Mu, Yong Chen

:
PFault: A General Framework for Analyzing the Reliability of High-Performance Parallel File Systems. 1-11 - Jie Yu, Guangming Liu, Xin Liu, Wenrui Dong, Xiaoyong Li

, Yusheng Liu:
Rethinking Node Allocation Strategy for Data-intensive Applications in Consideration of Spatially Bursty I/O. 12-21 - Wenhui Zhang, Qiang Cao, Hong Jiang, Jie Yao:

PA-SSD: A Page-Type Aware TLC SSD for Improved Write/Read Performance and Storage Efficiency. 22-32 - Anthony Kougkas, Hariharan Devarajan, Xian-He Sun:

IRIS: I/O Redirection via Integrated Storage. 33-42
GPUs-I: Execution Model
- Husheng Zhou, Soroush Bateni

, Cong Liu:
GRU: Exploring Computation and Data Redundancy via Partial GPU Computing Result Reuse. 43-52 - Ang Li, Weifeng Liu

, Linnan Wang, Kevin J. Barker
, Shuaiwen Leon Song:
Warp-Consolidation: A Novel Execution Model for GPUs. 53-64 - Xia Zhao, Zhiying Wang, Lieven Eeckhout:

Classification-Driven Search for Effective SM Partitioning in Multitasking GPUs. 65-75
GPUs-II: GPU and Algorithm
- Bernhard Kerbl

, Michael Kenzel
, Joerg H. Mueller, Dieter Schmalstieg, Markus Steinberger:
The Broker Queue: A Fast, Linearizable FIFO Queue for Fine-Granular Work Distribution on the GPU. 76-85 - Ben Karsin, Volker Weichert, Henri Casanova, John Iacono, Nodari Sitchinava:

Analysis-driven Engineering of Comparison-based Sorting Algorithms on GPUs. 86-95 - Jinsung Kim

, Aravind Sukumaran-Rajam
, Changwan Hong, Ajay Panyala
, Rohit Kumar Srivastava, Sriram Krishnamoorthy, P. Sadayappan
:
Optimizing Tensor Contractions in CCSD(T) for Efficient Execution on GPUs. 96-106
Architecture
- Zhaoxiang Jin, Soner Önder:

A two-phase recovery mechanism. 107-117 - Reena Panda, Lizy K. John:

HALO: A Hierarchical Memory Access Locality Modeling Technique For Memory System Explorations. 118-128 - Jose Antonio Pascual

, Javier Navaridas
:
High-Performance, Low-Complexity Deadlock Avoidance for Arbitrary Topologies/Routings. 129-138
Accelerator
- Dongwoo Lee

, Sungbum Kang, Kiyoung Choi:
ComPEND: Computation Pruning through Early Negative Detection for ReLU in a Deep Neural Network Accelerator. 139-148 - Hao Yan, Hebin R. Cherian, Ethan C. Ahn

, Lide Duan:
CELIA: A Device and Architecture Co-Design Framework for STT-MRAM-Based Deep Learning Acceleration. 149-159 - Jacob Lambert, Seyong Lee

, Jungwon Kim
, Jeffrey S. Vetter, Allen D. Malony:
Directive-Based, High-Level Programming and Optimizations for High-Performance Computing with FPGAs. 160-171
Application and Programming Framework
- Xue Li, Mingxing Zhang, Kang Chen, Yongwei Wu:

ReGraph: A Graph Processing Framework that Alternately Shrinks and Repartitions the Graph. 172-183 - Xiuhong Li, Yun Liang, Wentai Zhang

, Taide Liu, Haochen Li, Guojie Luo, Ming Jiang:
cuMBIR: An Efficient Framework for Low-dose X-ray CT Image Reconstruction on GPUs. 184-194 - Feng Zhang, Jidong Zhai, Xipeng Shen

, Onur Mutlu
, Wenguang Chen:
Zwift: A Programming Framework for High Performance Text Analytics on Compressed Data. 195-206
Runtime System and Library
- Isaac Sánchez Barrera

, Miquel Moretó
, Eduard Ayguadé, Jesús Labarta, Mateo Valero, Marc Casas
:
Reducing Data Movement on Large Shared Memory Systems by Exploiting Computation Dependencies. 207-217 - Lluc Alvarez

, Marc Casas
, Jesús Labarta, Eduard Ayguadé, Mateo Valero, Miquel Moretó
:
Runtime-Guided Management of Stacked DRAM Memories in Task Parallel Programs. 218-228 - François Tessier

, Paul Gressier, Venkatram Vishwanath:
Optimizing Data Aggregation by Leveraging the Deep Memory Hierarchy on Large-scale Systems. 229-239
Program Analysis
- Lai Wei, John M. Mellor-Crummey

:
Automated Analysis of Time Series Data to Understand Parallel Program Behaviors. 240-251 - Hui Zhang, Jeffrey K. Hollingsworth:

ChplBlamer: A Data-centric and Code-centric Combined Profiler for Multi-locale Chapel Programs. 252-262 - Shasha Wen, Lucy Cherkasova, Felix Xiaozhu Lin, Xu Liu:

ProfDP: A Lightweight Profiler to Guide Data Placement in Heterogeneous Memory Systems. 263-273
System Design
- Nadja Peters, Sangyoung Park

, Daniel Clifford, S. Kyostila, Ross McIlroy, Benedikt Meurer, Hannes Payer, Samarjit Chakraborty
:
Phase-Aware Web Browser Power Management on HMP Platforms. 274-283 - Ke Zhou, Si Sun, Hua Wang, Ping Huang, Xubin He, Rui Lan, Wenyan Li, Wenjie Liu, Tianming Yang:

Demystifying Cache Policies for Photo Stores at Scale: A Tencent Case Study. 284-294 - Zhihao Jia, Sean Treichler, Galen M. Shipman, Patrick S. McCormick

, Alex Aiken
:
Isometry: A Path-Based Distributed Data Transfer System. 295-306
Parallel Algorithm
- Yang You, James Demmel, Cho-Jui Hsieh, Richard W. Vuduc

:
Accurate, Fast and Scalable Kernel Ridge Regression on Parallel and Distributed Systems. 307-317 - Keke Zhai, Tania Banerjee

, David Zwick
, Jason Hackl, Sanjay Ranka
:
Dynamic Load Balancing for Compressible Multiphase Turbulence. 318-327
Compiler and OS
- Jiacheng Zhao

, Huimin Cui, Yalin Zhang, Jingling Xue
, Xiaobing Feng:
Revisiting Loop Tiling for Datacenters: Live and Let Live. 328-340 - Shikai Li, Sunghyun Park, Scott A. Mahlke:

Sculptor: Flexible Approximation with Selective Dynamic Loop Perforation. 341-351 - Jee Ho Ryoo, Lizy K. John, Arkaprava Basu:

A Case for Granularity Aware Page Migration. 352-362
Optimization and Performance Tuning
- Changxi Liu

, Biwei Xie, Xin Liu, Wei Xue, Hailong Yang, Xu Liu:
Towards Efficient SpMV on Sunway Manycore Architectures. 363-373 - Venkatesan T. Chakaravarthy, Jee W. Choi, Douglas J. Joseph, Prakash Murali, Shivmaran S. Pandian, Yogish Sabharwal, Dheeraj Sreedhar:

On Optimizing Distributed Tucker Decomposition for Sparse Tensors. 374-384 - Jayaraman J. Thiagarajan, Nikhil Jain, Rushil Anirudh, Alfredo Giménez, Rahul Sridhar, Aniruddha Marathe

, Tao Wang
, Murali Emani, Abhinav Bhatele, Todd Gamblin:
Bootstrapping Parameter Space Exploration for Fast Tuning. 385-395

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














