


default search action
IPDPS 2017: Orlando, FL, USA
- 2017 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2017, Orlando, FL, USA, May 29 - June 2, 2017. IEEE Computer Society 2017, ISBN 978-1-5386-3914-6

Keynote 1
- Tandy J. Warnow:

Computational Challenges in Constructing the Tree of Life. 1
Session 1: Graph Algorithms
- Gal Yehuda, Daniel Keren, Islam Akaria:

Monitoring Properties of Large, Distributed, Dynamic Graphs. 2-11 - Patrick Flick, Srinivas Aluru:

Parallel Construction of Suffix Trees and the All-Nearest-Smaller-Values Problem. 12-21 - Ariful Azad, Mathias Jacquelin

, Aydin Buluç
, Esmond G. Ng:
The Reverse Cuthill-McKee Algorithm in Distributed-Memory. 22-31 - Maciej Besta, Florian Marending, Edgar Solomonik, Torsten Hoefler:

SlimSell: A Vectorizable Graph Representation for Breadth-First Search. 32-41
Session 2: Computational Biology
- Haidong Lan, Weiguo Liu, Yongchao Liu, Bertil Schmidt

:
SWhybrid: A Hybrid-Parallel Framework for Large-Scale Protein Sequence Database Search. 42-51 - Yuandong Chan, Kai Xu, Haidong Lan, Weiguo Liu, Yongchao Liu, Bertil Schmidt

:
PUNAS: A Parallel Ungapped-Alignment-Featured Seed Verification Algorithm for Next-Generation Sequencing Read Alignment. 52-61 - Jing Zhang, Sanchit Misra, Hao Wang, Wu-chun Feng:

Eliminating Irregularities of Protein Sequence Search on Multicore Architectures. 62-71 - Jie Wang, Xinfeng Xie

, Jason Cong:
Communication Optimization on GPU: A Case Study of Sequence Alignment Algorithms. 72-81
Session 3: Caches
- Bingchao Li, Jizhou Sun, Murali Annavaram

, Nam Sung Kim:
Elastic-Cache: GPU Cache Architecture for Efficient Fine- and Coarse-Grained Cache-Line Management. 82-91 - Qi Zeng, Jih-Kwon Peir:

Content-Aware Non-Volatile Cache Replacement. 92-101 - Jiguang Wan, Wei Wu, Ling Zhan, Qing Yang, Xiaoyang Qu, Changsheng Xie:

DEFT-Cache: A Cost-Effective and Highly Reliable SSD Cache for RAID Storage. 102-111 - Pengcheng Li, Dhruva R. Chakrabarti, Chen Ding

, Liang Yuan:
Adaptive Software Caching for Efficient NVRAM Data Persistence. 112-122
Session 4: Cloud & OS
- Song Wu, Chao Niu, Jia Rao, Hai Jin, Xiaohai Dai:

Container-Based Cloud Platform for Mobile Computation Offloading. 123-132 - Hao He, Jiang Hu, Dilma Da Silva:

Enhancing Datacenter Resource Management through Temporal Logic Constraints. 133-142 - Jie Zhang, Xiaoyi Lu, Dhabaleswar K. Panda:

High-Performance Virtual Machine Migration Framework for MPI Applications on SR-IOV Enabled InfiniBand Clusters. 143-152 - Swann Perarnau, Judicael A. Zounmevo, Matthieu Dreher, Brian C. Van Essen, Roberto Gioiosa, Kamil Iskra

, Maya B. Gokhale, Kazutomo Yoshii, Peter H. Beckman:
Argo NodeOS: Toward Unified Resource Management for Exascale. 153-162
Session 5: Distributed Algorithms
- Andrea Clementi, Luciano Gualà, Guido Proietti, Giacomo Scornavacca:

Rational Fair Consensus in the Gossip Model. 163-171 - Calvin Newport:

Leader Election in a Smartphone Peer-to-Peer Network. 172-181 - Karine Altisen, Ajoy K. Datta, Stéphane Devismes

, Anaïs Durand
, Lawrence L. Larmore:
Leader Election in Asymmetric Labeled Unidirectional Rings. 182-191 - Petra Berenbrink, Peter Kling

, Christopher Liaw, Abbas Mehrabian:
Tight Load Balancing Via Randomized Local Search. 192-201
Session 6: Numerical Simulation
- Hiroshi Nakashima, Yoshiki Summura, Keisuke Kikura, Yohei Miyake

:
Large Scale Manycore-Aware PIC Simulation with Efficient Particle Binning. 202-212 - Amrita Mathuriya, Ye Luo

, Anouar Benali
, Luke Shulenburger
, Jeongnim Kim:
Optimization and Parallelization of B-Spline Based Orbital Evaluations in QMC on Multi/Many-Core Shared Memory Processors. 213-223 - Kshitij Mehta, Maxime R. Hugues, Oscar R. Hernandez, David E. Bernholdt

, Henri Calandra:
One-Way Wave Equation Migration at Scale on GPUs Using Directive Based Programming. 224-233 - Mathias Jacquelin

, Wibe A. de Jong
, Eric J. Bylaska:
Towards Highly scalable Ab Initio Molecular Dynamics (AIMD) Simulations on the Intel Knights Landing Manycore Processor. 234-243
Session 7: Novel Architectures
- Xubin Tan

, Jaume Bosch
, Miquel Vidal
, Carlos Álvarez
, Daniel Jiménez-González
, Eduard Ayguadé, Mateo Valero:
General Purpose Task-Dependence Management Hardware for Task-Based Dataflow Programming Models. 244-253 - Halit Dogan, Farrukh Hijaz, Masab Ahmad, Brian Kahne, Peter Wilson, Omer Khan:

Accelerating Graph and Machine Learning Workloads Using a Shared Memory Multicore Architecture with Auxiliary Support for In-hardware Explicit Messaging. 254-264 - Xiang Pan, Anys Bacha, Radu Teodorescu:

Respin: Rethinking Near-Threshold Multiprocessor Design with Non-volatile Memory. 265-275 - Syed Mohammad Asad Hassan Jafri, Ahmed Hemani, Kolin Paul, Naeem Abbas:

MOCHA: Morphable Locality and Compression Aware Architecture for Convolutional Neural Networks. 276-286
Session 8: Performance Modeling and Tuning
- Biagio Cosenza

, Juan José Durillo, Stefano Ermon, Ben H. H. Juurlink:
Autotuning Stencil Computations with Structural Ordinal Regression Learning. 287-296 - Sabela Ramos, Torsten Hoefler:

Capability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL. 297-306 - David Beckingsale, Olga Pearce

, Ignacio Laguna, Todd Gamblin:
Apollo: Reusable Models for Fast, Dynamic Tuning of Input-Dependent Code. 307-316 - Ryan D. Friese

, Nathan R. Tallent
, Abhinav Vishnu, Darren J. Kerbyson, Adolfy Hoisie
:
Generating Performance Models for Irregular Applications. 317-326
Session 9: Communication & Coordination
- Keishla D. Ortiz-Lopez, Jennifer L. Welch:

Bounded Reordering Allows Efficient Reliable Message Transmission. 327-336 - Dongxiao Yu, Yuexuan Wang, Tigran Tonoyan, Magnús M. Halldórsson

:
Dynamic Adaptation in Wireless Networks Under Comprehensive Interference via Carrier Sense. 337-346 - Pawel Garncarek

, Tomasz Jurdzinski
, Krzysztof Lorys
:
Fault-Tolerant Online Packet Scheduling on Parallel Channels. 347-356 - Torsten Hoefler, Amnon Barak, Amnon Shiloh, Zvi Drezner:

Corrected Gossip Algorithms for Fast Reliable Broadcast on Unreliable Systems. 357-366
Session 10: Tools 1
- Hao Xu

, Shasha Wen, Alfredo Giménez, Todd Gamblin, Xu Liu:
DR-BW: Identifying Bandwidth Contention in NUMA Architectures with Supervised Learning. 367-376 - Hui Zhang, Jeffrey K. Hollingsworth:

Data Centric Performance Measurement Techniques for Chapel Programs. 377-386 - Young Wn Song, Yann-Hang Lee:

A Parallel FastTrack Data Race Detector on Multi-core Systems. 387-396 - Gokcen Kestor

, Sriram Krishnamoorthy, Wenjing Ma:
Localized Fault Recovery for Nested Fork-Join Programs. 397-408
Session 11: Networks
- Roberto Gioiosa, Antonino Tumeo

, Jian Yin, Thomas Warfel
, David J. Haglin, Santiago Betelú:
Exploring DataVortex Systems for Irregular Applications. 409-418 - Jiyan Sun, Yan Zhang, Xin Wang, Shihan Xiao, Zhen Xu, Hongjing Wu, Xin Chen, Yanni Han:

DC2-MTCP: Light-Weight Coding for Efficient Multi-Path Transmission in Data Center Network. 419-428 - Yi Dai, Kefei Wang, Gang Qu, Liquan Xiao

, Dezun Dong, Xingyun Qi:
A Scalable and Resilient Microarchitecture Based on Multiport Binding for High-Radix Router Design. 429-438 - Nikhil Jain, Abhinav Bhatele, Xiang Ni, Todd Gamblin, Laxmikant V. Kalé:

Partitioning Low-Diameter Networks to Eliminate Inter-Job Interference. 439-448
Session 12: Libraries & Frameworks
- Jan Wroblewski

, Kazuaki Ishizaki, Hiroshi Inoue, Moriyoshi Ohara:
Accelerating Spark Datasets by Inlining Deserialization. 449-458 - Hong Zhang, Hai Huang, Liqiang Wang:

MRapid: An Efficient Short Job Optimizer on Hadoop. 459-468 - Samuel K. Gutierrez

, Kei Davis, Dorian C. Arnold, Randal S. Baker, Robert W. Robey, Patrick S. McCormick
, Daniel Holladay, Jon A. Dahl, R. Joe Zerr, Florian Weik, Christoph Junghans
:
Accommodating Thread-Level Heterogeneity in Coupled Parallel Applications. 469-478 - Yuechao Pan, Yangzihao Wang, Yuduo Wu, Carl Yang, John D. Owens:

Multi-GPU Graph Analytics. 479-490
Industry Tutorial
- Julie Bernauer:

NVIDIA Deep Learning Tutorial. 491
Keynote 2
- Mark Seager:

A Scalable System Architecture to Addressing the Next Generation of Predictive Simulation Workflows with Coupled Compute and Data Intensive Applications. 492
Session 13: Motion Planning & Similarity Search
- Sergio Rajsbaum

, Armando Castañeda, David Flores-Peñaloza, Manuel Alcantara:
Fault-Tolerant Robot Gathering Problems on Graphs With Arbitrary Appearing Times. 493-502 - Akhil Krishnan, Mikhail Markov, Borzoo Bonakdarpour:

Distributed Vehicle Routing Approximation. 503-512 - Gokarna Sharma, Ramachandran Vaidyanathan, Jerry L. Trahan

, Costas Busch, Suresh Rai:
O(log N)-Time Complete Visibility for Asynchronous Robots with Lights. 513-522 - Vincent T. Lee, Justin Kotalik, Carlo C. del Mundo, Armin Alaghi, Luis Ceze, Mark Oskin:

Similarity Search on Automata Processors. 523-534
Session 14: Applications
- Yulong Ao, Chao Yang, Xinliang Wang, Wei Xue, Haohuan Fu, Fangfang Liu, Lin Gan, Ping Xu, Wenjing Ma:

26 PFLOPS Stencil Computations for Atmospheric Modeling on Sunway TaihuLight. 535-544 - Bram Veenboer

, Matthias Petschow, John W. Romein:
Image-Domain Gridding on Graphics Processors. 545-554 - Beverly A. Sanders, Jason N. Byrd, Nakul Jindal, Victor F. Lotrich, Dmitry I. Lyakh

, Ajith Perera, Rodney J. Bartlett
:
Aces4: A Platform for Computational Chemistry Calculations with Extremely Large Block-Sparse Arrays. 555-564 - Shun Yao, Dantong Yu:

PhiOpenSSL: Using the Xeon Phi Coprocessor for Efficient Cryptographic Calculations. 565-574
Session 15: Tools 2
- Xuewen Cui

, Thomas R. W. Scogland, Bronis R. de Supinski, Wu-chun Feng:
Directive-Based Partitioning and Pipelining for Graphics Processing Units. 575-584 - Xiaoqing Luo, Frank Mueller, Philip H. Carns, Jonathan Jenkins, Robert Latham, Robert B. Ross, Shane Snyder:

ScalaIOExtrap: Elastic I/O Tracing and Extrapolation. 585-594 - Jen-Cheng Huang, Lifeng Nai, Pranith Kumar, Hyojong Kim, Hyesoon Kim:

SimProf: A Sampling Framework for Data Analytic Workloads. 595-604 - Hao Wang, Jing Zhang, Da Zhang, Sarunya Pumma, Wu-chun Feng:

PaPar: A Parallel Data Partitioning Framework for Big Data Applications. 605-614
Session 16: Data and Graph Analytics
- Jiarui Fang

, Haohuan Fu, Wenlai Zhao, Bingwei Chen, Weijie Zheng
, Guangwen Yang:
swDNN: A Library for Accelerating Deep Learning Applications on Sunway TaihuLight. 615-624 - Md. Naim, Fredrik Manne, Mahantesh Halappanavar, Antonino Tumeo

:
Community Detection on the GPU. 625-634 - Heng Lin, Xiongchao Tang, Bowen Yu, Youwei Zhuo, Wenguang Chen, Jidong Zhai, Wanwang Yin, Weimin Zheng:

Scalable Graph Traversal on Sunway TaihuLight with Ten Million Cores. 635-645 - George M. Slota, Sivasankaran Rajamanickam, Karen D. Devine, Kamesh Madduri:

Partitioning Trillion-Edge Graphs in Minutes. 646-655
Session 17: Linear Algebra
- Jianyu Huang

, Leslie Rice, Devin A. Matthews
, Robert A. van de Geijn
:
Generating Families of Practical Fast Matrix Multiplication Algorithms. 656-667 - Mathieu Faverge, Julien Langou

, Yves Robert
, Jack J. Dongarra:
Bidiagonalization and R-Bidiagonalization: Parallel Tiled Algorithms, Critical Paths and Distributed-Memory Implementation. 668-677 - Tobias Wicky, Edgar Solomonik, Torsten Hoefler:

Communication-Avoiding Parallel Algorithms for Solving Triangular Systems of Linear Equations. 678-687 - Ariful Azad, Aydin Buluç

:
A Work-Efficient Parallel Sparse Matrix-Sparse Vector Multiplication Algorithm. 688-697
Session 18: Power Management
- Abdulaziz Tabbakh

, Murali Annavaram
, Xuehai Qian:
Power Efficient Sharing-Aware GPU Data Management. 698-707 - Rahul Boyapati, Jiayi Huang

, Ningyuan Wang, Kyung Hoon Kim, Ki Hwan Yum, Eun Jung Kim:
Fly-Over: A Light-Weight Distributed Power-Gating Mechanism for Energy-Efficient Networks-on-Chip. 708-717 - Zhenhua Li, Yuanyuan Yang

:
RCube: A Power Efficient and Highly Available Network for Data Centers. 718-727 - Thang Cao, Wei Huang, Yuan He

, Masaaki Kondo:
Cooling-Aware Job Scheduling and Node Allocation for Overprovisioned HPC Systems. 728-737
Session 19: Scheduling
- Vincenzo Bonifaci

, Gianlorenzo D'Angelo
, Alberto Marchetti-Spaccamela
:
Algorithms for Hierarchical and Semi-Partitioned Parallel Scheduling. 738-747 - Odorico Machado Mendizabal

, Ruda S. T. De Moura, Fernando Luís Dotti
, Fernando Pedone
:
Efficient and Deterministic Scheduling for Parallel State Machine Replication. 748-757 - Guillaume Aupy, Clement Brasseur, Loris Marchal

:
Dynamic Memory-Aware Task-Tree Scheduling. 758-767 - Olivier Beaumont

, Lionel Eyraud-Dubois, Suraj Kumar:
Approximation Proofs of a Fast and Efficient List Scheduling Algorithm for Task-Based Runtime Systems on Multicores and GPUs. 768-777
Session 20: Code Optimization
- Philippe Clauss

, Ervin Altintas, Matthieu Kuhn:
Automatic Collapsing of Non-Rectangular Loops. 778-787 - Yonghong Yan, Jiawen Liu, Kirk W. Cameron

, Mariam Umar:
HOMP: Automated Distribution of Parallel Loops and Data in Highly Parallel Accelerator-Based Systems. 788-798 - Jaime Arteaga Molina, Stéphane Zuckerman, Guang R. Gao:

Multigrain Parallelism: Bridging Coarse-Grain Parallel Programs and Fine-Grain Event-Driven Multithreading. 799-808 - Josep M. Pérez

, Vicenç Beltran
, Jesús Labarta, Eduard Ayguadé:
Improving the Integration of Task Nesting and Dependencies in OpenMP. 809-818
Keynote 3
- Mateo Valero:

Runtime Aware Architectures. 819
Best Papers
- Scott Beamer

, Krste Asanovic, David A. Patterson:
Reducing Pagerank Communication via Propagation Blocking. 820-831 - Michael G. Gowanlock, Cody M. Rude, David M. Blair, Justin D. Li, Victor Pankratius:

Clustering Throughput Optimization on the GPU. 832-841 - Pablo Fuentes

, Enrique Vallejo
, Ramón Beivide, Cyriel Minkenberg, Mateo Valero:
FlexVC: Flexible Virtual Channel Management in Low-Diameter Networks. 842-854 - Benjamin Klenk, Holger Fröning, Hans Eberle, Larry Dennison:

Relaxations for High-Performance Message Passing on Massively Parallel SIMT Processors. 855-865
Session 21: Algorithms
- Reza Mokhtari, Michael Stumm

:
The SEPO Model of Computation to Enable Larger-Than-Memory Hash Tables for GPU-Accelerated Big Data Analytics. 866-875 - Wei Xie, Yong Chen

:
Elastic Consistent Hashing for Distributed Storage Systems. 876-885 - Chenhan D. Yu, William B. March, George Biros:

An N log N Parallel Fast Direct Solver for Kernel Matrices. 886-896 - Pieter Ghysels, Xiaoye Sherry Li, Christopher Gorman

, François-Henry Rouet:
A Robust Parallel Preconditioner for Indefinite Systems Using Hierarchical Matrices and Randomized Sampling. 897-906
Session 22: Coordination
- Sergei Arnautov, Pascal Felber

, Christof Fetzer, Bohdan Trach:
FFQ: A Fast Single-Producer/Multiple-Consumer Concurrent FIFO Queue. 907-916 - Ivan Walulya, Philippas Tsigas

:
Scalable Lock-Free Vector with Combining. 917-926 - Wei-Lun Hung, Vijay K. Garg:

Automatic-Signal Monitors with Multi-object Synchronization. 927-936 - Yujie An, Quentin F. Stout:

Optimal Algorithms for a Mesh-Connected Computer with Limited Additional Global Bandwidth. 937-946
Session 23: Power Management 2
- Sridutt Bhalachandra, Allan Porterfield, Stephen L. Olivier

, Jan F. Prins:
An Adaptive Core-Specific Runtime for Energy Efficiency. 947-956 - Ryuichi Sakamoto, Thang Cao, Masaaki Kondo, Koji Inoue, Masatsugu Ueda, Tapasya Patki, Daniel A. Ellsworth, Barry Rountree, Martin Schulz

:
Production Hardware Overprovisioning: Real-World Performance Optimization Using an Extensible Power-Aware Resource Management Framework. 957-966 - Qi Zhu, Bo Wu, Xipeng Shen

, Li Shen, Zhiying Wang:
Co-Run Scheduling with Power Cap on Integrated CPU-GPU Systems. 967-977 - Vignesh Adhinarayanan

, Wu-chun Feng, David H. Rogers, James P. Ahrens
, Scott Pakin:
Characterizing and Modeling Power and Energy for Extreme-Scale In-Situ Visualization. 978-987
Session 24: MPI
- Wim Lavrijsen, Costin Iancu:

Application Level Reordering of Remote Direct Memory Access Operations. 988-997 - Sergio M. Martin, Marsha J. Berger, Scott B. Baden:

Toucan - A Translator for Communication Tolerant MPI Applications. 998-1007 - Yanfei Guo, Charles J. Archer, Michael Blocksome, Scott Parker, Wesley Bland, Ken Raffenetti, Pavan Balaji:

Memory Compression Techniques for Network Address Management in MPI. 1008-1017 - Salvatore Di Girolamo, Flavio Vella

, Torsten Hoefler:
Transparent Caching for RMA Systems. 1018-1027
Session 25: ML & Tensors
- El Mahdi El Mhamdi

, Rachid Guerraoui
:
When Neurons Fail. 1028-1037 - Venkatesan T. Chakaravarthy, Jee W. Choi, Douglas J. Joseph, Xing Liu, Prakash Murali, Yogish Sabharwal, Dheeraj Sreedhar:

On Optimizing Distributed Tucker Decomposition for Dense Tensors. 1038-1047 - Jiajia Li

, Jee Choi, Ioakeim Perros, Jimeng Sun
, Richard W. Vuduc
:
Model-Driven Sparse CP Decomposition for Higher-Order Tensors. 1048-1057 - Shaden Smith, Jongsoo Park, George Karypis

:
Sparse Tensor Factorization on Many-Core Processors with High-Bandwidth Memory. 1058-1067
Session 26: Resource Management
- Ali Pourmiri

, Mahdi Jafari Siavoshani, Seyed Pooya Shariatpanahi
:
Proximity-Aware Balanced Allocations in Cache Networks. 1068-1077 - Wei Chen, Jia Rao, Xiaobo Zhou:

Addressing Performance Heterogeneity in MapReduce Clusters with Elastic Tasks. 1078-1087 - Masahiro Tanaka, Kenjiro Taura

, Kentaro Torisawa:
Autonomic Resource Management for Program Orchestration in Large-Scale Data Analysis. 1088-1097 - Tao Gao, Yanfei Guo, Boyu Zhang, Pietro Cicotti, Yutong Lu, Pavan Balaji, Michela Taufer

:
Mimir: Memory-Efficient and Scalable MapReduce for Large Supercomputing Systems. 1098-1108
Session 27: Compression & Memoization
- Bo Mao, Hong Jiang, Suzhen Wu, Yaodong Yang, Zaifa Xi:

Elastic Data Compression with Improved Performance and Space Efficiency for Flash-Based Storage Systems. 1109-1118 - Sohan Lal, Jan Lucas, Ben H. H. Juurlink:

E^2MC: Entropy Encoding Based Memory Compression for GPUs. 1119-1128 - Dingwen Tao

, Sheng Di, Zizhong Chen
, Franck Cappello:
Significantly Improving Lossy Compression for Scientific Data Sets Based on Multidimensional Prediction and Error-Controlled Quantization. 1129-1139 - Iulian Brumar

, Marc Casas
, Miquel Moretó
, Mateo Valero, Gurindar S. Sohi:
ATM: Approximate Task Memoization in the Runtime System. 1140-1150
Session 28: Persistent Memory
- Jungwon Kim

, Kittisak Sajjapongse, Seyong Lee
, Jeffrey S. Vetter:
Design and Implementation of Papyrus: Parallel Aggregate Persistent Storage. 1151-1162 - Joel Edward Denny

, Seyong Lee
, Jeffrey S. Vetter:
Language-Based Optimizations for Persistence on Nonvolatile Main Memory Systems. 1163-1173 - Teng Wang, Adam Moody, Yue Zhu, Kathryn M. Mohror

, Kento Sato, Tanzima Z. Islam
, Weikuan Yu
:
MetaKV: A Key-Value Store for Metadata Management of Distributed Burst Buffers. 1174-1183 - Jiayang Guo, Yiming Hu, Bo Mao, Suzhen Wu:

Parallelism and Garbage Collection Aware I/O Scheduler with Improved SSD Performance. 1184-1193

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














