


default search action
20th PPOPP 2015: San Francisco, CA, USA
- Albert Cohen, David Grove:

Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, San Francisco, CA, USA, February 7-11, 2015. ACM 2015, ISBN 978-1-4503-3205-7
Concurrency
- Vincent Gramoli

:
More than you ever wanted to know about synchronization: synchrobench, measuring the impact of the synchronization on concurrent algorithms. 1-10 - Dan Alistarh, Justin Kopinsky, Jerry Li, Nir Shavit:

The SprayList: a scalable relaxed priority queue. 11-20 - Maya Arbel, Adam Morrison:

Predicate RCU: an RCU for scalable concurrent updates. 21-30 - Guy Golan-Gueta, G. Ramalingam, Mooly Sagiv, Eran Yahav:

Automatic scalable atomicity via semantic locking. 31-41
Code Generation
- Austin R. Benson

, Grey Ballard
:
A framework for practical parallel fast matrix multiplication. 42-53 - Aravind Acharya

, Uday Bondhugula:
PLUTO+: near-complete modeling of affine transformations for parallelism and locality. 54-64 - Mahesh Ravishankar, Roshan Dathathri, Venmugil Elango, Louis-Noël Pouchet, J. Ramanujam

, Atanas Rountev, P. Sadayappan
:
Distributed memory code generation for mixed Irregular/Regular computations. 65-75
Transactional Memory
- Lingxiang Xiang, Michael L. Scott

:
Software partitioning of hardware transactions. 76-86 - Alexandro Baldassin

, Edson Borin, Guido Araujo:
Performance implications of dynamic memory allocators on transactional memory systems. 87-96 - Minjia Zhang, Jipeng Huang, Man Cao, Michael D. Bond

:
Low-overhead software transactional memory with progress guarantees and strong semantics. 97-108
Large Scale Parallelism
- Milind Chabbi, Wim Lavrijsen, Wibe de Jong

, Koushik Sen, John M. Mellor-Crummey
, Costin Iancu:
Barrier elision for production parallel programs. 109-119 - Loïc Thébault, Eric Petit, Quang Dinh:

Scalable and efficient implementation of 3d unstructured meshes computation: a case study on matrix assembly. 120-129 - Nathan R. Tallent

, Abhinav Vishnu, Hubertus Van Dam
, Jeff Daily
, Darren J. Kerbyson, Adolfy Hoisie
:
Diagnosing the causes and severity of one-sided message contention. 130-139
Verification and Accelerators
- Yen-Jung Chang, Vijay K. Garg:

A parallel algorithm for global states enumeration in concurrent systems. 140-149 - Tiago Cogumbreiro

, Raymond Hu
, Francisco Martins
, Nobuko Yoshida
:
Dynamic deadlock verification for general barrier synchronisation. 150-160 - Yi-Ping You

, Hen-Jung Wu, Yeh-Ning Tsai, Yen-Ting Chao:
VirtCL: a framework for OpenCL device abstraction and management. 161-172 - Arash Ashari, Shirish Tatikonda, Matthias Boehm, Berthold Reinwald, Keith Campbell, John Keenleyside, P. Sadayappan

:
On optimizing machine learning workloads via kernel fusion. 173-182
Algorithms
- Kaiyuan Zhang, Rong Chen, Haibo Chen:

NUMA-aware graph-structured analytics. 183-193 - Chenning Xie, Rong Chen, Haibing Guan, Binyu Zang, Haibo Chen:

SYNC or ASYNC: time to fuse for distributed graph-parallel computation. 194-204 - Yuan Tang, Ronghui You, Haibin Kan, Jesmin Jahan Tithi, Pramod Ganapathi

, Rezaul Alam Chowdhury:
Cache-oblivious wavefront: improving parallelism of recursive dynamic programming algorithms without losing cache-efficiency. 205-214
Locking and Locality
- Milind Chabbi, Michael W. Fagan, John M. Mellor-Crummey

:
High performance locks for multi-level NUMA systems. 215-226 - Zoltan Majó, Thomas R. Gross:

A library for portable and composable data locality optimizations for NUMA systems. 227-238 - Abdelhalim Amer, Huiwei Lu, Yanjie Wei, Pavan Balaji, Satoshi Matsuoka:

MPI+Threads: runtime contention and remedies. 239-248
Poster Abstracts
- Andrew J. McPherson, Vijay Nagarajan, Susmit Sarkar

, Marcelo Cintra:
Fence placement for legacy data-race-free programs via synchronization read detection. 249-250 - Xianglan Piao, Channoh Kim, Younghwan Oh, Huiying Li, Jincheon Kim, Hanjun Kim

, Jae W. Lee:
JAWS: a JavaScript framework for adaptive CPU-GPU work sharing. 251-252 - Hyunseok Seo, Jinwook Kim, Min-Soo Kim:

GStream: a graph streaming processing method for large-scale graphs on GPUs. 253-254 - Nabeel AlSaber, Milind Kulkarni:

SemCache++: semantics-aware caching for efficient multi-GPU offloading. 255-256 - Jungwon Kim

, Seyong Lee
, Jeffrey S. Vetter:
An OpenACC-based unified programming model for multi-accelerator systems. 257-258 - Paul Thomson, Alastair F. Donaldson:

The lazy happens-before relation: better partial-order reduction for systematic concurrency testing. 259-260 - Azzam Haidar, Tingxing Dong, Piotr Luszczek, Stanimire Tomov

, Jack J. Dongarra:
Towards batched linear solvers on accelerated hardware platforms. 261-262 - Saurav Muralidharan, Michael Garland, Bryan Catanzaro, Albert Sidelnik, Mary W. Hall

:
A collection-oriented programming model for performance portability. 263-264 - Yangzihao Wang, Andrew A. Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, John D. Owens:

Gunrock: a high-performance graph processing library on the GPU. 265-266 - Olga Pearce, Todd Gamblin, Bronis R. de Supinski, Martin Schulz

, Nancy M. Amato:
Decoupled load balancing. 267-268 - Ye Jin, Mingliang Liu, Xiaosong Ma, Qing Liu, Jeremy Logan, Norbert Podhorszki

, Jong Youl Choi, Scott Klasky:
Combining phase identification and statistic modeling for automated parallel benchmark generation. 269-270 - Xuanhua Shi, Junling Liang, Sheng Di, Bingsheng He

, Hai Jin, Lu Lu, Zhixiang Wang, Xuan Luo, Jianlong Zhong:
Optimization of asynchronous graph processing on GPU with hybrid coloring model. 271-272 - Scott West, Sebastian Nanz, Bertrand Meyer:

Efficient and reasonable object-oriented concurrency. 273-274 - Vassilis Vassiliadis, Konstantinos Parasyris, Charalambos Chalios

, Christos D. Antonopoulos
, Spyros Lalis
, Nikolaos Bellas
, Hans Vandierendonck, Dimitrios S. Nikolopoulos
:
A programming model and runtime system for significance-aware energy-efficient computing. 275-276 - Martin Wimmer, Jakob Gruber, Jesper Larsson Träff, Philippas Tsigas

:
The lock-free k-LSM relaxed priority queue. 277-278 - Emmanuelle Saillard, Patrick Carribault, Denis Barthou

:
Static/Dynamic validation of MPI collective communications in multi-threaded context. 279-280 - Arunmoezhi Ramachandran

, Neeraj Mittal:
CASTLE: fast concurrent internal binary search tree using edge-based locking. 281-282 - Madan Mohan Das, Gabriel Southern, Jose Renau:

Section based program analysis to reduce overhead of detecting unsynchronized thread communication. 283-284 - Harshvardhan, Nancy M. Amato, Lawrence Rauchwerger:

A hierarchical approach to reducing communication in parallel graph algorithms. 285-286 - Yifeng Chen, Xiang Cui, Hong Mei:

Tiles: a new language mechanism for heterogeneous parallelism. 287-288 - Cosmin Radoi, Stephan Herhut, Jaswanth Sreeram, Danny Dig:

Are web applications ready for parallelism? 289-290

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














