


default search action
SC 2025: St. Louis, MO, USA
- Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2025, St. Louis, MO, USA, November 16-21, 2025. ACM 2025, ISBN 979-8-4007-1466-5

ACM Gordon Bell Finalist
- Nicolas Vetsch

, Alexander Maeder
, Vincent Maillou
, Anders Winka
, Jiang Cao
, Grzegorz Kwasniewski
, Leonard Deuschle
, Torsten Hoefler
, Alexandros Nikolaos Ziogas
, Mathieu Luisier
:
Ab-initio Quantum Transport with the GW Approximation, 42, 240 Atoms, and Sustained Exascale Performance. 1-13 - Benjamin Wilfong

, Anand Radhakrishnan
, Henry Le Berre
, Daniel Vickers
, Tanush Prathi
, Nikolaos Tselepidis
, Benedikt Dorschner
, Reuben D. Budiardja
, Brian Cornille
, Stephen Abbott
, Florian Schäfer, Spencer H. Bryngelson
:
Simulating many-engine spacecraft: Exceeding 1 quadrillion degrees of freedom via information geometric regularization. 14-24 - Nicholas Frontiere

, J. D. Emberson
, Michael Buehlmann
, Esteban M. Rangel
, Salman Habib
, Katrin Heitmann
, Patricia Larsen
, Vitali A. Morozov
, Adrian Pope
, Claude-André Faucher-Giguère, Antigoni Georgiadou
, Damien Lebrun-Grandié
, Andrey Prokopenko
:
Cosmological Hydrodynamics at Exascale: A Trillion-Particle Leap in Capability. 25-35 - Taufeq Mohammed Razakh

, Thomas Linker
, Ye Luo
, Nariman Piroozan
, Simon John Pennycook
, Nalini Kumar
, Albert Musaelian
, Anders Johansson
, Boris Kozinsky
, Rajiv K. Kalia
, Priya Vashishta
, Fuyuki Shimojo
, Shinnosuke Hattori
, Ken-ichi Nomura
, Aiichiro Nakano
:
Multiscale Light-Matter Dynamics in Quantum Materials: From Electrons to Topological Superlattices. 36-47 - Benran Zhang

, Daniel Weinberg
, Chih-En Hsu
, Aaron R. Altman
, Yuming Shi
, James B. White
, Derek Vigil-Fowler
, Steven G. Louie
, Jack R. Deslippe
, Felipe H. da Jornada
, Zhenglu Li
, Mauro Del Ben
:
Advancing Quantum Many-Body GW Calculations on Exascale Supercomputing Platforms. 48-59 - Stefan Henneking

, Sreeram Venkat
, Veselin Dobrev
, John Camier
, Tzanio V. Kolev
, Milinda Fernando
, Alice-Agnes Gabriel
, Omar Ghattas
:
Real-Time Bayesian Inference at Extreme Scale: A Digital Twin for Tsunami Early Warning Applied to the Cascadia Subduction Zone. 60-71
ACM Gordon Bell Climate Modeling Finalist
- Väinö Hatanpää

, Eugene Ku
, Jason Stock
, Murali Emani
, Sam Foreman
, Chunyong Jung
, Sandeep Madireddy
, Tung Nguyen
, Varuni Sastry
, Ray A. O. Sinurat
, Huihuo Zheng
, Sam Wheeler
, Troy Arcomano
, Venkatram Vishwanath
, Rao Kotamarthi
:
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions. 72-85 - Xiao Wang

, Jong-Youl Choi
, Takuya Kurihana
, Isaac Lyngaas
, Hong-Jun Yoon
, Xi Xiao
, David Pugmire
, Ming Fan
, Nasik Muhammad Nafi
, Aristeidis Tsaris
, Ashwin M. Aji
, Maliha Hossain
, Mohamed Wahib
, Dali Wang
, Peter E. Thornton
, Prasanna Balaprakash
, Moetasim Ashfaq
, Dan Lu
:
ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling. 86-98 - Ioan Hadade

, Daniel Klocke
, Jussi Enkovaara
, Tuomas Lunttila
, Thomas Rackow
, Jan Frederik Engels
, Claudia Frauen
, René Redler, Jenni Kontkanen
, Thomas Jung
, Dmitry Sein
, Irina Sandu
, Balthasar Reuter
, Nils Wedi
, Sebastian Milinski
, Francisco Doblas-Reyes
, Miguel Castrillo
, Mario C. Acosta
, Sergi Girona
, Pekka Manninen
:
Destination Earth: The Climate Change Adaptation Digital Twin. 99-110 - Kai Xu

, Maoxue Yu, Yuhu Chen, Jie Gao, Shuang Wang, Jiaying Song, Xiaohui Duan, Junwei Wei, Jiangfeng Yu, Hailong Liu
, Jinrong Jiang, Yi Zhang, Pengfei Lin, Tianyi Wang, Pengfei Wang, Weipeng Zheng, Jingwei Xie, Jiakang Zhang, Zilu Liu, Xiaoyu Jin, Jilin Wei, Qixin Chang, Qingxia Lin, Yanzhi Zhou, Weiguo Liu, Wei Xue, Yiwen Li, Haohuan Fu, Yue Yu, Xuebin Chi, Lixin Wu:
Kilometer-Scale AI-Powered and Performance-Portable Earth System Model (AP3ESM) to Achieve Year-Scale Simulation Speed on Heterogeneous Supercomputers. 111-124 - Daniel Klocke

, Claudia Frauen
, Jan Frederik Engels
, Dmitry Alexeev
, René Redler, Reiner Schnur
, Helmuth Haak
, Luis Kornblueh
, Nils Brüggemann
, Fatemeh Chegini, Manoel Römmer
, Lars Hoffmann
, Sabine Griessbach
, Mathis Bode
, Jonathan Coles
, Miguel Gila
, William Sawyer
, Alexandru Calotoiu
, Yakup Budanaz
, Pratyai Mazumder
, Marcin Copik
, Benjamin Weber
, Andreas Herten
, Hendryk Bockelmann
, Torsten Hoefler
, Cathy Hohenegger
, Bjorn Stevens
:
Computing the Full Earth System at 1km Resolution. 125-136
Auto-Tuning, Compilation, and Code Generation
- Andrei Ivanov

, Siyuan Shen
, Gioele Gottardo
, Marcin Chrapek
, Afif Boudaoud
, Timo Schneider
, Luca Benini
, Torsten Hoefler
:
PerfDojo: Automated ML Library Generation for Heterogeneous Architectures. 137-151 - Doru-Thom Popovici

, Botao Wu
, John Shalf
, Martin Kong
:
Automatic Generation of Mappings for Distributed Fourier Operations. 152-166 - Yangjie Zhou

, Honglin Zhu
, Qian Qiu
, Weihao Cui
, Zihan Liu
, Peng Chen
, Mohamed Wahib
, Cong Guo
, Siyuan Feng
, Jintao Meng
, Haidong Lan
, Jingwen Leng
, Yun Lin
, Jin Song Dong
, Wenxi Zhu
, Minwen Deng
:
A Sample-Free Compilation Framework for Efficient Dynamic Tensor Computation. 167-184 - Xinxin Qi

, Jianbin Fang
, Peng Zhang
, Yonggang Che
, Jie Ren
:
Constraint-Driven Auto-Tuning of GEMM-like Operators for MT-3000 Many-core Processor. 185-199
Graph Neural Networks and Training
- Aditya K. Ranjan

, Siddharth Singh
, Cunyang Wei
, Abhinav Bhatele
:
Plexus: Taming Billion-edge Graphs with 3D Parallel Full-graph GNN Training. 200-216 - Seth Ockerman

, Amal Gueroudji
, Tanwi Mallick
, Yixuan He
, Line Pouchard
, Robert B. Ross
, Shivaram Venkataraman
:
PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training. 217-236 - Hui Yu

, Yu Zhang
, Ligang He
, Bing Peng
, Jin Zhao
, Zixiao Wang
, Hao Qi
, Hai Jin
:
TaGNN: An Efficient Topology-aware Accelerator for High-performance Dynamic Graph Neural Network. 237-249 - Zuocheng Shi

, Jie Sun
, Ziyu Song
, Mo Sun
, Yang Xiao
, Fei Wu
, Zeke Wang
:
Moment: Co-optimizing Physical Communication Topology and Data Placement for Multi-GPU Out-of-core GNN Training. 250-264
Performance: Benchmarks and Optimization
- Bin Ma

, Viktor Nikitin
, Xi Wang
, Tekin Bicer
, Dong Li
:
mLR: Scalable Laminography Reconstruction based on Memoization. 265-280 - Aditya Kashi

, Nicholson Koukpaizan
, Hao Lu
, Michael A. Matheson
, Sarp Oral
, Feiyi Wang
:
Scaling the memory wall using mixed-precision - HPG-MxP on an exascale machine. 281-297 - James D. Trotter

, Sinan Ekmekçibasi
, Dogan Sagbili
, Johannes Langguth
, Xing Cai
, Didem Unat
:
CPU- and GPU-initiated Communication Strategies for Conjugate Gradient Methods on Large GPU Clusters. 298-315 - Shaokang Du

, Kelun Lei
, Xin You
, Hailong Yang
, Yufan Xu
, Zhongzhi Luan
, Yi Liu
, Depei Qian
:
Zero-Value Code Specialization via Profile-Guided Control Data Flow Analysis. 316-330
Performance: Analysis Tools
- Philipp Schaad

, Tal Ben-Nun
, Torsten Hoefler
:
C.A.T.S.: Memory and Control Flow Tracing for Whole-Program Performance Analysis. 331-348 - Siyuan Shen

, Tommaso Bonato
, Zhiyi Hu
, Pasquale Jordan
, Tiancheng Chen
, Torsten Hoefler
:
ATLAHS: An Application-centric Network Simulator Toolchain for AI, HPC, and Distributed Storage. 349-367 - Yanbo Zhao

, Yueming Hao
, Zecheng Li
, Shuyin Jiao
, Xu Liu
, Jiajia Li
:
RedSan: A Redundant Memory Instruction Sanitizer for GPU Programs. 368-382 - Yuyang Jin

, Xirui Shui
, Mingshu Zhai
, Zan Zong
, Feng Zhang
, Felix Wolf
, Jidong Zhai
:
TraceFlow: Efficient Trace Analysis for Large-Scale Parallel Applications via Interaction Pattern-Aware Trace Distribution. 383-396
State of the Practice
- Hao Lu

, Michael A. Matheson
, Noel Chalmers
, Aditya Kashi
, Nicholas Malaya
, Feiyi Wang
:
Insights from Optimizing HPL Performance on Exascale Systems: A Comparative Analysis of Panel Factorization. 397-410 - Edgar A. León, Joseph Glenski

, Mark J. Stock
, Kim H. McMahon
, William Loewe
, Clark Snyder
, Larry Kaplan
, Srinath Vadlamani
, Timothy I. Mattox
, Trent D'Hooge
, Brian Behlendorf
, Nathan Hanford
, Ramesh Pankajakshan
, Matthew L. Leininger
:
Breaking the System Noise Barrier at Exascale. 411-436 - Valérie Hayot-Sasson

, Nathaniel Hudson
, André Bauer
, Maxime Gonthier
, Ian T. Foster
, Kyle Chard
:
Addressing Reproducibility Challenges in HPC with Continuous Integration. 437-457 - Pedro Valero-Lara

, Aaron R. Young
, Jeffrey S. Vetter
, Zheming Jin
, Swaroop Pophale
, Mohammad Alaul Haque Monil
, Keita Teranishi
, William F. Godoy
:
ChatHPC: Building the Foundations for a Productive and Trustworthy AI-Assisted HPC Ecosystem. 458-474
System Software and Cloud Computing: Resource Utilization
- Rohan Basu Roy

, Tirthak Patel
, Baolin Li
, Siddharth Samsi
, Vijay Gadepally
, Devesh Tiwari
:
GreenMix: Energy-Efficient Serverless Computing via Randomized Sketching on Asymmetric Multi-Cores. 475-489 - Nathan Jones

, Tyler N. Allen
, Rong Ge
:
HELM: Characterizing Unified Memory Accesses to Improve GPU Performance under Memory Oversubscription. 490-504 - Zhong Zheng

, Seyfal Sultanov
, Michael E. Papka
, Zhiling Lan
:
Minimizing Power Waste in Heterogenous Computing via Adaptive Uncore Scaling. 505-518 - Bowen Zhang

, Yuhang Wang
, Zhuozhao Li
:
BOER: Enhancing Resource Utilization for Deep Learning Inference with Hybrid Spatial GPU Sharing. 519-532
Containerization and Software Deployment
- Marcin Copik

, Eiman Alnuaimi
, Alok Kamatar
, Valérie Hayot-Sasson
, Alberto Madonna
, Todd Gamblin
, Kyle Chard
, Ian T. Foster
, Torsten Hoefler
:
XaaS Containers: Performance-Portable Representation With Source and IR Containers. 533-555 - John Gouwar

, Gregory Becker
, Tamara Dahlgren
, Nathan Hanford
, Arjun Guha
, Todd Gamblin
:
Bridging the Gap Between Binary and Source Based Package Management in Spack. 556-569 - Hao Fan

, Zhuo Huang
, Shadi Ibrahim
, Lin Gu
, Song Wu
:
EDDE: Container Deployment Framework Beyond the Cloud. 570-585 - Yuhao Gu

, Haoquan Chen
, Xianjie Chen
, Jiangsu Du
, Zhiguang Chen
, Nong Xiao
, Xianwei Zhang
, Yutong Lu
:
coMtainer: Compilation-assisted HPC Container Images with Enhanced Adaptability. 586-601
Performance: Sparse Matrix and Tensor Computation
- Da Ma

, Khalid Ahmad
, Kazem Cheshmi
, Hari Sundar
, Mary W. Hall
:
Sparsified Preconditioned Conjugate Gradient Solver on GPUs. 602-616 - Saurabh Raje

, Hunter McCoy
, Atanas Rountev
, Prashant Pandey
, P. Sadayappan
:
FaSTCC: Fast Sparse Tensor Contractions on CPUs. 617-630 - Weidong He

, Haikun Liu
, Zhuohui Duan
, Xiaofei Liao
, Shuhao Zhang
, Fubing Mao
, Hai Jin
:
StraGCN: GPU-Accelerated Strassen's Sparse-Dense Matrix Multiplication for Graph Convolutional Network Training. 631-644 - Yukang Dong

, Ziyuan Shen
, Wenbin Jiang
, Zhenghang Liu
, Ye Xu
, Bingyi He
, Ran Zheng
, Hai Jin
:
Bridging the Gap between Unstructured SpMM and Structured Sparse Tensor Cores. 645-660
Precision and Real Number Representations
- Faveo Hoerold

, Ivan R. Ivanov
, Akash Dhruv
, William S. Moses
, Anshu Dubey
, Mohamed Wahib
, Jens Domke
:
RAPTOR: Practical Numerical Profiling of Scientific Applications. 661-680 - Laslo Hunhold

, James Quinlan
, Stefan Wesner
:
Numerical Performance of the Implicitly Restarted Arnoldi Method in OFP8, Bfloat16, Posit, and Takum Arithmetics. 681-694 - David Kai Zhang

, Alex Aiken
:
High-Performance Branch-Free Algorithms for Extended-Precision Floating-Point Arithmetic. 695-710 - Kengo Suzuki

, Takeshi Iwashita
:
A Nested Krylov Method Using Half-Precision Arithmetic. 711-727
Quantum Computing and Simulation
- Emmanouil Giortamis

, Francisco Romão
, Nathaniel Tornow
, Dmitry Lugovoy
, Pramod Bhatotia
:
Qonductor: A Cloud Orchestrator for Quantum Computing. 728-745 - Yuqi Zhang

, Yuxin Yang
, Cheng-Chang Lu
, Weiwen Jiang
, Feixiong Cheng
, Bo Fang
, Qiang Guan
:
QDockBank: A dataset for Ligand Docking on Protein Fragments Predicted on Utility-Level Quantum Computers. 746-761 - Taylor Lee Patti

, Thien Nguyen
, Justin Gage Lietz
, Alex McCaskey
, Brucek Khailany
:
Augmenting Simulated Noisy Quantum Data Collection by Orders of Magnitude Using Pre-Trajectory Sampling with Batched Execution. 762-773 - Longshan Xu

, Edwin Hsing-Mean Sha
, Xiulin Cui
, Qingfeng Zhuge
:
Optimizing Quantum Circuit Mapping to Reduce Inter-Module Communications in Distributed Architectures. 774-788
Architectures and Networks: Hashing, Indexing, and Nearest Neighbor Search
- Sitian Chen

, Amelie Chi Zhou
, Yucheng Shi
, Yusen Li
, Xin Yao
:
UpANNS: Enhancing Billion-Scale ANNS Efficiency with Real-World PIM Architecture. 789-804 - Zixiang Yu

, Guangyang Deng
, Zhirong Shen
, Qiangsheng Su
, Ronglong Wu
, Xiaoli Wang
, Quanqing Xu
, Chuanhui Yang
, Zhifeng Bao
:
MetoHash: A Memory-Efficient and Traffic-Optimized Hashing Index on Hybrid PMem-DRAM Memories. 805-819 - Mingkai Chen

, Tianhua Han
, Cheng Liu
, Shengwen Liang
, Kuai Yu
, Lei Dai
, Ziming Yuan
, Ying Wang
, Lei Zhang
, Huawei Li
, Xiaowei Li
:
DRIM-ANN: An Approximate Nearest Neighbor Search Engine based on Commercial DRAM-PIMs. 820-836 - Yanhao Li

, Zijun Xu
, Xuanjun Wen
, Yanjie Song
, Guancheng Li
, Shu Yin
:
Optimizing Data Acquisitions in Multi-Robot Systems. 837-854
Energy, Power, and Sustainability
- Yankai Jiang

, Raghavendra Kanakagiri
, Rohan Basu Roy
, Devesh Tiwari
:
ThirstyFLOPS: Water Footprint Modeling and Analysis Toward Sustainable HPC Systems. 855-869 - Alok Kamatar

, Maxime Gonthier
, Valérie Hayot-Sasson
, André Bauer
, Marcin Copik
, Raul Castro Fernandez
, Torsten Hoefler
, Kyle Chard
, Ian T. Foster
:
Core Hours and Carbon Credits: Incentivizing Sustainability in HPC. 870-887 - Oscar Antepara

, Zhengji Zhao
, Brian Austin
, Nan Ding
, Leonid Oliker
, Nicholas J. Wright
, Samuel Williams
:
Benchmark-driven Models for Energy Analysis and Attribution of GPU-Accelerated Supercomputing. 888-904 - Bagus Hanindhito

, Bhavesh Patel
:
Characterizing Performance, Power, and Energy of AMD CDNA3 GPU Family. 905-934
Machine Learning: Methods
- Aristeidis Tsaris

, Isaac Lyngaas
, John H. Lagergren
, Mohamed Wahib
, Larry M. York
, Prasanna Balaprakash
, Dan Lu
, Feiyi Wang
, Xiao Wang
:
Distributed Cross-Channel Hierarchical Aggregation for Foundation Models. 935-948 - Lisa Gaedke-Merzhäuser

, Vincent Maillou
, Fernando Rodriguez Avellaneda
, Olaf Schenk
, Paula Moraga
, Mathieu Luisier
, Alexandros Nikolaos Ziogas
, Håvard Rue
:
Accelerated Spatio-Temporal Bayesian Modeling for Multivariate Gaussian Processes. 949-972 - Siqi Wang

, Hailong Yang
, Xuezhu Wang
, Tongxuan Liu
, Pengbo Wang
, Yufan Xu
, Xuning Liang
, Kejie Ma
, Tianyu Feng
, Xin You
, Ruihao Gong
, Rui Wang
, Zhongzhi Luan
, Yi Liu
, Depei Qian
:
Towards Efficient LLM Inference via Collective and Adaptive Speculative Decoding. 973-990 - Shixun Wu

, Yujia Zhai
, Huangliang Dai
, Yue Zhu
, Haiyang Hu
, Zizhong Chen
:
TurboFNO: High-Performance Fourier Neural Operator with Fused FFT-GEMM-iFFT on GPU. 991-1005
Programming Frameworks
- Muhammad Usman

, Mariano Benito
, Sergio Iserte
, Antonio J. Peña
:
ODOS-MPI: HPC-Friendly SmartNIC Offloading of Computation/Communication Kernels. 1006-1027 - Zhuoping Yang

, Jinming Zhuang
, Xingzhen Chen
, Alex K. Jones
, Peipei Zhou
:
AGILE: Lightweight and Efficient Asynchronous GPU-SSD Integration. 1028-1042 - Jiakun Yan

, Marc Snir
:
LCI: a Lightweight Communication Interface for Efficient Asynchronous Multithreaded Communication. 1043-1059 - Zhiheng Lin

, Ke Meng
, Changjie Xu
, Weichen Cao
, Guangming Tan
:
COSMOS: Performance Portable Graph Pattern Matching with Domain-Specific Software Distributed Shared Memory. 1060-1072
Anomaly Detection, Failure Management, and Resilience 1
- Yonatan Levitt

, Richard Barella
, Sam Zeltner
, Thomas Musta
, Lance Cheney
, Gustavo Espinosa
, Olivier Franza
, Balazs Gerofi
:
Fine-grained Automated Failure Management for Extreme-Scale GPU Accelerated Systems. 1073-1084 - Huangliang Dai

, Shixun Wu
, Jiajun Huang
, Zizhe Jian
, Yue Zhu
, Haiyang Hu
, Zizhong Chen
:
FT-Transformer: Resilient and Reliable Transformer with End-to-End Fault Tolerant Attention. 1085-1098 - Md Hasanur Rahman

, Guanpeng Li
:
Deploying Lightweight Input-Aware Selective Instruction Duplication in HPC Applications. 1099-1112 - Chenxuan Yao

, Feifan Liu
, Yuchong Hu
, Zhengyu Liu
, Xinjue Zheng
, Wenxiang Zhou
:
LowDiff: Efficient Frequent Checkpointing via Low-Cost Differential for High-Performance Distributed Training Systems. 1113-1126
Anomaly Detection, Failure Management, and Resilience 2
- Yu Sun

, Zachary Coalson
, Shiyang Chen
, Hang Liu
, Zhao Zhang
, Sanghyun Hong
, Bo Fang
, Lishan Yang
:
Demystifying the Resilience of Large Language Model Inference: An End-to-End Perspective. 1127-1144 - Shengkun Cui

, Archit Patke
, Hung Nguyen
, Aditya Ranjan
, Ziheng Chen
, Phuong Cao
, Gregory H. Bauer
, Brett M. Bode
, Catello Di Martino
, Saurabh Jha
, Chandra Narayanaswami
, Daby Sow
, Zbigniew T. Kalbarczyk
, Ravishankar K. Iyer
:
Story of Two GPUs: Characterizing the Resilience of Hopper H100 and Ampere A100 GPUs. 1145-1164 - Pengfei Yu

, Jingjing Gu
, Hao Han
, Dazhong Shen
, Bao Wen
, Yang Liu
:
Exploring and Mitigating Failure Behavior of Large Language Model Training Workloads in HPC Systems. 1165-1179 - Sibo Xia

, Yongqian Sun
, Xijie Pan
, Yuan Yuan
, Shenglin Zhang
, Shaoyu Hu
, Lei Tao
, Yuqi Li
, Jinghua Feng
:
Effective Node-Level Anomaly Detection in HPC Systems via Coarse-Grained Clustering and Fine-Grained Model Sharing. 1180-1194
Architectures and Networks: Networking
- Tommaso Bonato

, Sepehr Abdous
, Abdul Kabbani
, Ahmad Ghalayini
, Nadeen Gebara
, Terry Lam
, Anup Agarwal
, Tiancheng Chen
, Zhuolong Yu
, Konstantin Taranov
, Mahmoud Elhaddad
, Daniele De Sensi
, Soudeh Ghorbani
, Torsten Hoefler
:
Uno: A One-Stop Solution for Inter- and Intra-Data Center Congestion Control and Reliable Connectivity. 1195-1210 - Zhenguo Wu

, Benjamin Klenk
, Larry Dennison
, Keren Bergman
:
ACTINA: Adapting Circuit-Switching Techniques for AI Networking Architectures. 1211-1222 - Mikhail Khalilov

, Siyuan Shen
, Marcin Chrapek
, Tiancheng Chen
, Kenji Nakano
, Nicola Mazzoletti
, Peter-Jan Gootzen
, Salvatore Di Girolamo
, Rami Nudelman
, Gil Bloch
, Jithin Jose
, Abdul Kabbani
, Sreevatsa Anantharamu
, Jie Zhang
, Konstantin Taranov
, Zhuolong Yu
, Scott Moe
, Mahmoud Elhaddad
, Torsten Hoefler
:
SDR-RDMA: Software-Defined Reliability Architecture for Planetary Scale RDMA Communication. 1223-1239 - Giyong Jung

, Saeid Gorgin
, John Kim
, Jungrae Kim
:
Scaling Out Chip Interconnect Networks with Implicit Sequence Numbers. 1240-1251
Data Analytics, Visualization & Storage
- Chris Egersdoerfer

, Philip H. Carns
, Shane Snyder
, Robert Ross
, Dong Dai
:
STELLAR: Storage Tuning Engine Leveraging LLM Autonomous Reasoning for High Performance Parallel File Systems. 1252-1266 - Jianqin Yan, Shi Qiu, Yina Lv

, Yifan Hu, Hao Chen, Zhirong Shen, Xin Yao
, Renhai Chen, Jiwu Shu
, Gong Zhang, Yiming Zhang:
Phoenix: A Refactored I/O Stack for GPU Direct Storage without Phony Buffers. 1267-1283 - Hui Sun

, Xiangxiang Jiang
, Xiao Qin
, Song Jiang
, Enhui Wang
:
gParaKV: A GPGPU-accelerated Key-Value Separation-based KV Store with Optimized Compaction and Garbage Collection. 1284-1298 - Wenjing Huang

, Jinwu Yang
, Shengquan Yin
, Haoxu Li
, Yida Gu
, Zedong Liu
, Xing Jing
, Zheng Wei
, Shiyuan Fu
, Hao Hu
, Guangming Tan
, Dingwen Tao
:
MANS: Efficient and Portable ANS Encoding for Multi-Byte Integer Data on CPUs and GPUs. 1299-1314
Machine Learning: Training at Scale 1
- Yueming Yuan

, Ahan Gupta
, Jianping Li
, Sajal Dash
, Feiyi Wang
, Minjia Zhang
:
X-MoE: Enabling Scalable Training for Emerging Mixture-of-Experts Architectures on HPC Platforms. 1315-1331 - Pradip Kunwar

, Minh N. Vu
, Maanak Gupta
, Mahmoud Abdelsalam
, Manish Bhattarai
:
TT-LoRA MoE: Using Parameter-Efficient Fine-Tuning and Sparse Mixture-Of-Experts. 1332-1350 - Mohamed Wahib

, Muhammed Abdullah Soyturk
, Didem Unat
:
Balanced and Elastic End-to-end Training of Dynamic LLMs. 1351-1367 - Adam Weingram

, Duo Zhang
, Zhonghao Chen
, Hao Qi
, Xiaoyi Lu
:
HPC-R1: Characterizing R1-like Large Reasoning Models on HPC. 1368-1380
Machine Learning: Training at Scale 2
- Avinash Maurya

, M. Mustafa Rafique
, Franck Cappello
, Bogdan Nicolae
:
MLP-Offload: Multi-Level, Multi-Path Offloading for LLM Pre-training to Break the GPU Memory Wall. 1381-1394 - Junqi Yin

, Mijanur Palash
, Mallikarjun Shankar
, Feiyi Wang
:
RingX: Scalable Parallel Attention for Long-Context Learning on HPC. 1395-1408 - Zhouyang Li

, Yuliang Liu
, Wei Zhang
, Tailing Yuan
, Bin Chen
, Chengru Song
:
SlimPipe: Memory-Thrifty and Efficient Pipeline Parallelism for Long-Context LLM Training. 1409-1428 - Ao Sun

, Weilin Zhao
, Xu Han
, Cheng Yang
, Zhiyuan Liu
, Chuan Shi
, Maosong Sun
:
BurstEngine: An efficient distributed framework for training transformers On extremely Long sequences of over 1M tokens. 1429-1445
Algorithms: Sparse Matrix and Tensor Computation
- Abdullah Al Raqibul Islam

, Helen Xu
, Dong Dai
, Aydin Buluç
:
Improving SpGEMM Performance Through Matrix-Reordering and Cluster-wise Computation. 1446-1463 - Jakub Homola

, Ondrej Meca
, Lubomír Ríha, Tomás Brzobohatý:
Utilizing Sparsity in the GPU-accelerated Assembly of Schur Complement Matrices in Domain Decomposition Methods. 1464-1476 - Jie Ren

, Tingxuan Zhong
, Yuxi Hong
, Guofeng Feng
, Xincheng Wang
, Weile Jia
, Hatem Ltaief
, David Elliot Keyes
:
Caracal: A GPU-Resident Sparse LU Solver with Lightweight Fine-Grained Scheduling. 1477-1494 - Qi Li

, Kun Li
, Haozhi Han
, Liang Yuan
, Yunquan Zhang
, Yifeng Chen
, Junshi Chen
, Hong An
, Ting Cao
, Mao Yang
:
SparStencil: Retargeting Sparse Tensor Cores to Scientific Stencil Computations via Structured Sparsity Transformation. 1495-1509
Graph Processing and Pattern Matching
- Cameron Bradley

, Ghadeer Ahmed H. Alabandi
, Martin Burtscher
:
Fringe-SGC: Counting Subgraphs with Fringe Vertices. 1510-1523 - Antonio De Caro

, Gennaro Cordasco
, Federico Ficarelli
, Biagio Cosenza
:
SIGMo: High-Throughput Batched Subgraph Isomorphism on GPUs for Molecular Matching. 1524-1538 - Xianghao Xu

, Yucheng Zhang
, Gongxuan Zhang
, Yongli Cheng
, Fang Wang
:
Graphago: Accelerating SSD-based Graph Processing via Activity-Aware Graph Preprocessing. 1539-1552 - Long Deng

, Yongkun Li
, Zaigui Zhang
, Yinlong Xu
, John C. S. Lui
:
Bubble: Towards Scalable Evolving Graph Processing via Mini-Batch Sorting. 1553-1571
Algorithms: Matrix Multiplication and GEMM Optimization
- Hemeng Wang

, Yang Du
, Sidu Li
, Xiaowen Tian
, Qingxiao Sun
, Weifeng Liu
:
KAMI: Communication-Avoiding General Matrix Multiplication within a Single GPU. 1572-1589 - Weihu Wang

, Yaqi Xia
, Donglin Yang
, Xiaobo Zhou
, Dazhao Cheng
:
MXBLAS: Accelerating 8-bit Deep Learning with a Unified Micro-Scaled GEMM Library. 1590-1603 - Zheng Zhang

, Hulin Wang
, Hongming Xu
, Donglin Yang
, Xiaobo Zhou
, Dazhao Cheng
:
HyTiS: Hybrid Tile Scheduling for GPU GEMM with Enhanced Wave Utilization and Cache Locality. 1604-1618 - Huanqi Hu

, Bowen Xiao
, Shixuan Sun
, Jianian Yin
, Zhexi Zhang
, Xiang Luo
, Chengquan Jiang
, Weiqi Xu
, Xiaoying Jia
, Xin Liu
, Minyi Guo
:
LiquidGEMM: Hardware-Efficient W4A8 GEMM Kernel for High-Performance LLM Serving. 1619-1630
Applications: Atomistic Modeling
- Yucheng Ouyang, Xin Chen, Ying Liu, Xin Chen, Honghui Shang, Zhenchuan Chen, Rongfen Lin, Xingyu Gao, Lifang Wang, Fang Li, Jiahao Shan, Haifeng Song, Huimin Cui, Xiaobing Feng, Jingling Xue:

TENSORMD: Accelerating Molecular Dynamics with a High-Performance Machine Learning Interatomic Potential. 1631-1645 - Bowen Kan

, Yumeng Zhou
, Daiyou Xie
, Pengyu Zhou
, Yunquan Zhang
, Honghui Shang
:
NNQS-SCI: Tackling Trillion-Dimensional Hilbert Space with Adaptive Neural Network Quantum States. 1646-1660 - Shunde Li

, Zhijie Pan
, Ningming Nie
, Jue Wang
, He Bai
, Genshen Chu
, Yan Zeng
, Xinfu He
, Yangang Wang
, Changjun Hu
, Xuebin Chi
:
MISA-AKMC : Achieve Kinetic Monte Carlo Simulation of 20 Quadrillion Atoms on GPU Clusters. 1661-1675
Machine Learning: Inference and Serving
- Dimitrios Liakopoulos

, Prasoon Sinha
, Tianrui Hu
, Myungjin Lee
, Neeraja J. Yadwadkar
:
MaverIQ: Fingerprint-Guided Extrapolation and Fragmentation-Aware Layering for Intent-Based LLM Serving. 1676-1696 - Sungin Hong

, Hyunjun Kim
, Hwansoo Han
:
Compile-Time QoS Scheme for Deep Learning Inferences. 1697-1709 - Zizhao Mo

, Jianxiong Liao
, Huanle Xu
, Zhi Zhou
, Chengzhong Xu
:
Hetis: Serving LLMs in Heterogeneous GPU Clusters with Fine-grained and Dynamic Parallelism. 1710-1724 - Tianyu Guo

, Xianwei Zhang
, Jiangsu Du
, Zhiguang Chen
, Nong Xiao
, Yutong Lu
:
gLLM: Global Balanced Pipeline Parallelism Systems for Distributed LLMs Serving with Token Throttling. 1725-1741
Scheduling, Tiling, and Parallelism
- Thomas Jakobsche

, Fredrik Robertsén
, Jessica R. Jones
, Utz-Uwe Haus
, Florina M. Ciorba
:
SIREN: Software Identification and Recognition in HPC Systems. 1742-1754 - Shigang Li

, Jingkun Dong
, Jihao Chen
, Zhi Ma
, Zhongzhe Hu
:
Hypertron: Efficiently Scaling Large Models by Exploring High-Dimensional Parallelization Space. 1755-1768 - Haoyu Yang

, Zan Zong
, Yuyang Jin
, Kinman Lei
, Jiaao He
, Qigang Yang
, Jidong Zhai
:
UltraAttn: Efficiently Parallelizing Attention through Hierarchical Context-Tiling. 1769-1784 - Xinbiao Gan

, Tiejun Li
, Yiqi Wang
, Qiang Zhang
, Yongming Yi
, Chunye Gong
, Jie Liu
, Kai Lu
:
TianheEngine: Hierarchy-aware Adaptive Partitioning System for Trillion-scale Graph Processing. 1785-1799
Algorithms: Other Matrix and Tensor Methods
- João Pinheiro

, Aditya Devarakonda
, Grey Ballard
:
Parallel Rank-Adaptive Higher Order Orthogonal Iteration. 1800-1815 - Han Huang

, Jiabin Xie
, Guangnan Feng
, Xianwei Zhang
, Dan Huang
, Zhiguang Chen
, Yutong Lu
:
HStencil: Matrix-Vector Stencil Computation with Interleaved Outer Product and MLA. 1816-1829 - Hansheng Wang

, Dajun Huang
, Gaoyuan Zou
, Lu Shi
, Xu Jiang
, Xi Wu
, Hancong Duan
, Shaoshuai Zhang
:
Rethinking Back Transformation in 2-stage Eigenvalue Decomposition on Heterogeneous Architectures. 1830-1844 - Fan Yuan

, Shengguo Li
, Xiaojian Yang
, Yunqing Huang
, Hongxia Wang
, Chuanfu Xu
, Dezun Dong
, Tiejun Li
, Jianchun Wang
, Jie Liu
:
DAS-ILU: A Distributed Asynchronous Parallel ILU Factorization Based on Domain Decomposition. 1845-1858
Applications: Large-Scale Scientific Simulation
- Keiya Hirashima

, Michiko S. Fujii
, Takayuki R. Saitoh
, Naoto Harada
, Kentaro Nomura
, Kohji Yoshikawa
, Yutaka Hirai
, Tetsuro Asano
, Kana Moriwaki
, Masaki Iwasawa, Takashi Okamoto
, Junichiro Makino
:
The First Star-by-star $N$-body/Hydrodynamics Simulation of Our Galaxy Coupling with a Surrogate Model. 1859-1873 - Junwei Feng

, Junshi Chen
, Xiangyu Zhang
, Junhui Liu
, Xinming Qin
, Lingyun Wan
, Sheng Chen
, Wentiao Wu
, Bingkun Hou
, Yexuan Lin
, Yihong Zhang
, Zechuan Zhang
, Yijun Hu
, Weile Jia
, Hong An
, Jinlong Yang
, Wei Hu
:
Million-Atom Ab Initio Electron Dynamics: Discontinuous Galerkin Real-Time Time-Dependent Density Functional Theory. 1874-1887 - Zhuoqiang Guo

, Runze Mao
, Lijun Liu
, Guangming Tan
, Weile Jia
, Zhi X. Chen
:
Deep Learning-Enabled Supercritical Flame Simulation at Detailed Chemistry and Real-Fluid Accuracy Towards Trillion-Cell Scale. 1888-1900
Collective Operations and Communication
- Daniele De Sensi

, Saverio Pasqualoni
, Lorenzo Piarulli
, Tommaso Bonato
, Seydou Ba
, Matteo Turisini
, Jens Domke
, Torsten Hoefler
:
Bine Trees: Enhancing Collective Operations by Optimizing Communication Locality. 1901-1916 - Hao Qi

, Weicong Chen
, Chenghong Wang
, Xiaoyi Lu
:
DPAR: High-Performance, Secure, and Scalable Differential Privacy-based AllReduce. 1917-1934 - Nicholas Contini

, Jake Queiser
, Bharath Ramesh
, Hari Subramoni
, Dhabaleswar K. Panda
:
A Streaming Collectives Interface Targeting Dataflow Acceleration and HPC Workloads. 1935-1950 - Kexin Li

, Wenkan Huang, Qinggang Wang
, Long Zheng
, Xiaofei Liao
, Hai Jin
, Jingling Xue
:
Diff-MoE: Efficient Batched MoE Inference with Priority-Driven Differential Expert Caching. 1951-1965
Compression and Data Reduction 1
- Franck Cappello

, Robert Underwood
, Yuri Alexeev
, Allison H. Baker
, Ebru Bozdag
, Martin Burtscher
, Kyle Chard
, Sheng Di
, Kyle Gerard Felker
, Paul Christopher O'Grady
, Hanqi Guo
, Yafan Huang
, Peng Jiang
, Sian Jin
, Petter Johansson
, Shaomeng Li
, Xin Liang
, Erik Lindahl
, Peter Lindstrom
, Zarija Lukic
, Magnus Lundborg
, Danylo Lykov
, Masaru Nagaso
, Kento Sato
, Amarjit Singh
, Seung Woo Son
, Shihui Song
, William Tang
, Dingwen Tao
, Jiannan Tian
, Kazutomo Yoshii
, Kai Zhao
:
What to Support When You're Compressing: The State of Practice Gaps and Opportunities for Scientific Data Compression. 1966-1979 - Xiao Li

, Liangji Zhu
, Anand Rangarajan
, Sanjay Ranka
:
Generative Latent Diffusion for Efficient Spatiotemporal Data Reduction. 1980-1991 - Qian Gong

, Mark Ainsworth
, Jieyang Chen
, Xin Liang
, Liangji Zhu
, Ethan Klasky
, Tushar M. Athawale
, Qing Liu
, Anand Rangarajan
, Sanjay Ranka
, Scott Klasky
:
Stability-preserving Lossy Compression for Large-scale Partial Differential Equations. 1992-2005 - Yafan Huang

, Sheng Di
, Robert Underwood
, Peco Myint
, Miaoqi Chu
, Guanpeng Li
, Nicholas Schwarz
, Franck Cappello
:
lsCOMP: Efficient Light Source Compression. 2006-2023
Compression and Data Reduction 2
- Shixun Wu

, Jinwen Pan
, Jinyang Liu
, Jiannan Tian
, Ziwei Qiu
, Jiajun Huang
, Kai Zhao
, Xin Liang
, Sheng Di
, Zizhong Chen
, Franck Cappello
:
Boosting Scientific Error-Bounded Lossy Compression through Optimized Synergistic Lossy-Lossless Orchestration. 2024-2037 - Daoce Wang

, Pascal Grosset
, Jesus Pulido
, Jiannan Tian
, Tushar M. Athawale
, Jinda Jia
, Baixi Sun
, Boyuan Zhang
, Sian Jin
, Kai Zhao
, James P. Ahrens
, Fengguang Song
:
STZ: A High Quality and High Speed Streaming Lossy Compression Framework for Scientific Data. 2038-2055 - Yafan Huang

, Sheng Di
, Guanpeng Li
, Franck Cappello
:
GPU Lossy Compression for HPC Can Be Versatile and Ultra-Fast. 2056-2075 - Yanliang Li

, Wenbo Li
, Qian Gong
, Qing Liu
, Norbert Podhorszki
, Scott Klasky
, Xin Liang
, Jieyang Chen
:
HP-MDR: High-performance and Portable Data Refactoring and Progressive Retrieval with Advanced GPUs. 2076-2093
Algorithms: Matching System Capabilities
- Budvin Edippuliarachchi

, David Van Komen
, Hari Sundar
:
AMRaCut: Scalable Partitioning for Adaptive Mesh Refinement. 2094-2108 - Marco D'Antonio

, Son Thai Mai
, Philippas Tsigas
, Hans Vandierendonck
:
Wasp: Efficient Asynchronous Single-Source Shortest Path on Multicore Systems via Work Stealing. 2109-2125 - Haozhi Han

, Kun Li
, Fusong Ju
, Qi Li
, Hong An
, Yifeng Chen
, Yunquan Zhang
, Ting Cao
, Mao Yang
:
Matrix Is All You Need: Rearchitecting Quantum Chemistry to Scale on AI Accelerators. 2126-2142
Applications: Biological Modeling
- Sree Charan Gundabolu

, Mithuna Thottethodi
, T. N. Vijaykumar
:
BLAZE: Exploiting Hybrid Parallelism and Size-customized Kernels to Accelerate BLASTP on GPUs. 2143-2157 - Rin Kuriyama

, Kaaya Akira
, Laura Green
, Beatriz Herrera
, Kael Dai
, Mari Iura
, Gilles Gouaillardet
, Asako Terasawa
, Taira Kobayashi
, Jun Igarashi
, Anton Arkhipov
, Tadashi Yamazaki
:
Microscopic-Level Mouse Whole Cortex Simulation Composed of 9 Million Biophysical Neurons and 26 Billion Synapses on the Supercomputer Fugaku. 2158-2171 - Xiaohui Duan

, Cheng Shen
, Gaowei Chen
, Shanshan Wu
, Yizhen Wang
, Yizhen Chen
, Qixin Chang
, Qiancheng Xia
, Zekun Yin
, Lin Gan
, Yibing Shan
, Guangwen Yang
, Weiguo Liu
, Niu Huang
:
Trillion Ligands per Day: Performance-Portable Virtual Screening via Compound Database Optimization and Multi-Target Docking. 2172-2185 - Jiayu Fu

, Jingle Xu
, Lin Gan
, Tianqi Mao
, Zirong Shen
, Yinuo Wang
, Zeyu Song
, Xiaohui Duan
, Wei Xue
, Guangwen Yang
:
T2-RELION: Task Parallelism, Tensor Core Accelerated RELION for Cryo-EM 3D Reconstruction. 2186-2202
System Software and Cloud
- Lexiang Huang

, Anjaly Parayil
, Jue Zhang
, Xiaoting Qin
, Chetan Bansal
, Jovan Stojkovic
, Pantea Zardoshti
, Pulkit A. Misra
, Eli Cortez
, Raphael Ghelman
, Íñigo Goiri, Saravan Rajmohan
, Jim Kleewein
, Rodrigo Fonseca
, Timothy Zhu
, Ricardo Bianchini
:
Workload Intelligence: Workload-Aware IaaS abstraction for Cloud Efficiency. 2203-2215 - Xi Wang

, Bin Ma
, Jongryool Kim
, Byungil Koh
, Hoshik Kim
, Dong Li
:
cMPI: Using CXL Memory Sharing for MPI One-Sided and Two-Sided Inter-Node Communications. 2216-2232 - Guangda Liu

, Chenqi Zhang
, Yizhou Shan
, Hao Feng
, Zeke Wang
, Shixuan Sun
, Minyi Guo
, Jieru Zhao
:
DHAP: Towards Efficient OLAP in a Disaggregated and Heterogeneous Environment. 2233-2250 - Hai Zhou

, Dan Feng
:
Make Updates Faster: A Fast Multi-Stripe Updates Framework in Erasure-Coded Storage Clusters. 2251-2265
Reproducibility Reports
- Strahinja Trecakov

:
Reproducibility Report for SC25 Paper Uno: A One-Stop Solution for Inter- and Intra- Data Center Congestion Control and Reliable Connectivity. 2266-2267 - Iacopo Colonnelli

:
Reproducibility Report for SC25 Paper ATLAHS: An Application-centric Network Simulator Toolchain for AI, HPC, and Distributed Storage. 2268 - Minh Chung

:
Reproducibility Report for SC25 Paper TensorMD: Accelerating Molecular Dynamics with a High-Performance Machine Learning Interatomic Potential. 2269-2270 - Arjun Parab

:
Reproducibility Report for SC25 Paper TurboFNO: High-Performance Fourier Neural Operator with Fused FFT-GEMM-iFFT on GPU. 2271 - Joseph Schuchart

:
Reproducibility Report for SC25 Paper X-MoE: Enabling Scalable Training for Emerging Mixture-of-Experts Architectures on HPC Platforms. 2272-2273 - Gianluca Mittone

:
Reproducibility Report for SC25 Paper SIGMo: High-Throughput Batched Subgraph Isomorphism on GPUs for Molecular Matching. 2274 - Benjamin Brock

:
Reproducibility Report for SC25 Paper MLP-Offload: Multi-Level, Multi-Path Offloading for LLM Pre-training to Break the GPU Memory Wall. 2275-2276 - Marc-André Vef

:
Reproducibility Report for SC25 Paper STZ: A High Quality and High Speed Streaming Lossy Compression Framework for Scientific Data. 2277 - Philippe Swartvagher

:
Reproducibility Report for SC25 Paper Story of Two GPUs: Characterizing the Resilience of Hopper H100 and Ampere A100 GPUs. 2278-2279 - Kurt H. Maier

:
Reproducibility Report for SC25 Paper Optimizing Quantum Circuit Mapping to Reduce Inter-Module Communications in Distributed Architectures. 2280-2281 - Roberto R. Expósito

:
Reproducibility Report for SC25 Paper MXBLAS: Accelerating 8-bit Deep Learning with a Unified Micro-Scaled GEMM Library. 2282 - Kurt H. Maier

:
Reproducibility Report for SC25 Paper C.A.T.S.: Memory and Control Flow Tracing for Whole-Program Performance Analysis. 2283-2284 - Pedro Bruel

:
Reproducibility Report for SC25 Paper Numerical Performance of the Implicitly Restarted Arnoldi Method in OFP8, Bfloat16, Posit, and Takum Arithmetics. 2285 - Minh Chung

:
Reproducibility Report for SC25 Paper Moment: Co-optimizing Physical Communication Topology and Data Placement for Multi-GPU Out-of-core GNN Training. 2286 - Sixu Li

:
Reproducibility Report for SC25 Paper Sparsified Preconditioned Conjugate Gradient Solver on GPUs. 2287-2289 - Joshua Hoke Davis

:
Reproducibility Report for SC25 Paper MANS: Efficient and Portable ANS Encoding for Multi-Byte Integer Data on CPUs and GPUs. 2290-2291 - Dogan Sagbili

:
Reproducibility Report for SC25 Paper Caracal: A GPU-Resident Sparse LU Solver with Lightweight Fine-Grained Scheduling. 2292-2293 - Alessio Orsino

:
Reproducibility Report for SC25 Paper MetoHash: A Memory-Efficient and Traffic-Optimized Hashing Index on Hybrid PMem-DRAM Memories. 2294 - Brian J. N. Wylie

:
Reproducibility Report for SC25 Paper CPU- and GPU-initiated Communication Strategies for Conjugate Gradient Methods on Large GPU Clusters. 2295-2296 - Arjun Parab

:
Reproducibility Report for SC25 Paper lsCOMP: Efficient Light Source Compression. 2297 - Vinícius Garcia Pinto

:
Reproducibility Report for SC25 Paper lsCOMP: Efficient Light Source Compression. 2298-2299 - Quentin Guilloteau

:
Reproducibility Report for SC25 Paper Zero-Value Code Specialization via Profile-Guided Control Data Flow Analysis. 2300-2303 - Minh Chung

:
Reproducibility Report for SC25 Paper High-Performance Branch-Free Algorithms for Extended-Precision Floating-Point Arithmetic. 2304-2305 - Thomas Randall

:
Reproducibility Report for SC25 Paper GPU Lossy Compression for HPC Can Be Versatile and Ultra-Fast. 2306 - Sayef Azad Sakin

:
Reproducibility Report for SC25 Paper Ab-initio Quantum Transport with the GW Approximation, 42, 240 Atoms, and Sustained Exascale Performance. 2307-2308 - Amir Raoofy

:
Reproducibility Report for SC25 Paper KAMI: Communication-Avoiding General Matrix Multiplication within a Single GPU. 2309-2310 - Sandra Wienke

:
Reproducibility Report for SC25 Paper Demystifying the Resilience of Large Language Model Inference: An End-to-End Perspective. 2311-2313 - Philippe Swartvagher

:
Reproducibility Report for SC25 Paper HP-MDR: High-performance and Portable Data Refactoring and Progressive Retrieval with Advanced GPUs. 2314-2315 - Iacopo Colonnelli

:
Reproducibility Report for SC25 Paper Bridging the Gap Between Binary and Source Based Package Management in Spack. 2316-2317 - Marcel Koch

:
Reproducibility Report for SC25 Paper FaSTCC: Fast Sparse Tensor Contractions on CPUs. 2318 - Volker Weinberg

:
Reproducibility Report for SC25 Paper RedSan: A Redundant Memory Instruction Sanitizer for GPU Programs. 2319-2320 - Ruben Laso

:
Reproducibility Report for SC25 Paper RAPTOR: Practical Numerical Profiling of Scientific Applications. 2321 - Ruben Laso

:
Reproducibility Report for SC25 Paper Addressing Reproducibility Challenges in HPC with Continuous Integration. 2322 - Shaina Smith

:
Reproducibility Report for SC25 Paper ThirstyFLOPS: Water Footprint Modeling and Analysis Toward Sustainable HPC Systems. 2323-2324 - Sergej Breiter

:
Reproducibility Report for SC25 Paper DRIM-ANN: An Approximate Nearest Neighbor Search Engine based on Commercial DRAM-PIMs. 2325-2326 - Jan Laukemann

:
Reproducibility Report for SC25 Paper Bine Trees: Enhancing Collective Operations by Optimizing Communication Locality. 2327-2328 - Joao Vicente Ferreira Lima

:
Reproducibility Report for SC25 Paper XaaS Containers: Performance-Portable Representation With Source and IR Containers. 2329

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














