


default search action
Minyi Guo
Person information
- affiliation: Shanghai Jiao Tong University, Shanghai, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2026
[j242]Runnan Shen
, Jinquan Wang, Zhisheng Huo, Limin Xiao, Jiantong Huo, Minyi Guo, Jing Shang:
MEIS: Optimizing deduplication system with efficient index structure. J. Syst. Archit. 170: 103620 (2026)
[j241]Xinkai Wang
, Chao Li
, Yiyang Li
, Lingyu Sun
, Cheng Xu
, Xiaofeng Hou
, Jing Wang
, Minyi Guo
, Yaqian Zhao
:
Enabling Learning-Based Efficiency Optimizer With Shadow Cycles in Resource-Constrained Autonomous Embedded Systems. IEEE Trans. Computers 75(3): 928-941 (2026)
[j240]Kaihua Fu
, Jiuchen Shi
, Yao Chen
, Quan Chen
, Weng-Fai Wong
, Wei Wang
, Bingsheng He
, Minyi Guo
:
QoS Awareness and Improved Throughput of Point Cloud Services With Dynamic Workloads. IEEE Trans. Computers 75(3): 1141-1155 (2026)
[j239]Keli Liu
, Jing Yang
, Xiaoli Ruan
, Han Zhao, Shixuan Sun, Shaobo Li
, Minyi Guo
:
SAMD-SRG: Service-Aware Microservice Deployment Using Spectral Ranking and Graph-Enhanced Reinforcement Learning. IEEE Trans. Cogn. Commun. Netw. 12: 1649-1663 (2026)
[j238]Jiayi Zhang
, Chen Chen
, Zuo Gan, Wei Wang
, Bo Li
, Minyi Guo
:
Mitigating Server-Side Communication Bottlenecks in Distributed Learning With Round-Robin Participant Coordination. IEEE Trans. Netw. 34: 2997-3012 (2026)
[c424]Peng Tang, Jiacheng Liu
, Xiaofeng Hou, Yifei Pu
, Jing Wang, Pheng-Ann Heng
, Chao Li, Minyi Guo:
MoE-APEX: An Efficient MoE Inference System with Adaptive Precision Expert Offloading. ASPLOS (2) 2026: 1185-1200
[c423]Yang Liu, Yunfei Gu, Liqiang Zhang, Chentao Wu, Guangtao Xue, Jie Li, Minyi Guo, Junhao Hu, Jie Meng:
CacheSlide: Unlocking Cross Position-Aware KV Cache Reuse for Accelerating LLM Serving. FAST 2026: 83-99
[c422]Ziyu Huang, Yangjie Zhou, Zihan Liu, Xinhao Luo, Yijia Diao, Minyi Guo, Jidong Zhai, Yu Feng, Chen Zhang, Anbang Wu, Jingwen Leng:
FlashFuser: Expanding the Scale of Kernel Fusion for Compute-Intensive Operators via Inter-Core Connection. HPCA 2026: 1-14
[c421]Xiaotong Huang, He Zhu, Tianrui Ma, Yuxiang Xiong, Fangxin Liu, Zhezhi He, Yiming Gan, Zihan Liu, Jingwen Leng, Yu Feng, Minyi Guo:
SPLATONIC: Architectural Support for 3D Gaussian Splatting SLAM via Sparse Processing. HPCA 2026: 1-14
[c420]Jiuchen Shi, Hang Zhang, Yixiao Wang, Quan Chen, Yizhou Shan, Kaihua Fu, Wei Wang, Minyi Guo:
ELORA: Efficient LoRA and KV Cache Management for Multi-LoRA LLM Serving. HPCA 2026: 1-14
[c419]Xinkai Wang, Chao Li, Yiming Zhuansun, Jinyang Guo, Xiaofeng Hou, Jing Wang, Luping Wang, Weigao Chen, Cheng Huang, Guodong Yang, Liping Zhang, Minyi Guo:
AUM: Unleashing the Efficiency Potential of Shared Processors with Accelerator Units for LLM Serving. HPCA 2026: 1-15
[c418]Anbang Wu, Liqiang Lu, Jianwei Yin, Jingwen Leng, Minyi Guo:
CLINE: Improving Control Flow Compilation of Quantum Programs with Control Line Encoding. HPCA 2026: 1-13
[c417]Chuhao Xu, Zijun Li, Quan Chen, Han Zhao, Xueyan Tang, Minyi Guo:
Towards Resource-Efficient Serverless LLM Inference with SLINFER. HPCA 2026: 1-18
[c416]Chen Zhang, Qijun Zhang, Zhuoshan Zhou, Yijia Diao, Haibo Wang, Zhe Zhou, Zhipeng Tu, Zhiyao Li, Guangyu Sun, Zhuoran Song, Zhigang Ji, Jingwen Leng, Minyi Guo:
Towards Compute-Aware In-Switch Computing for LLMs Tensor-Parallelism on Multi-GPU Systems. HPCA 2026: 1-15
[c415]Han Zhao, Weihao Cui, Zeshen Zhang, Wenhao Zhang, Jiangtong Li, Quan Chen, Pu Pang, Zijun Li, Zhenhua Han, Yuqing Yang, Minyi Guo:
LEGO: Supporting LLM-Enhanced Games with One Gaming GPU. HPCA 2026: 1-14
[i117]Xinwei Qiang, Hongmin Chen, Shixuan Sun, Jingwen Leng, Xin Liu, Minyi Guo:
DASH: Deterministic Attention Scheduling for High-throughput Reproducible LLM Training. CoRR abs/2601.21824 (2026)- 2025
[j237]Du Liu
, Lu Zhang, Yechen Xu, Xinkai Wang, Lingyu Sun, Yifei Pu, Xiaofeng Hou, Chao Li, Minyi Guo:
Power synchronization: taming massive diversified serverless functions under power constraints. Sci. China Inf. Sci. 68(3) (2025)
[j236]Lizeth Patricia Aguirre Sanchez
, Yao Shen
, Minyi Guo:
LCA: Deep Reinforcement Learning-Based Congestion Avoidance Routing Model in SDN. Comput. Networks 268: 111371 (2025)
[j235]Mizhipeng Zhang, Chentao Wu, Jie Li, Minyi Guo:
Dynamic-EC: an efficient dynamic erasure coding method for permissioned blockchain systems. Frontiers Comput. Sci. 19(1): 191101 (2025)
[j234]Runzhe Chen, Guandong Lu, Yakai Wang, Rui Zhang, Zheng Hu, Yanming Miao, Zhifang Cai, Jingwen Leng, Minyi Guo:
BAFT: bubble-aware fault-tolerant framework for distributed DNN training with hybrid parallelism. Frontiers Comput. Sci. 19(1): 191102 (2025)
[j233]Xiaoqing Cai, Han Zhao, Xiaofeng Hou, Weihao Cui, Quan Chen, Chao Li, Minyi Guo:
FLAPS: fluctuation-aware power auction strategy for reducing the power overload probability. Frontiers Comput. Sci. 19(5): 195108 (2025)
[j232]Yunzhe Li
, Jiajun Yan
, Yuzhou Wei
, Kechen Liu
, Yize Zhao
, Chong Zhang
, Hongzi Zhu
, Li Lu
, Shan Chang
, Minyi Guo
:
BlinkBud: Detecting Hazards from Behind via Sampled Monocular 3D Detection on a Single Earbud. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 9(4): 191:1-191:26 (2025)
[j231]Lizeth Patricia Aguirre Sanchez
, Yao Shen, Minyi Guo:
MDQ: A QoS-Congestion Aware Deep Reinforcement Learning Approach for Multi-Path Routing in SDN. J. Netw. Comput. Appl. 235: 104082 (2025)
[j230]Yifei Pu
, Xinfeng Xia, Xiaofeng Hou, Chi Wang, Cheng Xu, Jiacheng Liu, Jing Wang, Minyi Guo, Jingling Yuan, Chao Li:
MMBypass: Towards efficient multi-modal AI computing with adaptive bypass network. J. Parallel Distributed Comput. 201: 105078 (2025)
[j229]Jixian Su
, Chiyu Hao
, Shixuan Sun
, Hao Zhang
, Sen Gao
, Jiaxin Jiang
, Yao Chen
, Chenyi Zhang
, Bingsheng He
, Minyi Guo
:
Revisiting the Design of In-Memory Dynamic Graph Storage. Proc. ACM Manag. Data 3(1): 70:1-70:27 (2025)
[j228]Yunsong Zhou
, Quan Liu
, Hongzi Zhu
, Yunzhe Li
, Shan Chang
, Minyi Guo
:
Exploiting Ground Depth Estimation for Mobile Monocular 3D Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 47(4): 3079-3093 (2025)
[j227]Chiyu Hao, Jixian Su, Shixuan Sun, Hao Zhang, Sen Gao, Jianwen Zhao, Chenyi Zhang, Jieru Zhao, Chen Chen, Minyi Guo:
RapidStore: An Efficient Dynamic Graph Storage System for Concurrent Queries. Proc. VLDB Endow. 18(10): 3587-3600 (2025)
[j226]Pengyu Yang
, Weihao Cui
, Chunyu Xue
, Han Zhao
, Chen Chen
, Quan Chen
, Jing Yang
, Minyi Guo
:
Taming Flexible Job Packing in Deep Learning Training Clusters. ACM Trans. Archit. Code Optim. 22(1): 37:1-37:24 (2025)
[j225]Cheng Xu
, Chao Li
, Xiaofeng Hou
, Junyi Mei
, Jing Wang
, Pengyu Wang
, Shixuan Sun
, Minyi Guo
, Baoping Hao
:
Enhancing High-Throughput GPU Random Walks Through Multi-Task Concurrency Orchestration. ACM Trans. Archit. Code Optim. 22(1): 42:1-42:26 (2025)
[j224]Yifu He
, Han Zhao
, Weihao Cui
, Shulai Zhang
, Quan Chen
, Minyi Guo
:
ARACHNE: Optimizing Distributed Parallel Applications with Reduced Inter-Process Communication. ACM Trans. Archit. Code Optim. 22(2): 52:1-52:26 (2025)
[j223]Han Zhao
, Weihao Cui
, Quan Chen
, Zijun Li
, Zhenhua Han
, Nan Wang
, Yu Feng
, Jieru Zhao
, Chen Chen
, Jingwen Leng
, Minyi Guo
:
EDAS: Enabling Fast Data Loading for GPU Serverless Computing. ACM Trans. Archit. Code Optim. 22(3): 99:1-99:23 (2025)
[j222]Han Zhao
, Junxiao Deng
, Weihao Cui
, Quan Chen
, Youtao Zhang
, Deze Zeng
, Minyi Guo
:
Adaptive Kernel Fusion for Improving the GPU Utilization While Ensuring QoS. IEEE Trans. Computers 74(2): 386-400 (2025)
[j221]Xiaofeng Hou
, Cheng Xu
, Chao Li
, Jiacheng Liu
, Xuehan Tang
, Kwang-Ting Cheng
, Minyi Guo
:
Improving Efficiency in Multi-Modal Autonomous Embedded Systems Through Adaptive Gating. IEEE Trans. Computers 74(2): 691-704 (2025)
[j220]Jinquan Wang, Zhisheng Huo, Limin Xiao, Jinqian Yang, Jiantong Huo, Minyi Guo:
Hierarchical Hashing: A Dynamic Hashing Method With Low Write Amplification and High Performance for Non-Volatile Memory. IEEE Trans. Computers 74(4): 1138-1151 (2025)
[j219]Zijun Li
, Chenyang Wu
, Chuhao Xu
, Quan Chen
, Shuo Quan
, Bin Zha
, Qiang Wang
, Weidong Han
, Jie Wu
, Minyi Guo
:
Lightweight and Holistic-Scalable Serverless Secure Container Runtime for High-Density Deployment and High-Concurrency Startup. IEEE Trans. Computers 74(8): 2621-2634 (2025)
[j218]Yunzhe Li
, Hongzi Zhu
, Zhuohong Deng, Yunlong Cheng
, Zimu Zheng
, Liang Zhang
, Shan Chang
, Minyi Guo
:
A Scene-Aware Model Adaptation Scheme for Cross-Scene Online Inference on Mobile Devices. IEEE Trans. Mob. Comput. 24(10): 11061-11075 (2025)
[j217]Zhengzhe Xiang
, Xizi Xue, Yuanyi Chen
, Schahram Dustdar
, Minyi Guo
:
Let Robots Watch Grass Grow: Optimal Task Assignment for Automatic Plant Factory. IEEE Trans. Sustain. Comput. 10(3): 464-474 (2025)
[j216]Chuanyue Xiong
, Jing Yang
, Jiahao Zhong, Zirui He, Pu Pang
, Minyi Guo
:
EMC-LSP: A Novel Lightweight Architecture for Edge Multi-Node Long Sequence Prediction. IEEE Trans. Sustain. Comput. 10(6): 1351-1365 (2025)
[c414]Siyuan Huang, Zhiyuan Ma, Jintao Du, Changhua Meng, Weiqiang Wang, Jingwen Leng, Minyi Guo, Zhouhan Lin:
Gumbel Reranking: Differentiable End-to-End Reranker Optimization. ACL (1) 2025: 7142-7161
[c413]Han Zhao, Yiying Xiang, Yu Liu, Xiaochun Ye, Deze Zeng, Jing Yang, Weihao Cui, Quan Chen, Jingwen Leng, Minyi Guo:
DACO: Unlocking Latent Dataflow Opportunities in Edge-Side SIMT Accelerators. APPT 2025: 3-16
[c412]Jinyuan Chen, Jiuchen Shi, Quan Chen, Lin Gu, Minyi Guo:
Veyth: Adaptive Container Placement for Optimizing Cross-Server Network Traffic of Microservice Applications. APPT 2025: 201-211
[c411]Xinkai Wang
, Yiming Zhuansun, Chao Li
, Jing Wang
, Xiaofeng Hou
, Lingyu Sun, Luping Wang, Minyi Guo:
AsymServe: Demystifying and Optimizing LLM Serving Efficiency on CPU Acceleration Units. APPT 2025: 231-245
[c410]Kunyun Wang, Shuo Yang, Jieru Zhao, Wenchao Ding, Quan Chen, Jingwen Leng, Minyi Guo:
SparseTem: Boosting the Efficiency of CNN-Based Video Encoders by Exploiting Temporal Continuity. APPT 2025: 246-256
[c409]Jing Wang, Taolei Wang, Juntao Huang, Yibo Liu, Xinkai Wang, Marius Kreutzer, Chao Li, Minyi Guo:
Accelerating Large-Scale Out-of-GPU-Core GNN Training with Two-Level Historical Caching. APPT 2025: 297-307
[c408]Ruogang Ma, Jiuchen Shi, Quan Chen, Minyi Guo:
Comber: QoS-Aware and Efficient Deployment for Co-located Microservices and Best-Effort Tasks in Disaggregated Datacenters. APPT 2025: 412-418
[c407]Yangjie Zhou
, Wenting Shen
, Jingwen Leng
, Shuwen Lu
, Zihan Liu
, Weihao Cui
, Zhendong Zhang
, Wencong Xiao
, Baole Ai
, Yong Li
, Wei Lin
, Deze Zeng
, Yun Liang
, Quan Chen
, Ning Liu
, Minyi Guo
:
Voyager: Input-Adaptive Algebraic Transformations for High-Performance Graph Neural Networks. ASPLOS (3) 2025: 247-263
[c406]Xinkai Wang
, Xiaofeng Hou
, Chao Li
, Yuancheng Li
, Du Liu
, Guoyao Xu
, Guodong Yang
, Liping Zhang
, Yuemin Wu
, Xiaopeng Yuan
, Quan Chen
, Minyi Guo
:
EXIST: Enabling Extremely Efficient Intra-Service Tracing Observability in Datacenters. ASPLOS (2) 2025: 355-372
[c405]Yu Feng
, Zheng Liu
, Weikai Lin
, Zihan Liu
, Jingwen Leng
, Minyi Guo
, Zhezhi He
, Jieru Zhao
, Yuhao Zhu
:
StreamGrid: Streaming Point Cloud Analytics via Compulsory Splitting and Deterministic Termination. ASPLOS (2) 2025: 1189-1202
[c404]Yaoxuan Li
, Pu Pang
, Yecheng Yang
, Quan Chen
, Zhengxuan Yan
, Guoyao Xu
, Guodong Yang
, Liping Zhang
, Minyi Guo
:
WDP: Mitigating Interference in CPU Sharing Through Wake-up Delay Driven Preemption for QoS-aware Co-location. SoCC 2025: 255-268
[c403]Yuzhuo Yang
, Kaihua Fu
, Quan Chen
, Deze Zeng
, Shuo Quan
, Jie Wu
, Minyi Guo
:
FaaSGNN: Enabling Memory Efficient and Low Latency GNN Inference Services with Serverless Computing. SoCC 2025: 803-816
[c402]Tianze Wang, Guanjie Wang, Mingyan Yang, Manqi Luo, Mingchuan Zou, Chen Chen, Minyi Guo:
SemanticPrefetcher: Accelerate Data Lake Access with Semantics-Aware File Prefetching. CloudCom 2025: 1-8
[c401]Ranhao Jia, Yunfei Gu, Chentao Wu, Jie Li, Minyi Guo, Liqiang Zhang, Zaigui Zhang, Haijun Zhang:
FIFO-MEP: An Efficient Multi-Eviction-Point FIFO Cache with Stable Demotion for Burst-Oriented Access Mitigation. CLUSTER 2025: 1-13
[c400]Shifan Zhang, Hongzi Zhu, Yinan He, Minyi Guo, Ziyang Lou, Shan Chang:
WISNet: Pseudo Label Generation on Unbalanced and Patch Annotated Waste Images. CVPR 2025: 15076-15085
[c399]Yunfei Gu, Yixuan Liu
, Xinyuan Wu, Bo Shao, Chentao Wu, Shiyi Li, Jieru Zhao, Jie Li, Minyi Guo, Kunlin Yang, Wengui Zhang, Feilong Lin:
MemSeer: Leverage Memory Failure Distinctions and Multi-Grained Prediction in Ultra-Scale Heterogeneous X86/ARM Clusters. DAC 2025: 1-7
[c398]Yixuan Liu
, Yunfei Gu, Junhao Dai, Xinyuan Wu, Chentao Wu, Xinfei Guo, Jieru Zhao, Jie Li, Minyi Guo:
CXL-ECC: an Efficient LRC-based on-CXL-Memory-eXpander-Controller ECC to Enhance Reliability and Performance of DRAM Error Correction. DAC 2025: 1-7
[c397]Guangda Liu, Chengwei Li, Jieru Zhao, Chenqi Zhang, Minyi Guo:
ClusterKV: Manipulating LLM KV Cache in Semantic Space for Recallable Compression. DAC 2025: 1-7
[c396]Chenqi Zhang, Yu Feng, Jieru Zhao, Guangda Liu, Wenchao Ding, Chentao Wu, Minyi Guo:
STREAMINGGS: Voxel-Based Streaming 3D Gaussian Splatting with Memory Optimization and Architectural Support. DAC 2025: 1-7
[c395]Shulai Zhang
, Quan Chen
, Weihao Cui
, Han Zhao
, Chunyu Xue
, Zhen Zheng
, Wei Lin
, Minyi Guo
:
Improving GPU Sharing Performance through Adaptive Bubbleless Spatial-Temporal Sharing. EuroSys 2025: 573-588
[c394]Zixiao Chen, Chentao Wu, Yunfei Gu, Ranhao Jia, Jie Li, Minyi Guo:
Gaze into the Pattern: Characterizing Spatial Patterns with Internal Temporal Correlations for Hardware Prefetching. HPCA 2025: 173-187
[c393]Weiming Hu, Haoyan Zhang
, Cong Guo, Yu Feng, Renyang Guan, Zhendong Hua, Zihan Liu, Yue Guan, Minyi Guo, Jingwen Leng:
M-ANT: Efficient Low-bit Group Quantization for LLMs via Mathematically Adaptive Numerical Type. HPCA 2025: 1112-1126
[c392]Zihan Liu
, Xinhao Luo, Junxian Guo, Wentao Ni, Yangjie Zhou, Yue Guan, Cong Guo, Weihao Cui, Yu Feng, Minyi Guo, Yuhao Zhu, Minjia Zhang, Chen Jin, Jingwen Leng:
VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference. HPCA 2025: 1496-1509
[c391]Xingyang Li, Jie Jiang, Yu Feng, Yiming Gan, Jieru Zhao, Zihan Liu, Jingwen Leng, Minyi Guo:
SLTarch: Towards Scalable Point-Based Neural Rendering by Taming Workload Imbalance and Memory Irregularity. ICCAD 2025: 1-9
[c390]Shifan Zhang, Hongzi Zhu, Yunzhe Li, Liang Zhang, Shan Chang, Minyi Guo:
CoPe: Taming Collaborative 3D Perception via Lite Network Attention across Mobile Agents. ICDCS 2025: 188-198
[c389]Jiangang Shen, Hongzi Zhu, Liang Zhang, Yunzhe Li, Shan Chang, Jie Wu, Minyi Guo:
DiVE: Differential Video Encoding for Online Edge-assisted Video Analytics on Mobile Agents. ICDCS 2025: 243-252
[c388]Wei Yu, Chen Chen, Qinbin Li, Jieru Zhao, Shixuan Sun, Bo Li, Minyi Guo:
FedSU: Communication-efficient Federated Learning with Speculative Updating. ICDCS 2025: 406-416
[c387]Yunzhe Li, Facheng Hu, Hongzi Zhu, Shifan Zhang, Liang Zhang, Shan Chang, Minyi Guo:
Saga: Capturing Multi-granularity Semantics from Massive Unlabelled IMU Data. ICDCS 2025: 879-889
[c386]Zhengyi Li, Yue Guan, Kang Yang, Yu Feng, Ning Liu, Yu Yu, Jingwen Leng, Minyi Guo:
An Efficient Private GPT Never Autoregressively Decodes. ICML 2025
[c385]Shihao Zhang
, Chi Zhang
, Chentao Wu
, Jie Li
, Minyi Guo
, Hui Li
, Liqiang Zhang
:
Decision Shuffle: Efficient Pre-scheduling System for Push-based Shuffle in DAG Computing Frameworks. ICPP 2025: 730-740
[c384]Zhixin Tong
, Jiuchen Shi
, Quan Chen
, Pu Pang
, Shixuan Sun
, Jie Meng
, Jiang Liu
, En Shao
, Minyi Guo
:
ORION: Optimizing OLAP Query Execution with Proactive Caching and Separate Operators. ICS 2025: 868-883
[c383]Fanrong Du
, Jiuchen Shi
, Quan Chen
, Pu Pang
, Li Li
, Minyi Guo
:
Generating Microservice Graphs with Production Characteristics for Efficient Resource Scaling. ICS 2025: 895-910
[c382]Yunzhe Li, Facheng Hu, Hongzi Zhu, Quan Liu, Xiaoke Zhao, Jiangang Shen, Shan Chang, Minyi Guo:
Prism: Mining Task-aware Domains in Non-i.i.d. IMU Data for Flexible User Perception. INFOCOM 2025: 1-10
[c381]Guangqiang Luan, Pu Pang, Quan Chen, Chen Chen, Guoyao Xu, Chi Zhang, Yanyi Zi, Yinghao Yu, Guodong Yang, Liping Zhang, Minyi Guo:
Reducing the End-to-End Latency of DNN-Based Recommendation Systems in GPU Pools. IPDPS 2025: 725-736
[c380]Yu Feng
, Weikai Lin
, Yuge Cheng
, Zihan Liu
, Jingwen Leng
, Minyi Guo
, Chen Chen
, Shixuan Sun
, Yuhao Zhu
:
Lumina: Real-Time Neural Rendering by Exploiting Computational Redundancy. ISCA 2025: 1925-1939
[c379]Tianhao Huang
, Lingyu Sun, Chao Li, Xiaofeng Hou, Yaqian Zhao, Jingwen Leng, Li Li, Minyi Guo:
Repurpose Accel-Sim for Next Generation NVIDIA Jetson GPU Architectural Design. ISLPED 2025: 1-7
[c378]Zichen Xu
, Zijun Li
, Quan Chen
, Minyi Guo
:
ServScale: Concurrency-Aware Serverless Execution and Scaling Paradigm. NPC (1) 2025: 13-25
[c377]Taolei Wang, Chao Li, Jing Wang, Xiaofeng Hou, Minyi Guo:
CGO: Cloud Game Orchestration via Resource Preception and CODEC Optimization. NPC (2) 2025: 38-50
[c376]Jinyang Guo, Xinkai Wang, Jing Wang, Xiaofeng Hou, Chao Li, Minyi Guo:
TriCooling-Sim: Efficient Thermal Simulation for High-Density Micro AI Data Centers. NPC (2) 2025: 227-239
[c375]Yuhang Fang, Pu Pang
, Quan Chen
, Li Li, Minyi Guo
:
Reducing Load-Balancing Cost for Multithreading Applications on Asymmetric NUMA Machine. NPC (2) 2025: 448-460
[c374]Huanqi Hu
, Bowen Xiao
, Shixuan Sun
, Jianian Yin
, Zhexi Zhang
, Xiang Luo
, Chengquan Jiang
, Weiqi Xu
, Xiaoying Jia
, Xin Liu
, Minyi Guo
:
LiquidGEMM: Hardware-Efficient W4A8 GEMM Kernel for High-Performance LLM Serving. SC 2025: 1619-1630
[c373]Guangda Liu
, Chenqi Zhang
, Yizhou Shan
, Hao Feng
, Zeke Wang
, Shixuan Sun
, Minyi Guo
, Jieru Zhao
:
DHAP: Towards Efficient OLAP in a Disaggregated and Heterogeneous Environment. SC 2025: 2233-2250
[c372]Sen Gao, Jianwen Zhao, Hao Zhang, Shixuan Sun, Chen Liang, Gongye Chen, Wenliang Zhang, Bo Ren, Chao Liu, Chenyi Zhang, Quan Chen, Chao Li, Jingwen Leng, Minyi Guo:
GES: High-Performance Graph Processing Engine and Service in Huawei. SIGMOD Conference Companion 2025: 391-403
[c371]Shulai Zhang, Ao Xu, Quan Chen, Han Zhao, Weihao Cui, Zhen Wang, Yan Li, Limin Xiao, Minyi Guo:
Efficient Performance-Aware GPU Sharing with Compatibility and Isolation through Kernel Space Interception. USENIX ATC 2025: 1003-1019
[i116]Yunzhe Li, Facheng Hu, Hongzi Zhu, Quan Liu, Xiaoke Zhao, Jiangang Shen, Shan Chang, Minyi Guo:
Prism: Mining Task-aware Domains in Non-i.i.d. IMU Data for Flexible User Perception. CoRR abs/2501.01598 (2025)
[i115]Weihao Cui, Ji Zhang, Han Zhao, Chao Liu, Wenhao Zhang, Jian Sha, Quan Chen, Bingsheng He, Minyi Guo:
XPUTimer: Anomaly Diagnostics for Divergent LLM Training in GPU Clusters of Thousand-Plus Scale. CoRR abs/2502.05413 (2025)
[i114]Jixian Su, Chiyu Hao, Shixuan Sun, Hao Zhang, Sen Gao, Jiaxin Jiang, Yao Chen, Chenyi Zhang, Bingsheng He, Minyi Guo:
Revisiting the Design of In-Memory Dynamic Graph Storage. CoRR abs/2502.10959 (2025)
[i113]Siyuan Huang, Zhiyuan Ma, Jintao Du, Changhua Meng, Weiqiang Wang, Jingwen Leng, Minyi Guo, Zhouhan Lin:
Gumbel Reranking: Differentiable End-to-End Reranker Optimization. CoRR abs/2502.11116 (2025)
[i112]Weiming Hu, Haoyan Zhang, Cong Guo, Yu Feng, Renyang Guan, Zhendong Hua, Zihan Liu, Yue Guan, Minyi Guo, Jingwen Leng:
M-ANT: Efficient Low-bit Group Quantization for LLMs via Mathematically Adaptive Numerical Type. CoRR abs/2502.18755 (2025)
[i111]Zihan Liu, Xinhao Luo, Junxian Guo, Wentao Ni, Yangjie Zhou, Yue Guan, Cong Guo, Weihao Cui, Yu Feng, Minyi Guo, Yuhao Zhu, Minjia Zhang, Jingwen Leng, Chen Jin:
VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference. CoRR abs/2503.02236 (2025)
[i110]Xiaotong Huang, He Zhu, Zihan Liu, Weikai Lin, Xiaohong Liu, Zhezhi He, Jingwen Leng, Minyi Guo, Yu Feng:
SeeLe: A Unified Acceleration Framework for Real-Time Gaussian Splatting. CoRR abs/2503.05168 (2025)
[i109]Yu Feng, Zheng Liu, Weikai Lin, Zihan Liu, Jingwen Leng, Minyi Guo, Zhezhi He, Jieru Zhao, Yuhao Zhu:
StreamGrid: Streaming Point Cloud Analytics via Compulsory Splitting and Deterministic Termination. CoRR abs/2503.05197 (2025)
[i108]Weihao Cui, Ziyi Xu, Han Zhao, Quan Chen, Zijun Li, Bingsheng He, Minyi Guo:
Efficient Function-as-a-Service for Large Language Models with TIDAL. CoRR abs/2503.06421 (2025)
[i107]Jing Wang, Chao Li, Taolei Wang, Jinyang Guo, Hanzhang Yang, Yiming Zhuansun, Minyi Guo:
Survey of Disaggregated Memory: Cross-layer Technique Insights for Next-Generation Datacenters. CoRR abs/2503.20275 (2025)
[i106]Yunzhe Li, Facheng Hu, Hongzi Zhu, Shifan Zhang, Liang Zhang, Shan Chang, Minyi Guo:
Saga: Capturing Multi-granularity Semantics from Massive Unlabelled IMU Data for User Perception. CoRR abs/2504.11726 (2025)
[i105]Weihao Cui, Yukang Chen, Han Zhao, Ziyi Xu, Quan Chen, Xusheng Chen, Yangjie Zhou, Shixuan Sun, Minyi Guo:
Optimizing SLO-oriented LLM Serving with PD-Multiplexing. CoRR abs/2504.14489 (2025)
[i104]Hang Zhang, Jiuchen Shi, Yixiao Wang, Quan Chen, Yizhou Shan, Minyi Guo:
Improving the Serving Performance of Multi-LoRA Large Language Models via Efficient LoRA and KV Cache Management. CoRR abs/2505.03756 (2025)
[i103]Guangda Liu, Chengwei Li, Zhenyu Ning, Jing Lin, Yiwu Yao, Danning Ke, Minyi Guo, Jieru Zhao:
FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference. CoRR abs/2505.13109 (2025)
[i102]Kunyun Wang, Bohan Li, Kai Yu, Minyi Guo, Jieru Zhao:
Communication-Efficient Diffusion Denoising Parallelization via Reuse-then-Predict Mechanism. CoRR abs/2505.14741 (2025)
[i101]Zhengyi Li, Yue Guan, Kang Yang, Yu Feng, Ning Liu, Yu Yu, Jingwen Leng, Minyi Guo:
An Efficient Private GPT Never Autoregressively Decodes. CoRR abs/2505.15252 (2025)
[i100]Zhenyu Ning, Guangda Liu, Qihao Jin, Wenchao Ding, Minyi Guo, Jieru Zhao:
LiveVLM: Efficient Online Video Understanding via Streaming-Oriented KV Cache and Retrieval. CoRR abs/2505.15269 (2025)
[i99]Zheng Liu, He Zhu, Xinyang Li, Yirun Wang, Yujiao Shi, Wei Li, Jingwen Leng, Minyi Guo, Yu Feng:
Voyager: Real-Time Splatting City-Scale 3D Gaussians on Your Phone. CoRR abs/2506.02774 (2025)
[i98]Haosong Liu, Yuge Cheng, Zihan Liu, Aiyue Chen, Jing Lin, Yiwu Yao, Chen Chen, Jingwen Leng, Yu Feng, Minyi Guo:
Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers. CoRR abs/2506.05096 (2025)
[i97]Yu Feng, Weikai Lin, Yuge Cheng, Zihan Liu, Jingwen Leng, Minyi Guo, Chen Chen, Shixuan Sun, Yuhao Zhu:
Lumina: Real-Time Mobile Neural Rendering by Exploiting Computational Redundancy. CoRR abs/2506.05682 (2025)
[i96]Chenqi Zhang, Yu Feng, Jieru Zhao, Guangda Liu, Wenchao Ding, Chentao Wu, Minyi Guo:
STREAMINGGS: Voxel-Based Streaming 3D Gaussian Splatting with Memory Optimization and Architectural Support. CoRR abs/2506.09070 (2025)
[i95]Tianze Wang, Yifei Liu, Chen Chen, Pengfei Zuo, Jiawei Zhang, Qizhen Weng, Yin Chen, Zhenhua Han, Jieru Zhao, Quan Chen, Minyi Guo:
Efficient Unified Caching for Accelerating Heterogeneous AI Workloads. CoRR abs/2506.12370 (2025)
[i94]Yifei Liu, Zuo Gan, Zhenghao Gan, Weiye Wang, Chen Chen, Yizhou Shan, Xusheng Chen, Zhenhua Han, Yifei Zhu, Shixuan Sun, Minyi Guo:
Efficient Serving of LLM Applications with Probabilistic Demand Modeling. CoRR abs/2506.14851 (2025)
[i93]Jiale Xu, Rui Zhang, Yi Xiong, Cong Guo, Zihan Liu, Yangjie Zhou, Weiming Hu, Hao Wu, Changxu Shao, Ziqing Wang, Yongjie Yuan, Junping Zhao, Minyi Guo, Jingwen Leng:
eLLM: Elastic Memory Management Framework for Efficient LLM Serving. CoRR abs/2506.15155 (2025)
[i92]Chuhao Xu, Zijun Li, Quan Chen, Han Zhao, Minyi Guo:
LLM-Mesh: Enabling Elastic Sharing for Serverless LLM Inference. CoRR abs/2507.00507 (2025)
[i91]Chiyu Hao, Jixian Su, Shixuan Sun, Hao Zhang, Sen Gao, Jianwen Zhao, Chenyi Zhang, Jieru Zhao, Chen Chen, Minyi Guo:
RapidStore: An Efficient Dynamic Graph Storage System for Concurrent Queries. CoRR abs/2507.00839 (2025)
[i90]Xingyang Li, Jie Jiang, Yu Feng, Yiming Gan, Jieru Zhao, Zihan Liu, Jingwen Leng, Minyi Guo:
SLTarch: Towards Scalable Point-Based Neural Rendering by Taming Workload Imbalance and Memory Irregularity. CoRR abs/2507.21499 (2025)
[i89]Ping Chen, Zhuohong Deng, Ping Li, Shuibing He, Hongzi Zhu, Yi Zheng, Zhefeng Wang, Baoxing Huai, Minyi Guo:
Adacc: An Adaptive Framework Unifying Compression and Activation Recomputation for LLM Training. CoRR abs/2508.00806 (2025)
[i88]Jinyuan Chen, Jiuchen Shi, Quan Chen, Minyi Guo:
Kairos: Low-latency Multi-Agent Serving with Shared LLMs and Excessive Loads in the Public Cloud. CoRR abs/2508.06948 (2025)
[i87]Xinhao Luo, Zihan Liu, Yangjie Zhou, Shihan Fang, Ziyu Huang, Yu Feng, Chen Zhang, Shixuan Sun, Zhenzhe Zheng, Jingwen Leng, Minyi Guo:
ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive. CoRR abs/2508.18850 (2025)
[i86]Huanqi Hu, Bowen Xiao, Shixuan Sun, Jianian Yin, Zhexi Zhang, Xiang Luo, Chengquan Jiang, Weiqi Xu, Xiaoying Jia, Xin Liu, Minyi Guo:
LiquidGEMM: Hardware-Efficient W4A8 GEMM Kernel for High-Performance LLM Serving. CoRR abs/2509.01229 (2025)
[i85]Shulai Zhang, Ao Xu, Quan Chen, Han Zhao, Weihao Cui, Ningxin Zheng, Haibin Lin, Xin Liu, Minyi Guo:
Boosting Embodied AI Agents through Perception-Generation Disaggregation and Asynchronous Pipeline Execution. CoRR abs/2509.09560 (2025)
[i84]Mingyan Yang, Guanjie Wang, Manqi Luo, Yifei Liu, Chen Chen, Han Zhao, Yu Feng, Quan Chen, Minyi Guo:
Justitia: Fair and Efficient Scheduling for LLM Applications. CoRR abs/2510.17015 (2025)
[i83]Kangkang Sun, Jun Wu, Minyi Guo, Jianhua Li, Jianwei Huang
:
Accurate Target Privacy Preserving Federated Learning Balancing Fairness and Utility. CoRR abs/2510.26841 (2025)
[i82]Ao Xu, Han Zhao, Weihao Cui, Quan Chen, Yukang Chen, Shulai Zhang, Shuang Chen, Jiemin Jiang, Zhibin Yu, Minyi Guo:
Harli: SLO-Aware Co-location of LLM Inference and PEFT-based Finetuning on Model-as-a-Service Platforms. CoRR abs/2511.11729 (2025)
[i81]Wenfeng Wang, Jiacheng Liu, Xiaofeng Hou, Xinfeng Xia, Peng Tang, Mingxuan Zhang, Chao Li, Minyi Guo:
MoE-SpeQ: Speculative Quantized Decoding with Proactive Expert Prefetching and Offloading for Mixture-of-Experts. CoRR abs/2511.14102 (2025)
[i80]Xiaotong Huang, He Zhu, Tianrui Ma, Yuxiang Xiong, Fangxin Liu, Zhezhi He, Yiming Gan, Zihan Liu, Jingwen Leng, Yu Feng, Minyi Guo:
Splatonic: Architecture Support for 3D Gaussian Splatting SLAM via Sparse Processing. CoRR abs/2511.18755 (2025)
[i79]Yunzhe Li, Jiajun Yan, Yuzhou Wei, Kechen Liu, Yize Zhao, Chong Zhang, Hongzi Zhu, Li Lu, Shan Chang, Minyi Guo:
BlinkBud: Detecting Hazards from Behind via Sampled Monocular 3D Detection on a Single Earbud. CoRR abs/2512.01366 (2025)
[i78]Yunzhe Li, Jianan Wang, Hongzi Zhu, James Lin, Shan Chang, Minyi Guo:
ThinkTrap: Denial-of-Service Attacks against Black-box LLM Services via Infinite Thinking. CoRR abs/2512.07086 (2025)
[i77]Ziyu Huang, Yangjie Zhou, Zihan Liu, Xinhao Luo, Yijia Diao, Minyi Guo, Jidong Zhai, Yu Feng, Chen Zhang, Anbang Wu, Jingwen Leng:
FlashFuser: Expanding the Scale of Kernel Fusion for Compute-Intensive Operators via Inter-Core Connection. CoRR abs/2512.12949 (2025)
[i76]Yue Guan, Changming Yu, Shihan Fang, Weiming Hu, Zaifeng Pan, Zheng Wang, Zihan Liu, Yangjie Zhou, Yufei Ding, Minyi Guo, Jingwen Leng:
Yggdrasil: Bridging Dynamic Speculation and Static Runtime for Latency-Optimal Tree-Based LLM Decoding. CoRR abs/2512.23858 (2025)
[i75]Zhengyi Li, Kang Yang, Jin Tan, Wen-jie Lu, Haoqi Wu, Xiao Wang, Yu Yu, Derun Zhao, Yancheng Zheng, Minyi Guo, Jingwen Leng:
Nimbus: Secure and Efficient Two-Party Inference for Transformers. IACR Cryptol. ePrint Arch. 2025: 2250 (2025)
[i74]Zhengyi Li, Yue Guan, Kang Yang, Yu Feng, Ning Liu, Yu Yu, Jingwen Leng, Minyi Guo:
An Efficient Private GPT Never Autoregressively Decodes. IACR Cryptol. ePrint Arch.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID