I am an associate professor at the School of Computer Science and Engineering, Sun Yat-sen University. Official Homepage. As a member of the Interdisciplinary Research Center (xRC), led by Prof. Yutong Lu and Prof. Nong Xiao, my research focuses on scalable and efficient computing systems for AI, HPC, and AI for Science, through innovations in hardware operator, task scheduling, resource management, and algorithm codesign.
I am looking for highly-motivated Undergraduate, Master and PhD students, please email me your CV if you are interested.
Education
- Ph.D in Computer Science, Sun Yat-sen University (National SuperComputer Center in Guangzhou), advised by Yutong Lu, Xiangke Liao and Yunfei Du.
- Visiting Scholar, National University of Singapore.
- M.Sc in High Performance Computing and Data Science, The University of Edinburgh (Edinburgh Parallel Computing Center).
- B.Sc. in Spatial Information and Digital Technology, Wuhan University.
News
- [2025-08] Two Paper Accepted for ICCD 2025, congratualtions to Xiao Shi and Jinhui Wei.
- [2025-07] Receive offer from Sun Yat-sen University, join as an associate professor.
- [2025-07] One Paper Accepted for ICPP 2025, congratualtions to Hongbin Zhang.
- [2025-07] One Paper Accepted for SC 2025.
- [2025-01] One Paper Accepted For WWW 2025, congratulations to Yuhao Gu.
- [2024-12] One Paper Accepted For TPDS.
- [2024-12] One Paper accepted for VLDB 2025, congratulations to Qingyin Lin!
- [2024-10] One Paper accepted for ASPLOS 2025, congratulations to Shenggan Cheng!
- [2024-04] One Paper accepted for SC 2024, congratulations to Yuanxin Wei!
- [2023-11] One Paper accepted for PPoPP 2024.
- [2021-10] Joined LuChen as a Research Intern, leading the large-scale Model Inference Project, EnergonAI, cooperating with Jiarui Fang, Shenggan Cheng, and Ziming Liu.
- [2021-06] Joined Tencent Shanghai, Visualization Group, as a Research Intern, working on technical verification of GPU pooling, mentored by Song Jike and Feng Kehuan.
- [2019-02] Visited Svalbard, what an amazing place.
Selected Publications
- * Denotes the
Corresponding author.Conference Publications
- [VLDB 2025] [CCF-A] Qingyin Lin, Jiangsu Du*,Rui Li, Zhiguang Chen, Wenguang Chen, and Nong Xiao, “IncrCP: Decomposing and Orchestrating Incremental Checkpoints for Effective Recommendation Model Training”.
- [ICPP 2025] [CCF-B] Hongbin Zhang, Taosheng Wei, Zhenyi Zheng, Jiangsu Du*, Zhiguang Chen*, Yutong Lu, “TD-Pipe: Temporally-Disaggregated Pipeline Parallelism Architecture for High-Throughput LLM Inference”.
- [ICCD 2025] [CCF-B] Jinhui Wei, Ye Huang, Yuhui Zhou, Jiazhi Jiang, Jiangsu Du*, Hongbin Zhang, Taosheng Wei, Zhenyi Zheng, Jiangsu Du*, Zhiguang Chen, Yutong Lu, “Ghidorah: Fast LLM Inference on Edge with Speculative Decoding and Hetero-Core Parallelism”.
- [ICCD 2025] [CCF-B] Xiao Shi, Jiangsu Du, Zhiguang Chen, Yutong Lu, “AuLoRA: Fine-Grained Loading and Computation Orchestration for Efficient LoRA LLM Serving”.
- [ASPLOS 2025] [CCF-A] Shenggan Cheng, Shengjie Lin, Lansong Diao, Hao Wu, Siyu Wang, Chang Si, Ziming Liu, Xuanlei Zhao, Jiangsu Du, Wei Lin, and Yang You, “Concerto: Automatic Communication Optimization and Scheduling for Large-Scale Deep Learning”.
- [SC 2025] [CCF-A] Yuhao Gu, Haoquan Chen, Xianjie Chen, Jiangsu Du, Zhiguang Chen, Nong Xiao, Xianwei Zhang, and Yutong LU, “coMtainer: Compilation-assisted HPC Container Images with Enhanced Adaptability”.
- [WWW 2025] [CCF-A] Yuhao Gu, Junyu Chen, Jiangsu Du, Xiaoxi Zhang, and Xianwei Zhang, “ORFA: A WebAssembly-based Runtime to Optimize Remote Procedure Calls with Complete Expressiveness”.
- [PPoPP 2024] [CCF-A] Jiangsu Du, Jinhui Wei, Jiazhi Jiang, Shenggan Cheng, Dan Huang, Zhiguang Chen, Yulong Lu, “Liger: Interleaving Intra- and Inter-Operator Parallelism for Distributed Large Model Inference”.
- [SC 2024] [CCF-A] Yuanxin Wei, Jiangsu Du*, Jiazhi Jiang, Xiao Shi, Xianwei Zhang, Dan Huang*, Nong Xiao, Yutong LU, “APTMoE: Affinity-Aware Pipeline Tuning for MoE Models on Bandwidth-Constrained GPU Nodes”.
- [INFOCOM 2024] [CCF-A] Shengyuan Ye, Jiangsu Du*, Liekang Zeng, Wenzhong Ou, Xiaowen Chu, Yutong Lu, Xu Chen*, “Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference”.
- [DATE 2024] [CCF-B] Yuanxin Wei, Shengyuan Ye, Jiazhi Jiang, Xu Chen, Dan Huang*, Jiangsu Du*, Yutong Lu, “Communication-Efficient Model Parallelism for Distributed In-situ Transformer Inference”.
- [NPC 2024] [CCF-C] Yu Li, Yuanxin Wei, Jiangsu Du, Dan Huang, Nong Xiao, “Understanding the Inference Performance of Spatial Temporal Diffusion Transformer”.
- [ICCD 2023] [CCF-B] Jiazhi Jiang, Rui Tian, Jiangsu Du, Dan Huang, Yutong Lu, “MixRec: Orchestrating Concurrent Recommendation Model Training on CPU-GPU platform”.
- [DATE 2023] [CCF-B] Jiazhi Jiang, Zhijian Huang, Dan Huang, Jiangsu Du, Yutong Lu, “Accelerating Inference of 3D-CNN on ARM Many-core CPU via Hierarchical Model Partition”.
- [ICS 2022] [CCF-B] Jiangsu Du, Jiazhi Jiang, Yang You, Dan Huang, Yutong Lu, “Handling Heavy-tailed Input of Transformer Inference on GPUs”.
- [ICCD 2020] [CCF-B] Jiangsu Du, Minghua Shen, Yunfei Du. “A Distributed In-Situ CNN Inference System for IoT Applications”.
- [ICPP 2022] [CCF-B] Jiazhi Jiang, Jiangsu Du, Dan Huang, Dongsheng Li, Jiang Zheng, Yutong Lu. “Characterizing and optimizing transformer inference on arm many-core processor”.
Journal Publications
- [TPDS] [CCF-A] Jiangsu Du, Xin Zhu, Minghua Shen, Yunfei Du, Yutong Lu, Nong Xiao, and Xiangke Liao, “Co-designing Transformer Architectures for Distributed Inference with Low Communication”.
- [TPDS] [CCF-A] Jiangsu Du, Yuanxin Wei, Shengyuan Ye, Jiazhi Jiang, Xu Chen, Dan Huang, and Yutong Lu, “Model Parallelism Optimization for Distributed Inference via Decoupled CNN Structure”.
- [TACO] [CCF-A] Jiangsu Du, Jiazhi Jiang, Jiang Zheng, Hongbin Zhang, Dan Huang, Yutong Lu, “Improving Computation and Memory Efficiency for Real-world Transformer Inference on GPUs”.
- [JCST] [CCF-B] Jiangsu Du, Dongsheng Li, Yingpeng Wen, Jiazhi Jiang, Dan Huang, Xiangke Liao, and Yutong Lu, “SAIH: A Scalable Evaluation Methodology for Understanding AI Performance Trend on HPC Systems”.
- [IOTJ] [JCR-1] Jiangsu Du, Yunfei Du, Dan Huang, Yutong Lu, and Xiangke Liao, “Enhancing Distributed In-Situ CNN Inference in the Internet of Things”
- [TPDS] [CCF-A] Rui Tian, Jiazhi Jiang, Jiangsu Du, Dan Huang, Yutong Lu, “Sophisticated Orchestrating Concurrent DLRM Training on CPU/GPU Platform”.
- [TPDS] [CCF-A] Jiazhi Jiang, Jiangsu Du, Dan Huang, Zhiguang Chen, Yutong Lu, Xiangke Liao, “Full-Stack Optimizing Transformer Inference on ARM Many-Core CPU”.
📚 Courses
- 2025.9 - Now, Principles of Computer Organization
《计算机组成原理》