I am currently a fourth(final)-year Ph.D. Student in the School of Computing, National University of Singapore, advised by Prof. Yang You. Before that, I obtained my masterโ€™s and bachelorโ€™s degrees from Northwestern Polytechnical University, China, in 2019 and 2022, respectively. During my masterโ€™s study, I was fortunatedly to collaborate with Dr. Nian Liu, under the supervision from Prof. Junwei Han.

My research interest includes Efficient AI, Dynamic Models and Model-system Co-design. I have published more than 20 papers at the top international AI conferences and journals with .

All talents are welcome to send an email (wangbo.zhao96@gmail.com) to me if you are interested in collaborating on projects related to efficient deep learning or other promising research directions.

Beyond research, I am an amateur athlete specializing in the 400 meters (personal best: 53.40) and 400-meter hurdles (personal best: 1:01.78). I was honored to represent NUS in national-level competitions in Singapore and NWPU in provincial-level competitions in Shanxi, China.

๐Ÿ”ฅ News

  • 2026.01: ย ๐ŸŽ‰๐ŸŽ‰ Four papers accepted to ICLR 2026! Proud of the teamโ€™s contributions to efficient DiT, MoE, visual tracking, and generation evaluation. Congratulations to everyone involved! ๐Ÿš€
  • 2026.01: ย ๐ŸŽ‰๐ŸŽ‰ One paper accepted to TPAMI 2026. Congratulations to all the authors!
  • 2026.01: ย ๐ŸŽ‰๐ŸŽ‰ I have successfully completed my thesis defense!
  • 2025.11: ย ๐ŸŽ‰๐ŸŽ‰ I am invited to give a talk at the the Eastern Institute of Technology and HKUST(GZ).
  • 2025.10: ย ๐ŸŽ‰๐ŸŽ‰ I am invited to give a talk at the Global College of Shanghai Jiao Tong University.
  • 2025.10: ย ๐ŸŽ‰๐ŸŽ‰ Recipient of the Google PhD Fellowship 2025 in Machine Learning and ML Foundations.
  • 2025.09: ย ๐ŸŽ‰๐ŸŽ‰ I am invited to give a talk at ShanghaiTech University.
  • 2025.07: ย ๐ŸŽ‰๐ŸŽ‰ One paper accepted to ICCV 2025.
  • 2025.06: ย ๐ŸŽ‰๐ŸŽ‰ I begin my Internship at Meta in Zurich.
  • 2025.05: ย ๐ŸŽ‰๐ŸŽ‰ One paper accepted to ICML 2025.
  • 2025.02: ย ๐ŸŽ‰๐ŸŽ‰ One paper accepted to CVPR 2025.
  • 2025.01: ย ๐ŸŽ‰๐ŸŽ‰ One paper accepted to ICLR 2025.
  • 2024.09: ย ๐ŸŽ‰๐ŸŽ‰ One paper accepted to NeurIPS 2024.
  • 2024.07: ย ๐ŸŽ‰๐ŸŽ‰ One paper accepted to ECCV 2024.

๐Ÿ“ Selected Publications

ICLR 2026
sym

MoNE: Replacing Redundant Experts with Lightweight Novices for Structured Pruning of MoE

Geng Zhang, Yuxuan Han, Yuxuan Lou, Wangbo Zhaoโ€ , Yiqi Zhang, Yang Youโ€ 

  • MoNE achieves efficient MoE compression by replacing redundant experts with lightweight โ€œnovices,โ€ significantly reducing memory overhead while maintaining higher accuracy than traditional pruning.
ICLR 2026
sym

RAPID^3: Tri-Level Reinforced Acceleration Policies for Diffusion Transformer

Wangbo Zhao, Yizeng Han, Zhiwei Tang, Jiasheng Tang, Pengfei Zhou, Kai Wang, Bohan Zhuang, Zhangyang Wang, Fan Wang, Yang You

  • RAPID3 speeds up DiT models like FLUX by 3x using adaptive reinforcement learning policies that accelerate sampling without needing any fine-tuning of the original generator.
TPAMI 2026
sym

DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation

Code

Wangbo Zhao*, Yizeng Han*, Jiasheng Tang, Kai Wang, Hao Luo, Yibing Song, Gao Huang, Fan Wang, Yang You

  • DyDiT++ enhances DyDiT with integration with flow matching for broader applications, improved adaptability to complex tasks, and timestep-based dynamic LoRA for efficient training.
ICCV 2025
sym

EA-ViT: Efficient Adaptation for Elastic Vision Transformer

Code

Chen Zhu, Wangbo Zhaoโ€ , Huiwen Zhang, Samir Khaki, Yuhao Zhou, Weidong Tang, Shuo Wang, Zhihang Yuan, Yuzhang Shang, Xiaojiang Peng, Kai Wang, Dawei Yangโ€ 

  • EA-ViT is an efficient adaptation framework for Vision Transformers, enabling a single process to generate flexible models of varying sizes for diverse resource constraints, using a nested elastic architecture and a lightweight router optimized with Pareto-optimal configurations.
ICML 2025
sym

Unsupervised Learning for Class Distribution Mismatch

Pan Du, Wangbo Zhaoโ€ , Xinai Lu, Nian Liu, Zhikai Li, Chaoyu Gong, Suyun Zhaoโ€ , Hong Chen, Cuiping Li, Kai Wang, Yang You

  • UCDM addresses Class Distribution Mismatch (CDM) by leveraging unlabeled data to train classifiers through positive-negative pairs, synthesized using a diffusion model, and a confidence-based pseudo-labeling mechanism, achieving superior performance over semi-supervised methods without relying on labeled data.
CVPR 2025
sym

A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs

Code

Wangbo Zhao, Yizeng Han, Jiasheng Tang, Zhikai Li, Yibing Song, Kai Wang, Zhangyang Wang, Yang You

  • We employ the attention map aggregated from a small VLM to guide visual token pruning in a large VLM. And an early exiting mechanism is developed to fully use the small VLMโ€™s predictions, dynamically invoking the larger VLM only when necessary, yielding a superior trade-off between accuracy and computation.
ICLR 2025
sym

Dynamic diffusion transformer

Code

Wangbo Zhao, Yizeng Han, Jiasheng Tang, Kai Wang, Yibing Song, Gao Huang, Fan Wang, Yang You

  • We propose to dynamically adjust the computation of DiT in different timesteps and spatial locations of images. The computation of DiT-XL could be saved by 50% without sacrificing generation quality.
NeurIPS 2024
sym

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation

Code

Wangbo Zhao, Jiasheng Tang, Yizeng Han, Yibing Song, Kai Wang, Gao Huang, Fan Wang, Yang You

  • We propose to adapt static ViT to dynamic ViT via parameter-efficient fine-tuning without full-parameter tuning.
ECCV 2024 (Oral)
sym

Mmbench: Is your multi-modal model an all-around player?

Yuan Liu, Haodong Duan, Yuanhan Zhang, Bo Li, Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, Kai Chen, Dahua Lin

  • We propose MMBench, a bilingual benchmark for assessing the multi-modal capabilities of VLMs.
CVPR 2022
sym

Modeling motion with multi-modal features for text-based video segmentation

Wangbo Zhao, Kai Wang, Xiangxiang Chu, Fuzhao Xue, Xinchao Wang, Yang You

  • We design a method to fuse and align appearance, motion, and linguistic features to achieve accurate text-based video segmentation.
ICCV 2021
sym

Light field saliency detection with dual local graph learning and reciprocative guidance

Nian Liu*, Wangbo Zhao*, Dingwen Zhang, Junwei Han, Ling Shao

  • We introduce a reciprocative guidance scheme for light field saliency detection.
CVPR 2021
sym

Weakly supervised video salient object detection

Wangbo Zhao, Jing Zhang, Long Li, Nick Barnes, Nian Liu, Junwei Han

  • We present the first weakly supervised video salient object detection model based on relabeled fixation guided scribble annotations.

๐Ÿ“– Educations

  • 2022.08 - 2026.06, Ph.D., School of Computing, National University of Singapore, Singapore.
  • 2019.09 - 2022.04, Master, School of Automation, Northwestern Polytechnical University, China
  • 2017.07 - 2019.01, Undergraduate, Universitรฉ de technologie de Troyes, France
  • 2015.09 - 2019.06, Undergraduate, Honors College, Northwestern Polytechnical University, China