Ruiyang Xu (徐瑞阳)

Senior Undergraduate in Computer Science @ SJTU

Research: Multimodal Learning

Email: xry2022 [at] sjtu.edu.cn

About

I am a senior CS student at Shanghai Jiao Tong University, advised by Prof. Xie Chen.

My research focuses on fine-grained perception in Multimodal LLMs, specifically through audio-visual synergy. I am currently a Research Intern at Alibaba Qwen, where I contributed to the development of Qwen3-Omni, focusing on audio-visual captioning and agents.

In my spare time, I listen to Britpop and J-rock.

News

Selected Publications

The Interspeech 2026 Audio Reasoning Challenge: Evaluating Reasoning Process Quality for Audio Reasoning Models and Agents
Ziyang Ma, Ruiyang Xu, Yinghao Ma, Chao-Han Huck Yang, Bohan Li, Jaeyeon Kim, Jin Xu, Jinyu Li, Carlos Busso, Kai Yu, Eng Siong Chng, Xie Chen
arxiv
2026.02
Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception
Ziyang Ma*, Ruiyang Xu*, Zhenghao Xing*, Yunfei Chu, Yuxuan Wang, Jinzheng He, Jin Xu, Pheng-Ann Heng, Kai Yu, Junyang Lin, Eng Siong Chng, Xie Chen
Proc. ICLR 2026

* Equal contribution

2025.10
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
Ziyang Ma, Yinghao Ma, Yanqiao Zhu, Chen Yang, Yi-Wen Chao, Ruiyang Xu, Wenxi Chen, Yuanzhe Chen, Zhuo Chen, Jian Cong, Kai Li, Keliang Li, Siyou Li, Xinfeng Li, Xiquan Li, Zheng Lian, Yuzhe Liang, Minghao Liu, Zhikang Niu, Tianrui Wang, Yuping Wang, Yuxuan Wang, Yihao Wu, Guanrou Yang, Jianwei Yu, Ruibin Yuan, Zhisheng Zheng, Ziya Zhou, Haina Zhu, Wei Xue, Emmanouil Benetos, Kai Yu, Eng-Siong Chng, Xie Chen
Proc. NeurIPS 2025
2025.05

Education

Shanghai Jiao Tong University
B.S. in Computer Science
2022 — Present