Yihao Quan (全奕豪)

I'm a fourth-year ungraduate student at Beijing Jiaotong University, where I major in Management Information system.

I will be pursuing my Ph.D. in Computer Science at Rutgers University, starting in Fall 2025, under the supervision of Prof. Ryan Tang.

Email  /  Scholar  /  Linkedin  /  Wechat  /  Github

profile photo

Research Interests

My research primarily focuses on MultiModal Learning and Trustworthy & Interpretability AI, with a emphasis on the following areas:

  • Vision + Language: Bridging the gap between our knowledge of language and vision representation, providing a foundation for more interpretable and controllable multimodal systems.
  • Unify Multimodal Understanding and Generation: Bridging architectural differences between autoregressive and diffusion models via mutual promotion of understanding and generation.
  • Mechanistic Interpretability: Understanding the internal mechanisms of LLMs and MLLMs through techniques like sparse autoencoder, circuit analysis, causal tracing, and neuron analysis.

  • Hallucination, Factuality and Safety: Using the interpretability findings to help downstream tasks (e.g. factual knowledge, enhancing reasoning, reducing hallucinations, model editing), and design safer models.

    Representative papers are highlighted, * = Equal Contribution

News

Seeing Clearly by Layer Two: Enhancing Attention Heads to Alleviate Hallucination in LVLMs
Xiaofeng Zhang*, Yihao Quan*, Chaochen Gu, Chen Shen, Xiaosong Yuan, Shaotian Yan, Jieping Ye
arXiv
Under Review

A plug-and-play and training-free method showing significant hallucination-mitigating performance on different VLMs and metrics.

From Redundancy to Relevance: Information Flow in LVLMs Across Reasoning Tasks
Xiaofeng Zhang*, Yihao Quan*, Chaochen Gu, Chen Shen, Xiaosong Yuan, Shaotian Yan, Liang Xie, Wenxiao Wang, Hao Tang, Jieping Ye
Github / arXiv
Accepted by NAACL'25 Main (Oral)

A novel perspective to enhance understanding of LVLMs and their functioning, particularly for complex reasoning tasks.

Enhancing LVLMs’ Complex Reasoning via Similarity Computation
Fanshuo Zeng*, Xiaofeng Zhang*, Yihao Quan, Zheng Hui, Jiawei Yao
Github / arXiv
Accepted by AAAI'25

A novel image token reduction method, Simignore, designed to enhance the complex reasoning capabilities.


Selected Awards and Honors

  • 2022-2024: Second Class Scholarship
  • 2022-2024: Merit Student of Beijing Jiaotong University
  • 2022-2023: Dean’s List of Rochester Institute of Technology
  • 2022: Second Prize of National College Student Mathematical Modeling Competition
  • 2022: Kaggle Sliver Medal (Top 5%): Feedback Prize - Evaluating Student Writing

  • Feel free to steal this website's source code.