Xiaochuang Han  

(You can call me Han, which is easier to pronounce and remember)

firstname.lastname@gmail.com

[CV] [Google Scholar]



Bio

I am a Research Scientist at Meta FAIR. My research is centered on multimodal generation, with specific interests in omni-multimodal generative architectures, long and interactive text-video generation, and synthetic data generation. I have developed frameworks such as TV2TV for interleaved language and video generation, LMFusion for adapting language models to multimodal tasks, and JPEG-LM for codec-based image generation. My previous work also includes diffusion language models, inference-time model collaboration, and training data attribution.

I earned my Ph.D. in Computer Science and Engineering from the University of Washington, where I was advised by Yulia Tsvetkov. Before UW, I received my M.S. from Carnegie Mellon University. I completed my undergraduate studies at Georgia Tech, where I was advised by Jacob Eisenstein. My research has been supported by the OpenAI Superalignment Fellowship (2024) and the Meta AI Mentorship Program (2023, 2022).


Selected Publications

Please see my Google Scholar or CV for a full list of publications.

TV2TV: A Unified Framework for Interleaved Language and Video Generation
Xiaochuang Han, Youssef Emad, Melissa Hall, John Nguyen, Karthik Padthe, Liam Robbins, Amir Bar, Delong Chen, Michal Drozdzal, Maha Elbayad, Yushi Hu, Shang-Wen Li, Sreya Dutta Roy, Jakob Verbeek, XuDong Wang, Marjan Ghazvininejad, Luke Zettlemoyer, and Emily Dinan.
CVPR 2026

MADFormer: Mixed Autoregressive and Diffusion Transformers for Continuous Image Generation
Junhao Chen, Yulia Tsvetkov, and Xiaochuang Han.
ICLR 2026

LMFusion: Adapting Pretrained Language Models for Multimodal Generation
Weijia Shi*, Xiaochuang Han*, Chunting Zhou, Weixin Liang, Xi Victoria Lin, Luke Zettlemoyer, and Lili Yu.
NeurIPS 2025

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations
Xiaochuang Han, Marjan Ghazvininejad, Pang Wei Koh, and Yulia Tsvetkov.
arXiv preprint

David helps Goliath: Inference-Time Collaboration Between Small Specialized and Large General Diffusion LMs
Xiaochuang Han, Sachin Kumar, Yulia Tsvetkov, and Marjan Ghazvininejad.
NAACL 2024

Tuning Language Models by Proxy
Alisa Liu, Xiaochuang Han, Yizhong Wang, Yulia Tsvetkov, Yejin Choi, Noah Smith.
COLM 2024

Trusting Your Evidence: Hallucinate Less with Context-aware Decoding
Weijia Shi*, Xiaochuang Han*, Mike Lewis, Yulia Tsvetkov, Luke Zettlemoyer, and Scott Wen-tau Yih.
NAACL 2024

SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Xiaochuang Han, Sachin Kumar, and Yulia Tsvetkov.
ACL 2023

Understanding In-Context Learning via Supportive Pretraining Data
Xiaochuang Han, Daniel Simig, Todor Mihaylov, Yulia Tsvetkov, Asli Celikyilmaz, and Tianlu Wang.
ACL 2023