πŸ‘€ About Me

Hi,

🌱 I’m Xiaoxiao Ma, a first-year PhD student at USTC, in USTC-BIVLab supervised by Prof. Feng Zhao. I am currently a research intern at Meituan

πŸ“– My research interest includes:

  • Generative models & image synthesis, autoregressive models, vision-language models
  • Image restoration, image enhancement

πŸ“« Looking forward to any collaborations or internship positions, feel free to contact me via email



πŸ”₯ News

  • 2025.06: Β  Delighted to announce that HQ-CLIP was accepted by ICCV 2025!
  • 2024.09: Β  Delighted to announce that MPI was accepted by NeurIPS 2024!
  • 2024.09: Β  I was invited to give a talk at ByteDance as the author of STAR! See slides here
  • 2024.06: Β  STAR was released on Arxiv.



πŸ“ Publications

Arxiv 2024
sym

STAR: Scale-wise Text-to-image generation via Auto-Regressive representations

Xiaoxiao Ma*, Mohan Zhou*, Tao Liang, Yalong Bai, et al.

Project

  • STAR is a novel scale-wise text-to-image model that is effective and efficient in performance
  • Notably, STAR shows efficiency by requiring 2.95s to generate 512Γ—512 images (compared to 6.48s for PixArt-Ξ±)
NeurIPS 2024
sym

Masked Pre-trained Model Enables Universal Zero-shot Denoiser

Xiaoxiao Ma*, Zhixiang Wei*, Yi Jin, Pengyang Ling, et al.

Project

  • MPI is a zero-shot denoising pipeline designed for many types of noise degradations
  • Only around 10s takes for a MPI to denoise on single noisy image
ICCV 2025
sym

HQ-CLIP: Leveraging Large Vision-Language Models to Create High-Quality Image-Text Datasets and CLIP Models

Zhixiang Wei*, Guangting Wang*, Xiaoxiao Ma, et al.

Project

  • A CLIP training framework trained on 1.3B bidirectional image–text pairs, combining bidirectional supervision and label classification, achieving SoTA zero-shot and retrieval performance.
CVPR 2024
sym

Stronger, Fewer, \& Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation

Zhixiang Wei*, Lin Chen*, Yi Jin*, Xiaoxiao Ma, et al.

Project

  • Rein is a PEFT framework based on vision foundation models for domain generalized semantic segmentation (DGSS) with merely 1% trainable parameters



πŸ’» Experiences

  • 2025.04 - Persent, Meituan, Beijing.
  • 2024.12 - 2025.03, Shanghai AI Laboratory, Shanghai.
  • 2024.04 - 2024.12, Duxiaoman, Beijing.



πŸ“ Academic Service (Reviewer)

  • NeurIPS 2025
  • IEEE TPAMI



πŸŽ– Honors and Awards

  • 2022~2024 The First Prize Scholarship of USTC for three continusous years
  • 2024 National Scholarship for Undergraduate Students



πŸ“– Educations

  • 2022.09 - now, University of Science and Technology of China, Anhui, Master candidate in Computer Vision
  • 2018.09 - 2022.06, China Agricultural University, Beijing. B. Eng in Computer Science