publications | Shujian Zhang

View my Google Scholar profile →

2025

DLM-One: Diffusion Language Models for One-Step Sequence Generation

Tianqi Chen, Shujian Zhang, and Mingyuan Zhou

arXiv preprint arXiv:2506.00290, 2025

PDF
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Gemini Team

arXiv preprint arXiv:2507.06261, 2025

PDF

2024

Preprint

Introducing Gemini 2.0: our new AI model for the agentic era (2024)

Sundar Pichai, Demis Hassabis, and Koray Kavukcuoglu

Accessed:[Insert Date Accessed Here], 2024
ICLR 2025

Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models

Tianqi Chen, Shujian Zhang, and Mingyuan Zhou

arXiv preprint arXiv:2409.11219, 2024
ICLR 2025

Statistical Advantages of Perturbing Cosine Router in Sparse Mixture of Experts

Huy Nguyen, Pedram Akbarian, Trang Pham, and 3 more authors

arXiv preprint arXiv:2405.14131, 2024
ACL 2025

T-REG: Preference Optimization with Token-Level Reward Regularization

Wenxuan Zhou, Shujian Zhang, Lingxiao Zhao, and 1 more author

arXiv preprint arXiv:2412.02685, 2024
ICLR 2025

Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy

Tong Wu, Shujian Zhang, Kaiqiang Song, and 7 more authors

arXiv preprint arXiv:2410.09102, 2024
EMNLP 2024

WPO: Enhancing RLHF with Weighted Preference Optimization

Wenxuan Zhou, Ravi Agrawal, Shujian Zhang, and 5 more authors

In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
NAACL 2024

LanguageFlow: Advancing Diffusion Language Generation with Probabilistic Flows

Shujian Zhang, Lemeng Wu, Chengyue Gong, and 1 more author

In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
ICML 2024

Sliced Wasserstein with random-path projecting directions

Khai Nguyen, Shujian Zhang, Tam Le, and 1 more author

Proceedings of the ICML, 2024, 2024
ICML 2024

Switchable Decision: Dynamic Neural Generation Networks

Shujian Zhang, Korawat Tanwisuth, Chengyue Gong, and 2 more authors

Proceedings of the ICML 2024, 2024
Preference-grounded token-level guidance for language model fine-tuning

Shentao Yang, Shujian Zhang, Congying Xia, and 3 more authors

Advances in Neural Information Processing Systems, 2024

2023

CVPR 2023

FlowGrad: Controlling the Output of Generative ODEs with Gradients

Xingchao Liu, Lemeng Wu, Shujian Zhang, and 3 more authors

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Preprint

AutoML-GPT: Automatic Machine Learning with GPT

Shujian Zhang, Chengyue Gong, Lemeng Wu, and 2 more authors

arXiv preprint arXiv:2305.02499, 2023
ICML 2023

POUF: Prompt-oriented unsupervised fine-tuning for large pre-trained models

Korawat Tanwisuth, Shujian Zhang, Huangjie Zheng, and 2 more authors

In International Conference on Machine Learning, 2023
ICLR 2023

Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-Oriented Dialogue Systems

Yihao Feng, Shentao Yang, Shujian Zhang, and 4 more authors

arXiv preprint arXiv:2302.10342, 2023
Preprint

A prototype-oriented clustering for domain shift with source privacy

Korawat Tanwisuth, Shujian Zhang, Pengcheng He, and 1 more author

arXiv preprint arXiv:2302.03807, 2023

2022

EMNLP 2022

Passage-Mask: A Learnable Regularization Strategy for Retriever-Reader Models

Shujian Zhang, Chengyue Gong, and Xingchao Liu

In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
NeurIPS 2022

A unified framework for alternating offline model training and policy learning

Shentao Yang, Shujian Zhang, Yihao Feng, and 1 more author

Advances in Neural Information Processing Systems, 2022
ICML 2022

Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning

Shentao Yang, Yihao Feng, Shujian Zhang, and 1 more author

In International Conference on Machine Learning, 2022
NAACL 2022

ALLSH: Active Learning Guided by Local Sensitivity and Hardness

Shujian Zhang, Chengyue Gong, Xingchao Liu, and 3 more authors

arXiv preprint arXiv:2205.04980, 2022

2021

Preprint

Crossformer: Transformer with Alternated Cross-Layer Guidance

Shujian Zhang, Zhibin Duan, Huangjie Zheng, and 4 more authors

2021
EMNLP 2021

Learning from uneven training data: Unlabeled, single label, and multiple labels

Shujian Zhang, Chengyue Gong, and Eunsol Choi

arXiv e-prints, 2021
Preprint

Capturing label distribution: A case study in nli

Shujian Zhang, Chengyue Gong, and Eunsol Choi

arXiv preprint arXiv:2102.06859, 2021
ICLR 2021

Contextual dropout: An efficient sample-dependent dropout module

Xinjie Fan, Shujian Zhang, Korawat Tanwisuth, and 2 more authors

arXiv preprint arXiv:2103.04181, 2021
Preprint

Fusedream: Training-free text-to-image generation with improved clip+ gan space optimization

Xingchao Liu, Chengyue Gong, Lemeng Wu, and 3 more authors

arXiv preprint arXiv:2112.01573, 2021
ACL 2021

Knowing more about questions can help: Improving calibration in question answering

Shujian Zhang, Chengyue Gong, and Eunsol Choi

arXiv preprint arXiv:2106.01494, 2021
NeurIPS 2021

A prototype-oriented framework for unsupervised domain adaptation

Korawat Tanwisuth, Xinjie Fan, Huangjie Zheng, and 4 more authors

Advances in Neural Information Processing Systems, 2021
ICML 2021

Bayesian attention belief networks

Shujian Zhang, Xinjie Fan, Bo Chen, and 1 more author

In International Conference on Machine Learning, 2021
NeurIPS 2021

Alignment attention by matching key and query distributions

Shujian Zhang, Xinjie Fan, Huangjie Zheng, and 2 more authors

Advances in Neural Information Processing Systems, 2021

2020

NeurIPS 2020

Bayesian attention modules

Xinjie Fan, Shujian Zhang, Bo Chen, and 1 more author

Advances in Neural Information Processing Systems, 2020