Publications

All Publications
0 publications
ICCV 2025
Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding
Zhang, Y., Zhao, Z., Chen, Z., Ding, Z., Yang, X., & Sun, Y.
International Conference on Computer Vision (ICCV 2025), Honolulu, Hawaii, Oct 2025
Video
ICML 2025
ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning
Chen, Z., Kang, M., & Li, B.
International Conference on Machine Learning (ICML 2025), Vancouver, Canada, July 2025
Guardrail Agent Safety
ICLR 2025
SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations
Chen, Z., Pinto, F., Pan, M., & Li, B.
International Conference on Learning Representations (ICLR 2025), Singapore, Apr 2025
Guardrail Multimodal Safety Video
ICLR 2025
MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models
Xu, C., Zhang, J., Chen, Z., Xie, C., Kang, M., Yuan, Z., Xiong, Z., Zhang, C., Yuan, L., Zeng, Y., Xu, P., Guo, C., Zhou, A., Tan, J. Z., Wang, Z., Xiong, A., Zhao, X., Gai, Y., Pinto, F., Potter, Y., Xiang, Z., Lin, Z., Hendrycks, D., Song, D., & Li, B.
International Conference on Learning Representations (ICLR 2025), Singapore, Apr 2025
Risk Assessment Multimodal Safety Evaluation
ICLR 2025
AnyPrefer: An Automatic Framework for Preference Data Synthesis
Zhou, Y., Wang, Z., Wang, T., Xing, S., Xia, P., Li, B., Zheng, K., Zhang, Z., Chen, Z., Zheng, W., Zhang, X., Bansal, C., Zhang, W., Wei, Y., Bansal, M., & Yao, H.
International Conference on Learning Representations (ICLR 2025), Singapore, Apr 2025
Safety Alignment Reinforcement Learning
ICLR 2025
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Xia, P., Han, S., Qiu, S., Zhou, Y., Wang, Z., Zheng, W., Chen, Z., Cui, C., Ding, M., Li, L., Wang, L., & Yao, H.
International Conference on Learning Representations (ICLR 2025), Singapore, Apr 2025
Oral Presentation
Evaluation Multimodal Safety
ICLR 2025
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Cui, C., Zhang, A., Zhou, Y., Chen, Z., Deng, G., Yao, H. and Chua, T.S.
International Conference on Learning Representations (ICLR 2025), Singapore, Apr 2025
Safety Alignment Multimodal Safety Reinforcement Learning
NeurIPS 2024
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases
Chen, Z., Xiang, Z., Xiao, C., Song, D., & Li, B.
Advances in Neural Information Processing Systems (NeurIPS 2024), Vancouver, Canada, Dec 2024
Risk Assessment Agent Safety
NeurIPS 2024
Calibrated Self-rewarding Vision Language Models
Zhou, Y., Fan, Z., Cheng, D., Yang, S., Chen, Z., Cui, C., Wang, X., Li, Y., Zhang, L., & Yao, H.
Advances in Neural Information Processing Systems (NeurIPS 2024), Vancouver, Canada, Dec 2024
Safety Alignment Multimodal Safety
ICML 2024
HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding
Chen, Z., Zhao, Z., Luo, H., Yao, H., Li, B., & Zhou, J.
International Conference on Machine Learning (ICML 2024), Vienna, Austria, July 2024
Safety Alignment Multimodal Safety
CoRL 2024
EscIRL: Evolving Self-Contrastive IRL for Trajectory Prediction in Autonomous Driving
Wang, S., Chen, Z., Zhao, Z., Mao, C., Zhou, Y., He, J., & Hu, A.S.
Conference on Robot Learning (CoRL 2024), Munich, Germany, Nov 2024
Safety Alignment Reinforcement Learning
NAACL 2024
AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition
Chen, Z., Zhao, Z., Zhu, Z., Zhang, R., Li, X., Raj, B., & Yao, H.
North American Chapter of the Association for Computational Linguistics (NAACL 2024), Mexico City, Mexico, Jun 2024
Safety Alignment
IROS 2024
Safe Reinforcement Learning via Hierarchical Adaptive Chance-Constraint Safeguards
Chen, Z., Zhao, Z., He, T., Chen, B., Zhao, X., Gong, L., & Liu C.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024), Abu Dhabi, UAE, October 2024
Oral
Safety Alignment Reinforcement Learning
arXiv 2024
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Chen, Z., Du, Y., Wen, Z., Zhou, Y., Cui, C., Weng, Z., Tu, H., Wang, C., Tong, Z., Huang, Q., Chen, C., Ye, Q., Zhu, Z., Zhang, Y., Zhou, J., Zhao, Z., Rafailov, R., Finn, C., & Yao, H.
arXiv preprint arXiv:2407.04842, 2024
Evaluation Multimodal Safety Evaluation