ML + Vision Top-6 Agent Survey - ICLR 2025 - Page 3 of 3

  • Venue: International Conference on Learning Representations
  • Year: 2025
  • Page: 3 / 3
  • Papers: 61-74 / 74
Bridging Compressed Image Latents and Multimodal Large Language Models Paper
  • Authors: Chia-Hao Kao, Cheng Chien, Yu-Jen Tseng, Yi-Hsin Chen, Alessandro Gnutti, Shao-Yuan Lo, Wen-Hsiao Peng, Riccardo Leonardi
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: vision-language models (matched: multimodal large language models).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-based Formalized Programming Paper
  • Authors: Yilun Hao, Yang Zhang, Chuchu Fan
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: program synthesis (matched: programming).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs Paper
  • Authors: Shuo Li, Tao Ji, Xiaoran Fan, Linsheng Lu, Leyi Yang, Yuming Yang, Zhiheng Xi, Rui Zheng, Yuran Wang, XH Zhao, Tao Gui, Qi Zhang, et al.
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: vision-language models (matched: vlms).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

Mitigating Object Hallucination in MLLMs via Data-augmented Phrase-level Alignment Paper
  • Authors: Pritam Sarkar, Sayna Ebrahimi, Ali Etemad, Ahmad Beirami, Sercan Ö. Arik, Tomas Pfister
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: vision-language models (matched: mllms).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

γ-MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models Paper
  • Authors: Yaxin Luo, Gen Luo, Jiayi Ji, Yiyi Zhou, Xiaoshuai Sun, Zhiqiang Shen, Rongrong Ji
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: vision-language models (matched: multimodal large language models).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Paper
  • Authors: Fanqing Meng, Jin Wang, Chuanhao Li, Quanfeng Lu, Hao Tian, Tianshuo Yang, Jiaqi Liao, Xizhou Zhu, Jifeng Dai, Yu Qiao, Ping Luo, Kaipeng Zhang, et al.
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: vision-language models (matched: vision language models, large vision language models).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs Paper
  • Authors: Jie Zhang, Zhong Ling Wang, Mengqi Lei, Zheng Yuan, Bei Yan, Shiguang Shan, Xilin Chen
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: vision-language models (matched: lvlms).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage Paper
  • Authors: Zhi Gao, Bofei Zhang, Pengxiang Li, Xiaojian Ma, Tao Yuan, Yue Fan, Yuwei Wu, Yunde Jia, Song-Chun Zhu, Qing Li
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: vision-language models (matched: vlm).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning Paper
  • Authors: Zehan Qi, Xiao Liu, Iat Long Iong, Hanyu Lai, Xueqiao Sun, Jiadai Sun, Xinyue Yang, Yu Yang, Shuntian Yao, Wei Xu, Jie Tang, Yuxiao Dong
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: computer-use agents (matched: web agents).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

Language Agents Meet Causality - Bridging LLMs and Causal World Models Paper
  • Authors: John Gkountouras, Matthias Lindemann, Phillip Lippe, E. Gavves, Ivan Titov
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: LLM agents (matched: language agents).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents Paper
  • Authors: Kexun Zhang, Weiran Yao, Zuxin Liu, Yihao Feng, Zhiwei Liu, N. RitheshR., Tian Lan, Lei Li, Renze Lou, Jiacheng Xu, Bo Pang, Yingbo Zhou, et al.
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: program synthesis (matched: software engineering).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

Failures to Find Transferable Image Jailbreaks Between Vision-Language Models Paper
  • Authors: Rylan Schaeffer, Dan Valentine, Luke Bailey, James Chua, Cristobal Eyzaguirre, Zane Durante, Joe Benton, Brando Miranda, Henry Sleight, T. Wang, John Hughes, Rajashree Agrawal, et al.
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: vision-language models (matched: vision language models).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

Reflexive Guidance: Improving OoDD in Vision-Language Models via Self-Guided Image-Adaptive Concept Generation Paper
  • Authors: Jihyo Kim, Seulbi Lee, Sangheum Hwang
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: vision-language models (matched: vision language models).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.

Do LLM Agents Have Regret? A Case Study in Online Learning and Games Paper
  • Authors: Chanwoo Park, Xiangyu Liu, A. Ozdaglar, Kaiqing Zhang
  • Year: 2025
  • Venue: International Conference on Learning Representations
  • DOI: Not stated.
  • Citations: 0
  • Relevance: 3 / 5
  • Why selected: Heuristic keyword/alias matches: LLM agents (matched: llm agents).
  • Code: Not found.
  • Extraction: method/data pending

Abstract

Not stated in metadata.

Claim

Not stated in abstract.