ML + Vision Top-6 Agent Survey (2023-2026)¶
Note: this is a Semantic Scholar snapshot generated on 2026-06-03. Citation counts are cumulative as of fetch time, not year-end counts. The 2026 rows are incomplete because several 2026 conferences were not fully published or indexed at snapshot time. Scoring uses heuristic keyword/alias matching for fast triage; selected papers should be treated as candidates for deeper review.
Summary¶
Papers Per Venue Per Year¶
| Venue | 2023 | 2024 | 2025 | 2026 | Total |
|---|---|---|---|---|---|
| Neural Information Processing Systems | 40 | 167 | 0 | 0 | 207 |
| International Conference on Machine Learning | 35 | 75 | 98 | 0 | 208 |
| International Conference on Learning Representations | 45 | 124 | 74 | 0 | 243 |
| Computer Vision and Pattern Recognition | 53 | 143 | 124 | 0 | 320 |
| IEEE International Conference on Computer Vision | 19 | 50 | 153 | 0 | 222 |
| European Conference on Computer Vision | 16 | 61 | 0 | 0 | 77 |
Top 10 By Citation Count¶
Top 5 By Topic Match¶
| Rank | Title | Venue | Year | Citations | Score | DOI |
|---|---|---|---|---|---|---|
| 1 | SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering | Neural Information Processing Systems | 2024 | 1175 | 5 | 10.48550/arXiv.2405.15793 |
| 2 | OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments | Neural Information Processing Systems | 2024 | 757 | 5 | 10.48550/arXiv.2404.07972 |
| 3 | CogAgent: A Visual Language Model for GUI Agents | Computer Vision and Pattern Recognition | 2023 | 749 | 5 | 10.1109/CVPR52733.2024.01354 |
| 4 | Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models | International Conference on Machine Learning | 2023 | 493 | 5 | 10.48550/arXiv.2310.04406 |
| 5 | Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents | International Conference on Learning Representations | 2024 | 344 | 5 | 10.48550/arXiv.2410.05243 |
Paginated Papers¶
The paper-level sections are split into static venue/year pages so the deployed MkDocs page stays readable and fast to load.
Total paper entries: 1277. Detail pages: 53. Target page size: 30 papers.
| Venue | Year | Papers | Detail pages | Start |
|---|---|---|---|---|
| Neural Information Processing Systems | 2023 | 40 | 2 | NeurIPS 2023 |
| Neural Information Processing Systems | 2024 | 167 | 6 | NeurIPS 2024 |
| Neural Information Processing Systems | 2025 | 0 | 1 | NeurIPS 2025 |
| International Conference on Machine Learning | 2023 | 35 | 2 | ICML 2023 |
| International Conference on Machine Learning | 2024 | 75 | 3 | ICML 2024 |
| International Conference on Machine Learning | 2025 | 98 | 4 | ICML 2025 |
| International Conference on Learning Representations | 2023 | 45 | 2 | ICLR 2023 |
| International Conference on Learning Representations | 2024 | 124 | 5 | ICLR 2024 |
| International Conference on Learning Representations | 2025 | 74 | 3 | ICLR 2025 |
| Computer Vision and Pattern Recognition | 2023 | 53 | 2 | CVPR 2023 |
| Computer Vision and Pattern Recognition | 2024 | 143 | 5 | CVPR 2024 |
| Computer Vision and Pattern Recognition | 2025 | 124 | 5 | CVPR 2025 |
| IEEE International Conference on Computer Vision | 2023 | 19 | 1 | ICCV 2023 |
| IEEE International Conference on Computer Vision | 2024 | 50 | 2 | ICCV 2024 |
| IEEE International Conference on Computer Vision | 2025 | 153 | 6 | ICCV 2025 |
| European Conference on Computer Vision | 2023 | 16 | 1 | ECCV 2023 |
| European Conference on Computer Vision | 2024 | 61 | 3 | ECCV 2024 |