Skip to content

Agent Research Landscape · From Alpha Auto Search Outward

把你的 alpha auto search 工作放在更大的 agent 研究背景里,6 层同心圆结构——从最内层(你的项目本身)一层层 zoom out 到 cognitive architecture 历史脉络。每一层标注:scope / key surveys / representative systems / 你工作的 positioning / 对 HKU 汇报的引用建议。

2026-05-19 · Maintained by Paul Weng


同心圆结构图

L5 · Cognitive Architectures / AGI Frameworks            最外层(理论根基)
└─ L4 · Agentic AI / RL Foundations
   └─ L3 · LLM-Based Agents (general)
      └─ L2 · AI for Science / AI for Math
         └─ L1 · Autonomous Research Agents
            └─ L0 · Alpha Auto Search          最内层(你的工作)

L0 是你的研究对象本身;L1 是它最直接的 academic parent;L5 是最远的智识根基。汇报时建议显式声明你在哪一层 ground,避免被问"为什么不做更大/更小"时被动


Scope: 自动化发现/搜索 alpha 因子或 trading signal。Search unit = formula / program / NN weights / portfolio。

Key Surveys (2025-2026)

Title Authors Venue arXiv
Survey on LLM-based Alpha Mining FITEE 2025 10.1631/FITEE.2500386
AlphaEval: Comprehensive Eval for Formula Alpha Mining Ding et al. 2025 2508.13174

Representative Systems

Tradition Systems
Classical GP gplearn / AutoAlpha / gpquant / AlphaForge / AlphaSAGE (GFlowNet) / AlphaPROBE
DL factor FactorVAE / HIST / HireVAE / RVRAE / FactorGCL
LLM-driven formula AlphaAgent / Alpha Jungle (LLM-MCTS) / QuantaAlpha / FactorMAD / Alpha-GPT
Benchmark AlphaBench (ICLR 2026) / AlphaEval / Crypto-Alpha-Bench (你的提案)

你的工作 positioning

  • Verifier 侧已交付(M8.6 walk-forward + microstructure gate + adaptive state controller)
  • Generator 侧未做——这是你 RQ3 (Cognition Base) + Researcher Agent 要补的
  • Benchmark contribution——Crypto-Alpha-Bench 是 alpha auto search 的 ImageNet moment 提案

汇报引用

  • Slide 5 / 7-10 全部 L0 内容
  • 已有 artifact:alpha_search_baselines.md / alpha_search_survey_taxonomy_and_bibliography.md / financial_sota_agent_survey.md

L1 · Autonomous Research Agents(最直接的 academic parent)

Scope: LLM-powered agents 自主完成研究工作流(hypothesis → experiment → analysis → writeup)。Alpha auto search 是这一层在 finance 的 instantiation。

Key Surveys

Title Authors Year URL
From Copilots to Colleagues: A Survey of Autonomous Research Agents Deli Chen 2026 early victorchen96.github.io
Deep Research: A Survey of Autonomous Research Agents Zhang et al. 2025-08 2508.12752
Deep Research Agents: A Systematic Examination And Roadmap 2025-06 2506.18096
Deep Research: A Systematic Survey 2025-12 2512.02038
Reinforcement Learning Foundations for Deep Research Systems 2025-09 2509.06733

Key Frameworks

  1. L1-L5 autonomy taxonomy (Chen 2026)——类比 SAE 自动驾驶;现 frontier 在 L4,L5 aspirational
  2. 4-stage Deep Research pipeline (Zhang 2025)——planning / question-developing / web exploration / report generation
  3. 4 architectures——single-agent loop / multi-agent / hierarchical / tool-augmented

Representative Systems

Domain Systems
ML research AI Scientist v1/v2 (Sakana) / MLR-Copilot / RD-Agent(Q) / AgentRxiv
Architecture discovery ASI-ARCH (GAIR 2025)
Math/algorithms FunSearch / AlphaProof / AlphaEvolve / AlphaGeometry
Chemistry Coscientist / ChemCrow
Biology BioPlanner / MedAgents
General research GPT Deep Research / Perplexity Pro / STORM / Tongyi DeepResearch

6 Open Problems (Chen 2026)

  1. Cognitive loop trap (反复陷入失败策略)
  2. Context window limits
  3. Novelty evaluation(survey 说是 "fundamentally unsolved... philosophical")
  4. Reproducibility / determinism (SWE-bench std 5-15%)
  5. Safety / dual-use
  6. Cost ($100-1000/research campaign)

你的工作 positioning

  • Alpha auto search = autonomous research agent 在 finance domain 的 instantiation
  • FunSearch 被 Chen 2026 点为 "nearest L5",因为 verifiable novelty——alpha 因子天然 inherit 这个属性(IC/Sharpe/PnL 是 mechanical verifier)
  • Chen 2026 survey 完全没覆盖 finance domain——这是 Crypto-Alpha-Bench 的 academic whitespace
  • 6 个 open problems 几乎全部映射到 alpha search(详见 agent_research_landscape.md §L1 后续 mapping)

汇报引用

  • Slide 7 改用 L1-L5 ladder 替代 "4 common patterns"
  • Slide 9(Crypto-Alpha-Bench gap)显式 claim "first finance-domain research-agent benchmark"
  • Q&A: 引 Chen 2026 6 open problems → 每个映射到 alpha search

L2 · AI for Science / AI for Math(智识 lineage 的母层)

Scope: AI 在科学发现 / 数学定理证明 / 算法发明上的应用。Autonomous research agents 是这条线的近期分支。

Key Surveys

Title Authors Year URL
Agentic AI for Scientific Discovery: A Survey of Progress, Challenges, and Future Directions 2025-03 2503.08979

Landmark Systems (按时间)

System Year Domain Key
AlphaFold (2) 2020 / 2021 Protein structure Diff-equiv + evoformer; 解锁结构生物学
AlphaTensor 2022 Matrix multiplication RL + game tree 找新算法
FunSearch 2023 (Nature) Math / bin packing LLM + evolution + 程序级 verification
AlphaGeometry / AlphaGeometry 2 2024 / 2025 IMO geometry Neuro-symbolic + 形式化证明
AlphaProof 2024-2025 IMO algebra/number theory Gemini + AlphaZero MCTS + Lean kernel
AlphaEvolve 2025 Algorithm discovery FunSearch + Pareto + 长程序
GNoME 2023 (Nature) Materials science GNN + active learning, 2.2M 新材料
ASI-ARCH 2025-07 Linear attention archs Multi-agent + Scaling Law for Discovery

Core Pattern (4 共同模式)

  1. Generator-verifier separation
  2. Cognition base / knowledge grounding
  3. Multi-agent decomposition
  4. Compute-scaled discovery

你的工作 positioning

  • Alpha auto search 共享 AI for Science 的全部 4 模式
  • 关键 differentiator: 金融 verifier 不是 mechanical(math/code 那样)但是 statistical——需要 PBO / DSR / multiple testing 替代 Lean 内核
  • 你的 self-evolution research reference 实际上是 AI for Science 范式的 finance-specific safety adaptation

汇报引用

  • Slide 2 thesis & Slide 7 frontier evolution
  • 已有 artifact: alpha_search_baselines.md 主战场

L3 · LLM-Based Agents (general)(更广的 agent 文献)

Scope: 任何用 LLM 作为 reasoning engine 的 agent 系统——不限于 research,包括 coding / web browsing / robotics / business automation。

Key Surveys("必读"级别)

Title Authors Year URL
A Survey on LLM-Based Autonomous Agents Wang et al. (Renmin U) 2023 → 2025 2308.11432
The Rise and Potential of LLM-Based Agents: A Survey Xi et al. (Fudan) 2023 2309.07864
Large Language Model Agent: A Survey on Methodology, Applications and Challenges 2025-03 2503.21460
Evaluation and Benchmarking of LLM Agents: A Survey Mohammadi et al. 2025-07 2507.21504
LLM-based Agentic Reasoning Frameworks: A Survey 2025-08 2508.17692
LLM-Based Human-Agent Collaboration 2025-05 2505.00753

Foundational Techniques

Technique Origin Key paper
Tool use / function calling OpenAI 2023 / Toolformer Schick 2023
ReAct (Reason + Act 交错) Princeton 2022 Yao 2210.03629
Reflexion (verbal RL) Northeastern 2023 Shinn 2303.11366
Chain of Thought / Tree of Thoughts Google 2022 / Princeton 2023 Wei / Yao
LATS (Language Agent Tree Search) 2023 Zhou 2310.04406
Self-Refine / Self-Critique CMU 2023 Madaan 2303.17651
Plan-and-Execute / Plan-and-Solve 2023 Wang
Voyager (open-world skill learning) NVIDIA 2023 Wang 2305.16291
MemGPT / Memory hierarchies Berkeley 2023 Packer 2310.08560
MCP (Model Context Protocol) Anthropic 2024 open standard for tool integration
A2A (Agent-to-Agent Protocol) Google 2025 cross-vendor agent communication

Multi-Agent Frameworks

Framework Origin Key paper
CAMEL (role-play conversation) KAUST 2023 Li 2303.17760
AutoGen Microsoft 2023 Wu 2308.08155
MetaGPT (SOP-based software co.) DeepWisdom 2023 Hong 2308.00352
ChatDev THU 2023 Qian 2307.07924
AgentVerse THU 2023 Chen 2308.10848
LangGraph (graph-based orchestration) LangChain 2024 open-source framework

Production Coding Agents (frontier L4)

  • Devin (Cognition Labs 2024) — 第一个 production "AI software engineer"
  • Claude Code (Anthropic 2024) — 72% SWE-bench Verified
  • Cursor / Aider — IDE-integrated coding agents
  • SWE-Agent (Princeton 2024) — open-source SWE benchmark champion
  • OpenHands (formerly OpenDevin)
  • Agentless / AgentCoder / AutoCodeRover — research baselines

你的工作 positioning

  • 你 production system 的 L1-L6 LLM agent stack 是 L3 层的 finance-specific instantiation
  • L3 的核心技术(ReAct / Reflexion / 工具调用 / memory)你都已在 production 用过——这件事在汇报里可以一句话带过
  • 不要把汇报放在 L3——这是和 vision/coding agents 共享的层,你的 differentiator 在 L1(research agent in finance)

汇报引用

  • 不在 main slide,但 Q&A 备答需要熟悉
  • 韩教授可能问 ReAct / MCP / agent reliability 等问题——这些都是 L3 概念

L4 · Agentic AI / RL Foundations(agent 范式的方法学根基)

Scope: Agent 范式的数学和 RL 基础——比 LLM 早几十年。决策、规划、学习的形式化。

Key Surveys

Title Authors Year URL
The Landscape of Agentic Reinforcement Learning for LLMs 2025-09 2509.02547
A Survey of Frontiers in LLM Reasoning 2025-04 2504.09037
Logical Reasoning in LLMs: A Survey 2025-02 2502.09100

Classic RL Foundations

  • Sutton & Barto Reinforcement Learning: An Introduction (2018, 2nd ed) — bible
  • Q-learning (Watkins 1989) / Policy gradient (Williams 1992)
  • DQN (Mnih 2013 / Nature 2015) — first deep RL breakthrough
  • AlphaGo (Silver 2016, Nature) → AlphaZero (2017) → MuZero (2019)
  • PPO (Schulman 2017) — RLHF 主力算法
  • Decision Transformer (Chen 2021) — sequence modeling for RL
  • World models / Dreamer V1-V3 (Hafner 2019-2023) — model-based RL

LLM × RL 现代脉络

Technique Year Key
RLHF 2022 (InstructGPT) Ouyang et al.
DPO (Direct Preference Optimization) 2023 Rafailov 2305.18290
RLAIF 2023 Anthropic
Constitutional AI 2022 Anthropic 2212.08073
DeepSeek-R1 (GRPO + RL on reasoning) 2025 DeepSeek
OpenAI o1 / o3 2024 / 2025 inference-time RL / test-time compute
GFlowNet 2021-ongoing Bengio; structured policy learning

你的工作 positioning

  • 你的 walk-forward + adaptive state controller 在概念上接近 offline RL with conservative policy improvement
  • Alpha search 中"探索 vs 利用"是经典 RL 问题
  • 不直接 use RL(production 决策是 deterministic),但理论上可以把 RQ2 (open-world LLM safety) ground 在 offline RL safety guarantees 上

汇报引用

  • Backup slide 用:被问"为什么不用 RL 做 alpha 搜索"时
  • Conformal prediction × offline RL 的交叉是 RQ2 真正的方法学位置

L5 · Cognitive Architectures / AGI Frameworks(最外层 / 智识祖辈)

Scope: agent 范式的哲学和认知科学根基。一般不直接 cite,但理解它能让你 framing 更深。

Classic Cognitive Architectures

  • SOAR (Newell 1990) — symbolic problem-solving, production rules
  • ACT-R (Anderson 1976-) — cognitive psychology + computational model
  • BDI (Bratman 1987) — Belief / Desire / Intention 框架
  • Society of Mind (Minsky 1986)
  • Global Workspace Theory (Baars 1988)

Modern AGI / Foundation Model Debates

Topic Key paper
Sparks of AGI Bubeck 2023 2303.12712 — GPT-4 early试金石
Foundation Models Bommasani 2021 2108.07258 — Stanford CRFM
Generalist agents (Gato) DeepMind 2022
Embodied AGI debates ongoing

你的工作 positioning

  • 不要在汇报里引这一层——离得太远
  • 心里要知道:你做的 production trading agent 已经实现了一个 simplified BDI 架构(Belief = market state / Desire = user trader intent / Intention = schema-validated order)。这是 L5 层的实际落地。

汇报引用

  • 无。Q&A 极端深问时("AGI 的边界在哪")可以提一句

总结:从同心圆看你的 strategic positioning

你的工作的层级 anchoring

L0 alpha auto search             ← 你做的具体事
   ↑ anchor here for technical depth
L1 autonomous research agents     ← 你的 academic argument 在这一层
   ↑ anchor here for thesis framing
L2 AI for Science / AI for Math   ← 你的 intellectual lineage
   ↑ cite here for inspiration / pattern
L3 LLM-Based Agents               ← 你的 tooling layer
   ↑ Q&A 备答
L4 Agentic AI / RL                ← 方法学根基
   ↑ backup slide
L5 Cognitive Architectures        ← 哲学背景,一般不显式 cite

HKU 汇报 slide-by-level 建议

Slide 适合的 layer
Slide 1 Title L1 framing ("autonomous research agents in finance")
Slide 2 Thesis L1 + L0
Slide 4 What I Built L0 + L3(生产 stack 用了 L3 技术)
Slide 6 Negative Result L0
Slide 7 Frontier Pattern L1-L5 ladder + 4 mode pattern(替换原 "4 common patterns")
Slide 8 Gap L1 (Chen 2026 6 open problems) + L0 (alpha-specific)
Slide 9-10 Crypto-Alpha-Bench L1 (autonomous research bench) + L2 (AI for Science verifier philosophy)
Slide 11 RQs L1 (RQ2/RQ3) + L2 (RQ1 architectural prior from time series)
Slide 12 Ask L1 (which use case for first paper)

一句话 elevator pitch(嵌套 framing 版)

"I built a production-grade verification testbed for alpha auto search (L0), which is an instantiation of autonomous research agents (L1) in the finance domain. My work inherits the verifiable-novelty success pattern from FunSearch / AI for Science (L2), and the engineering stack from LLM-based agents (L3). The contribution is Crypto-Alpha-Bench——the field's first research-agent benchmark in finance, addressing 6 open problems identified by Chen 2026."

这句话覆盖 4 个 layer,30 秒可念完,被任何老师打断都能优雅退出。


推荐进一步阅读(按优先级)

高 priority(汇报必备): - Chen 2026 "From Copilots to Colleagues" — L1 锚定 - Zhang 2025 "Deep Research Survey" arXiv 2508.12752 — L1 互补

中 priority(research agenda 深化): - Wang 2023 "LLM-Based Autonomous Agents" arXiv 2308.11432 — L3 经典综述 - "Agentic RL for LLMs" arXiv 2509.02547 — L4 现代版

低 priority(背景了解): - Sutton & Barto RL bible — L4 经典 - Bubeck 2023 "Sparks of AGI" — L5 文化背景


End of landscape document.

这份 doc 不是为了汇报现场用,是为了你自己心里有清晰的 nested framing——这样无论老师从 L0 到 L5 哪一层提问,你都知道在哪个层级回应、引用哪一个 anchor paper、向哪个相邻层做 transition。