Agent Research Landscape · From Alpha Auto Search Outward¶

把你的 alpha auto search 工作放在更大的 agent 研究背景里，6 层同心圆结构——从最内层（你的项目本身）一层层 zoom out 到 cognitive architecture 历史脉络。每一层标注：scope / key surveys / representative systems / 你工作的 positioning / 对 HKU 汇报的引用建议。

2026-05-19 · Maintained by Paul Weng

同心圆结构图¶

L5 · Cognitive Architectures / AGI Frameworks            最外层（理论根基）
└─ L4 · Agentic AI / RL Foundations
   └─ L3 · LLM-Based Agents (general)
      └─ L2 · AI for Science / AI for Math
         └─ L1 · Autonomous Research Agents
            └─ L0 · Alpha Auto Search          最内层（你的工作）

L0 是你的研究对象本身；L1 是它最直接的 academic parent；L5 是最远的智识根基。汇报时建议显式声明你在哪一层 ground，避免被问"为什么不做更大/更小"时被动。

L0 · Alpha Auto Search（最内层 / 你的核心工作）¶

Scope: 自动化发现/搜索 alpha 因子或 trading signal。Search unit = formula / program / NN weights / portfolio。

Key Surveys (2025-2026)¶

Title	Authors	Venue	arXiv
Survey on LLM-based Alpha Mining	—	FITEE 2025	10.1631/FITEE.2500386
AlphaEval: Comprehensive Eval for Formula Alpha Mining	Ding et al.	2025	2508.13174

Representative Systems¶

Tradition	Systems
Classical GP	gplearn / AutoAlpha / gpquant / AlphaForge / AlphaSAGE (GFlowNet) / AlphaPROBE
DL factor	FactorVAE / HIST / HireVAE / RVRAE / FactorGCL
LLM-driven formula	AlphaAgent / Alpha Jungle (LLM-MCTS) / QuantaAlpha / FactorMAD / Alpha-GPT
Benchmark	AlphaBench (ICLR 2026) / AlphaEval / Crypto-Alpha-Bench (你的提案)

你的工作 positioning¶

Verifier 侧已交付（M8.6 walk-forward + microstructure gate + adaptive state controller）
Generator 侧未做——这是你 RQ3 (Cognition Base) + Researcher Agent 要补的
Benchmark contribution——Crypto-Alpha-Bench 是 alpha auto search 的 ImageNet moment 提案

汇报引用¶

Slide 5 / 7-10 全部 L0 内容
已有 artifact：alpha_search_baselines.md / alpha_search_survey_taxonomy_and_bibliography.md / financial_sota_agent_survey.md

L1 · Autonomous Research Agents（最直接的 academic parent）¶

Scope: LLM-powered agents 自主完成研究工作流（hypothesis → experiment → analysis → writeup）。Alpha auto search 是这一层在 finance 的 instantiation。

Key Surveys¶

Title	Authors	Year	URL
From Copilots to Colleagues: A Survey of Autonomous Research Agents	Deli Chen	2026 early	victorchen96.github.io
Deep Research: A Survey of Autonomous Research Agents	Zhang et al.	2025-08	2508.12752
Deep Research Agents: A Systematic Examination And Roadmap	—	2025-06	2506.18096
Deep Research: A Systematic Survey	—	2025-12	2512.02038
Reinforcement Learning Foundations for Deep Research Systems	—	2025-09	2509.06733

Key Frameworks¶

L1-L5 autonomy taxonomy (Chen 2026)——类比 SAE 自动驾驶；现 frontier 在 L4，L5 aspirational
4-stage Deep Research pipeline (Zhang 2025)——planning / question-developing / web exploration / report generation
4 architectures——single-agent loop / multi-agent / hierarchical / tool-augmented

Representative Systems¶

Domain	Systems
ML research	AI Scientist v1/v2 (Sakana) / MLR-Copilot / RD-Agent(Q) / AgentRxiv
Architecture discovery	ASI-ARCH (GAIR 2025)
Math/algorithms	FunSearch / AlphaProof / AlphaEvolve / AlphaGeometry
Chemistry	Coscientist / ChemCrow
Biology	BioPlanner / MedAgents
General research	GPT Deep Research / Perplexity Pro / STORM / Tongyi DeepResearch

6 Open Problems (Chen 2026)¶

Cognitive loop trap (反复陷入失败策略)
Context window limits
Novelty evaluation（survey 说是 "fundamentally unsolved... philosophical"）
Reproducibility / determinism (SWE-bench std 5-15%)
Safety / dual-use
Cost ($100-1000/research campaign)

你的工作 positioning¶

Alpha auto search = autonomous research agent 在 finance domain 的 instantiation
FunSearch 被 Chen 2026 点为 "nearest L5"，因为 verifiable novelty——alpha 因子天然 inherit 这个属性（IC/Sharpe/PnL 是 mechanical verifier）
Chen 2026 survey 完全没覆盖 finance domain——这是 Crypto-Alpha-Bench 的 academic whitespace
6 个 open problems 几乎全部映射到 alpha search（详见 agent_research_landscape.md §L1 后续 mapping）

汇报引用¶

Slide 7 改用 L1-L5 ladder 替代 "4 common patterns"
Slide 9（Crypto-Alpha-Bench gap）显式 claim "first finance-domain research-agent benchmark"
Q&A: 引 Chen 2026 6 open problems → 每个映射到 alpha search

L2 · AI for Science / AI for Math（智识 lineage 的母层）¶

Scope: AI 在科学发现 / 数学定理证明 / 算法发明上的应用。Autonomous research agents 是这条线的近期分支。

Key Surveys¶

Title	Authors	Year	URL
Agentic AI for Scientific Discovery: A Survey of Progress, Challenges, and Future Directions	—	2025-03	2503.08979

Landmark Systems (按时间)¶

System	Year	Domain	Key
AlphaFold (2)	2020 / 2021	Protein structure	Diff-equiv + evoformer; 解锁结构生物学
AlphaTensor	2022	Matrix multiplication	RL + game tree 找新算法
FunSearch	2023 (Nature)	Math / bin packing	LLM + evolution + 程序级 verification
AlphaGeometry / AlphaGeometry 2	2024 / 2025	IMO geometry	Neuro-symbolic + 形式化证明
AlphaProof	2024-2025	IMO algebra/number theory	Gemini + AlphaZero MCTS + Lean kernel
AlphaEvolve	2025	Algorithm discovery	FunSearch + Pareto + 长程序
GNoME	2023 (Nature)	Materials science	GNN + active learning, 2.2M 新材料
ASI-ARCH	2025-07	Linear attention archs	Multi-agent + Scaling Law for Discovery

Core Pattern (4 共同模式)¶

Generator-verifier separation
Cognition base / knowledge grounding
Multi-agent decomposition
Compute-scaled discovery

你的工作 positioning¶

Alpha auto search 共享 AI for Science 的全部 4 模式
关键 differentiator: 金融 verifier 不是 mechanical（math/code 那样）但是 statistical——需要 PBO / DSR / multiple testing 替代 Lean 内核
你的 self-evolution research reference 实际上是 AI for Science 范式的 finance-specific safety adaptation

汇报引用¶

Slide 2 thesis & Slide 7 frontier evolution
已有 artifact: alpha_search_baselines.md 主战场

L3 · LLM-Based Agents (general)（更广的 agent 文献）¶

Scope: 任何用 LLM 作为 reasoning engine 的 agent 系统——不限于 research，包括 coding / web browsing / robotics / business automation。

Key Surveys（"必读"级别）¶

Title	Authors	Year	URL
A Survey on LLM-Based Autonomous Agents	Wang et al. (Renmin U)	2023 → 2025	2308.11432
The Rise and Potential of LLM-Based Agents: A Survey	Xi et al. (Fudan)	2023	2309.07864
Large Language Model Agent: A Survey on Methodology, Applications and Challenges	—	2025-03	2503.21460
Evaluation and Benchmarking of LLM Agents: A Survey	Mohammadi et al.	2025-07	2507.21504
LLM-based Agentic Reasoning Frameworks: A Survey	—	2025-08	2508.17692
LLM-Based Human-Agent Collaboration	—	2025-05	2505.00753

Foundational Techniques¶

Technique	Origin	Key paper
Tool use / function calling	OpenAI 2023 / Toolformer	Schick 2023
ReAct (Reason + Act 交错)	Princeton 2022	Yao 2210.03629
Reflexion (verbal RL)	Northeastern 2023	Shinn 2303.11366
Chain of Thought / Tree of Thoughts	Google 2022 / Princeton 2023	Wei / Yao
LATS (Language Agent Tree Search)	2023	Zhou 2310.04406
Self-Refine / Self-Critique	CMU 2023	Madaan 2303.17651
Plan-and-Execute / Plan-and-Solve	2023	Wang
Voyager (open-world skill learning)	NVIDIA 2023	Wang 2305.16291
MemGPT / Memory hierarchies	Berkeley 2023	Packer 2310.08560
MCP (Model Context Protocol)	Anthropic 2024	open standard for tool integration
A2A (Agent-to-Agent Protocol)	Google 2025	cross-vendor agent communication

Multi-Agent Frameworks¶

Framework	Origin	Key paper
CAMEL (role-play conversation)	KAUST 2023	Li 2303.17760
AutoGen	Microsoft 2023	Wu 2308.08155
MetaGPT (SOP-based software co.)	DeepWisdom 2023	Hong 2308.00352
ChatDev	THU 2023	Qian 2307.07924
AgentVerse	THU 2023	Chen 2308.10848
LangGraph (graph-based orchestration)	LangChain 2024	open-source framework

Production Coding Agents (frontier L4)¶

Devin (Cognition Labs 2024) — 第一个 production "AI software engineer"
Claude Code (Anthropic 2024) — 72% SWE-bench Verified
Cursor / Aider — IDE-integrated coding agents
SWE-Agent (Princeton 2024) — open-source SWE benchmark champion
OpenHands (formerly OpenDevin)
Agentless / AgentCoder / AutoCodeRover — research baselines

你的工作 positioning¶

你 production system 的 L1-L6 LLM agent stack 是 L3 层的 finance-specific instantiation
L3 的核心技术（ReAct / Reflexion / 工具调用 / memory）你都已在 production 用过——这件事在汇报里可以一句话带过
不要把汇报放在 L3——这是和 vision/coding agents 共享的层，你的 differentiator 在 L1（research agent in finance）

汇报引用¶

不在 main slide，但 Q&A 备答需要熟悉
韩教授可能问 ReAct / MCP / agent reliability 等问题——这些都是 L3 概念

L4 · Agentic AI / RL Foundations（agent 范式的方法学根基）¶

Scope: Agent 范式的数学和 RL 基础——比 LLM 早几十年。决策、规划、学习的形式化。

Key Surveys¶

Title	Authors	Year	URL
The Landscape of Agentic Reinforcement Learning for LLMs	—	2025-09	2509.02547
A Survey of Frontiers in LLM Reasoning	—	2025-04	2504.09037
Logical Reasoning in LLMs: A Survey	—	2025-02	2502.09100

Classic RL Foundations¶

Sutton & Barto Reinforcement Learning: An Introduction (2018, 2^nd ed) — bible
Q-learning (Watkins 1989) / Policy gradient (Williams 1992)
DQN (Mnih 2013 / Nature 2015) — first deep RL breakthrough
AlphaGo (Silver 2016, Nature) → AlphaZero (2017) → MuZero (2019)
PPO (Schulman 2017) — RLHF 主力算法
Decision Transformer (Chen 2021) — sequence modeling for RL
World models / Dreamer V1-V3 (Hafner 2019-2023) — model-based RL

LLM × RL 现代脉络¶

Technique	Year	Key
RLHF	2022 (InstructGPT)	Ouyang et al.
DPO (Direct Preference Optimization)	2023	Rafailov 2305.18290
RLAIF	2023	Anthropic
Constitutional AI	2022	Anthropic 2212.08073
DeepSeek-R1 (GRPO + RL on reasoning)	2025	DeepSeek
OpenAI o1 / o3	2024 / 2025	inference-time RL / test-time compute
GFlowNet	2021-ongoing	Bengio; structured policy learning

你的工作 positioning¶

你的 walk-forward + adaptive state controller 在概念上接近 offline RL with conservative policy improvement
Alpha search 中"探索 vs 利用"是经典 RL 问题
不直接 use RL（production 决策是 deterministic），但理论上可以把 RQ2 (open-world LLM safety) ground 在 offline RL safety guarantees 上

汇报引用¶

Backup slide 用：被问"为什么不用 RL 做 alpha 搜索"时
Conformal prediction × offline RL 的交叉是 RQ2 真正的方法学位置

L5 · Cognitive Architectures / AGI Frameworks（最外层 / 智识祖辈）¶

Scope: agent 范式的哲学和认知科学根基。一般不直接 cite，但理解它能让你 framing 更深。

Classic Cognitive Architectures¶

SOAR (Newell 1990) — symbolic problem-solving, production rules
ACT-R (Anderson 1976-) — cognitive psychology + computational model
BDI (Bratman 1987) — Belief / Desire / Intention 框架
Society of Mind (Minsky 1986)
Global Workspace Theory (Baars 1988)

Modern AGI / Foundation Model Debates¶

Topic	Key paper
Sparks of AGI	Bubeck 2023 2303.12712 — GPT-4 early试金石
Foundation Models	Bommasani 2021 2108.07258 — Stanford CRFM
Generalist agents (Gato)	DeepMind 2022
Embodied AGI debates	ongoing

你的工作 positioning¶

不要在汇报里引这一层——离得太远
但心里要知道：你做的 production trading agent 已经实现了一个 simplified BDI 架构（Belief = market state / Desire = user trader intent / Intention = schema-validated order）。这是 L5 层的实际落地。

汇报引用¶

无。Q&A 极端深问时（"AGI 的边界在哪"）可以提一句

总结：从同心圆看你的 strategic positioning¶

你的工作的层级 anchoring¶

L0 alpha auto search             ← 你做的具体事
   ↑ anchor here for technical depth
L1 autonomous research agents     ← 你的 academic argument 在这一层
   ↑ anchor here for thesis framing
L2 AI for Science / AI for Math   ← 你的 intellectual lineage
   ↑ cite here for inspiration / pattern
L3 LLM-Based Agents               ← 你的 tooling layer
   ↑ Q&A 备答
L4 Agentic AI / RL                ← 方法学根基
   ↑ backup slide
L5 Cognitive Architectures        ← 哲学背景，一般不显式 cite

HKU 汇报 slide-by-level 建议¶

Slide	适合的 layer
Slide 1 Title	L1 framing ("autonomous research agents in finance")
Slide 2 Thesis	L1 + L0
Slide 4 What I Built	L0 + L3（生产 stack 用了 L3 技术）
Slide 6 Negative Result	L0
Slide 7 Frontier Pattern	L1-L5 ladder + 4 mode pattern（替换原 "4 common patterns"）
Slide 8 Gap	L1 (Chen 2026 6 open problems) + L0 (alpha-specific)
Slide 9-10 Crypto-Alpha-Bench	L1 (autonomous research bench) + L2 (AI for Science verifier philosophy)
Slide 11 RQs	L1 (RQ2/RQ3) + L2 (RQ1 architectural prior from time series)
Slide 12 Ask	L1 (which use case for first paper)

一句话 elevator pitch（嵌套 framing 版）¶

"I built a production-grade verification testbed for alpha auto search (L0), which is an instantiation of autonomous research agents (L1) in the finance domain. My work inherits the verifiable-novelty success pattern from FunSearch / AI for Science (L2), and the engineering stack from LLM-based agents (L3). The contribution is Crypto-Alpha-Bench——the field's first research-agent benchmark in finance, addressing 6 open problems identified by Chen 2026."

这句话覆盖 4 个 layer，30 秒可念完，被任何老师打断都能优雅退出。

Agent Research Landscape · From Alpha Auto Search Outward¶

同心圆结构图¶

L0 · Alpha Auto Search（最内层 / 你的核心工作）¶

Key Surveys (2025-2026)¶

Representative Systems¶

你的工作 positioning¶

汇报引用¶

L1 · Autonomous Research Agents（最直接的 academic parent）¶

Key Surveys¶

Key Frameworks¶

Representative Systems¶

6 Open Problems (Chen 2026)¶

你的工作 positioning¶

汇报引用¶

L2 · AI for Science / AI for Math（智识 lineage 的母层）¶

Key Surveys¶

Landmark Systems (按时间)¶

Core Pattern (4 共同模式)¶

你的工作 positioning¶

汇报引用¶

L3 · LLM-Based Agents (general)（更广的 agent 文献）¶

Key Surveys（"必读"级别）¶

Foundational Techniques¶

Multi-Agent Frameworks¶

Production Coding Agents (frontier L4)¶

你的工作 positioning¶

汇报引用¶

L4 · Agentic AI / RL Foundations（agent 范式的方法学根基）¶

Key Surveys¶

Classic RL Foundations¶

LLM × RL 现代脉络¶

你的工作 positioning¶

汇报引用¶

L5 · Cognitive Architectures / AGI Frameworks（最外层 / 智识祖辈）¶

Classic Cognitive Architectures¶

Modern AGI / Foundation Model Debates¶

你的工作 positioning¶

汇报引用¶

总结：从同心圆看你的 strategic positioning¶

你的工作的层级 anchoring¶

HKU 汇报 slide-by-level 建议¶

一句话 elevator pitch（嵌套 framing 版）¶

推荐进一步阅读（按优先级）¶