LangChain与LangGraph构建智能体工作流实战

遇珞

1. 项目概述：当LangChain遇上LangGraph

最近在尝试用LangChain结合LangGraph构建复杂任务处理智能体时，发现这两个工具的组合简直就像给AI装上了"思维导航系统"。传统LangChain虽然能处理线性任务流，但在需要动态决策的复杂场景下就显得力不从心。而LangGraph的引入，让智能体真正具备了根据上下文调整执行路径的能力。

这个项目特别适合三类开发者：

已经熟悉LangChain基础用法，想进阶处理更复杂任务的
需要实现多步骤、有条件分支的业务流程自动化
对AI代理(Agent)架构设计感兴趣的Python开发者

2. 核心架构设计解析

2.1 LangGraph的核心增强点

LangGraph在LangChain基础上主要增加了三个关键能力：

循环执行控制：通过特殊的"continue"边实现
条件分支：基于节点执行结果动态选择路径
状态持久化：在整个执行过程中维护共享状态

这种设计使得智能体可以处理像这样的复杂逻辑：

python复制def should_continue(state):
    if state["iteration"] > 5:
        return "end"
    return "continue"

2.2 典型架构设计模式

在实际项目中，我常用以下三种架构模式：

中心调度模式：
- 核心节点作为决策中心
- 其他节点作为功能模块
- 适合需要集中控制的业务流程
链式反应模式：
- 节点间形成处理流水线
- 每个节点处理完自动触发下一个
- 适合数据转换类任务
黑板架构模式：
- 所有节点共享状态
- 节点根据状态自主决定是否激活
- 适合开放性问题求解

3. 实战代码深度解析

3.1 基础智能体搭建

先看一个最简单的客服对话智能体实现：

python复制from langgraph.graph import Graph
from langchain_core.messages import HumanMessage

workflow = Graph()

def receive_input(state):
    return {"user_input": state["last_message"]}

def generate_response(state):
    # 这里接入实际的LLM调用
    return {"response": f"已处理：{state['user_input']}"}

workflow.add_node("receive", receive_input)
workflow.add_node("respond", generate_response)
workflow.set_entry_point("receive")
workflow.add_edge("receive", "respond")
workflow.add_edge("respond", END)

# 使用示例
result = workflow.invoke({"last_message": "我的订单状态？"})

3.2 复杂任务处理示例

下面是一个电商售后处理的增强版实现：

python复制def check_order_status(state):
    # 模拟订单系统查询
    return {"status": "shipped" if random.random() > 0.5 else "processing"}

def handle_shipped(state):
    return {"response": "您的订单已发货，预计3天内送达"}

def handle_processing(state):
    return {"response": "订单正在处理中，请耐心等待"}

workflow = Graph()
workflow.add_node("check_status", check_order_status)
workflow.add_node("shipped", handle_shipped)
workflow.add_node("processing", handle_processing)

workflow.set_entry_point("check_status")
workflow.add_conditional_edges(
    "check_status",
    lambda x: "shipped" if x["status"] == "shipped" else "processing",
    {"shipped": "shipped", "processing": "processing"}
)
workflow.add_edge("shipped", END)
workflow.add_edge("processing", END)

4. 高级技巧与优化策略

4.1 性能优化方案

在处理高并发请求时，我总结了这些优化点：

节点缓存：

python复制@functools.lru_cache(maxsize=128)
def expensive_operation(param):
    # 耗时计算
    return result

异步执行：

python复制async def async_node(state):
    # 异步调用LLM
    return await llm.ainvoke(...)

批量处理：

python复制def batch_node(state):
    inputs = state["batch_inputs"]
    return {"outputs": [process(x) for x in inputs]}

4.2 调试与监控

调试复杂工作流时，这些工具特别有用：

可视化追踪：

python复制from langgraph.graph import draw_graph
draw_graph(workflow).show()

执行日志：

python复制class DebugNode:
    def __call__(self, state):
        print(f"Current state: {state}")
        return state

workflow.add_node("debug", DebugNode())

性能分析：

python复制from line_profiler import LineProfiler
profiler = LineProfiler()
profiler.add_function(workflow.invoke)
profiler.run('workflow.invoke(input)')

5. 生产环境最佳实践

5.1 错误处理机制

健壮的生产系统需要完善的错误处理：

节点级重试：

python复制from tenacity import retry, stop_after_attempt

@retry(stop=stop_after_attempt(3))
def unreliable_node(state):
    # 可能失败的操作
    return result

工作流级回退：

python复制def fallback_node(state):
    return {"response": "系统繁忙，请稍后再试"}

workflow.add_node("fallback", fallback_node)
workflow.add_edge("fallback", END)

超时控制：

python复制import timeout_decorator

@timeout_decorator.timeout(5)
def time_sensitive_node(state):
    # 必须在5秒内完成
    return result

5.2 安全防护措施

在处理用户输入时特别注意：

输入净化：

python复制def sanitize_input(text):
    return text.replace("<", "&lt;").replace(">", "&gt;")

权限控制：

python复制def check_permission(state):
    if not state["user"].has_permission():
        raise PermissionError("操作未授权")

敏感数据过滤：

python复制def filter_output(state):
    if "credit_card" in state["response"]:
        state["response"] = "[REDACTED]"
    return state

6. 典型问题排查指南

6.1 常见错误与解决

错误现象	可能原因	解决方案
工作流卡死	循环条件未终止	检查continue边的终止条件
节点未执行	边配置错误	验证add_edge调用顺序
状态丢失	节点未返回正确字段	确保每个节点返回完整状态
性能低下	节点计算密集	添加缓存或异步处理

6.2 调试技巧实录

最小复现法：
- 从最简单的两节点工作流开始
- 逐步添加复杂逻辑
- 在每步验证预期行为

状态快照：

python复制def debug_node(state):
    with open("state_snapshot.json", "w") as f:
        json.dump(state, f)
    return state

断点调试：

python复制import pdb

def debug_node(state):
    pdb.set_trace()  # 交互式调试
    return state

在实际项目中，我发现最难调试的是循环条件设置不当导致的工作流卡死。一个实用的技巧是在循环节点添加计数器：

python复制def loop_node(state):
    state["iteration"] = state.get("iteration", 0) + 1
    if state["iteration"] > MAX_ITER:
        raise ValueError("循环次数超出限制")
    return state