手工实现旅行智能体的核心架构与工具调用机制

妩媚怡口莲

1. 旅行智能体手工实现方案解析

这个不依赖LangChain框架的手工实现版本，为我们揭示了智能体(Agent)技术的核心运作机制。作为一名长期从事AI应用开发的工程师，我特别欣赏这种"剥洋葱"式的实现方式——它让我们能够真正理解智能体是如何思考、决策和行动的，而不是简单地调用某个框架的API。

这个项目本质上构建了一个专为旅行场景设计的对话系统。与普通的聊天机器人不同，它具备主动调用外部工具的能力，可以根据用户需求查询航班信息、推荐景点、计算预算等，最后综合所有信息给出合理建议。这种"思考-行动-反馈"的循环，正是智能体技术的精髓所在。

2. 核心架构设计思路

2.1 模块化分层设计

项目的目录结构清晰地体现了分层设计思想：

code复制manual_agent/
├── data/          # 静态数据层
├── tools/         # 能力层
├── base.py        # 基础设施层
├── llm.py         # 模型交互层
├── agent.py       # 核心逻辑层
└── main.py        # 入口层

这种分层架构带来的主要优势是：

高内聚低耦合：每层只需关注自己的职责，修改某一层不会影响其他层
易于扩展：新增工具只需在tools目录添加文件，不影响核心逻辑
便于测试：可以单独测试每一层的功能

2.2 工具调用机制实现

在base.py中实现的工具调用机制是整个项目的基石。其核心是Tool类和@tool装饰器：

python复制class Tool:
    def __init__(self, func, name=None, description=None):
        self.func = func
        self.name = name or func.__name__
        self.description = description or func.__doc__

def tool(name=None, description=None):
    def decorator(func):
        return Tool(func, name, description)
    return decorator

这种设计巧妙地将普通Python函数转换为智能体可调用的工具，同时保留了函数的元信息（名称、描述）供LLM决策使用。在实际项目中，我建议进一步强化参数校验和错误处理机制。

3. 关键组件深度解析

3.1 LLM交互层实现

llm.py中的LLM类封装了与DeepSeek API的交互逻辑。其核心方法是generate_response：

python复制def generate_response(self, messages, tools=None):
    payload = {
        "model": self.model,
        "messages": messages,
        "temperature": 0.7
    }
    if tools:
        payload["tools"] = [tool.to_dict() for tool in tools]
    
    response = requests.post(
        self.api_url,
        headers=self.headers,
        json=payload
    )
    return self._process_response(response)

这里有几个值得注意的技术细节：

温度参数设为0.7，在确定性和创造性之间取得平衡
工具描述需要转换为字典格式供API识别
响应处理需要同时考虑普通回复和工具调用两种情况

3.2 智能体核心逻辑

agent.py中的Agent类实现了智能体的核心决策循环：

python复制class Agent:
    def __init__(self, llm, tools):
        self.llm = llm
        self.tools = {tool.name: tool for tool in tools}
    
    def run(self, user_input):
        messages = [{"role": "user", "content": user_input}]
        
        while True:
            response = self.llm.generate_response(messages, self.tools.values())
            
            if response.type == "message":
                return response.content
            elif response.type == "tool_call":
                tool = self.tools[response.tool_name]
                result = tool.func(**response.parameters)
                messages.append({
                    "role": "tool",
                    "content": str(result),
                    "tool_call_id": response.call_id
                })

这个循环体现了智能体的基本工作流程：

接收用户输入
LLM分析是否需要调用工具
如需调用，执行工具并收集结果
将结果反馈给LLM生成最终回复
返回给用户

4. 工具集实现细节

4.1 天气查询工具

weather_tools.py中的实现展示了如何处理结构化数据：

python复制@tool("query_weather", "查询指定城市的天气情况")
def query_weather(city: str, date: str = None):
    """
    参数:
        city: 城市名称 (目前仅支持北京/上海)
        date: 日期 (YYYY-MM-DD格式)，默认为今天
    """
    with open('data/weather.json') as f:
        weather_data = json.load(f)
    
    date = date or datetime.now().strftime("%Y-%m-%d")
    city_data = weather_data.get(city)
    
    if not city_data:
        return f"抱歉，暂不支持{city}的天气查询"
    
    forecast = city_data.get(date)
    return forecast or f"找不到{city}在{date}的天气数据"

关键点：

使用类型注解明确参数类型
提供默认值简化调用
完善的错误处理和数据校验

4.2 预算计算工具

budget_tools.py展示了如何实现业务逻辑：

python复制@tool("calculate_total_budget", "根据预算推荐行程安排")
def calculate_total_budget(days: int, budget: float, city: str):
    """
    参数:
        days: 旅行天数
        budget: 总预算(元)
        city: 城市名称
    """
    daily_budget = budget / days
    
    with open('data/attractions.json') as f:
        attractions = json.load(f).get(city, [])
    
    recommendations = []
    for attr in attractions:
        if attr['price'] <= daily_budget * 0.3:  # 景点支出不超过每日预算的30%
            recommendations.append({
                "name": attr["name"],
                "price": attr["price"],
                "time_needed": attr["time_needed"]
            })
    
    return {
        "daily_budget": daily_budget,
        "recommendations": recommendations[:3],  # 最多推荐3个
        "suggestion": f"建议每日餐饮交通预算约{daily_budget*0.5:.1f}元"
    }

这个工具体现了：

业务规则的封装（30%预算限制）
数据的过滤和排序
结构化返回便于LLM处理

5. 部署与使用实践

5.1 环境配置要点

在配置DeepSeek API Key时，有几个实用技巧：

多环境管理：可以使用python-dotenv管理不同环境的API Key
安全存储：切勿将API Key硬编码在代码中或上传到版本控制系统
访问控制：建议为智能体创建专用的API Key并设置用量限制

示例.env文件：

code复制DEEPSEEK_API_KEY=your_key_here
ENVIRONMENT=development

5.2 运行与调试

启动智能体后，可以通过以下几种方式优化交互体验：

对话历史：在main.py中添加简单的对话历史管理

python复制conversation_history = []

while True:
    user_input = input("You: ")
    if user_input.lower() in ['exit', 'quit']:
        break
        
    conversation_history.append({"role": "user", "content": user_input})
    response = agent.run(user_input)
    conversation_history.append({"role": "assistant", "content": response})
    
    print(f"Assistant: {response}")

超时处理：为API调用添加超时机制

python复制response = requests.post(
    self.api_url,
    headers=self.headers,
    json=payload,
    timeout=10  # 10秒超时
)

速率限制：实现简单的请求限流

python复制import time

class RateLimitedLLM(LLM):
    def __init__(self, *args, max_calls=3, per_second=1, **kwargs):
        super().__init__(*args, **kwargs)
        self.max_calls = max_calls
        self.per_second = per_second
        self.calls = []
    
    def generate_response(self, messages, tools=None):
        now = time.time()
        self.calls = [call for call in self.calls if call > now - 1]
        
        if len(self.calls) >= self.max_calls:
            time.sleep(1)
            
        self.calls.append(time.time())
        return super().generate_response(messages, tools)

6. 性能优化与扩展方向

6.1 性能优化技巧

工具缓存：对于频繁查询的工具结果可以添加缓存

python复制from functools import lru_cache

@lru_cache(maxsize=100)
@tool("query_weather")
def query_weather(city: str, date: str = None):
    # 原有实现

批量处理：支持同时处理多个工具调用

python复制def run(self, user_input):
    messages = [{"role": "user", "content": user_input}]
    
    while True:
        response = self.llm.generate_response(messages, self.tools.values())
        
        if response.type == "message":
            return response.content
        elif response.type == "tool_call":
            # 支持并行处理多个工具调用
            results = {}
            for call in response.tool_calls:
                tool = self.tools[call.name]
                results[call.id] = tool.func(**call.parameters)
            
            messages.append({
                "role": "tool",
                "content": str(results),
                "tool_call_id": [call.id for call in response.tool_calls]
            })

结果预处理：在工具返回前对数据进行精简

python复制def preprocess_flight_data(flights):
    """保留关键字段，减少token使用"""
    return [
        {
            "flight_no": f["flight_no"],
            "departure": f["departure_time"],
            "arrival": f["arrival_time"],
            "price": f["price"]
        }
        for f in flights[:5]  # 最多返回5条
    ]

6.2 扩展方向建议

记忆与上下文：实现对话历史管理

python复制class MemoryAgent(Agent):
    def __init__(self, *args, memory_size=5, **kwargs):
        super().__init__(*args, **kwargs)
        self.memory = deque(maxlen=memory_size)
    
    def run(self, user_input):
        self.memory.append({"role": "user", "content": user_input})
        messages = list(self.memory)
        
        response = super().run(messages)
        self.memory.append({"role": "assistant", "content": response})
        
        return response

多模态支持：扩展工具处理图片、语音等

python复制@tool("image_search", "搜索旅游景点图片")
def image_search(attraction_name: str):
    # 调用图片搜索API
    return {
        "description": f"{attraction_name}的图片",
        "image_url": "https://example.com/image.jpg"
    }

验证与授权：添加用户身份验证

python复制def authenticated_tool(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        if not current_user.authenticated:
            raise PermissionError("请先登录")
        return func(*args, **kwargs)
    return wrapper

7. 常见问题与解决方案

7.1 工具调用失败排查

问题现象：智能体反复尝试调用同一个工具但无响应

排查步骤：

检查工具函数是否正确定义了参数
验证工具描述是否准确反映了功能
确认LLM接收到的工具定义格式正确
检查API响应中是否包含工具调用请求

解决方案：

python复制# 在agent.py中添加调试输出
print(f"Tool call requested: {response.tool_name}")
print(f"With parameters: {response.parameters}")

try:
    tool = self.tools[response.tool_name]
    result = tool.func(**response.parameters)
except Exception as e:
    print(f"Tool call failed: {str(e)}")
    result = f"工具调用失败: {str(e)}"

7.2 API限流处理

问题现象：频繁收到429 Too Many Requests错误

解决方案：

实现指数退避重试机制

python复制def generate_response(self, messages, tools=None):
    max_retries = 3
    base_delay = 1
    
    for attempt in range(max_retries):
        try:
            response = requests.post(...)
            response.raise_for_status()
            return self._process_response(response)
        except requests.exceptions.HTTPError as e:
            if e.response.status_code == 429:
                delay = base_delay * (2 ** attempt)
                time.sleep(delay)
            else:
                raise
    raise Exception("Max retries exceeded")

监控API使用情况

python复制class MonitoredLLM(LLM):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.usage = {
            "last_hour": 0,
            "last_day": 0
        }
    
    def generate_response(self, messages, tools=None):
        response = super().generate_response(messages, tools)
        
        # 更新使用统计
        now = time.time()
        self._update_usage(now)
        
        return response
    
    def _update_usage(self, timestamp):
        # 实现使用统计逻辑

7.3 结果格式化问题

问题现象：LLM无法正确解析工具返回的数据

解决方案：

统一工具返回格式

python复制def standardize_response(data):
    """将工具响应标准化为LLM易处理的格式"""
    if isinstance(data, (dict, list)):
        return {"status": "success", "data": data}
    return {"status": "success", "data": {"result": str(data)}}

添加结果后处理

python复制def postprocess_tool_result(result):
    """确保结果适合LLM处理"""
    if isinstance(result, str):
        return result
    try:
        return json.dumps(result, ensure_ascii=False)
    except:
        return str(result)

8. 项目实战经验分享

在实际部署这类智能体系统时，有几个关键点需要特别注意：

工具设计的原子性：每个工具应该只做一件事并做好。过于复杂的工具会导致LLM难以正确调用。我曾遇到一个工具同时处理天气和交通查询，结果调用准确率大幅下降。拆分为两个独立工具后问题解决。
参数描述的精确性：工具描述中的参数说明要尽可能明确。例如"city"参数应该注明当前支持哪些城市，否则LLM可能会尝试调用不支持的地点。
错误处理的友好性：工具调用失败时返回的错误信息应该既包含足够的技术细节供调试，又要对最终用户友好。我们采用分层错误处理：

python复制try:
    result = tool(**params)
except ValidationError as e:
    return {"error": "参数错误", "details": str(e)}
except APIError as e:
    return {"error": "服务暂时不可用", "code": e.code}
except Exception as e:
    logging.error(f"Tool {tool.name} failed: {str(e)}")
    return {"error": "处理请求时出错"}