这个项目将YOLOv5目标检测算法与PyQt图形界面相结合,构建了一个完整的行人车辆检测计数系统。作为一名长期从事计算机视觉开发的工程师,我发现很多同行在将算法模型与实际应用结合时会遇到各种问题。本文将分享我在这个项目中的完整实现过程,包括模型训练、界面开发和系统集成的关键细节。
YOLOv5是目前工业界最流行的目标检测框架之一,相比前代版本,它在速度和精度之间取得了更好的平衡。而PyQt作为Python生态中最成熟的GUI工具包,能够帮助我们快速构建专业的用户界面。将两者结合,可以打造出既具备强大检测能力又方便用户操作的系统。
这个系统的主要功能包括:
在开始项目前,我们需要搭建合适的开发环境。我推荐使用Python 3.8或3.9版本,因为这些版本与主要深度学习框架的兼容性最好。可以使用conda创建一个独立的虚拟环境:
bash复制conda create -n yolov5_pyqt python=3.8
conda activate yolov5_pyqt
项目需要安装以下几个关键库:
bash复制pip install torch torchvision torchaudio # PyTorch基础套件
pip install opencv-python pyqt5 # 图像处理和GUI
pip install matplotlib pandas # 数据分析和可视化
特别提醒:PyTorch的安装建议根据你的CUDA版本选择对应的安装命令。如果你使用GPU加速,可以到PyTorch官网获取适合你环境的安装指令。
我们需要克隆官方的YOLOv5仓库:
bash复制git clone https://github.com/ultralytics/yolov5
cd yolov5
pip install -r requirements.txt # 安装YOLOv5的依赖
这个仓库包含了完整的训练和推理代码,我们后续会基于这些代码进行修改和扩展。
COCO(Common Objects in Context)是目标检测领域最常用的基准数据集之一。它包含80个常见物体类别,非常适合行人车辆检测任务。
下载COCO数据集:
bash复制mkdir -p data/coco
cd data/coco
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
unzip train2017.zip
unzip val2017.zip
unzip annotations_trainval2017.zip
解压后,你的目录结构应该是这样的:
code复制data/coco/
├── annotations/
├── train2017/
└── val2017/
YOLOv5提供了多个预定义模型(YOLOv5s、YOLOv5m、YOLOv5l、YOLOv5x),我们选择YOLOv5s作为基础模型,它在速度和精度之间取得了很好的平衡。
创建自定义配置文件data/coco.yaml:
yaml复制# COCO dataset configuration
train: data/coco/train2017.txt
val: data/coco/val2017.txt
# number of classes
nc: 80
# class names
names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', ...]
使用以下命令开始训练:
bash复制python train.py --img 640 --batch 16 --epochs 100 --data data/coco.yaml --cfg models/yolov5s.yaml --weights yolov5s.pt --name coco_yolov5s
关键参数说明:
--img 640: 输入图像尺寸--batch 16: 批次大小(根据GPU内存调整)--epochs 100: 训练轮次--weights yolov5s.pt: 使用预训练权重训练过程中,YOLOv5会自动记录各种指标,包括mAP、损失值等。你可以通过TensorBoard监控训练进度:
bash复制tensorboard --logdir runs/train
训练完成后,使用以下命令评估模型性能:
bash复制python val.py --data data/coco.yaml --weights runs/train/coco_yolov5s/weights/best.pt --img 640
如果发现某些类别的检测效果不佳,可以考虑:
PyQt是Python绑定Qt框架的GUI工具包,它提供了丰富的UI组件和布局管理器。我们将创建一个主窗口,包含以下元素:
首先创建一个基本的窗口框架:
python复制import sys
from PyQt5.QtWidgets import QApplication, QMainWindow, QWidget, QVBoxLayout, QHBoxLayout, QLabel, QPushButton
class MainWindow(QMainWindow):
def __init__(self):
super().__init__()
self.setWindowTitle("YOLOv5检测系统")
self.setGeometry(100, 100, 1200, 800)
# 主窗口部件
self.central_widget = QWidget()
self.setCentralWidget(self.central_widget)
# 主布局
self.main_layout = QHBoxLayout(self.central_widget)
# 左侧图像显示区域
self.image_label = QLabel()
self.image_label.setAlignment(Qt.AlignCenter)
self.main_layout.addWidget(self.image_label, stretch=3)
# 右侧控制面板
self.control_panel = QWidget()
self.control_layout = QVBoxLayout(self.control_panel)
# 添加控制按钮
self.load_button = QPushButton("加载文件")
self.detect_button = QPushButton("开始检测")
self.save_button = QPushButton("保存结果")
self.control_layout.addWidget(self.load_button)
self.control_layout.addWidget(self.detect_button)
self.control_layout.addWidget(self.save_button)
self.control_layout.addStretch()
self.main_layout.addWidget(self.control_panel, stretch=1)
我们需要实现图像加载和显示功能。PyQt使用QPixmap来显示图像,而OpenCV使用BGR格式,需要进行转换:
python复制from PyQt5.QtGui import QImage, QPixmap
from PyQt5.QtCore import Qt
import cv2
def load_image(self, file_path):
# 使用OpenCV读取图像
cv_image = cv2.imread(file_path)
if cv_image is not None:
# 转换颜色空间 BGR -> RGB
rgb_image = cv2.cvtColor(cv_image, cv2.COLOR_BGR2RGB)
# 转换为QImage
h, w, ch = rgb_image.shape
bytes_per_line = ch * w
qt_image = QImage(rgb_image.data, w, h, bytes_per_line, QImage.Format_RGB888)
# 缩放图像以适应显示区域
scaled_pixmap = QPixmap.fromImage(qt_image).scaled(
self.image_label.size(), Qt.KeepAspectRatio, Qt.SmoothTransformation)
self.image_label.setPixmap(scaled_pixmap)
对于视频文件,我们需要使用QTimer来实现实时显示:
python复制from PyQt5.QtCore import QTimer
class MainWindow(QMainWindow):
def __init__(self):
# ... 其他初始化代码 ...
# 视频处理相关
self.video_capture = None
self.timer = QTimer()
self.timer.timeout.connect(self.update_video_frame)
def load_video(self, file_path):
self.video_capture = cv2.VideoCapture(file_path)
if self.video_capture.isOpened():
self.timer.start(30) # 30ms更新一帧
def update_video_frame(self):
ret, frame = self.video_capture.read()
if ret:
# 处理并显示当前帧
self.process_frame(frame)
def stop_video(self):
if self.video_capture:
self.timer.stop()
self.video_capture.release()
我们需要将YOLOv5的检测功能封装成可以在PyQt中调用的形式。创建一个专门的检测器类:
python复制import torch
from yolov5.models.experimental import attempt_load
from yolov5.utils.general import non_max_suppression, scale_coords
from yolov5.utils.plots import plot_one_box
class YOLOv5Detector:
def __init__(self, weights_path, device='cuda' if torch.cuda.is_available() else 'cpu'):
self.device = device
self.model = attempt_load(weights_path, map_location=device)
self.model.eval()
# 类别名称
self.names = self.model.module.names if hasattr(self.model, 'module') else self.model.names
def detect(self, image, conf_thres=0.5, iou_thres=0.45):
"""
执行目标检测
:param image: 输入图像 (numpy数组)
:param conf_thres: 置信度阈值
:param iou_thres: IOU阈值
:return: 检测结果图像, 检测统计信息
"""
# 预处理
img = self.preprocess(image)
# 推理
with torch.no_grad():
pred = self.model(img, augment=False)[0]
# NMS处理
pred = non_max_suppression(pred, conf_thres, iou_thres)
# 后处理
result_img = image.copy()
stats = {}
for det in pred: # 每张图像的检测结果
if det is not None and len(det):
det[:, :4] = scale_coords(img.shape[2:], det[:, :4], image.shape).round()
for *xyxy, conf, cls in det:
label = f'{self.names[int(cls)]} {conf:.2f}'
plot_one_box(xyxy, result_img, label=label, color=(0, 255, 0), line_thickness=2)
# 统计计数
class_name = self.names[int(cls)]
stats[class_name] = stats.get(class_name, 0) + 1
return result_img, stats
将检测器集成到主界面中,实现实时检测功能:
python复制class MainWindow(QMainWindow):
def __init__(self):
# ... 其他初始化代码 ...
# 初始化检测器
self.detector = YOLOv5Detector('runs/train/coco_yolov5s/weights/best.pt')
# 连接信号槽
self.load_button.clicked.connect(self.open_file_dialog)
self.detect_button.clicked.connect(self.toggle_detection)
# 检测状态
self.detection_active = False
def toggle_detection(self):
self.detection_active = not self.detection_active
self.detect_button.setText("停止检测" if self.detection_active else "开始检测")
def process_frame(self, frame):
if self.detection_active:
# 执行检测
result_img, stats = self.detector.detect(frame)
# 更新显示
self.update_image_display(result_img)
self.update_stats_panel(stats)
else:
# 直接显示原始图像
self.update_image_display(frame)
添加一个统计面板来显示检测结果:
python复制class MainWindow(QMainWindow):
def __init__(self):
# ... 其他初始化代码 ...
# 统计面板
self.stats_panel = QWidget()
self.stats_layout = QVBoxLayout(self.stats_panel)
# 标题
self.stats_title = QLabel("检测统计")
self.stats_title.setStyleSheet("font-weight: bold; font-size: 16px;")
self.stats_layout.addWidget(self.stats_title)
# 统计项容器
self.stats_items = {}
# 添加到控制面板
self.control_layout.addWidget(self.stats_panel)
def update_stats_panel(self, stats):
# 清空现有统计项
for item in self.stats_items.values():
item.setParent(None)
self.stats_items.clear()
# 添加新的统计项
for class_name, count in stats.items():
label = QLabel(f"{class_name}: {count}")
self.stats_layout.addWidget(label)
self.stats_items[class_name] = label
为了避免界面卡顿,我们需要将耗时的检测操作放到单独的线程中:
python复制from PyQt5.QtCore import QThread, pyqtSignal
class DetectionThread(QThread):
finished = pyqtSignal(object, object) # 信号:检测完成
def __init__(self, detector, frame):
super().__init__()
self.detector = detector
self.frame = frame
def run(self):
result_img, stats = self.detector.detect(self.frame)
self.finished.emit(result_img, stats)
class MainWindow(QMainWindow):
def __init__(self):
# ... 其他初始化代码 ...
# 检测线程
self.detection_thread = None
def process_frame(self, frame):
if self.detection_active:
# 如果之前的检测线程还在运行,先终止它
if self.detection_thread and self.detection_thread.isRunning():
self.detection_thread.terminate()
# 创建并启动新的检测线程
self.detection_thread = DetectionThread(self.detector, frame)
self.detection_thread.finished.connect(self.on_detection_finished)
self.detection_thread.start()
else:
self.update_image_display(frame)
def on_detection_finished(self, result_img, stats):
self.update_image_display(result_img)
self.update_stats_panel(stats)
为了提升检测速度,我们可以对模型进行量化:
python复制class YOLOv5Detector:
def __init__(self, weights_path, device='cuda' if torch.cuda.is_available() else 'cpu'):
self.device = device
self.model = attempt_load(weights_path, map_location=device)
# 模型量化
if device == 'cpu':
self.model = torch.quantization.quantize_dynamic(
self.model, {torch.nn.Linear}, dtype=torch.qint8)
self.model.eval()
长时间运行视频检测时,需要注意内存管理:
python复制class MainWindow(QMainWindow):
def closeEvent(self, event):
# 清理资源
if self.video_capture:
self.video_capture.release()
if self.detection_thread and self.detection_thread.isRunning():
self.detection_thread.terminate()
super().closeEvent(event)
实现一个完整的计数系统,需要跟踪物体在视频中的移动:
python复制from collections import defaultdict
class ObjectCounter:
def __init__(self):
self.track_history = defaultdict(list)
self.counted_ids = set()
self.total_counts = defaultdict(int)
def update(self, detections, frame_width):
"""
更新计数器状态
:param detections: 检测结果列表 [(x1, y1, x2, y2, conf, cls)]
:param frame_width: 帧宽度
:return: 更新后的检测结果(添加了ID)
"""
current_ids = set()
for det in detections:
*xyxy, conf, cls = det
x1, y1, x2, y2 = xyxy
center_x = (x1 + x2) / 2
center_y = (y1 + y2) / 2
# 寻找最近的已有轨迹
min_dist = float('inf')
best_id = None
for obj_id, history in self.track_history.items():
if len(history) > 0:
last_x, last_y = history[-1]
dist = ((center_x - last_x)**2 + (center_y - last_y)**2)**0.5
if dist < min_dist and dist < 50: # 距离阈值
min_dist = dist
best_id = obj_id
if best_id is None:
# 新物体
best_id = len(self.track_history) + 1
self.track_history[best_id] = []
# 更新轨迹
self.track_history[best_id].append((center_x, center_y))
if len(self.track_history[best_id]) > 30: # 保留最近的30个点
self.track_history[best_id] = self.track_history[best_id][-30:]
# 检查是否穿过计数线(假设在画面中间)
if len(self.track_history[best_id]) > 1:
prev_x = self.track_history[best_id][-2][0]
if prev_x < frame_width/2 and center_x >= frame_width/2 and best_id not in self.counted_ids:
self.total_counts[self.names[int(cls)]] += 1
self.counted_ids.add(best_id)
current_ids.add(best_id)
det += (best_id,) # 添加ID到检测结果
# 清理不再出现的物体
for obj_id in list(self.track_history.keys()):
if obj_id not in current_ids:
del self.track_history[obj_id]
if obj_id in self.counted_ids:
self.counted_ids.remove(obj_id)
return detections, self.total_counts
实现特定区域的检测和报警功能:
python复制class ZoneDetector:
def __init__(self, zone_polygon):
"""
:param zone_polygon: 定义检测区域的顶点列表 [(x1,y1), (x2,y2), ...]
"""
self.zone_polygon = zone_polygon
def is_in_zone(self, x, y):
"""
判断点是否在多边形内
使用射线法
"""
n = len(self.zone_polygon)
inside = False
p1x, p1y = self.zone_polygon[0]
for i in range(n+1):
p2x, p2y = self.zone_polygon[i % n]
if y > min(p1y, p2y):
if y <= max(p1y, p2y):
if x <= max(p1x, p2x):
if p1y != p2y:
xinters = (y-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
if p1x == p2x or x <= xinters:
inside = not inside
p1x, p1y = p2x, p2y
return inside
添加结果保存功能,支持图片和统计数据的导出:
python复制import json
from datetime import datetime
class MainWindow(QMainWindow):
def __init__(self):
# ... 其他初始化代码 ...
# 连接保存按钮
self.save_button.clicked.connect(self.save_results)
def save_results(self):
if not hasattr(self, 'current_stats'):
return
# 保存图片
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
img_path = f"results/detection_{timestamp}.jpg"
cv2.imwrite(img_path, cv2.cvtColor(self.current_image, cv2.COLOR_RGB2BGR))
# 保存统计数据
stats_path = f"results/stats_{timestamp}.json"
with open(stats_path, 'w') as f:
json.dump(self.current_stats, f, indent=2)
# 提示用户
QMessageBox.information(self, "保存成功", f"结果已保存到:\n{img_path}\n{stats_path}")
模型加载失败
检测速度慢
检测精度低
批处理推理
当处理多张图片时,可以使用批处理来提高GPU利用率:
python复制def batch_detect(self, image_list):
# 预处理
img_batch = torch.cat([self.preprocess(img) for img in image_list], 0)
# 推理
with torch.no_grad():
pred = self.model(img_batch)[0]
# NMS处理
pred = non_max_suppression(pred, conf_thres, iou_thres)
# 后处理
results = []
for i, det in enumerate(pred):
result_img = image_list[i].copy()
stats = {}
if det is not None and len(det):
det[:, :4] = scale_coords(img_batch.shape[2:], det[:, :4], image_list[i].shape).round()
for *xyxy, conf, cls in det:
label = f'{self.names[int(cls)]} {conf:.2f}'
plot_one_box(xyxy, result_img, label=label, color=(0, 255, 0), line_thickness=2)
class_name = self.names[int(cls)]
stats[class_name] = stats.get(class_name, 0) + 1
results.append((result_img, stats))
return results
异步处理流水线
使用生产者-消费者模式构建处理流水线:
python复制from queue import Queue
from threading import Thread
class ProcessingPipeline:
def __init__(self, detector, batch_size=4):
self.detector = detector
self.batch_size = batch_size
self.input_queue = Queue()
self.output_queue = Queue()
self.worker = Thread(target=self.process_batches)
self.worker.daemon = True
self.worker.start()
def process_batches(self):
batch = []
while True:
item = self.input_queue.get()
if item is None: # 结束信号
if batch:
self.process_and_put(batch)
break
batch.append(item)
if len(batch) >= self.batch_size:
self.process_and_put(batch)
batch = []
def process_and_put(self, batch):
frames = [item[0] for item in batch]
callbacks = [item[1] for item in batch]
results = self.detector.batch_detect(frames)
for result, callback in zip(results, callbacks):
self.output_queue.put((result, callback))
路径处理
使用os.path来处理跨平台路径问题:
python复制import os
config_dir = os.path.expanduser("~/.yolov5_detector")
if not os.path.exists(config_dir):
os.makedirs(config_dir)
model_path = os.path.join(config_dir, "best.pt")
高DPI支持
对于高分辨率屏幕,需要启用Qt的高DPI缩放:
python复制if __name__ == '__main__':
# 启用高DPI支持
QApplication.setAttribute(Qt.AA_EnableHighDpiScaling)
QApplication.setAttribute(Qt.AA_UseHighDpiPixmaps)
app = QApplication(sys.argv)
window = MainWindow()
window.show()
sys.exit(app.exec_())
将项目打包为可执行文件,方便在没有Python环境的机器上运行:
安装PyInstaller:
bash复制pip install pyinstaller
创建打包脚本build.spec:
python复制# -*- mode: python -*-
block_cipher = None
a = Analysis(['main.py'],
pathex=['/path/to/your/project'],
binaries=[],
datas=[('yolov5', 'yolov5'), ('data', 'data')],
hiddenimports=[],
hookspath=[],
runtime_hooks=[],
excludes=[],
win_no_prefer_redirects=False,
win_private_assemblies=False,
cipher=block_cipher,
noarchive=False)
pyz = PYZ(a.pure, a.zipped_data,
cipher=block_cipher)
exe = EXE(pyz,
a.scripts,
[],
exclude_binaries=True,
name='YOLOv5_Detector',
debug=False,
bootloader_ignore_signals=False,
strip=False,
upx=True,
console=False,
icon='icon.ico')
coll = COLLECT(exe,
a.binaries,
a.zipfiles,
a.datas,
strip=False,
upx=True,
name='YOLOv5_Detector')
执行打包:
bash复制pyinstaller build.spec
为了减小部署包大小和提高运行效率,可以对模型进行优化:
python复制def optimize_model(input_weights, output_weights):
# 加载模型
model = attempt_load(input_weights, map_location='cpu')
# 转换为TorchScript
model = model.fuse().eval()
input_tensor = torch.rand(1, 3, 640, 640)
traced_model = torch.jit.trace(model, input_tensor)
# 量化 (仅CPU)
quantized_model = torch.quantization.quantize_dynamic(
traced_model, {torch.nn.Linear}, dtype=torch.qint8)
# 保存优化后的模型
quantized_model.save(output_weights)
使用NSIS或Inno Setup创建Windows安装程序,可以包含:
扩展系统以支持多摄像头输入:
python复制class MultiCameraController:
def __init__(self):
self.cameras = {}
self.timers = {}
def add_camera(self, camera_id, rtsp_url=None):
if rtsp_url:
cap = cv2.VideoCapture(rtsp_url)
else:
cap = cv2.VideoCapture(camera_id)
if cap.isOpened():
self.cameras[camera_id] = cap
timer = QTimer()
timer.timeout.connect(lambda: self.update_frame(camera_id))
timer.start(30)
self.timers[camera_id] = timer
return True
return False
def update_frame(self, camera_id):
ret, frame = self.cameras[camera_id].read()
if ret:
# 处理并显示帧
self.process_frame.emit(camera_id, frame)
将检测服务部署到云端,提供API接口:
python复制from flask import Flask, request, jsonify
import numpy as np
import cv2
import base64
app = Flask(__name__)
detector = YOLOv5Detector('best.pt')
@app.route('/detect', methods=['POST'])
def detect_api():
# 获取上传的图像
file = request.files.get('image')
if not file:
return jsonify({'error': 'No image provided'}), 400
# 读取图像
img_bytes = file.read()
nparr = np.frombuffer(img_bytes, np.uint8)
img = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
# 执行检测
result_img, stats = detector.detect(img)
# 编码结果图像
_, buffer = cv2.imencode('.jpg', result_img)
result_base64 = base64.b64encode(buffer).decode('utf-8')
return jsonify({
'stats': stats,
'result_image': result_base64
})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
使用PyQt for Android或Kivy将应用移植到移动平台:
使用Buildozer打包Android应用:
bash复制pip install buildozer
buildozer init
编辑buildozer.spec:
ini复制[app]
title = YOLOv5 Detector
package.name = yolov5detector
package.domain = org.yolov5
source.dir = .
source.include_exts = py,png,jpg,kv,pt
version = 0.1
requirements = python3,pyqt5,opencv,torch
构建APK:
bash复制buildozer android debug deploy run
在实际项目中,我发现YOLOv5的检测效果很大程度上取决于训练数据的质量。特别是在行人车辆检测场景中,不同光照条件、不同角度的数据样本对最终效果影响很大。建议在实际应用中收集特定场景的数据进行微调训练,可以显著提升检测精度。