去年接手了一个电力负荷预测项目,需要提前24小时预测区域用电量。传统统计方法在节假日预测误差经常超过15%,业务部门天天追着我们要解决方案。试了LSTM、Prophet几个主流模型后,发现BP神经网络在平稳时段表现不错,但超参数调优成了大问题——手动调参两周效果提升不到2%,团队都快被逼疯了。
这时候突然想起读研时接触过的贝叶斯优化(Bayesian Optimization),决定用它来改造我们的BP神经网络调参流程。没想到第一版实验MAPE就降了3.8个百分点,最终模型在春节假期期间的预测误差控制在了8.2%以内。今天就把这个实战方案拆开讲讲,附上完整代码和那些教科书不会告诉你的坑。
常规网格搜索(Grid Search)和随机搜索(Random Search)在神经网络调参时有两个致命缺陷:
贝叶斯优化通过高斯过程(Gaussian Process)建立代理模型,用较少的采样点逼近最优解。其核心优势在于:
实际测试发现,在相同计算预算下,贝叶斯优化的效果比随机搜索提升23%-47%
我们的基础网络结构包含:
python复制model = Sequential([
Dense(64, activation='relu', input_shape=(n_features,)),
Dropout(0.3),
Dense(32, activation='relu'),
Dense(1)
])
需要优化的超参数包括:
关键是要把模型验证误差作为优化目标:
python复制def objective(params):
model = build_model(params) # 根据参数构建网络
history = model.fit(X_train, y_train,
validation_data=(X_val, y_val),
epochs=params['epochs'],
batch_size=params['batch_size'],
verbose=0)
return -history.history['val_mape'][-1] # 返回负的验证误差
使用Hyperopt库实现:
python复制from hyperopt import fmin, tpe, hp
space = {
'hidden_units': hp.quniform('hidden_units', 16, 256, 16),
'dropout': hp.uniform('dropout', 0.1, 0.5),
'lr': hp.loguniform('lr', np.log(1e-5), np.log(1e-2)),
'batch_size': hp.choice('batch_size', [32, 64, 128, 256]),
'epochs': hp.quniform('epochs', 50, 300, 25)
}
best = fmin(objective, space, algo=tpe.suggest, max_evals=100)
通过MongoTrials实现分布式优化:
python复制from hyperopt import MongoTrials
trials = MongoTrials('mongo://localhost:27017/hyperopt_db')
best = fmin(objective, space, algo=tpe.suggest, max_evals=100, trials=trials)
坑1:归一化方式不一致
坑2:时间序列泄漏
python复制import resource
def memory_limit(percentage=0.8):
soft, hard = resource.getrlimit(resource.RLIMIT_AS)
resource.setrlimit(resource.RLIMIT_AS,
(int(get_memory() * 1024 * percentage), hard))
python复制# 数据准备
def prepare_data():
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
scaler = MinMaxScaler()
X_scaled = scaler.fit_transform(X)
X_train, X_test = train_test_split(X_scaled, shuffle=False, test_size=0.2)
# ... 其他处理
return X_train, y_train, X_val, y_val
# 模型构建
def build_model(params):
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
model = Sequential([
Dense(params['hidden_units'], activation='relu'),
Dropout(params['dropout']),
Dense(params['hidden_units']//2, activation='relu'),
Dense(1)
])
model.compile(
optimizer=Adam(learning_rate=params['lr']),
loss='mse',
metrics=['mape']
)
return model
# 优化执行
def run_optimization():
from hyperopt import STATUS_OK
def objective(params):
model = build_model(params)
history = model.fit(
X_train, y_train,
validation_data=(X_val, y_val),
epochs=int(params['epochs']),
batch_size=int(params['batch_size']),
verbose=0,
callbacks=[EarlyStopping(patience=10)]
)
return {
'loss': -history.history['val_mape'][-1],
'status': STATUS_OK,
'params': params
}
best = fmin(objective, space, algo=tpe.suggest, max_evals=100)
return best
优化前后关键指标对比:
| 指标 | 手动调参 | 贝叶斯优化 | 提升幅度 |
|---|---|---|---|
| MAPE(%) | 12.7 | 8.2 | 35.4% |
| 训练时间(h) | 38.5 | 21.2 | 45.0% |
| 超参数尝试数 | 76 | 100 | +31.6% |
在实际业务中带来的改进: