

浏览全部资源
扫码关注微信
1.西北工业大学计算机学院,陕西西安 710129
2.西北工业大学自动化学院,陕西西安 710129
Received:24 September 2025,
Accepted:08 December 2025,
Published:25 December 2025
移动端阅览
赵楠楠, 杨帆, 王浩, 等. 面向云平台弹性伸缩的多头注意力-残差修正容器伸缩行为学习模型[J]. 电子学报, 2025, 53(12): 4408-4428.
ZHAO Nan-nan, YANG Fan, WANG Hao, et al. A Hybrid Multi-Head Attention and Residual Correction Model for Elastic Scaling Behavior Learning in Cloud Platforms[J]. Acta Electronica Sinica, 2025, 53(12): 4408-4428.
赵楠楠, 杨帆, 王浩, 等. 面向云平台弹性伸缩的多头注意力-残差修正容器伸缩行为学习模型[J]. 电子学报, 2025, 53(12): 4408-4428. DOI:10.12263/DZXB.20250831
ZHAO Nan-nan, YANG Fan, WANG Hao, et al. A Hybrid Multi-Head Attention and Residual Correction Model for Elastic Scaling Behavior Learning in Cloud Platforms[J]. Acta Electronica Sinica, 2025, 53(12): 4408-4428. DOI:10.12263/DZXB.20250831
云平台和微服务架构的快速发展,使弹性伸缩成为保障性能与资源效率的核心机制.现有研究虽在负载预测与组合建模方面取得了进展,但大多仍以 CPU、内存等资源利用率为预测目标,再通过阈值规则或控制逻辑间接触发副本调整.这种“预测-控制解耦”的设计带来多重不足:一方面,预测误差在规则映射过程中被放大,难以保证伸缩动作的准确性;另一方面,真实调度器中的滞回、冷却与离散动作机制难以刻画,使得预测结果难以直接落地.因此,直接对伸缩行为进行学习,即建模副本数的动态变化规律,成为一种更具控制导向和部署价值的思路.针对这一问题,本文提出一种多头注意力-残差修正容器伸缩行为学习模型.该模型首先利用自回归滑动平均模型(Auto Regressive Integrated Moving Average,ARIMA)对副本数序列进行趋势分解,再以双向长短期记忆(Bidirectional Long-Short Term Memory,BiLSTM)神经网络对残差序列建模,并引入多头注意力机制(Multi-Head Attention,MHA)自动捕获关键时序特征,通过残差修正(Residual Correction)提升预测精度与健壮性.基于Alibaba公开的cluster-trace-microservices-v2022真实数据集,系统对比了PETformer、SparseTSF、TFEGRU、GRU、Transformer、Seq2Seq-LSTM、Seq2Seq-GRU、Seq2Seq-Transfomer、GRU-LSTM、CNN-LSTM、CNN-LSTM-GRU等主流模型.实验结果表明,所提ARIMA-BiLSTM-MHA组合模型在均方误差(Mean Squared Error,MSE)、均方根误差(Root Mean Squared Error,RMSE)、平均绝对误差(Mean Absolute Error,MAE)、平均绝对百分比误差(Mean Absolute Percentage Error,MAPE)、决定系数(Coefficient of Determination,R²)等核心指标上,较各类基线方法分别取得了1.57%~71.56%(MSE)、0.72%~46.67%(RMSE)、1.57%~59.10%(MAE)、1.97%~60.48%(MAPE)、0.27%~15.70%(R²)的相对百分比性能提升,其中R²最高提升至0.954 3.进一步地,在基于DeathStarBench socialNetwork基准应用构建的容器副本伸缩控制实验中,行为学习驱动的伸缩策略相较于CPU阈值型HPA策略,在将平均 P99 延迟降低2.11%并有效抑制负载跃迁阶段尾延迟尖峰的同时,成功将平均副本数减少约17%,显著缓解了资源过度配置问题.结果证明,该方法能够更准确、稳定地学习伸缩动作并输出直接可用的控制语义,为云平台自动扩缩容提供高效、可靠的决策支持.
The rapid development of cloud platforms and microservice architectures has made elastic scaling a critical mechanism for ensuring both performance and cost efficiency. Although prior studies have advanced workload forecasting and hybrid modeling
most approaches still focus on predicting resource utilization (e.g.
CPU or memory) and then mapping forecasts to scaling actions through threshold rules or controller logic. This forecast-control decoupling amplifies prediction errors and fails to capture practical mechanisms such as hysteresis
cooldown
and discrete scaling steps
thereby limiting deployment feasibility. To overcome these limitations
we directly learn scaling behaviors
modeling replica count dynamics as autoscaler control actions. We propose a hybrid model
ARIMA-BiLSTM-MHA
that integrates ARIMA for long-term trend extraction
BiLSTM for residual sequence modeling
multi-head attention for capturing critical temporal dependencies
and residual correction for improving robustness against bursty and non-stationary workloads. We conduct extensive experiments on the real-world Alibaba cluster-trace-microservices-v2022 dataset
where we systematically compare our method with baselines including PETformer
SparseTSF
TFEGRU
GRU
Transformer
Seq2Seq-LSTM
Seq2Seq-GRU
Seq2Seq-Transfomer
GRU-LSTM
CNN-LSTM and CNN-LSTM-GRU. Our results demonstrate that our approach consistently outperforms existing methods
achieving relative improvements of 1.57%~71.56% (MSE)
0.72%~46.67% (RMSE)
1.57%~59.10% (MAE)
1.97%~60.48% (MAPE)
and 0.27%~15.70% (R²)
with R² reaching up to 0.954 3. Furthermore
we conduct container replica autoscaling experiments based on the DeathStarBench socialNetwork benchmark. We show that the behavior learning-driven strategy
compared with the CPU-threshold HPA strategy
successfully reduces the average replica count by approximately 17% while lowering the average P99 latency by 2.11% and effectively suppressing tail-latency spikes during load transitions
thereby significantly mitigating resource over-provisioning. We show that our model can more accurately and stably learn and forecast scaling actions
providing forward-looking decision support for autoscaling in practical cloud environments.
NGUYEN T T , YEOM Y J , KIM T , et al . Horizontal pod autoscaling in kubernetes for elastic container orchestration [J ] . Sensors , 2020 , 20 ( 16 ): 1 - 18 .
VERGADIA P . Visualizing Google Cloud: 101 Illustrated References for Cloud Engineers and Architects [M ] . Hoboken : John Wiley & Sons, Inc. , 2022 .
CALHEIROS R N , MASOUMI E , RANJAN R , et al . Workload prediction using ARIMA model and its impact on cloud applications’ QoS [J ] . IEEE Transactions on Cloud Computing , 2015 , 3 ( 4 ): 449 - 458 .
BI J , YUAN H T , LI S , et al . ARIMA-based and multiapplication workload prediction with wavelet decomposition and savitzky-golay filter in clouds [J ] . IEEE Transactions on Systems, Man, and Cybernetics: Systems , 2024 , 54 ( 4 ): 2495 - 2506 .
杨阳 , 姜春茂 , 李志聪 . 三支决策视角下的云平台负载预测研究 [J ] . 小型微型计算机系统 , 2020 , 41 ( 7 ): 1363 - 1370 .
YANG Y , JIANG C M , LI Z C . Cloud platform load forecasting from the perspective of three-way decisions [J ] . Journal of Chinese Computer Systems , 2020 , 41 ( 7 ): 1363 - 1370 . (in Chinese)
谢同磊 , 邓莉 , 曹振 , 等 . 基于LSTM-Z的云平台主机负载预测方法 [J ] . 计算机工程与设计 , 2023 , 44 ( 9 ): 2561 - 2568 .
XIE T L , DENG L , CAO Z , et al . Host load prediction method based on LSTM-Z in cloud [J ] . Computer Engineering and Design , 2023 , 44 ( 9 ): 2561 - 2568 . (in Chinese)
尤文龙 , 邓莉 , 李锐龙 , 等 . 基于v-Informer的云平台资源负载预测方法 [J ] . 计算机科学 , 2024 , 51 ( 12 ): 147 - 156 .
YOU W L , DENG L , LI R L , et al . Load prediction method of cloud resource based on v-informer [J ] . Computer Science , 2024 , 51 ( 12 ): 147 - 156 . (in Chinese)
李浩阳 , 贺小伟 , 王宾 , 等 . 基于改进Informer的云计算资源负载预测 [J ] . 计算机工程 , 2024 , 50 ( 2 ): 43 - 50 .
LI H Y , HE X W , WANG B , et al . Cloud computing resource load prediction based on improved informer [J ] . Computer Engineering , 2024 , 50 ( 2 ): 43 - 50 . (in Chinese)
朱金灿 , 邓莉 , 梁晨君 , 等 . 云平台主机资源负载预测分析研究 [J ] . 小型微型计算机系统 , 2021 , 42 ( 12 ): 2538 - 2544 .
ZHU J C , DENG L , LIANG C J , et al . Analysis and prediction of host resource load in the cloud [J ] . Journal of Chinese Computer Systems , 2021 , 42 ( 12 ): 2538 - 2544 . (in Chinese)
LIN S S , LIN W W , WU W T , et al . PETformer: Long-term time series forecasting via placeholder-enhanced transformer [J ] . IEEE Transactions on Emerging Topics in Computational Intelligence , 2025 , 9 ( 2 ): 1189 - 1201 .
LIN S S , LIN W W , WU W T , et al . SparseTSF: Lightweight and robust time series forecasting via sparse modeling [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2026 , 48 ( 1 ): 170 - 183 .
史爱武 , 罗良杰 , 何凯 . 基于EMD-TCN的云资源预测研究 [J ] . 计算机应用与软件 , 2023 , 40 ( 7 ): 85 - 90 .
SHI A W , LUO L J , HE K . Cloud resource prediction based on EMD-TCN [J ] . Computer Applications and Software , 2023 , 40 ( 7 ): 85 - 90 . (in Chinese)
杨哲兴 , 谢晓兰 , 李水旺 . 基于VDM-ISSA-LSSVM的云资源短期负载预测模型 [J ] . 实验室研究与探索 , 2023 , 42 ( 6 ): 117 - 124 .
YANG Z X , XIE X L , LI S W . Short term load prediction model for cloud resources based on VMD-ISSA-LSSVM [J ] . Research and Exploration in Laboratory , 2023 , 42 ( 6 ): 117 - 124 . (in Chinese)
林涛 , 冯竞凯 , 郝章肖 , 等 . 基于组合预测模型的云计算资源负载预测研究 [J ] . 计算机工程与科学 , 2020 , 42 ( 7 ): 1168 - 1173 .
LIN T , FENG J K , HAO Z X , et al . Cloud computing resource load prediction based on combined prediction model [J ] . Computer Engineering & Science , 2020 , 42 ( 7 ): 1168 - 1173 . (in Chinese)
姚军 , 刘明 . 基于凌日搜索算法优化的组合预测模型 [J ] . 计算机应用 , 2025 , 45 ( 12 ): 3925 - 3930 .
YAO J , LIU M . Combined prediction model optimized by transit search algorithm [J ] . Journal of Computer Applications , 2025 , 45 ( 12 ): 3925 - 3930 . (in Chinese)
徐江 , 张晨飞 , 王富强 , 等 . 基于ARIMA-LSTM的容器云资源预测方法 [J ] . 重型机械 , 2022 ( 6 ): 6 - 14 .
XU J , ZHANG C F , WANG F Q , et al . Container cloud resource prediction method based on ARIMA-LSTM [J ] . Heavy Machinery , 2022 ( 6 ): 6 - 14 . (in Chinese)
ZHANG L K , XIE Y L , JIN M P , et al . A novel hybrid model for docker container workload prediction [J ] . IEEE Transactions on Network and Service Management , 2023 , 20 ( 3 ): 2726 - 2743 .
贺小伟 , 徐靖杰 , 王宾 , 等 . 基于GRU-LSTM组合模型的云计算资源负载预测研究 [J ] . 计算机工程 , 2022 , 48 ( 5 ): 11 - 17, 34 .
HE X W , XU J J , WANG B , et al . Research on cloud computing resource load forecasting based on GRU-LSTM combination model [J ] . Computer Engineering , 2022 , 48 ( 5 ): 11 - 17, 34 . (in Chinese)
王艺霏 , 于雷 , 滕飞 , 等 . 基于长-短时序特征融合的资源负载预测模型 [J ] . 计算机应用 , 2022 , 42 ( 5 ): 1508 - 1515 .
WANG Y F , YU L , TENG F , et al . Resource load prediction model based on long-short time series feature fusion [J ] . Journal of Computer Applications , 2022 , 42 ( 5 ): 1508 - 1515 . (in Chinese)
史爱武 , 罗干 , 李林逸 , 等 . 基于MVMD-MHAT-BiLSTM的云资源负载预测方法 [J ] . 软件导刊 , 2024 ( 12 ): 18 - 26 .
SHI A W , LUO G , LI L Y , et al . Cloud resource load prediction method based on MVMD-MHAT-BiLSTM [J ] . Software Guide , 2024 ( 12 ): 18 - 26 . (in Chinese)
ZHANG M , AN C Y , YANG C H . Multivariate workload aware correlation model for container workload prediction [C ] // 2023 IEEE 29th International Conference on Parallel and Distributed Systems . Piscataway : IEEE , 2024 : 972 - 979 .
ZHAO F Y , LIN W W , LIN S S , et al . TFEGRU: Time-frequency enhanced gated recurrent unit with attention for cloud workload prediction [J ] . IEEE Transactions on Services Computing , 2025 , 18 ( 1 ): 467 - 478 .
CHEN L , ZHANG W W . A deep learning-based approach with PSO for workload prediction of containers in the cloud [C ] // 2021 13th International Conference on Advanced Infocomm Technology . Piscataway : IEEE , 2022 : 204 - 208 .
胡应钢 , 郭翔 , 赵海燕 , 等 . 基于粒子群优化GRU-RNN组合模型的云计算资源负载预测 [J ] . 内蒙古民族大学学报(自然科学版) , 2023 , 38 ( 4 ): 315 - 321 .
HU Y G , GUO X , ZHAO H Y , et al . Cloud computing resource load prediction based on particle swarm optimization GRU-RNN combination model [J ] . Journal of Inner Mongolia Minzu University (Natural Sciences) , 2023 , 38 ( 4 ): 315 - 321 . (in Chinese)
XU M X , SONG C H , WU H M , et al . esDNN: Deep neural network based multivariate workload prediction in cloud computing environments [J ] . ACM Transactions on Internet Technology , 2022 , 22 ( 3 ): 1 - 24 .
FENG B B , DING Z J . GROUP: An end-to-end multi-step-ahead workload prediction approach focusing on workload group behavior [C ] // Proceedings of the ACM Web Conference 2023 . New York : ACM , 2023 : 3098 - 3108 .
DING Z J , FENG B B , JIANG C J . COIN: A container workload prediction model focusing on common and individual changes in workloads [J ] . IEEE Transactions on Parallel and Distributed Systems , 2022 , 33 ( 12 ): 4738 - 4751 .
徐海洋 , 刘海龙 , 陈先 , 等 . 基于组合负载预测模型的多租户数据库弹性伸缩方法 [J ] . 软件学报 , 2025 , 36 ( 3 ): 981 - 994 .
XU H Y , LIU H L , CHEN X , et al . Elastic scaling method for multi-tenant databases based on hybrid workload prediction model [J ] . Journal of Software , 2025 , 36 ( 3 ): 981 - 994 . (in Chinese)
LUO S T , XU H L , YE K J , et al . The power of prediction: Microservice auto scaling via workload learning [C ] // Proceedings of the 13th Symposium on Cloud Computing . New York : ACM , 2022 : 355 - 369 .
WANG L , GUO S D , ZHANG P L , et al . An efficient load prediction-driven scheduling strategy model in container cloud [J ] . International Journal of Intelligent Systems , 2023 , 2023 : 5959223 .
MEHRAN N , NIKOLOV N , PRODAN R , et al . ADApt: Edge device anomaly detection and microservice replica prediction [C ] // 2025 IEEE 9th International Conference on Fog and Edge Computing . Piscataway : IEEE , 2025 : 6 - 10 .
GAO J C , WANG H Y , SHEN H Y . Machine learning based workload prediction in cloud computing [C ] // 2020 29th International Conference on Computer Communications and Networks . Piscataway : IEEE , 2020 : 1 - 9 .
SAXENA D , KUMAR J , SINGH A K , et al . Performance analysis of machine learning centered workload prediction models for cloud [J ] . IEEE Transactions on Parallel and Distributed Systems , 2023 , 34 ( 4 ): 1313 - 1330 .
VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [EB/OL ] . ( 2023-08-02 )[ 2025-10-10 ] . https://arxiv.org/abs/1706.03762 https://arxiv.org/abs/1706.03762 .
NIEWUYA . Cluster-trace-microservices-v2022 [EB/OL ] . ( 2023-06-27 )[ 2025-08-27 ] . https://github.com/alibaba/clusterdata/tree/master/cluster-trace-microservices-v2022 https://github.com/alibaba/clusterdata/tree/master/cluster-trace-microservices-v2022 .
GAN Y , ZHANG Y Q , CHENG D L , et al . An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems [C ] // Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems . New York : ACM , 2019 : 3 - 18 .
DEATHSTARBENCH . Social network microservices benchmark [EB/OL ] . ( 2019-01-01 )[ 2025-11-29 ] . https://github.com/delimitrou/DeathStarBench/tree/master/socialNetwork https://github.com/delimitrou/DeathStarBench/tree/master/socialNetwork .
0
Views
14
下载量
0
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution
京公网安备11010802024621