最新刊期

    53 12 2025

      Special Issue\: Recipients of CIE Science and Technology Awards

    • BAO Chen-xi, SHENG Min, ZHOU Di, JI Si-jing, SHI Yan, LI Jian-dong
      Vol. 53, Issue 12, Pages: 4199-4215(2025) DOI: 10.12263/DZXB.20250885
      摘要:The significant improvement in onboard computing capabilities of large-scale satellite networks (LSNs) has facilitated the realization of satellite autonomous resource management, which is a key means to ensure the timeliness of end-to-end (E2E) services for diversified services. However, the highly dynamic topology of LSNs makes it difficult to efficiently collaborate on inter-satellite communication and computing resources, posing severe challenges to meeting the differentiated timeliness requirements of various services and ensuring high-quality E2E services. To this end, this paper establishes a virtual node and link mapping model to form a virtual network that statically covers service requests, effectively avoiding the impact of high-speed satellite movement on E2E services. Furthermore, a network status information extraction model is designed that integrates the LSN topology, which captures the dynamic evolution relationship of inter-satellite structured communication and computing resources and service demand characteristics in real time. Leveraging a parameter-sharing slice resource allocation decision-making mechanism, satellite-ground collaborative intelligent slice resource management and network communication and computing resource slicing with service demand characteristics can be achieved. In addition, by designing a regional resource management mode and introducing service orientation information, local topology awareness and target positioning capabilities are provided for satellite autonomous E2E service decision-making. It achieves on-demand and efficient coordination of dynamic communication and computing resources, leveraging onboard processing capabilities to improve the E2E service performance of diversified services under latency constraints. Simulation results show that the proposed algorithm can improve the service completion performance by 28%, 25.2% and 39.3% respectively, under different communication and computing resources and service request numbers compared with the non-topology-aware algorithm.  
      关键词:large-scale satellite networks;diversified services;communication and computing collaboration;resource management;end-to-end service   
      38
      |
      8
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 144422930 false
      更新时间:2026-04-24
    • Fine Measurement Technology of Insect Monitoring Radar

      HU Cheng, WANG Rui, LI Wei-dong, WANG Jiang-tao, YE Zi-han, CAI Jiong, JIANG Qi, ZHANG Ji-chuan, TAN Li-jia
      Vol. 53, Issue 12, Pages: 4216-4230(2025) DOI: 10.12263/DZXB.20250883
      摘要:Insect migration is a key factor in the cross‑regional outbreak of pests and the large‑scale spread of diseases, leading to severe issues such as crop loss, environmental pollution, and biosecurity threats. Monitoring migratory insects is therefore of great significance for national strategies on food security and invasive species biosecurity. Entomological radar is a specialized radar system designed for detecting migratory insects. Compared to traditional methods such as aerial net trapping and high‑altitude light trapping, radar enables all‑weather, all‑day, non‑invasive, and wide‑area monitoring of migratory insects, making it one of the most effective technical means for studying and monitoring insect migration. It has already revealed many collective migration phenomena and patterns, such as layer formation and common orientation. However, research on insect migration remains a global challenge and is listed as one of the 125 scientific puzzles by journal Science. To unravel the mysteries of insect migration, international entomology experts generally agree that the first step is to realize flight trajectory analysis and species identification of individual insect. The core of trajectory analysis lies in measuring behavioral parameters such as insect head orientation, while species identification relies on measuring biological parameters such as body length, body width, and wingbeat frequency. Nevertheless, insects vary widely in size from millimeters to centimeters, covering scattering regimes from Rayleigh to resonance regions, resulting in complex scattering characteristics. Their radar cross‑section (RCS) can be as low as -70 dBsm, yielding extremely weak echo signals with very low signal‑to‑noise ratios, and wingbeat echoes are even weaker. Consequently, conventional radar systems struggle to achieve precise measurements of behavioral and biological parameters. To address this, our research group has conducted studies on weak‑target measurement technology in the resonance region. We have proposed a series of methods, including full‑polarization multi‑aspect 3D orientation estimation for insects in the resonance region, insect size measurement via multi‑frequency full‑polarization feature mapping, and wingbeat frequency measurement via extremely weak micro‑Doppler echo enhancement. These approaches have overcome the challenges in accurately measuring individual parameters such as head orientation, body length, body width, and wingbeat frequency. Building on this, we have developed a new‑generation insect individual monitoring radar system, achieving for the first time the measurement of 3D head orientation, body length, and body width, while significantly improving the accuracy of wingbeat frequency measurement. Currently, this system has been deployed in Dongying, Shandong—a national agricultural high‑tech zone and a key pathway in northern China’s migration corridor—and is operating for routine observation. It provides essential technological and data support for research on insect migration behavior and early‑warning control of major pest outbreaks.  
      关键词:entomological radar for individual monitoring;resonance region measurement;orientation;body size;wingbeat frequency   
      122
      |
      55
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 149817803 false
      更新时间:2026-04-24
    • SHEN Meng, JIA Ji-zhe, ZHAO Bu-fan, CHANG Li-yuan, YANG Ming, REN Chen-chen, SONG Yue, ZHU Lie-huang
      Vol. 53, Issue 12, Pages: 4231-4249(2025) DOI: 10.12263/DZXB.20250731
      摘要:The widespread adoption of encrypted internet traffic ensures confidentiality and privacy, yet attackers increasingly leverage encryption techniques to conceal malicious network activities. As encrypted malicious traffic exhibits characteristics similar to benign encrypted traffic, it can easily evade traditional detection methods based on feature signatures and deep packet inspection (DPI). Current research on encrypted malicious traffic detection primarily focuses on supervised learning paradigms. While effective against known attack types, their efficacy heavily relies on large, continuously updated labeled malicious traffic samples. Confronted with rapidly evolving malware variants and the widespread use of encryption tunneling techniques, supervised learning models struggle to generalize to unseen attack types, exhibiting significant limitations in adaptability. Furthermore, the feature representations in existing methods often depend on manually engineered statistical features, which fail to capture the deep semantic information and complex temporal dynamics of malicious behaviors within the underlying data packets of encrypted flows, resulting in limited feature discriminability and ineffectiveness against novel attack patterns. To address these challenges, we propose MalGuard, a reliable method for detecting unknown encrypted malicious traffic via self-supervised learning of burst-feature tokens. By analyzing the underlying mechanisms of network transmission and observing the key characteristic distributions between benign and malicious traffic, we innovatively propose a novel burst-aware traffic tokenization method, achieving a correlated representation of the semantic information and temporal dynamics of data packets and providing a high-information-density input foundation for subsequent model pre-training. Building on this token representation, we design two traffic-specific self-supervised pre-training tasks—Span-Masked Language Modeling and a Span Boundary Objective. These tasks mask and reconstruct spans of packet content to enhance the model’s holistic perception of contextual dependencies within the data, enabling the extraction of generalized traffic features. Leveraging these features, we further construct a lightweight unsupervised learning algorithm adapted to the intrinsic distribution of traffic characteristics. By identifying outliers in the high-dimensional representation space, reliable detection of encrypted malicious traffic is achieved without requiring labeled malicious data. To validate the effectiveness of MalGuard, we conducted experimental evaluations on three public datasets. Experimental results demonstrate that MalGuard outperforms the SOTA methods in detecting unknown encrypted malicious traffic. Specifically, we define the imbalance ratioβas the ratio of benign to malicious samples, at β =4∶1 and β =16∶1, MalGuard achieves average F1 scores of 91.76% and 84.56%, surpassing the best existing baseline by 6.01 percentage points and 28.33 percentage points, respectively.  
      关键词:network security;encrypted traffic analysis;malicious traffic detection;malware;self-supervised pre-training;unsupervised learning   
      141
      |
      33
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 149484856 false
      更新时间:2026-04-24
    • Infrared-Visible Image Fusion via Heterogeneous Multi-Level Distillation

      ZHANG Qi, SONG Hong, LI Jin-fu, MA Shi-han, LIN Yu-cong, YANG Jian
      Vol. 53, Issue 12, Pages: 4250-4266(2025) DOI: 10.12263/DZXB.20250764
      摘要:Knowledge distillation transfers the representation capability of a complex teacher network to a lightweight student network, thereby enhancing model performance and deployment efficiency. However, existing knowledge distillation-based multimodal image fusion methods often neglect the heterogeneity of feature representations and modality preferences between teacher and student networks, as well as the inherent differences across modalities. This limitation results in inefficient knowledge transfer, insufficient semantic alignment, and degraded fusion performance. To address these issues, we propose an infrared and visible image fusion method based on heterogeneous model multi-level knowledge distillation. Specifically, a cross-layer knowledge transfer mechanism is designed: at the feature layer, attention is utilized to guide the precise transfer of infrared salient targets and visible-light textures; at the relationship layer, similarity-based relational matching and topological structure alignment are employed to enhance cross-modal semantic adaptation; and at the output layer, response constraints are applied to ensure both visual consistency and semantic integrity of the fused results, alleviating the information imbalance caused by mismatched modality preferences between teacher and student networks. In addition, we construct a task-adaptive lightweight CNN-Transformer dual-branch student network that simultaneously models global information and captures local details, thereby enhancing its ability to receive and integrate heterogeneous knowledge. Experimental results on the MSRS, RoadScene, TNO, and M3FD datasets demonstrate that under the guidance of three teacher models with significantly different architectures, the proposed method outperforms both the teacher models and state-of-the-art approaches in terms of correlation coefficient (CC), peak signal-to-noise ratio (PSNR), sum of the correlations of differences (SCD) and structural similarity index measure (SSIM) metrics, while requiring only 0.077 2 M parameters and achieving 31.22 ms inference time on a server platform. Moreover, the model maintains an inference time of 250.31 ms on the Jetson AGX Xavier edge platform, indicating strong suitability for edge deployment and practical applications.  
      关键词:infrared-visible image fusion;knowledge distillation;heterogeneous models;lightweight design;feature alignment   
      28
      |
      18
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 148553512 false
      更新时间:2026-04-24
    • Uncertainty Fusion-Based Multimodal Remote Sensing Data Classification

      HE Xin, CHEN Yu-shi, GU Yan-feng, LIU Tian-zhu
      Vol. 53, Issue 12, Pages: 4267-4278(2025) DOI: 10.12263/DZXB.20250882
      摘要:Classification is a key technique and a hot topic in the interpretation of multimodal remote sensing data. In recent years, deep learning methods have achieved significant progress in pixel-level classification of multimodal remote sensing data. However, the different modalities contained in multimodal remote sensing data exhibit variability in their predicted results after feature extraction, which is referred to as predictive uncertainty. This uncertainty negatively impacts the classification accuracy of multimodal remote sensing data classification methods. To reduce the prediction uncertainty, this paper proposes an uncertainty-aware fusion framework for multimodal remote sensing data classification. From the perspective of evidence quality, the framework jointly extracts spatial and channel features from different modalities (e.g., synthetic aperture radar data, light detection and ranging data, hyperspectral image, etc.) and constructs corresponding evidential neural networks. A specifically designed evidential fusion function is employed to effectively integrate multimodal evidential information. During the fusion process, when conflicting predictions arise from different modalities, a conflict-aware dynamic weight adjustment mechanism is introduced. This mechanism adaptively reduces the weight of conflicting modalities using a discount factor and dynamically reallocates the quality of evidence, thereby effectively reducing model uncertainty. Furthermore, to further minimize the discrepancy among predictions from different modalities, a consistency loss function is incorporated into the model parameter optimization process to constrain the consistency of predictions across modalities. Experiments are conducted on three publicly available international multimodal remote sensing datasets, and the proposed method is compared with six state-of-the-art approaches. The results demonstrate that the proposed method achieves significant improvements in classification performance.  
      关键词:multimodal remote sensing data;evidential deep learning;uncertainty;pixel-level remote sensing image classification   
      89
      |
      10
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 145196001 false
      更新时间:2026-04-24
    • HAN Jia-yu, DAI Jun-yan, CHENG Qiang
      Vol. 53, Issue 12, Pages: 4279-4287(2025) DOI: 10.12263/DZXB.20250343
      摘要:With the continuous growth of modern communication demands and the rapid development of RF technologies, metasurface-based wireless communication systems have emerged as one of the key technologies for next-generation wireless communication and radar systems due to their remarkable advantages of lightweight design, high efficiency, and low cost, attracting extensive attention from researchers worldwide. In this paper, we propose a wireless communication system based on a 1 bit transmissive metasurface. Specifically, we designed a low-loss transmissive metasurface operating in the 4.75~5.65 GHz frequency band, capable of achieving 180°phase modulation of transmitted electromagnetic waves. Furthermore, the metasurface can realize 2 bit precise phase modulation by adjusting the time delay and duty cycle of control signals. Based on this theoretical framework, we established an efficient QPSK (Quadrature Phase Shift Keying) wireless communication system at 5 GHz. Experimental results demonstrate that the system successfully achieves high-quality QPSK modulation and enables stable, accurate directional communication. These findings fully illustrate the superior performance and application potential of time-domain coded transmissive metasurfaces in wireless communications, providing important theoretical foundations and technical support for the design and optimization of next-generation metasurface communication systems.  
      关键词:transmissive metasurface;low-loss;time-domain coding metasurface;wireless communications;QPSK modulation   
      68
      |
      14
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 128150784 false
      更新时间:2026-04-24

      PAPERS

    • LI De-gui, SUN Qi, MA Zi-yin, MEI Zhong-lei
      Vol. 53, Issue 12, Pages: 4288-4295(2025) DOI: 10.12263/DZXB.20250644
      摘要:As the general solution of the Helmholtz equations for spherical resonant cavities analyzed with the separated-variable method only takes the integer-order first-type associated Legendre function, it is confined to characterize the electromagnetic properties of the shaped-metal resonant cavity derived from this structure invalidly. To address the aforementioned drawbacks, in this study, a universal analytical analytical framework of electromagnetic properties of shaped metal resonant cavity is proposed. The first and the second type of arbitrary order and degree of associated Legendre functions are constructed by introducing the generalized hypergeometric function, so as to complete the general solution of Helmholtz equations. Based on this, the analytical solutions of the electromagnetic fields of the fundamental and higher-order modes in the transverse magnetic and transverse electric modes are derived based on the Borgins’ method of potential function, as well as verified by finite element numerical simulations. The results indicated that the relative errors between the analytical and numerical solutions for the resonance frequencies of fundamental and higher-order modes are 0.070% and 0.069%, respectively. Furthermore, the normalized electromagnetic field distributions of both solutions are mutually consistent, with root-mean-square errors of only 1.670×10-3 and 2.667×10-3, respectively, validating the accuracy and reliability of the research method. This study not only successfully extends the conclusion of classical spherical cavities to shaped structures, but also expands the application scope of analytical modeling of electromagnetic fields, which is instrumental to precise design of novel microwave and optical devices.  
      关键词:shaped metal resonant cavity;separate variables method;spherical coordinates;helmholtz equation;arbitrary order and degree of associated legendre functions;Borgins’ potential function   
      14
      |
      5
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 144030584 false
      更新时间:2026-04-24
    • Rotatable Antenna-Aided Multi-User MISO Physical Layer Key Generation

      ZHU Zheng-yu, LIU Ji-tong, LI Xin-ze, ZHENG Bei-xiong, JIN Liang, HUANG Kai-zhi, ZHONG Zhou
      Vol. 53, Issue 12, Pages: 4296-4304(2025) DOI: 10.12263/DZXB.20250959
      摘要:Aiming at the eavesdropping attacks and information leakage risks existing in wireless communication systems, a physical layer key generation (PLKG) scheme for multi-user multiple input single output (MISO) systems assisted by rotatable antenna (RA) is proposed. By utilizing the directional controllability of RA, an optimization problem is constructed, where the objective function is the maximization of the system’s sum key generation rate, and the constraints include total transmit power, the minimum quality of service requirements for each user, the maximum zenith angle of RA, and the normalization of the pointing vector. To solve this non-convex problem, an optimization algorithm based on alternating optimization (AO), semi-definite relaxation (SDR), and successive convex approximation (SCA) is designed. Simulation results show that the proposed scheme exhibits a higher sum key generation rate compared with the benchmark scheme, which demonstrates the effectiveness of RA in enhancing the physical layer key generation rate.  
      关键词:rotatable antenna;physical layer key generation;alternating optimization;semi-definite relaxation;successive convex approximation   
      50
      |
      9
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 144030728 false
      更新时间:2026-04-24
    • MAO Xin-long, XIONG Yu-yang, YU Jian-hang, LI Shao-ping, DONG Yang, JIANG Yan-feng
      Vol. 53, Issue 12, Pages: 4305-4316(2025) DOI: 10.12263/DZXB.20250163
      摘要:In order to improve the nonlinearity and temperature drift of the output of the condenser micro-electro-mechanical systems (MEMS) silicon microphone, Based on the relevant algorithms of the current sensor output calibration strategies, and proposes a novel calibration algorithm based on the Pade approximation is proposed and testified, which is adopted in practical industrial application with high practicability. In a humidity environment with RH 50%, the accuracy of MEMS sensor can be improved from 6%FS to 1.2%FS. Compared with the current mainstream polynomial fitting calibration algorithm with the same accuracy, the algorithm used in the paper consumes about 14% less computational memory. The algorithm has great reference significance for the optimization of output response of other sensors or circuits.  
      关键词:condenser MEMS silicon microphones;calibration algorithms;temperature and humidity calibration;linear fitting;pade approximation   
      35
      |
      10
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 144607713 false
      更新时间:2026-04-24
    • Binocular Vision Localization and Measurement Based on Star-RTMPose

      ZHANG Meng-quan, XU Si-xiang, YANG Yu, WU Duan-zheng
      Vol. 53, Issue 12, Pages: 4317-4329(2025) DOI: 10.12263/DZXB.20250422
      摘要:A binocular vision localization and measurement method based on star-enhanced real-time multi-person pose estimation (Star-RTMPose) is proposed to address the problems of low efficiency, insufficient matching accuracy, sensitivity to illumination changes, and complex parameter tuning of traditional binocular vision feature point detection algorithms, which limit the accuracy of binocular vision localization and measurement. Taking continuous casting billets in the iron and steel metallurgy industry as the research object, this method focuses on the precise positioning and dimension measurement requirements for burr removal after flame cutting, and proposes a corresponding technical implementation approach. Firstly, images of continuous casting billets are collected using calibrated binocular cameras. The LabelMe tool is then used to annotate target regions and keypoints, which are uniformly converted to the microsoft common objects in context (MSCOCO) format to adapt to model training. Subsequently, a two-stage framework of “target detection-keypoint extraction” is adopted to achieve precise detection: the real-time models for object detection (RTMDet) algorithm is first used to quickly locate the main area of the continuous casting billet, and then the improved real-time multi-person pose estimation (RTMPose) model, Star-RTMPose, is used to extract keypoint coordinates. The improvements include: introducing the star triple block (StarTriBlock) module into the RTMPose backbone to enhance the network’s ability to characterize high-level semantic features of the target through a multi-branch dynamic fusion mechanism, making full use of the maximum receptive field and global spatial correlation information of this stage; replacing the 7×7 large kernel convolution at the network head with the maximum depthwise separable convolution 2 (MaxDSC2) module based on depth-separable convolution, setting the intermediate channel number of this module to 0.45 times the input channel number to improve the sensitivity to semantic information while reducing the number of parameters; substituting the traditional channel attention module with the parameter-free simple parameter-free attention module (SimAM) attention module, which generates channel-spatial three-dimensional joint weights through the closed-form solution of the energy function, strengthens the network’s ability to capture spatial features, and avoids parameter redundancy. Finally, by combining the calibration parameters of the binocular camera with the triangulation principle, the three-dimensional reconstruction of keypoints and the dimensional measurement of continuous casting billets are completed. The experimental results show that: in the keypoint detection task, the inference time of the improved Star-RTMPose model for a single image is only 9.86 ms; compared with the baseline model RTMPose-T, its average precision (AP) is improved by 1.09 percentage points, percentage of correct keypoints (PCK) by 0.40 percentage points, and normalized mean error (NME) is reduced by 42.86%; on the premise of more streamlined parameters, the comprehensive performance of the improved model is significantly superior to that of mainstream models such as HRNet-W32 and SwinTransformer-T. In terms of three-dimensional measurement accuracy, the relative error of the proposed method for measuring the long side dimension of Type 1 continuous casting billet is reduced by 1.715 and 0.365 percentage points compared to the traditional oriented fast and rotated brief (ORB) algorithm and the improved features from accelerated segment test (FAST) algorithm, respectively. This method effectively addresses the issue of poor robustness in traditional algorithms, achieving dual improvements in detection accuracy and measurement accuracy, and thereby meeting the demand for high-precision detection in industrial scenarios.  
      关键词:binocular vision;RTMPose;attention module;three-dimensional reconstruction;dimensional measurement;keypoint detection   
      50
      |
      8
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 144995457 false
      更新时间:2026-04-24
    • Mixed Space Mapping Strategy for Microwave Waveguide Filters

      GONG Jian-qiang, ZHANG Chen-lei, LIAO Zi-hao, LIU Yi-run, YU Han-chao, XIE Jian
      Vol. 53, Issue 12, Pages: 4330-4336(2025) DOI: 10.12263/DZXB.20250768
      摘要:This paper presents an effective mixed space mapping (MSM) strategy for microwave waveguide filters by incorporating the one-step aggressive space mapping (OS-ASM) and the implicit space mapping (ISM). Executing OS-ASM enables the initial physical parameters to rapidly reach the proximity of the best solution, even if the initial parameters locate far away from the targetted ones. The further conducted standard ISM steps empower the simulation results of the high-fidelity model or the fine model (FM) of the waveguide filters almost perfectly to match the specified theoretical porformance within few iteration steps. Compared with the traditional direct optimization methods, the proposed MSM strategy does directly and iteratively optimize the low-fidelity model or the coarse model (CM) based on the mode-matching technique with small computation burden, and the few performed FM simulations with large computation consumpation are only used to verify the physical parameters produced by optimizing the continuously calibrated CM, and that is why substantial efficiency and accuracy are simultaneoulsy achieved, and in addition, the demanding requirement that the CM must be able to realize the extreme fitting in the traditional OS-ASM is greatly relieved. An eight-pole Chebyshev rectangular waveguide bandpass filter (BPF) and a four-pole dual-mode circular waveguide BPF are taken as examples to detail the implementation process and the eminent effects of the proposed MSM strategy.  
      关键词:waveguide filter;mixed space mapping;fine model;coarse model;mode-matching technique   
      39
      |
      9
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 144599603 false
      更新时间:2026-04-24
    • ZHANG Lu, LI Ming-ai
      Vol. 53, Issue 12, Pages: 4337-4348(2025) DOI: 10.12263/DZXB.20250435
      摘要:Decoding motor imagery electroencephalogram (MI-EEG) signals based on deep learning models is one of the hot research topics in the field of brain-computer interface (BCI) technology. Aiming at the time-frequency characteristics and individual differences of MI-EEG, numerous studies have conducted time-frequency analysis on MI-EEG and widely applied its time-frequency representations to MI-EEG decoding. However, most existing methods ignore the spatial distribution characteristics of multi-electrode MI-EEG and fail to fully explore and utilize the topological relationships between different electrodes, thereby affecting the integrity of feature information and limiting the further improvement of decoding performance. To adaptively learn the topological information between multi-electrode MI-EEG and effectively enhance its time-frequency-spatial feature information, this paper proposes an attention network with continuous wavelet convolution and graph embedding (CWC-GEAN). The network consists of five modules: a multi-branch continuous wavelet convolution module (MCWCM), a multi-branch dynamic graph embedding module (MGEM), a multi-branch feature channel attention module (MFCAM), a multi-branch feature channel-time attention module (MFCTAM), and a feature fusion and classification block (FFCB). First, the original multi-electrode MI-EEG signals are input into the MCWCM, where continuous wavelet convolution is performed based on four sub-bands(1 Hz to 8 Hz, 9 Hz to 16 Hz, 17 Hz to 24 Hz, 25 Hz to 32 Hz) in four branches respectively, and the optimal multi-scale frequency-spatial-temporal feature representations are obtained through dynamic learning of scale factors. Then, a prior adjacency matrix containing topological information between electrodes is constructed based on mutual information, and the prior adjacency matrix is adaptively learned and adjusted from different sub-bands via the MGEM, which is embedded into the frequency-spatial-temporal feature representations of corresponding branches to obtain graph structure features containing topological information between electrodes. Furthermore, the MFCAM and the MFCTAM further extract deep features from the graph structure features of each branch, and successively complete the automatic acquisition of feature channel attention vectors and feature channel-time attention matrices as well as feature weighting to obtain multi-branch discriminative features. Finally, the FFCB fuses the multi-branch discriminative features to obtain the final classification results. In this paper, the performance of CWC-GEAN is evaluated based on the public BCI Competition IV 2a dataset and High-Gamma Dataset, with average classification accuracies of 85.45% and 95.09%, and average Kappa values of 0.806 and 0.934, respectively. The results show that CWC-GEAN has the ability to adaptively learn and capture the time-frequency information and electrode topological information of MI-EEG, as well as enhance time-frequency-spatial features, and exhibits good model robustness and consistency of classification results, with certain performance advantages over popular methods.  
      关键词:electroencephalogram;motor imagery;continuous wavelet convolution;topological information;attention;frequency-spatial-temporal feature   
      91
      |
      8
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 145195968 false
      更新时间:2026-04-24
    • LIU Ming-jie, LYU Meng-lin, LIU Ping, CHEN Jun-sheng, PIAO Chang-hao, KANG Zong-xu
      Vol. 53, Issue 12, Pages: 4349-4363(2025) DOI: 10.12263/DZXB.20250429
      摘要:Suspended particulates of the atmosphere in hazy weather markedly degrade the imaging quality of visible light systems, which manifests as reduced image contrast, color distortion, and loss of fine-grained details. Such image deterioration substantially impairs the performance of computer vision tasks. Consequently, image dehazing is commonly employed as a preprocessing step for high-level visual tasks to furnish processes with high-quality visual data. U-Net-based image dehazing architecture has garnered widespread attention due to its efficiency, detail-oriented feature extraction, and lightweight characteristics. However, current U-Net-based networks realize image dehazing based on features extracted from space domain, ignoring the impact of features in frequency domain. In addition, the decoder of U-Net-based networks always realizes feature upsampling by nearest neighbor interpolation. It may cause spatial information loss and impact semantic information transmission from high-level to low-level, which adversely affects clear image restriction. To address the above issues, this paper proposes a novel image dehazing algorithm with dual-domain feature interaction and local correlation upsampling. Specifically, the dual-domain feature interaction module, including dual-path feature fusion submodule and frequency domain feature enhancement sub-module, is designed to extract and fuse the spatial domain and frequency domain features of the image. It can enhance the ability to capture the structural features of the image by introducing frequency domain information. Local correlation upsampling module embedded in decoder of U-Net is designed to capture the intrinsic correlation of local information of each feature map by attention mechanism, and transmit the high-level features with the compensatory information the low-level features simultaneously. In addition, we propose a contrast analysis method based on heat maps to visually the dehazing performance of different methods, which uses color gradients to quantitatively measure the differences in the dehazing effect. It can effectively reflect the performance differences of various dehazing methods in terms of image detail restoration. The experimental results demonstrate that the dehazing effect of our proposed method is superior to that of the compared method in both quantitative and qualitative evaluations. The peak signal noise ratio (PSNR) and structural similarity index measure (SSIM) values on the SOTS-Indoor, SOTS-Outdoor and Hzae4K datasets achieve 41.46 dB and 0.994 3, 37.73 dB and 0.993 6, 34.72 dB and 0.993, respectively.  
      关键词:image dehaze;U-Net-based architecture;spatial and frequency domain feature interaction;local correlation upsampling;Information fusion   
      82
      |
      10
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 145792335 false
      更新时间:2026-04-24
    • ZHANG Shun-wai, CUI Bo-yu
      Vol. 53, Issue 12, Pages: 4364-4375(2025) DOI: 10.12263/DZXB.20250611
      摘要:The fundamental idea of relay cooperative technology is to enable single-antenna terminals to share their antennas through cooperation, forming a virtual multiple-input multiple-output (MIMO) system, which is an effective way to utilize spatial resources. Reconfigurable intelligent surface (RIS) technology leverages the flexible control capabilities of information metamaterials to directly manipulate the propagation direction of wireless electromagnetic waves. It can provide additional transmission paths beyond direct line-of-sight, enabling full-duplex transmission without introducing self-interference, while offering advantages such as low hardware cost, low energy consumption, flexible configuration, and intelligent reconfiguration. Existing research indicates that RIS and traditional relay should not merely compete or replace one another. Despite their similar functions, they are fundamentally different, and there is a clear necessity and practical demand for coexistence and cooperation between them. Active reconfigurable intelligent surface (ARIS) technology, unlike traditional passive RIS (PRIS), incorporates active components capable of amplifying or processing signals, thereby providing additional power control and signal processing capabilities. To meet the inherent demands of future 6G communication network such as ubiquitous connectivity, full-coverage networks, green and low-carbon operations, and inclusive intelligence as well as to increase the transmission rate and extend the coverage, considering the advantages of ARIS such as flexible beamforming and signal regulation capabilities, low overall power consumption, and cost relative controllability, along with merits of traditional relays such as the stability, reliability, and widespread deployment, a hybrid system combining ARIS and relay technology is investigated. The ARIS-assisted multi-antenna decode-and-forward (DF) relay cooperative multi-user multiple-input single-output (MU-MISO) system model is established, where the base station (BS) transmits information to multiple single-antenna users with the assistance of both half-duplex DF relays and ARIS. To maximize the sum rate, a joint optimization problem of the source transmit beamforming vector, relay transmit beamforming vector, relay receive beamforming vector, and ARIS active beamforming matrix is formulated. Due to the coupling of optimization variables and the presence of non-convex constraints, the original joint problem is non-convex and difficult to be solved directly. Alternating optimization (AO) algorithm is utilized to decouple the original joint optimization problem into multiple subproblems, which are then converted into semi-definite programming (SDP) problems for solution via the sequential convex approximation (SCA) method. Simulation results demonstrate that the proposed system significantly outperforms other existing benchmark schemes. For example, when the number of ARIS elements is 20, the sum rate of the proposed system can be improved by 7.4%, 11.2%, 12.9%, 8.4% compared with the benchmark schemes 1, 2, 3, 4 respectively.  
      关键词:ARIS;decode-and-forward relay;MU-MISO;alternating optimization algorithm;sum rate   
      89
      |
      11
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 145792368 false
      更新时间:2026-04-24
    • WEI Jian-hao, ZHOU Ting-sen, LI Chuang, WEN Yan-hua, LI Ke-qin
      Vol. 53, Issue 12, Pages: 4376-4393(2025) DOI: 10.12263/DZXB.20250638
      摘要:Multimodal pedestrian trajectory prediction in city-scale traffic models faces critical challenges including sparse heterogeneous data with strong spatiotemporal correlations and privacy risks during large model pre-training. However, existing privacy-preserving methods for large models predominantly focus on protecting a single modality, such as images, text, or trajectories, while neglecting the high-dimensional correlation structures among modalities in the fusion space and the risk of cross-modal semantic leakage embedded in the gradients. As a result, these methods are vulnerable to model inversion and reconstruction attacks that can expose users’ real trajectory patterns and behavioral preferences, and they fail to effectively protect the privacy of both multimodal fused data and gradient correlations. Moreover, conventional attention mechanisms designed for dense data struggle to efficiently process sparse multimodal traffic features, resulting in suboptimal prediction accuracy. To address these issues, this paper proposes a privacy-preserving multimodal pedestrian trajectory prediction scheme for large model pre-training (PMPTL), achieving dual-efficient protection for both multimodal data and pre-trained models, along with high-accuracy prediction. Specifically, we design an innovative multimodal sparse trajectory flow fusion method based on a combination of Transformer and Mamba (MSTM), where the Transformer mechanism models global dependencies in pedestrian trajectory sequences and the Mamba mechanism is introduced to reduce the complexity of long-sequence modeling, thereby enabling efficient fusion of sparse spatiotemporal features. Secondly, we propose a resolution-aware grid partitioning-based adaptive weighted differential privacy (RGADP) method, which dynamically allocates privacy budgets according to grid resolution and the density of grid-level trajectory features, thereby achieving high-utility protection of fused feature privacy. Next, we propose a multimodal feature enhancement algorithm based on a dual-branch adaptive sparse self-attention mechanism (DBAS). By designing a dual-branch self-attention structure that dynamically adjusts weights to strengthen the representation of sparse data features, DBAS enables the large model to efficiently capture key characteristics of sparse trajectories in sparse scenarios and thereby improves pre-training efficiency. Additionally, an adaptive spatiotemporal Top-K sparsification with dithering quantization (ASDQ) method is introduced to reduce gradient redundancy and ensure secure model training. Finally, we propose an adaptive weighted aggregation-based multimodal sparse trajectory prediction framework (AWMT), which dynamically re-weights different model parameters to balance the strength of privacy protection and the accuracy of pedestrian trajectory prediction, thereby achieving high-precision trajectory forecasting. Theoretical analysis demonstrates that our scheme satisfies ϵ-DP protection. Experimental results on two real-world datasets show that our scheme reduces prediction error by 10% compared to state-of-the-art approaches and improves communication efficiency by 18.43%.  
      关键词:privacy-preserving large model;adaptive differential privacy;pedestrian trajectory prediction;efficient multimodal data fusion;dithering quantization;spatiotemporal feature modeling   
      101
      |
      19
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 144607833 false
      更新时间:2026-04-24
    • CHEN Zhi-xiong, YAN Yu-hao, ZHOU Zhen-yu
      Vol. 53, Issue 12, Pages: 4394-4407(2025) DOI: 10.12263/DZXB.20250501
      摘要:Dual-mode communication technology, based on both power line communication (PLC) and wireless communication, offers advantages such as extensive coverage, high reliability, and flexible access. It can significantly enhance the reliability, rate, and latency of data transmission, garnering considerable attention in scenarios such as metering network communications. However, in practical applications, factors such as dual-mode media access control (MAC) algorithms, burst traffic, and hybrid channel fading significantly impact the latency and other performance characteristics of metering networks. This poses considerable challenges for the theoretical analysis and computation of system latency boundary performance. To address the challenge of deterministic delay analysis under dual-mode, dual-medium, multi-parameter conditions, this paper proposes a deterministic delay performance calculation and optimization method for dual-mode communication networks based on peak age of information violation probability (PAVP). This offers a novel approach to ensuring timeliness in metrology networks. First, addressing the coexistence of periodic and burst traffic in metrology networks, a hybrid arrival model combining burst and periodic flows is established. Subsequently, a cross-layer service model is constructed considering dual-mode channel hybrid fading, superframe-based MAC layer hybrid access algorithms, and traffic prioritization. Building upon this foundation, stochastic network calculus (SNC) theory is employed to derive the theoretical upper bound for PAVP in dual-mode communication networks. This is achieved through moment generating functions (MGF) and maximum addition algebra, enabling the calculation and analysis of delay bound performance for systems with random arrivals and random services. Considering queue stability and power constraints, a power optimization model for continuous packets is established. The Lyapunov algorithm transforms the time-averaged optimization problem into a real-time optimization problem related to the current time slot queue and peak age of information (PAoI), thereby achieving dynamic optimization allocation of node and channel transmission power. Finally, system simulations analyze the impact of key parameters—including sampling period, mixed traffic intensity, device count, PAoI threshold, and MAC contention window—on delay bound performance. Results indicate an optimal sampling period exists under mixed traffic conditions to maximize delay bound performance. When dual-mode nodes exceed 20, the MAC layer access algorithm becomes the primary determinant of delay performance. Compared to single-mode communication and dual-mode fixed-parameter conditions, power optimization based on the Lyapunov algorithm further enhances the system’s delay boundary performance, improving data transmission real-time capability. These findings provide a theoretical foundation and technical reference for the engineering application of dual-mode communication in smart metering networks.  
      关键词:dual-mode communication;relay system;age of information;stochastic network calculus;Lyapunov optimal   
      81
      |
      19
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 146314973 false
      更新时间:2026-04-24
    • ZHAO Nan-nan, YANG Fan, WANG Hao, ZHANG Jia-meng, WU Ruo-fei, ZHAO Tong-xuan, SHI Yu-lu, ZHANG Xiao, ZHAO Xiao-nan, HUANG Zhi-jie, HAN Shu-jie
      Vol. 53, Issue 12, Pages: 4408-4428(2025) DOI: 10.12263/DZXB.20250831
      摘要:The rapid development of cloud platforms and microservice architectures has made elastic scaling a critical mechanism for ensuring both performance and cost efficiency. Although prior studies have advanced workload forecasting and hybrid modeling, most approaches still focus on predicting resource utilization (e.g., CPU or memory) and then mapping forecasts to scaling actions through threshold rules or controller logic. This forecast-control decoupling amplifies prediction errors and fails to capture practical mechanisms such as hysteresis, cooldown, and discrete scaling steps, thereby limiting deployment feasibility. To overcome these limitations, we directly learn scaling behaviors, modeling replica count dynamics as autoscaler control actions. We propose a hybrid model, ARIMA-BiLSTM-MHA, that integrates ARIMA for long-term trend extraction, BiLSTM for residual sequence modeling, multi-head attention for capturing critical temporal dependencies, and residual correction for improving robustness against bursty and non-stationary workloads. We conduct extensive experiments on the real-world Alibaba cluster-trace-microservices-v2022 dataset, where we systematically compare our method with baselines including PETformer, SparseTSF, TFEGRU, GRU, Transformer, Seq2Seq-LSTM, Seq2Seq-GRU, Seq2Seq-Transfomer, GRU-LSTM, CNN-LSTM and CNN-LSTM-GRU. Our results demonstrate that our approach consistently outperforms existing methods, achieving relative improvements of 1.57%~71.56% (MSE), 0.72%~46.67% (RMSE), 1.57%~59.10% (MAE), 1.97%~60.48% (MAPE), and 0.27%~15.70% (R²), with R² reaching up to 0.954 3. Furthermore, we conduct container replica autoscaling experiments based on the DeathStarBench socialNetwork benchmark. We show that the behavior learning-driven strategy, compared with the CPU-threshold HPA strategy, successfully reduces the average replica count by approximately 17% while lowering the average P99 latency by 2.11% and effectively suppressing tail-latency spikes during load transitions, thereby significantly mitigating resource over-provisioning. We show that our model can more accurately and stably learn and forecast scaling actions, providing forward-looking decision support for autoscaling in practical cloud environments.  
      关键词:cloud platforms;elastic scaling;scaling behavior learning;multi-head attention;residual correction;Hybrid models;deep learning;time series forecasting   
      67
      |
      14
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 147230835 false
      更新时间:2026-04-24
    • WU Guo-dong, HUANG Wen-jing, BAO Xian-li, LI Jing-xia, XIE Dong-chen
      Vol. 53, Issue 12, Pages: 4429-4443(2025) DOI: 10.12263/DZXB.20250654
      摘要:Current graph neural network-based recommendation studies rarely consider the repetitive behavior patterns of historical interactions and their frequency characteristics in the temporal dimension, making it difficult to capture the “shift” in node interactions over time. To address this, we propose a repetition and frequency-enhanced dynamic graph recommendation model (ReFDGRec) that integrates repetition-aware neighbor sampling with frequency domain analysis. ReFDGRec introduces a repetition-aware neighbor sampling strategy that not only considers the immediate neighborhood of individual nodes but also deeply explores node pairs with prior interactions, leveraging relevant historical information to enhance the understanding of node relationships. This approach enables more precise identification and extraction of high-frequency interaction patterns between users and items, capturing the dynamic evolution of user behavior and providing richer input features for the model. Additionally, considering the multi-resolution analysis capabilities of continuous wavelet transform (CWT), ReFDGRec effectively handles the non-stationarity and dynamic changes in user behavior across different time scales, capturing periodic trends and abrupt preference shifts. By CWT transforming user behavior time series into the frequency domain, it simultaneously captures long-term trends and short-term fluctuations, effectively addressing the limitations of existing methods in handling non-stationary user behavior. This enhances the model’s performance in managing abrupt user preferences and dynamic interest shifts, delivering more accurate personalized recommendation services. To comprehensively evaluate the performance of the proposed model, systematic experiments are conducted on four public datasets, namely Wikipedia, UCI, MOOC, and MovieLens, covering both transductive and inductive dynamic recommendation scenarios. In addition, three negative sampling strategies, including random, historical, and inductive sampling, are employed to ensure the rigor and fairness of the evaluation. Experimental results demonstrate that ReFDGRec consistently outperforms state-of-the-art baseline models such as DySAT, TGAT, TGN, GraphMixer, and RepeatMixer in terms of average precision metrics, achieving an average performance improvement of 2.3%~6.9%. Ablation studies further confirm the critical contributions of the node interaction frequency encoding scheme and the CWT-based enhancement module to the overall performance gains. Moreover, comparative analyses of time-frequency modeling methods indicate that the continuous wavelet transform is markedly more effective than the discrete Fourier transform and the short-time Fourier transform in modeling non-stationary behavioral sequences. By leveraging a theoretically guided repetition-aware mechanism and signal processing-driven frequency-domain enhancement techniques, this work provides a solution for dynamic graph recommendation that effectively captures interest evolution and behavioral drift, and has certain theoretical innovation and practical value.  
      关键词:dynamic graph;recommendation;repeat-aware;wavelet transform;frequency-enhanced   
      89
      |
      11
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 146562139 false
      更新时间:2026-04-24
    • ZHAN Hui-you, NI Hong-qiu, TAN Hai-sheng, WANG Tian-zhu, LI Xiang-yang
      Vol. 53, Issue 12, Pages: 4444-4459(2025) DOI: 10.12263/DZXB.20250461
      摘要:With the growing demand for deploying large language models (LLMs) in edge intelligence scenarios, retrieval-augmented generation (RAG) has emerged as a pivotal paradigm for edge-side deployment due to its ability to reduce dependency on large models while enhancing domain-specific knowledge coverage and privacy protection. However, RAG systems still face significant challenges on resource-constrained edge devices: large-scale, high-dimensional embedding vector indexes cannot be fully loaded into memory, and frequent cache evictions coupled with slow storage accesses lead to substantially increased retrieval latency. Currently, embedding vectors are primarily acquired through three approaches—disk loading, online generation, and in-memory caching—which differ significantly in terms of latency, computational overhead, and resource consumption, and lack a unified, efficient scheduling mechanism. To address these challenges, this paper proposes BP-Cache, an efficient online caching algorithm specifically tailored for resource-constrained edge RAG systems. The core innovation of BP-Cache lies in its multi-path access cost modeling and dynamic hierarchical cache management mechanism, designed to resolve performance conflicts at the edge through fine-grained resource scheduling. First, we uncover two key characteristics of embedding vector access patterns in edge environments through empirical analysis: (1) the cost of online vector generation exhibits a pronounced long-tail distribution, where the computational latency for large vector clusters far exceeds that of disk loading; and (2) access patterns show strong locality and sparsity, with over 60% of vectors accessed only once throughout their lifetime. Based on these observations, BP-Cache introduces a lightweight admission filtering mechanism that uses a small buffer cache as an “observation window” to temporarily hold newly arriving vectors, enabling rapid bypass of low-value access requests and effectively mitigating cache pollution. Simultaneously, the algorithm constructs a joint cost–size scoring model that integrates the online generation cost, disk I/O latency, and memory footprint of each vector cluster into a unified evaluation framework. When cache capacity is insufficient, BP-Cache dynamically prioritizes retaining vector clusters that deliver the highest utility per unit of memory, thereby approaching the performance of an optimal offline strategy—without requiring any knowledge of future access sequences. We implement and evaluate our system on a real-world edge platform based on the NVIDIA Jetson AGX Orin, conducting extensive experiments across multiple datasets from the BEIR benchmark suite. Results show that, compared to state-of-the-art edge RAG solutions such as EdgeRAG, BP-Cache reduces average retrieval latency by approximately 29% and improves cache hit rate by about 21% across multiple datasets, while significantly optimizing tail latency performance. Further sensitivity analyses confirm that the algorithm demonstrates excellent robustness and adaptability under varying cache capacities, small-buffer ratios, and vector cluster granularities.  
      关键词:edge computing;retrieval augmented generation;cache optimization;online algorithm;low-latency retrieval   
      73
      |
      12
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 147422416 false
      更新时间:2026-04-24
    • CHANG Wen-wen, WANG Ya-jun, GUO Jin-cheng, SHU Kang, MA Yu
      Vol. 53, Issue 12, Pages: 4460-4473(2025) DOI: 10.12263/DZXB.20250911
      摘要:In this paper, we propose a classification and recognition model based on the multi-head cross-attention mechanism (MHCA) fusing EEG multi-domain features for the detection and recognition of epileptic electroencephalographic (EEG) signals in preictal, ictal and interictal states. The model is developed by sequentially splicing the two-dimensional images generated by continuous wavelet transform (CWT) of epileptic EEG signals according to the channels, and utilizing shallow convolutional neural network (CNN) to extract features from the spliced time-frequency images in order to effectively extract the time-frequency domain features of epileptic EEG signals. At the same time, the brain functional connectivity matrix is constructed to depict the functional connectivity between different brain regions to capture the potential spatial features during epileptic seizures, and finally, MHCA is used to realize the global interaction and adaptive fusion between time-frequency and spatial features to fully model the correlation and complementarity between multidimensional features, so as to construct a complete and unified feature characterization of epileptic EEG signals in the three dimensions of the time domain, the frequency domain, and the spatial domain. The experimental results show that the model can reach a maximum classification accuracy of 92.49% and a sensitivity of 92.48% in multi-subject classification of preictal, ictal and interictal phases, which reflects its good generalization ability and stability in cross-subject scenarios; and the model can reach a maximum accuracy of 98.39% and a sensitivity of 98.04% in single-subject classification, which fully verifies the efficacy of the method in the task of individualized epilepsy EEG recognition. The ablation experiments ultimately further confirmed the critical role of the spatial information represented by the brain functional connectivity matrix and the multi-head cross-attention mechanism in multi-domain feature fusion and discriminative feature enhancement, both of which positively contributed to the model's performance improvement. This paper validates the efficacy of epileptic EEG classification and recognition, which not only provides a reliable and feasible technical means for clinical EEG detection and recognition, but also provides new research ideas and methodological references for the extraction, characterization and modeling of key features in epileptic EEG signals.  
      关键词:epilepsy detection;EEG signals;multi-domain features;continuous wavelet transform;functional connectivity matrix;multi-head cross-attention mechanism   
      33
      |
      23
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 148553652 false
      更新时间:2026-04-24
    • LLM-Enhanced Maritime UAV Routing Algorithm Against Gray-Hole Attacks

      LI Jie-ling, XIAO Liang, WANG Peng-cheng, LEI Yan, CHEN Qiao-xin, WANG Cheng-yao
      Vol. 53, Issue 12, Pages: 4474-4484(2025) DOI: 10.12263/DZXB.20250878
      摘要:Unmanned aerial vehicle (UAV) routing enables the efficient transmission of multimodal data such as images, audio and location to the shipborne destination node equipped with a large language model (LLM) to support inference tasks including target search, which are applicable to maritime applications such as environmental monitoring and search and rescue. However, UAV network topologies change rapidly under harsh maritime channel conditions, resulting in significant degradation of routing stability. Meanwhile, gray-hole attacks selectively discard packets, leading to substantial increases in packet loss rate and transmission latency, and even causing inference failures. To address these challenges, this paper proposes an LLM-enhanced maritime UAV routing algorithm against gray-hole attacks that exploits the environment feature inferred by the LLM and the number of packets successfully forwarded by neighboring UAVs to construct a routing trust framework and applies reinforcement learning to jointly optimize the next hop UAV and the transmit power. The routing policy distribution function is formulated based on the quality-of-service requirements and the trust levels of the neighboring UAV, enabling rapid self-healing in response to dynamic network topologies and channel variations. To address feedback loss caused by sparse node distribution and rapidly varying channels in maritime environments, a feedback recovery mechanism is incorporated into routing experience replay to enhance routing stability. We develop a maritime UAV routing system, with the shipborne as the destination hosting a 7-billion-parameter LLaVA-1.5 model. Taking the captured images and one-hop neighbor information such as location as input, this model infers environment features and feeds the results back to UAVs to enhance the routing policy optimization. Based on measured channel data from the Oucuo sea area in Xiamen, a simulation scenario is constructed with 30 UAVs under gray-hole attacks with different packet loss probabilities. The results show that the proposed algorithm improves 72.8% packet delivery ratio, reduces 75.1% end-to-end latency and 64.7% energy consumption, and effectively supports LLM-driven maritime applications.  
      关键词:large language model;maritime unmanned aerial vehicle routing;gray-hole attacks;reinforcement learning;multimodal data   
      68
      |
      40
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 148553810 false
      更新时间:2026-04-24
    • MA Jun, CHEN Xin-ran, XIANG Kai-ran, CHEN Fu-chang, YAN Jun-jie
      Vol. 53, Issue 12, Pages: 4485-4493(2025) DOI: 10.12263/DZXB.20251023
      摘要:This paper proposes a high-selectivity frequency selective surface (FSS) based on a low-profile stacked patch structure. To address the challenges in the synthesis design of FSS: where complex structures make it difficult to accurately control resonant modes and coupling mechanisms, thereby limiting the flexible control of transmission zeros, equivalent circuit models based on mixed electric and magnetic coupling theory are proposed and applied to the design of FSSs. First, a weakly coupled dual patch FSS is designed and analyzed to clarify its operating mechanism. It is demonstrated that mixed electric and magnetic (EM) coupling between the two patch layers can introduce a controllable transmission zero (TZ). By etching one or two narrow slots on the patches, the EM coupling strength can be effectively adjusted, allowing precise control over the TZ position. An equivalent circuit model is developed and validated to accurately predict the coupling behavior. Based on the working mechanism, two third-order FSSs with single TZ and a fourth-order FSS with TZs on both sides of the passband are further designed, significantly enhancing frequency selectivity. A prototype operating at 4.72 GHz is fabricated and measured, and the measured results are in good agreement with the simulated results. The proposed FSS exhibits excellent out-of-band suppression, stable angular stability and low profile, making it a suitable candidate for advanced spatial filtering and electromagnetic shielding applications.  
      关键词:frequency selective surface;mixed electric and magnetic coupling;transmission zero;stacked patch;equivalent circuit;high selectivity   
      38
      |
      37
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 148553717 false
      更新时间:2026-04-24
    • LIANG Jia-wei, LIANG Si-yuan, CHEN Ruo-yu, LIU Kuan-rong, HUANG Jian-jie, CAO Xiao-chun
      Vol. 53, Issue 12, Pages: 4494-4506(2025) DOI: 10.12263/DZXB.20250673
      摘要:Incremental object detection (IOD) aims to enable models to continuously learn the recognition and localization of new categories from streaming data, while effectively maintaining detection performance on previously learned old classes.However, current mainstream object detectors often suffer from catastrophic forgetting during incremental training: their performance on old classes degrades significantly when fine-tuned only with labeled data from new classes.Existing methods mostly rely on knowledge distillation or exemplar replay strategies to mitigate forgetting, but generally overlook two critical challenges: first, label assignment conflicts in region proposal generation, and second, the overfitting risk induced by hard-label supervision on limited old samples.This paper points out that existing methods adopt inconsistent label assignment strategies in the proposal generation stage: new category and background proposals are matched based on the intersection over union (IoU) with ground truth, whereas old category proposals rely on inferences from the old model.When these two types of proposals overlap spatially, the same candidate region may be assigned contradictory labels, leading to conflicting supervision signals for classification and regression tasks and interfering with effective training.Furthermore, even with a few replayed old samples, applying hard-label supervision makes the model prone to overfitting on small subsets, making it difficult to reproduce the generalization ability gained from the original large-scale datasets, which in turn weakens old knowledge preservation.To address these issues, we propose a decoupled learning framework for incremental object detection.First, a hierarchically decoupled region proposal assignment mechanism is designed to perform mutually exclusive screening of overlapping regions according to a priority order of “new categories → old categories → background”, eliminating label conflicts.Subsequently, a dual-path decoupled supervision strategy is introduced: new categories and background regions are trained with ground-truth annotations (using an unbiased background definition), while all old category regions, regardless of whether they are explicitly labeled in replayed samples, are supervised solely through knowledge distillation to align their prediction distributions with the old model’s outputs.This avoids local overfitting induced by hard labels and ensures supervision consistency and learning stability throughout the training process.Experiments on Pascal VOC and MS COCO benchmarks demonstrate that the proposed method outperforms state-of-the-art (SOTA) methods in both single-step and multi-step incremental settings.Notably, in multi-step scenarios, our method improves the mean average precision (mAP) by over 2.0% and 2.9% respectively, validating its superiority in synergistically preserving old knowledge and learning new tasks.This work not only enhances the continual learning capability of IOD but also reveals the critical role of the collaborative design of proposal generation and supervision strategies in mitigating catastrophic forgetting.  
      关键词:incremental learning;object detection;knowledge distillation;exemplar replay;catastrophic forgetting   
      83
      |
      27
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 148553960 false
      更新时间:2026-04-24
    • WANG Fang-fang, LIU Ming-hua, QU Lian-en, WANG He, LI Dan-ning
      Vol. 53, Issue 12, Pages: 4507-4517(2025) DOI: 10.12263/DZXB.20250855
      摘要:Pedestrian trajectory prediction is one of the core challenges in fields such as autonomous driving and robotic navigation. Its key difficulty lies in effectively modeling complex interactions among pedestrians and extracting multi-scale spatiotemporal features. This paper proposes a pedestrian trajectory prediction method based on graph convolution and adaptive transformer (GCAT), which achieves high-precision trajectory prediction through hierarchical feature extraction and adaptive interaction modeling.The model takes the position and velocity information of all pedestrians within a historical observation window as input. First, linear projection and sinusoidal positional encoding are applied to map the raw observations into a high-dimensional feature space, explicitly preserving temporal order information. Subsequently, a relational graph convolutional network is introduced to capture local topological structures and spatial interaction strengths among pedestrians. An adaptive adjacency matrix based on feature cosine similarity is constructed in real time to model pedestrian interactions, enabling the graph structure to dynamically adjust according to scene characteristics. In addition, an enhanced multi-layer convolutional structure is employed, where learnable residual weights are used to adaptively balance the contributions of features at different layers. This design effectively alleviates the gradient vanishing problem in deep networks and strengthens the representation capability of local interaction features.Furthermore, the model incorporates a spatially adaptive Transformer to model global spatiotemporal dependencies. This module achieves continuous sampling over feature maps through learnable spatial offsets. Specifically, spatial offsets and attention weights are generated from the input features via linear layers. The offsets are added to reference point coordinates and normalized to obtain actual sampling locations. Bilinear interpolation is then used to extract feature values at these locations from the feature maps, which are subsequently aggregated using the attention weights. This process yields enhanced representations that capture both local geometric variations and global temporal dependencies. The continuous sampling strategy enables the model to focus on spatial regions most relevant to trajectory prediction and to adaptively handle geometric layout variations across different scenes.Meanwhile, the model further integrates multi-granularity temporal features, progressively extracting multi-level spatiotemporal representations ranging from local interactions to global dependencies. This design effectively addresses key limitations of existing methods in modeling long-range dependencies, environmental adaptability, and multi-scale feature fusion.For experimental validation, the proposed method is systematically evaluated on two widely used public pedestrian trajectory prediction datasets, ETH and UCY. Compared with existing baseline models, the proposed approach achieves improvements of 5.1% and 13.2% in terms of average displacement error (ADE) and final displacement error (FDE), respectively, demonstrating its effectiveness and superiority in complex interaction modeling and multi-scale spatiotemporal feature extraction.  
      关键词:trajectory prediction;local topology structure;global temporal dependencies;multi-scale feature fusion;prediction performance   
      82
      |
      12
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 148553439 false
      更新时间:2026-04-24
    • HE Guang-peng, DI Zhi-xiong, DENG Yu-jiao, CHEN Xuan, ZHANG Ze-tao, LIU Yang
      Vol. 53, Issue 12, Pages: 4518-4526(2025) DOI: 10.12263/DZXB.20250798
      摘要:In the design flow of very large scale integration (VLSI), the logic synthesis stage serves as a bridge between architectural design and physical implementation. However, there is often a significant discrepancy between the initial timing bottleneck paths analyzed by logic synthesis tools and the actual timing bottlenecks after placement and routing (P&R). This “timing inconsistency” mainly originates from two dimensions: first, in the synthesis stage, due to the lack of physical layout information, electronic design automation (EDA) tools usually employ wire load models to estimate interconnect delays, which makes it difficult to capture complex parasitic effects under deep sub-micron processes; second, modern chips contain a wide variety of logic gates and highly complex topological connections, which significantly increases the computational difficulty of accurate static timing analysis (STA) during logic synthesis. To address these challenges, this paper proposes BottleneckNet, a timing bottleneck prediction model for large-scale digital chips. First, in terms of feature engineering, to address the issue where large-scale design netlists are too large for deep learning models to train, a netlist feature extraction method based on the concept of register sub-graph (RSG) is proposed. This method performs structural pruning on the netlist topology, retaining only the critical combinational logic information between registers. This feature extraction process can be completed with a time complexity of On, thereby meeting the processing requirements for industrial large-scale netlists. In terms of model architecture, a dual-channel feature propagation model based on graph neural network (GNN) is designed. This model can simultaneously capture global features and local logic network topological information of the circuit. By fusing dual-channel features, BottleneckNet achieves accurate perception of post-P&R timing bottlenecks. A complete test dataset was constructed based on multiple sets of open-source large-scale designs. Simulation experiment results show that the proposed BottleneckNet model demonstrates excellent comprehensive performance. In terms of processing efficiency, the method can complete feature extraction and inference tasks for million-gate level designs within minutes. In terms of prediction accuracy, for the worst 5%~20% timing bottleneck paths, the prediction accuracy of the proposed method is not only significantly better than the light gradient boosting machine (LightGBM) model, but also much higher than the calculation results given by Synopsys design compiler (DC), a mainstream industrial logic synthesis tool, during the synthesis stage. The research results have important theoretical significance and engineering application value for guiding timing optimization after the logic synthesis stage and shortening the chip R&D cycle.  
      关键词:logic synthesis;placement and routing;static timing analysis;register sub-graph;graph neural networks;dual-channel features;timing bottleneck   
      82
      |
      19
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 148553402 false
      更新时间:2026-04-24
    • ZHAO Dong-xing, LIU Hui, HUANG Ke-ju, YANG Jun-an
      Vol. 53, Issue 12, Pages: 4527-4540(2025) DOI: 10.12263/DZXB.20250843
      摘要:Specific emitter identification (SEI) exploits subtle hardware discrepancies caused by manufacturing imperfections and device aging to perform transmitter identification and attribution at the physical layer. Compared with traditional authentication schemes that rely on protocols and cryptographic keys, SEI requires no modification to the protocol stack, is transparent to transmitted data, and incurs low deployment cost, making it valuable for applications such as spectrum regulation, wireless security, cognitive radio, and sensing in complex electromagnetic environments. However, in real-world wireless scenarios, time-varying and scene-dependent channel conditions introduce unstable modulation and distortion to radio-frequency fingerprints. Effects such as multipath fading, carrier frequency offset, and phase noise drift over time, causing the signals emitted by the same device to exhibit significant temporal variation. As a result, identification performance degrades markedly in the target domain, posing a major obstacle to practical deployment. To mitigate domain distribution shifts, existing studies mainly investigate transfer learning and domain adaptation approaches. Transfer learning relies on fine-tuning with labeled target-domain data and can improve target-domain performance, but it often disrupts previously learned source-domain knowledge and leads to catastrophic forgetting. Unsupervised domain adaptation reduces distribution discrepancies through feature alignment, pseudo labeling, and entropy minimization; however, due to the absence of explicit supervision, performance improvements are limited, and such methods struggle to handle continuously arriving data in online scenarios. Incremental learning emphasizes balancing adaptation to new data with the preservation of prior knowledge, yet most existing approaches still require labeled data or additional storage, making them difficult to apply directly to unlabeled cross-time SEI tasks. The advancement of generative modeling provides a new opportunity to address these challenges. Diffusion models characterize complex data distributions through forward noise injection and reverse denoising processes, and are well suited for modeling the superposition of channel perturbations and device-intrinsic features, enabling the recovery of radio-frequency fingerprints from distorted observations. Nevertheless, existing studies predominantly focus on denoising or data generation, and have not fully addressed cross-time identification and continual learning requirements. To this end, this paper proposes a diffusion-model-driven cross-time incremental SEI method. In the source domain, forward diffusion is employed to explicitly model channel perturbations, while in the target domain, reverse diffusion progressively restores discriminative representations that approximate the source-domain distribution, thereby suppressing feature drift. A cross-attention mechanism is incorporated into the diffusion network to inject emitter identity information during denoising, enhancing inter-class separability. Furthermore, an unsupervised incremental learning strategy is introduced, which achieves continual adaptation using only unlabeled target-domain samples through distribution consistency and knowledge-preservation regularization, effectively mitigating catastrophic forgetting. Cross-time identification experiments on the WiSig dataset demonstrate that the proposed method improves target-domain identification accuracy by more than 5 percentage points compared with representative domain adaptation methods, and enhances source-domain performance retention by approximately 10 percentage points relative to mainstream incremental learning strategies, validating its channel restoration capability, feature alignment effectiveness, and robustness under dynamic channel conditions.  
      关键词:specific emitter identification;cross-time domain;diffusion model;incremental learning;signal processing;cross-attention mechanism   
      37
      |
      11
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 148838278 false
      更新时间:2026-04-24
    • XIE Yu-nong, ZHANG Zhi-yong
      Vol. 53, Issue 12, Pages: 4541-4559(2025) DOI: 10.12263/DZXB.20250771
      摘要:Oxide semiconductors (OS), particularly amorphous oxide semiconductors (AOS), have emerged as important candidates for overcoming the physical scaling limits of silicon-based devices, owing to their moderate carrier mobility, extremely low off-state current, excellent large-area uniformity, and low-temperature fabrication processes compatible with conventional complementary metal-oxide-semiconductor (CMOS) technology. In recent years, AOS have not only achieved large-scale commercial applications in high-end liquid crystal display (LCD) and organic light-emitting diode (OLED) display backplanes, but have also demonstrated great potential in low-power logic devices, high-density memory, and advanced integration architectures such as monolithic three-dimensional integrated circuits (M3D). In particular, under the stringent low thermal budget (<400 °C) required for M3D fabrication, oxide semiconductors exhibit significant advantages in the comprehensive optimization of power, performance, area, and cost (PPAC).As device dimensions continue to scale down, maintaining effective electrostatic control over channel carriers, suppressing short-channel effects, and ensuring long-term device reliability have become critical challenges limiting the further development of oxide semiconductor thin-film transistors (TFTs). Among various design strategies, gate engineering plays a pivotal role in determining transistor electrical characteristics, directly affecting key performance metrics such as threshold voltage, subthreshold swing, leakage current, and bias stability. This paper presents a systematic review of gate engineering in oxide semiconductor TFTs, with a focus on recent advances and technological trends in gate dielectric materials, gate structure design, and gate-channel interface engineering. At the gate dielectric level, the introduction of high-permittivity (high-κ) materials and their composite structures enables enhanced gate controllability, reduced operating voltage, and effective suppression of gate leakage current by scaling down the equivalent oxide thickness. At the gate structure level, three-dimensional non-planar architectures—including FinFETs, nanowire transistors, and gate-all-around (GAA) structures—significantly improve gate-to-channel coupling, thereby alleviating short-channel effects and enhancing device performance at aggressive scaling limits. At the interface engineering level, strategies such as interface passivation, band alignment optimization, and defect state modulation effectively reduce interface trap density, improve carrier transport properties, and markedly enhance device stability and reliability.Despite the substantial progress achieved in gate engineering of oxide semiconductor devices, several critical challenges remain, including the complexity of reliability degradation mechanisms, the applicability of existing interface optimization strategies to short-channel devices, and the lack of high-performance p-type oxide semiconductor materials that are both compatible with back-end-of-line (BEOL) processes and performance-matched to n-type oxide semiconductors. These limitations hinder the development of complementary circuits and high-density integrated systems. Overall, oxide semiconductors are widely recognized as a key technological pathway in the post-Moore era, and with continued breakthroughs in gate-related materials, device structures, and interface control technologies, they are expected to play an increasingly important role in future high-performance, low-power electronic devices and three-dimensional integrated systems.  
      关键词:Oxide semiconductor;thin-film transistor;gate engineering;gate dielectric;gate structure;interface engineering;Short-channel effects   
      73
      |
      17
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 149004708 false
      更新时间:2026-04-24
    • LIU Shuai, CHEN Da, PAN Yi-heng, LI Qian, LIN Chen-hao, SHEN Chao
      Vol. 53, Issue 12, Pages: 4560-4574(2025) DOI: 10.12263/DZXB.20250502
      摘要:Reinforcement learning from human feedback (RLHF) can effectively align model outputs with human preferences and has been widely used to mitigate the hallucination problem of multimodal large language models (MLLMs) in practical applications. Among various RLHF approaches, direct preference optimization (DPO) avoids explicit reward modeling, enabling more stable and efficient improvement of MLLMs’ reliability and usability. As a result, DPO has attracted extensive attention from both academia and industry. However, the DPO training process still faces several challenges: issues such as training data distribution shift and insufficient distinction of the factuality of instructions during preference data construction may exacerbate model hallucinations. Additionally, existing methods underutilize the audio information accompanying multi-image data (e.g., videos). As an effective supplementary signal for visual understanding, audio has the potential to alleviate hallucinations.To address the aforementioned problems, this paper proposes an instruction factuality assessment and audio-aided self-alignment training framework (IFAA). This framework generates high-quality preference data through four core modules to suppress hallucinations in MLLMs. The specific modules are as follows: (1) Style-consistent response sampling, which effectively reduces data distribution shift in DPO training; (2) Long-response segmentation strategy, which improves the accuracy of the model’s self-judgment; (3) Instruction factuality assessment module, which constructs preference data with stronger factual basis; (4) Audio-aided understanding module, which enhances the quality of preference data by fusing audio information. Finally, DPO training is conducted to further improve the model’s reliability. In addition, this paper innovatively introduces a confidence balance point selection mechanism based on the receiver operating characteristic (ROC) curve to effectively mitigate the overconfidence issue of MLLMs.To verify the effectiveness and generalization ability of the proposed framework, experiments are conducted on five mainstream MLLM evaluation benchmarks. Taking the large language and vision assistant (LLaVA) 1.5 model as an example, after optimization with the IFAA framework, its sentence-level hallucination rate on the object hallucination benchmark (Object HalBench) dataset decreases by 43.1%, and the instance-level hallucination rate drops by 37.3%. Furthermore, transfer experiments on other cutting-edge models demonstrate that the preference data constructed based on IFAA exhibits excellent generalization, significantly reducing the hallucination rates of different models. These results confirm the applicability of the proposed framework across various models and provide a new effective approach for hallucination mitigation in MLLMs.  
      关键词:multimodal large language models;hallucination mitigation;preference learning;self-alignment;instruction factuality;audio assistance   
      77
      |
      24
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 149380592 false
      更新时间:2026-04-24
    • DING Nan, FANG Xi-qi, HAO Yun-tao, HU Chuang-ye, XU Li
      Vol. 53, Issue 12, Pages: 4575-4591(2025) DOI: 10.12263/DZXB.20250674
      摘要:Partition-based edge-end collaborative inference technology for deep neural networks (DNN), by splitting models and deploying them on mobile robot terminals and edge servers respectively, can effectively alleviate resource constraints on terminal devices and address the issue of reduced inference accuracy caused by existing model lightweighting techniques. However, this technology also poses new challenges for communication scheduling in the robot operating system2 (ROS2): existing communication strategies struggle to ensure the effective transmission of critical collaborative inference data flows while simultaneously accommodating the transmission needs of other application data flows. To address this problem, this study proposes a hybrid data flow dynamic scheduling algorithm for mobile robot deep neural network edge-end collaborative inference in the robot operating system2 (DRECHS). First, based on the mechanism analysis of edge-end collaborative inference, we define the maximum allowable transmission time boundaries for DNN intermediate data to provide a theoretical basis for transmission optimization. Combining these boundary conditions, we design a scheduling model based on hybrid switching system theory, modeling the flow scheduling process as a dynamic switching model containing a priority-first subsystem and a time-first subsystem. On this basis, the specific hybrid data flow scheduling algorithm is proposed. Integrated into the data distribution service (DDS) flow controller of the robot operating system2, this algorithm is capable of dynamically generating output queues based on calculated queue status metrics, realizing fine-grained control of the underlying data transmission order. Thus, while meeting the transmission requirements of inference tasks, it achieves differentiated quality of service (QoS) optimization for data flows with different priorities, effectively balancing overall system transmission performance. Targeting the adopted dynamic partitioning method, we design simulation experiments under different bandwidth conditions to compare and analyze the performance differences between the proposed algorithm and the system's built-in scheduling algorithms and others in terms of transmission delay and packet loss rate. Experimental results show that the proposed scheduling algorithm, through the hybrid switching system model and dynamic scheduling strategy, successfully achieves differentiated quality of service optimization for data flows with different priorities while meeting high-priority data transmission requirements. Furthermore, this study proposes a corresponding deployment scheme, and deploys the scheduling algorithm and the deep neural network edge-end collaborative inference framework on real devices, completing system verification. This deployment scheme provides a reference for the deployment of the proposed algorithm and framework in real-world scenarios.  
      关键词:deep neural networks;edge-end collaboration;robot operating system2;mixed data flow;hybrid switching systems   
      77
      |
      20
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 149484805 false
      更新时间:2026-04-24
    • Motif-Based Structural Robustness of Lower- and Higher-Order Networks

      XING Zhi-yao, XIANG Lin-ying
      Vol. 53, Issue 12, Pages: 4592-4606(2025) DOI: 10.12263/DZXB.20250683
      摘要:Real-world multi-agent networked systems are commonly embedded in dynamic environments characterized by the interplay between attacks and defenses. Attackers aim to degrade system functionality by disrupting critical nodes or interaction structures, whereas defenders adopt corresponding repair and reconfiguration strategies to preserve overall system performance. The alternating actions of these two sides give rise to complex and highly nonlinear adversarial evolutionary processes. Such multi-agent systems are typically abstracted as complex networks, where nodes represent individual agents and edges describe their interaction relationships. Traditional graph-based modeling approaches exhibit clear advantages in characterizing pairwise interactions between nodes and have been widely employed in studies of network robustness and attack-defense games. However, these approaches encounter inherent limitations when attempting to capture the ubiquitous multi-agent coordination, group interactions, and higher-order coupling behaviors present in real-world systems, and thus fail to fully reflect the complexity of collective cooperation mechanisms. In recent years, with the rapid advancement of complex systems research, higher-order network modeling approaches have attracted increasing attention and have been incorporated into the analytical framework of multi-agent systems. Compared with conventional lower-order networks, higher-order networks provide richer structural representations for investigating the formation mechanisms and evolutionary dynamics of complex cooperative behaviors in multi-agent systems. In this context, this paper starts from lower-order network structures and introduces a higher-order network modeling framework to systematically investigate the structural robustness evolution of multi-agent networks under attack-defense games. Specifically, multiple representative attack and defense strategies are constructed to reflect realistic adversarial scenarios, and their combined effects on the structural robustness of higher-order network are systematically analyzed. Particular emphasis is placed on examining the differences in robustness evolution between higher-order networks and their corresponding lower-order counterparts under various attack modes and defense mechanisms, as well as on elucidating the role of higher-order structures in enhancing or weakening system resilience against attacks. More concretely, this study focuses on motif-based structures and conducts an in-depth analysis of how different types of motifs in higher-order networks influence overall system robustness. Furthermore, the moderating effects of lower-order structural parameters, such as the average degree of the underlying network, on higher-order network robustness are investigated. A combination of numerical simulations and theoretical analysis is employed. Four representative lower-order network models are selected to generate their corresponding higher-order network structures. On this basis, four typical attack strategies are introduced to simulate agent node failures, enabling a systematic characterization of the dynamic structural evolution of networks during attack-defense interactions. By computing the relative size of the largest connected component, the robustness variations of networks under different attack-defense strategies are quantitatively evaluated. The results demonstrate that, compared with traditional lower-order networks, higher-order networks exhibit distinctly different robustness response characteristics when subjected to attacks. System robustness depends not only on pairwise connections between nodes but is also significantly influenced by the distribution of higher-order motif structures and lower-order structural parameters such as the average degree. Appropriate motif organization and suitable lower-order structural configurations can, to some extent, enhance system resistance to attacks. The findings of this study provide a novel theoretical perspective for understanding the robustness formation mechanisms of multi-agent systems with complex cooperative behaviors and offer valuable insights for the design of higher-order network structures and the optimization of attack-defense strategies.  
      关键词:higher-order network;structural robustness;motif;multi-agent system   
      50
      |
      47
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 149579857 false
      更新时间:2026-04-24
    • GENG Yuan, PAN Jin, WANG Ling-xiao, LIU Si-hao, LIU Yan-hui, YANG De-qiang
      Vol. 53, Issue 12, Pages: 4607-4613(2025) DOI: 10.12263/DZXB.20251010
      摘要:This paper proposes a low-profile, ultra-wideband, low-cross-polarization tightly coupled dipole array. Traditional tightly coupled dipole arrays typically expand bandwidth at the expense of arrays profile height. To extend the low-frequency bandwidth of the tightly coupled dipole array without increasing profile height, a current loop radiation mode equivalent to a magnetic dipole is introduced at low frequencies. Simulation analysis reveals that the Double-Y configuration does not achieve optimal balanced transformation across all frequency bands. Non-balanced feeding introduces net vertical currents within the feed network. By leveraging the Double-Y balun’s inherent non-balanced feeding characteristics at low frequencies, the net vertical currents drive a current loop formed by the dipole, Double-Y balun, shorted probe, and ground plane. Moreover, this current loop, equivalent to a magnetic dipole, enables effective radiation. Consequently, this design enhances the array’s low-frequency bandwidth while simultaneously reducing the array’s profile. Traditional wide-angle impedance matching layers in TCDAs serve a relatively singular purpose, typically employed solely to enhance array impedance matching and scanning capability. However, this paper proposes a multifunctional wide-angle impedance matching layer. By integrating a metasurface with a polarization grid, the wide-angle impedance matching layer not only retains its original functions of improving array impedance matching and scanning capability but also enhances the array’s cross-polarization performance. Ultimately, the proposed array achieves a 5:1 bandwidth (0.8~4 GHz) with a voltage standing wave ratio (VSWR) < 3, alongside scanning capabilities of 45° in the E-/H-plane. Moreover, with a profile of 0.089 λlow, where λlow is the wavelength at the lowest operating frequency. The array exhibits cross-polarization levels below -54 dB/-30 dB for E- /H-plane 45°scanning. To validate this design, a 10 × 10 tightly coupled array was fabricated.  
      关键词:low-profile;ultra-wideband;magnetic dipole;metasurface;low cross-polarization;tightly coupled dipole array   
      37
      |
      41
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 149579785 false
      更新时间:2026-04-24
    • ZHANG Shi-bin, CAI Song-rui, YANG Min, CHEN Shi-hang
      Vol. 53, Issue 12, Pages: 4614-4629(2025) DOI: 10.12263/DZXB.20250650
      摘要:The rapid development of artificial intelligence (AI) technology has enriched the Internet content ecosystem while simultaneously exacerbating the widespread propagation of multimodal fake news. In particular, the application of deepfake technology renders false information highly realistic at both visual and semantic levels, posing a severe threat to the trust system of the online public sphere. Although existing multimodal fake news detection techniques have utilized cross-modal attention mechanisms and large language models (LLMs) to achieve multimodal semantic alignment and reasoning enhancement, these methods still face challenges in specific scenarios. On one hand, general-purpose large models are prone to “hallucination” risks and are often limited to coarse-grained semantic fusion, making it difficult to accurately capture mismatch conflicts between visual and textual entities. On the other hand, existing models often overlook the mining of physical artifacts in the image frequency domain and emotional manipulation signals in the text, resulting in limited discrimination capability when facing high-fidelity fake content generated by generative AI. To address the aforementioned issues, this paper proposes a multimodal similarity-aware graph attention network (MS-GAT) based on multi-channel feature enhancement. The method first designs a multi-channel feature extraction module, utilizing the bidirectional encoder representations from transformers (BERT) model to extract deep semantic and emotional features of the text, combined with the vision transformer (ViT) to acquire image spatial features. Simultaneously, it introduces the fast Fourier transform (FFT) to capture anomalous artifacts in the image frequency domain and implements weighted fusion of multi-channel features through an adaptive gating unit. Building upon this, this paper constructs a similarity-aware heterogeneous graph containing visual-textual entity nodes and modality hub nodes. It utilizes the CLIP model to calculate the similarity of each node in a shared semantic space and thereby explicitly models the fine-grained associations between images and text. Finally, the model employs the graph attention network (GAT) to aggregate neighborhood information, dynamically adjusting the association strength between nodes via attention weights to focus on visual-textual inconsistency features, and incorporates an adaptive multi-task loss function to resolve the optimization imbalance problem in joint learning. The proposed method achieves accuracies of 94.5% and 87.6% on the Weibo17 and CFND datasets, respectively, with all key performance indicators outperforming existing mainstream baselines. Research results indicate that by integrating multi-channel visual-textual features with structured reasoning mechanisms, the proposed method successfully captures deep semantic conflicts between images and text, providing a new perspective and technical support for enhancing the interpretability and robustness of multimodal fake news detection.  
      关键词:fake news detection;multimodal fusion;visual-textual similarity awareness;multi-channel feature extraction;graph attention network;heterogeneous graph   
      135
      |
      69
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 149817910 false
      更新时间:2026-04-24
    • LIU Xin, LIU Xiao-qing, DAI Wei
      Vol. 53, Issue 12, Pages: 4630-4639(2025) DOI: 10.12263/DZXB.20250927
      摘要:Unknown time-delays pose a common challenge in industrial soft sensor modeling. Neglecting the identification of unknown time-delay variables, particularly for multidimensional unknown time-delay variables, can undermine model reliability and accuracy, leading to modeling failure. Accordingly, this paper proposes a stochastic incremental modeling method for industrial soft sensing with multidimensional unknown input time-delays, which is developed based on stochastic configuration network (SCN) to jointly solve the iterative optimization problem of multidimensional unknown input time-delays and network model parameters. Initially, the stochastic configuration network is utilized as a basic model to map the nonlinear relationships between input and output data, thereby revealing the sensitivity of conventional least-squares estimation to time-delay variables. Subsequently, the expectation-maximization (EM) algorithm is employed to establish a probabilistic framework, which formulates the probabilistic identification problem of the multidimensional unknown time-delay parameters. Furthermore, a solution space for the unknown time-delay variables is constructed, and the probability distribution of the unknown time-delay variables within solution space is quantified by calculating the posterior probability density function. Finally, an iterative optimization strategy is adopted to derive a joint estimation formula for the parameters of both the unknown time-delay and the network model, thereby avoiding the error accumulation caused by separate estimations and obtaining the desired soft sensor model. For model validation, the effectiveness and reliability of the proposed soft sensor model are validated through a numerical simulation and an industrial application involving a typical grinding process.  
      关键词:soft sensing modeling method;stochastic configuration network;expectation-maximization (EM) algorithm;multidimensional unknown time-delays   
      69
      |
      65
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 149817875 false
      更新时间:2026-04-24
    • YANG Xin-lu, WANG Wen-bo, XING Yuan-xiu, DENG Zhao
      Vol. 53, Issue 12, Pages: 4640-4655(2025) DOI: 10.12263/DZXB.20250893
      摘要:Dynamic electrocardiograms (ECGs) play an important role in clinical monitoring and wearable health assessment. However, due to their low amplitude and strong nonstationarity, ECG signals are highly susceptible to contamination by multiple sources of interference during acquisition, including baseline wander (BW), muscle artifacts (MA), electrode motion (EM), and environmental noise such as white Gaussian noise (WGN). The superposition of these disturbances leads to distortion of critical waveform components (P wave, QRS complex, and T wave), severely limiting the reliability of automatic analysis and clinical interpretation in wearable devices. Moreover, most existing ECG denoising methods are designed for single noise types or ideal operating conditions, and they often fail to simultaneously achieve effective noise suppression and waveform fidelity under multi-source mixed-noise and low signal-to-noise ratio (SNR) conditions. To address these challenges, a two-stage denoising method that combines an energy-selected tunable Q-factor wavelet transform with improved singular value decomposition (ES-TQWT-ISVD) is proposed. First, the multiresolution analysis capability of TQWT is employed to decompose noisy ECG signals into multiple subband components with different oscillatory characteristics. Based on the energy distribution differences of mixed noise in the time-frequency domain, criteria based on subband energy ratios and cumulative energy are constructed to adaptively select signal-dominant subbands, thereby achieving preliminary noise suppression. Subsequently, the selected subband signals are used to construct a Hankel matrix, and an adaptive order-determination strategy based on abrupt changes in the standard deviation of singular value subsets is introduced to identify the optimal reconstruction order. In this way, residual noise is further attenuated without relying on empirical thresholds, while preserving fine waveform details. Experiments were conducted on four types of single noise (WGN, BW, MA, and EM) and four types of mixed noise (BW+MA, BW+EM, EM+MA, and BW+MA+EM), constructed using the MIT-BIH Arrhythmia Database and the MIT-BIH Noise Stress Test Database, to systematically evaluate the denoising performance of the proposed method under different noise intensities and combinations. The experimental results demonstrate that, even under severe noise conditions at -5 dB, the proposed method achieves an SNR improvement of 12.46 dB, while maintaining a low root mean square error (0.057) and a high cosine similarity (91.07%). Compared with conventional TQWT and complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) based methods, the proposed approach exhibits superior noise suppression capability and waveform preservation performance, and shows robust overall performance in multi-source mixed-noise scenarios. The results further indicate that the proposed method does not require training samples, has moderate computational complexity, and exhibits high detection consistency in feature wave localization tasks, making it suitable for high-quality ECG denoising and clinical front-end processing in complex dynamic environments.  
      关键词:tunable Q-factor wavelet transform;electrocardiograms signal denoising;singular value decomposition;mixed noise suppression;feature wave localization   
      119
      |
      67
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 149823143 false
      更新时间:2026-04-24
    • JIANG Xin-yu, YAN You-jie, WANG Bin-wen, ZHNAG Kai-yue, BI Liang-jie, LI Hai-long, MENG Lin
      Vol. 53, Issue 12, Pages: 4656-4664(2025) DOI: 10.12263/DZXB.20250981
      摘要:In this paper, a systematic investigation into the reconstruction of radiation characteristics for high-power pulsed array antennas, based on the time domain pattern convolution method and constrained multi-objective optimization algorithms is conducted. Under constraints of a radiation field amplitude exceeding 12 kV/m and a pulse width variation rate below 20%, the study achieves reconfigurable control of both the antenna pattern and the radiated pulse waveform. To enhance computational model reliability, a “small-array extrapolation to large-array” strategy is adopted. Simulation results from a smaller-scale array are used to reasonably infer the radiation behavior of a large-scale array, thereby balancing computational efficiency and model accuracy. Building upon this foundation, a rigorous numerical analysis tool accounting for mutual coupling effects is integrated with the non-dominated sorting genetic algorithm II (NSGA-II) to establish a multi-objective optimization workflow for array antenna delay layout. This workflow enables flexible reconfiguration of radiation pulse characteristics. In practical engineering applications, designers can select the most suitable delay configuration scheme from the Pareto optimal solution set derived through optimization, tailored to specific mission requirements. High-power pulsed array antennas hold significant application value in ultra-wideband radar, electromagnetic countermeasures, and bioelectromagnetics. The core objective is to generate radiation field distributions with specific temporal and spatial characteristics within designated spatial regions. Traditional design methods predominantly focus on frequency-domain or steady-state performance, exhibiting limited control over transient pulse waveforms. The proposed time-domain pattern convolution method provides an intuitive representation of how excitation delays influence synthesized pulse waveforms, establishing a theoretical framework for precise control of time-domain radiation characteristics. By introducing the pulse width variation rate as a key constraint, it effectively ensures temporal consistency of the radiated pulse waveform within the target region, preventing system performance degradation caused by waveform distortion.Regarding the optimization approach, the NSGA-II algorithm used in this paper exhibits strong global search capabilities and convergence performance, making it particularly suitable for multi-objective engineering optimization problems involving complex nonlinear constraints. By embedding mutual coupling effects into the individual fitness evaluation process, the feasibility of optimization results in actual physical systems is ensured. In the numerical experiments, three typical scenarios are examined: maximizing radiation field amplitude; maximizing beam width; minimizing pulse width variation rate within the half-power beam width. The high consistency between test data and calculation results validates the effectiveness and robustness of the developed optimization model in simultaneously handling spatiotemporal constraints.  
      关键词:Reconfiguration;Non-dominated Sorting Genetic Algorithm II (NSGA-II);time-domain pattern;pulse array antenna;pulse width variation rate;mutual coupling   
      109
      |
      37
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 151032663 false
      更新时间:2026-04-24
    • PI Hui-bin, WU Long-sheng, WEN Yi, LUO Deng, YU Guo-fang
      Vol. 53, Issue 12, Pages: 4665-4670(2025) DOI: 10.12263/DZXB.20250733
      摘要:This paper investigates the cryogenic total ionizing dose (TID) effects on a temperature sensor fabricated in a sub-20 nm bulk silicon FinFET technology, targeting applications in extreme environments such as deep-space exploration. The sensor core utilizes a bandgap reference circuit based on PNP bipolar junction transistors. Theoretical analysis reveals that the degradation of bipolar transistors under irradiation at 290 K or 110 K manifests an increase in the base recombination current. However, the underlying dominant mechanisms are fundamentally different. At room temperature (290 K), the degradation is primarily governed by an increase in the interface trap density, consistent with the classical model involving hydrogen ion (H⁺) drift leading to the breaking of Si-H bonds. In contrast, under cryogenic conditions (110 K), the drift motion of H⁺ ions is effectively “frozen”, and the tunneling-assisted recombination mechanism via border traps becomes the main contributor to degradation. Given that the border trap concentration at 110 K is significantly higher than the interface trap concentration at 290 K, a pronounced enhancement of radiation damage at cryogenic temperatures is observed. Circuit analysis and simulation validation demonstrate that as the radiation-induced leakage current increases from 0 nA to 100 nA, the bandgap reference voltage undergoes significant changes, resulting in an approximate 21 bit increase in the output temperature code. This finding theoretically predicts a positive drift in the temperature code due to irradiation. To experimentally verify the theoretical analysis, irradiation tests were conducted using a customized cryogenic TID effect test system and a Co⁶⁰ γ-ray source, accumulating a total dose of 1 Mrad(Si) under 110 K and 290 K conditionsrespectively. The experimental results unequivocally show that after 1 Mrad(Si) irradiation, the cumulative increases in the temperature code for sensors irradiated at 110 K reach 37 bits and 30 bits, respectively, which are substantially greater than the 9 bit increase observed under 290 K room-temperature irradiation. This clear quantitative evidence firmly validates the existence of cryogenic temperature radiation damage enhancement effects. This study provides crucial theoretical insights and experimental data for the design and radiation hardening of FinFET integrated circuits, particularly sensors relying on bipolar devices, operating in extreme low-temperature and high-radiation environments.  
      关键词:FinFET;temperature sensor;total ionizing dose effects;cryogenic temperature;bipolar junction transistor   
      126
      |
      47
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 151370240 false
      更新时间:2026-04-24
    • Adaptive Diversity-Driven Particle Swarm Optimization

      ZHANG Li
      Vol. 53, Issue 12, Pages: 4671-4685(2025) DOI: 10.12263/DZXB.20250823
      摘要:To address the inherent limitations of particle swarm optimization (PSO) in handling high-dimensional, multi-modal, and complex optimization problems, such as rapid loss of population diversity and premature convergence, an adaptive diversity-driven particle swarm optimization (ADDPSO) algorithm is proposed. This algorithm is systematically reconstructed across five dimensions—initialization strategy, diversity quantification, parameter adaptation, information exchange structure, and perturbation mechanism—to comprehensively enhance its global exploration capability, convergence accuracy, and robustness within complex search spaces. First, to address uneven initial population distribution, a weighted fusion initialization strategy combining Logistic chaotic mapping and uniform random numbers is designed. The ratio at 7:3 balances sequence traversal and random perturbation, significantly improving the uniformity of initial solution coverage in high-dimensional spaces. Second, a population diversity metric based on average Euclidean distance is introduced, with the diversity ratio defined as a real-time feedback signal for evolutionary adjustments, which enables dynamic parameter and strategy tuning. Built upon this, a dual-driven parameter adaptation mechanism combining time-decreasing rules and diversity ratio is proposed, enabling smooth transitions between inertia weight and learning rate during exploration and exploitation phases. Furthermore, to overcome the excessive reliance of traditional velocity updates on a single global optimum, a three-tiered collaborative velocity update architecture “individual cognition-elite guidance-population distribution” is constructed. Elite guidance prevents search direction convergence by maintaining an elite archive and employing probability-based selection strategies based on fitness. While the population distribution component introduces a global coordination mechanism based on fitness deviation, achieving differentiated guidance through root-mean-square normalisation and sign-preserving coefficients. Additionally, the algorithm integrates diversity-aware simulated annealing with multi-strategy adaptive mutation operations and incorporates a dual-threshold acceptance criterion. This approach actively injects diversity while maintaining convergence trends, effectively suppressing premature convergence. Experimental validation was conducted on the CEC2017 test set comprising 12 high-dimensional complex functions (including mixed functions F18~F19 and composite functions F20~F29) and a gear system design problem. Results demonstrate that ADDPSO achieves optimal or near-optimal mean and standard deviation values on most functions. Particularly for highly complex functions such as F18, F20~F24, and F27~F29, their solution accuracy surpasses mainstream PSO variants by 1 to 4 orders of magnitude while exhibiting superior stability. In gear system design problems, ADDPSO not only converged stably to theoretically optimal solutions but also significantly outperformed comparison algorithms, fully validating its reliability and consistency in engineering optimization. In summary, through multi-level, multi-mechanism collaborative design, ADDPSO systematically addresses PSO's diversity decay and premature convergence issues in high-dimensional complex optimization, demonstrating outstanding comprehensive performance and practical application potential.  
      关键词:particle swarm optimization algorithm;population diversity;high-dimensional complexity;parameter adaptation;premature convergence;global optimization   
      139
      |
      39
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 152063717 false
      更新时间:2026-04-24
    • SUN Dian-xing, HUANG Ya-sheng, PENG Rui-hui, TAN Shun-cheng, WANG Guo-hong
      Vol. 53, Issue 12, Pages: 4686-4707(2025) DOI: 10.12263/DZXB.20250887
      摘要:Weak maritime target detection usually faces challenges such as small radar cross sections, low infrared contrast, and susceptibility to background interference from sea clutter, floating objects, islands, and seabirds. Due to inherent physical limitations, single-sensor detection methods have difficulty balancing detection probability and false-alarm suppression under complex sea conditions. Track-before-detect (TBD) techniques can effectively enhance weak target detection through multi-frame joint processing; however, traditional TBD methods mostly rely on prior motion model assumptions and are primarily designed for single-sensor scenarios, showing insufficient adaptability under target maneuvering or complex background conditions. To address these issues, this paper proposes a radar-infrared multi-view cooperative intelligent TBD technique to achieve reliable detection of weak targets in complex maritime environments. First, to deal with the dense background clutter and noise points in radar echoes, a radar low-threshold preprocessing mechanism is introduced, which removes part of the low-amplitude interference points while preserving target echo information, thereby reducing the computational complexity of subsequent processing. Then, considering the heterogeneity of radar and infrared sensors in measurement dimensions and spatial representations, a radar-infrared heterogeneous data spatial mapping model is constructed. Radar range-azimuth measurements are mapped onto the infrared image plane to generate radar-infrared virtual fusion images, enabling the alignment and fusion of the two types of sensor information in a unified pixel space, thus increasing the target measurement data rate and enhancing target saliency. Based on the constructed fusion images, a maximum-value multi-frame accumulation strategy is employed to perform energy accumulation on consecutive fusion images, highlighting the spatio-temporal correlation of weak targets and suppressing random noise. Meanwhile, by exploiting the cross-modal difference that real targets generate responses in both radar and infrared sensors, whereas some background interference appears only in a single sensor, a candidate target region delineation mechanism based on joint radar-infrared response constraints is established. This mechanism effectively excludes non-target interference such as seabirds and floating objects, provides reliable spatial constraints for subsequent detection, and significantly reduces false-alarm rates. In the target detection stage, considering that weak targets in multi-frame accumulated fusion images are characterized by small size, low contrast, and elongated continuous trajectories, an adaptive multi-scale feature enhancement network based on the YOLOv11 framework (AMSFE-YOLOv11) is constructed to achieve target trajectory detection and instance segmentation, while further suppressing complex background interference. The proposed method eliminates the dependence of traditional TBD approaches on prior motion models and can still effectively extract trajectory features and spatio-temporal correlations under target maneuvering conditions, achieving stable energy accumulation with good robustness and applicability. Finally, the proposed method is validated using real maritime radar and infrared data. The experimental results demonstrate that the proposed method achieves a detection probability exceeding 94.7% for weak targets, with a false alarm rate below 0.52%. Compared with single-sensor detection approaches, it exhibits a clear performance advantage, thereby validating the effectiveness and practical application potential of the proposed radar-infrared multi-view cooperative intelligent TBD technique in complex maritime environments.  
      关键词:weak target detection;radar-infrared;multi-view cooperation;multi-frame accumulation;candidate target region;AMSFE-YOLOv11   
      165
      |
      36
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 152091943 false
      更新时间:2026-04-24
    • LI Shuang-zhi, LIU Ying-ying, LI Lu-qi, GUO Xin
      Vol. 53, Issue 12, Pages: 4708-4718(2025) DOI: 10.12263/DZXB.20250986
      摘要:To address the issues of high-order unitary constellation design and the high complexity of detection at the receiver, this paper focuses on non-coherent single-input multiple-output (SIMO) short-packet communication systems over block Rayleigh fading channels. It proposes a low-complexity multi-symbol high-order unitary constellation parameterization design method, aiming to efficiently exploit time-domain diversity in short-packet transmissions, improve system reliability, and reduce design and detection overhead. Short-packet communication is one of the key technologies for supporting ultra-reliable low-latency communication (URLLC). Due to limited transmission block lengths, traditional channel coding and coherent detection schemes suffer from reduced spectral efficiency as they require allocating a large number of pilot symbols for channel estimation. Although non-coherent communication does not require instantaneous channel state information, the design and maximum likelihood (ML) detection complexity of traditional unstructured unitary constellations grow exponentially with the modulation order, making them difficult to apply in high-order modulation or latency-sensitive scenarios. To tackle this, this paper introduces a parameterized and structured design approach. Under given transmission rate and average power constraints, the unitary constellation design problem is transformed into a mixed discrete-continuous optimization problem with the objective of maximizing the minimum chordal distance (MCD) between different transmitted signals.The core of the proposed method lies in recursively parameterizing the transmitted signal of the lengthL into a series of independent angle parameters and phase parameters, assuming they belong to different constellation sets. This structural assumption not only decouples the high-dimensional constellation point optimization problem into the optimization of a finite number of parameters, significantly reducing the design complexity, but also establishes a structural foundation for implementing low-complexity recursive detection algorithms at the receiver. Specifically, the paper proposes a two-step solution strategy that jointly optimizes bit allocation and constellation structure: first, under a given bit allocation, it is derived that the optimal angle constellation should have an arithmetic sequence structure, and the optimal phase constellation should be a uniformly distributed phase-shift keying (PSK) constellation, based on maximizing the MCD property; Second, based on this structure, offline search is employed to determine the optimal bit allocation for each parameter subspace, thereby maximizing the system’s MCD while ensuring structural regularity.At the receiver, benefiting from the parameterization and structural independence of the constellation, the paper further proposes a low-complexity recursive ML detection algorithm. This algorithm recursively decomposes the detection problem of any symbol length into two-symbol detection, reducing computational complexity. Theoretical analysis shows that the complexity of the proposed detection algorithm scales linearly with the sum of the constellation points in each parameter subspace, avoiding the exponential growth with the total modulation order characteristic of traditional global search algorithms. This makes it particularly suitable for high-order modulation and real-time short-packet communication scenarios. Simulation results demonstrate that compared to pilot-based quadrature amplitude modulation and PSK schemes, the proposed structured unitary constellation achieves better block error rate (BLER) performance. Moreover, in high-rate scenarios, it further enhances spectral efficiency by jointly utilizing the angle and phase dimensions. Meanwhile, the proposed recursive detector achieves BLER and bit error rate performances close to those of the global search detector, while reducing computational complexity by over 95%. This validates its feasibility in maintaining excellent performance while achieving low-complexity processing. The study provides an efficient and practical constellation design and detection scheme for non-coherent short-packet communication in SIMO systems, contributing to the deployment and application of URLLC in practical systems.  
      关键词:SIMO system;short-packet communication;constellation design;non-coherent space-time modulation;recursive detection;minimum chordal distance (MCD) criterion   
      97
      |
      38
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 152056108 false
      更新时间:2026-04-24

      SURVEYS AND REVIEWS

    • Private Information Retrieval: Current Status and Future Prospects

      DU Rui-ying, HUANG Zheng-di, SHI Min, ZHOU Er-jun, HE Kun, CHEN Jing
      Vol. 53, Issue 12, Pages: 4719-4739(2025) DOI: 10.12263/DZXB.20250525
      摘要:In the era of data-driven decision making, the deep integration of big data analytics and cloud computing has pushed data security and privacy protection to the forefront of core challenges while unleashing the value of data. As a key multi-party secure computing technology, private information retrieval allows users to retrieve specific information from remote databases without revealing the query target at all, providing a solid privacy guarantee for data query in untrustworthy environments. The technology has demonstrated its application potential in many fields, such as healthcare and finance, and continues to receive extensive attention from both academia and industry. However, with the proliferation of data size and number of users, the existing schemes face a significant contradiction between efficiency and practicality. Early multi-server schemes based on information-theoretic security rely on the strong security assumption of multiple non-collusion, while single-server schemes based on computational security face severe challenges in communication, computation, and storage overheads. Therefore, how to comprehensively improve the retrieval efficiency under the premise of ensuring security has become a core issue to drive the technology to the ground. In this paper, we systematically sort out and summarize the current research status of private information retrieval technology. First, we clarify the formal definition of private information retrieval and its core attributes, and outline the mainstream cryptographic primitives that realize the technology. Second, this paper constructs a technology categorization framework based on the number of servers, divides the existing schemes into two main vectors: multi-server and single-server, and deeply analyzes the design principles and performance trade-offs of different technology routes based on function secret sharing, puncturable pseudorandom function, homomorphic encryption and oblivious transfer. Further, this paper explores various practical variants derived to meet specific functional requirements, including batch private information retrieval, symmetric private information retrieval, keyword private information retrieval and updatable private information retrieval, and analyzes their respective problems and design features. At the application level, this paper specifically illustrates how private information retrieval can address practical privacy protection pain points through typical scenarios such as social discovery, anonymous communication and ad delivery. Finally, based on the comprehensive review and analysis, this paper looks forward to the future development trend of this area, pointing out that research should focus on further optimizing the theoretical overhead, designing a unified and flexible framework to support multi-functionality, and solving practical deployment challenges through system-level innovation, so as to promote private information retrieval technology from theory to a wide range of practical applications.  
      关键词:private information retrieval;secure multiparty computation;privacy protection;homomorphic encryption;function secret sharing   
      116
      |
      13
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 146560764 false
      更新时间:2026-04-24
    • LIU Jing-ling, HE Zhong, WANG Zi-xi, FANG Hao, CUI Rui, HUANG Jia-wei
      Vol. 53, Issue 12, Pages: 4740-4755(2025) DOI: 10.12263/DZXB.20250334
      摘要:Data centers host cloud computing, big data analytics, and artificial intelligence workloads, whose traffic exhibits highly heterogeneous latency and throughput requirements. Latency-sensitive tasks require extremely low delay, whereas large-scale backup and analytics flows are more concerned with high throughput; accordingly, data center networks usually adopt efficient scheduling mechanisms to meet these diverse application needs. Traditional packet scheduling mechanisms are implemented on customized network devices to meet specific application requirements, which leads to high development costs and poor scalability and makes it difficult to rapidly deploy new scheduling algorithms in the network. With the emergence of new network devices such as programmable switches, the programmability of the data plane has been significantly enhanced. Network devices have shifted from fixed functionalities to flexible configurations, laying the groundwork for designing high-performance scheduling mechanisms in the data plane. Recently, researchers have proposed numerous innovative programmable scheduling mechanisms, significantly improving network performance. This paper provides a comprehensive survey of recent programmable scheduling mechanisms for data center networking and classifies them into four categories: general-purpose abstraction mechanisms, mechanisms for high-concurrency traffic, mechanisms for fairness assurance, and mechanisms for tenant-level isolation. General-purpose abstraction mechanisms provide flexible abstractions that support a wide range of scheduling policies. Mechanisms for high-concurrency traffic emphasize line-rate processing under massive flow concurrency. Mechanisms for fairness assurance focus on providing fair bandwidth allocation among flows or tenants. Mechanisms for tenant-level isolation target multi-tenant environments by providing strong isolation and hierarchical resource allocation. These four types of mechanisms have different design emphases but can complement each other to cope with complex and diverse data center scenarios. Then, representative scheduling mechanisms for each category are discussed, followed by a comparative analysis of the advantages and disadvantages of various categories. Finally, the paper concludes with a discussion on the future development trends of scheduling mechanisms based on programmable data planes.  
      关键词:data center networking;programmable data plane;priority scheduling;fairness;multi-tenant;latency   
      54
      |
      4
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 144035967 false
      更新时间:2026-04-24
    • HAN Guang-jie, ZHU Sheng-chao, LIN Chuan, JIANG Jin-fang
      Vol. 53, Issue 12, Pages: 4756-4786(2025) DOI: 10.12263/DZXB.20250418
      摘要:Multi-Agent Reinforcement Learning (MARL), as an important framework for handling the problems of agent cooperation and competition in complex dynamic environments, has achieved rapid development in both theory and application in recent years, and has shown broad prospects in fields such as autonomous driving, swarm robotics, intelligent scheduling, and adversarial games. However, problems such as environmental non-stationarity, strong policy coupling, difficult credit assignment, and complex safety constraints are widespread in multi-agent systems, making MARL face greater challenges compared to single-agent reinforcement learning. This paper first combs through the foundational modeling and theoretical framework of MARL, starting from formal descriptions such as Markov games and partially observable Markov games, and combining typical paradigms such as centralized training with decentralized execution, and communication-based cooperative decision-making, to conduct a comparative analysis of existing methods in terms of information utilization, computational complexity, and convergence properties, and summarizes the core technologies such as value decomposition, policy gradients, multi-agent credit assignment, and communication modeling. On this basis, this paper focuses on summarizing several frontier research directions. The first is Large Language Models (LLMs)-based MARL, which, by introducing the knowledge reasoning and high-level planning capabilities of LLMs, is used for task decomposition, policy guidance, and natural language communication, to enhance the generalization and collaboration capabilities of agents in open environments. The second is MARL based on meta-learning, facing multi-task and distribution shift scenarios, focusing on the rapid adaptation of policies to new tasks, new teammates, or new opponents, improving sample efficiency by learning “learn-to-learn” initializations or adaptation rules. The third is MARL based on explainability, using methods such as attention visualization, causal analysis, and rule extraction to enhance the transparency of the decision-making process, providing support for policy auditing, human-agent collaboration, and safety supervision. The fourth is the application and deployment of large-scale MARL, focusing on the problems of training efficiency, communication overhead, and scalability brought by the sharp increase in the number of agents and state dimensions, exploring mechanisms such as hierarchical structures, population modeling, and parallel training. The fifth is multi-agent safe reinforcement learning, starting from constraint satisfaction, risk control, and robustness, studying safe decision-making under adversarial perturbations, uncertainties, and policy games. Finally, this paper, combining two typical application scenarios of cooperation and competition, discusses the challenges faced by MARL in its deployment in real systems, such as insufficient sample efficiency, difficulty in simulation-to-real transfer, and insufficient analysis of fairness and steady-state games, aiming to provide a systematic reference for the subsequent theoretical research and engineering applications of MARL.  
      关键词:multi-agent reinforcement learning;Markov games;large language model;meta-learning;explainability;multi-agent safe reinforcement learning   
      159
      |
      28
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 147995631 false
      更新时间:2026-04-24
    • A Survey on Shuffled Differential Privacy

      ZHANG Xiao-jian, WANG Hao-feng, FU Ji-bin
      Vol. 53, Issue 12, Pages: 4787-4810(2025) DOI: 10.12263/DZXB.20250017
      摘要:Query and analysis of users’ data with centralized differential privacy (CDP) and local differential privacy (LDP) have attracted considerable attention in recent years. The solutions to this problem have been proposed constantly, and the corresponding limitations are also highlighted, which originate from the fact that CDP and LDP are the two extreme models with the changing for collector’s trust. In the CDP model, users fully trust the collector, and report their raw data. The collector perturbs the raw data to respond to the query, which error is low. Users in the CDP model, however, cannot control their raw private data. While, in the LDP model, users do not trust the collector and only report the noise data. The query error over the noise reports is high. The shuffled differential privacy (SDP) model effectively balances the contradiction between CDP and LDP. This paper surveys the state of the art of SDP for data query and analysis. The mechanisms and properties of this model are described, while our focus is put on summation query, histogram estimation, frequency and means estimation, and machine /federated learning, etc. Following the comprehensive comparison and analysis of existing works, future research directions are put forward.  
      关键词:central differential privacy;local differential privacy;shuffled differential privacy;data queries;data analysis;machine learning   
      53
      |
      9
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 124975521 false
      更新时间:2026-04-24
    • A Survey on Visible-Infrared Cross-Modality Person Re-Identification

      LI Zhi-yong, JIANG Wei, LIU Hao-jie
      Vol. 53, Issue 12, Pages: 4811-4832(2025) DOI: 10.12263/DZXB.20250800
      摘要:Person re-identification (ReID) is a core technology in intelligent video surveillance systems, with the fundamental objective of efficiently retrieving and matching a specific pedestrian across camera networks with non-overlapping fields of view. However, traditional approaches that rely solely on visible images suffer severe performance degradation under challenging illumination conditions such as nighttime or low-light environments. To address this limitation, visible-infrared person re-identification (VI-ReID) has emerged, aiming to enable cross-modal retrieval between visible and infrared images. This task not only inherits classic challenges from unimodal ReID—such as pose variations, viewpoint changes, and occlusions—but also faces a significant cross-modal discrepancy arising from the intrinsic differences in imaging mechanisms. This paper provides a systematic survey, comprehensive synthesis, and critical review of recent deep learning-based methods for VI-ReID. We categorize existing mainstream approaches into three major groups: (1) cross-modal network architecture design, which constructs specialized network structures to extract modality-invariant identity features, including dual-stream feature extraction networks, identity disentanglement modules, fine-grained feature alignment strategies, and architecture search-based designs; (2) generative learning methods, which seek to bridge the modality gap through modality translation or data augmentation, covering unidirectional or bidirectional image generation, construction of unified intermediate modalities, and feature-level generation and compensation techniques; (3) cross-modal similarity learning, which focuses on designing loss functions and metric learning schemes to pull together positive cross-modal pairs while pushing apart negative ones, primarily encompassing sample- or proxy-based contrastive learning and test-time optimized cross-modal re-ranking algorithms. Moreover, recognizing the high cost of annotation and the prevalence of noisy or incomplete labels in real-world applications, this survey further investigates advances under non-fully-supervised learning paradigms, systematically summarizing the unique challenges and representative solutions in noisy-label learning, semi-supervised learning, and unsupervised learning. To offer a holistic performance evaluation, we conduct unified comparisons and analyses of representative algorithms under different supervision settings on widely adopted public benchmarks—SYSU-MM01, RegDB, and LLCM. Finally, grounded in current technical bottlenecks, we outline promising future directions, including the development of more realistic and diverse datasets, mitigation of modality imbalance, lightweight model deployment, exploration of sustainable or lifelong learning mechanisms, and extension toward video-based or multi-source heterogeneous information-fused ReID. This survey aims to serve as a valuable theoretical reference and technical guide for future researchers in the field.  
      关键词:cross-modality person re-identification;pattern recognition;deep learning;representation learning;network architecture design;generative learning;intelligent surveillance   
      92
      |
      12
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 145919666 false
      更新时间:2026-04-24
    • Research Progress on Blockchain-Enabled Trusted Data Space Security

      SHANG Si-yuan, DU Xue-hui, LIU Ao-di, WANG Xiao-han, WU Xiang-yu, WANG Na
      Vol. 53, Issue 12, Pages: 4833-4858(2025) DOI: 10.12263/DZXB.20250617
      摘要:With the ’release of the action plan for trusted data spaces, promoting the trading and circulation of data elements and addressing “data silos” has become a national strategic priority. Blockchain, as a distributed computing and storage paradigm, provides immutability, decentralization, and traceability to support trusted data management. Unlike traditional databases, it uses immutable data structures and multi‑party consensus to build trust between data providers and users. An urgent priority is to harness these advantages to strengthen trusted data space security. Currently, numerous domestic and international reviews have surveyed blockchain’s foundational technologies (e.g., consensus mechanisms, smart contracts, network topology), application areas (e.g., information security, system protection, data management), and technical enhancements (e.g., sharding, cross-chain, privacy protection). However, a systematic review of how blockchain enables trusted data space security remains lacking. Based on this, this paper offers a comprehensive academic review of blockchain‑enabled trusted data space security. First, we analyze the core architecture and system requirements, identify security issues from a full-lifecycle perspective, and propose a blockchain-based security framework for data acquisition, validation, sharing, and provenance. Second, we integrate blockchain with mainstream security mechanisms and synthesize research progress across four domains: trusted acquisition, compliance verification, secure sharing, and federated provenance. Third, we survey development trends in blockchain-enabled trusted data space security. Grounded in the foundational requirements of trusted data spaces as data infrastructure, we evaluate the strengths and limitations of existing work across all stages of data circulation. Further breakthroughs are needed in on-chain data retrieval, data ownership protection, and the enforcement of data regulations to safeguard core capabilities, including trustworthy governance, resource interoperability, and value co-creation, and to advance the security architecture and technological development of blockchain-based trusted data spaces.  
      关键词:blockchain;trusted data space;data sharing;data security;privacy protection;technology enablement   
      131
      |
      32
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 149484754 false
      更新时间:2026-04-24
    • Charge Field Modulation Mechanism of Power Semiconductor Devices

      ZHANG Wen-tong, ZHANG Bo, LI Zhao-ji
      Vol. 53, Issue 12, Pages: 4859-4866(2025) DOI: 10.12263/DZXB.20250921
      摘要:The essential disparity between power semiconductor devices and micro/nano-devices lies in the former possessing a voltage-withstanding layer and a termination region. In this study, the electric field within power semiconductor device is dissected into two components: the charge field Eq, originated from ionized charges, and the potential field Ep, induced by the applied potential. This decomposition allows the independent analysis of the electric field interaction mechanisms in various devices to be made by examining the additional charge fields generated by varying charges. Based on this analysis, a charge field modulation mechanism for power semiconductor devices is introduced. This mechanism achieves holistic optimization of the electric potential and field distribution across the voltage-withstanding layer and the termination region through multi-dimensional modulation of the potential field Ep by the charge field Eq. The proposed charge field modulation mechanism is universally adaptable to the design of voltage-withstanding layers and their corresponding terminations in both discrete and integrated power semiconductor devices. Moreover, its applicability extends to wide-bandgap power semiconductor devices. In the future, this mechanism can be further integrated with artificial intelligence technology to enhance the efficiency of device design.  
      关键词:power semiconductor devices;voltage-sustaining layer;charge field;potential field;charge field modulation mechanism   
      57
      |
      36
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 152061934 false
      更新时间:2026-04-24
    0