最新刊期

    53 10 2025

      PAPERS

    • LI Yuan-hang, GU Peng-fei, ZHAO Hui, HE Zi, FAN Zhen-hong, DING Da-zhi
      Vol. 53, Issue 10, Pages: 3473-3482(2025) DOI: 10.12263/DZXB.20250392
      摘要:Waveguide slot array antennas are crucial radiating components in microwave and millimeter-wave systems. The operational principle relies on the slots interrupting the surface currents on the waveguide walls, thereby generating radiation. By designing the offset, inclination angle, and length of the slots, as well as their arrangement on the waveguide (e.g., resonant or non-resonant arrays), the antenna’s radiation pattern, polarization, and impedance characteristics can be controlled. Common array configurations include longitudinal slots alternately offset on either side of the broad-wall centerline (used to form broadside arrays) and inclined slots at a certain angle to the axis (which can be used for frequency scanning or circular polarization). This type of antenna offers advantages such as a compact structure, high power capacity, and low loss, making it widely applicable in radar, communications, and electronic countermeasures. Traditional uniformly spaced linear waveguide slot arrays have limitations in frequency scanning range and sidelobe suppression. To overcome these constraints, this paper presents the design and fabrication of a serpentine ridge waveguide slot array antenna with a center frequency of 15.35 GHz. This structure innovatively incorporates a serpentine slow-wave line, which increases the propagation path of the electromagnetic wave and significantly enhances the equivalent propagation phase constant. As a result, a larger beam deflection is achieved for a given frequency change, enabling a wider angular frequency scan. Simulation and experimental results demonstrate that the proposed antenna achieves a wide-angle frequency scan from -30° to +30°, offering broad coverage while maintaining excellent low sidelobe characteristics throughout the scanning process. Furthermore, the structure is compact and low-profile, making it suitable for integrated systems with strict space requirements. To further enhance performance, this paper employs the M-FOCUSS synthesis algorithm: which integrates the multiple measurement vectors (MMV) and the focal underdetermined system solver (FOCUSS): to sparsely optimize the array. This method reduces the number of slot elements by approximately 28% while preserving radiation performance, achieving a sidelobe level of approximately -18 dB and maintaining favorable frequency scanning characteristics. The sparsification reduces the aperture area on the waveguide surface, increasing the power capacity by about 40%, which is significant for high-power applications. Experimental measurements show good agreement with full-wave simulation results, validating the effectiveness of the serpentine ridge waveguide structure in expanding the beam scanning range and the practical value of the M-FOCUSS sparse synthesis method. This design provides a new approach for achieving wide-angle frequency scanning and low sidelobe performance in waveguide slot antennas, while also demonstrating the potential of sparse arrays in enhancing power capacity and reducing manufacturing costs. It offers valuable insights for the development of next-generation high-performance scanning antennas.  
      关键词:waveguide slot antenna;ridge waveguide;frequency-scanning antenna;low sidelobe;sparse antenna   
      41
      |
      7
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 136758209 false
      更新时间:2026-02-05
    • A Lightweight Neural Speech Compression Method for Edge Devices

      LU Yu, FU Yong-jian, DING Dian, PAN Hao, XUE Guang-tao, REN Ju
      Vol. 53, Issue 10, Pages: 3483-3496(2025) DOI: 10.12263/DZXB.20250524
      摘要:Neural audio compression methods have shown remarkable performance in low-bitrate speech reconstruction, but their high computational cost and deployment complexity limit their practical use on edge devices. To address this issue, this paper proposes a lightweight neural speech compression system tailored for resource-constrained scenarios such as mobile terminals. Based on the Funcodec framework, we redesign the encoder module using a streamlined convolutional neural network architecture and introduce a multi-objective knowledge distillation strategy that integrates perceptual alignment, spectral constraints and adversarial training. Experimental results demonstrate that the proposed convolutional neural network encoder significantly reduces model complexity and inference latency while maintaining comparable compression quality, enabling millisecond-level real-time speech encoding on edge devices. Furthermore, to improve transmission efficiency, we present a Huffman coding-based entropy optimization method that adaptively encodes residual quantization outputs, achieving an average storage reduction of approximately 5% without compromising reconstruction quality. Overall, the proposed system strikes a favorable balance between compression fidelity, computational efficiency and deployability, making it well-suited for real-world speech acquisition and processing applications on edge platforms.  
      关键词:audio compression;Huffman coding;knowledge distillation;edge computing   
      31
      |
      1
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 142689799 false
      更新时间:2026-02-05
    • DENG Shi-he, ZHANG Meng, SHEN Ya-fei, XIE Zhen-chao, WANG Wen-wei
      Vol. 53, Issue 10, Pages: 3497-3503(2025) DOI: 10.12263/DZXB.20250526
      摘要:This paper presents the design and implementation of a low-insertion-loss, hermetic low-noise amplifier (LNA) module based on an indium phosphide high-electron-mobility transistor (InP HEMT) chip. To address the vulnerability of traditional E-plane probe waveguide packaging structures to moisture during environmental testing, a hermetic vertical waveguide-to-microstrip transition structure is proposed, enhancing the reliability of the amplifier module in harsh conditions. Furthermore, by introducing a periodic gap waveguide structure around the waveguide short-circuit region, energy leakage and higher-order mode resonance in the quartz substrate are effectively suppressed. Simulation results demonstrate that the proposed transition structure achieves a return loss better than -25 dB and an insertion loss below 0.3 dB across 87.5~90.5 GHz. Through compensation for the parasitic inductance introduced by bonding wires, the in-band return performance is improved from -15 dB to -25 dB, reducing energy reflection during transmission. Measured results indicate a gain greater than 20 dB, an input return loss below -20 dB, a typical noise figure of 2.5 dB, and a total loss induced by the bilateral packaging of less than 1 dB within the operating band. The overall performance shows good agreement with the chip datasheet, validating the effectiveness of the proposed design.  
      关键词:hermeticity;low-noise amplifier;gap waveguide;low-loss packaging;millimeter wave   
      48
      |
      3
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 135009953 false
      更新时间:2026-02-05
    • Dual-Band Waveform-Selective Metasurface Absorber

      TIAN Wen-liang, XING Rui, CHENG Yong-zhi, LUO Hui, LI Xiang-cheng
      Vol. 53, Issue 10, Pages: 3504-3513(2025) DOI: 10.12263/DZXB.20250520
      摘要:To address the limitations of traditional metasurface absorbers (MSAs), such as single-band operation and the inability to distinguish between pulse waves (PWs) and continuous waves (CWs), this paper proposes a design scheme for a dual-band MSA based on full-wave rectifier nonlinear circuits. The basic unit of the MSA structure consists of three layers, including two metal square ring resonators of the same size with nonlinear circuits loaded on the top layer, a middle dielectric layer, and a bottom metal ground plane. Through an innovative combination of capacitive/inductive nonlinear circuits and the introduction of an additional parallel inductor to adjust the resonant frequency, four implementation methods for MSAs with dual-band waveform-selective absorption performance are developed. These MSAs can accurately and independently absorb specific waveforms (CWs or PWs) within two different frequency bands. Moreover, the operating frequency bands can be flexibly adjusted simply by changing the value of the parallel inductor, which significantly enhances the operational flexibility and adaptability of the designed MSAs. This effectively overcomes the limitations of existing electromagnetic wave-absorbing devices in complex multi-spectrum application scenarios. To ensure the universality of the research, this paper takes one type of dual-band MSA among the four as the key research object, specifically the one loaded with an inductive nonlinear circuit and a capacitive nonlinear circuit with an additional parallel inductor, and focuses on investigating its waveform selection characteristics. First, the selective absorption performance of this dual-band MSA for CWs and pulse waves PWs with different powers are thoroughly explored through electromagnetic-circuit co-simulation, clarifying the critical influence of the power threshold (-5 dBm) on the rectification function of the diode and impedance matching. Second, the selectivity of the dual-band MSA for pulse waves under different pulse widths is studied, verifying its matching relationship with the circuit time constant. Subsequently, a detailed analysis is conducted on the influence of oblique incidence of transverse electric (TE)-polarized waves and transverse magnetic (TM)-polarized waves on the waveform-selective absorption performance of the dual-band MSA, confirming that this MSA exhibits wide-angle stability. Finally, this paper systematically elaborates on the influence of lumped circuit component parameters (RC, C, RL, L) on the selective absorption performance of the MSA for incident waves with different waveforms. The dual-band waveform-selective MSA proposed in this paper, with its flexible frequency band tunability and stable polarization/angle adaptability, provides an effective technical approach to address the issues of “pulse clutter filtering” and “inter-band interference suppression” in multi-band communications. It holds significant theoretical value and potential application prospects in the fields of antenna design, wireless communication signal optimization, and electromagnetic compatibility (EMC) protection.  
      关键词:metasurface absorber (MSA);waveform selectivity;dual-band;continuous wave (CW);pulse wave (PW)   
      59
      |
      8
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 134277472 false
      更新时间:2026-02-05
    • TAN Ling, WANG Hai-feng, SONG Jing, YAO Yong-lei, XU Hai
      Vol. 53, Issue 10, Pages: 3514-3528(2025) DOI: 10.12263/DZXB.20250268
      摘要:In power grid inspection, the transmission line inspection robot (TLIR) undertakes full-coverage inspection tasks; its long-distance operation and high-frequency data acquisition impose stringent requirements on the real-time performance and energy efficiency of task processing. Mobile edge computing (MEC) deploys computing and offloading capabilities at the network edge and can effectively support the real-time data processing of transmission line inspection. Traditional inspection strategies treat path planning and task offloading as two independent processes optimized in separate stages, overlooking the dynamic associations and temporal couplings among variables and making it difficult to achieve a global optimum in system performance. To address the inconsistent decision timing in MEC-assisted dense transmission line inspection and the difficulty of balancing low-latency task demands with system energy saving, this paper proposes a granularity-nested multi-objective offloading strategy for transmission line inspection, which achieves joint optimization of path planning, task offloading, and resource allocation by constructing a granularity-nested structure in which multiple single windows are embedded within a composite window. In this granularity-nested structure, the composite window primarily controls TLIR path planning, while the single windows dynamically decide task offloading and resource allocation according to variations in communication conditions and available resources, thereby coping with the different control periods among multiple optimization tasks and ensuring the minimization of system delay and energy consumption. To realize full-coverage inspection by the TLIR, an Eulerian-graph-based strategy is introduced which, by analyzing the topological characteristics of the inspection scenario, constructs the shortest Eulerian circuit covering all transmission lines, and Lyapunov optimization is employed to gradually transform the long-term stochastic optimization of task backlog and energy management into a deterministic problem at the time-slot level. In view of the complex coupling among optimization variables and the inconsistency of decision timing, this paper further proposes a nested-granularity-aware multi-objective adaptive offloading algorithm (NGA-MOAO), which decomposes the original NP-hard problem into two subproblems and designs a cross-window joint optimization strategy based on single-window incentive feedback, in which the single windows generate incentive signals by dynamically adjusting task offloading and resource allocation, and a penalty term encoding the hard full-coverage constraint is superimposed on these signals to guide path planning in the composite window, ultimately achieving collaborative optimization among multiple variables. Simulation results show that, under different numbers of towers, weight-coefficient proportions, and task surges, the delay and energy consumption of NGA-MOAO are both superior to those of the comparison schemes and exhibit smaller fluctuations. Under the premise of full-coverage inspection, compared with the baselines, the NGA-MOAO algorithm reduces inspection cost, energy consumption, and delay by at least 11.75%, 15.11% and 8.32%, respectively, and increases resource utilization by at least 9.47%, making it applicable to full-coverage inspection in complex transmission line environments.  
      关键词:power grid inspection;full coverage;mobile edge computing;multi-objective;Eulerian graph;Lyapunov optimization;nested granularity   
      60
      |
      5
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 134360712 false
      更新时间:2026-02-05
    • WANG Tong, LIU Shang-he, CHEN Dong-wei, JIN Meng-zhe, FANG Qing-yuan
      Vol. 53, Issue 10, Pages: 3529-3539(2025) DOI: 10.12263/DZXB.20250490
      摘要:With the widespread application of drone swarms in civilian fields, it is crucial for drones effective regulation to obtain some key state parameters of drone swarms such as spatial angles and signal polarization which are acquired by utilizing joint direction of arrival (DOA) and polarization estimation based on electromagnetic vector sensor arrays (EMVA). However, in simultaneous detection of non-cooperative multi unmanned aerial vehicles (UAVs) of drone swarms in a complex electromagnetic environment, the joint DOA and polarization estimation performance of traditional methods on real UAVs signal sources with weak power usually deteriorates when interference signal with high power impinge on an EMVA. Therefore, a joint DOA and polarization estimation method with invariant property of noise subspace (IPNS) based on geometric algebra of Euclidean 3-space (G3) model is proposed. Firstly, the impact of the coexistence of strong and weak signals on the performance of parameter estimation of traditional subspace methods based on G3 is studied by using expected spectrum of multiple signal classification method based on geometric algebra of Euclidean 3-space framework (G3-MUSIC). Then the invariant property of G3 noise subspace of array covariance matrix is theoretically proved. The performance of joint DOA and polarization estimation on real targets with weak power is improved by utilizing the characteristic of the method that the eigenvalues of G3 noise subspace remain unchanged when the incident signal power is increased. Simultaneously, by theoretically deriving the impact of changes in virtual source polarization parameters on the invariance of the noise subspace of the array covariance matrix based on G3, it is proved that 4-dimensional spectral peak search is not required by the proposed method which realizes joint DOA and polarization estimation only by 2-dimensional spectral peak search. It is verified by simulation that the weak signals cannot be distinguished by the traditional methods as the power of strong interference signals increases. At the same time, simulation verified that when high power signals impinge on an EMVA, the proposed method outperforms the traditional methods in terms of different signal-to-noise ratios, power ratios between strong and weak signals, and noise correlations. Compared with the traditional methods based on invariant noise subspace, the signal to noise ratio threshold of direction finding of the weak source was reduced by more than 3 dB, the accuracy of joint DOA and polarization estimation was enhanced by 88.7%, and the calculation amount was reduced by more than 97.11%. The proposed method can be used for obtaining locations of multiple UAVs and polarization parameters of signals emitted by multiple UAVs in non-cooperative drone swarms in complex electromagnetic environments, especially in circumstance of interference with high power, and has potential application value in scenarios such as anti-interference for mobile wireless communication based on UAVs platforms.  
      关键词:parameter estimation of sources;resisting intensive interference;electromagnetic vector sensor;geometric algebra;invariant property of noise subspace;drone swarms   
      31
      |
      2
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 142690180 false
      更新时间:2026-02-05
    • SHAO Shu-yu, ZHANG Yang, YAN Wen-jing
      Vol. 53, Issue 10, Pages: 3540-3550(2025) DOI: 10.12263/DZXB.20250506
      摘要:Traditional driving behavior recognition methods have limitations such as relying on external sensor data, being vulnerable to environmental interference, and being difficult to reflect the internal cognitive state of drivers. To this end, this paper constructs a multimodal physiological signal deep learning framework that integrates Transformer and convolutional neural network (CNN) to achieve high-precision recognition and interpretability analysis of driving behavior. The research is based on a multimodal physiological data for behavior recognition (MPDB) containing electroencephalogram (EEG), electrocardiogram (ECG), electromyography (EMG), and galvanic skin response (GSR), and systematically plans a complete process from signal preprocessing, feature extraction to spatio-temporal fusion. After filtering, artifact correction, feature standardization and time-frequency transformation, the signals of each mode are synchronously aligned to construct a spatio-temporal feature tensor to achieve a unified representation among different physiological modes. At the model architecture level, the CNN branch is responsible for capturing local spatiotemporal patterns and extracting short-term response features, while the Transformer branch models the long-term dependence of physiological signals and cross-modal interaction relationships through its self-attention mechanism, taking into account both local sensitivity and global temporal modeling capabilities. The fusion network adopts a two-stream structure, combines multi-head attention with multi-scale convolution, and introduces a dynamic weight distribution mechanism to achieve feature adaptive fusion. The optimization process employs the AdamW algorithm and Dropout regularization to further enhance the generalization performance and convergence stability of the model. The experimental results show that in the binary classification (smooth driving/dynamic driving) tasks, the accuracy rates of this model reach 94.9% and 98.75% respectively. Among the five types of driving behavior recognition (smooth driving, acceleration, deceleration, lane changing, and turning), the average accuracy rate of the model was 85.39%, significantly higher than that of recurrent neural network (RNN), long short-term memory (LSTM) network, support vector machine (SVM), single CNN, and single Transformer. Moreover, it achieved a good balance in F1 and recall rate. It has verified its comprehensive advantages in multimodal signal characterization and timing dependency modeling. The model training curve also indicates that this framework has a fast convergence speed and a low convergence loss value, suggesting that it has strong robustness and is not prone to overfitting. On this basis, in order to enhance the interpretability of the model, this paper introduces the deep SHapley additive explanations (DeepSHAP) method to conduct feature attribution analysis on the decision-making process of the established model. The analysis results show that high-frequency electroencephalographic signals (β waves, γ waves) and upper limb electromyographic signals have a significant impact on accelerated driving operations, while the activity and reaction delay of the tibialis anterior muscle have a significant impact on lane-changing driving operations. The method proposed in this paper reveals the physiological response laws behind different driving operations, providing a new perspective for exploring the neuro-behavioral hierarchical relationship of drivers. In conclusion, this paper proposes that the Transformer-CNN fusion framework can extract the spatiotemporal information features of multimodal physiological signals quite well. It has achieved good performance in performance indicators such as recognition accuracy, stability, and interpretability, and at the same time provides applicable technical support for the constructed intelligent driving monitoring system. It also provides a technical direction for the application of multi-source signal modeling and explainable artificial intelligence in driving safety research. In the next step of work, the research on the proposed method under natural driving conditions will be considered, so as to better apply it in real-time monitoring of driving conditions and continuous risk prediction.  
      关键词:driving behavior;multimodal physiological signals;Transformer;convolutional neural network (CNN);SHAP   
      79
      |
      3
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 136278113 false
      更新时间:2026-02-05
    • LI Kang, YU Juan, HAN Jian-min, QIU Sheng, YANG Qiong
      Vol. 53, Issue 10, Pages: 3551-3565(2025) DOI: 10.12263/DZXB.20250307
      摘要:Mobility trajectory data of individuals, vehicles, and other objects in urban environments contains rich information about residents’ activities, which is highly valuable for urban planning, traffic management, and epidemic spread analysis. However, privacy protection and commercial confidentiality significantly restrict the sharing and utilization of trajectory data. Generating synthetic trajectories that preserve the characteristics of real trajectories to replace real ones for release and application has become a preferred solution to overcome these limitations. Recently, deep learning-based trajectory generation research has attracted considerable attention from both academia and industry, with various trajectory models based on generative adversarial networks, diffusion models, and others being successively proposed. Nevertheless, existing trajectory generation models suffer from two major limitations: first, they fail to effectively capture global spatial dependencies in human mobility patterns; second, they inadequately model the influence of urban environments on trajectory generation, leading to deviations between generated trajectories and real-world scenarios. To address this, this paper proposes a knowledge graph-driven trajectory generation model integrating multi-source urban environmental information, named urban trajectory generation via knowledge graph-enhanced multi-source context fusion (KG-TrajGen). The model integrates key multi-source urban environmental data, including road network topology, points of interest (POI), and functional zone classifications, to construct a foundational road knowledge graph (RKG) and an environment-semantics-enhanced road knowledge graph (E-RKG). A relational graph convolutional network is employed to learn basic road segment embeddings from the RKG, simultaneously capturing both local and global spatial dependencies among roads. Additionally, a structure-aware knowledge graph embedding method is used to extract urban environmental knowledge from the E-RKG, endowing the model with environmental awareness and enriching the road segment embedding features. Subsequently, a Transformer decoder model learns human activity pattern features from historical trajectory data to obtain trajectory history-enhanced road segment embeddings. Finally, by effectively fusing the knowledge graph-enhanced and historical trajectory-enhanced road segment embeddings, the model generates environment-aware, fine-grained trajectories in an autoregressive manner. Experiments on two open source real world trajectory datasets demonstrate that KG-TrajGen significantly outperforms baseline methods in terms of statistical feature error, frequent pattern feature error, and trajectory error metrics. Moreover, the generated trajectories perform better than those from baseline methods in downstream trajectory analysis tasks such as traffic flow prediction, fully validating the effectiveness of the KG-TrajGen model. The code for KG-TrajGen is available at https://github.com/trajgen/KG-TrajGen.  
      关键词:knowledge graph;road network;trajectory generation;privacy protection;urban environment;Transformer   
      71
      |
      5
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 134256589 false
      更新时间:2026-02-05
    • A Malware Classification Method Based on MP-FSCIL

      WANG Jian, LIU Qiang, WANG Lei
      Vol. 53, Issue 10, Pages: 3566-3578(2025) DOI: 10.12263/DZXB.20250692
      摘要:Malware families continue to mutate through techniques such as code obfuscation and polymorphic deformation, leading to the shift of feature space and the failure of model decision boundaries. In addition, the rapid evolution of zero-day attacks and small sample scenarios in early stage further exacerbate degradation of knowledge and adaptation bottleneck of traditional detection models. In response to the above issues, this article proposes a malicious code classification method based on multi-prototype few-shot class-incremental learning, namely MP-FSCIL (Multi-Prototype Few-Shot Class-Incremental Learning), which aims to resolve the problems of catastrophic forgetting and overfitting in dynamic environments. In the base class training stage, the large separable kernel attention (LSKA) is fused with the DenseNet network to design a dedicated feature extractor for malware images. The large kernel attention mechanism of the LSKA module is capable of capturing the global features of malware images, while the dense connection feature of the DenseNet structure is competent to preserve fine-grained local features, effectively solving the problem of insufficient capture of key features in malware images by traditional feature extractors. The proposed model achieves a classification accuracy of 99.36% on the Malimg dataset, which is better than the feature extraction effect of existing FSCIL (Few-Shot Class-Incremental Learning) methods on the Malimg dataset; In the new class adaptation stage, a collaborative mechanism of “adaptive clustering and multi prototype learning” is constructed: the G-means algorithm is used to automatically iterate based on the distribution of malicious software features to determine the optimal number of clusters for the new class, and then combined with multi prototype learning to generate multiple class prototypes for each class. This strategy addresses the weakness of traditional single-prototype methods in distinguishing malware families with high intra-class feature heterogeneity, and increases the model’s average accuracy in identifying new classes by 17.23% in each incremental session. The cross-dataset class increment experiment on the Malimg and Microsoft Big 2015 datasets validated the effectiveness of the model in real scenarios of malicious code evolution. The experimental results show that MP-FSCIL can learn new class features well while maintaining the memory of old classes. Compared with existing research methods, the model improves classification accuracy by 8.89% on all classes, and the performance degradation rate on the last incremental session drops to 12.21%. Besides, The parameter size of the model is only 16.18 M, and the inference time for each sample is only 12.6 ms. It is suitable for deployment in practical applications and provides a robust and scalable solution for malware detection in open dynamic environments.  
      关键词:network security;malware;multi-prototype;few-shot class-incremental learning;large separable kernel attention;G-means algorithm   
      19
      |
      1
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 135418478 false
      更新时间:2026-02-05
    • ZHANG Yan, LUO Xiang-yu, QIN Zi-yue, ZHANG Miao, LI Zhi-fei
      Vol. 53, Issue 10, Pages: 3579-3592(2025) DOI: 10.12263/DZXB.20250485
      摘要:As a critical tool for cybersecurity knowledge modeling, vulnerability knowledge graphs play an increasingly important role in key tasks such as vulnerability analysis, threat modeling, security situational awareness and attack chain tracking. Unlike universal knowledge graphs, which cover a wide range of domains, have a long update cycle, and focus on generic knowledge and relationship modeling, vulnerability knowledge graphs are updated frequently, face the challenges of data heterogeneity, semantic ambiguity, and knowledge sparsity, and often need to incorporate unstructured descriptive information for joint modeling. However, the existing methods are still limited to the ternary formation modeling paradigm, ignoring the rich security text descriptions in the cybersecurity knowledge base, resulting in limited accuracy of vulnerability knowledge graph complementation and attack chain prediction. To address this, this paper proposes the construction of a vulnerability description knowledge graph (VKG-T), which enhances the ability to complete vulnerability and weakness information by combining structural and semantic data. Additionally, we present a dual-modality perception aggregated model for vulnerability description knowledge graph completion (VKGC-ST). This model integrates graph attention networks (GAT) with pre-trained language models, considering both the structural adjacency features of entities and their textual descriptions, and employs multi-level negative sampling and contrastive learning mechanisms to improve semantic discrimination and structural correlation modeling. Through link prediction experiments on vulnerability knowledge graph VKG-T and general datasets FB15K-237, WN18RR, VKGC-ST achieves the best performance across all metrics, specifically, on the vulnerability description knowledge graph dataset, the average improvement rate is 9.42%, with a maximum improvement rate of 15.51%, showcasing excellent generalization ability and domain adaptability.  
      关键词:vulnerability knowledge graphs;knowledge graph completion;knowledge representation learning;dual-modality perception;contrastive learning;attack chain prediction   
      41
      |
      4
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 134262777 false
      更新时间:2026-02-05
    • A Kernel Data Race Detection Method Based on Inference-Verification Mode

      ZHENG Hao-ran, BAI Jia-ju, ZHANG Cen, GUAN Zhen-yu
      Vol. 53, Issue 10, Pages: 3593-3607(2025) DOI: 10.12263/DZXB.20250792
      摘要:Data race is one of the most critical concurrency issues in operating system kernels. A data race occurs when two or more kernel execution threads concurrently access the same shared memory location without proper synchronization, and at least one of the accesses is a write. Data races can cause data corruption, logical errors, and kernel crashes, and can even be exploited by attackers to construct privilege escalation or denial-of-service attacks. Thus, designing efficient and precise data race detection mechanisms during the operating system development and testing phases is crucial for ensuring system stability and security. However, the complexity and non-determinism of the kernel’s concurrent environment pose significant challenges to data race detection. Existing dynamic detection methods suffer from limitations such as high runtime overhead and a weak capability of finding complex races, as they need to track locksets or happens-before relationships and rely heavily on spontaneous thread interleaving produced by the kernel. These issues severely impact the efficiency and accuracy of data race detection. To address these challenges, this paper proposes RIV (Racepair Inference-Validator), a kernel data race detection method based on an “inference-verification” model. The core idea of RIV is its “inference-verification” detection model: RIV first infers potential racy variable pairs by analyzing thread execution traces and memory access patterns, and then performs directed verification of these potential racy variable pairs through memory watchpoints and delayed injection, to achieve precise detection and reliable reproduction of data races. Meanwhile, RIV uses static taint analysis to identify potential shared variables, reducing code instrumentation and decrease runtime overhead. By capturing memory addresses and timing information of variable accesses, RIV can ensure high detection accuracy. To validate the effectiveness of RIV, we conducted experimental evaluations on 6 widely used Linux kernel modules. We discovered 31 real data races with no false positives, 12 of which have been confirmed by Linux kernel developers. In performance comparisons, RIV demonstrated performance improvements of 1.5 times, 6.7 times, and 1.8 times over existing kernel data race detection methods KCSAN, DILP, and SDILP, respectively. Furthermore, based on its unique “inference-verification” model, RIV discovered more real data races, proving its breakthrough in addressing the core problem of a weak ability to find complex races. In conclusion, RIV provides an efficient, precise, and practical automated detection solution for the concurrency security of operating system kernels.  
      关键词:operating system kernel;data race detection;dynamic analysis;watchpoint sampling   
      66
      |
      2
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 136954836 false
      更新时间:2026-02-05
    • CHEN Yi-lei, XIONG Sheng-wu
      Vol. 53, Issue 10, Pages: 3608-3621(2025) DOI: 10.12263/DZXB.20240685
      摘要:Speaker-independent visual dubbing aims to edit the lip movements of talking face videos according to speech signals, ensuring high audio-visual synchronization and natural fidelity. This task not only requires accurate lip-sync performance but also demands consistent facial texture and identity preservation. However, existing methods often suffer from texture inconsistencies between the restored and original facial regions when natural head movements occur, leading to unstable generation quality. To address these challenges, this paper proposes a cross-modal semantic enhanced and 3D face-guided motion-texture synergistic generation network. Specifically, we adopt 3D morphable models (3DMM) as an intermediate representation and decompose the task into two submodules: cross-modal semantic enhanced 3DMM expression coefficient prediction and 3D face-guided motion-texture synergistic rendering. In the first stage, a cross-modal attention mechanism integrates Wav2Lip-generated semantic image sequences with audio features, significantly improving synchronization accuracy and geometric consistency. In the second stage, a 3D face-guided rendering network leverages multi-reference faces and reconstructed 3D geometry to enhance texture consistency under head motion, while a multi-task learning framework further refines visual fidelity between the restored and real facial regions. Extensive experiments on the VoxCeleb1 and VoxCeleb2 datasets demonstrate that the proposed method achieves superior performance in generation fidelity, motion robustness, and synchronization compared with state-of-the-art approaches. On VoxCeleb1, our method improves peak signal noise ratio (PSNR) by 7.76, reduces learned perceptual image patch similarity (LPIPS) by 0.08, increases structural similarity index measure (SSIM) by 0.11, decreases landmark distance (LMD) by 1.10, and improves lip-sync score (Sync) by 0.20 over the baseline. On VoxCeleb2, it improves PSNR by 7.12, reduces LPIPS by 0.10, increases SSIM by 0.11, decreases LMD by 1.10, and improves Sync by 0.15. These results verify the effectiveness and robustness of the proposed framework under complex head movements and diverse identities.  
      关键词:visual dubbing;speaker-independent talking face;cross-modal attention;3D face modeling;motion-texture synergistic generation   
      55
      |
      7
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 139351060 false
      更新时间:2026-02-05
    • Boundary Thinking for Cognitive Uncertainty Problems

      ZHANG Qing-hua, HONG Cheng-xin, ZHAO Fan, GAO Man, CHENG Yun-long, WANG Guo-yin
      Vol. 53, Issue 10, Pages: 3622-3639(2025) DOI: 10.12263/DZXB.20250577
      摘要:Uncertainty problems are ubiquitous in the real world and have a significant impact on human understanding, cognition, and decision-making,making them an important topic in uncertain artificial intelligence research. Despite the fact that some progress has been made in artificial intelligence in dealing with uncertainty problems, it remains a challenging task to effectively address cognitive uncertainty. Uncertainty arises primarily from the conceptual boundary and from insufficient information to characterize it. Therefore, how to accurately identify the boundary for uncertainty and effectively deal with the boundary has become an important scientific problem in the field of artificial intelligence. This paper first summarizes the theoretical models and methods for dealing with uncertainty, revealing that the cognitive uncertainty is essentially the study of the transition state (boundary) between two opposing states (certainty), that is, the problem of identifying and dealing with the boundary. Secondly, the presentation forms of uncertainty in different dimensions are analyzed from the perspective of cognitive uncertainty boundary, such as the precise boundary of “point, line, and surface”and the fuzzy boundary of “interval, region, and space”. Finally, the boundary theory for uncertainty is discussed and summarized, and future research questions and directions are prospected. This study provides a new perspective on cognitive uncertainty and aims to promote the development and refinement of the boundary theory for uncertainty.  
      关键词:uncertaintyproblems;boundary;rough set;fuzzy set;state transition   
      14
      |
      4
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 142660128 false
      更新时间:2026-02-05
    • LI Yong-ming, HU Jie, ZHANG Xiao-heng, WANG Pin, LI Wen-zheng
      Vol. 53, Issue 10, Pages: 3640-3658(2025) DOI: 10.12263/DZXB.20250729
      摘要:Pedestrian trajectory prediction holds significant applications across autonomous driving, intelligent surveillance, and smart cities. However, the complexity and unpredictability of pedestrian interactions make this task a persistent challenge. Current models face two common limitations: (1) Considering only a single type of social coupling. This introduces redundant interactions. More critically, it fails to account for the varying nature of trajectory coupling across different scenarios and between different pedestrians. Consequently, models cannot deeply explore or effectively utilize diverse scene features; (2) Inadequate handling of domain shift. Although very few methods address domain shift, they rely on statistical criterion-based domain distribution alignment. Such approaches exhibit strong dependency on predefined statistical metrics. This leads to significant limitations in complex, dynamic environments. To address these issues, this paper proposes a hierarchical envelope adversarial domain adaptation with loose-tight coupled model. Firstly, an envelope sample transformation mechanism was designed. It constructs envelope samples and extends them into graph structures; Secondly, an adversarial domain adaptation module was developed. This integrates both local and global domain adaptation strategies; meanwhile, a loose-tight coupling envelope sample construction module was created. It dynamically adapts to diverse coupling relationships across scenarios. These innovations collectively enhance prediction accuracy and robustness. The experimental section employed two representative public datasets for validation and conducted comprehensive comparisons with six relevant baseline algorithms. Results demonstrate that our model achieves significantly higher accuracy compared to existing methods, with the average displacement error (ADE) and final displacement error (FDE) metrics reduced by 17.6% and 19.1%, respectively. The time overhead meets practical requirements, which verifies the effectiveness of our key innovations.  
      关键词:pedestrian trajectory prediction;social interaction;adversarial domain adaptation;envelope sample transformation;loose-tight coupled   
      55
      |
      6
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 139037845 false
      更新时间:2026-02-05
    • ZHU Song-hao, WANG Shuang-cheng
      Vol. 53, Issue 10, Pages: 3659-3670(2025) DOI: 10.12263/DZXB.20250309
      摘要:This article addresses a novel and challenging problem of knowledge transfer from the source domain and the intermediate domain to a single target domain, where each category in the target domain has few labeled samples. The knowledge transfer process in this situation faces two difficulties: the target data is extremely scarce, resulting in insufficient target domain feature distribution. Existing few-shot learning methods often extract features from each part indiscriminately, resulting in poor performance in few-shot object detection. To solve the above problems, this paper proposes a few-shot multi-source domain object detection method. A new meta optimization mechanism is proposed to align the source domain and target domain by introducing a mixed domain, alleviating the problem of scarce feature distribution in the target domain. Firstly, image-level mixing is used to generate mixed images, which together with corresponding labels form the first mixed domain. Then, fine-grained features are generated through a dual-channel attention mechanism, and feature level mixing is used to generate feature level mixed features, which together with corresponding labels form the second mixed domain. Finally, region of interest features are generated through a region recommendation network and a region of interest network, and then ROI (Region Of Interest) level mixed ROI features are generated through feature-level mixing of the region of interest, which together with corresponding labels form the third mixed domain. The three generated mixed domains are used together to calculate the loss function and complete the meta optimization process. A dual channel attention mechanism including convolutional layers and feature calibration is proposed to learn more discriminative deep feature representations, where convolutional layers are used to prevent the loss of key spatial information, and feature calibration is used to selectively enhance important features and weaken non important features. Firstly, the convolutional layer submodules are used to generate coarse-grained feature representations. Secondly, the feature calibration submodules are used to establish attention weights based on the correlation between features, and these attention weights are integrated with the original features to selectively enhance important regions while suppressing unimportant regions. A large number of experimental results on the COCO dataset and PASCAL-VOC dataset demonstrate the effectiveness and robustness of the proposed method. It surpasses other methods in the same field in terms of detection performance, while maintaining good generalization performance on different datasets. Furthermore, the model’s parameter count has significant advantages compared to other methods in the same field.  
      关键词:Few-Shot Multi-Source Domain Object Detection;Cross-Domain Meta Optimization;Dual-Channel Attention   
      53
      |
      4
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 140956486 false
      更新时间:2026-02-05
    • LIANG Xin, JIANG Lin, PENG Chao, LIU Zhi-yi, HUANG Yi-sha, AI Yan-di
      Vol. 53, Issue 10, Pages: 3671-3691(2025) DOI: 10.12263/DZXB.20250412
      摘要:Semantic segmentation in complex scenes often suffers from the challenge of effectively balancing global semantic context and local fine-grained details during feature fusion. To address this issue, we propose dynamic Kolmogorov-Arnold network (DynKANet), a novel segmentation framework inspired by the Kolmogorov-Arnold representation theorem. The proposed architecture comprises a multi-level feature extraction module and a dynamic feature fusion module. Specifically, the feature extraction stage integrates a residual-connected U-shaped context enhancement module for robust global semantic modeling and a difference-map-based refinement module to enhance local detail representation.Building on these representations, we design a KA-inspired dynamic fusion module that decomposes the fusion process into nested inner and outer functions, enabling precise modeling of complex interactions between global and local features. A self-adaptive dynamic fusion strategy is incorporated to ensure complementary integration and mitigate the conflict between different feature types. Additionally, we introduce a dynamic compound loss function—CE/TopK+Dice—guided by a conditional trigger mechanism to strengthen the network’s ability to learn from hard samples. Extensive experiments on 10 public datasets spanning 5 domains demonstrate that DynKANet achieves consistent improvements, with an average performance gain of 8% per dataset. These results highlight the strong generalization capability and practical potential of the proposed approach for real-world semantic segmentation tasks in challenging scenarios.  
      关键词:semantic segmentation;Kolmogorov-Arnold representation theorem;Dynamic feature fusion;Global semantics;Local details   
      52
      |
      7
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 125962057 false
      更新时间:2026-02-05
    • JIANG Wei, GUAN Meng-yi, WEI Fu-peng, SUN Hao-chen, MENG Yao, WU Hui-xin
      Vol. 53, Issue 10, Pages: 3692-3704(2025) DOI: 10.12263/DZXB.20250259
      摘要:Graph convolutional network (GCN) has been extensively applied to skeleton-based action recognition and have achieved remarkable performance. However, as the number of action categories and scene complexity increase, existing methods still face significant challenges in modeling detailed human body structures and temporal dependencies, which can be summarized as two main issues. Firstly, when extracting relational features among joints, these methods often inadequately capture the interactions between peripheral joints (such as hands, feet, and head) and their synergistic effects with other joints. Secondly, when extracting temporal features, these methods focus on short-term temporal feature extraction neglecting of long-term dependencies. To address these issues, this paper proposes an enhanced spatiotemporal graph convolutional network (EST-GCN), which consists of multi-branch spatial enhanced graph convolution (MSEGC) and multi-scale temporal enhanced convolution (MTEC) modules. The MSEGC module enhances the feature representation of peripheral joints by capturing relationships between peripheral joints and others through multi-stage learning and propagation within a two-stream graph convolution framework. Meanwhile, the MTEC module effectively captures long-term temporal dependencies across frames through multi-stage learning and propagation of temporal features from multi-scale convolutions, thereby expanding the temporal receptive field. The model sequentially extracts and fuses spatial and temporal features via MSEGC and MTEC, jointly modeling joint structural correlations and temporal dependencies to improve the discriminability of spatial-temporal features. To fully exploit the spatial-temporal information of skeleton data, three types of input features—joint positions, motion velocities, and bone features—are introduced and fused through a multi-stream strategy to enhance feature representation. The proposed method achieves accuracies of 92.4% and 96.2% on the X-Sub and X-View benchmarks of the NTU-RGB+D dataset, respectively; and 88.7% and 90.0% on the X-Sub and X-Setup benchmarks of the NTU-RGB+D 120 dataset, which validates its effectiveness. Furthermore, to validate the model’s performance in real-world scenarios, additional skeleton-based action recognition experiments are conducted on video samples from the NTU-RGB+D dataset, including tests under multi-person interactions and joint noise interference. The results show that the model can still achieve accurate recognition even when local joint misassignments occur, further verifying the practicality and robustness of the proposed approach.  
      关键词:action recognition;skeleton sequence;graph convolutional network;multi-branch spatial enhanced graph convolution;multi-scale temporal enhanced convolution;spatial-temporal features;multi-stream strategy   
      37
      |
      4
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 138392677 false
      更新时间:2026-02-05
    • ZHANG Xi-wei, FANG Xian-wen, MAO Gu-bao
      Vol. 53, Issue 10, Pages: 3705-3717(2025) DOI: 10.12263/DZXB.20250599
      摘要:Amidst deepening digital transformation, data-driven process analysis, with Predictive Process Monitoring (PPM) at its core, has become pivotal for enhancing enterprise operational efficiency and decision-making. To improve the accuracy and generalization of PPM, existing research focuses on mining deep representations from vast event logs. However, the evolution of real-world business processes is influenced not only by temporal logic but also by underlying structural factors such as resource allocation and data dependencies. This complexity poses a formidable challenge to the representational capabilities of existing predictive models. Specifically, the performance of mainstream predictive methods is often constrained by their reliance on a singular process view and static information fusion strategies. Most approaches, even those based on Graph Neural Networks (GNNs), tend to model processes from a single control-flow perspective. This overlooks critical dimensions such as resource interactions and data dependencies, creating a gap in the representation of deep process structures and multi-dimensional relationships. Furthermore, the few studies that attempt to integrate multi-dimensional information typically employ static fusion strategies, lacking a context-aware fusion capability and resulting in models with insufficient adaptability. To address these challenges, this paper proposes a context-aware multi-view graph fusion (CAM-GF) framework. The framework first transcends the limitations of the control-flow perspective by systematically constructing a process graph map. This graph map comprises not only basic control-flow views, such as a long-term dependency graph that captures macroscopic patterns, but also extended semantic views, like a resource interaction graph that reveals organizational collaboration, thereby capturing holistic and multi-level structural knowledge. Subsequently, a novel context-aware graph attention mechanism is designed for spatio-temporal information fusion. It takes the real-time prefix of a case as input to dynamically learn and assign fusion weights to each view. Finally, a Transformer is introduced to perform deep temporal modeling on the dynamically fused feature sequence to achieve precise next-activity prediction. To validate the framework’s effectiveness and practical value, comprehensive experiments were conducted on six public, real-world business process datasets. The results demonstrate that, compared to various mainstream baseline models, the CAM-GF framework achieves an average accuracy improvement of 4.16 percentage points on the next-activity prediction task. Furthermore, the dynamically generated attention weights provide high-value interpretability for the model’s behavior, revealing how the model, based on predictive feedback and real-time context, can both rely on global process structures when local context fails and pivot to focus on critical semantic views, such as resource allocation, in specific situations. This thoroughly validates the proposed framework’s advancement in both accuracy and transparency.  
      关键词:Predictive Process Monitoring;graph attention networks;Multi-View Representation;Context-Aware Fusion;interpretability   
      24
      |
      2
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 137596055 false
      更新时间:2026-02-05
    • Concept Evolution Detection Method for Open Feature Space

      SU Rui, GUO Hu-sheng, WANG Jing, WANG Wen-jian
      Vol. 53, Issue 10, Pages: 3718-3729(2025) DOI: 10.12263/DZXB.20250416
      摘要:In many real-world scenarios, data is continuously generated in the form of streams. Due to the dynamic nature of streaming data, new categories may emerge during the generation process, which is known as concept evolution. Concept evolution is one of the primary challenges leading to the degradation or even failure of predictive performance in stream mining models. Therefore, concept evolution detection methods capable of promptly identifying changes in the class space and alerting models to perform adaptive adjustments have attracted widespread attention. However, most of the current concept evolution detection methods construct algorithms based on the assumption that the feature space is static and unchanging. In real scenarios, the feature space is also dynamic and belongs to the open space. Specifically, over time, some features may disappear and new features may emerge, thus violating the above assumption and causing existing algorithms to fail. To address this problem, this paper proposes a concept evolution detection method for open feature space (CD_OF). The method constructs a micro-cluster ensemble model to classify incoming instances. For the problem of disappearing old features in the open feature space, the information contained in the old features is converted to the shared features through the transfer matrix; for the newly emerged features, the shared feature space is expanded and the integration model is reconstructed. On this basis, the inter-sample similarity is defined based on the shared neighborhood information of the samples to detect concept evolution, and the dynamic decay model is established to solve the class vanishing and classifications cycling problems under the open feature space. The experimental results show that the method proposed in this paper is able to respond to the changes of features in the open feature space in a timely manner and enhance the ability of concept evolution detection. The error rate is reduced by 1.7% to 11.4% compared to existing methods on real streaming data with feature space variations.  
      关键词:concept evolution;open feature space;feature similarity measure;shared neighborhood;dynamic decay model;online learning   
      18
      |
      3
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 136946378 false
      更新时间:2026-02-05
    • YE Deng-pan, TANG Long, CHEN Si-run, LIU Zi-yi, LÜ Yun-na, SHI Xiu-wen
      Vol. 53, Issue 10, Pages: 3730-3743(2025) DOI: 10.12263/DZXB.20250596
      摘要:Fine-tuning text-to-image diffusion models enables high-quality customized image generation, yet it also introduces risks of privacy leakage and potential misuse for opinion manipulation. Current research primarily focuses on prompt- or image-level adversarial attacks to counter model customization; however, it overlooks the inter-modal correlation between prompt- and image-level adversarial perturbations, as well as the adversarial interplay among the model’s internal functional modules. This limitation restricts the practical effectiveness of existing anti-customization methods. To address this, we propose dual anti-diffusion (DADiff), a two-stage framework that integrates prompt-level adversarial attacks into the generation of image-level adversarial examples. In the first stage, DADiff generates adversarial prompt vectors to guide the subsequent image-level perturbation. In the second stage, beyond performing an end-to-end attack on the diffusion UNet, DADiff further perturbs its self-attention and cross-attention modules—aiming to break pixel-wise correlations and enforce consistency by aligning the cross-attention maps derived from the original instance prompt and those from the adversarial prompt vector. Additionally, DADiff introduces a local-random timestep gradient ensemble strategy, which updates adversarial perturbations by aggregating stochastic gradients sampled from multiple segmented timestep intervals. Experimental results on mainstream facial and artistic style datasets show that DADiff achieves an average performance improvement of 20% over existing methods across cross-prompt, keyword-mismatch, and cross-model anti-customization scenarios.  
      关键词:text-to-image generation;diffusion models;model fine-tuning;adversarial examples;model anti-customization   
      34
      |
      1
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 138394697 false
      更新时间:2026-02-05
    • LÜ Liang, LAN Jie, LAN Meng, LU Xian-kai, ZHANG Le-fei
      Vol. 53, Issue 10, Pages: 3744-3758(2025) DOI: 10.12263/DZXB.20250483
      摘要:Semi-supervised semantic segmentation of high-resolution remote sensing images aims to leverage a small number of labeled samples together with a large amount of unlabeled data for joint training, thereby enhancing the performance of semantic segmentation models, as this approach not only significantly reduces the cost of manual annotation but also fully exploits the potential value of unlabeled data. Existing methods typically divide high-resolution remote sensing images into multiple sub-views for training, focusing primarily on enforcing prediction consistency under different perturbations of the same view. However, such strategies often overlook the semantic and spatial relationships between different views, limiting the model’s ability to learn broader contextual information when labeled data are scarce. To address this issue, this paper proposes a cross-view context-aware semi-supervised semantic segmentation method for high-resolution remote sensing images. The proposed approach explicitly models the contextual interactions among multiple views to improve the quality of pseudo labels and introduces a multi-level cross-view consistency constraint to maintain prediction consistency within a broader contextual scope. Specifically, during training, multiple overlapping views—including a primary view and several contextual views—are sampled from the original high-resolution image and jointly fed into the model. A spatial-aware interaction fusion (SIF) module is designed to perform cross-view feature interaction and fusion via cross-attention and self-attention mechanisms. This module generates spatial attention activation maps that adaptively fuse the predictions from different views, thereby improving pseudo label accuracy. In addition, a multiple cross-view context consistency (CVCC) mechanism is introduced to enforce consistent predictions in overlapping regions by aligning their spatial correspondences. This constraint enhances the model’s ability to perceive and model cross-view contextual information, mitigating semantic ambiguity caused by view variations. To comprehensively evaluate the proposed method, extensive experiments are conducted on the Vaihingen and Potsdam datasets provided by the International Society for Photogrammetry and Remote Sensing, under various labeling annotation ratios. Results show that the proposed method consistently outperforms state-of-the-art semi-supervised segmentation approaches. In particular, under an extremely low-label setting using only one labeled image, it achieves 6.84% and 12.73% mIoU improvements over the supervised baseline on Vaihingen and Potsdam, respectively, validating its superior performance and strong generalization under limited annotation.  
      关键词:remote sensing;semantic segmentation;semi-supervised learning;cross-view context consistency;spatial-aware interaction fusion;pseudo-label   
      91
      |
      3
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 136436048 false
      更新时间:2026-02-05
    • Recognition Method of Ship Targets for SAR Based on M3Net

      WANG Hao-tian, JI Zhen-yuan, HUA Qing-long, GUO Zhao-xin, ZHANG Yun
      Vol. 53, Issue 10, Pages: 3759-3772(2025) DOI: 10.12263/DZXB.20250356
      摘要:To address the challenges of significant intra-class variations and high inter-class similarities in ship target recognition within synthetic aperture radar (SAR) images, this paper proposes a novel recognition method based on a multi-branch, multi-information, multi-depth feature fusion complex-valued network (M3Net). Traditional methods predominantly rely on manually designed amplitude features, failing to fully exploit the inherent complex-valued nature of raw SAR data and neglecting the crucial phase information and its coupling relationship with amplitude. This limitation results in insufficient characterization of ships’ fine structures and ultimately restricts recognition accuracy and model generalization capability. Through in-depth analysis of the noncircularity and complex signal kurtosis characteristics of ship targets, this study reveals that these features can effectively characterize the scattering properties distinguishing ships from the sea background, highlighting the representational advantages of complex-domain statistics for ship scattering characteristics. Building on this foundation, a deep complex feature extraction module (CFEM) is designed. This module employs complex-valued convolutional operations to extract amplitude-phase coupled features and innovatively introduces a cross-fusion of real and imaginary activation (CRIA) function. The CRIA mechanism, utilizing a dual-activation function cross-coupling approach, achieves nonlinear feature interactions and enhances the representational capacity for complex-valued features. Furthermore, the multi-branch, multi-information, multi-depth fusion network M3Net is constructed. M3Net synergistically integrates a core complex-valued convolutional neural network (CV-CNN) backbone, a pre-trained CFEM branch, and a real-valued feature branch. By incorporating a complex-domain attention mechanism, M3Net achieves dynamic weighted fusion of these heterogeneous features, adaptively highlighting the most discriminative feature channels. Experimental results on the reconstructed OpenSARship dataset demonstrate the effectiveness of the proposed method. Compared to the traditional CV-CNN, our approach achieves a 5.89% improvement in overall accuracy and reduces the maximum accuracy deviation across classes to 6.82%, significantly enhancing category balance.  
      关键词:synthetic aperture radar (SAR);complex-valued neural network;ship targets;recognition;complex-valued feature   
      62
      |
      7
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 135768701 false
      更新时间:2026-02-05

      CORRESPONDENCE

    • LIU Jun, GU Guo-dong, LIU Bo-wen, LI Ya-ming, LIANG Shi-xiong, SONG Rui-liang, LIU Ning
      Vol. 53, Issue 10, Pages: 3773-3780(2025) DOI: 10.12263/DZXB.20250571
      摘要:Currently, the resources in frequency band of Sub-6 GHz are increasingly scarce, which promotes the research of millimeter wave in 5G/6G communication field. As a key component to determine the communication quality and imaging accuracy, the research and development of millimeter-wave oscillator with low phase noise and high stability has important scientific research value and application prospect. This study presents a Ka-band MMIC oscillator chip based on resonant tunneling diodes (RTDs) with negative resistance characteristics, which effectively simplify oscillator circuit design. RTD is a semiconductor device based on quantum tunneling effect, with a double-barrier single-quantum-well (DBQW) structure as their most representative configuration—where the barriers are composed of wide-bandgap materials and the well of narrow-bandgap materials. RTDs exhibit both nonlinear and negative resistance properties, enabling them to operate both as an oscillator and detector by changing the bias voltage. After comparing GaAs and InP epitaxial materials, the InP system was selected for its superior performance. Through optimized epitaxial structure, the fabricated RTD achieved a peak-to-valley current ratio of 3.9 and a peak current density of 290 kA/cm². The oscillator circuit, implemented on an InP substrate, integrates a coplanar waveguide, metal-insulator-metal (MIM) capacitors (for decoupling and DC blocking), and suppression resistors. The decoupling capacitor shorts RF signals to ground to prevent power dissipation, while the DC-blocking capacitor protects spectrum analyzers from damage during on-wafer measurements. The low-frequency bias oscillations are suppressed by shunt resistor. The intrinsic capacitance of the RTD, combined with an equivalent inductance realized by a shorted transmission line, forms an LC oscillation network to generate the desired frequency. A thin film NiCr resistor is used for realizing stabilizing resistor. The MIM capacitor are realized by Si3N4. The On-wafer testing results show fundamental oscillation frequency of 30.67 GHz, with an output power of -2.2 dBm, phase noise of -87 dBc/Hz@1 MHz and -114 dBc/Hz@10 MHz, a tuning bandwidth of 0.72 GHz, a figure of merit (FoM) of -169.7 dBc/Hz, and a total DC power consumption of 71.3 mW, and the chip area is 0.39 mm². The difference between simulation and experimental results may stem from fabrication tolerances, transmission line deviations, and inaccuracies device equivalent circuit model of RTD.Compared with other Ka-band oscillator, this study has a higher output power and a smaller chip area, and there is still a gap between the performance of the oscillator implemented by the same process in foreign. The performance of the oscillator will be improved by optimizing the material structure, improving circuit design, and optimizing the suppression resistance values. The oscillator chip based on resonant tunneling diode is the first report in China, and this method is expected to be applied to realize terahertz frequency band oscillator.  
      关键词:resonant tunneling diode;Ka-band;oscillator;Monolithic Microwave Integrated Circuit;indium phosphide   
      11
      |
      0
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 134270149 false
      更新时间:2026-02-05

      SURVEYS AND REVIEWS

    • Relations Between Several Classical Privacy Notions

      ZHENG Zhi-run, HUANG Cheng, WANG Ping, LI Cheng-xin, XU Wen, LI Zhe-tao
      Vol. 53, Issue 10, Pages: 3781-3793(2025) DOI: 10.12263/DZXB.20250078
      摘要:We address the challenge of theoretically evaluating various perturbation-based privacy-preserving mechanisms designed under different privacy notions. By analyzing the relationships among three classical privacy notions, namely identifiability, differential privacy, and mutual-information privacy, in both centralized and local settings, we propose a complete privacy notion framework that establishes theoretical relations among them. Specifically, given a constant σmin determined by the prior probability distribution of the real data (the constant σmin=0 when the prior distribution is uniform), the following theorems are formally proved in both central and local settings. First, the mechanism satisfying εi-identifiability must also satisfy εd-differential privacy with εd=εi-σmin. Second, the mechanism satisfying εd-differential privacy must also satisfy εi-identifiability with εi=εd+σmin. Third, the mechanism satisfying εi-identifiability must also satisfy εm-mutual-information privacy with εm=2(εi-σmin). Fourth, the mechanism satisfying εm-mutual-information privacy does not guarantee εd-differential privacy. Fifth, the mechanism satisfying εd-differential privacy must also satisfy εm-mutual-information privacy with εm=2εd. Sixth, the mechanism satisfying εm-mutual-information privacy does not guarantee εd-differential privacy. The proposed framework systematically derives the theoretical relationships among identifiability, differential privacy, and mutual-information privacy, enabling a more precise estimation of privacy budget bounds. Furthermore, the framework provides a theoretical foundation for achieving privacy preservation in data-driven applications such as crowdsensing systems, location-based services, and large language models.  
      关键词:privacy preserving;data publishing;data privacy;differential privacy;mutual-information privacy;identifiability   
      22
      |
      0
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 136278144 false
      更新时间:2026-02-05
    • Research Progress on Blockchain-Empowered Data Storage Security Service

      ZHANG Yao-yao, ZHOU Yuan, YANG Qing-lin, ZHUANSUN Chen-lu, CHEN Kai, LI Yi, LIU Yuan, TIAN Zhi-hong
      Vol. 53, Issue 10, Pages: 3794-3816(2025) DOI: 10.12263/DZXB.20250347
      摘要:With the rapid growth of the digital economy, data storage has become a critical component in the whole management cycle, but it encounters several critical security challenges, such as content integrity, privacy preservation, and sustainable availability. Benefiting from the distributed ledger structure and cryptographic consensus mechanisms, blockchain technology provides a trustworthy solution for addressing these security challenges. Given the fact that the current blockchain systems remain constrained by scalability, efficiency, and security limitations, it is essential to conduct a systematic review of their data storage service capacities and analyze their security boundaries. This paper provides a comprehensive survey of research progress in blockchain-enabled data security by investigating its key procedures, including data on-chain storage, service access, authorization management, and ecological operation to identify emerging technical trends towards secure data utilization and value exchange. Specifically, this paper begins by examining the data on-chain storage phase, analyzing the current state of consensus mechanisms, editable blockchains, and integrity auditing technologies that impact data integrity. Next, in the service access phase, it analyzes the attack principles of two types of threats - DoS (Denial-of-Service) attacks and inefficient functionality - and compares the strengths and weaknesses of existing storage availability solutions. Regarding the authorization management phase, it discusses cross-domain data and unauthorized access issues from the perspectives of identity management and ciphertext storage, while analyzing existing solutions. In response to the ecological operation needs of data storage security service, it explores two types of technologies — scalable consensus protocols and sharding mechanisms — analyzing their bottlenecks in terms of cost, efficiency, and adaptability. Finally, it discusses blockchain’s capability to empower LLM (Large Language Models) in the field of data security and looks ahead to research trends, including energy-efficient optimization mechanisms for post-quantum cryptographic algorithms, blockchain elastic scaling architectures based on federated learning, and trusted data element-driven LLMs.  
      关键词:blockchain;data security;storage;security service;large language models   
      31
      |
      1
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 136758905 false
      更新时间:2026-02-05
    0