最新刊期

    53 9 2025

      Large-Scale Models and the Internet

    • LLM-Based Zero-Shot Imputation of Spatiotemporal Data

      MEI Ya-xin, QIN Hui-ling, LIANG Yu-zhu, ZHANG Guang-xue, WANG Tian
      Vol. 53, Issue 9, Pages: 3047-3059(2025) DOI: 10.12263/DZXB.20250473
      摘要:Internet of things (IoT) sensing data commonly suffers from data sparsity issues due to multiple factors including deployment costs, environmental constraints, and equipment failures, severely limiting the overall performance of intelligent sensing systems. Most existing imputation methods rely on labeled data for supervised training, resulting in severely insufficient generalization capabilities when facing “cold start” scenarios in new environments, failing to meet the practical demands of rapid IoT deployment and cross-domain applications. This paper introduces, for the first time, the intrinsic reasoning capabilities of large language models (LLM) into the spatiotemporal data imputation domain, proposing the ZeroImpute framework based on multi-agent collaborative reasoning that achieves a paradigmatic shift from traditional “data-driven learning” to “knowledge-driven reasoning.” The core innovation of this method lies in constructing a collaborative reasoning system comprising specialized task-oriented LLM agents: the temporal analysis agent is responsible for semantic understanding and reasoning of complex temporal dependencies, capturing forward evolutionary trends and backward constraint conditions through bidirectional sequence modeling; the spatial analysis agent focuses on modeling and parsing dynamic spatial relationships, achieving precise identification of time-varying spatial correlations through temporal context guidance; the imputation decision agent integrates multi-source semantic knowledge and employs adaptive weight fusion algorithms to complete final intelligent imputation decisions. Each agent achieves deep understanding of complex spatiotemporal patterns through semantic knowledge representation and logical reasoning, transforming traditional numerical computation problems into semantic reasoning tasks that can be collaboratively processed by multiple agents, thereby overcoming the limitations of single models in handling complex spatiotemporal relationships. The framework possesses significant technical advantages: first, it achieves true zero-shot generalization capability, enabling direct deployment without requiring any domain-specific training data; second, through multi-agent specialization, it enhances the identification accuracy and reasoning quality of complex spatiotemporal patterns; third, it exhibits excellent interpretability with transparent agent reasoning processes, enhancing system trustworthiness; finally, plug-and-play deployment substantially reduces technical barriers and deployment costs for practical applications. Comprehensive evaluations on three real-world IoT datasets demonstrate that ZeroImpute achieves at least a 4.5% performance improvement in MAE compared to the best-performing specialized deep learning models under strictly zero-shot, zero-training settings. Moreover, the method exhibits robustness across different missing rate scenarios, effectively addressing critical practical challenges including rapid deployment in new regions, cross-domain data imputation generalization, and efficient deployment in resource-constrained environments. This research pioneers a new paradigm of multi-agent collaborative reasoning for spatiotemporal computation, providing novel technical pathways for the spatiotemporal data imputation field and offering crucial technical support and theoretical foundations for advancing IoT technology adoption across broader industrial applications.  
      关键词:internet of things sensing;data sparsity;cold start;spatiotemporal data imputation;large language models;zero-shot   
      139
      |
      20
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 134125552 false
      更新时间:2025-12-27
    • LI Guang-rong, LI Guang-jun, SHANG Jing, WU Wen-tai, WANG Ze-ping, LONG Sai-qin
      Vol. 53, Issue 9, Pages: 3060-3077(2025) DOI: 10.12263/DZXB.20250494
      摘要:Executing workflows in cloud-edge collaborative environments can reduce data transmission latency between the cloud and terminal devices. Significant differences exist between cloud computing nodes and edge devices in terms of computational capability, storage resources, and communication latency. Furthermore, the computational resources of edge servers exhibit dynamicity due to factors like workload pressure and performance degradation. The complex topological dependencies within workflow applications introduce additional scheduling constraints. These combined factors render the workflow scheduling problem in this context NP-hard. To address these challenges, this paper proposes large language model-assisted cloud-edge collaborative workflow scheduling algorithm (LAWS). The algorithm employs a knowledge graph to structurally represent the chain-of-thought (CoT) reasoning process. It decomposes the scheduling problem into multiple sub-problems and extracts sub-knowledge graphs to serve as chain-of-thought guides for the large model, facilitating collaborative reasoning for scheduling decisions. Experimental results demonstrate that compared with traditional algorithms, the proposed algorithm achieves a reduction in workflow execution latency of 3% to 83% and a decrease in computational energy consumption of 2.4% to 66.0%.  
      关键词:workflow scheduling;Cloud-edge collaboration;large language models;knowledge graph;chain-of-thought;problem decomposition   
      90
      |
      10
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 135713101 false
      更新时间:2025-12-27
    • YANG Xing-yuan, TIAN Le, YAO Ying, PAN Fan, HU Yu-xiang
      Vol. 53, Issue 9, Pages: 3078-3088(2025) DOI: 10.12263/DZXB.20250463
      摘要:Traditional networks, which depend on manual configuration, are inefficient and expensive in the face of today’s rapidly expanding scales, increasingly complex demands, and the growing need for real-time responsiveness. Large language models (LLM), known for their exceptional ability to understand natural language, show immense promise for automating network configurations. This paper introduces a streamlined approach to automated configuration for software defined networking (SDN), leveraging LLM. For the data plane, we present RetroP4, a code generation method that uses retrieval-augmented generation (RAG) technology, enabling the creation of P4 code tailored to users’ intentions. In the control plane, we propose CtrlSynth, a method for automatically generating flow tables by breaking down tasks, aligning the configurations with users’ intentions and the P4 code from the data plane. Compared with general-purpose large models, the syntactic correctness of P4 code generated by RetroP4 is improved by 25%, and the semantic correctness is enhanced by 87.5%. CtrlSynth accurately produces flow table information that corresponds to the P4 code, achieving a 100% accuracy rate when dealing with up to 300 traffic-related intentions.  
      关键词:large language models (LLM);network configuration;software defined network (SDN);programming protocol-independent packet processors (P4);retrieval-augmented generation (RAG)   
      92
      |
      12
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 129995001 false
      更新时间:2025-12-27
    • HU Rui, WU Hao, PAN Yu-xuan, ZHANG Lin, LIU Yu, ZHU Kong-lin
      Vol. 53, Issue 9, Pages: 3089-3102(2025) DOI: 10.12263/DZXB.20250451
      摘要:Large language model (LLM) driven open-domain question answering (ODQA) systems, exemplified by frameworks like GIST (Generating Identifiers and Selecting chunks for Tables), have garnered considerable research attention due to their significant potential in processing extensive tabular data. However, when such ODQA systems integrate data from multiple providers for Top-K candidate screening, traditional methods requiring access to raw data encounter substantial challenges concerning data privacy, computational transparency, and participant trustworthiness. While existing research employs zero-knowledge proofs and stake-based mechanisms to achieve public verifiability, the overhead of generating and verifying individual proofs in large-scale scenarios is often prohibitive. Moreover, conventional stake-based mechanisms exhibit limitations in fairness and adaptability within dynamic environments. This paper proposes an enhanced method for multi-party private table screening in LLM-driven ODQA, which integrates multi-party computation (MPC), a publicly aggregable audit mechanism, and a dynamic reputation system. This study adapt the Top-K multi-party private table screening process using MPC to ensure data privacy. Concurrently, an efficient aggregable audit mechanism is introduced; this mechanism combines zero-knowledge proof techniques with random sampling, aggregate proof construction, time-window-based batching, and error localization, thereby enabling the public and batch-verified correctness of the scoring and ranking process. The integration of a blockchain-based dynamic reputation feedback mechanism further enhances system fairness and constrains malicious behavior. Experimental evaluations demonstrate that our Top-K candidate screening method, while preserving privacy, achieves high consistency with the original GIST screening approach, attaining a Top-50 average recall of 0.91 and an average Jaccard index of 0.83, thus indicating minimal impact on end-to-end ODQA task performance. Furthermore, the efficiency of publicly auditable proof generation and verification for large-scale tasks is significantly improved, saving approximately 87% of proof time compared to individual proofs. The adaptability and fairness of the feedback mechanism are also demonstrably enhanced.  
      关键词:open-domain question answering;large language models;secure multi-party computation;publicly auditable;zero-knowledge proofs;blockchain   
      27
      |
      6
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 132134897 false
      更新时间:2025-12-27
    • WANG Zheng, WANG Lei, YOU Zhu-hong, WANG Lei, ZHAO Bo-wei
      Vol. 53, Issue 9, Pages: 3103-3116(2025) DOI: 10.12263/DZXB.20250436
      摘要:Extensive studies have shown that circular RNA (circRiboNucleic Acid), as a type of endogenous non-coding RNA, plays a key role in the occurrence and development of various complex human diseases. Through mechanisms such as acting as molecular sponges, regulating gene transcription, or interacting with proteins, circRNAs participate in the regulation of disease-related signaling pathways. Analyzing the associations between circRNAs and diseases is of crucial scientific value for deepening the understanding of disease mechanisms, discovering novel biomarkers, and advancing precision medicine. However, traditional experimental methods are constrained by high costs, long cycles, and limited throughput, which severely restrict large-scale analysis of circRNA-disease associations. Thus, developing efficient and low-cost computational methods is essential for promoting research in this field. In response, this paper proposes a prediction model named ES-NMGCDA based on evolutionary computation. The model first constructs multi-source similarity networks of circRNAs and diseases, then incorporates the state analysis optimization algorithm (SAOA) to integrate and optimize these multi-source similarity networks, and finally employs a causal forest classifier to achieve accurate prediction of circRNA-disease associations. By integrating the powerful search advantage of SAOA with the superior inference capability of causal forests, ES-NMGCDA enables highly accurate and robust prediction of potential circRNA-disease associations. To comprehensively evaluate the performance of the ES-NMGCDA model, we conducted rigorous 5-fold cross-validation on the widely used public benchmark dataset CircR2Disease. Experimental results demonstrate that the model achieved a prediction accuracy of 93.80%, while also excelling in multiple metrics such as precision and sensitivity, significantly outperforming several existing baseline methods. Furthermore, to validate the model’s practical utility in real biomedical scenarios, we carried out two case studies. In the case study on circRNA-disease associations, 18 out of the top 20 circRNA-disease pairs with the highest prediction scores were supported by recent literature. In the case study focused on breast cancer, 43 out of the top 50 predicted circRNAs were confirmed to be closely associated with the disease. These results consistently indicate that the ES-NMGCDA model not only provides highly reliable candidate circRNA molecules for subsequent molecular biology experiments, significantly shortening research cycles and reducing experimental costs, but also offers new data support and theoretical foundations for understanding the role of circRNAs in complex diseases.  
      关键词:multi-source similarity network;circRNA-disease;evolutionary computation;state analysis optimization algorithm;causal forest;potential association   
      62
      |
      5
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 135713213 false
      更新时间:2025-12-27
    • PENG Han, RUAN Ri-qing, HU Ying, LIU Qiong-lin, ZHANG Zhen
      Vol. 53, Issue 9, Pages: 3117-3133(2025) DOI: 10.12263/DZXB.20250656
      摘要:Named entity recognition (NER) serves as a fundamental task in the structural analysis and semantic understanding of legal texts, with the potential to greatly enhance judicial efficiency and promote fairness. However, due to the high complexity and domain specificity of legal language, traditional NER methods struggle to adequately capture contextual dependencies in legal documents. They often rely on shallow token-level predictions, lacking both role-based entity interpretation and deeper contextual reasoning. These limitations are particularly pronounced when dealing with nested entities, fine-grained entity categories, and ambiguous boundaries that frequently occur in judicial texts. To address these challenges, this paper introduces a novel NER framework for Chinese legal scenarios, termed JURIS (judicial understanding-enhanced reasoning via instruction-tuned strategies for named entity recognition). JURIS reformulates entity recognition as a context-driven conditional generation task and adopts an innovative context-aware embedded annotation strategy, which preserves the original semantic structure of the text while effectively enhancing contextual modeling. In addition, JURIS incorporates a tri-aspect understanding enhancement module (Tri-UEM), consisting of a standardization module, a knowledge-guided module, and an analogy-based learning module. These components jointly strengthen the model’s semantic understanding and discrimination ability in the legal domain by improving output consistency, injecting domain-specific knowledge, and enabling contextual analogy transfer. Experimental results demonstrate that JURIS consistently outperforms strong baseline models on multiple datasets, including CAIL2021, Drug, and CSKS2019, achieving state-of-the-art performance. It significantly improves recognition of nested and fine-grained entities while showing strong generalizability and applicability in domain-specific information extraction tasks.  
      关键词:judicial named entity recognition;understanding enhancement;instruction tuning;information extraction   
      75
      |
      8
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 135616294 false
      更新时间:2025-12-27

      PAPERS

    • ZHANG Shu-xin, LIANG Chang-yi
      Vol. 53, Issue 9, Pages: 3134-3146(2025) DOI: 10.12263/DZXB.20250286
      摘要:The mechanical-electromagnetic coupling analysis, design, and control of reflector antennas under multi-source loads are key challenges in the design of large-aperture reflector antennas. The existing mechanical-electromagnetic coupling calculation method adopts integral operation models, which fails to decouple the structural deformations and leads to both complex integral operation and high computational time consumption. Distorted reflector antennas suffer from multi-source loads, which makes the mechanical-electromagnetic coupling analysis complex and can not support the fast estimation and adjustment for electromagnetic performance. To solve this problem, a mechanical-electromagnetic decoupling computational method is proposed to obtain the radiation performance for distorted reflector antennas with the best fit paraboloid. Taking the antenna state under the operation of best fit paraboloid as the basis, the normal distortion is taken as the structural input for the electromagnetic computation. Adopting the second order series expansion of the phase component, the structural distortion is separated from the electromagnetic computation, which makes the original integral computation convert to a matrix multiplication model. With the separation of structural distortion and the mechanical-electromagnetic decoupling, the mechanical-electromagnetic decoupling computational model is established for distorted reflector antennas with the best fit paraboloid, and the fast decoupling electromagnetic computation is achieved. Taking an 8 m reflector antenna as a typical example, and introducing hypothetical structural deformations as well as deformations caused by gravity and wind loads at different elevation angles as structural inputs, the verification of the mechanical-electromagnetic decoupling computational method for antenna electromagnetic performance under multi-source loads is carried out. Simulation shows that in the calculation results of antenna electromagnetic performance under multi-source loads, the proposed mechanical-electromagnetic decoupling computational method can obtain a well matched antenna radiation pattern with that of the original integral operation model. Moreover, the deviation of gain loss between the two models is within 0.1 dB, indicating the high calculation accuracy of the proposed method that meets the antenna design requirements. Compared with the original integral model, the proposed method can guarantee computational accuracy, and improve the computation efficiency about 95% of the electromagnetic performance for distorted reflector antennas under multi-source loads, which can pave the foundation for the fast estimation and adjustment of electromagnetic performance.  
      关键词:best fit;reflector antennas;electromagnetic performance;normal distortion;mechanical-electromagnetic decoupling   
      271
      |
      13
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 126427486 false
      更新时间:2025-12-27
    • Cross-Modal SAR Target Detection via Progressive Knowledge Transfer

      ZHAO Guo-wei, JIANG Jia-qing, DONG Gang-gang
      Vol. 53, Issue 9, Pages: 3147-3162(2025) DOI: 10.12263/DZXB.20250417
      摘要:The wide application of electro-optical sensors presented the urgent need of cross-modal learning between optical and SAR image. In this paper, a new cross-modal synthetic aperture radar (SAR) target detection method via progressive knowledge transfer was proposed. First, a new generative technique from the optical image to SAR image was presented. The immediate domain composed of the generated pseudo SAR images can be formed accordingly. The semantic discrepancies between SAR backscattering imaging mechanism and the passive optical radiation imagery can be bridged. The optical radiation features with SAR scattering characteristics can be fused effectively. Second, a dual-stage domain adaptation strategy composed of the multi-scale feature alignment skill was presented. The semantic components between the optical source domain and the intermediate domain can be aligned through the multi-scale feature learning trick initially. The scattering distribution alignment between the intermediate domain and the SAR target domain can be then achieved. Third, a quality-aware dynamic weighting strategy was presented to mitigate the impact of outlier samples in the intermediate domain. It was capable of adjusting the contributions of synthetic data based on confidence metrics dynamically. Finally, multiple rounds of experiments were pursued on SpaceNet6, SSDD (SAR Ship Detection Dataset), and HRSID (High-Resolution SAR Images Dataset) datasets. The results proved the advantages of proposed method. The improvement of 21.5 percentage points was achieved compared to the source-only learning method. Likewise, the improvement of 3.3 percentage points was achieved in comparison to the state-of-the-art. These results confirm the viability of electro-optical-to-SAR knowledge transfer for enhancing cross-modal target detection.  
      关键词:synthetic aperture radar;Target Detection;knowledge transfer;Adversarial learning;multi-scale alignment;pseudo-label learning   
      43
      |
      12
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 126427336 false
      更新时间:2025-12-27
    • JIANG Chun-sheng, HUO Yi-kang, HUA Qi-lin, SONG Shu-xiang, PAN Li-yang, XU Jun
      Vol. 53, Issue 9, Pages: 3163-3172(2025) DOI: 10.12263/DZXB.20250400
      摘要:Two-dimensional atomic-threshold-switching field-effect-transistors (2D ATS-FETs) offer great promise for low-power applications in logic computing, selectors, and neuromorphic systems in the post-Moore era, thanks to their ultra-low off-state current, extremely small subthreshold swing, ultra-low operating voltage, compact device structure, and compatibility with mainstream complementary metal-oxide-semiconductor (CMOS) process. A 2D ATS-FET can be regarded as a series connection of an atomic threshold switching (TS) device and a baseline 2D FET. In this study, we first developed a current-voltage (I-V) model for the TS device based on conductive filament (CF) evolution dynamics and tunneling mechanisms. Then, we propose a current-voltage model for the baseline 2D FET based on drift-diffusion transport mechanisms. Finally, by leveraging the fact that the conduction current of the two series-connected devices must be equal, we implemente a standard simulation program with integrated circuit emphasis (SPICE) model compatible with mainstream commercial circuit simulators using the Verilog-A language. The calculated results from the analytical model show good agreement with experimental data, validating the correctness of the proposed theoretical model. Furthermore, we systematically investigated the electrical characteristics and working mechanisms of 2D ATS-FETs based on this analytical model. This analytical model provides a reliable theoretical foundation and an effective research tool for the study of device mechanisms, performance optimization, and circuit design of 2D ATS-FETs.  
      关键词:atomic-threshold-switching (TS);two-dimensional (2D) channel material;Field effect transistor;subthreshold swing;analytical model   
      50
      |
      6
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 127069968 false
      更新时间:2025-12-27
    • ZHAO Ben, XIA Wen-chao, ZHAO Hai-tao, NI Yi-yang, ZHU Hong-bo
      Vol. 53, Issue 9, Pages: 3173-3191(2025) DOI: 10.12263/DZXB.20250195
      摘要:Reconfigurable intelligent surface (RIS) technique has great potential in improving positioning accuracy and data transmission rate. This paper investigated RIS-assisted location sensing and superimposed pilot (SP) transmission under the assumption that position information of mobile user equipment (UE) and channel state information (CSI) are unavailable. In this work, a frame structure of transmission protocol composed of several location coherence intervals was designed, each with pure-pilot and data-pilot transmission durations. The former was used to estimate UE locations, while the latter was used to transmit data and pilot signals simultaneously. Then, the Cramér-Rao Bound of positional parameter estimation error was derived, and the inverse fast Fourier transform (IFFT) algorithm was adopted to obtain the estimation results of UE positions, which were then exploited for channel estimation. Besides, the achievable rate in closed form for SP transmission was calculated, and on this basis, a block coordinate descent algorithm was proposed to jointly optimize the transmit power of UEs and the phase shifts of RIS, aiming to maximize the weighted sum rate of all UEs. The convergence and complexity of algorithms were also analyzed. Finally, simulation results validate the performance of UE position estimation algorithm and the superiority of the proposed SP scheme by comparison with the regular pilot scheme.  
      关键词:integrated sensing and communication;reconfigurable intelligent surface;superimposed pilot;location sensing;channel estimation   
      42
      |
      6
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 132134944 false
      更新时间:2025-12-27
    • TPST-BCH Coding Scheme with High-Performance Decoding

      ZHONG Zhuo-hong, WANG Qian-fan, WANG Yi-wen, SONG Lin-qi, MA Xiao
      Vol. 53, Issue 9, Pages: 3192-3201(2025) DOI: 10.12263/DZXB.20250582
      摘要:This work proposes a novel coding and low-complexity decoding scheme based on BCH codes to meet the requirements of high-reliability and low-latency communication (HRLLC) applications. In the proposed design, BCH codes are used as component codes within a twisted-pair superposition transmission (TPST) framework, resulting in TPST-BCH codes. The upper-layer BCH codeword undergoes a random transformation before being superimposed onto the lower-layer codeword, and the resulting signal is further interleaved and fed back to the upper layer, enabling code length extension and reliability enhancement. For decoding, a serial interference cancellation strategy is developed, where ordered statistics decoding with local constraints (LC-OSD) is first applied to generate a list of candidates for the upper layer. Given an upper‐layer candidate, LC-OSD decoding is then performed on the lower‐layer codeword, and the candidate with the highest posterior probability is ultimately selected as the decoding output. To further reduce complexity, an early termination mechanism is introduced, including intra-layer early termination within LC-OSD and cross-layer early termination across decoding stages. Simulation results show that the proposed early termination design significantly reduces the average number of searches with negligible performance loss. Compared to existing coding schemes, the proposed TPST-BCH codes (with the proposed decoding algorithm) demonstrate superior frame error rate (FER) performance over BCH codes of the same code length and rate (decoded with the LC-OSD algorithm) and 5G LDPC codes (decoded with belief propagation decoding). They achieve comparable or slightly better FER performance than 5G Polar codes (decoded with successive cancellation list decoding), while exhibiting lower computational complexity and decoding latency relative to 5G Polar codes in the moderate-to-high SNR regions.  
      关键词:channel coding;BCH codes;twisted-pair superposition transmission (TPST);ordered statistics decoding with local constraints (LC-OSD)   
      40
      |
      9
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 137631534 false
      更新时间:2025-12-27
    • A Single-Event Immune Oscillator for On-Chip Clock Systems

      SANG Hao, YUAN Heng-zhou, GUO Yang, LIU Sheng, CHEN Xiao-wen, XU Wei-xia
      Vol. 53, Issue 9, Pages: 3202-3210(2025) DOI: 10.12263/DZXB.20250667
      摘要:With the continuous evolution of advanced semiconductor processes, chip integration and complexity have significantly increased. System clock design is gradually shifting from off-chip crystal oscillators to on-chip clock systems. frequency-locked loop (FLL) clock reference generation circuits based on Inductor-Capacitor voltage-controlled oscillators(LCVCOs) have become a key technology for high-reliability electronic systems due to their excellent noise suppression characteristics and strong radiation resistance capabilities. However, existing LCVCO architectures are highly sensitive to single-event effects in capacitor arrays and tail current sources, which can easily cause system failures in space radiation environments, severely limiting their application prospects in aerospace.To address these issues, this paper proposes a novel LCVCO architecture aimed at enhancing the robustness of reliable on-chip clock systems. This architecture introduces dynamic self-bias feedback technology, dynamically adjusting the bias voltage of the tail current source through real-time oscillation signals to effectively suppress 1/f noise and achieve self-stabilization control of output amplitude. The phase noise and single-event sensitivity of oscillators are evaluated and compared based on the impulse sensitivity function (ISF). Simulation results show that the ISF curve of the LCVCO designed in this paper has lower amplitude and better symmetry at critical nodes, significantly improving noise performance and single event transient (SET) tolerance. Additionally, the resonant tank uses an NMOS-type capacitor array unit with low-impedance discharge paths, which can quickly discharge single-event transient currents, reducing frequency fluctuations caused by them. The design is implemented using fin field-effect transistor (FinFET) technology, occupying an area of 0.06 mm² and consuming 9.6 mW of power. At a 26 MHz output clock, the phase noise is optimized to -136 dBc/Hz@1MHz, with a figure of merit (FoM) value of 154.5 dBc/Hz@1MHz, and the root mean square (RMS) value of cycle jitter is 5.93 ps. Laser experiments show that the laser triggering threshold of the LCVCO designed in this paper has been increased to 1.5 nJ, an improvement of 114.3% compared to traditional structures, greatly reducing the impact of single-event effects. Heavy-ion irradiation experiments were conducted at a linear energy transfer (LET) value of 86.1 MeV·cm²/mg, showing that the maximum frequency deviation is less than 4.5%. The simulation and experimental results fully validate the high reliability of this circuit in the space radiation environment, which provides a more robust clocking solution for spaceborne electronic equipment.  
      关键词:oscillator;phase-locked loop;clock;phase noise;single event transient;impulse sensitivity function   
      34
      |
      6
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 132286338 false
      更新时间:2025-12-27
    • LI Qiang, YANG Yuan, WEN Yang, ZHAO Tian-yang, LI Ya-lan, RU Xiao
      Vol. 53, Issue 9, Pages: 3211-3222(2025) DOI: 10.12263/DZXB.20241156
      摘要:To address the issues of hard switch fault (HSF), fault under load (FUL), and overload fault (OL) for SiC MOSFETs, this paper proposes an overcurrent protection method based on the drain-voltage and source-voltage detection (DSD-OCP). The DSD-OCP employs a detection circuit to monitor the drain-voltage and source-voltage of SiC MOSFETs in real time, enabling accurate identification of short-circuit and overload faults. It utilizes a drive circuit to control the turn-on and turn-off of the SiC MOSFET, resulting in fast short-circuit protection and adaptive overload protection. And it also integrates a soft turn-off function. The DSD-OCP circuit is designed and fabricated based on 0.5 µm BCD process with a chip area of 2.8 mm². The developed chip is used to construct a 1200 V/80 mΩ SiC MOSFET test platform, and the effectiveness of the DSD-OCP is verified. Experimental results show that the HSF and FUL durations of the SiC MOSFET are 88 ns and 105 ns, respectively. Under different bus voltages, the DSD-OCP chip can provide adaptive overload protection for the SiC MOSFET. Since the DSD-OCP chip offers a soft turn-off function, the drain voltage overshoot of the SiC MOSFET during overcurrent protection does not exceed 110 V.  
      关键词:SiC MOSFET;drain voltage and source voltage detection;fast short-circuit protection;adaptive overload protection;soft turn-off function   
      27
      |
      3
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 132286392 false
      更新时间:2025-12-27
    • LIU Huan, LI Chen, CAI Jie-ding, LUO Xin-jiang, HU Yuan-yun, YOU Bin, XU Kui-wen, SONG Kai-xin, LI Wen-jun
      Vol. 53, Issue 9, Pages: 3223-3232(2025) DOI: 10.12263/DZXB.20250710
      摘要:As fifth-generation (5G) mobile communications evolve toward sixth-generation (6G or B5G) millimeter-wave bands, communication networks face challenges of spectrum reconfiguration and system performance enhancement. Owing to its abundant bandwidth and high data rate, millimeter-wave communication has become a key technology to support high-capacity and low-latency services. As a core passive component in the RF front end, the bandpass filter plays a vital role in selecting in-band signals and suppressing out-of-band interference. In high-frequency and high-density communication environments, filters must balance frequency reuse, high integration, and compact packaging, achieving coordinated optimization of high selectivity, low insertion loss, and miniaturization, which places higher demands on structural design and packaging technologies. Low-temperature cofired ceramic (LTCC) technology, featuring multilayer three-dimensional integration, precise structural control, and excellent dielectric properties, has become an important approach for passive device packaging and integration.Based on LTCC technology, this paper presents a novel millimeter-wave bandpass filter with high selectivity, low loss, and compact size. The proposed filter employs bent half-wavelength and quarter-wavelength stepped impedance resonators (SIRs), and its size is effectively reduced by leveraging the multilayer integration capability of LTCC. To enhance selectivity, an omnidirectional cross-coupling topology is constructed by introducing additional cross-coupling paths on the basis of partial cross-coupling, and the design is theoretically analyzed through coupling matrix synthesis and equivalent circuit modeling, resulting in three transmission zeros near the passband. To minimize loss, a low-dielectric-loss K7 LTCC material is adopted, and metallized via sidewalls are employed to construct a shielding structure that suppresses surface-wave leakage and parasitic radiation.To verify the effectiveness of the proposed design method, a multilayer LTCC-based prototype operating in the millimeter-wave band was simulated and optimized using HFSS software, followed by fabrication and experimental verification. The measured results demonstrate that the filter exhibits an insertion loss of only 0.9 dB at the center frequency of 28 GHz, with a minimum in-band loss of 0.61 dB. The roll-off rates at both edges of the passband reach 25.5 dB/GHz and 10.47 dB/GHz, respectively, indicating significantly improved frequency selectivity. The effective circuit size of the device is only 1.44 mm × 1.0 mm (0.134 λ0 × 0.093 λ0), achieving compactness in the millimeter-wave band.To address the difficulty in accurately extracting the intrinsic performance of Surface-Mount Technology (SMT) LTCC filters during millimeter-wave testing, this paper introduces a multiline thru-reflect-line (MTRL) calibration method. By mounting the LTCC filter on a test board and combining it with self-fabricated MTRL standards, the method integrates calibration data into the vector network analyzer algorithm, effectively eliminating systematic errors caused by interconnection transitions. This approach provides a general and efficient solution for performance evaluation of millimeter-wave LTCC passive devices.In summary, this article presents an LTCC-based filter for 6G (B5G) millimeter-wave communication. The filter design is elaborated based on coupling matrix synthesis and equivalent circuit theory. Featuring an omnidirectional cross-coupling topology and bent half-/quarter-wavelength SIRs, the proposed filter achieves miniaturization, low loss, and high selectivity. Furthermore, by introducing the MTRL calibration technique, the intrinsic performance extraction problem of millimeter-wave filters is effectively addressed. This work offers valuable technical insight for the development of high-performance passive components in 6G millimeter-wave communication systems.  
      关键词:6G/B5G;millimeter-wave;cross-coupling;low-temperature cofired ceramics;bandpass filter   
      63
      |
      8
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 134270862 false
      更新时间:2025-12-27
    • JIANG Ming, LIU Yi-meng, XU Yue, KONG Ling-jun, WEI Yue-jun
      Vol. 53, Issue 9, Pages: 3233-3244(2025) DOI: 10.12263/DZXB.20241180
      摘要:Due to fully exploiting the spatial dimension resources of the massive multi-antenna system, spatiotemporal two-dimensional coding can effectively solve the problem of severe transmission performance degradation of short block-length coding under the very low delay constraints, but there is still a large gap between the performance of this coding scheme and that of the equivalent large-block-length near-Shannon-limited coding currently. Aiming at the issues of performance degradation encountered in the practical coding process of the interleaved serial cascade structure adopted by the existing spatiotemporal two-dimensional coding, this paper proposes a new spatiotemporal two-dimensional parallel concatenated code coding (spatiotemporal product code) suitable for efficient iterative decoding. The coding scheme, which performs a specific time/space domain mapping after the product coding in parallel cascades, improves the coding interleaving depth and is suitable for highly parallel and iterative decoding, which improves the efficiency and reliability of the overall compilation code. In addition, the coding scheme proposed in this paper makes full use of the advantages of multiple antennas in the spatial dimension, and further improves the transmission performance and reduces the processing delay by configuring the different component codes of the product code and different time/space domain mappings. The paper specifically introduces the single parity-check code as a product cascade component code in the spatial dimension to construct the spatiotemporal product code, simulates and investigates the impacts of time/space domain mapping, component code length and code rate and modulation on the decoding performance of spatiotemporal product codes, and analyzes the corresponding transmission and decoding delays in detail. Simulation results show that the spatiotemporal product code scheme proposed in this paper can achieve a performance gain of 0.4~2.3 dB at a block error rate of 10-3 compared with the existing spatiotemporal two-dimensional coding scheme, the proposed scheme is also more suitable for parallel iterative decoding and ultra-low latency transmission.  
      关键词:channel coding;multi-antenna transmission;parallel concatenated code coding;spatiotemporal two-dimensional coding;irregular product code   
      44
      |
      11
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 126427618 false
      更新时间:2025-12-27
    • YUN Yan-zhi, MENG Qing-wei, WANG Han, MA Zhi-qiang
      Vol. 53, Issue 9, Pages: 3245-3255(2025) DOI: 10.12263/DZXB.20250135
      摘要:To further enhance the security of wireless communication, an extended weighted fractional Fourier transform (EWFRFT) communication method using non-degenerate hyperchaos-driven three-dimensional constellation encryption is proposed. The method constructs a non-degenerate hyperchaos and utilizes its generated sequences to control the parameters of scaling and Rodrigues’ rotation. It generates randomized scaling matrices and Rodrigues’ rotation matrices, applying 3D constellation encryption through sequential scaling followed rotation to each constellation point. Then, the encrypted constellation points are combined into I/Q signals for EWFRFT processing. Furthermore, the mathematical model and cryptographic primitives for 3D constellation encryption are presented, which demonstrates its perfect confidentiality. Each constellation point is controlled by 8 mutually factors for encryption, resulting in more randomized and unpredictable transformed positions. Simulation results show that the proposed method not only disrupts the original structured constellation distribution, improving the anti-interception capability of wireless signals, but also presents excellent randomness in transmitted information, effectively resisting common attacks such as exhaustive attack and statistical attack.  
      关键词:physical layer security;non-degenerate hyperchaos;three-dimensional scaling;Rodrigues’ rotation;extended weighted fractional Fourier transform   
      37
      |
      14
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 134127460 false
      更新时间:2025-12-27
    • LIN Jin-jian, HUANG Ming-jun, SUN Hui-bo, LIN Zi-han, XIE Kai
      Vol. 53, Issue 9, Pages: 3256-3273(2025) DOI: 10.12263/DZXB.20250341
      摘要:In complex electromagnetic environments, pulses emitted from multiple unknown radar emitters are highly interleaved in the time domain. Conventional deinterleaving methods suffer from performance degradation because key parameters, such as radio frequency (RF) and pulse width (PW), often exhibit high similarity to one another. In contrast, pulse amplitude (PA) is influenced by underlying physical mechanisms, including the antenna radiation pattern and the beam scanning mode. This is particularly evident in mechanically scanned radars, where PA presents recognizable envelope variation patterns that can provide supplementary discriminative information for deinterleaving. Based on this premise, this paper proposes a radar signal deinterleaving method founded on the joint use of time of arrival (TOA) and pulse amplitude (PA).The proposed method first analyzes the temporal variation patterns of PA under different radar operational modes. Inspired by density-based clustering, it identifies pulse subsets with similar geometric morphologies in the 2D TOA-PA space by combining constraints on neighborhood radius and local slope variations, thereby generating an initial set of cluster paths. To address the fragmentation of co-source tracks caused by missing pulses or noise interference, a cluster path fusion method is introduced. It screens candidate path pairs through temporal overlap, calculates global and local slope entropy to assess PA trend consistency, and employs the Hausdorff distance to measure spatial similarity between paths. This process merges similar paths to reconstruct physically plausible PA envelope tracks. Finally, a first-order difference histogram is constructed from the TOA sequence of the fused tracks, and pulse repetition interval (PRI) candidate grouping and parameter statistics are completed through an associated pulse pair analysis. Experiments are conducted in four simulated scenarios, covering various combinations of missing pulse rates and noise pulse rates from 10% to 50%. Clustering performance was evaluated using four metrics—purity, F-score, Fowlkes-Mallows index (FMI), and adjusted rand index (ARI)—and benchmarked against seven mainstream clustering algorithms. The results demonstrate that the proposed method significantly outperforms the control group in overall performance. The path fusion mechanism effectively suppresses the generation of spurious emitters, enhances the temporal continuity of clusters, and improves their correspondence to true emitters. The average relative error for PRI estimation did not exceed 0.6%. In summary, this paper performs an initial sort using joint TOA-PA features, introduces a path fusion strategy based on spatio-temporal similarity, and conducts the main deinterleaving via a TOA first-order difference histogram to achieve robust PRI detection. The approach is well-suited for deinterleaving unknown emitters in non-cooperative electronic reconnaissance. Future research could focus on the adaptive tuning of clustering hyperparameters, an evidence fusion mechanism for multi-epoch deinterleaving results, and validation of generalization performance on measured, real-world pulse descriptive word (PDW) datasets.  
      关键词:temporal-pulse amplitude;clustering;unknown radar emitters;signal sorting;fusion   
      150
      |
      13
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 135616343 false
      更新时间:2025-12-27
    • A Causal Structure Learning Algorithm with Dynamic Weighted Condition Set

      CAO Dong-lei, CAO Fu-yuan, WANG Yun-xia, GAO Xiao-fang
      Vol. 53, Issue 9, Pages: 3274-3286(2025) DOI: 10.12263/DZXB.20250637
      摘要:Constraint-based causal structure learning algorithms have the advantage of not relying on specific functional model assumptions and generally offer high computational efficiency. However, their V-structure orientation stage heavily depends on the results of conditional independence tests (CIT) on specific conditioning sets. Although the recently proposed Shapley-PC algorithm integrates multiple condition sets through Shapley value evaluation to mitigate CIT errors, it still fails to adequately account for the varying influence of different condition sets on orientation decisions, thereby overlooking the importance of certain key sets and reducing orientation accuracy. To address this issue, we propose a dynamically weighted causal structure learning (DW-CSL) algorithm. The core idea of the method is to combine normalized p-values with Shapley values to assign dynamic weights to condition sets of the same size, thereby finely quantifying their contribution differences to orientation decisions and effectively suppressing the propagation of CIT errors in V-structure orientation. Specifically, the algorithm first constructs the causal skeleton based on the PC-Stable framework; then, during V-structure orientation, it introduces a dynamic weighted orientation rule that incorporates normalized p-values into Shapley value calculations, making CIT results from different condition sets comparable and enabling precise orientation of unshielded triples; finally, the remaining undirected edges are oriented using Meek’s rules. Experimental results on both synthetic and benchmark datasets demonstrate that, compared with baseline methods, DW-CSL improves V-structure recognition accuracy by an average of 4.75% and edge orientation accuracy by an average of 5.5%, thereby enhancing the stability and overall accuracy of causal structure learning.  
      关键词:causal structure learning;conditional independence test;dynamic weighted condition set;normalized p-value;shapley value   
      32
      |
      9
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 136278175 false
      更新时间:2025-12-27
    • ZHONG Jiang, DAI Qi-zhu, LI Xue
      Vol. 53, Issue 9, Pages: 3287-3298(2025) DOI: 10.12263/DZXB.20240472
      摘要:Continuous relation extraction (CRE) plays a crucial role in understanding and adapting to the ever-changing data environment. Traditional CRE techniques often face two major challenges: the continuous evolution of relationship patterns and the risk of forgetting previously learned relationships. Although storing and replaying typical examples of old relationships has been proven effective in reducing forgetting, repeatedly replaying these fixed and limited samples can lead to overfitting. To address this issue, this paper proposes a dynamic prototype-based continuous relation extraction method that combines density clustering and generative large language models to tackle the aforementioned challenges, which is named a dynamic prototype-based continuous relation extraction method (Continuous Relation Extraction with Density based Clustering and Generative Large Language Model, CRE-DC GLLM) in this paper. Specifically, this paper employs density clustering technology to extract memory samples to alleviate the problem of forgetting previous tasks, and designs dynamic relationship prototypes based on full samples and memory samples. In addition, this paper uses a generative large model to generate pseudo-samples for memory samples for replay training, to solve the problem of model overfitting caused by multiple replays. At the same time, this paper also uses focused knowledge distillation technology to enhance the adaptability to changing relationship patterns. A series of experiments conducted on the FewRel dataset and the TACRED dataset have verified the effectiveness of this method. The experimental results show that this method has achieved significant improvements in the accuracy and efficiency of continuous relation extraction, especially in handling similar relationships, preventing knowledge forgetting, and overcoming overfitting, it has shown excellent performance.  
      关键词:continuous relation extraction;clustering;large language model;density peaks;dynamic protype   
      44
      |
      7
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 130454422 false
      更新时间:2025-12-27
    • WANG Ying, GAO Lan, ZHANG Zhe, LIU Xin, WU Yi-xiong, ZHANG Wei-gong
      Vol. 53, Issue 9, Pages: 3299-3309(2025) DOI: 10.12263/DZXB.20250312
      摘要:To address the failure of traditional operator fusion algorithms in heterogeneous computing systems when crossing different computing units, this paper proposes an optimized operator fusion strategy and implements a hardware design for the novel fusion algorithm. Building upon the original design intentions of traditional operator fusion, we analyze the impact of operator fusion coverage on inference performance when deploying deep learning algorithms on edge-side heterogeneous computing systems. We explore the feasibility of cross-unit operator fusion and design an improved fusion algorithm model that enhances fusion coverage. Furthermore, a heterogeneous computing platform composed of CPU (Central Processing Unit), GPU (Graphics Processing Unit) and DLA (Deep Learning Accelerator) is constructed, incorporating a tightly coupled multi-level shared memory architecture tailored for the optimized fusion strategy. Experimental results demonstrate that the proposed fusion strategy significantly improves operator fusion coverage compared to the unoptimized version. Deployed on a Xilinx FPGA (Field-Programmable Gate Array) development board for object detection network inference, the proposed design achieves a 62.67% performance improvement and a 2.68× speedup for YOLOX-Nano inference, and a 71.10% performance improvement and a 3.46× speedup for YOLOv5s inference.  
      关键词:deep learning;operator fusion;convolutional neural network;heterogeneous computing system;FPGA;GPU   
      69
      |
      12
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 129093805 false
      更新时间:2025-12-27
    • Differentially Private with Sparse and Smooth Self-Distillation

      ZHAO Deng-feng, XUE Da-xuan, ZHAO Su-yun, CHEN Hong
      Vol. 53, Issue 9, Pages: 3310-3318(2025) DOI: 10.12263/DZXB.20250133
      摘要:To mitigate privacy leakage risks in deep learning, numerous studies utilize differential privacy techniques to train neural networks. However, substantial performance degradation is often unavoidable. To address the privacy-utility trade-off, we propose the differentially private learning with sparse and smooth self-distillation (DP3SD) method, which leverages dual temperature scaling to enhance the utility of privacy-preserving deep learning. Specifically, DP3SD proposes a dual scaling loss function composed of a sparse classification loss and a smooth distillation loss. By incorporating a lower temperature into the classification loss, the class prediction distribution of student model is sharpened, thereby reducing the influence of low-probability classes that are likely noise-induced. Conversely, applying a higher temperature to the distillation loss, the prediction distributions of both the teacher and student models are smoothed, thus promoting stable and efficient knowledge transfer under differential privacy constraints. This dual scaling mechanism, under strict privacy guarantees through differential privacy stochastic gradient descent, facilitates the student model in progressively enhancing its learning from the teacher model while simultaneously alleviating the perturbations caused by privacy constraints. By extensive experiments on three public datasets, we find that DP3SD can effectively improve model performance while ensuring rigorous data privacy.  
      关键词:deep learning;differential privacy;privacy protection;knowledge distillation;stochastic gradient descent   
      58
      |
      9
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 129308902 false
      更新时间:2025-12-27
    • CHEN Yi-fei, LIU Yan-wei, LIU Jin-xia, GU Xiao-yan
      Vol. 53, Issue 9, Pages: 3319-3330(2025) DOI: 10.12263/DZXB.20250549
      摘要:Holographic display technology can reproduce three-dimensional imaging that encompasses all information of an object, providing users with a highly realistic visual experience. It is regarded as the most perfect naked-eye 3D display technology currently available. The unique immersive 3D experience offered by holographic displays gives holographic communication has broad application prospects in fields such as healthcare, education, and virtual reality. However, the large-scale commercial application of holographic communication technology still currently confronts numerous obstacles. Among them, one major issue affecting the quality of holographic communication is the multiple aliasing distortions caused by compression noise and channel interference during hologram transmission. Existing image distortion correction techniques mostly focus on single distortion type and struggle to address the problem of mixed holographic distortions in complex scenarios, severely limiting the effectiveness of holographic technology in the practical applications. To tackle this issue, this paper proposes a holographic image distortion correction method based on a multi-branch complex-valued convolutional neural network. This method constructs a multi-level parallel multi-branch network architecture to achieve in-depth extraction and collaborative fusion of multi-scale and multi-dimensional distortion features of holographic images. Simultaneously, a complex-valued adaptive attention mechanism is proposed to enhance the network’s perception and suppression capabilities for key distortion features such as phase distortion and amplitude attenuation, thereby achieving precise correction of end-to-end distortions caused during compression and transmission. In the experiments involving mixed-type holographic distortions including compression and channel noises, compared to the state-of-the-art deep learning distortion correction method SCUNet (Swin-Conv-UNet), the proposed method achieves an average improvement of over 0.41 dB in peak signal-to-noise ratio (PSNR) and an average increase of over 0.006 in structural similarity index (SSIM). These experimental results show that the proposed method can effectively suppress the brightness abnormalities caused by amplitude distortion, correct the phase distortions, and significantly enhance the reconstruction quality of holographic images.  
      关键词:holographic image;distortion correction;complex-valued neural network;complex-domain attention mechanism;multi-branch-fused complex-valued network   
      48
      |
      6
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 133183153 false
      更新时间:2025-12-27
    • XUE Wei, CHEN Chuang-hui, DU Ming-yang, ZHONG Ping, ZHENG Xiao
      Vol. 53, Issue 9, Pages: 3331-3344(2025) DOI: 10.12263/DZXB.20250642
      摘要:Medical image segmentation is a key technology in the field of smart healthcare, aiming to accurately identify and segment organs or pathological regions within images, thereby providing reliable quantitative evidence for clinical diagnosis and treatment decision-making. In recent years, medical image segmentation methods based on convolutional neural network (CNN) have been widely adopted due to their excellent capability in extracting local features. However, due to the inherent local receptive field of convolution operations, CNN still suffers from limitations in modeling long-range spatial dependencies and global contextual information. Although Transformer-based methods achieve global feature modeling through the self-attention mechanism, their computational complexity grows quadratically with sequence length, limiting their efficiency in practical applications. To mitigate the aforementioned issues, this paper proposes a new medical image segmentation network, which mainly consist of two core modules: cross-vision state space (C-VSS) and multi-branch interactive attention (MBIA). The C-VSS module integrates the local perception advantage of convolutional operation with the long-sequence modeling capability of state space model. Through a dual-branch collaborative strategy, it achieves effective extraction and fusion of local and global features while maintaining linear computational complexity. The MBIA module enhances the representation of multi-scale contextual information through a multi-branch architecture and establishes bidirectional information interaction pathways between the encoder and the decoder to enable dynamic fusion of cross-level features, thereby improving the model’s ability to perceive complex structures. Experimental results on four public medical image segmentation datasets, including CVC-ColonDB, ISIC2017, ISIC2018, and COVID-19, demonstrate that our method outperforms the second-best approach by approximately 0.94, 0.83, 1.04, and 2.28 percentage points in intersection over union (IoU) and 0.63, 0.50, 1.56, and 1.51 percentage points in dice similarity coefficient (DSC), respectively. In addition, the proposed method achieves average (Avg) scores of 91.51%, 91.74%, 91.30%, and 88.78% on the four datasets, respectively, all of which are higher than those of the comparative methods, demonstrating its superior segmentation performance. Furthermore, ablation studies show that removing the C-VSS module alone leads to a decrease of 3.62, 2.15, 1.69, and 2.13 percentage points in IoU, and 2.25, 1.29, 1.02, and 1.40 percentage points in DSC, respectively. Removing the MBIA module alone results in a decline of 10.11, 0.50, 1.08, and 1.97 percentage points in IoU, and 6.54, 0.30, 0.65, and 1.30 percentage points in DSC, respectively. The experimental results fully verify the effectiveness of the C-VSS and MBIA modules, indicate that the MBIA module contributes more significantly to performance improvement, and reveal a notable synergy between the two.  
      关键词:medical image segmentation;cross-visual state space module;multi-branch interactive attention module;Dynamic feature fusion;convolutional neural network;Transformer   
      59
      |
      13
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 134450152 false
      更新时间:2025-12-27
    • ZHANG Su-kai, CHEN Peng, DONG Zi-ying, WANG Wei
      Vol. 53, Issue 9, Pages: 3345-3357(2025) DOI: 10.12263/DZXB.20250475
      摘要:To address the challenges in radar sea clutter simulation under complex sea conditions, including insufficient global feature modeling, limited multimodal generation capability, and a simplistic evaluation system, this paper proposes a generative adversarial network enhanced by multi-head self-attention mechanisms self-attention high-fidelity generative adversarial network (SA-HIFIGAN). The model incorporates multi-head self-attention modules in both the generator and discriminator to strengthen the modeling of long-range spatiotemporal correlations in sea clutter. Additionally, a multi-scale and multi-period discriminator structure with classification functionality is designed. Furthermore, this paper constructs a hybrid evaluation system integrating distribution similarity, spectral error, and statistical stability, achieving multidimensional quality control for generated clutter. Experiments conducted using X-band radar field-measured datasets validate the model’s effectiveness in metrics such as amplitude probability density, power spectral density, and spatiotemporal correlation. The results demonstrate that SA-HIFIGAN achieves high consistency with measured data across these metrics. Not only can it generate clutter data with characteristics corresponding to sea state levels, but it also outperforms existing clutter generation methods like deep convolutional generative adversarial network (DCGAN) and variational auto-encoder (VAE) in comprehensive scoring.  
      关键词:multi-head attention mechanism;generative adversarial network;sea clutter simulation;conditional control   
      75
      |
      9
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 133069229 false
      更新时间:2025-12-27
    • DU Wen-liang, XU Xiao-yu, ZHAO Jia-qi, LIU Bing, ZHOU Yong
      Vol. 53, Issue 9, Pages: 3358-3370(2025) DOI: 10.12263/DZXB.20250326
      摘要:Remote sensing image-text retrieval aims to quickly and accurately retrieve semantically matching text or images from a massive remote sensing image-text database based on a given image or text. With the rapid development of Earth observation technology, the application value of this technology in fields such as urban planning, disaster emergency response, and environmental monitoring has become increasingly prominent, making it a research hotspot in the current field of multimodal information processing. Vision-language pre-training models, pre-trained on general-domain data, have laid the technical foundation for general image-text retrieval tasks by achieving efficient semantic alignment between images and text. However, a significant domain gap exists between general and remote sensing data, which limits the performance of these pre-trained models when directly applied to remote sensing tasks. Therefore, fine-tuning is necessary to adapt the vision-language model to the unique data distribution of the remote sensing domain. However, existing fine-tuning methods face two core challenges when applied to the remote sensing domain. First, there is insufficient cross-modal alignment: current fine-tuning methods lack explicit cross-modal information interaction mechanisms, making it difficult to fully model the intrinsic correlation between images and text. Second, it is difficult to achieve fine-grained semantic representation: existing methods often struggle to capture fine-grained semantic information in remote sensing images, such as vast differences in target scales, high similarity between ground object classes, and complex spatial-topological relationships. Performance is particularly limited when dealing with small targets or semantic confusion caused by similar ground objects, which significantly reduces retrieval accuracy. This paper addresses the problems of insufficient cross-modal alignment and difficulty in fine-grained semantic representation in remote sensing image-text retrieval tasks by proposing a fine-tuning method based on a shared prompt and Mamba adapter. This method first establishes an explicit interaction mechanism for image and text features by designing a cross-modal shared prompt generation module. Then, it constructs a dual-branch Mamba adapter fine-tuning module for remote sensing scenarios to achieve fine-grained representation of image and text features, respectively. Finally, it uses contrastive loss and affiliation loss to alleviate the semantic confusion caused by small targets or similar ground objects in remote sensing images. Experimental results show that this method achieves mean average recall rates of 37.3% and 48.05% on the remote sensing image captioning dataset (RSICD) and remote sensing image-text match dataset (RSITMD) datasets, respectively, which are improvements of 3.68% and 1.52% compared to the current state-of-the-art adapter fine-tuning method. Furthermore, ablation studies have verified the effectiveness of the shared prompt generation module and the Mamba adapter.  
      关键词:image-text retrieval;remote sensing images;mamba adapter;visual-language model fine-tuning   
      113
      |
      15
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 128555389 false
      更新时间:2025-12-27
    • Cross-Scene Point Prediction Crowd Counting Method Based on Meta-Weight-Net

      XU Xin, TAN Zhuo-lin, GAO Chen-qiang, XI Yue
      Vol. 53, Issue 9, Pages: 3371-3383(2025) DOI: 10.12263/DZXB.20250285
      摘要:Cross-scene crowd counting often suffers from degraded accuracy due to data distribution disparities caused by factors such as illumination, scale, camera angles, and crowd density. To address the limitations of existing crowd counting models in cross-scene applications, a meta-learning-based scene-aware reweighting method is proposed. Instead of relying on traditional density map approaches that suffer from localization ambiguity, the method employs a point prediction counting model to directly estimate the precise coordinates of each individual. A meta-weight network is introduced to learn an explicit weighting scheme for the point prediction loss from meta-data, while a scene-aware branch treats each scene as an independent learning task, leveraging intrinsic features across scenes to adaptively adjust the weighting scheme and mitigate the impact of annotation noise on cross-scene generalization. Furthermore, to overcome the limitations of existing datasets in educational settings, a new campus multi-scene crowd counting dataset (MS-Crowd) is constructed, providing a more comprehensive benchmark for cross-scene evaluation. Experimental results demonstrate that the proposed method reduces the mean absolute error (MAE) by 19.7% and 10.7% on the MS-Crowd and the public outdoor dataset ShanghaiTech, respectively, validating its effectiveness.  
      关键词:crowd counting;crowd localization;meta-learning;cross scene   
      19
      |
      3
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 132200375 false
      更新时间:2025-12-27
    • LI Hao, HAO Wen-ning, ZOU Shi-chen, XIE Xiao-yu
      Vol. 53, Issue 9, Pages: 3384-3396(2025) DOI: 10.12263/DZXB.20250308
      摘要:Diffusion models have garnered significant attention in the field of image generation due to their high precision. The backbone networks of these models have evolved from U-Net to Transformer architectures. However, the computational complexity of Transformer-based models scales quadratically with sequence length, posing a substantial challenge for generating high-resolution images. To address this issue, we propose a novel progressive image synthesis method based on Diffusion-Mamba and scale-invariant loss. Our method leverages the efficient characteristics of Mamba and the powerful modeling capabilities of diffusion models by integrating multi-directional scanning mechanisms and lightweight local structure enhancement modules. It achieves an efficient transformation from low-resolution images to high-resolution images through a progressive cascaded diffusion process, significantly reducing computational complexity. Furthermore, we design a contrastive learning-based scale-invariant loss function that maximizes the mutual information of the same target across different resolutions, thereby aligning and enhancing cross-scale feature representations. Experimental results on the ImageNet (FID = 1.67) dataset demonstrate that our proposed method achieves comprehensive improvements in accuracy, effectively validating its efficacy and efficiency.  
      关键词:image synthesis;diffusion model;state space model;contrastive learning;scale-invariant loss   
      87
      |
      11
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 129984171 false
      更新时间:2025-12-27
    • HUANG Chen, LIU Hui-jie, ZHANG Yan, YANG Chao, SONG Jian-hua
      Vol. 53, Issue 9, Pages: 3397-3409(2025) DOI: 10.12263/DZXB.20250533
      摘要:Multimodal aspect-based sentiment analysis (MABSA) aims to accurately identify aspect terms and determine their sentiment polarity from multimodal input data. Existing studies focus on integrating multimodal information to improve sentiment analysis performance. However, they still face two critical challenges in multi-aspect and multi-sentiment scenarios: (1) a lack of comprehensive perception of aspect terms in multimodal input data; and (2) sentiment semantic bias, where current models tend to focus on sentiment semantics strongly correlated with specific aspect terms, while ignoring weakly associated yet equally important sentiment cues. To address these issues, we propose a novel multimodal aspect-based sentiment analysis method, ANAGAL (Adaptive Noise and Aspect Graph Association Learning), which integrates adaptive noise handling and aspect-graph association learning to enhance analytical performance in scenarios involving multiple aspects and multiple sentiments. Specifically, an adaptive noise enhancement module is designed to supplement aspect information, thereby improving the model’s aspect perception and robustness. In addition, an aspect graph correlation learning module is introduced to associate all aspect terms and learn related sentiment semantics. Extra parameters are further incorporated to calibrate sentiment representations, enabling the model to capture more generalized sentiment biases and better identify sentiment polarity associated with each aspect term. Extensive experimental evaluations on public datasets demonstrate that ANAGAL performs exceptionally well in addressing these challenges. Compared to existing state-of-the-art MABSA models, ANAGAL improves precision by 1.46 percentage points and 1.56 percentage points on the Twitter-2015 and Twitter-2017 datasets, and by 2.48 percentage points and 1.55 percentage points on the MASAD (Multimodal Aspect Sentiment Analysis Dataset) and EmoMeta datasets.  
      关键词:multimodal;aspect-based sentiment analysis;pre-trained language model;noise augmentation;aspect-graph association learning;graph attention network   
      36
      |
      6
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 126427287 false
      更新时间:2025-12-27
    • HUANG Yu-zhe, GUAN Yong-yuan, WEI Song-jie
      Vol. 53, Issue 9, Pages: 3410-3424(2025) DOI: 10.12263/DZXB.20250385
      摘要:Network traffic time series anomaly detection, as a crucial component of time series research, has garnered widespread attention and study in both academia and industry. To address issues such as high training costs and low detection efficiency in existing methods, this paper proposes ScanMamba, a novel time series classification model based on the Mamba-DSCNN architecture. The model significantly enhances the modeling capability for complex network traffic time series data through a designed variable-range multidirectional scanning mechanism and a spatiotemporal feature fusion mechanism. Specifically, ScanMamba integrates the Mamba State Space Model with a depthwise separable convolutional neural network (DSCNN) to dynamically adjust the effective receptive field across multiple temporal resolutions via downsampling, capturing temporal dependency features at different scales. A multidirectional scanning fusion strategy is employed to strengthen the modeling of long-range dependencies and nonlinear patterns. A multiscale pooling module combined with an attention mechanism performs weighted feature fusion, effectively boosting classification performance. During training, the incorporation of residual connections and a deep supervision mechanism mitigates gradient vanishing, accelerates model convergence, and enhances generalization capability. Experimental results on the CIC-IDS2017 dataset demonstrate that ScanMamba achieves accuracy, recall, and F1 scores of 0.983 1, 0.984 9, and 0.983 7, respectively. Its accuracy outperforms Mamba-ECANet by approximately 3%. For high-intensity attacks, ScanMamba attains F1 scores of 0.998 0 and 0.984 7, representing a 3.3 improvement over traditional LSTM methods in DDoS detection. Reducing the state space dimensionality decreased training time by approximately 10% with only a 0.25 performance drop. The average inference latency per data point for ScanMamba is 6.3 ms, significantly surpassing the traditional LSTM models of 11.2 ms and the Transformer-based architectures of 9.6 ms.  
      关键词:time series data;Mamba;network traffic;anomaly detection;deep learning;feature fusion   
      59
      |
      9
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 133636004 false
      更新时间:2025-12-27

      CORRESPONDENCE

    • LIU Huan-lin, LIU Bo, CHEN Yong, MA Bing, QIU Yan, CHEN Hao-nan
      Vol. 53, Issue 9, Pages: 3425-3432(2025) DOI: 10.12263/DZXB.20240963
      摘要:To enhance the security and anti-calamity of virtual network embedding (VNE) for multi-area faults scenarios in elastic optical networks, a method of ant colony optimization and quasi-real-time key pool based on survivability VNE (ACOQKP-SVNE) is proposed. A strategy of ciphertext data transmission path embedding based on ant colony optimization is designed to reduce the degree of the potential risk of the ciphertext transmission path, and the strategy of quantum key distribution path embedding based on quasi-real-time key pool construction to improve the key utilization. When the multi-area failure occurs, the affected physical components are recovered by using the strategy of damage-aware failure recovery based on different physical components. Simulation results show that compared with comparative algorithms, the proposed ACOQKP-SVNE algorithm can reduce the bandwidth blocking probability by up to about 16%.  
      关键词:elastic optical networks;survival network virtualization embedding;ant colony optimization;quantum key distribution;bandwidth blocking probability   
      20
      |
      4
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 126331165 false
      更新时间:2025-12-27

      SURVEYS AND REVIEWS

    • ZENG Xu, HUANG Rui-lai, TU Cheng, ZHANG Yi, ZHANG Xiao-sheng
      Vol. 53, Issue 9, Pages: 3433-3453(2025) DOI: 10.12263/DZXB.20250623
      摘要:Wood, as one of the most abundant renewable resources on Earth, possesses a naturally porous structure, excellent mechanical properties, and biocompatibility, endowing it with significant potential in the field of green electronics. Unlike conventional polymeric or inorganic substrates, wood has unique fiber orientation and porous micro/nanostructures. These features provide advantages in charge transport, ion migration, and interfacial regulation, which offer a sustainable and biodegradable basis for wood-based electronic devices. In recent years, intelligent sensing technologies, self-powered systems, and flexible electronics have developed rapidly. Researchers have modified wood through structural design and material functionalization. As a result, wood has been endowed with new properties such as conductivity, flexibility, and optical transparency. This has revived its potential in green electronics. This review summarizes the physical structure and chemical composition of wood. It also introduces common modification and functionalization methods. These include delignification, chemical doping, carbonization, and bioinspired structural design. Such strategies enable wood to acquire diverse functional properties. Based on these advances, researchers have built various high-performance wood-based devices. Energy-harvesting devices, such as evaporation-driven and triboelectric generators, can capture energy from the environment. Multifunctional sensors can detect pressure, humidity, and gas with high sensitivity. Energy-storage devices, such as supercapacitors, show excellent energy storage and release performance. We further discuss the integration of wood-based electronic microsystems. Energy management modules, sensing units, and wireless communication modules are combined to achieve “sensing function-micro/nano power supply” integration. These systems provide stable operation and reliable signal response. They also match the goals of green and sustainable development. Finally, we discuss future opportunities and challenges facing the development of wood-based electronic technologies.  
      关键词:wood electronics;wood-based sensor;nanogenerator;wood-based evaporation-induced electricity generator;wood-based supercapacitor;wood-based microsystems   
      40
      |
      6
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 135693083 false
      更新时间:2025-12-27
    • ZHANG Ya-zhou, LIU Qi-meng, RONG Lu, ZHAO Bin, LI Ai-jun
      Vol. 53, Issue 9, Pages: 3454-3472(2025) DOI: 10.12263/DZXB.20250367
      摘要:Large language models (LLMs) have achieved outstanding success across a wide range of downstream tasks in natural language processing (NLP), thanks to their remarkable ability to follow instructions and learn from context.As human intelligence is inherently multimodal, the momentum of this research has naturally expanded into other modalities, particularly vision and speech. In the realm of vision, large-scale models like GPT-4V and LLaVa employ foundational language models as the “brain” enabling them to perform complex tasks in visual understanding and reasoning. These models have shown impressive abilities to break down task barriers, transcending traditional boundaries in vision-related tasks. In a similar vein, speech large language models (SLLMs) have attracted significant interest from both academia and industry. Notable models such as Whisper and Qwen-Audio have emerged as frontrunners, setting new performance records in speech-related tasks,including speech recognition, understanding, and synthesis. Their development demonstrates significant potential for further breakthroughs. This paper aims to provide a comprehensive review of the latest advancements in SLLMs research. It delves into the foundational architecture of these models, thoroughly exploring key concepts such as model components, training strategies, data construction, and evaluation methods. Furthermore, it addresses the primary challenges that researchers face in this rapidly evolving field and discusses possible future directions for research and development in speech-based large models.  
      关键词:speech large language models;Large Language Models;instruction following;speech understanding;pre-training   
      102
      |
      26
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 128530793 false
      更新时间:2025-12-27
    0