3D Model Classification for Non-Aligned Poses

DING Bo; GAO Yuan; FAN Yu-fei; HE Yong-jun

doi:10.12263/DZXB.20211366

您当前的位置：

首页 >

文章列表页 >

3D Model Classification for Non-Aligned Poses

PAPERS | 更新时间：2025-12-08

- 3D Model Classification for Non-Aligned Poses
- ACTA ELECTRONICA SINICA Vol. 51, Issue 9, Pages: 2379-2390(2023)
- 作者机构：
  
  哈尔滨理工大学计算机科学与技术学院，黑龙江哈尔滨150080
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(61673142)
- DOI：10.12263/DZXB.20211366
  CLC： TP391.4
- Received：09 October 2021，
  
  Revised：2022-04-23，
  
  Published：25 September 2023
- 稿件说明：
移动端阅览
丁博,高源,范宇飞等.姿态非对齐的三维模型分类[J].电子学报,2023,51(09):2379-2390.

DING Bo,GAO Yuan,FAN Yu-fei,et al.3D Model Classification for Non-Aligned Poses[J].ACTA ELECTRONICA SINICA,2023,51(09):2379-2390.
丁博,高源,范宇飞等.姿态非对齐的三维模型分类[J].电子学报,2023,51(09):2379-2390. DOI： 10.12263/DZXB.20211366.

DING Bo,GAO Yuan,FAN Yu-fei,et al.3D Model Classification for Non-Aligned Poses[J].ACTA ELECTRONICA SINICA,2023,51(09):2379-2390. DOI： 10.12263/DZXB.20211366.

摘要

目前的三维模型分类方法均是对初始姿态已经对齐的数据集进行分类，但是在实际应用中，三维模型的姿态是未知的，非对齐的三维模型将导致分类准确率急剧下降. 本文提出了一种新的三维模型分类方法，适用于模型姿态对齐和非对齐两种情况. 该方法采用图卷积神经网络（Graph Convolutional neural Network，GCN）学习视图间的空间关系，将预先设置好的相机位置作为图结构中的顶点，并通过时序特征提取网络以及注意力网络进一步提升GCN的运算效果，从而完成三维模型的分类. 实验表明，该方法在ModelNet10和ModelNet40数据集上进行实验，在三维模型姿态对齐的情况下，分类准确率分别高达99.3%和97.4%，远高于现有方法. 在三维模型姿态非对齐的情况下，也有较高的分类准确率.

Abstract

Current 3D model classification methods are validated on the datasets whose initial poses are aligned. However

in practical applications

the poses of 3D models are unknown

resulting in obvious performance degradation of a non-aligned 3D models. A new 3D model classification method which is suitable for both the aligned and non-aligned poses of datasets

is proposed in this paper. This method employs graph convolutional neural network (GCN) to learn the spatial relations between views

and uses the preset camera positions as the vertexes in the graph structure. Moreover

the timing feature extraction network and the attention network are used to further improve the effect of GCN. Experiments on ModelNet10 and ModelNet40 datasets show that the proposed method achieves accuracies of 99.3% and 97.4% under aligned poses of 3D models

which is much higher than other existing methods. On non-aligned poses of 3D models

also has high classification accuracy.

关键词

Keywords

references

WANG D , YAO H X , TOMBARI F , et al . Learning descriptors with cube loss for view-based 3-D object retrieval [J]. IEEE Transactions on Multimedia , 2019 , 21 ( 8 ): 2071 - 2082 .

ABDUL R H , YUAN J F , LI B , et al . 2D image-based 3D scene retrieval [C]// Proceedings of the 11th Eurographics Workshop on 3D Object Retrieval . Delft : Eurographics Association , 2018 : 37 - 44 .

PHAM Q H , TRAN M K , LI W H , et al . RGB-D object-to-cad retrieval [C]// Proceedings of the 11th Eurographics Workshop on 3D Object Retrieval . Delft : Eurographics Association , 2018 : 45 - 52 .

白静 , 司庆龙 , 秦飞巍 . 基于卷积神经网络和投票机制的三维模型分类与检索 [J]. 计算机辅助设计与图形学学报 , 2019 , 31 ( 2 ): 303 - 314 .

BAI J , SI Q L , QIN F W . 3D model classification and retrieval based on CNN and voting scheme [J]. Journal of Computer-Aided Design & Computer Graphics , 2019 , 31 ( 2 ): 303 - 314 . (in Chinese)

李海生 , 武玉娟 , 郑艳萍 , 等 . 基于深度学习的三维数据分析理解方法研究综述 [J]. 计算机学报 , 2020 , 43 ( 1 ): 41 - 63 .

LI H S , WU Y J , ZHENG Y P , et al . A survey of 3D data analysis and understanding based on deep learning [J]. Chinese Journal of Computers , 2020 , 43 ( 1 ): 41 - 63 . (in Chinese)

白静 , 周文惠 , 拖继文 , 等 . 时空信息联合嵌入的端到端三维模型草图检索 [J]. 计算机辅助设计与图形学学报 , 2021 , 33 ( 6 ): 826 - 836 .

BAI J , ZHOU W H , TUO J W , et al . End-to-end sketch-3D model retrieval with spatiotemporal information joint embedding [J]. Journal of Computer-Aided Design & Computer Graphics , 2021 , 33 ( 6 ): 826 - 836 . (in Chinese)

张静 , 曲志坚 , 刘晓红 . 基于深度学习的三维模型检索研究 [J]. 智能计算机与应用 , 2019 , 9 ( 3 ): 54 - 58 .

ZHANG J , QU Z J , LIU X H . Research on 3D model retrieval based on deep learning [J]. Intelligent Computer and Applications , 2019 , 9 ( 3 ): 54 - 58 . (in Chinese)

CHEN D Y , TIAN X P , SHEN Y T , et al . On visual similarity based 3D model retrieval [J]. Computer Graphics Forum , 2003 , 22 ( 3 ): 223 - 232 .

SHI B G , BAI S , ZHOU Z C , et al . DeepPano: Deep panoramic representation for 3-D shape recognition [J]. IEEE Signal Processing Letters , 2015 , 22 ( 12 ): 2339 - 2343 .

王鹏宇 , 水盼盼 , 余锋根 , 等 . 基于多视角卷积神经网络的三维模型分类方法 [J]. 中国科学: 信息科学 , 2019 , 49 ( 4 ): 436 - 449 .

WANG P Y , SHUI P P , YU F G , et al . 3D shape classification based on convolutional neural networks fusing multi-view information [J]. Scientia Sinica (Informationis) , 2019 , 49 ( 4 ): 436 - 449 . (in Chinese)

ZHOU H Y , LIU A N , NIE W Z , et al . Multi-view saliency guided deep neural network for 3-D object retrieval and classification [J]. IEEE Transactions on Multimedia , 2020 , 22 ( 6 ): 1496 - 1506 .

汤磊 , 丁博 , 何勇军 . 基于卷积神经网络的高效三维模型检索方法 [J]. 电子学报 , 2021 , 49 ( 1 ): 64 - 71 .

TANG L , DING B , HE Y J . An efficient 3D model retrieval method based on convolutional neural network [J]. Acta Electronica Sinica , 2021 , 49 ( 1 ): 64 - 71 . (in Chinese)

DING B , TANG L , GAO Z , et al . 3D shape classification using a single view [J]. IEEE Access , 8 : 200812 - 200822 .

LIU A N , HU N , SONG D , et al . Multi-view hierarchical fusion network for 3D object retrieval and classification [J]. IEEE Access , 2019 , 7 : 153021 - 153030 .

HUANG X , WANG M T , ZHANG D J , et al . Multi-view fusion with deep learning for 3D shape classification [C]// 2018 International Conference on Audio, Language and Image Processing (ICALIP) . Piscataway : IEEE , 2018 : 189 - 194 .

SU H , MAJI S , KALOGERAKIS E , et al . Multi-view convolutional neural networks for 3D shape recognition [C]// 2015 IEEE International Conference on Computer Vision . Piscataway : IEEE , 2015 : 945 - 953 .

FENG Y F , ZHANG Z Z , ZHAO X B , et al . GVCNN: Group-view convolutional neural networks for 3D shape recognition [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 264 - 272 .

WANG C , PELILLO M , SIDDIQI K . Dominant set clustering and pooling for multi-view 3D object recognition [EB/OL]. ( 2019-06-4 ) [ 2022-04-19 ]. https://arxiv.org/abs/1906.01592 https://arxiv.org/abs/1906.01592 .

HAN Z Z , LU H L , LIU Z B , et al . 3D2SeqViews: Aggregating sequential views for 3D global feature learning by CNN with hierarchical attention aggregation [J]. IEEE Transactions on Image Processing , 2019 , 28 ( 8 ): 3986 - 3999 .

ZHOU Y , ZENG F Z , QIAN J C , et al . 3D shape classification and retrieval based on polar view [J]. Information Sciences , 2019 , 474 : 205 - 220 .

LIU A N , GUO F B , ZHOU H Y , et al . Semantic and context information fusion network for view-based 3D model classification and retrieval [J]. IEEE Access , 2020 , 8 : 155939 - 155950 .

LIU A N , ZHOU H Y , LI M J , et al . 3D model retrieval based on multi-view attentional convolutional neural network [J]. Multimedia Tools and Applications , 2020 , 79 ( 7/8 ): 4699 - 4711 .

LIANG Q , WANG Y X , NIE W Z , et al . MVCLN: Multi-view convolutional LSTM network for cross-media 3D shape recognition [J]. IEEE Access , 2020 , 8 : 139792 - 139802 .

徐冰冰 , 岑科廷 , 黄俊杰 , 等 . 图卷积神经网络综述 [J]. 计算机学报 , 2020 , 43 ( 5 ): 755 - 780 .

XU B B , CEN K T , HUANG J J , et al . A survey on graph convolutional neural network [J]. Chinese Journal of Computers , 2020 , 43 ( 5 ): 755 - 780 . (in Chinese)

BRUNA J , ZAREMBA W , SZLAM A , et al . Spectral networks and locally connected networks on graphs [EB/OL]. ( 2013-12-21 ) [ 2022-04-19 ]. https://arxiv.org/abs/1312.6203 https://arxiv.org/abs/1312.6203 .

HENAFF M , BRUNA J , LECUN Y . Deep convolutional networks on graph-structured data [EB/OL]. ( 2015-06-16 ) [ 2022-04-19 ]. https://arxiv.org/abs/1506.05163 https://arxiv.org/abs/1506.05163 .

XU B B , SHEN H W , CAO Q , et al . Graph wavelet neural network [EB/OL]. ( 2019-04-12 ) [ 2022-04-19 ]. https://arxiv.org/abs/1904.07785 https://arxiv.org/abs/1904.07785 .

WU F , ZHANG T Y , SOUZA A , et al . Simplifying graph convolutional networks [C]// International Conference on Machine Learning . New York : PMLR , 2019 : 6861 - 6871 .

MONTI F , BOSCAINI D , MASCI J , et al . Geometric deep learning on graphs and manifolds using mixture model CNNs [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2017 : 5425 - 5434 .

GILMER J , SCHOENHOLZ S S , RILEY P F , et al . Neural message passing for Quantum chemistry [C]// Proceedings of the 34th International Conference on Machine Learning - Volume 70 . Sydney : JMLR 2017 : 1263 - 1272 .

LIN Z H , HUANG S Y , WANG Y C F . Convolution in the cloud: Learning deformable kernels in 3D graph convolution networks for point cloud analysis [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 1797 - 1806 .

WEI X , YU R X , SUN J . View-GCN: View-based graph convolutional network for 3D shape analysis [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 1847 - 1856 .

LEI H , AKHTAR N , MIAN A . Spherical kernel for efficient graph convolution on 3D point clouds [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 10 ): 3664 - 3680 .

KANEZAKI A , MATSUSHITA Y , NISHIDA Y . RotationNet: Joint object categorization and pose estimation using multiviews from unsupervised viewpoints [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 5010 - 5019 .

ESTEVES C , XU Y S , ALLEC-BLANCHETTE C , et al . Equivariant multi-view networks [C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway : IEEE , 2019 : 1568 - 1577 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2016 : 770 - 778 .

王天保 , 刘昱 , 郭继昌 , 等 . 图卷积神经网络行人轨迹预测算法 [J]. 哈尔滨工业大学学报 , 2021 , 53 ( 2 ): 53 - 60 .

WANG T B , LIU Y , GUO J C , et al . Pedestrian trajectory prediction algorithm based on graph convolutional network [J]. Journal of Harbin Institute of Technology , 2021 , 53 ( 2 ): 53 - 60 . (in Chinese)

HE T , ZHANG Z , ZHANG H , et al . Bag of tricks for image classification with convolutional neural networks [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 558 - 567 .

SHILANE P , MIN P , KAZHDAN M , et al . The princeton shape benchmark [C]// Proceedings Shape Modeling Applications . Piscataway : IEEE , 2004 : 167 - 178 .

ZANUTTIGH P , MINTO L . Deep learning for 3D shape classification from multiple depth maps [C]// 2017 IEEE International Conference on Image Processing (ICIP) . Beijing : IEEE , 2017 : 3615 - 3619 .

LIU S K , GILES L , ORORBIA A . Learning a hierarchical latent-variable model of 3D shapes [C]// 2018 International Conference on 3D Vision (3DV) . Piscataway : IEEE , 2018 : 542 - 551 .

DOMINGUEZ M , DHAMDHERE R , PETKAR A , et al . General-purpose deep point cloud feature extractor [C]// 2018 IEEE Winter Conference on Applications of Computer Vision . Piscataway : IEEE , 2018 : 1972 - 1981 .

QI C R , YI L , SU H , et al . Pointnet++: Deep hierarchical feature learning on point sets in a metric space [C]// Conference on Neural Information Processing Systems . Long Beach : NIPS , 2017 : 5099 - 5108 .

KLOKOV R , LEMPITSKY V . Escape from cells: Deep kd-networks for the recognition of 3D point cloud models [C]// 2017 IEEE International Conference on Computer Vision . Piscataway : IEEE , 2017 : 863 - 872 .

HU H T , WANG F Y , LE H X . VA-GCN: A vector attention graph convolution network for learning on point clouds [EB/OL]. ( 2021-06-01 ) [ 2022-04-19 ]. https://arxiv.org/abs/2106.00227 https://arxiv.org/abs/2106.00227 .

SU J C , GADELHA M , WANG R , et al . A deeper look at 3D shape classifiers [C]// European Conference on Computer Vision . Cham : Springer , 2019 : 645 - 661 .

KUMAWAT S , RAMAN S . LP-3DCNN: Unveiling local phase in 3D convolutional neural networks [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2019 : 4898 - 4907 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Neighborhood and Hypergraph Collaboration for Session-Based Recommendation

Object Detection Based on EIMYOLO for High-Resolution Remote Sensing Images

Single-Image High Dynamic Range Reconstruction Based on Multi-Attention and Perceptual Weighted Learning

Facing Different Challenges and Separating Homogeneous and Heterogeneous Information for RGBT Tracking

FD-GAN: Frequency-Decomposed Generative Adversarial Network for Unpaired Underwater Image Enhancement

Related Author

CHEN Rong-yuan

WEN Jie-bin

HUANG Shao-nian

HE Ye-yu

CAO Feng

ZENG Ke-wen

LI De-yu

LUO Xi-zhao

Related Institution

College of Frontier Intersection, Hunan University of Technology and Business

Key Laboratory of Hunan Province for Statistical Learning and Intelligent Computation, Hunan University of Technology and Business

School of Computer Science, Hunan University of Technology and Business

School of Information and Technology, Shanxi University

School of Computer Science and Technology, Soochow University

⁰