TY - JOUR
T1 - Global Attention-Guided Dual-Domain Point Cloud Feature Learning for Classification and Segmentation
AU - Li, Zihao
AU - Gao, Pan
AU - You, Kang
AU - Yan, Chuan
AU - Paul, Manoranjan
N1 - Publisher Copyright:
IEEE
PY - 2024
Y1 - 2024
N2 - Previous studies have demonstrated the effectiveness of point-based neural models on the point cloud analysis task. However, there remains a crucial issue on producing the efficient input embedding for raw point coordinates. Moreover, another issue lies in the limited efficiency of neighboring aggregations, which is a critical component in the network stem. In this paper, we propose a Global Attention-guided Dual-domain Feature Learning network (GAD) to address the above-mentioned issues. We first devise the Contextual Position-enhanced Transformer (CPT) module, which is armed with an improved global attention mechanism, to produce a global-aware input embedding that serves as the guidance to subsequent aggregations. Then, the Dual-domain K-nearest neighbor Feature Fusion (DKFF) is cascaded to conduct effective feature aggregation through novel dual-domain feature learning which appreciates both local geometric relations and long-distance semantic connections. Extensive experiments on multiple point cloud analysis tasks (e.g., classification, part segmentation, and scene semantic segmentation) demonstrate the superior performance of the proposed method and the efficacy of the devised modules.
AB - Previous studies have demonstrated the effectiveness of point-based neural models on the point cloud analysis task. However, there remains a crucial issue on producing the efficient input embedding for raw point coordinates. Moreover, another issue lies in the limited efficiency of neighboring aggregations, which is a critical component in the network stem. In this paper, we propose a Global Attention-guided Dual-domain Feature Learning network (GAD) to address the above-mentioned issues. We first devise the Contextual Position-enhanced Transformer (CPT) module, which is armed with an improved global attention mechanism, to produce a global-aware input embedding that serves as the guidance to subsequent aggregations. Then, the Dual-domain K-nearest neighbor Feature Fusion (DKFF) is cascaded to conduct effective feature aggregation through novel dual-domain feature learning which appreciates both local geometric relations and long-distance semantic connections. Extensive experiments on multiple point cloud analysis tasks (e.g., classification, part segmentation, and scene semantic segmentation) demonstrate the superior performance of the proposed method and the efficacy of the devised modules.
KW - classification
KW - Convolution
KW - dual-domain feature learning
KW - global attention-guided
KW - point cloud
KW - Point cloud compression
KW - Representation learning
KW - segmentation
KW - Semantics
KW - Shape
KW - Task analysis
KW - Three-dimensional displays
UR - http://www.scopus.com/inward/record.url?scp=85199056415&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85199056415&partnerID=8YFLogxK
U2 - 10.1109/TAI.2024.3429050
DO - 10.1109/TAI.2024.3429050
M3 - Article
AN - SCOPUS:85199056415
SN - 2691-4581
SP - 1
EP - 12
JO - IEEE Transactions on Artificial Intelligence
JF - IEEE Transactions on Artificial Intelligence
ER -