This work provides a 3D hand attitude estimation approach for fixed hand posture based on a CNN and LightGBM for dual-view RGB images to facilitate the application of hand posture teleoperation. First, using dual-view cameras and an IMU sensor, we provide a simple method for building 3D hand posture datasets. This method can quickly acquire dual-view 2D hand image sets and automatically append the appropriate three-axis attitude angle labels. Then, combining ensemble learning, which has strong regression fitting capabilities, with deep learning, which has excellent automatic feature extraction capabilities, we present an integrated hand attitude CNN regression model. This model uses a Bayesian optimization based LightGBM in the ensemble learning algorithm to produce 3D hand attitude regression and two CNNs to extract dual-view hand image features. Finally, a mapping from dual-view 2D images to 3D hand attitude angles is established using a training approach for feature integration, and a comparative experiment is run on the test set. The results of the experiments demonstrate that the suggested method may successfully solve the hand self-occlusion issue and accomplish 3D hand attitude estimation using only two normal RGB cameras.
这项工作针对固定手部姿势提出了一种基于卷积神经网络(CNN)和轻量级梯度提升机(LightGBM)的双视角RGB图像的三维手部姿态估计方法,以促进手部姿势遥操作的应用。首先,利用双视角相机和一个惯性测量单元(IMU)传感器,我们提供了一种构建三维手部姿势数据集的简单方法。这种方法可以快速获取双视角二维手部图像集,并自动添加合适的三轴姿态角标签。然后,将具有强大回归拟合能力的集成学习与具有出色自动特征提取能力的深度学习相结合,我们提出了一种集成的手部姿态CNN回归模型。该模型在集成学习算法中使用基于贝叶斯优化的LightGBM进行三维手部姿态回归,并使用两个CNN来提取双视角手部图像特征。最后,利用一种特征集成的训练方法建立了从双视角二维图像到三维手部姿态角的映射,并在测试集上进行了对比实验。实验结果表明,所提出的方法可以成功解决手部自遮挡问题,并且仅使用两个普通的RGB相机就能完成三维手部姿态估计。