基于颜色和深度信息融合的目标识别方法

吴 鑫; 王桂英; 丛 杨

摘要: 传统的机器视觉采用二维RGB图像，难以满足三维视觉检测的要求，深度图像能直接反映物体表面的三维特征，正逐渐受到重视。该文提出的方案将RGB和深度信息相结合，分割出物体所在区域，并利用梯度方向直方图(HOG, histograms of oriented gradients)分别提取RGB图像和深度图像特征信息。在分类算法上，该文采用k最邻近节点算法（k-NN）对特征进行筛选，识别出目标物体。试验结果表明，综合利用深度信息和RGB信息，识别准确率很高，此方案能够对物体和手势进行很好识别。

Abstract: The traditional machine vision with RGB image doesn't meet the requirements of the 3D visual inspection. The Range Image can reflect the 3D characteristics of the object surface directly, and is attracting much more attentions gradually. How to use the RGB and depth information for object recognition is the core issue, which would be studied in this paper. Firstly, based on the kinect color and depth information, the object recognition system was put forward in this paper. The kinect sensor was used to acquire the color and depth information of the target object and its background in recognition system. The information can be used to segment the object from the background. Then HOG feature descriptor was used to extract the target sample’s characteristics and establish the characteristic model. In the actual process of object recognition, the most similar templates category with k-NN algorithm was selected to achieve the goal of classification and recognition. In this paper’s scheme, depth and RGB image was comprehensively used. The target objects was segmented by Canny edge detection operator, and the depth image’s advantage that can reflect the object’s contour directly was made full use of. In terms of feature extraction, histograms of oriented gradients(HOG) had a good geometrical and optical invariance. The HOG descriptor was used to describe the objects’ features. The HOG descriptor was done some deformation to under the premise of keeping its core algorithm unchanged. And then the HOG was used to extract the image features from target object of any size. In the descriptor, the image was divided into 2 × 2 sub-images. Count of the original image itself, there were 5 sub-images together. Each sub-image was one block, and it can be divided into 2 × 2 cells. Two different quantization levels were used in each cell. The gradient distribution space of unsigned value (0 ° to 180 °) was divided into 9 parts, and the gradient distribution space of symbol value (0 ° to 360 °) was divided into 18 parts. So 4 × (18+9) = 108-dimensional feature vector was generated in each cell, and each RGB-D image (a RGB image and a corresponding depth image) can be represented by the 1080-dimensional feature vector. Then, k-NN algorithm was used for classification and recognition of objects, which is calculated by Euclidean distance computation using the actual 1080-dimensional feature vector and the template of each sample, and k training samples with minimum distance will be obtained. If the k samples entirely or mostly belong to the same kind of template objects, it can be said that the target object belongs to the classification. Finally, an accuracy check experiment was done under different feature information: RGB only (visual features), depth only (shape features) and RGB-D (all features). The result showed that RGB-D features had the advantages of RGB and depth, and present the highest recognition accuracy in both of category and instance recognition. The proposed object recognition system had a good solution to the problem of large-scale, multi-classifier identification of objects, and achieved the intended purpose of the experiment.

基于颜色和深度信息融合的目标识别方法

Object recognition method by combining color and depth information