基于混合特征的互联网茄子图像检索方法与系统

    Internet eggplant image retrieval method and system based on mixed features

    • 摘要: 互联网图像数据的爆炸式增长使得有效检索变得越来越重要,不同于文字的检索,有效的图像检索仍然是一个开放的问题。该文提出了基于混合特征的互联网茄子类图像的检索方法,并开发了图像检索系统。该文采用Hu不变矩(hu invariant distance)作为几何不变特征,采用颜色矩方法,通过计算HSV空间的三阶矩来描述颜色特征,采用分水岭算法(watershed algorithm)提取茄子的轮廓特征,通过长宽比特征区分长茄和圆茄,最后综合几何不变特征、颜色、轮廓特征进行茄子对象的描述,分别给3种特征赋不同的权重,形成检索系统。试验验证,该文方法和系统在测试数据集上查全率为87.6%,查准率为87.6%,相比于只采用Hu不变矩方法(其查全率为31.75%,查准率为31.75%)在Hu不变矩加颜色特征方法(查全率为52.8%,查准率为52.8%),有了一定的提升,验证了方法的有效性。

       

      Abstract: Abstract: With the explosive growth of image data on the Internet, effective image retrieval becomes more and more important. Different from text retrieval, effective image retrieval is still an open problem. Specific crop images correspond to specific agricultural knowledge. In this paper, in order to search eggplant knowledge more effectively, a system for eggplant images retrieval based on hybrid features is proposed. The hybrid features include Hu invariant moments, color moment vector and contour. Hu invariant moments are used as geometric invariant features, and 7 Hu invariant moments are constructed by using the 2 and 3 order normalized central moments. The main idea of invariant moment is to use the few moments based on region as the shape feature, which are invariant to rotation, translation and scale. For color features, we use the color moment method. The color space is changed from RGB (red, green, blue) color space to HSV (hue, saturation, value) color space firstly. Then the H, S and V channels of HSV color space are used to construct the 9-dimensional color moment vector, which is used as the descriptor of color features. The similarity vector of color features can be calculated using the Manhattan distance. For contour features, sobel operator is used for edge detection firstly, and then the watershed algorithm is used to segment the image and extract the contour feature from the image. The watershed algorithm is divided into 2 steps, one is the sorting process, and the other is the submerging process. Firstly, the images are converted from color images to gray images. The gray levels of all the pixels in the images are ordered from low to high. Then the submerging process is executed from low to high orderly. For each local minimum value in the h-order height of the domain using FIFO (first in first out) data structure to determine and label . The main purpose of watershed algorithm is to find the connected region of image. Finally, the descriptor of eggplant object in images is made by combing the geometric invariant features, color and contour features, which are assigned with 3 different weights respectively. The eggplant images retrieval system is developed based on the combined features descriptor. In order to distinguish long eggplant and round eggplant, we observe that the ratio of length to width of contour is an ideal distinguishing feature. The minimum bounding rectangle of each contour is calculated after image segmentation. The ratio of the length to width of each minimum bounding rectangle is used to distinguish long eggplant and round eggplant in the image. Experiments verify that the system achieves 87.6% in recall and precision ratio in our test data sets, and we list all the images according to the computed similarity values. Compared to the results only using Hu invariant moment (the recall and precision ratio is 31.75%) and that using color features combined with Hu (the recall and precision ratio is 52.8%), our hybrid features are more robust and precise. The effectiveness of the proposed method is verified.

       

    /

    返回文章
    返回