Although it has been studied for several years by computer vision and machine learning communitie... more Although it has been studied for several years by computer vision and machine learning communities, image annotation is still far from practical. In this paper, we present AnnoSearch, a novel way to annotate images using search and data mining technologies. Leveraging the Web-scale images, we solve this problem in two-steps: 1) searching for semantically and visually similar images on the Web, 2) and mining annotations from them. Firstly, at least one accurate keyword is required to enable text-based search for a set of semantically similar images. Then content-based search is performed on this set to retrieve visually similar images. At last, annotations are mined from the descriptions (titles, URLs and surrounding texts) of these images. It worth highlighting that to ensure the efficiency, high dimensional visual features are mapped to hash codes which significantly speed up the content-based search process. Our proposed approach enables annotating with unlimited vocabulary, which is impossible for all existing approaches. Experimental results on real web images show the effectiveness and efficiency of the proposed algorithm. | | Θ in Eq.4 and directly affects the learned clusters and the predicted annotations. The reason of proposing such a threshold strategy is that, since the similarity of images varies greatly, it is very
In this paper, we present a novel approach to solving the supervised dimensionality reduction pro... more In this paper, we present a novel approach to solving the supervised dimensionality reduction problem by encoding an image object as a general tensor of 2nd or higher order. First, we propose a Discriminant Tensor Criterion (DTC), whereby multiple interrelated lower-dimensional discriminative subspaces are derived for feature selection. Then, a novel approach called k-mode Cluster-based Discriminant Analysis is presented to iteratively learn these subspaces by unfolding the tensor along different tensor dimensions. We call this algorithm Discriminant Analysis with Tensor Representation (DATER), which has the following characteristics: 1) multiple interrelated subspaces can collaborate to discriminate different classes; 2) for classification problems involving higher-order tensors, the DATER algorithm can avoid the curse of dimensionality dilemma and overcome the small sample size problem; and 3) the computational cost in the learning stage is reduced to a large extent owing to the reduced data dimensions in generalized eigenvalue decomposition. We provide extensive experiments by encoding face images as 2nd or 3rd order tensors to demonstrate that the proposed DATER algorithm based on higher order tensors has the potential to outperform the traditional subspace learning algorithms, especially in the small sample size cases.
An analysis-by-synthesis framework for face recognition with variant pose, illumination and expre... more An analysis-by-synthesis framework for face recognition with variant pose, illumination and expression (PIE) is proposed in this paper. First, an efficient 2D-to-3D integrated face reconstruction approach is introduced to reconstruct a personalized 3D face model from a single frontal face image with neutral expression and normal illumination; Then, realistic virtual faces with different PIE are synthesized based on the personalized 3D face to characterize the face subspace; Finally, face recognition is conducted based on these representative virtual faces. Compared with other related works, this framework has the following advantages: 1) only one single frontal face is required for face recognition, which avoids the burdensome enrollment work; 2) the synthesized face samples provide the capability to conduct recognition under difficult conditions like complex PIE; and 3) the proposed 2D-to-3D integrated face reconstruction approach is fully automatic and more efficient. The extensive experimental results show that the synthesized virtual faces significantly improve the accuracy of face recognition with variant PIE.
Image annotation plays an important role in image retrieval and management. However, the results ... more Image annotation plays an important role in image retrieval and management. However, the results of the state-of-the-art image annotation methods are often unsatisfactory. Therefore, it is necessary to refine the imprecise annotations obtained by existing annotation methods.
Automatic annotation of photographs is one of the most desirable needs in family photograph manag... more Automatic annotation of photographs is one of the most desirable needs in family photograph management systems. In this paper, we present a learning framework to automate the face annotation in family photograph albums. Firstly, methodologies of contentbased image retrieval and face recognition are seamlessly integrated to achieve automated annotation. Secondly, face annotation is formulated in a Bayesian framework, in which the face similarity measure is defined as maximum a posteriori (MAP) estimation. Thirdly, to deal with the missing features, marginal probability is used so that samples which have missing features are compared with those having the full feature set to ensure a non-biased decision. The experimental evaluation has been conducted within a family album of few thousands of photographs and the results show that the proposed approach is effective and efficient in automated face annotation in family albums.
Although it has been studied for several years by computer vision and machine learning communitie... more Although it has been studied for several years by computer vision and machine learning communities, image annotation is still far from practical. In this paper, we present AnnoSearch, a novel way to annotate images using search and data mining technologies. Leveraging the Web-scale images, we solve this problem in two-steps: 1) searching for semantically and visually similar images on the Web, 2) and mining annotations from them. Firstly, at least one accurate keyword is required to enable text-based search for a set of semantically similar images. Then content-based search is performed on this set to retrieve visually similar images. At last, annotations are mined from the descriptions (titles, URLs and surrounding texts) of these images. It worth highlighting that to ensure the efficiency, high dimensional visual features are mapped to hash codes which significantly speed up the content-based search process. Our proposed approach enables annotating with unlimited vocabulary, which is impossible for all existing approaches. Experimental results on real web images show the effectiveness and efficiency of the proposed algorithm. | | Θ in Eq.4 and directly affects the learned clusters and the predicted annotations. The reason of proposing such a threshold strategy is that, since the similarity of images varies greatly, it is very
In this paper, we present a novel approach to solving the supervised dimensionality reduction pro... more In this paper, we present a novel approach to solving the supervised dimensionality reduction problem by encoding an image object as a general tensor of 2nd or higher order. First, we propose a Discriminant Tensor Criterion (DTC), whereby multiple interrelated lower-dimensional discriminative subspaces are derived for feature selection. Then, a novel approach called k-mode Cluster-based Discriminant Analysis is presented to iteratively learn these subspaces by unfolding the tensor along different tensor dimensions. We call this algorithm Discriminant Analysis with Tensor Representation (DATER), which has the following characteristics: 1) multiple interrelated subspaces can collaborate to discriminate different classes; 2) for classification problems involving higher-order tensors, the DATER algorithm can avoid the curse of dimensionality dilemma and overcome the small sample size problem; and 3) the computational cost in the learning stage is reduced to a large extent owing to the reduced data dimensions in generalized eigenvalue decomposition. We provide extensive experiments by encoding face images as 2nd or 3rd order tensors to demonstrate that the proposed DATER algorithm based on higher order tensors has the potential to outperform the traditional subspace learning algorithms, especially in the small sample size cases.
An analysis-by-synthesis framework for face recognition with variant pose, illumination and expre... more An analysis-by-synthesis framework for face recognition with variant pose, illumination and expression (PIE) is proposed in this paper. First, an efficient 2D-to-3D integrated face reconstruction approach is introduced to reconstruct a personalized 3D face model from a single frontal face image with neutral expression and normal illumination; Then, realistic virtual faces with different PIE are synthesized based on the personalized 3D face to characterize the face subspace; Finally, face recognition is conducted based on these representative virtual faces. Compared with other related works, this framework has the following advantages: 1) only one single frontal face is required for face recognition, which avoids the burdensome enrollment work; 2) the synthesized face samples provide the capability to conduct recognition under difficult conditions like complex PIE; and 3) the proposed 2D-to-3D integrated face reconstruction approach is fully automatic and more efficient. The extensive experimental results show that the synthesized virtual faces significantly improve the accuracy of face recognition with variant PIE.
Image annotation plays an important role in image retrieval and management. However, the results ... more Image annotation plays an important role in image retrieval and management. However, the results of the state-of-the-art image annotation methods are often unsatisfactory. Therefore, it is necessary to refine the imprecise annotations obtained by existing annotation methods.
Automatic annotation of photographs is one of the most desirable needs in family photograph manag... more Automatic annotation of photographs is one of the most desirable needs in family photograph management systems. In this paper, we present a learning framework to automate the face annotation in family photograph albums. Firstly, methodologies of contentbased image retrieval and face recognition are seamlessly integrated to achieve automated annotation. Secondly, face annotation is formulated in a Bayesian framework, in which the face similarity measure is defined as maximum a posteriori (MAP) estimation. Thirdly, to deal with the missing features, marginal probability is used so that samples which have missing features are compared with those having the full feature set to ensure a non-biased decision. The experimental evaluation has been conducted within a family album of few thousands of photographs and the results show that the proposed approach is effective and efficient in automated face annotation in family albums.
Uploads
Papers by Lei Zhang