2012 ICPR

To show or hide the keywords and abstract of a paper (if available), click on the paper title
Open all abstracts Close all abstracts

Click on

to open the PDF file for that paper


WePSAT1	Main Hall
Poster Shotgun (07): PR	Regular Session

08:30-09:00, Paper WePSAT1.1
Multiple Kernel Discriminant Analysis
Liu, Xiao-Zhang	Sun Yat-sen Univ.
Feng, Guocan	Sun Yat-Sen Univ.
Keywords: Machine Learning and Data Mining, Classification and Clustering Abstract: This paper proposes a multiple kernel construction method for kernel discriminant analysis. The constructed kernel is a linear combination of several base kernels with a constraint on their weights. By maximizing the margin maximization criterion (MMC), we present an iterative scheme for weight optimization. The experiments on several UCI real data benchmarks show that, the constructed kernel with optimized weights results in high classfication accuracy, compared with multiple kernel learning under the framework of support vector machines. The experiments also show that the constructed kernel relaxes parameter selection for kernel discriminant analysis to some extent.

08:30-09:00, Paper WePSAT1.2
A New Fall Detection Method Using Weightlessness-Like Status Detection from a Single Tri-Axial Accelerometer
Jin, Lianwen	South China Univ. of Tech.
Xue, Yang	SouthChina Univ. of Tech. China
Mao, Huiyun	South China Univ. of Tech.
Zhang, Hengzhi	South China Univ. of Tech.
Keywords: Gesture and Behavior Analysis, Pattern Recognition for Surveillance and Security Abstract: Falls and fall-related injuries remain a major problem in public health domain. For delivering instant and adequate medical support, the automatic, reliable and immediate detection of falls is important。 In this paper, we present a new real-time weightlessness-like-status based fall detection (WFD) method using a single tri-axial accelerometer attached at waist or in trousers pocket. The accelerometer signals are analyzed by vertical direction calibration, threshold detection, and weightlessness-like status detection to discriminate falls from normal daily living activities. The result shows that this algorithm can detect falls with robust sensitivity of 96.5% and specificity of 95%-100% respectively for different accelerometer locations.

08:30-09:00, Paper WePSAT1.3
Human Face Recognition under Occlusion Using LBP and Entropy Weighted Voting
Nikan, Soodeh	Univ. of Windsor
Ahmadi, Majid	Univ. of Windsor
Keywords: Pattern Recognition for Surveillance and Security, Pattern Recognition for Search, Retrieval and Visualization, Statistical, Syntactic and Structural Pattern Recognition Abstract: In this paper a new block-based algorithm has been proposed to deal with facial occlusion when only one sample per person is available. A Local Binary Pattern (LBP) descriptor is applied on the image sub-blocks to extract distinctive texture features from those areas separately. Chi-Square is employed as histogram similarity metric in local classifiers corresponding to different image blocks. Finally, a weighted majority voting scheme is used for decision fusion. Local entropy is proposed to devote weights to classifiers results according to the block informative richness. This way, we can reduce the effect of blocks with appearance deformation on the final decision. Experimental results show the significantly high recognition accuracy of our method on the challenging AR face database compared to recent well-known approaches, without imposing computational complexity.

08:30-09:00, Paper WePSAT1.4
Kernel Based Sparse Representation for Face Recognition
Qi, Zhu	Shenzhen Graduate School,Harbin Inst. of Tech.
Xu, Yong	Harbin Inst. of Tech.
Wang, Jinghua	The Hong Kong Pol. Univ.
Fan, Zizhu	Harbin Inst. of Tech. EastChinaJiaotongUniversity
Keywords: Biometrics, Classification and Clustering, Machine Learning and Data Mining Abstract: In this paper, we extend the idea of sparse representation into the high dimensional feature space induced by the kernel function, and propose a kernel based test sample sparse representation and classification algorithm (KTSRC) for the first time. The KTSRC is based on the assumption that the test sample can be linearly represented by a part of the training samples in the high dimensional feature space. Although the explicit form of the sample in the feature space is unknown, we can implement the KTSRC by the kernel trick. The experimental results show that the KTSRC achieves promising performance in face recognition, and outperforms the state-of-the-art methods.

08:30-09:00, Paper WePSAT1.5
Sparse Residue for Occluded Face Image Reconstruction and Classification
Wang, Jinghua	The Hong Kong Pol. Univ.
Xu, Yong	Harbin Inst. of Tech.
You, Jane	The Hong Kong Pol. Univ.
Keywords: Classification and Clustering, Biometrics Abstract: Occlusion problem is one of remaining challenges in face recognition. This work expresses an occluded image as the summation of a non-occluded image and a sparse occlusion. By solving a norm minimization problem, we isolate the sparse occlusion from the face image, and simultaneously reconstruct the image. The reconstructed image is same to the original one in most pixels. To classify an occluded image with unknown identity, we first linearly express it using the images of every person, and then make decision based on the residues of the expressions. This paper also presents the relationship between the proposed method and the popular methods. The experiments validate the feasibility of the proposed method.

08:30-09:00, Paper WePSAT1.6
Skin Detection Via Linear Regression Tree
Zhang, Jixia	CASIA
Wang, Haibo	CASIA
Davoine, Franck	CNRS
Pan, Chunhong	Inst. of Automation, Chinese Acad. of Sciences
Keywords: Classification and Clustering, Detection, Separation and Segmentation, Image and Video Processing Abstract: A robust and efficient skin detector facilitates automatic human detection and tracking. In this paper, we propose a new skin detection method via linear regression tree, which decomposes the problem of discriminating different skin and nonskin colors into several simple problems. Experimental results on the MCG skin database demonstrated its better generalization ability and discriminability than state-of-the-arts.

08:30-09:00, Paper WePSAT1.7
A Near-Optimal Non-Myopic Active Learning Method
Zhao, Yue	Minzu Univ. of China
Yang, Guosheng	Minzu Univ. of China
Xu, Xiaona	Minzu Univ. of China
Ji, Qiang	RPI
Keywords: Machine Learning and Data Mining, Classification and Clustering, Statistical, Syntactic and Structural Pattern Recognition Abstract: Non-myopic active learning allows the learner to select multiple unlabeled samples at a time. It avoids tedious retraining with each selected sample, and is effective to utilize multiple labelers. But current non-myopic active learning methods are typically greedy by selecting top N unlabeled samples with maximum score. While efficient, such a greedy active learning approach cannot guarantee the learner's performance. In this paper, we introduce a near-optimal non-myopic active learning algorithm that is efficient and simultaneously has a performance guarantee. Our experimental results on UCI data sets and a real-world application show that the proposed algorithm outperforms the myopic active learning method and the existing non-myopic active learning methods in both efficiency and accuracy.

08:30-09:00, Paper WePSAT1.8
Tensor Based Robust Color Face Recognition
Li, Billy	Curtin Univ. of Tech.
Liu, Wanquan	Curtin Univ. of Tech.
An, Senjian	Curtin Univ. of Tech.
Krishna, Aneesh	DEPARTMENT OF COMPUTING, CURTIN Univ. PERTH, WESTERN AUSTR
Keywords: Feature Reduction and Manifold Learning, Classification and Clustering, Statistical, Syntactic and Structural Pattern Recognition Abstract: In this paper we address the robust face recognition problem for color faces with large variations in pose, illumination and facial expression. A novel algorithm is proposed, namely the Multilinear Color Tensor Dis- criminant (MCTD) model. This approach utilizes tensor representation to preserve image structure, as well as enhance discriminate capability via color space trans- formation. On the other hand, it uses the multilinear analysis technique to handle variations in pose, illu- mination and expressions and improve the performance via minimizing the least square of reconstruction error in the tensor framework. Extensive experiments con- ducted on the CMU-PIE and CurtinFaces databases demonstrate the effectiveness of the proposed approach.

08:30-09:00, Paper WePSAT1.9
Closed-Form Information-Theoretic Divergences for Statistical Mixtures
Nielsen, Frank	Sony Computer Science Lab. Inc
Keywords: Classification and Clustering, Statistical, Syntactic and Structural Pattern Recognition, Pattern Recognition for Search, Retrieval and Visualization Abstract: Statistical mixtures such as Rayleigh, Wishart or Gaussian mixture models are commonly used in pattern recognition and signal processing tasks. Since the Kullback-Leibler divergence between any two such mixture models does not admit an analytical expression, the relative entropy can only be approximated numerically using time-consuming Monte-Carlo stochastic sampling. This drawback has motivated the quest for alternative information-theoretic divergences such as the recent Jensen-R'enyi, Cauchy-Schwarz, or total square loss divergences that bypass the numerical approximations by providing exact analytic expressions. In this paper, we state sufficient conditions on the mixture distribution family so that these novel non-KL statistical divergences between any two such mixtures can be expressed in generic closed-form formulas.

08:30-09:00, Paper WePSAT1.10
Real-Time Smoke Detection Using Texture and Color Features
Wang, Yue	(1) Inst. for Infocomm Res. Univ.
Chua, Teck Wee	Inst. for Infocomm Res.
Chang, Richard	Inst. for Infocomm Res.
Pham, Nam Trung	Inst. for Infocomm Res.
Keywords: Pattern Recognition for Surveillance and Security, Detection, Separation and Segmentation, Segmentation, Color and Texture Abstract: This paper presents a real-time smoke detection algorithm. A modified Center Symmetric Local Ternary Pattern (CS-LTP) is proposed as the smoke texture descriptor. The change in background texture provides a means to differentiate between smoke and non-smoke region. Combined with the color information of smoke, our method is able to achieve real-time performance at minimum of 30 fps. The comparison with the state of art is presented as well. The experiment results show the proposed method can obtain good performance.

08:30-09:00, Paper WePSAT1.11
Combining Generative and Discriminative Models for Classifying Social Images from 101 Object Categories
Ballan, Lamberto	Univ. of Florence
Bertini, Marco	Univ. of Florence
Del Bimbo, Alberto	Univ. of Florence
Serain, Andrea M.	Univ. of Florence
Serra, Giuseppe	Univ. of Florence
Zaccone, Benito F.	Univ. of Florence
Keywords: 2D/3D Object Detection and Recognition, Image and Video Understanding, Scene Understanding Abstract: In this paper we present a hybrid generative-discriminative approach for image categorization in real-world images, based on Latent Dirichlet Allocation and SVM classifiers. We use SVMs with non-linear kernels on different visual features in a multiple kernel combination framework. A major contribution of our work is also the introduction of a novel dataset, called MICC-Flickr101, based on the popular Caltech101 and collected from Flickr. We demonstrate the effectiveness and efficiency of our method testing it on both datasets, and we evaluate the impact of combining image features and tags for object recognition.

08:30-09:00, Paper WePSAT1.12
Classification of Kinematic Golf Putt Data with Emphasis on Feature Selection
Jensen, Ulf	Univ. Erlangen-Nuremberg
Dassler, Frank	adidas Group
Eskofier, Bjoern	Univ. of Erlangen-Nuremberg
Keywords: Classification and Clustering, Feature Reduction and Manifold Learning, Machine Learning and Data Mining Abstract: The complex movement sequences of golf require supporting tools for players and coaches alike. We developed a system that classifies the experience level and trained it with data from an inertial sensor on the club head. Based on 315 golf putts from eleven subjects the system differentiated between experienced and unexperienced players with a classification rate of 86.1%. To improve the classification system and obtain discriminant features we additionally integrated a feature selection step. We compared different selection approaches and concluded that a leave-subject-out feature selection was the appropriate approach to predict the true performance of a live system. The selected features can be fed back to coaches and help them to guide players to a better putting technique.

08:30-09:00, Paper WePSAT1.13
Real-Time 3D Face Identification from a Depth Camera
Min, Rui	EURECOM
Choi, Jongmoo	Univ. of Southern California
Medioni, Gerard	Univ. of Southern California
Dugelay, Jean-Luc	Eur�com
Keywords: Biometrics, 2D/3D Object Detection and Recognition Abstract: We present a real-time 3D face identification system using a consumer level depth camera (PrimeSensor). Our system takes a noisy sequence as input and produces reliable identification. Instead of registering a probe to all instances in the database, we propose to only register it with several intermediate references, which considerably reduces processing, while preserving the recognition rate. The presented system routinely achieves 100% identification rate when matching a (0.5-4 seconds) video sequence, and 97.9% for single frame recognition. These numbers refer to a real-world dataset of 20 people. The methodology extends directly to very large datasets. The process runs at 20fps on an off the shelf laptop.

08:30-09:00, Paper WePSAT1.14
Spectral Correspondence Method for Fingerprint Minutia Matching
Fu, Xiang	Peking Univ. Scie
Liu, Chongjin	Peking Univ. of Electronics Engineering andComputer
Bian, Junjie	Peking Univ.
Feng, Jufu	Peking Univ.
Keywords: Biometrics, Pattern Recognition for Surveillance and Security, Pattern Recognition for Bioinformatics Abstract: This paper presents an effective spectral correspondence method for fingerprint matching. Minutia matching is formulated as recovering the dense sub-block in the corresponding matrix. Then the spectral correspondence method is used for searching the dense sub-block. First, we propose the pairwise adjacency matrix (PAM), whose diagonal elements represent similarities of local minutia structures and other elements represent pairwise compatibilities between local minutia structure pairs, thus similarities information and compatibilities information are unified appropriately. Second, correct minutia pairs are likely to establish both large similarities and large compatibilities among each other and they form a dense sub-block. Then minutia matching is formulated as recovering the dense sub-block in the PAM. It gives a clear mathematical meaning for "optimal matching minutia pairs". Third, we recover the dense sub-block based on spectral correspondence method, by using the principal eigenvector of PAM and imposing the one-to-one mapping constraints. Proposed method has stronger description ability and better robustness. Experiments conducted on FVC database demonstrate the effectiveness and the efficiency.

08:30-09:00, Paper WePSAT1.15
Hypergraph Based Semi-Supervised Learning for Gender Classification
Zhang, Zhihong	Univ. of York
Hancock, Edwin	Univ. of York
Ren, Peng	China Univ. of Petroleum (Huadong)
Keywords: Statistical, Syntactic and Structural Pattern Recognition, Machine Learning and Data Mining, Classification and Clustering Abstract: Graph-based methods are an important category of semi-supervised learning techniques. However, in many situations the graph representation of relational patterns can lead to substantial loss of information. This is because in real-world problems objects and their features tend to exhibit multiple relationships rather than simple pairwise ones. In this paper, we develop a semi-supervised learning method which is based on a weighted hypergraph representation. There are two main contributions in this paper. The first is that we develop a hypergraph representation based on the attributes of feature vectors, i.e. a feature hypergraph. With this representation, the structural information latent in the data can be more effectively modeled. Secondly, to address semi-supervised classification, we derive a ell_{1}-norm for the spectral embedding minimization problem on the learned hypergraph. This leads to sparse and direct clustering results. We apply the method to the challenging problem of gender determination using features delivered by principal geodesic analysis (PGA). We obtain a classification accuracy as high as 91% on 2.5D facial needle-maps when 50% of the data are labeled.

08:30-09:00, Paper WePSAT1.16
Evaluation of Canonical Correlation Analysis: A Correlation Generation Model
Su, Ya	Tsinghua Univ.
Wang, Shengjin	Tsinghua Univ.
Fu, Yun	SUNY at Buffalo
Keywords: Feature Reduction and Manifold Learning, Machine Learning and Data Mining, Pattern Recognition for Surveillance and Security Abstract: Canonical Correlation Analysis (CCA) is a powerful technique for finding the correlations between two sets of multidimensional variables. Due to its performance in practice, many extensions were brought forward such as least square CCA. However, there is not a unified solution to compare their performance, i.e. in the sense of extracting canonical correlations. In this paper, we propose a framework to systematically evaluate performance of CCA and its variants. Firstly, a Correlation Generation Model (CGM) to analyze CCA in three aspects: 1) Why are the multidimensional variables correlated? 2) How are they correlated? 3) How to evaluate this correlation? Based on CGM, it is possible to qualitatively study CCA in terms of accuracy and robustness. Most interestingly, the analysis reveals that CCA actually suffers from the Under Sample Problem (USP), which is often discussed in the machine learning field but ignored in the literature. Finally, experiments based on CGM are performed to evaluate the CCA as well as its variants.

08:30-09:00, Paper WePSAT1.17
Submanifold Decomposition
Su, Ya	Tsinghua Univ.
Wang, Shengjin	Tsinghua Univ.
Fu, Yun	SUNY at Buffalo
Keywords: Feature Reduction and Manifold Learning, Machine Learning and Data Mining, Classification and Clustering Abstract: Extracting low-dimensional structures from high-dimensional space through spectral analysis has been prevalent in the fields of machine learning and computer vision. However, most manifold learning methods assume that there is a dominant low-dimensional manifold, while other variations are usually considered as noise or even ignored. This paper proposes a novel submanifold decomposition (SMD) algorithm, which simultaneously considers multiple submanifolds intertwined in the same high-dimensional space for decomposition. It makes full use of multi-category labels of a dataset to improve the modeling of manifolds of each label. In order to applied the proposed method to practical applications, the linear version of SMD is developed subsequently. Comparative experiments demonstrate that the proposed method not only effectively extracts submanifolds by subspace learning, but also outperforms traditional manifold and subspace learning methods for visual recognition tasks.

08:30-09:00, Paper WePSAT1.18
Camera View Usage of Binary Infrared Sensors for Activity Recognition
Tao, Shuai	Hokkaido Univ.
Kudo, Mineichi	Hokkaido Univ.
Nonaka, Hidetoshi	Hokkaido Univ.
Toyama, Jun	Hokkaido Univ.
Keywords: Gesture and Behavior Analysis, Motion, Tracking and Video Analysis, Human Computer Interaction Abstract: A ceiling sensor system reported in this study is to recognize different activities of multiple persons in the home environment. The sensors output binary sequences by which we know the existence/nonexistence of persons under the sensors. A short-period average of the binary response is shown to be regarded as a pixel value of a top view camera, but the camera-like view is more advantage in the sense of preserving privacy. Using the "pixel values" as features, support vector machine (SVM) classifier succeeded to recognize eight activities of five subjects at average recognition rate of 80.10%. This accuracy is not sufficient in general but surprisingly high with such low-level information.

08:30-09:00, Paper WePSAT1.19
Using K-Nearest Neighbors to Handle Missing Weak Classifiers in a Boosted Cascade
Bouges, Pierre	Blaise Pascal Univ.
Chateau, Thierry	Blaise Pascal Univ.
Blanc, Christophe	Blaise Pascal Univ.
Loosli, Gaelle	Blaise Pascal Univ.
Keywords: Pattern Recognition for Surveillance and Security, 2D/3D Object Detection and Recognition Abstract: We propose a generic framework to handle missing weak classifiers at prediction time in a boosted cascade. The contribution is a probabilistic formulation of the cascade structure that considers the uncertainty introduced by missing weak classifiers. This new formulation involves two problems: 1) the approximation of posterior probabilities on each level and 2) the computation of thresholds on these probabilities to make a decision. Both problems are studied and solutions are proposed and evaluated. The method is then applied on a popular computer vision application: detecting occluded faces. Experimental results are provided on classic databases to evaluate the proposed solution related to the basic one.

08:30-09:00, Paper WePSAT1.20
Structural Analysis of Protein Secondary Structure by GHT
Cantoni, Virginio	Pavia Univ.
Ferone, Alessio	Univ. of Naples Parthenope
Ozbudak, Ozlem	Istanbul Tech. Univ.
Petrosino, Alfredo	Univ. of Naples "Parthenope"
Keywords: Pattern Recognition for Bioinformatics, Pattern Recognition for Search, Retrieval and Visualization, Classification and Clustering Abstract: Structural biology is a branch of life science concerned with the study of the structure of biological macromolecules like proteins. The structure of a protein gives much more insight in its functions than that of its amino acid sequence. Protein structure comparison is important for understanding the evolutionary relationships among proteins, predicting protein functions, and predicting protein structures from the chemical composition. In this paper we propose a new approach for structural block retrieval based on the Generalized Hough Transform (GHT). A first technique uses as primitives the single Secondary Structure (SS), an alternative adopts co-occurrence of SSs couple, the third approach uses SSs triplets, and finally the primitive can be an entire block. In this paper we describe some experiments for the retrieval of elementary structural blocks consisting of four- and five-SSs.

08:30-09:00, Paper WePSAT1.21
Recovering Human Pose in 3D by Visual Manifolds
Wang, Zibin	The Chinese Univ. of Hong Kong
Chung, Chi-kit Ronald	The Chinese Univ. of Hong Kong
Keywords: Gesture and Behavior Analysis, 2D/3D Object Detection and Recognition, Feature Reduction and Manifold Learning Abstract: We describe a mechanism based upon activity manifolds that map image data from more than one view to spatial pose. We learn the manifolds from training data which are motion capture data about real human subjects exercising the target actions. The nature of the training data allows the learned manifolds to conform naturally to multiple constraints, including (1) the body-part articulation constraint; (2) the image-consistency constraint; and (3) conformation to prior information about the possible human activities. A mirror function is specifically designed that allows the system to pick up the proper manifold in multi-activity scenario. Human pose in both the image space and 3D is expressed in terms of the body-part joint positions. Such a representation allows image data to be related across views and to 3D space with ease. Experimental results show that not only do the manifolds effectively map image data to 3D pose; the presence of multiple images also improves the precision of the recovered pose and helps fix feature extraction error in any single image.

08:30-09:00, Paper WePSAT1.22
Graph Kernels Based on Relevant Patterns and Cycle Information for Chemoinformatics
Ga�z�re, Benoit	Univ. de Caen, CNRS UMR 6072, GREYC, ENSICAEN
Brun, Luc	ENSICAEN
Villemin, Didier	CNRS UMR 6507 LCMT, ENSICAEN
Brun, Myriam	Univ. de Caen, CNRS UMR 6072, GREYC, ENSICAEN
Keywords: Pattern Recognition for Bioinformatics, Statistical, Syntactic and Structural Pattern Recognition, Machine Learning and Data Mining Abstract: Chemoinformatics techniques aim to predict molecule�s properties through informational tech- niques. Computer science�s research fields concerned by chemoinformatics are machine learning and graph theory. From this point of view, graph kernels provide a nice framework combining these two fields. We present in this paper two contributions to this research field: a graph kernel based on an optimal linear combination of kernels on acyclic patterns and a new kernel on the cyclic system of two graphs. These two extensions are validated on two chemoinformatics datasets.

08:30-09:00, Paper WePSAT1.23
ARMA-HMM: A New Approach for Early Recognition of Human Activity
Li, Kang	SUNY at Buffalo
Fu, Yun	SUNY at Buffalo
Keywords: Gesture and Behavior Analysis, Image and Video Understanding, Motion, Tracking and Video Analysis Abstract: Early Recognition of human activities is a highly desirable functionality for many visual intelligent systems. However, in computer vision, very few work have been devoted to this challenging and interesting task. In this paper, we address human activity early recognition as a pattern recognition problem of time series data. A new model called ARMA-HMM is introduced to integrate both the predictive power of sequential model HMM and time series model ARMA. We also present a novel feature called Histogram of Oriented Velocity (HOV) to encode activity video as a sequential observation of motion signals. Experiments on a daily activity dataset and a realistic YouTube sports dataset show promising results of the proposed method.

08:30-09:00, Paper WePSAT1.24
Confidence-Assisted Classification Result Refinement for Object Recognition Featuring TopN-Exemplar-SVM
Yamasaki, Toshihiko	The Univ. of Tokyo
Chen, Tsuhan	Cornell Univ.
Keywords: Classification and Clustering, Image and Video Understanding Abstract: This paper proposes a cascaded classifier framework for better image recognition. The proposed method is based on the confidence values given by the classifiers. By using our proposed topN-Exemplar SVM in the second stage and comparing the confidence values with those from the first stage, the classification results with less confidence are successfully updated. The validity of our algorithm has been demonstrated by the experiments using three standard image datasets.

08:30-09:00, Paper WePSAT1.25
Unsupervised Spectral Feature Selection for Face Recognition
Zhang, Zhihong	Univ. of York
Hancock, Edwin	Univ. of York
Keywords: Machine Learning and Data Mining, Feature Reduction and Manifold Learning, Classification and Clustering Abstract: Most existing feature selection methods focus on ranking individual features based on a utility criterion, which neglecting the correlations among features. To overcome this problem, we develop a novel feature selection technique using the spectral data transformation and by using ell_{1}-norm regularized models for subset selection. Specifically, we propose a new two-step spectral regression technique for unsupervised feature selection. In the first step, we use kernel entropy component analysis (kECA) to transform the data into a lower-dimensional space so as to improve class separation. Second, we use ell_{1}-norm regularization to select the features that best align with the data embedding resulting from kECA. The advantage of kECA is that dimensionality reducing data transformation maximally preserves entropy estimates for the input data whilst also best preserving the cluster structure of the data. Using ell_{1}-norm regularization, we cast feature discriminant analysis into a regression framework which accommodates the correlations among features. As a result, we can evaluate joint feature combinations, rather than being confined to consider them individually. Experimental results demonstrate the effectiveness of our feature selection method on a number of standard face data-sets.

08:30-09:00, Paper WePSAT1.26
MEG Spatio-Temporal Source Reconstruction with Basis Functions Source Model
Kan, Jing	Univ. of York
Keywords: Pattern Recognition for Bioinformatics, Statistical, Syntactic and Structural Pattern Recognition Abstract: The aim of this paper is to introduce a classical method of pattern recognition as the solution for the medical imaging, and to provide a new angle of using the pattern recognition theory for MEG source reconstruction. We explore a new method of MEG source spatio-temporal reconstruction based on modeling the neural source with extended basis functions. Inspired by the graph theory that Laplacian eigenvectors of spherical mesh are equivalent to its basis functions representing the cortex mesh, we build a new model to describe the current source distributed on each mesh vertex. This model consists of analogous basis functions and unknown weighted coefficients. Along with leadfield, the weighted coefficients can be calculated in the light of the forward formulae of MEG. Expanding this process from a single time point to continuous time series, it is able to obtain the spatio-temporal reconstructed source distributed on cortical mesh vertices. Under the condition of zero-mean Gaussian noise with small value of variance, the results show robustness to noise and better performance than minimum-norm, but intensive to the deep sources.

08:30-09:00, Paper WePSAT1.27
Detection of Eyes by Circular Hough Transform and Histgram of Gradient
Ito, Yasutaka	Mie Univ.
Ohyama, Wataru	Mie Univ.
Wakabayashi, Tetsushi	Mie Univ.
Kimura, Fumitaka	Mie Univ.
Keywords: Biometrics Abstract: In order to achieve high accuracy of face recognition, detection of facial parts such as eyes, nose, and mouth is essentially important. In this paper, we propose a method to detect eyes from frontal face images. The proposed method consists of two major steps. The first is two dimensional Hough transformation for detecting circle of unknown radius. The circular Hough transform first generates two dimensional parameter space (x, y) using the gradient of grayscale. The radius of circle r is determined for each local maximum in the (x, y) space. The second step of the proposed method is evaluation of likelihood of eye using histogram of gradient and Support Vector Machine (SVM). The eye detection step of proposed method firstly detects possible eye center by the circular Hough transform. Then it extracts histogram of gradient (HOG) from rectangular window centered at each eye center. Likelihood of eye of the extracted feature vector is evaluated by SVM, and pairs of eyes satisfying predefined conditions are generated and ordered by sum of the likelihood of both eyes. Evaluation experiment is conducted using 1,409 images of the FERET database of frontal face image. The experimental result shows that the proposed method achieves 98.65% detection rate of both eyes.

08:30-09:00, Paper WePSAT1.28
Medical Prognosis Based on Patient Similarity and Expert Feedback
Wang, Fei	IBM T. J. Watson Res. Center
Hu, Jianying	IBM
Sun, Jimeng	IBM T. J. Watson Res. Center
Keywords: Classification and Clustering, Machine Learning and Data Mining, Human Computer Interaction Abstract: Prognosis refers to the prediction of the future health status of a patient. Providing prognostic insight to clinicians is critical for physician decision support. In this paper we present a collaborative disease prognosis strategy leveraging the information of the clinically similar patient cohort, using a Local Spline Regression (LSR) based similarity measure. To improve the reliability of the approach, the algorithm can also incorporate physician�s feedback in the form of whether the patients in a retrieved cohort are indeed similar to the query patient. The proposed methodology was tested on a real clinical data set containing records of over two hundred thousand patients over three years. We report the retrieval as well as prognosis performance to demonstrate the effectiveness of the system.

08:30-09:00, Paper WePSAT1.29
Theoretical Analysis of Learning Local Anchors for Classification
Pang, Junbiao	Beijing Uinversity of Tech.
Huang, Qingming	Chinese Acad. of Sciences
Yin, Baocai	Beijing Univ. of Tech.
Qin, Lei	Inst. of Computing Tech. Chinese Acad. ofSciences, C
Wang, Dan	Beijing Uinversity of Tech.
Keywords: Machine Learning and Data Mining, Classification and Clustering Abstract: In this paper, we present a theoretical analysis on learning anchors for local coordinate coding (LCC), which is a method to model functions for data lying on non-linear manifolds. In our analysis several local coding schemes, i.e., orthogonal coordinate coding (OCC), local Gaussian coding (LGC), local Student coding (LSC), are theoretically compared, in terms of the upper-bound locality error on any high-dimension data; this provides some insight to understand the local coding for classification tasks. We further give some interesting implications of our results, such as tradeoff between locality and approximation ability in learning anchors.

08:30-09:00, Paper WePSAT1.30
Sparse Representation for Motion Primitive-Based Human Activity Modeling and Recognition Using Wearable Sensors
Zhang, Mi	Univ. of Southern California
Xu, Wenyao	Univ. of California, Los Angeles
Sawchuk, Alexander	Univ. of Southern California
Sarrafzadeh, Majid	UCLA
Keywords: Gesture and Behavior Analysis Abstract: The use of wearable sensors for human activity monitoring and recognition is becoming an important technology due to its potential benefits to our daily lives. In this paper, we present a sparse representation-based human activity modeling and recognition approach using wearable motion sensors. Our approach first learns an overcomplete dictionary to find the motion primitives shared by all activity classes. Activity models are then built on top of these motion primitives by solving a sparse optimization problem. Experiments on a dataset including nine activities and fourteen subjects show the advantages of using sparse representation for activity modeling and demonstrate that our approach achieves a better recognition performance compared to the conventional motion primitive-based approach.

08:30-09:00, Paper WePSAT1.31
Looking for the Brain Stroke Signature
O'Reilly, Christian	Ec. Pol. de Montreal
Plamondon, R�jean	�cole Pol. de Montr�al
Keywords: Biometrics, Pattern Recognition for Bioinformatics, Handwriting Recognition Abstract: This conference paper investigates the possibility of using on-line handwritten signatures for biomedical biometry. More specifically, features extracted from sigma-lognormal representations of signatures are applied to the problem of brain stroke susceptibility assessment. The area under the receiver operating characteristic curve (AUC) is used to evaluate the predictability of the most important modifiable brain stroke risk factors (diabetes, hypertension, hypercholesterolemia, obesity, cigarette smoking, cardiac problems) based on four different statistical modeling of the features� variation (random forest, linear discriminant analysis, logistic regression and linear regression). Our preliminary results show a potential predictability (AUC of about 0.7-0.8) for every risk factor, except for cigarette smoking. Avenues for improving these results are discussed.

08:30-09:00, Paper WePSAT1.32
A Nonparametric Bayesian Poisson Gamma Model for Count Data
Gupta, Sunil Kumar	Deakin Univ.
Phung, Dinh	Deakin Univ.
Venkatesh, Svetha	Curtin Univ. of Tech.
Keywords: Machine Learning and Data Mining, Feature Reduction and Manifold Learning, Statistical, Syntactic and Structural Pattern Recognition Abstract: We propose a nonparametric Bayesian, linear Poisson gamma model for count data and use it for dictionary learning. A key property of this model is that it captures the parts-based representation similar to nonnegative matrix factorization. We present an auxiliary variable Gibbs sampler, which turns the intractable inference into a tractable one. Combining this inference procedure with the slice sampler of Indian buffet process, we show that our model can learn the number of factors automatically. Using synthetic and real world datasets, we show that the proposed model outperforms other state-of-the-art nonparametric factor models.

08:30-09:00, Paper WePSAT1.33
Soft Decision Trees
Irsoy, Ozan	Bogazici Univ.
Yildiz, Olcay Taner	Isik Univ.
Alpaydin, Ethem	Bogazici Univ.
Keywords: Statistical, Syntactic and Structural Pattern Recognition, Machine Learning and Data Mining, Classification and Clustering Abstract: We discuss a novel decision tree architecture with soft decisions at the internal nodes where we choose both children with probabilities given by a sigmoid gating function. Our algorithm is incremental where new nodes are added when needed and parameters are learned using gradient-descent. We visualize the soft tree fit on a toy data set and then compare it with the canonical, hard decision tree over ten regression and classification data sets. Our proposed model has significantly higher accuracy using fewer nodes.

08:30-09:00, Paper WePSAT1.34
Finding Discriminative Features for Raman Spectroscopy
Kemmler, Michael	Friedrich Schiller Univ. of Jena, Germany
Denzler, Joachim	Friedrich-Schiller Univ. of Jena
Keywords: Pattern Recognition for Bioinformatics Abstract: To identify microorganisms is of utmost importance in various applications such as medical science and pharmaceutical industry. The technique of Raman spectroscopy is particularly useful in this scenario, since it extracts a high-dimensional molecular fingerprint from samples at hand. Instead of using the complete spectrum, it is often sensible to concentrate on a small number of discriminative dimensions. Apart from providing important molecular insights, this can be beneficial in terms of speed and accuracy. This work studies several state-of-the-art machine learning techniques suitable for feature ranking, many of which have not been used before in the context of Raman spectra classification. Experiments on three different bacteria classification problems show that boosting-based methods and zero-norm support vector machines are especially suited for this challenging task.

08:30-09:00, Paper WePSAT1.35
Iterative Neighbor-Joining Tree Clustering Algorithm for Genotypic Data
Amornbunchornvej, Chainarong	King Mongkut's Inst. of Tech. Ladkrabang
Limpiti, Tulaya	King Mongkut's Inst. of Tech. Ladkrabang
Assawamakin, Anunchai	National Center for Genetic Engineering and Biotechnology
Intarapanich, Apichart	National Electronics and Computer Tech. Center
Tongsima, Sissades	National Center for Genetic Engineering and Biotechnology
Keywords: Pattern Recognition for Bioinformatics, Classification and Clustering Abstract: Issues to explore in genotypic datasets include the number and characteristic patterns of subpopulations and, possibly, relationships among them. Model-based clustering methods have been adopted to find a number of clusters and the individual assignments. However, they cannot infer genetic relationships among subpopulations the way phylogenetic trees, e.g., the widely-used Neighbor-Joining (NJ) tree, can. In this paper we propose an unsupervised, iterative clustering framework called iNJclust. It performs clustering on an NJ tree with a graph-based partitioning technique. The iterative process enhances the zooming ability and corrects the topology of the final NJ trees. Inference on genetic similarities between subpopulations is also possible. As final outputs, the iNJclust algorithm provides an estimate of the number of clusters, individual assignments, a population tree, as well as sub-trees of the terminal nodes. We illustrate the superior clustering performance of the proposed algorithm using Human 27 populations, bovine 47 breeds, and sheep 28 breeds datasets.

08:30-09:00, Paper WePSAT1.36
Dominant Set and Target Clique Extraction
Hou, Jian	Bohai Univ.
E, Xu	Bohai Univ.
Chi, Lei	Bohai Univ.
Xia, Qi	Harbin Inst. of Tech.
Qi, Nai-Ming	Harbin Inst. of Tech.
Keywords: Classification and Clustering, Statistical, Syntactic and Structural Pattern Recognition, Machine Learning and Data Mining Abstract: A standard paradigm to apply graph based representations to computer vision and pattern recognition is to construct a graph from the problem and then formulate the problem in terms of finding cliques in the graph. Many methods have been proposed to extract maximum clique, enumerate all cliques or a number of largest cliques. In this paper we present an approach to a new problem of target clique extraction, i.e., extracting a clique containing a particular vertex. This approach is based on the dominant set clustering and the recently proposed Infection and Immunization Dynamics. We intervene in the game evolution process and gear the convergence towards the target clique. Experiments validate the effectiveness of our approach.

08:30-09:00, Paper WePSAT1.37
Video Privacy Filters with Tolerance to Segmentation Errors for Video Conferencing and Surveillance
O'Gorman, Lawrence	Alcatel-Lucent Bell Lab.
Keywords: Pattern Recognition for Surveillance and Security, Detection, Separation and Segmentation, Image and Video Processing Abstract: It is sometimes desired to obscure background of a person on a video conference or foreground people in a surveillance video. Background subtraction (or foreground detection) methods can help separate desired from undesired planes, however current methods often have errors � holes in foreground or background � especially after lighting changes. We describe a unified approach to video privacy that capitalizes on the realization that private information is often in the image detail, which have edges, rather than in the uniform intensity regions. So a gradient based method for foreground detection can offer both error tolerance and privacy. We show results of error tolerance to lighting change, and degree of privacy gained by foreground and background privacy filters.

08:30-09:00, Paper WePSAT1.38
Facial Expression Recognition Based on Discriminative Dictionary Learning
Liu, Weifeng	China Univ. of Petroleum (East China)
Song, Caifeng	China Univ. of Petroleum (East China)
Wang, Yanjiang	China Univ. of Petroleum
Keywords: Statistical, Syntactic and Structural Pattern Recognition, Gesture and Behavior Analysis, Pattern Recognition for Bioinformatics Abstract: Sparse Representation Classification (SRC) performs well in facial expression recognition (FER). However, SRC based methods costs a lot to train large number of examples. Sparse coding based method will be favorable to tackle the large scale facial expression recognition. K-SVD is state of the art sparse coding method. Unfortunately, K-SVD lacks of discrimination capability for it only focus on the representational power. To cover these problems, we apply discriminative K-SVD (D-KSVD) algorithm on Gabor features for facial expression recognition. Comparing with K-SVD, D-KSVD is more effective for it unifies dictionary and classifiers. We construct comprehensive experiments to verify the proposed algorithm on facial expression database JAFFE. Experimental result indicates that the performance of D-KSVD algorithm on Gabor features is effective than the baselines including SRC and K-SVD algorithms.

08:30-09:00, Paper WePSAT1.39
Tracking Tetrahymena Pyriformis Cells Using Decision Trees
Wang, Quan	Rensselaer Pol. Inst.
Ou, Yan	Rensselaer Pol. Inst.
Julius, Agung	Rensselaer Pol. Inst.
Boyer, Kim	Rensselaer Pol. Inst.
Kim, Min Jun	Drexel Univ.
Keywords: Motion, Tracking and Video Analysis, Pattern Recognition for Bioinformatics, Image and Video Understanding Abstract: Matching cells over time has long been the most difficult step in cell tracking. In this paper, we approach this problem by recasting it as a classification problem. We construct a feature set for each cell, and compute a feature difference vector between a cell in the current frame and a cell in a previous frame. Then we determine whether the two cells represent the same cell over time by training decision trees as our binary classifiers. With the output of decision trees, we are able to formulate an assignment problem for our cell association task and solve it using a modified version of the Hungarian algorithm.

08:30-09:00, Paper WePSAT1.40
Illumination Normalization of Face Images with Cast Shadows
Matsukawa, Tetsu	The Univ. of Tokyo
Okabe, Takahiro	The Univ. of Tokyo
Sato, Yoichi	Univ. of Tokyo
Keywords: Biometrics Abstract: We propose a method for extracting and combining small-scale and large-scale illumination insensitive features for face recognition that can work even in the presence of cast shadows. Although several methods have been proposed to extract such features, they are not designed to handle severe lighting variation on a face and thus fail to work if cast shadows are present. In this paper, we extend quotient image-based illumination normalization by explicitly taking cast shadows into account so that illumination insensitive large-scale features can be obtained. The experimental results demonstrated that the proposed method achieves favorable normalization results under difficult illuminations with cast shadows.

08:30-09:00, Paper WePSAT1.41
Null QQ Plots: A Simple Graphical Alternative to Significance Testing for the Comparison of Classifiers
Berrar, Daniel	Tokyo Inst. of Tech.
Keywords: Machine Learning and Data Mining, Classification and Clustering, Statistical, Syntactic and Structural Pattern Recognition Abstract: The evaluation of machine learning algorithms is commonly based on statistical signiﬁcance tests. However, the suitability of such tests is often questionable. We propose null QQ plots as a simple yet powerful graphical alternative to signiﬁcance testing. Using ten benchmark data sets, we demonstrate that these plots concisely summarize the essential results from a comparative classiﬁcation study, while they are easy to produce and interpret.

08:30-09:00, Paper WePSAT1.42
Automated Mitosis Detection Based on Exclusive Independent Component Analysis
Huang, Chao-Hui	Bioinformatics Inst. (BII), Agency forScience,Tech. and
Lee, Hwee Kuan	Bioinformatics Inst. (BII), Agency forScience,Tech. and
Keywords: Pattern Recognition for Bioinformatics, Machine Learning and Data Mining, Pattern Recognition for Search, Retrieval and Visualization Abstract: In this paper, we propose an approach for automated mitosis detection, which provides critical information during performing breast cancer prognosis. Essentially, the problem of mitotic detection involves irregular shape object classification. It is a very challenging task. In this paper, a novel algorithm, named eXclusive Independent Component Analysis (XICA) is proposed. The XICA is an extension of a generic ICA, but focusing the components of differences (called exclusive basis set) between two classes of training patterns rather than the major (independent) components. Based on the residuals obtained from the relative computing involving the exclusive basis set of the relative training patterns, the automated mitosis detection is performed. By computing the residual of the relative exclusive basis set, we are able to classify the given testing patterns. The proposed approach has been tested on a mitosis image set provided by a ICPR2012 contest. It contains 226 mitosis in 35 color images. It achieved accurate rate 100% in training patterns and 83.513% in testing patterns.

08:30-09:00, Paper WePSAT1.43
Ensemble Learning for Change-Point Prediction
Hirade, Ryo	IBM Japan
Yoshizumi, Takayuki	IBM
Keywords: Classification and Clustering Abstract: In this paper, we propose a novel algorithm for the problem of predicting change-points. We assume that the causes for change-points can be characterized by the time interval between a change-point and its symptom. Based on this assumption, we first generate weak classifiers for capturing each characteristic, and then build an ensemble classifier with the weak classifiers. Experimental results show our algorithm improves the F-measure by 11% in the best case.

08:30-09:00, Paper WePSAT1.44
Concurrent Propagation for Solving Ill-Posed Problems of Global Discrete Optimisation
Gimel'farb, Georgy	Univ. of Auckland
Gong, Rui	Univ. of Auckland
Nicolescu, Radu	Univ. of Auckland
Delmas, Patrice	The Univ. of Auckland
Keywords: Machine Learning and Data Mining, Stereo and Image-Based Modeling, Statistical, Syntactic and Structural Pattern Recognition Abstract: Classical frameworks for global 1D discrete optimisation: dynamic programming (DP) and belief propagation (BP) -- presume well-posed problems with unique solutions. Ill-posed problems, being the most common in applied pattern recognition and computer vision, are regularised to restore well-posedness. However, typical heuristic regularisation does not guarantee that a set of multiple equivalent solutions is reduced to a single solution. An alternative concurrent propagation (CP) proposed in this paper extends the DP to allow for determining whether the problem is well- or ill-posed and storing implicitly in the latter case the entire set of solutions (e.g. for its structural analysis to improve regularisation). The CP, DP, and BP have similar computational complexity.

08:30-09:00, Paper WePSAT1.45
Human Action Recognition Based on Sparse Representation Induced by L1/L2 Regulations
Gao, Zan	Tianjin Univ. of Tech.
Liu, Anan	TJU
Zhang, Hua	TJUT
Xu, Guang-Ping	TJUT
Xue, Yanbing	TJUT
Keywords: Gesture and Behavior Analysis, Classification and Clustering, Pattern Recognition for Surveillance and Security Abstract: Sparse representation based classification (SRC) has been widely used for face recognition (FR). Although SRC algorithm is also adopted in human action recognition, the evaluations of different regular terms have not been given. In this paper, we will discuss and evaluate the role of different regular terms of SRC in human action recognition, after that, we propose human action recognition algorithm based on sparse representation induced by and regulations--- called SR-L12. Experiments on well known KTH action dataset show that SR-L12 is much better than that of nearest neighbor (NN), nearest subspace (NS), full-space (NF), SRC and collaborative representation classification (CRC). Moreover, the proposed method is comparable to most of state-of-the-art algorithms for human action recognition.


WePSAT2	Multi-Purpose Hall
Poster Shotgun (08): SS	Regular Session

08:30-09:00, Paper WePSAT2.1
Motion Blur Free Photometric Stereo Using Correlation Image Sensor
Kurihara, Toru	Univ. of Tokyo
Ando, Shigeru	Univ. of Tokyo
Keywords: Enhancement, Restoration and Filtering, Image and Video Processing, 2D/3D Object Detection and Recognition Abstract: We developed a motion blur restoration technique for surface orientation images using a correlation image sensor. This system consists of two components; one is ring-shaped modulation illumination for encoding surface orientation into the amplitude and phase of the reflected light intensity, and the other is the three-phase correlation image sensor (3PCIS) for demodulating the amplitude and phase of reflected light. The object motion is formalized by optical flow constraints. It is solved by weighted integral methods (WIM) developed by us, which is a direct algebraic method. The weighted integral method is suitable for correlation image sensor, because exposure time corresponds to integral over time in WIM and reference signals used by 3PCIS correspond to weighted function in WIM. It is demonstrated by both simulation and experiments that modulation imaging with sinusoids can be used to remove motion blur not only in intensity images, but also in normal vector maps.

08:30-09:00, Paper WePSAT2.2
Simultaneous Reflectance Estimation and Surface Shape Recovery Using Polarisation
Zhang, Lichi	Univ. of York
Hancock, Edwin	Univ. of York
Keywords: Image and Video Processing, Physics-Based Vision, Vision for Graphics Abstract: In this paper we develop a practical method for estimating shape and reflectance using only three polarised images. Using polarised light and retro-reflection settings during image acquisition, we separate the diffuse and specular reflectance components using Blind Source Separation without the accurate knowledge of the polariser angle information. Next, we compare the capacities of five chosen reflectance models, and estimate parameters of appropriate models for the two separated components together with their corresponding zenith angles. Finally, we recover surface shape by using a mixture model to match the two zenith angle estimations. We present experiments to demonstrate the validity of the proposed method for a variety of materials, and we show that the proposed method is capable of accurately estimating both shape and reflectance information.

08:30-09:00, Paper WePSAT2.3
Facial Emotion Recognition in Continuous Video
Cruz, Alberto	Univ. of California, Riverside
Bhanu, Bir	Univ. of California
Thakoor, Ninad	Univ. of California, Riverside
Keywords: Image and Video Processing, Image and Video Understanding, Human Computer Interaction Abstract: Facial emotion recognition--the detection of emotion states from video of facial expressions--has applications in video games, medicine, and affective computing. While there have been many advances, an approach has yet to be revealed that performs well on the non-trivial Audio/Visual Emotion Challenge 2011 data set. A majority of approaches still employ single frame classification, or temporally aggregate features. We assert that in unconstrained emotion video, a better classification strategy should model the change in features, versus simply combining them. We compute a derivative of features with histogram differencing and derivative of Gaussians and model the changes with a hidden Markov model. We are the first to incorporate temporal information in terms of derivatives. The efficacy of the approach is tested on the non-trivial AVEC2011 data set and increases classification rates on the data by as much as 13%.

08:30-09:00, Paper WePSAT2.4
A Novel Spatial-Temporal Multi-Scale Method for Detection and Analysis of Infrared Multiple Moving Objects
Zhang, Tianxu	Huazhong Univ. of Science andTechnology, Wuhan
Li, Hao	Huazhong Univ. of Science andTechnology, Wuhan
Li, Gaofei	Huazhong Univ. of Science andTechnology, Wuhan
Chen, Jianchong	Huazhong Univ. of ScienceandTechnology, Wuhan430074
Keywords: Detection, Separation and Segmentation Abstract: In this paper, a novel spatial-temporal multi-scale method (STMSM) is proposed to solve the problem of detecting multiple moving objects on complex background. Moving objects have multi-scale features both in spatial and temporal domain. The motion salience sub-spaces determine the moving features including position, size and trajectory of each moving object, then the problem of detecting moving objects can be transformed into searching optimal sub-spaces with different scales .This paper proposes a recursive algorithm for estimating motion salience in 3D space and an optimal determinant criterion. These can detect multiple objects at different spatial-temporal scales and extract their features on complex background. The experimental results show this method is effective in detecting multiple moving objects.

08:30-09:00, Paper WePSAT2.5
Classification Oriented Semi-Supervised Band Selection for Hyperspectral Images
Bai, Jun	Inst. of Automation, Chinese Acad. of Sciences
Xiang, Shiming	Inst. ofAutomation,Chinese Acad. of Sciences
Pan, Chunhong	Inst. of Automation, Chinese Acad. of Sciences
Keywords: Remote Sensing, Image and Video Processing Abstract: This paper proposes a new framework of band selection for hyperspectral images. The algorithm is designed for classification purpose. In this work, different subsets of bands are selected for different class pairs. Without prior knowledge of spectral database, we estimate the spectral characteristic of objects using the labeled and unlabeled samples, benefiting from the concept of semi-supervised learning. Under the assumption of Gaussian mixture model (GMM), the vectors of mean values and covariance matrices for each class are estimated. The separabilities for all pair of classes are thus calculated on each band. The bands with the highest separabilities are then selected. To validate our band selection result, support vector machine (SVM) is employed using a strategy of one against one (OAO). Experiments are carried out on a real data set of hyperspectral image, and the results can validate our algorithm.

08:30-09:00, Paper WePSAT2.6
Key Frame Selection Based on Jensen-Renyi Divergence
Xu, Qing	Tianjin Univ.
Keywords: Image and Video Processing Abstract: The key frame extraction is designed for obtaining a (very) compressed set of video frames that summarizes the essential content of a video sequence. In this paper, a well-known information theoretic measure, the Jensen-Renyi divergence (JRD), is studied to estimate the frame-by-frame distance between consecutive video images, for segmenting shots/subshots and for choosing key frames. Our new key frame extraction method, which is effective and computationally fast, contributes to a good and quick understanding of a large amount of video data.

08:30-09:00, Paper WePSAT2.7
Arbitrarily Oriented Text Detection Using Geodesic Distances between Corners and Skeletons
Zhang, Yong	Sun Yat-senUniversity
Lai, Jian-huang	Sun Yat-sen Univ.
Keywords: Detection, Separation and Segmentation, Scene Understanding, Segmentation, Color and Texture Abstract: This paper proposes a corner and skeleton based method for arbitrarily oriented text detection. By calculating the minimum moment of inertia of each candidate text region, we firstly obtain the orientation and minimum bounding box of each connected component. Secondly, based on the fact that corners are frequent and essential patterns in text regions, we propose a geodesic distance between corner and the skeleton of text regions to measure the effective distance between corner and text. Finally, a geodesic distances weighted corner saturation parameter is given to determine which candidate regions are the true text regions. Experimental results in ICDAR 2003 database show that the proposed method can handle the natural scene text of both horizontal and nonhorizontal orientation.

08:30-09:00, Paper WePSAT2.8
A Spectral Reflectance Representation for Recognition and Reproduction
Ratnasingam, Sivalogeswaran	NICTA
Robles-Kelly, Antonio	NICTA
Keywords: Coding and Compression, Image and Video Processing, Image and Video Understanding Abstract: In this paper we present a method to recover a spectra representation for reproduction and recognition on multispectral imagery. To do this, we commence by viewing the spectra in the image as a mixture which can be expressed in terms of the sample mean and a set of basis vectors and weights. This treatment leads to an MAP approach where the sample means is given by the centers yielded by the application of the k-means clustering algorithm and the basis vectors are the eigenvectors of the corresponding covariance matrix. We compute the weights making use of a linear programming approach. We illustrate the utility of the method for purposes of skin recognition and spectra reconsruction.

08:30-09:00, Paper WePSAT2.9
Context-Aware Horror Video Scene Recognition Via Cost-Sensitive Sparse Coding
Ding, Xinmiao	China Univ. of Mining and Tech.
Li, Bing	National Lab. of Pattern Recognition, Inst. of Automa
Hu, Weiming	National Lab. of Pattern Recognition,Inst.
Xiong, Weihua	Omnivision Corp.
Wang, Zhenchong	China Univ. of Mining and Tech.
Keywords: Image and Video Understanding Abstract: Along with the ever-growing Web, horror video sharing through the Internet has interfered with our daily life and affected ours, especially children's, health. Most of current horror video filtering researches pay more attention to the extraction of global features or selection of an optimal classifier, while neglecting the underlying contexts in a scene. In this paper, a novel cost-sensitive sparse coding (CSC) model is proposed to address the context inside scene and interrelations between audio-visual features simultaneously. The model essentially includes two aspects: one is to construct inner contextual structure among frames from same scene based on a graph; the other one is to extend the classic sparse coding technique into a cost-sensitive sparse coding model for graph pattern classification as well as audio-visual features fusion through graph kernel. The experiments on various video scenes demonstrate that our method's performance is superior to the other existing algorithm.

08:30-09:00, Paper WePSAT2.10
Pan-Sharpening Using Weighted Red-Black Wavelet
Liu, Qingjie	BeiHang Univ.
Wang, Yunhong	Beihang Univ.
Zhang, Zhaoxiang	Beihang Univ.
Liu, Lining	Beihang Univ.
Keywords: Remote Sensing Abstract: In this paper, we propose a new method for remote sensing image pan-sharpening which is based on weighted red-black (WRB) wavelet and adaptive principal component analysis (PCA), where the adaptive PCA is used to reduce spectral distortions and the utilization of WRB wavelet is used to extract the spatial details in PAN images. To reduce the artifacts and spectral distortions in the pan-sharpened images, which were caused by the local instabilities and dissimilarities in the PAN and MS images, a local process strategy incorporating detail enhancement is introduced. The proposed method is tested on two datasets both acquired by QuickBird and compared with the existing methods. Experimental results show that our method can provide promising fused MS images at a high spatial resolution.

08:30-09:00, Paper WePSAT2.11
A Color Chart Detection Method for Automatic Color Correction
Minagawa, Akihiro	Fujitsu Lab. LTD
Katsuyama, Yutaka	FUJITSU Lab. LTD.
Takebe, Hiroaki	Fujitsu Lab. Ltd.
Hotta, Yoshinobu	Fujitsu Lab. LTD.
Keywords: Detection, Separation and Segmentation, Segmentation, Color and Texture, 2D/3D Object Detection and Recognition Abstract: Recently, there are a wide range of cameras used around the world in both indoor and outdoor settings. Although precise color reproducibility is an important requirement, the colors of captured images often seem to be different from those in the original scenes. To evaluate color reproducibility problems, the Macbeth color chart, as shown in Fig. 1, is frequently used. In conventional color correction processes, a color chart is first placed near a target and an image including the chart is captured. Image colors are then corrected based on color deviations of patches on the chart from their true color. However, since the location of the color chart in the image is not readily apparent in such images, color analysis and correction is normally done manually. In this paper, automatic and fast color chart detection method is proposed. In the conventional approach to detecting a color chart, a special chart is used to determine the direction. The difficulty in detecting the chart lies in the fact that its position is unknown and its size within an image is not so large. In addition, in an uncorrected image, the chart colors may deviate from their true values. To detect a color chart precisely and automatically, we propose a method based on a colored pixel spotting approach based on the color array in the chart. Using this array information as a constraint leads to a low computational cost, which makes this method suitable for use in many types of cameras. The effectiveness of this algorithm is confirmed using a 167-image dataset that includes several sizes and rotations of color chart placed at arbitrary locations.

08:30-09:00, Paper WePSAT2.12
Single Image Super-Resolution Using Gaussian Mixture Model
He, HuaYong	School of Information Science and Tech. The computerapplica
Li, JianHong	School of Information Science andTechnologyThecomputerapplicatio
Luo, Xiaonan	Sun Yat-sen Univ.
Keywords: Image and Video Processing Abstract: In this paper we present a novel method for single super-resolution (SR). Given the input low-resolution image, we create a pyramid pair: the ground truth pyramid and the interpolated pyramid. Our method aims to model a relationship between pixel value in the ground truth pyramid and its corresponding 8- neighborhood vector in the interpolated pyramid using Gaussian Mixture Model (GMM). Each pixel in final high-resolution image is predicted by its corresponding 8- neighborhood vector through the trained GMM. Unlike the prior example-based SR method, our algorithm just utilizes the information of input image rather than the external image database. Our proposed algorithm achieves much better results than many state of the art algorithms in terms of both PSNR and visual perception.

08:30-09:00, Paper WePSAT2.13
Multi Scale Multi Structuring Element Top-Hat Transform for Linear Feature Detection
Xiangzhi, Bai	Beihang Univ.
Fugen, Zhou	Beihang Univ.
Bindang, Xue	Beihang Univ.
Keywords: Detection, Separation and Segmentation Abstract: To efficiently detect all the possible linear features, a multi scale multi structuring element top-hat transform based algorithm is proposed in this paper. The algorithm is divided into two parts: the multi scale multi structuring element top-hat transform and post-processing. In the multi scale multi structuring element top-hat transform, multi scales of multi structuring elements with increasing sizes are used by the top-hat transform to extract the useful information of linear features. In the post processing, the detected linear feature regions are binarized, firstly. Then, the small noise regions are removed. After that, the final linear feature regions are thinned to form the final binary detected linear features. Experimental results show that, the proposed algorithm could efficiently detect all the possible linear features of different types of images and could be widely used for linear feature detection in different applications.

08:30-09:00, Paper WePSAT2.14
Joint Multi-Frame Super-Resolution and Matting
Prabhu, Sahana	IITM
Ambasamudram, Rajagopalan	Indian Inst. of Tech. Madras
Keywords: Enhancement, Restoration and Filtering Abstract: Matting and super-resolution of frames from an image sequence have been studied independently in the literature. We propose a unified formulation to solve both inverse problems by assimilating matting within the super-resolution model. We adopt a multi-frame approach which uses data from adjacent frames to increase the resolution of the matte as well as foreground.

08:30-09:00, Paper WePSAT2.15
Locally Linear Embedding Based Example Learning for Pan-Sharpening
Liu, Qingjie	BeiHang Univ.
Liu, Lining	Beihang Univ.
Wang, Yunhong	Beihang Univ.
Zhang, Zhaoxiang	Beihang Univ.
Keywords: Remote Sensing Abstract: In this paper, a novel example based method is proposed to solve the remote sensing pan-sharpening problem, utilizing an implicit non-parametric learning framework. The high resolution (HR) and downsampled panchromatic (PAN) images are used to train the high/low resolution patch pair dictionaries. Based on the perspective of locally linear embedding (LLE), every patch in each multi-spectral (MS) image band is modeled by its K nearest neighbors in patch set generated from low resolution (LR) PAN image, and this model can be generalized to the HR condition. The intended HR MS patch is reconstructed from the corresponding neighbors in HR PAN patches. Finally, the HR MS images are recovered by stitching these patches together. Two datasets of images acquired by QuickBird satellite are used to test the performance of the proposed method. Experimental results show that the proposed method performs well in preserving spectral information as well as spatial details.

08:30-09:00, Paper WePSAT2.16
Envelope Extraction for Composite Shapes for Shape Retrieval
Song, Jianguo	Inst. of Computer Science & Tech.
Lu, Xiaoqing	Peking Univ.
Ling, Haibin	Temple Univ.
Wang, Xiao	Peking Univ.
Tang, Zhi	Peking Univ.
Keywords: Multimedia Analysis, Indexing and Retrieval, Image and Video Processing Abstract: Analysis of composite shapes recently receives increasing amount of research attention. Different from a silhouette, a composite shape rarely contains a complete envelope. In the paper, we propose a novel envelope extraction algorithm based on the Delaunay triangulation for composite shapes. By analyzing the spatial relationship among individual components of contours and their concavities, we establish new models to describe the envelope edges and their corresponding local enclosed regions. These new models are then used to extract accurate envelopes for composite shapes. We then apply the extracted envelopes to improve shape classification used in shape retrieval. The experimental results show that our algorithm effectively boosts existing shape retrieval algorithms.

08:30-09:00, Paper WePSAT2.17
Image Super-Resolution by Structural Sparse Coding
Ren, Jie	Peking Univ.
Liu, Jiaying	Peking Univ.
Wang, Mengyan	Peking Univ.
Guo, Zongming	Peking Univ.
Keywords: Image and Video Processing, Enhancement, Restoration and Filtering Abstract: Sparsity-based super-resolution has attracted lots of attention. Due to the high dimensionality of image data, sparsity-based methods are often in a patch-wise manner and simply impose the smoothness constraints on the overlapped regions between reconstructed patches. However, the imposed smoothness constraint is commonly weak to regularize super-resolution problem when the observed low-resolution image loses structure information. In this paper, we propose to improve the performance of the sparsity-based method by incorporating the structural correlations between neighboring patches. Concretely, the structural information is contained by the dictionary atoms which are used to sparsely represent the image patches. Incorporating the correlations of dictionary atoms into the basic sparse coding, a structural sparse coding algorithm is proposed. Experimental results demonstrate that the proposed algorithm outperforms the sparsity-based baseline in both objective and subjective quality.

08:30-09:00, Paper WePSAT2.18
Estimation of the Human Performance for Pedestrian Detectability Based on Visual Search and Motion Features
Wakayama, Masashi	Nagoya Univ.
Deguchi, Daisuke	Nagoya Univ.
Doman, Keisuke	Nagoya Univ.
Ide, Ichiro	Nagoya Univ.
Murase, Hiroshi	Nagoya Univ.
Tamatsu, Yukimasa	DENSO Corp.
Keywords: Image and Video Understanding, Scene Understanding, Cognitive and Embodied Vision Abstract: This paper proposes a method for estimating the human performance of pedestrian detectability from in-vehicle camera images in order to warn a driver of the positions of pedestrians in an appropriate timing. By introducing features related to visual search and motion of the target, the proposed method estimates the detectability of pedestrians accurately. Support Vector Regression (SVR) is used to estimate the detectability. Here, SVR is trained using features calculated by the proposed method with the ground truth obtained through experiments with human subjects. From experiments using in-vehicle camera images, we confirmed that the proposed features were effective to estimate the detectability of pedestrians.

08:30-09:00, Paper WePSAT2.19
A Hybrid Approach for Artificial Urdu Text Detection in Video Images
Jamil, Akhtar	COMSATS Inst. of Information Tech.
Abidi, Ali	National Univ. of Sciences & Tech.
Siddiqi, Imran	Bahria Univ.
Arif, Fahim	National Univ. of Sciences & Tech.
Keywords: Image and Video Processing, Image and Video Understanding, Multimedia Analysis, Indexing and Retrieval Abstract: The rapid growth of multimedia data containing rich textual information demands for efficient indexing and retrieval techniques. In this paper, we propose a hybrid approach based on a combination of supervised and unsupervised techniques for the detection of horizontally aligned artificial Urdu text appearing in video images. First, we use an unsupervised approach to detect potential text regions which are later validated by a supervised method. In the first step, edge features followed by morphological operations are used to identify the candidate text regions. These regions are further refined by using edge density and geometrical filters. In the next step, these detected text regions are validated by an Artificial Neural Network which is trained on example text and non-text regions. The effectiveness of the proposed system is evaluated on a dataset of 500 images reading promising results.

08:30-09:00, Paper WePSAT2.20
Image Super-Resolution Based on Locality-Constrained Linear Coding
Taniguchi, Kazuki	Ritsumeikan Univ.
Han, Xian-Hua	Ritsumeikan Univ.
Iwamoto, Yutaro	Ritsumeikan Univ.
Sasatani, So	Ritsumeikan Univ.
Chen, Yen-wei	Ritsumeikan Univ.
Keywords: Enhancement, Restoration and Filtering Abstract: This paper presents a learning-based method called image super-resolution (SR) for generating a high-resolution (HR) image from a single low-resolution (LR) image. Recent research investigated the image SR problem using sparse coding, which is based on good reconstruction of any image local patch by a sparse linear combination of atoms from an overcomplete dictionary. However, sparse-coding-based SR (ScSR) generally takes a significant amount of computational time to compute an HR image. Further, it can yield only a global dictionary D = [Dh; Dl] by jointly training the concatenated HR and LR image local patches, which results in no accurate correspondence between the HR and LR dictionaries. Therefore, we propose the generation of an HR image using a linear combination of several anchor points (codes) for a local patch based on locality-constrained linear coding (LLC), which is a fast implementation of local coordinate coding (LCC). In the proposed LLC-based strategy, each local patch is represented by a weighted linear combination of its nearer codes in a predefined codebook, and the linear weights become its local coordinate coding. Experimental results show that the recovered HR images with our proposed approach can achieve comparable performance at a processing time much shorter than those of conventional methods.

08:30-09:00, Paper WePSAT2.21
Color Maximal-Dissimilarity Pattern for Pedestrian Detection
Wang, Qingyuan	Graduate Univ. of Chinese Acad. of Sciences
Pang, Junbiao	Beijing Uinversity of Tech.
Liu, Guoyi	nec Lab. china
Qin, Lei	Inst. of Computing Tech. Chinese Acad. C
Huang, Qingming	Chinese Acad. of Sciences
Jiang, Shuqiang	Chinese Acad. of Sciences
Keywords: Detection, Separation and Segmentation Abstract: Feature plays an important role in pedestrian detection, and considerable progress has been made on shape-based descriptors. However, color cues have barely been devoted to detection tasks, seemingly due to the variable appearance of pedestrians. In this paper, Color Maximal-Dissimilarity Pattern (CMDP) is proposed to encode color cues by two core operations, i.e., oriented filtering and max-pooling, which emulate the functions of primary visual cortex (V1). The exten- sively experimental results reveal that the biologically-explainable encoding scheme increases the invariance of color cues, and outperforms the state-of-the-art color descriptor in terms of both accuracy and speed.

08:30-09:00, Paper WePSAT2.22
A Tracking Based Fast Online Complete Video Synopsis Approach
Sun, Lei	Tsinghua Univ. Beijing, China
Xing, Junliang	Tsinghua Univ.
Ai, Haizhou	Tsinghua Univ. China
Lao, Shihong	OMRON Social Solutions Co., LTD
Keywords: Multimedia Analysis, Indexing and Retrieval, Motion, Tracking and Video Analysis, Image and Video Processing Abstract: By segmenting moving objects out and then densely stitching them into background frames, video synopsis provides an efficient way to condense long videos while preserving most activities. Existing video synopsis methods, however, often suffer from either high computation cost due to global energy minimization or unsatisfactory condense rate to avoid loss of important object activities. To address these problems, a tracking based fast online video synopsis approach is proposed in this paper which makes following three main contributions: 1) an online formulation of the video synopsis problem which makes the approach very fast and scalable to endless surveillance videos with reduced chronological disorders, 2) a tracking based schema which can preserve most object activities, and 3) a complete optimization process from both temporal and spatial redundancies of the video which results in much higher condense rate and less object conflict rate. Experimental results demonstrate the effectiveness and efficiency of proposed approach compared to the traditional method on public surveillance videos.

08:30-09:00, Paper WePSAT2.23
Sorted Dominant Local Color for Searching Large and Heterogeneous Image Databases
Vidal, Marcio	Federal Univ. of Amazonas
Cavalcanti, Jo�o	Federal Univ. of Amazonas
Silva de Moura, Edleno	Federal Univ. of Amazonas
da Silva, Altigran	Univ. Federal do Amazonas
Torres, Ricardo	Inst. of Computing, Univ. of Campinas
Keywords: Multimedia Analysis, Indexing and Retrieval, Features and Image Descriptors Abstract: Recent work on Content-Based Image Retrieval (CBIR) have presented alternative methods for fast image indexing and retrieval using Bags of Visual Words (BoVW). In such methods, images are represented as sets of visual words, which can be indexed and searched using well-known text retrieval techniques, allowing fast search on large image databases. In this paper we propose a novel method based on BoVW that improves over current methods by using a new kind of local color descriptor, which we call SDLC, that encodes the most predominant color occurrences in blocks of different im- age regions. We report results of experiments we per- formed with two publicly available image databases. The results indicate that the use of SDLC led to a quite competitive CBIR method in comparison to the state- of-the-art.

08:30-09:00, Paper WePSAT2.24
Indexed Heat Curves for 3D-Model Retrieval
EL Khoury, Rachid	telecom lille1
Vandeborre, Jean-Philippe	Univ. of Lille 1
Daoudi, Mohamed	TELECOM Lille1
Keywords: Multimedia Analysis, Indexing and Retrieval, Pattern Recognition for Search, Retrieval and Visualization, Classification and Clustering Abstract: 3D-model analysis plays an important role in numerous applications. In this paper, we present an approach for 3D-model retrieval by creating index of closed curves in R3 generated from the center of a 3Dmodel, using a commute time mapping function. Our mapping function respects important properties in order to compute robust closed curves. Each curve describes a small region of the 3D-model. To describe all the mesh, we compute a set of indexed closed curves. These curves lead to creates an invariant descriptor to different transformations. Then we compute the distance between models by comparing the indexed curves. In order to evaluate our method, we used shapes from SHREC 2012 database. The results show the robustness of our method on various classes of 3D-models with different positions.

08:30-09:00, Paper WePSAT2.25
Nonlocal Processing of 3D Colored Point Clouds
Fran�ois, Lozes	Univ. de Caen Basse-Normandie GREYC UMR 6072
Elmoataz, Abderrahim	Univ. de Caen Basse-Normandie
Lezoray, Olivier	Univ. de Caen Basse-Normandie
Keywords: Enhancement, Restoration and Filtering, Image and Video Processing Abstract: In this paper we present a methodology for nonlocal processing of 3D colored point clouds using regularization of functions defined on weighted graphs. To adapt it to nonlocal processing of 3D data, a new definition of patches for 3D point clouds is introduced and used for nonlocal filtering of 3D data such as colored point~clouds. Results illustrate the benefits of our nonlocal approach to filter noisy 3D colored point clouds (either on spatial or colorimetric information).

08:30-09:00, Paper WePSAT2.26
Speech Emotion Recognition Based on Kernel Reduced-Rank Regression
Wenming, Zheng	Southeast Univ.
Zhou, Xiaoyan	Nanjing Univ. of Information Science & Tech.
Keywords: Speech and Audio Analysis Abstract: Emotion recognition from Speech has been a very active research topic in pattern recognition. In this paper, we investigate the use of kernel reduced-rank regression (KRRR) model to address the emotion recognition problem from speech. KRRR is a nonlinear extension of the linear reduced-rank regression (RRR) model via the kernel trick, in which a kernel mapping is used for the multivariable of RRR. To find the optimal kernel for KRRR, a kernel optimization algorithm is also proposed in the paper. To evaluate the performance of the proposed method, we conduct extensive experiments on the Berlin emotional database. The experimental results confirm the effectiveness of the proposed method.

08:30-09:00, Paper WePSAT2.27
Context-Aware Learning for Automatic Sports Highlight Recognition
Ghanem, Bernard	King Abdullah Univ. of Science and Tech.
Kreidieh, Maya	American Univ. of Beirut
Farra, Marc	American Univ. of Beirut
Zhang, Tianzhu	Advanced Digital Sciences Center of Illinois
Keywords: Multimedia Analysis, Indexing and Retrieval, Image and Video Understanding, Image and Video Processing Abstract: Video highlight recognition is the procedure in which a long video sequence is summarized into a shorter video clip that depicts the most "salient" parts of the sequence. It is an important technique for content delivery systems and search systems which create multimedia content tailored to their users' needs. This paper deals specifically with capturing highlights inherent to sports videos, especially for American football. Our proposed system exploits the multimodal nature of sports videos (i.e. visual, audio, and text cues) to detect the most important segments among them. The optimal combination of these cues is learned in a data-driven fashion using user preferences (expert input) as ground truth. Unlike most highlight recognition systems in the literature that define a highlight to be salient only in its own right (globally salient), we also consider the context of each video segment w.r.t. the video sequence it belongs to (locally salient). To validate our method, we compile a large dataset of broadcast American football videos, acquire their ground truth highlights, and evaluate the performance of our learning approach.

08:30-09:00, Paper WePSAT2.28
No Reference Measurement of Contrast Distortion and Optimal Contrast Enhancement
Xu, Hongteng	Shanghai Jiao Tong Univ.
Zhai, Guangtao	Shanghai Jiao Tong Univ.
Yang, Xiaokang	Shanghai Jiao Tong Univ.
Keywords: Enhancement, Restoration and Filtering, Image and Video Processing, Pattern Recognition for Art, Cultural Heritage and Entertainment Abstract: In this paper, a novel histogram-based model for contrast enhancement is proposed. Based on our analysis about the relationships of histogram with contrast, we establish a model which 1) achieves contrast enhancement by an optimal transform of histogram, 2) gives two metrics called contrast gain and nonlinearity of transform to measure the strength of enhancement and the seriousness of distortion caused by enhancement respectively. The ratio of the two proposed metrics not only gives a guidance for the configuration of parameter in the algorithm, but also provides a useful measurement for contrast distortion, which can be a potential solution to judge whether the contrast of an image is optimal. Experimental results show the superior performances of the proposed algorithm in image enhancement.

08:30-09:00, Paper WePSAT2.29
Enhanced Semantic Descriptors for Functional Scene Categorization
Zen, Gloria	Univ. of Trento
Rostamzadeh, Negar	Univ. of Trento
Staiano, Jacopo	Univ. of Trento
Ricci, Elisa	Univ. of Perugia
Sebe, Nicu	Univ. of Trento
Keywords: Image and Video Understanding, Image and Video Processing, Detection, Separation and Segmentation Abstract: In this work we present a novel approach which combines semantic information with low level features extracted from a complex video scene. The proposed method for video scene understanding relies on a bag-of-words approach, in which, typically, visual words contain information of local motion, but information regarding what generated such motion is discarded. Instead, in our framework, the semantic information is embedded in the visual words and it allows to automatically obtain semantic categorization of the scene. We show the effectiveness of our method in a traffic analysis scenario: in this case, two main semantic classes, pedestrians and vehicles, are discovered.

08:30-09:00, Paper WePSAT2.30
Active Contours Segmentation with Edge Based and Local Region Based
Srikham, Manassanan	Chulalongkorn Univ.
Keywords: Detection, Separation and Segmentation, Image and Video Processing Abstract: In this paper, we proposed a novel active contour method for image segmentation, which utilizes the advantages of the GAC and the LRAC methods. We consider the smoothing force of the GAC method and local region-based force of the LRAC method. The advantages of our method are as follows. First the proposed method a new region-based signed pressure force function, which can efficiently stop the contours at weak boundary. Second the proposed method can be handle the heterogeneous texture objects and able to reach into deep concave shapes. Finally, the proposed formulation can be easily implemented by simple finite difference scheme and is computationally more efficient and accurate. The proposed method has been applied to both synthetic and real images.

08:30-09:00, Paper WePSAT2.31
Robust Detection of Adventitious Lung Sounds in Electronic Auscultation Signals
Sakai, Tomoya	Nagasaki Univ.
Kato, Madoka	Nagasaki Univ.
Miyahara, Sueharu	Nagasaki Univ.
Kiyasu, Senya	Nagasaki Univ.
Keywords: Detection, Separation and Segmentation, Speech and Audio Analysis, Enhancement, Restoration and Filtering Abstract: We present a sparse representation-based method for detecting adventitious lung sounds in low-quality auscultation signals. Since the noise cannot be represented sparsely by any bases, we can extract clear breath sounds and adventitious sounds from noisy electronic auscultation signals via the sparse representation. Using these clear sound components, we measure the level of abnormality, and robustly detect adventitious sounds with pulsating waveforms, a.k.a crackles. We have experimentally confirmed that our detection achieves an average precision of about 85 percents regardless of nose level.

08:30-09:00, Paper WePSAT2.32
A Classwise Supervised Ordering Approach for Morphology Based Hyperspectral Image Classification
Courty, Nicolas	Univ. of Bretagne Sud
Aptoula, Erchan	Okan Univ.
Lef�vre, S�bastien	Univ. of South Brittany
Keywords: Remote Sensing, Classification and Clustering, Machine Learning and Data Mining Abstract: We present a new method for the spectral-spatial classification of hyperspectral images, by means of morphological features and manifold learning. In particular, mathematical morphology has proved to be an invaluable tool for the description of remote sensing images. However, its application to hyperspectral data is problematic, due to the absence of a complete lattice structure at higher dimensions. We address this issue by following up previous experimental indications on the interest of classwise orderings. The practical interest of the proposed approach is shown through comparison on the Pavia dataset with Extended Morphological Profiles, against which it achieves superior results.

08:30-09:00, Paper WePSAT2.33
Unsupervised People Organization and Its Application on Individual Retrieval from Videos
Hao, Pengyi	waseda Univ.
Kamata, Sei-ichiro	Waseda Univ.
Keywords: Multimedia Analysis, Indexing and Retrieval, Image and Video Processing, Image and Video Understanding Abstract: In this paper, a method named histogram intersection metric learning from scene tracks is proposed for automatic organizing people in videos. We make the following contributions: (i) learning histogram intersection distance instead of Mahalanobis distance for widely used face features; (ii) learning the metric from scene tracks without manually labeling any examples, which enables learning across large variations in pose, expression, occlusion and illumination with small number of face pairs and can distinguish different people powerfully. We firstly test face identification, track clustering, and people organization on a long film, then individual retrieval based on people organization from a large video dataset is evaluated, demonstrating significantly increased search quality with respect to previous approaches.

08:30-09:00, Paper WePSAT2.34
Sparse Representation of Audio Features for Sputum Detection from Lung Sounds
Yamashita, Tatsuya	Gifu Univ.
Tamura, Satoshi	Gifu Univ.
Hayashi, Kenji	Gifu Univ.
Nishimoto, Yutaka	Gifu Univ.
Hayamizu, Satoru	Gifu Univ.
Keywords: Detection, Separation and Segmentation, Speech and Audio Analysis, Speech and Audio Processing Abstract: A medical staff needs to check sputum accumulation in patient's respiratory tract by lung sounds auscultation at any time, and it is the big burden for the staff. This paper aims to develop a system which notifies appropriate timing for the tracheal suction for the medical staff by analyzing lung sounds of the patients. We present a novel framework about automatic sputum detection from lung sounds. We proposed the sparse representation of audio features to realize robust detection in real environment. We showed the effectiveness of our proposed method for three patients in an ICU of Gifu University Hospital, where the recorded lung sounds included electronic beeps, human voices, and other various noises.

08:30-09:00, Paper WePSAT2.35
Circular Object Detection Method Based on Separability and Uniformity of Feature Distributions Using Bhattacharyya Coefficient
Niigaki, Hitoshi	Nippon Telegraph and Telephone Corp.
Shimamura, Jun	NTT Corp.
Morimoto, Masashi	NTT Corp.
Keywords: Detection, Separation and Segmentation, Low-Level Vision, Features and Image Descriptors Abstract: This paper proposes a robust detection method for circular objects in noisy and inhomogeneous contrast image. This method detects circular objects not by the difference in image intensities between the object interior and its surrounding, but by the separability and uniformity of the image intensity distributions as calculated by Bhattacharyya Coefficient. The proposed method can detect obscure and textured circular objects, both of which are challenges for conventional methods. In addition, this method does not incur the cost of texture learning. Experiments demonstrate the effectiveness and robustness of the proposed method.

08:30-09:00, Paper WePSAT2.36
Adaptive Support-Window Approximation to Bilateral Filtering
Lin, Guo-Shiang	Da-Yeh Univ.
Chen, Chun-Yu	National Chung Cheng Univ.
Kuo, Chun-Ting	National Chung Cheng Univ.
Lie, Wen-Nung	National Chung Cheng Univ.
Liu, Kai-Che	Medical Image Res. Department, Asian Inst. of TeleSurger
Keywords: Image and Video Processing Abstract: In this paper, a computation-efficient adaptive support-window scheme is proposed to approximate the conventional bilateral filtering. The difference is that the pixel-wise weights in bilateral filter are thresholded to be only 0 or 1. This results in an adaptive support window, depending on the local image structure of the anchor pixel. A cross-based algorithm is devised to achieve adaptive support window. Experiments show that both noise removal and edge-preserving can be also achieved using our proposed filter. By computing integral images during data aggregation, our algorithm is capable of achieving constant-time complexity regardless of the shape of the support window. Experiments demonstrate that our proposed computing scheme can reduce up to 98% of execution time with respect to the traditional bilateral filter.

08:30-09:00, Paper WePSAT2.37
Multiple-Food Recognition Considering Co-Occurrence Employing Manifold Ranking
Matsuda, Yuji	The Univ. of Electro-Communications, Tokyo
Yanai, Keiji	The Univ. of Electro-Commnications, Tokyo
Keywords: Multimedia Analysis, Indexing and Retrieval, Image and Video Understanding, Pattern Recognition for Search, Retrieval and Visualization Abstract: In this paper, we propose a method to recognize food images which include multiple food items considering co-occurrence statistics of food items. The proposed method employs a manifold ranking method which has been applied to image retrieval successfully in the literature. In the experiments, we prepared co-occurrence matrices of 100 food items using various kinds of data sources including Web texts, Web food blogs and our own food database, and evaluated the final results obtained by applying manifold ranking. As results, it has been proved that co-occurrence statistics obtained from a food photo database is very helpful to improve the classification rate within the top ten candidates.

08:30-09:00, Paper WePSAT2.38
Spatiotemporal Saliency Based on Distributed Opponent Oriented Energy
Zhou, Yue	Shanghai Jiaotong Univ. Inst. of image processing& pat
Shi, Kun	Shanghai Jiao Tong Univ.
Keywords: Image and Video Processing, Detection, Separation and Segmentation, Enhancement, Restoration and Filtering Abstract: A computational saliency model utilizing bio-inspired features for spatiotemporal saliency is presented in this paper. We first propose distributed opponent oriented energy for compact local dynamic texture description motivated by Human Vision System. Then, we integrate the derived motion characterization and a revised self-resemblance saliency framework. High effectiveness and efficiency of the proposed method is extensively demonstrated both qualitatively and quantitatively, for background subtraction in the cases of extremely dynamic scenes and camera jitter. In terms of the trade-off between accuracy and computation cost, our method achieves competitive results in contrast to the state-of-art algorithm.

08:30-09:00, Paper WePSAT2.39
Image Enhancement by Wavelet Multi-Scale Edge Statistics
Liew, Alan Wee-Chung	Griffith Univ.
Jo, Jun	Griffith Univ.
Chun, Yong-Sik	Korea Aerospace Res. Inst.
Tae-Hong, Ahn	Chonnam Tech. Univ.
Tae Byong, Chae	Korea Aerospace Res. Inst.
Keywords: Enhancement, Restoration and Filtering, Image and Video Processing Abstract: The distribution of wavelet modulus maxima across wavelet scales can be used to characterize edges in an image. In this paper, we present a novel algorithm that performs image enhancement by mapping the distribution of the wavelet modulus maxima of the blurred image to that of a generic sharp image. Experimental results confirm that the proposed algorithm is able to perform image enhancement without introducing unpleasant visual artifacts.

08:30-09:00, Paper WePSAT2.40
A Semi-Lagrangian Scheme for Area Preserving Flow
Carlini, Elisabetta	Sapienza Univ. di Roma
Ferretti, Roberto	Univ. di Roma Tre
Keywords: Enhancement, Restoration and Filtering, 2D/3D Object Detection and Recognition, Pattern Recognition for Search, Retrieval and Visualization Abstract: We propose a new Semi-Lagrangian scheme for the area-preserving Mean Curvature flow. The model uses a level set framework to propagate a closed planar curve. The corresponding flow has been proposed by Sapiro & Tannenbaum cite{ST}. The scheme has the advantage to allow large time step still maintaining a good accuracy. We apply the algorithm to recover the original shape of a synthetic image to which artificial noise has been added. The numerical result shows that the area is preserved during the filtering process.

08:30-09:00, Paper WePSAT2.41
Enhancement and Noise Reduction of Very Low Light Level Images
Zhang, Xiangdong	Xidian Univ.
Shen, Peiyi	Xidian Univ.
Luo, Lingli	Xidian Univ.
Zhang, Liang	Xidian Univ.
Song, Juan	Xidian Univ.
Keywords: Image and Video Processing, Enhancement, Restoration and Filtering, Low-Level Vision Abstract: A general method for image contrast enhancement and noise reduction is proposed in this paper. The method is developed especially for enhancing images acquired under very low light conditions where the features of images are nearly invisible and the noise is serious. By applying an improved and effective image de-haze algorithm to the inverted input image, the intensity can be amplified so that the dark areas become bright and the contrast get enhanced. Then, the joint-bilateral filter with the original green component as the edge image is introduced to suppress the noise. Experimental results validate the performance of the proposed approach.

08:30-09:00, Paper WePSAT2.42
Visual Attention Region Determination for H.264 Videos
Hu, Kang-Ting	National Chung Cheng Univ.
Leou, Jin-Jang	National Chung Cheng Univ.
Hsiao, Han-Hui	National Chung Cheng Univ.
Keywords: Image and Video Processing, Enhancement, Restoration and Filtering Abstract: In this study, a visual attention region determination approach for H.264 videos using spatiotemporal features is proposed. After Gaussian filtering in Lab color space, the phase spectrum of Fourier transform (PFT) is used to generate the spatial saliency map of each video frame. On the other hand, the motion vector fields from an H.264 video bitstream are backward accumulated and the phase spectrum of Fourier transform (PFT) is used to obtain the temporal saliency map of each video frame. Then, the spatial and temporal saliency maps of each video frame are combined to obtain its spatiotemporal saliency map using adaptive fusion. Finally, a visual attention region determination scheme is used to determine visual attention regions (VARs) of each video frame. Based on the experimental results obtained in this study, the performance of the proposed approach is better than that of two comparison approaches.

08:30-09:00, Paper WePSAT2.43
Node Localization in Unsynchronized Time of Arrival Sensor Networks
Burgess, Simon	Lund Univ.
Kuang, Yubin	Lund Univ.
Astroem, Kalle	Lund Univ.
Keywords: Remote Sensing Abstract: We present a method for solving the previously unstudied problem of localizing a set of receivers and directions from transmitters placed far from the receivers, measuring unsynchronized time of arrival data. The same problem is present in node localization of microphone and antenna arrays. The solution algorithm using 5 receivers and 9 transmitters is extended to the overdetermined case in a straightforward manner. Degenerate cases are shown to be when i) the measurement matrix has rank 4 or less or ii) the directions from the transmitters to the receivers lie on an intersection between the unit sphere and another quadric surface. In simulated experiments we explore how sensitive the solution is with respect to different degrees of far field approximations of the transmitters and with respect to noise in the data. Using real data we get a reconstruction of the receivers with a relative error of 14%.

08:30-09:00, Paper WePSAT2.44
Video Summarization Using Simple Action Patterns
Aydemir, M. Said	Yildiz Tech. Univ.
Ergul, Ugur	Yildiz Tech. Univ.
Guclu, Adem	Yildiz Tech. Univ.
Karsligil, M. Elif	Yildiz Tech. Univ.
Keywords: Image and Video Processing, Scene Understanding, Pattern Recognition for Surveillance and Security Abstract: Video summarization, which has a tremendous usage area that spreads from information retrieval to data compression, plays a crucial role in the multimedia understanding. In recent years, with the explosion of the number of videos and their area of use, video summarization became a must to signify. Therefore, this work introduces a novel approach for the summarization problem which is based on human movement understanding. Proposed system presents efficient video knowledge extraction, especially for surveillance cases. Human centric videos are analyzed with histogram of oriented gradients as feature extractor and optical flow as motion descriptor. Above these, a template matching algorithm implemented in a shrinkable and stretchable manner to search for periodicity and thereby extract patterns. Summarization is built up on the validation of these extracted patterns with a correlation based search-through subsystem.

08:30-09:00, Paper WePSAT2.45
A Probabilistic Framework for Logo Detection and Localization in Natural Scene Images
Roy, Ankush	Indian Statistical Inst.
Garain, Utpal	Indian Statistical Inst.
Keywords: Multimedia Analysis, Indexing and Retrieval, Detection, Separation and Segmentation, 2D/3D Object Detection and Recognition Abstract: This paper presents a probabilistic approach for logo detection and localization in natural scene images. Two probability distributions are computed, one considering the features extracted from the key points located inside a region and the second refers to the shape geometry defined by the key points. The barycentric co-ordinates are considered to define the shape statistics. The performance of the proposed approach has been reported on two publicly available datasets. Logo detection is tested on BelgaLogos and shown that statistically significant improvement is achieved over two recently proposed methods. Logo localization efficiency has been tested on Flickr Logos 27.

08:30-09:00, Paper WePSAT2.46
Guided Inpainting and Filtering for Kinect Depth Maps
Liu, Junyi	Zhejiang Univ.
Gong, Xiaojin	Zhejiang Univ.
Liu, Jilin	Zhejiang Univ.
Keywords: Enhancement, Restoration and Filtering, Image and Video Processing Abstract: Depth maps captured by Kinect-like cameras are lack of depth data in some areas and suffer from heavy noise. These defects have negative impacts on practical applications. In order to enhance the depth maps, this paper proposes a new inpainting algorithm that extends the original fast marching method (FMM) to reconstruct unknown regions. The extended FMM incorporates an aligned color image as the guidance for inpainting. An edge-preserving guided filter is further applied for noise reduction. To validate our algorithm and compare it with other existing methods, we perform experiments on both the Kinect data and the Middlebury dataset which, respectively, provide qualitative and quantitative results. The results show that our method is efficient and superior to others.


WeAT1	Main Hall
Invited Talk Session-III	Regular Session
Chair: Yagi, Yasushi	Osaka Univ.
Co-Chair: Suzuki, Kenji	The Univ. of Chicago

09:00-09:40, Paper WeAT1.1
Patient and Process Specific Imaging and Visualization for Computer Assisted Interventions (Invited Talk)
Navab, Nassir	Tech. Univ. M�nchen, Chair for Computer AidedMedical
Keywords: Abstract: In this talk, I first focus on the needs for development of novel techniques for patient and process specific intra-operative imaging and visualization and present some of our latest results as exemplary cases. As novel intra-operative and multi-modality imaging techniques provide the surgical crew with rich co-registered information, their appropriate visualization and their integration into surgical workflow, their validation and finally their full deployment are becoming active subjects of research in our community. Pattern recognition, computer vision and machine learning techniques are further developed to help recovering and modeling surgical procedures and providing innovative solutions. I will in particular trace the Freehand SPECT and Camera Augmented Mobile C-arm (CAMC) from the early development of research ideas within our multi-disciplinary research laboratories to their deployment in different surgical suites. I will finally show how the 'real world laboratories' at our university hospitals demonstrate their efficiency through the smooth path they pave for bringing advance imaging and visualization techniques into the surgical theatres.

09:40-10:00, Paper WeAT1.2
Constrained-MSER Detection of Retinal Pathology
Lim, Yong San Gilbert	National Univ. of Singapore
Lee, Mong Li	National Univ. of Singapore
Hsu, Wynne	National Univ. of Singapore
Keywords: Computer-Aided Diagnosis and Surgery, Medical Image Analysis and Registration Abstract: With the increase in age and diabetes-related eye diseases, there is a rising demand for systems which can efficiently screen and locate abnormalities in retinal images. In this paper, we propose a framework that utilizes a variant of the Maximally Stable Extremal Region method, termed C-MSER, to systematically detect various retinopathy pathologies such as microaneurysms, haemorrhages, hard exudates and soft exudates. Experiments on three real-world datasets show that C-MSER is effective for online screening of diabetic retinopathy.

10:00-10:20, Paper WeAT1.3
Automatic Localization of the Macula in a Supervised Graph-Based Approach with Contextual Superpixel Features
Wong, Wing Kee Damon	Inst. for Infocomm Res.
Liu, Jiang, Jimmy	Inst. for Infocomm Res. A-STAR
Tan, Ngan-Meng	Inst. for Infocomm Res.
Yin, Fengshou	Inst. for Infocomm Res.
Cheng, Xiangang	Inst. for Infocomm Res.
Cheung, Gemmy C.M.	Singapore Eye Res. Inst.
Bhargava, Mayuri	Singapore Eye Res. Inst.
Wong, Tien Yin	Singapore Eye Res. Inst.
Keywords: Medical Image Analysis and Registration, Computer-Aided Diagnosis and Surgery Abstract: Localization of the macula centre is an important step in retinal image analysis, in particular for macular disease. We propose the use of a superpixel-based approach for macular localization. Features are extracted from the superpixels, including a proposed feature which aims to describe the extent of the local region due to the superpixel influence. These features are used to calculate probability estimates to determine the macula centre. We evaluated our results on a large dataset of 728 images comprising of normal, glaucoma and AMD eyes. The results are promising. Our method achieved an average error of 30pixels, with all the detected macula centres within 1/8 disc diameters of the reference ground truth, which is lower than the other methods tested.

10:20-10:40, Paper WeAT1.4
Classification of Drusen Positions in Optical Coherence Tomography Data from Patients with Age-Related Macular Degeneration
Dufour, Pascal Andre	Univ. of Bern
De Zanet, Sandro	Univ. of Bern
Wolf-Schnurrbusch, Ute	Univ. Hospital Bern
Kowal, Jens	Univ. of Bern
Keywords: Medical Image Analysis and Registration, Computer-Aided Diagnosis and Surgery Abstract: Quantitative analysis of optical coherence tomography volumes is an important tool for both clinicians and researchers. Until now, most work has focused on segmentation of the intraretinal cell layers, but the segmentation of pathological datasets remains challenging. We propose the application of random forest to detect the locations of drusen in the retinal pigment epithelium. This is an important step for further analysis of optical coherence tomography data, for segmentation or otherwise. The presented combination of Bruch�s Membrane segmentation with subsequent sampling around the retinal pigment epithelium is a way to quickly compute discriminative features for classification. The proposed method achieves an AUC of 0.94 on our test set, while keeping the computational complexity at a minimum. This makes a clinical setup feasible and provides a tool for clinicians and researchers to quantitatively measure disease progression.


WeAT2	Multi-Purpose Hall
Low-Level and Physics-Based Vision	Regular Session
Chair: Horiuchi, Takahiko	Chiba Univ.
Co-Chair: Wang, Hanzi	Xiamen Univ.

09:00-09:20, Paper WeAT2.1
Image Super-Resolution Based on Multikernel Regression
Gu, Ying	Xiamen Univ.
Qu, Yanyun	Xiamen Univ.
Fang, Tianzhu	Xiamen Univ.
Li, Cuihua	Xiamen Univ.
Wang, Hanzi	Xiamen Univ.
Keywords: Vision for Graphics, Physics-Based Vision, Low-Level Vision Abstract: In this paper, a novel approach to single image super-resolution based on the multikernel regression is presented. This approach aims to learn the map between the space of high-resolution image patches and the space of blurred high-resolution image patches which are the interpolation results generated from the corresponding low-resolution images. Kernel regression based super-resolution approach is promising, but the kernel selection is a critical problem. In order to avoid selecting the kernel via large amounts of crossverifications, the multikernel regression is applied to learn the map function. This approach is efficient and the experimental results show that it manifests a highquality performance in comparison with other superresolution methods.

09:20-09:40, Paper WeAT2.2
Predicting Environment Illumination Effects on Material Appearance
Filip, Jiri	Inst. of Information Theory and Automation of the AS CR
Haindl, Michael	Inst. of Information Theory and Automation
Stancik, Jaroslav	Varos s.r.o.
Keywords: Segmentation, Color and Texture, Features and Image Descriptors, Mixed and Augmented Reality Abstract: Environment illumination is a key to achieving a realistic visualization of material appearance. One way to achieve such an illumination is an approximation by rendering of the material surface lit by a finite set of point light sources. In this paper we employed visual psychophysics to identify a minimal number of point light sources approximating realistic illumination. Furthermore, we analyzed stimuli images and correlation of their statistics with obtained psychophysical data. Finally, image statistics were identified which can predict such a minimal environment representation for three tested materials, depending on the visual properties of the illumination environment.

09:40-10:00, Paper WeAT2.3
Tangent Estimation Along 3D Digital Curves
Postolski, Michal	Univ. Paris-Est, Lab. Eq
Janaszewski, Marcin	Tech. Univ. of Lodz, Computer EngineeringDepartment
Kenmochi, Yukiko	Univ. Paris-Est
Lachaud, Jacques-Olivier	Lab. of Mathematics (UMR CNRS 5127), Univ. of Savoie,
Keywords: Vision for Graphics, 2D/3D Object Detection and Recognition, Medical Image Analysis and Registration Abstract: In this paper, we present a new three-dimensional (3D) tangent estimator by extending the two-dimensional (2D) lambda-maximal segment tangent (lambda- MST) estimator, which has very good theoretical and practical behaviors. We show that our proposed estimator keeps the same time complexity, accuracy and experimental asymptotic behaviors as the original 2D one.

10:00-10:20, Paper WeAT2.4
Estimation of Multiple Light Sources from Specular Highlights
Kato, Yu	Chiba Univ.
Horiuchi, Takahiko	Chiba Univ.
Tominaga, Shoji	Chiba Univ.
Keywords: Scene Understanding Abstract: This paper proposes a method for estimating the illuminant spectral power distributions and their positional relationship of multiple light sources under a complex illumination environment. A multiband camera system is used for capturing spectral images of dielectric objects in a scene. Specular highlights are used as a clue for estimating the light source information, which are detected on curved object surfaces with the different object colors. The illuminant spectra of light sources are estimated from the camera data for each highlight areas. Then, the illuminant spectral estimates are obtained for a different set of light sources. Next, positional relationship among the light sources is predicted by classifying the detected highlights and the estimated spectra using the probabilistic relaxation labeling. The feasibility of the proposed method is examined in experiments on real scenes.

10:20-10:40, Paper WeAT2.5
Depth-Adaptive Superpixels
Weikersdorfer, David	Tech. Univ. M�nchen
Gossow, David	Tech. Univ. M�nchen
Beetz, Michael	Tech. Univ. M�nchen
Keywords: Segmentation, Color and Texture, 2D/3D Object Detection and Recognition, Vision for Robotics Abstract: We propose a novel oversegmentation technique for RGB-D images. The visible surface of the 3D geometry is partitioned into uniformly distributed and equally sized planar patches. This results in a classic oversegmentation of pixels into depth-adaptive superpixels which correctly reflect deformation through perspective projection. The advantages of depth-adaptive superpixels are demonstrated by using spectral graph theory to create image segmentations in near realtime. Our algorithms outperform state-of-the-art oversegmentation and image segmentation algorithms both in quality and runtime.


WeAT3	Room 101+102
Shape Analysis	Regular Session
Chair: Kittler, Josef	Univ. of Surrey
Co-Chair: Robles-Kelly, Antonio	NICTA

09:00-09:20, Paper WeAT3.1
Shape Analysis on the Hypersphere of Wavelet Densities
Moyou, Mark Matthew	Florida Inst. of Tech.
Peter, Adrian	Florida Inst. of Tech.
Keywords: 2D/3D Object Detection and Recognition, Classification and Clustering, Feature Reduction and Manifold Learning Abstract: We present a novel method for shape analysis which represents shapes as probability density functions and then uses the intrinsic geometry of this space to match similar shapes. In our approach, shape densities are estimated by representing the square-root of the density in a wavelet basis. Under this model, each density (of a corresponding shape) is then mapped to a point on a unit hypersphere. For each category of shapes, we find the intrinsic Karcher mean of the class on the hypersphere of shape densities, and use the minimum spherical distance between a query shape and the means to classify shapes. Our method is adaptable to a variety of applications, does not require burdensome preprocessing like extracting closed curves, and experimental results demonstrate it to be competitive with contemporary shape matching algorithms.

09:20-09:40, Paper WeAT3.2
A Grassmann Manifold-Based Domain Adaptation Approach
Zheng, Jingjing	Univ. of Maryland, Coll. Park
Liu, Ming-Yu	Mitsubishi Electric Res. Lab.
Chellappa, Rama	Univ. of Maryland
Phillips, Jonathon	NIST
Keywords: 2D/3D Object Detection and Recognition, Classification and Clustering, Image and Video Processing Abstract: Domain adaptation algorithms that handle shifts in the distribution between training and testing data are receiving much attention in computer vision. Recently, a Grassmann manifold-based domain adaptation algorithm that models the domain shift using intermediate subspaces along the geodesic connecting the source and target domains was presented by Gopalan et al. We build upon this work and propose replacing the step of concatenating feature projections on a very few sampled intermediate subspaces by directly integrating the distance between feature projections along the geodesic. The proposed approach considers all the intermediate subspaces along the geodesic. Thus, it is a more principled way of quantifying the cross-domain distance. We present the results of experiments on two standard datasets and show that the proposed algorithm yields favorable performance over previous approaches.

09:40-10:00, Paper WeAT3.3
Extracting Planar Structures Efficiently with Revisited BetaSAC
Decrouez, Marion Caroline	INRIA Rhone-Alpes
Romain, Dupont	CEA LIST
Gaspard, Francois	CEA LIST
Crowley, James	INP Grenoble
Keywords: Segmentation, Color and Texture, Scene Understanding, Vision for Robotics Abstract: We present a new method for the detection of multiple homographies in image pairs. Our aim is to show that we can approach the optimal solution in a short time using an approach based on the wellknown RANSAC algorithm. Given feature correspondences between two similar images, our algorithm iteratively generates homography hypotheses using a suitable sampling, optimizes the promising hypotheses and combine them to keep the best models. The performance of the method is demonstrated on synthetic and real data and we show that it outperforms the J-linkage technique for the multiple homographies estimation problem.

10:00-10:20, Paper WeAT3.4
Fast and Robust Monocular 3D Deformable Shape Estimation for Inextensible and Smooth Surfaces
Ferraz, Luis	Univ. Pompeu Fabra
Binefa, Xavier	Univ. Pompeu Fabra
Keywords: Stereo and Image-Based Modeling, Motion, Tracking and Video Analysis, Physics-Based Vision Abstract: We present a method for recovering fast and robustly the 3D shape of inextensible and smooth surfaces from a monocular image. We propose a weighted iterative least squares approach to minimize the reprojection error between 2D-3D point correspondences preserving the 3D lengths. In addition, a local 3D smoothness constraint for each mesh vertex is proposed to increase the robustness to noisy correspondences and occluded or poorly represented facets. Moreover, the proposed method updates automatically the relevance of each constraint in order to maximize the smoothness and minimize the reprojection error. Experimental results shown that our approach obtains accurate results and is faster than state-of-the-art algorithms using similar constraints.

10:20-10:40, Paper WeAT3.5
Detecting Discontinuities for Surface Reconstruction
Wang, Yinting	Zhejiang Univ.
Bu, Jiajun	Zhejiang Univ.
Li, Na	Zhejiang International Studies Univ.
Song, Mingli	Zhejiang Univ.
Tan, Ping	National Univ. of Singapore
Keywords: Stereo and Image-Based Modeling Abstract: Photometric stereo algorithms produce a map of normal directions from the input images. The 3D surface can be reconstructed from this normal map. Existing surface reconstruction works often assume the normal map is integrable but contaminated by small scale non-integrable noise. However, real surfaces often contain large discontinuities such as occlusion boundaries and sharp depth changes, which break the integrable assumption commonly made in many works. Here, we propose a method to detect these discontinuities by combining multiple geometric cues with trained classifiers and a simple graph optimization. The surface is then reconstructed with the guidance of these detected discontinuities. Experiments show our method outperforms existing works.


WeAT4	Hall 200
Tracking	Regular Session
Chair: Elgammal, Ahmed	Rutgers Univ.
Co-Chair: Svoboda, Tomas	Czech Tech. Univ. Faculty of Electrical Engineering

09:00-09:20, Paper WeAT4.1
Robust Tracking by Accounting for Hard Negatives Explicitly
Lei, Peng	Beijing Inst. of Tech. Hill Res.
Wu, Tianfu	Lotus Hill Res. Inst.
Pei, Mingtao	Beijing Inst. of Tech.
Ming, Anlong	Beijing Univ. of Posts and Telecommunications
Yao, Zhenyu	Beijing Univ. of Posts and Telecommunications;Lotus Hill Re
Keywords: Motion, Tracking and Video Analysis Abstract: In this paper, we present a method of robust tracking by accounting for hard negatives (i.e., distractors) of the tracking target explicitly. Our method extends the recently proposed Tracking-Learning-Detection (TLD) approach [5] in two aspects: (i) When learning the on-line fern detector, instead of using a set of features which are first randomly generated and then fixed throughout the tracking, we utilize a feature selection stage which constantly improves the performance of the detector, especially in tracking articulated objects (e.g.,pedestrians); (ii) To address the diversity of distractors, instead of tracking a target against the whole set of collected negative examples, we account for the hard negatives explicitly, so that tracking drifts are largely prevented when multiple resembled targets appear in videos (e.g., people with white skirts and jeans). Experiments on a series of diverse videos show that our method outperforms TLD.

09:20-09:40, Paper WeAT4.2
Point Track Creation in Unordered Image Collections Using Gomory-Hu Trees
Sv�rm, Linus	Lund Univ.
Zhayida, Simayijiang	Lund Univ.
Enqvist, Olof	Lund Univ.
Olsson, Carl	Lund Univ.
Keywords: Motion, Tracking and Video Analysis Abstract: Geometric reconstruction from image collections is a classical computer vision problem. The problem essentially consists of two steps; First, the identification of matches and assembling of point tracks, and second, multiple view geometry computations. In this paper we address the problem of constructing point tracks using graph theoretical algorithms. From standard descriptor matches between all pairs of images we construct a graph representing all image points and all possible matches. Using Gomory-Hu trees we make cuts in the graph to construct the individual point tracks. We present both theoretical and experimental results (on real datasets) that clearly demonstrates the benefits of using our approach.

09:40-10:00, Paper WeAT4.3
Robust Visual Tracking with the Cross-Bin Metric
Lv, Chaoxin	Xiamen Univ.
Yan, Yan	Xiamen Univ.
Wang, Hanzi	Xiamen Univ.
Keywords: Motion, Tracking and Video Analysis Abstract: In this paper, we propose an adaptive particle filter method based on the cross-bin matching, which makes use of the fast and robust earth mover�s distance with a new ground distance (EMD) as the similarity measure, for robust visual tracking. In contrast to the traditional bin-by-bin metrics, the cross-bin metric used in the (EMD) is capable of efficiently capturing the intrinsic affinity relationships among samples, resulting in more accurate and effective tracking results. Experimental results demonstrate the effectiveness and robustness of the proposed tracking method in coping with the challenging situations, such as background clutters, occlusions, abrupt motions and jumps.

10:00-10:20, Paper WeAT4.4
Tracking with Context As a Semi-Supervised Learning and Labeling Problem
Cerman, Luk�	Center for Machine Perception, Department ofCybernetics,Faculty
Hlavac, Vaclav	Czech Tech. Univ. Faculty of ElectricalEngineering
Keywords: Motion, Tracking and Video Analysis, Classification and Clustering Abstract: It is suggested how a Markov random field can be used for object tracking with context information. The tracking is formulated as a two layer process. In the first phase, the image is represented by a set of feature points which are tracked by a standard tracker. In the second phase, the proposed semi-supervised learning and labeling algorithm is used to label the points to three classes -- object, background and companion. The object state (pose) is defined by the set of points labeled as the object. The companion represents the object context and contains non-object points with a motion similar to the motion of the object. As initialization, labels of the object points only are provided by a user in the very first frame. Appearance and motion models of the three classes and the labels of the remaining points in the whole video sequence are estimated in a GrabCut fashion. We show, that use of the companion class together with a 3D (space-time) Markov random field helps to identify object points behind full occlusions or under strong appearance changes.

10:20-10:40, Paper WeAT4.5
Camera Tracking Based on Circular Point Factorization
Calvet, Lilian	Univ. of Toulouse, IRIT
Gurdjos, Pierre	Univ. of Toulouse
Charvillat, Vincent	IRIT
Keywords: Motion, Tracking and Video Analysis, Geometric and Photometric Registration, Vision for Robotics Abstract: Concentric circles (C2Tag�s) are planar markers which offer great advantages for detection and tracking. As the circular point-pair (CPP) is the geometric information encoded by a C2Tag, this work is focused on factorization techniques for Structure-and-Motion from multiple CPP images. Gathering all of them in a measurement matrix, two issues are addressed: how to scale the existing entries and how to fill the missing ones. We describe a complete algorithm and prove it. We validate our contributions on simulated and real images.


WeAT5	Hall 300
Learning-I	Regular Session
Chair: Uchida, Seiichi	Kyushu Univ.
Co-Chair: He, Ran	Inst. of Automation, Chinese Acad. of Sciences

09:00-09:20, Paper WeAT5.1
Set-Valued Bayesian Inference with Probabilistic Equivalence
Le Capitaine, Hoel	Univ. of Nantes
Keywords: Classification and Clustering, Statistical, Syntactic and Structural Pattern Recognition Abstract: In this paper, a unified view of the problem of class-selection with Bayesian classifiers is presented. Selecting a subset of classes instead of singleton allows 1) to reduce the error rate and 2) to propose a reduced set to another classifier or an expert. This second step provides additional information, and therefore increases the quality of the result. The proposed framework, based on the evaluation of the probabilistic equivalence, allows to retrieve the class-selective frameworks that have been proposed in the literature. Several experiments show the effectiveness of this generic proposition.

09:20-09:40, Paper WeAT5.2
Robust Multiple Model Estimation with Jensen-Shannon Divergence
Zhou, Kai	Inst. of Automation and Control, Vienna Univ. ofTechnol
Varadarajan, Karthik Mahesh	TU Wien
Zillich, Michael	Vienna Univ. of Tech.
Vincze, Markus	TU Wien
Keywords: Classification and Clustering, Statistical, Syntactic and Structural Pattern Recognition, Detection, Separation and Segmentation Abstract: In order to estimate multiple structures without prior knowledge of the noise scale, this paper utilizes Jensen-Shannon Divergence (JSD), which is a similarity measurement method, to represent the relations between pairwise data conceptually. This conceptual representation encompasses the geometrical relations between pairwise data as well as the information about whether pairwise data coexist in one model's inlier set or not. Tests on datasets comprised of noisy inlier and a large percentage of outliers demonstrate that the proposed solution can efficiently estimate multiple models without prior information. Superior performance in terms of synthetic experiments and pragmatic tests is also demonstrated to validate the proposed approach.

09:40-10:00, Paper WeAT5.3
Multi-Label Learning Vector Quantization Algorithm
Jin, Xiao-Bo	Henan Univ. of Tech.
Geng, Guang-Gang	Computer Network Information Center,Chinese Acad. ofScience
Yu, Junwei	School of Information Science and Engineering, Henan Univ.
Zhang, Dexian	School of Information Science and Engineering, Henan Univ.
Keywords: Classification and Clustering, Statistical, Syntactic and Structural Pattern Recognition, Machine Learning and Data Mining Abstract: Multi-label learning is increasingly required by many domains such as text categorization and scene classification. Learning vector quantization (LVQ) offers a simple, power and scalable algorithm for the single-label learning. In this work, we adapt LVQ to solve the multi-label problems called ML-LVQ. It once adjusts two prototypes for each label of the example to minimize the ranking loss approximately for improving the ranking measures. Moreover, we arm with the single-label AdaBoost.MH as the meta-labeler to predict the number of the labels for the test examples, which will benefit the bipartitions measures. Our empirical study on 6 public multi-label benchmark datasets shows that our proposed algorithm ML-LVQ is statistically significantly better than multi-label AdaBoost.MH and multi-label AdaBoost with the single-label AdaBoost.MH as the meta-labeler especially under the evaluations of the one-error and the mac-F1 (p = 0.03).

10:00-10:20, Paper WeAT5.4
Centroid-Based Clustering for Graph Datasets
Chen, Lifei	Fujian Normal Univ.
Wang, Shengrui	Univ. of Sherbrooke
Yan, Xuanhui	Fujian Normal Univ.
Keywords: Classification and Clustering, Statistical, Syntactic and Structural Pattern Recognition, Machine Learning and Data Mining Abstract: Due to the absence of nodes and edges correspondence between graphs, the existing central clustering algorithms usually perform graph clustering in some embedded spaces, or confine the cluster centers to the set median graphs. In this paper, a centroid-based algorithm is proposed for directly clustering on the graphs, based on a newly defined dissimilarity measure via structure matching. The resulting node correspondences are used to rearrange the graph structures, such that the cluster centroids, i.e., the generalized median graphs, can be defined directly on the graphs and subsequently be optimized in a k-means type process. The experimental results on four real-world graph datasets demonstrate that the new algorithm significantly improves the clustering accuracy, and is able to discover the meaningful generalized median graphs.

10:20-10:40, Paper WeAT5.5
Recursive NMF: Efficient Label Tree Learning for Large Multi-Class Problems
Liu, Lei	Michigan State Univ.
Comar, Prakash	Michigan State Univ.
Saha, Sabyasachi	Narus Inc.
Tan, Pang-Ning	Michigan State Univ.
Nucci, Antonio	Narus Inc.
Keywords: Classification and Clustering, Machine Learning and Data Mining Abstract: Many object recognition or concept identification tasks require accurate detection of large number of classes. These applications present enormous chal- lenges to traditional classification methods, which are mostly designed for solving problems with small num- ber of classes. In this paper, we develop a method called recursive non-negative matrix factorization (RNMF) for building a hierarchical label tree over set of classes. The internal nodes of the tree employ linear classifiers to propagate a data instance to its corresponding leaf node, where one or more one class support vector ma- chine (SVM) classifiers is applied to accurately predict its class. Our experiment results show that the proposed method achieves significant gain in test efficiency and comparable accuracy to some of the more expensive la- bel tree learning methods.


WePAT6	Room 201+202
Poster Session (07, 08)	Poster Session


WeBT1	Main Hall
Mixed Reality and Human Computer Interaction	Regular Session
Chair: Kato, Hirokazu	Nara Inst. of Science and Tech.
Co-Chair: Robles-Kelly, Antonio	NICTA

11:20-11:40, Paper WeBT1.1
Exploiting Sensors on Mobile Phones to Improve Wide-Area Localization
Arth, Clemens	Graz Univ. of Tech.
Mulloni, Alessandro	Graz Univ. of Tech.
Schmalstieg, Dieter	Graz Univ. of Tech.
Keywords: Mixed and Augmented Reality, Geometric and Photometric Registration, Features and Image Descriptors Abstract: In this paper, we discuss how the sensors available in modern smartphones can improve 6-degree-of-freedom (6DOF) localization in wide-area environments. In our research, we focus on phones as a platform for large-scale Augmented Reality (AR) applications. Thus, our aim is to estimate the position and orientation of the device accurately and fast - it is unrealistic to assume that users are willing to wait tenths of seconds before they can interact with the application. We propose supplementing vision methods with sensor readings from the compass and accelerometer available in most modern smartphones. We evaluate this approach on a large-scale reconstruction of the city center of Graz, Austria. Our results show that our approach improves both accuracy and localization time, in comparison to an existing localization approach based solely on vision. We finally conclude our paper with a real-world validation of the approach on an iPhone 4S.

11:40-12:00, Paper WeBT1.2
Robust Model-Based Tracking Considering Changes in the Measurable DoF of the Target Object
Kumagai, Kenzo	Nara Inst. of science and Tech.
Oikawa, Marina Atsumi	Nara Inst. of Science and Tech.
Taketomi, Takafumi	Nara Inst. of Science and Tech.
Yamamoto, Goshiro	Nara Inst. of Science and Tech.
Miyazaki, Jun	Nara Inst. of Science and Tech.
Kato, Hirokazu	Nara Inst. of Science and Tech.
Keywords: Geometric and Photometric Registration, Motion, Tracking and Video Analysis, Mixed and Augmented Reality Abstract: Model based tracking approaches estimate the pose of the object by minimizing the re-projection error. However, when the object has some ambiguity, for instance, rotation invariance, the 3D pose cannot be correctly estimated. This paper proposes a novel method to allow continuous tracking even when the Degrees of Freedom (DoF) of the target object changes, being able to recover one missing DoF. Pose ambiguity test and recovery of the 3D pose by null space search were added into a general model-based tracking algorithm. Experiments were conducted in a synthetic and in the real world environment to validate the proposed method.

12:00-12:20, Paper WeBT1.3
Combining Contrast Saliency and Region Discontinuity for Precise Hand Segmentation in Projector-Camera System
Dai, Jingwen	The Chinese Univ. of Hong Kong
Chung, Chi-kit Ronald	The Chinese Univ. of Hong Kong
Keywords: Human Computer Interaction, Segmentation, Color and Texture Abstract: One goal of projector-camera system is let human finger be used like a mouse to click and drag objects in the projected content. It requires segmentation of the human palm and fingers in the image data captured by the camera, which is a challenging task in the presence of the incessant variation of the projected video content and the shadow cast by the palm and fingers. We describe a coarse-to-fine hand segmentation method for projector-camera system. After rough segmentation by contrast saliency detection and mean shift-based discontinuity-preserved smoothing, the refined result is confirmed through confidence evaluation. Extensive experimental results are shown to illustrate the accuracy and efficiency of the approach.

12:20-12:40, Paper WeBT1.4
Shadow Detection Via Rayleigh Scattering and Mie Theory
Gu, Lin	the Australian National Univ.
Robles-Kelly, Antonio	NICTA
Keywords: Occlusion and Shadow Detection, Physics-Based Vision Abstract: In this paper, we present a method to detect shadows in outdoor scenes. Here, we note that the shadow areas correspond to the diffuse skylight which arises from the scattering of the sunlight by particles in the atmosphere. This yields a treatment in which shadows in the image can be viewed as a linear combination of scattered light obeying Rayleigh scattering and Mie theory. This treatment allows for the computation of a ratio which permits casting the problem of recovering the shadowed areas in the image into a clustering setting making use of active contours. We illustrate the utility of the method for purposes of detecting shadows in real-world imagery and compare our results against a number of alternatives elsewhere in the literature.

12:40-13:00, Paper WeBT1.5
Iterative Clustering and Support Vectors-Based High-Confidence Query Selection for Motor Imagery EEG Signals Classification
Yang, Huijuan	Inst. for Infocomm Res. A*STAR
Guan, Cuntai	Inst. for Infocomm Res.
Ang, Kai Keng	Inst. for Infocomm Res. A*STAR
Zhang, Haihong	Inst. for Infocomm Res.
Wang, Chuanchu	Inst. for Infocomm Res.
Keywords: Human Computer Interaction, Pattern Recognition for Bioinformatics, Machine Learning and Data Mining Abstract: This paper proposes a novel active learning method for the classification of motor imagery electroencephalogram (EEG) signals. Specifically, we propose an iterative clustering and support vector-based criterion to select samples of high-confidence to construct a robust training set. The common spatial pattern (CSP)-based features are iteratively clustered till the number of support vectors in the cluster is less than a predefined threshold. A predefined number of samples close to the cluster centers are chosen. When such clusters cannot be found, the samples that are of farthest distances to a group of support vectors of class �0� and �1� are alternately chosen. Experimental results on BCI competition IV dataset IIb show superior performance compared with a baseline method, which is 9% increase in accuracy averaged across subjects and training sizes.


WeBT2	Multi-Purpose Hall
Computational Photography	Regular Session
Chair: Pal, Chris	�cole Pol. Montr�al
Co-Chair: Sato, Jun	Nagoya Inst. of Tech.

11:20-11:40, Paper WeBT2.1
Learning Human Preferences to Sharpen Images
Nam, Myra	Univ. of Illinois at Urbana-Champaign
Ahuja, Narendra	-UIUC
Keywords: Cognitive and Embodied Vision, Enhancement, Restoration and Filtering, Low-Level Vision Abstract: We propose an image sharpening method that automatically optimizes the perceived sharpness of an image. Image sharpness is defined in terms of the one-dimensional contrast across region boundaries. Regions are automatically extracted for all natural scales present that are themselves identified automatically. Human judgments are collected and used to learn a function that determines the best sharpening parameter values at an image location as a function of certain local image properties. We use the Gaussian mixture model (GMM) to estimate the joint probability density of the preferred sharpening parameters and local image properties. The latter are then adaptively estimated by parametric regression from GMM. Experimental results demonstrate the adaptive nature and superior performance of our approach over the traditional Unsharp Masking method.

11:40-12:00, Paper WeBT2.2
Deblurring Depth Blur and Motion Blur Simultaneously by Using Space-Time Coding
Naito, Ryosuke	Nagoya Inst. of Tech.
Kobayashi, Takeyuki	Nagoya Inst. of Tech.
Sakaue, Fumihiko	Nagoya Inst. of Tech.
Sato, Jun	Nagoya Inst. of Tech.
Keywords: Computational Photography Abstract: In recent years, various methods have been proposed for recovering depth blur and motion blur by coding camera optics, such as aperture and exposure. However, these methods are limited to deblurring just a single type of blur, such as depth blur or motion blur. In this paper, we propose a method, which enables us to deblur the depth blur and the motion blur simultaneously by coding image capture both in space and time. The validity and the advantages of the proposed method are shown by real-image experiments and quantitative evaluations using lens simulator.

12:00-12:20, Paper WeBT2.3
8-D Reflectance Field for Computational Photography
Tagawa, Seiichi	Osaka Univ.
Mukaigawa, Yasuhiro	Osaka Univ.
Yagi, Yasushi	Osaka Univ.
Keywords: Computational Photography, Vision for Graphics Abstract: Some computational photography techniques have been proposed to control the focus and illumination of captured images. However, the relationship between the techniques have been unclear because they were developed independently for different purposes. In this research we propose a unified framework to explain the computational photography techniques in the computation of an 8-D reflectance field. Moreover, for an 8-D reflectance field we show that the synthetic aperture, the image-based relighting, and the confocal imaging techniques can be realized using the same measuring device.

12:20-12:40, Paper WeBT2.4
Assessment of Photo Aesthetics with Efficiency
Lo, Kuo-Yen	Acad. Sinica
Liu, Keng-Hao	Acad. Sinica
Chen, Chu-Song	Acad. Sinica
Keywords: Computational Photography, Features and Image Descriptors, Image and Video Understanding Abstract: Photo quality assessment has been a popular research topic. Many previous works achieved high classification rates in photo aesthetics assessment by designing new aesthetic features. However, those hand-crafted features sometimes are not describable, or are very time-consuming and thus not applicable for real-time applications. In this paper, we propose aesthetic features with high efficiency to compute. The experimental results show that our proposed features reach considerable performance. The computation consumption for classifying an image is low so that it is possible to realize online assessment in photo capturing and provide instant feedback to users or fulfill photo rating system on portable devices.

12:40-13:00, Paper WeBT2.5
Video Upscaling with Spatio-Temporal Self-Similar Examples
Ayvaci, Alper	UCLA
Jin, Hailin	Adobe Systems Incorporated
Lin, Zhe	Adobe
Cohen, Scott	Adobe Systems
Soatto, Stefano	UCLA
Keywords: Computational Photography, Motion, Tracking and Video Analysis, Low-Level Vision Abstract: We propose a new video upscaling technique that extends the example-based super-resolution frameworks to multiple-frames. Our method relies on repeating patches that can be observed not only inside a single frame but across the whole video. This allows us to encode image patches with over-complete dictionaries that are constructed in a local spatio-temporal neighborhood around that patch.cut{ To exploit temporal coherence between the frames, we estanblish correpondence between them via optical flow. }We demonstrate the ability of our method to produce high-quality results on real videos with HD resolution, and compare it against state-of-the-art super-resolution techniques.


WeBT3	Room 101+102
Registration and Correspondence	Regular Session
Chair: Goldgof, Dmitry	Univ. of South Florida
Co-Chair: Laurendeau, Denis	Univ. Laval

11:20-11:40, Paper WeBT3.1
Surface Matching by Curvature Distribution Images Generated Via Gaze Modeling
Maeda, Makoto	Kyushu Inst. of Tech.
Nakamae, Takashi	Kyushu Inst. of Tech.
Inoue, Katsuhiro	Kyushu Inst. of Tech.
Keywords: 2D/3D Object Detection and Recognition, Features and Image Descriptors, Stereo and Image-Based Modeling Abstract: In order to realize model-based 3D object recognition, first, we propose a geometric feature extraction method based on a novel gaze modeling. In the modeling process, local surface models are independently estimated for parts of range data restricted by several gaze domains. Hence, since features are independently extracted from each gaze domain, inconsistent or incorrect features may be obtained. Therefore we introduce a stochastic method that enables us to integrate such features by evaluating the reliability of each gaze model. Next, we propose a shape descriptor, curvature distribution image (CDI), to achieve object recognition by surface matching. It is generated based on the ratios between surface curvatures. In order to discuss the performance of 3D shape description by the kind of used curvatures, we have used 5 kinds of curvatures, such as Gaussian curvature, mean curvature, principal curvatures (maximum, minimum) and shape index. Hence, since an object is represented as a set of descriptive CDIs, the recognition is realized by discovering the one that has the most similar CDIs in the database. The main contribution of this paper is experimental analysis of the performance of CDIs generated by various generation parameters.

11:40-12:00, Paper WeBT3.2
Enhancing Motion Segmentation by Combination of Complementary Affinities
Zografos, Vasileios	Linkoping Univ.
Keywords: Motion, Tracking and Video Analysis, Detection, Separation and Segmentation Abstract: Complementary information, when combined in the right way, is capable of improving clustering and segmentation problems. In this paper, we show how it is possible to enhance motion segmentation accuracy with a very simple and inexpensive combination of complementary information, which comes from the column and row spaces of the same measurement matrix. We test our approach on the Hopkins155 dataset where it outperforms all other state-of-the-art methods.

12:00-12:20, Paper WeBT3.3
3D Tracking of Deformable Surface by Propagating Feature Correspondences
Liu, Ye	Fudan Univ.
Chen, Yan Qiu	Fudan Univ.
Keywords: Motion, Tracking and Video Analysis, Stereo and Image-Based Modeling, 2D/3D Object Detection and Recognition Abstract: This paper addresses the problem of 3D tracking deformable surfaces undergoing non-rigid motion from multi-view video sequences. We propose a method that starts from a set of feature points which have been matched across views and time. A data-driven motion propagation technique makes the motion dense enough to give initial guess to the parameter estimation of the vertices of the 3D model. Finally after rejecting some outliers the 3D mesh model is deformed using the estimated vertex motion. Experimental results demonstrate the proposed method is accurate and robust even at a low frame rate and for medium elastic deformations.

12:20-12:40, Paper WeBT3.4
Realtime 2D Video/3D LiDAR Registration
Bodensteiner, Christoph	Fraunhofer IOSB
Arens, Michael	Fraunhofer IOSB
Keywords: Motion, Tracking and Video Analysis, 2D/3D Object Detection and Recognition, Vision for Robotics Abstract: Progress in LiDAR scanning has led to the availability of large scale LiDAR datasets for urban areas. We use such pre-acquired data to determine the poses of 2D monocular cameras highly accurately in realtime. This is achieved by first correctly aligning key-frames of the multi-modal data using a combination of feature and intensity-based 2D/3D registration methods. The online pose is then determined in realtime by densely sampling and tracking features within the 2D video stream. The 3D coordinates of these features are determined by a fast GPU-based backprojection. The observed 2D/3D feature data is then fused using a recursive bayesian filter in order to exploit temporal coherency. The method is evaluated using ground truth camera trajectories and different filter implementations. The proposed registration and filter framework executes at video-framerate and it is up to 15% more accurate then a registration only solution. Applications are numerous and include, for instance, augmented-reality applications, online georeferentiation or metric online 3D reconstruction from monocular video data.

12:40-13:00, Paper WeBT3.5
3D Shape Isometric Correspondence by Spectral Assignment
Pan, Xiang	zhejiang Univ. of Tech.
Shapiro, Linda	Univ. of Washington
Keywords: 2D/3D Object Detection and Recognition Abstract: Finding correspondences between two 3D shapes is common both in computer vision and computer graphics. In this paper, we propose a general framework that shows how to build correspondences by utilizing the isometric property. We show that the problem of finding such correspondences can be reduced to the problem of spectral assignment, which can be solved by finding the principal eigenvector of the pairwise correspondence matrix. The proposed framework consists of four main steps. First, it obtains initial candidate pairs by performing a preliminary matching using local shape features. Second, it constructs a pairwise correspondence matrix using geodesic distance and these initial pairs. Next, the principal eigenvector of the matrix is computed. Finally, the final correspondence is obtained from the maximal elements of the principal eigenvector. In our experiments, we show that the proposed method is robust under a variety of poses. Furthermore, our results show a great improvement over the best related method in the literature.


WeBT4	Hall 200
Classification and Tracking	Regular Session
Chair: Mori, Greg	Simon Fraser Univ.
Co-Chair: Yuen, Pong C	Hong Kong Baptist Univ.

11:20-11:40, Paper WeBT4.1
Soft-Signed Sparse Coding for Ground-Based Cloud Classification
Liu, Shuang	Inst. of Automation, Chinese Acad. of Sciences
Wang, Chunheng	Inst. of Automation Chinese Acad. of Sciences
Xiao, Baihua	Inst. of Automation, Chinese Acad. of Sciences
Zhang, Zhong	Inst. of Automation, Chinese Acad. of Sciences
Shao, Yunxue	Inst. of Automation Chinese Acad. of Sciences
Keywords: Segmentation, Color and Texture, Features and Image Descriptors Abstract: Traditional sparse coding has been successfully applied in texture and image classification in the past years. Yet such kind of method neglects the influence of the signs of coding coefficients, which may cause information loss in the sequential max pooling. In this paper, we propose a novel coding strategy for ground-based cloud classification, which is named soft-signed sparse coding. In our method, a constraint on the signs is explicitly added to the objective function of traditional sparse coding model, which can effectively regulate the ratio between the number of positive and negative non-zero coefficients. As a result, the proposed method can not only obtain low reconstruction error but also consider the influence of the signs of coding coefficients. The strategy is verified on two challenging cloud datasets, and the experimental results demonstrate the superior performance of our method compared with previous ones.

11:40-12:00, Paper WeBT4.2
Enhancing Cross-View Object Classification by Feature-Based Transfer Learning
Mo, Yi	Beihang Univ.
Zhang, Zhaoxiang	Beihang Univ.
Wang, Yunhong	Beihang Univ.
Keywords: 2D/3D Object Detection and Recognition, Classification and Clustering, Image and Video Processing Abstract: Object classification is of vital importance to intelligent traffic surveillance. A big challenge is that shooting view changes in different scenes, which leads to sharp accuracy decrease since training and test samples do not follow the same distribution anymore. On the other hand, manual labeling training samples is time and labor consuming. We propose a feature-based transfer learning framework to gap the divergence of different domain distributions with scarce target view samples. Source view samples, following a different but relevant distribution, could be utilized to learn what a good classifier is like by structure learning. At the same time, small amount of target view samples could make a great contribution to reflect the target distribution. Experimental results indicate that our method outperforms traditional approaches when target samples are too scarce to build a strong classifier.

12:00-12:20, Paper WeBT4.3
Semantic Superpixel Based Vehicle Tracking
Liu, Liwei	Tsinghua Univ.
Xing, Junliang	Tsinghua Univ.
Ai, Haizhou	Tsinghua Univ. China
Lao, Shihong	OMRON Social Solutions Co., LTD
Keywords: Motion, Tracking and Video Analysis, Pattern Recognition for Surveillance and Security Abstract: This paper focuses on tracking multiple vehicles in real-world traffic videos which is very challenging due to frequent interactions and occlusions between different vehicles. To address these problems, we fall back on superpixel which recently has received great attention in a wide range of vision problems, e.g. object segmentation, tracking and recognition, for its ability of capturing local appearance characteristics of objects and their spatial relations. As a mid-level feature, however, superpixel itself is unable to carry semantic information which may restricts their use in these problems. To this end, we introduce semantic information into superpixel from an offline trained semantic object detector and successfully deploy it into the multiple vehicle tracking problem. The benefits of semantic superpixel include: 1) it gains better temporal coherency of superpixel; 2) the effectiveness and robustness of occlusion handling are improved; 3) benefited from semantic analysis, false targets and false trajectories are significantly reduced. Experiments show significant accuracy improvements of our approach in comparison with existing tracking methods.

12:20-12:40, Paper WeBT4.4
Efficient UAV Video Event Summarization
Trinh, Hoang	IBM Res.
Li, Jun	IBM Res.
Miyazawa, Sachiko	IBM Res.
Moreno, Juan	IBM Res.
Pankanti, Sharath	IBM Res.
Keywords: Motion, Tracking and Video Analysis, Image and Video Understanding, Multimedia Analysis, Indexing and Retrieval Abstract: In the paper, we present a method of efficiently summarizing UAV video data. Our approach is based on first detecting and tracking moving objects. Significant amera motion usually present in UAV video data is successfully handled by a robust feature-based frame registration technique. We then devise a saliency-based scoring method to score and rank detected object tracks. Object tracks are then grouped into video segments. The final step is to generate a concise summarization and visualization. Experimental results on the VIRAT UAV dataset show that we can accomplish a data reduction rate in excess of 1000 without significantly missing any activities of interest.

12:40-13:00, Paper WeBT4.5
Multiple Local Kernel Integrated Feature Selection for Image Classification
Sun, Yu	Univ. of California, Riverside
Bhanu, Bir	Univ. of California
Keywords: Features and Image Descriptors, Segmentation, Color and Texture Abstract: Feature redundancy and loss of local feature are central problems for image classification. Feature selection decreases the feature redundancy by choosing a subset of features and eliminating those with low prediction. The local feature representation is able to highlight objects in an image, thus, overcoming the drawbacks of global features. This paper presents a new method, called the local kernel for feature selection, which integrates a local kernel of the segmented regions into feature selection to provide improved image classification. This is done by integrating the region-based image distance with the kernel of a Bayesian classifier. The proposed method is tested on two standard image databases and the classification results are higher than the current feature selection and classification methods.


WeBT5	Hall 300
Learning-II	Regular Session
Chair: Deguchi, Daisuke	Nagoya Univ.
Co-Chair: Bhanu, Bir	Univ. of California

11:20-11:40, Paper WeBT5.1
Composite Likelihood Estimation for Restricted Boltzmann Machines
Yasuda, Muneki	Tohoku Univ.
Kataoka, Shun	Tohoku Univ.
Waizumi, Yuji	Tohoku Univ.
Tanaka, Kazuyuki	Graduate School of Information Sciences, Tohoku Univ.
Keywords: Machine Learning and Data Mining, Neural Networks, Statistical, Syntactic and Structural Pattern Recognition Abstract: Generally, learning the parameters of graphical models by using the maximum likelihood estimation is difficult and requires an approximation. Maximum composite likelihood estimations are statistical approximations of the maximum likelihood estimation and are higher-order generalizations of the maximum pseudo-likelihood estimation. In this paper, we propose a composite likelihood method and investigate its properties. Furthermore, we apply this to restricted Boltzmann machines.

11:40-12:00, Paper WeBT5.2
I Don't Know the Label: Active Learning with Blind Knowledge
Fang, Meng	Univ. of Tech. Sydney
Zhu, Xingquan	Florida Atlantic Univ.
Keywords: Machine Learning and Data Mining Abstract: Active learning traditionally assumes that the oracle is capable of providing labeling information for each query instance. In reality, the oracle might have no information for some queries and cannot provide accurate label but only answers ``I don't know the label''. We focus on this problem and provide a unified objective function to ensure that each query instance submitted to the oracle is the one mostly needed for labeling and the oracle should also have sufficient knowledge to label. Experimental results on real-world and benchmark data sets demonstrate the effectiveness of the proposed design for supporting active learning using oracles with blind knowledge.

12:00-12:20, Paper WeBT5.3
Map Matching with Hidden Markov Model on Sampled Road Network
Raymond, Rudy	IBM Res. -- Tokyo
Morimura, Tetsuro	IBM Res. - Tokyo
Osogami, Takayuki	IBM Res. - Tokyo
Hirosue, Noriaki	Kyoto Univ.
Keywords: Machine Learning and Data Mining, Statistical, Syntactic and Structural Pattern Recognition Abstract: This paper presents a map matching method based on an ideal Hidden Markov Model (HMM) to find a sequence of roads that corresponds to a given sequence of raw GPS points. Our method is a simplification of the more-complex HMM-based method that maintains its capabilities to cope with the noises and sparsity of the raw GPS data. We test the method with the real-world raw GPS data that is publicly available. Experiments show that despite its simplicity, the proposed method performs sufficiently well under sparse GPS points and sparse road network data.

12:20-12:40, Paper WeBT5.4
Multi-Task Signal Recovery by Higher Level Hyper-Parameter Sharing
Ali Pitchay, Sakinah	Univ. of Birmingham
Kaban, Ata	Univ. of Birmingham
Keywords: Machine Learning and Data Mining, Statistical, Syntactic and Structural Pattern Recognition Abstract: Sharing of hyper-parameters is often useful for multi-task problems as a means of encoding some notion of task similarity. Here we present a multi-task approach for signal recovery by sharing higher-level hyper-parameters which do not relate directly to the actual content of the signals of interest but only to their statistical characteristics. Our approach leads to a very simple model and algorithm that can be used to simultaneously recover multiple natural images with unrelated content. We investigate the advantages of this approach in relation to state of the art multi-task compressed sensing and we discuss our findings.

12:40-13:00, Paper WeBT5.5
Learning Markov Networks by Analytic Center Cutting Plane Method
Antoniuk, Kostiantyn	Center for Applied Cybernetics Faculty of ElectricalEngineering,
Franc, Vojtech	Czech Tech. Univ. in Prague
Hlavac, Vaclav	Czech Tech. Univ. Faculty of ElectricalEngineering
Keywords: Machine Learning and Data Mining, Statistical, Syntactic and Structural Pattern Recognition, Segmentation, Color and Texture Abstract: During the last decade the super-modular Pair-wise Markov Networks (SM-PMN) have become a routinely used model for structured prediction. Their popularity can be attributed to efficient algorithms for the MAP inference. Comparably efficient algorithms for learning their parameters from data have not been available so far. We propose an instance of the Analytic Center Cutting Plane Method (ACCPM) for discriminative learning of the SM-PMN from annotated examples. We empirically evaluate the proposed ACCPM on a problem of learning the SM-PMN for image segmentation. Results obtained on two public datasets show that the proposed ACCPM significantly outperforms the current state-of-the-art algorithm in terms of computational time as well as the accuracy because it can learn models which were not tractable by existing methods.


WePSBT1	Main Hall
Poster Shotgun (09): PR	Regular Session

14:00-14:30, Paper WePSBT1.1
HDP-MRF: A Hierarchical Nonparametric Model for Image Segmentation
Nakamura, Takuma	Waseda Univ.
Harada, Tatsuhiro	Waseda Univ.
Suzuki, Tomohiko	Waseda Univ.
Matsumoto, Takashi	Waseda Univ.
Keywords: Statistical, Syntactic and Structural Pattern Recognition, Machine Learning and Data Mining, Image and Video Processing Abstract: Infinite Hidden Markov Random Fields have been proposed for image segmentation as a solution to the problem of automatically determining the number of regions in an image; however, the model does not maintain identity of segmented regions among multiple images. In order to identify segmented regions in images, we developed Hierarchical Dirichlet Process Markov Random Fields. Our model maintains global identification of segmented regions in multiple images by incorporating the idea of hierarchical modeling and automatically determines the number of segmented regions in each image. We show an experimental comparison between the previous model and our proposed model by changing the observation features from RGB value to color histogram features.

14:00-14:30, Paper WePSBT1.2
Classification of Surfaces and Inclinations During Outdoor Running Using Shoe-Mounted Inertial Sensors
Schuldhaus, Dominik	Univ. of Erlangen-Nuremberg
Kugler, Patrick	Univ. of Erlangen-Nuremberg
Leible, Magnus	Univ. of Freiburg
Jensen, Ulf	Univ. Erlangen-Nuremberg
Schlarb, Heiko	adidas AG, Herzogenaurach
Eskofier, Bjoern	Univ. of Erlangen-Nuremberg
Keywords: Classification and Clustering, Pattern Recognition for Bioinformatics Abstract: Embedded mobile systems for analysis and classification become more and more important in the field of sports and sports science. Small and lightweight sensors in sportswear offer the possibility to monitor the athletes in a realistic environment, e.g. during an outdoor run. During the activity, the sportswear can automatically adapt to the current environment and hence optimizes the performance of the athlete. A major need is a running shoe, which can automatically be adapted to the current ground. In this paper, a classification system was developed, which distinguished between different surfaces and inclinations based on inertial sensors. They were placed on the heel of a running shoe and acquired kinematic data of 21 subjects. For each subject, several rounds of an one hour outdoor run were available and were used for the evaluation of the system. The classification system reached a mean classification rate of more than 80 %.

14:00-14:30, Paper WePSBT1.3
Characterizing User-Subgroups in Flickr Group : A Block LDA Based Approach
Negi, Sumit	IBM Res.
Balasubramanyan, Ramnath	CMU
Chaudhury, Santanu	Indian Inst. of Tech. Delhi
Keywords: Machine Learning and Data Mining, Multimedia Analysis, Indexing and Retrieval Abstract: The last few years have seen an exponential increase in the amount of multimedia content that is available online thanks to collaborative-online communities such as Flickr,You Tube etc. As opposed to "pure" social networking services these collaborative-online communities not only allow users to create new social links (e.g. add people to one's friend list) but also allow users to contribute multimedia content and engage in content-driven interactions. A good example of this can be seen in Flickr, in general and Flickr Group in particular where users can comment on or "like" an image contributed by another user. This paper looks at utilizing this within group user-user interaction information, along with image meta-data to discover user communities (user-subgroups) that contribute content around specific topics (subgroup-themes) at specific points in time. A good example of this is a group of users (e.g sports fans) contributing content and interacting with each other only at specific times of the year (e.g close to their favorite sporting event). We demonstrate that our proposed generative model Temporal Block-Link LDA is able to successfully extract such user-subgroups, subgroup-themes and associated temporal patterns from data in an unsupervised manner.

14:00-14:30, Paper WePSBT1.4
Software-Based Performance and Complexity Analysis for the Design of Embedded Classification Systems
Ring, Matthias	Univ. of Erlangen-Nuremberg
Jensen, Ulf	Univ. Erlangen-Nuremberg
Kugler, Patrick	Univ. of Erlangen-Nuremberg
Eskofier, Bjoern	Univ. of Erlangen-Nuremberg
Keywords: Machine Learning and Data Mining, Classification and Clustering, Feature Reduction and Manifold Learning Abstract: Embedded microcontrollers are employed in an increasing number of applications as a target for the implementation of classification systems. This is true for example for the fields of sports, automotive and medical engineering. However, important challenges arise when implementing classification systems on embedded microcontrollers, which is mainly due to limited hardware resources. In this paper, we present a solution to the two main challenges, namely obtaining a classification system with low computational complexity and at the same time high classification accuracy. For the first challenge, we propose complexity measures on the mathematical operation and parameter level, because the abstraction level of the commonly used Landau notation is too high in the context of embedded system implementation. For the second challenge, we present a software toolbox that trains different classification systems, compares their classification rate, and finally analyzes the complexity of the trained system. To give an impression of the importance of such complexity measures when dealing with limited hardware resources, we present the example analysis of the popular Pima Indians Diabetes data set, where considerable complexity differences between classification systems were revealed.

14:00-14:30, Paper WePSBT1.5
Robust Online Trajectory Clustering without Computing Trajectory Distances
Ulm, Michael	Austrian Inst. of Tech.
Braendle, Norbert	Austrian Inst. of Tech.
Keywords: Classification and Clustering, Pattern Recognition for Surveillance and Security, Machine Learning and Data Mining Abstract: We propose a novel trajectory clustering algorithm which is suitable for online processing of pedestrian or vehicle trajectories computed with a vision-based tracker. Our approach does not require defining distances between trajectories, and can thus process broken trajectories which are inevitable in most cases when object trackers are applied to real world video footage. Clusters are defined as smooth vector fields on a bounded connected set, and cluster distance is based on pairwise distances between vector sets. The results are illustrated on a trajectory set from the Edinburgh Informatics Forum Pedestrian Dataset, on a trajectory set from a public transport junction, and trajectories from an experimental setup in a corridor.

14:00-14:30, Paper WePSBT1.6
Hyperspectral Image Classification Based on Multiple Improved Particle Swarm Cooperative Optimization and SVM
Ren, Yuemei	NorthwesternPolytechnicalUniversity
Zhang, Yanning	Northwestern Pol. Univ.
Meng, Qingjie	Northwestern Pol. Univ.
Zhang, Lei	Northwestern Pol. Univ.
Keywords: Classification and Clustering, Remote Sensing, Image and Video Understanding Abstract: The huge increase of hyperspectral data dimensionality and information redundancy has brought high computational cost as well as the over-fitting risk of classification. In this paper, we present an automatic band selection and classification method based on a novel wrapper Multiple Improved particle swarm cooperative optimization and support vector machine model (MIPSO-SVM). The MIPSO-SVM model optimizes both the band subset and SVM kernel parameters simultaneously. In the proposed model, the particle swarm is divided into two sub-swarms. And PSO is improved firstly, by the new update strategy of position and velocity. Then the sub-swarms perform the improved PSO (IPSO) for band selection and classifier parameters optimization independently. Finally, in the process of cooperative evolution, extremal optimization (EO) is incorporated to maintain the diversity of swarms and enhance the space exploration ability of the proposed model. Experimental results demonstrate the effectiveness of the proposed method for band selection and classification of hyperspectral images.

14:00-14:30, Paper WePSBT1.7
STPCA: Sparse Tensor Principal Component Analysis for Feature Extraction
Wang, Su-Jing	Jilin Univ.
Sun, Ming-Fang	Jilin Univ.
Chen, Yu-Hsin	Jilin Univ.
Pang, Er-Ping	Jilin Univ.
Zhou, Chunguang	Jilin Univ.
Keywords: Feature Reduction and Manifold Learning, Biometrics, Pattern Recognition for Bioinformatics Abstract: Due to the fact that many objects in the real world can be naturally represented as tensors, tensor subspace analysis has become a hot research area in pattern recognition and computer vision. However, existing tensor subspace analysis methods cannot provide an intuitionistic nor semantic interpretation for the projection matrices. In this paper, we propose Sparse Tensor Principal Component Analysis (STPCA), which transforms the eigen-decomposition problem to a series of regression problems. Since its projection matrices are sparse, STPCA can also address the occlusion problem. Experiment on Georgia tech database and AR database showed that the proposed method outperforms the Multilinear Principal Component Analysis (MPCA) in terms of accuracy and robustness.

14:00-14:30, Paper WePSBT1.8
FastLOF: An Expectation-Maximization Based Local Outlier Detection Algorithm
Goldstein, Markus	German Res. Center for Artificial Intelligence DFKIGmbH
Keywords: Machine Learning and Data Mining, Pattern Recognition for Surveillance and Security, Classification and Clustering Abstract: Unsupervised anomaly detection techniques are becoming more and more important in a variety of application domains such as network intrusion detection, fraud detection and misuse detection. Today, unsupervised anomaly detection techniques are mainly based on quadratic complexity making it almost impossible to apply them on very large data sets. In this paper, an Expectation-Maximization algorithm is proposed which computes the Local Outlier Factor (LOF) incrementally and up to 80% faster than the standard method. Another advantage of FastLOF is that intermediate results can be used by a system already during computation. Evaluation on real world data sets reveal that FastLOF performs comparable to the best outlier detection algorithms although being significantly faster.

14:00-14:30, Paper WePSBT1.9
A Filtering Mechanism for Normal Fish Trajectories
Beyan, Cigdem	Univ. of Edinburgh, School of Informatics, IPAB
Fisher, Robert	Univ. of Edinburgh
Keywords: Gesture and Behavior Analysis, Motion, Tracking and Video Analysis Abstract: Understanding fish behavior by extracting normal motion patterns and then identifying abnormal behaviors is important for understanding the effects of environmental change. In the literature, there are many studies on normal/abnormal behavior detection in the areas of human behaviour analysis, traffic surveillance, and nursing home surveillance, etc. However, the literature is very limited in terms of normal/abnormal fish behavior understanding especially when natural habitat applications are considered. In this study, we present a rule based trajectory filtering mechanism to extract normal fish trajectories which potentially helps to increase the accuracy of the abnormal fish behavior detection systems and can be used as a preliminary method especially when the number of abnormal fish behaviors are very small (e.g. 40-50 times smaller) compared to the number of normal fish behaviors and/or when the number of trajectories are huge.

14:00-14:30, Paper WePSBT1.10
Semi-Supervised Adaptive Parzen Gentleboost Algorithm for Fault Diagnosis
Li, Chengliang	Northwestern Pol. Univ.
Wang, Zhongsheng	Northwestern Pol. Univ.
Bu, Shuhui	Northwestern Pol. Univ.
Liu, Zhenbao	Northwestern Pol. Univ.
Keywords: Classification and Clustering, Pattern Recognition for Surveillance and Security Abstract: In this paper, we present a novel semi-supervised strategy for machine fault diagnosis. In the proposed method, we select parzen window as the generative classifier and Gentleboost as the discriminative classifier. Compared with SVM, boosting method has a very interesting property of relative immunity to overfitting. In addition, we propose a novel adaptive parzen window algorithm. It employs variational adaptive parzen window rather than a global optimized and fixed window, therefore, more accurate density estimates can be obtained. In experiments, artificial and machine vibration data are used to compare with other algorithms. Our proposed algorithm achieves stronger robustness and lower classification error rate.

14:00-14:30, Paper WePSBT1.11
Non-Markovian Dynamic TimeWarping
Uchida, Seiichi	Kyushu Univ.
Fukutomi, Masahiro	Kyushu Univ.
Ogawara, Koichi	Wakayama Univ.
Feng, Yaokai	Kyushu Univ.
Keywords: Statistical, Syntactic and Structural Pattern Recognition, Gesture and Behavior Analysis, Handwriting Recognition Abstract: This paper proposes a new dynamic time warping (DTW) method, called non-Markovian DTW. In the conventional DTW, the warping function is optimized generally by dynamic programming (DP) subject to some Markovian constraints which restrict the relationship between neighboring time points. In contrast, the non-Markovian DTW can introduce non-Markovian constraints for dealing with the relationship between points with a large time interval. This new and promising ability of DTW is realized by using graph cut as the optimizer of the warping function instead of DP. Specifically, the conventional DTW problem is first converted as an equivalent minimum cut problem on a graph and then edges representing the non-Markovian constraints are added to the graph. An experiment on online character recognition showed the advantage of using non-Markovian constraints during DTW.

14:00-14:30, Paper WePSBT1.12
On the Relation between K-Means and PLSA
Roy Chaudhuri, Arghya	Indian Inst. of Science
Musti, Narasimha Murty	Indian Inst. of Science-
Keywords: Classification and Clustering, Document Analysis Systems, Machine Learning and Data Mining Abstract: Non-negative matrix factorization (NMF) is a well known tool for unsupervised machine learning. It can be viewed as a generalization of the K-means clustering, Expectation Maximization based clustering and aspect modeling by Probabilistic Latent Semantic Analysis (PLSA). Specifically PLSA is related to NMF with KL-divergence objective function. Further it is shown that K-means clustering is a special case of NMF with matrix L2 norm based error function. In this paper our objective is to analyze the relation between K-means clustering and PLSA by examining the KL-divergence function and matrix L2 norm based error function.

14:00-14:30, Paper WePSBT1.13
Probabilistic Shape Parsing for View-Based Object Recognition
Macrini, Diego	Univ. of Ottawa
Whiten, Chris	Univ. of Ottawa
Laganiere, Robert	-
Greenspan, Michael	Queen's Univ.
Keywords: Statistical, Syntactic and Structural Pattern Recognition, 2D/3D Object Detection and Recognition Abstract: We present a novel probabilistic model for parsing shapes into several distinguishable parts for accurate shape recognition. This shape parsing is based on robust geometric features that permit high recognition accuracy. Although modelling shapes is an inherently uncertain process, our approach is lenient, in that the desired parse of a shape only needs to be within its k most probable parses. Using this set of shape decompositions, we can boost recognition accuracy even further by determining which parts of a shape are common across most views of objects in the same class.

14:00-14:30, Paper WePSBT1.14
Learning-Based Mitotic Cell Detection in Histopathological Images
Sommer, Christoph	ETH Zurich
Fiaschi, Luca	HCI/IWR Heidelberg
Hamprecht, Fred Andreas	Univ. of Heidelberg
Gerlich, Daniel	Inst. for Biochemistry, ETH Zurich
Keywords: Pattern Recognition for Bioinformatics, Medical Image Analysis and Registration, Machine Learning and Data Mining Abstract: Breast cancer grading of histological tissue samples by visual inspection is the standard clinical practice for the diagnosis and prognosis of cancer development. An important parameter for tumor prognosis is the number of mitotic cells present in histologically stained breast cancer tissue sections. We propose a hierarchical learning workflow for automated mitosis detection in breast cancer. From an initial training set a pixel-wise classifier is learned to segment candidate cells, which are then classified into mitotic and non-mitotic cells using object shape and texture features. Our workflow banks on two open source biomedical image analysis software: "ilastik" and "CellCognition" which provide a user user friendly interface to powerful learning algorithms, with the potential of making the pathologist work an easier task. We evaluate our approach on a dataset of 35 high-resolution histopathological images from 5 different specimen (provided by textit{International Conference for Pattern Recognition} 2012 contest on textit{Mitosis Detection in Breast Cancer Histological Images}). Based on the candidate segmentation our approach achieves an area-under Precision-Recall-curve of 70% on an annotated dataset, with good localization accuracy, little parameter tuning and small user effort. Source code is provided.

14:00-14:30, Paper WePSBT1.15
Semi-Supervised Clustering with LEarnable Cluster Dependent Kernels
Frigui, Hichem	Univ. of Louisville
Bchir, Ouiem	King Saud Univ. SA
Keywords: Classification and Clustering, Statistical, Syntactic and Structural Pattern Recognition, Machine Learning and Data Mining Abstract: We propose a new semi-supervised relational clustering approach, called Semi-Supervised Fuzzy clustering with Learnable Cluster dependent Kernels (SSFLeCK). The proposed algorithm learns the underlying cluster-dependent dissimilarity measure while finding compact clusters. The learned dissimilarity is based on a Gaussian kernel with cluster dependent scaling parameters. SS-FLeCK uses side-information in the form of a small set of constraints on which instances should or should not reside in the same cluster. The proposed algorithm uses only the pairwise relation between the feature vectors. This makes it applicable when similar objects cannot be represented by a single prototype. Using synthetic and real data sets, we show that SSFLeCK outperforms several other algorithms.

14:00-14:30, Paper WePSBT1.16
Training Data Recycling for Multi-Level Learning
Liu, Jingchen	Pennsylvania State Univ.
McCloskey, Scott	McGill Univ. Honeywell
Liu, Yanxi	Penn State Univ.
Keywords: Machine Learning and Data Mining, Pattern Recognition for Search, Retrieval and Visualization Abstract: Among ensemble learning methods, stacking with a meta-level classifier is frequently adopted to fuse the output of multiple base-level classifiers and generate a final score. Labeled data is usually split for base-training and meta-training, so that the meta-level learning is not impacted by over-fitting of base level classifiers on their training data. We propose a novel knowledge-transfer framework that reutilizes the base-training data for learning the meta-level classifier without such negative consequences. By recycling the knowledge obtained during the base-classifier-training stage, we make the most efficient use of all available information and achieve better fusion, thus a better overall performance. With extensive experiments on complicated video event detection, where training data is scarce, we demonstrate the improved performance of our framework over other alternatives.

14:00-14:30, Paper WePSBT1.17
Unsupervised Tibetan Speech Features Learning Based on Dynamic Bayesian Networks
Zhao, Yue	Minzu Univ. of China
Xu, Xiaona	Minzu Univ. of China
Yang, Guosheng	Minzu Univ. of China
Keywords: Machine Learning and Data Mining, Classification and Clustering, Statistical, Syntactic and Structural Pattern Recognition Abstract: This paper proposed an unsupervised learning method to learn speech features based on Dynamic Bayesian Networks (DBNs) that accounts for the spatiotemporal dependences in speech signal. Although deep networks have been successfully applied to unsupervised learning features, the structures of the deep networks are often fixed before learning and they fail to capture temporal representation. In this paper, we propose to construct DBNs for unsupervised learning spatial-temporal features from speech data. The experiment results on Tibetan speech data showed the features learned using proposed DBNs outperforms the state-of-art methods in word recognition accuracy.

14:00-14:30, Paper WePSBT1.18
Multitask Multiclass Privileged Information Support Vector Machines
Ji, You	East China Normal Univ.
Sun, Shiliang	East China Normal Univ.
Lu, Yue	East China Normal Univ.
Keywords: Machine Learning and Data Mining, Pattern Recognition for Search, Retrieval and Visualization, Classification and Clustering Abstract: In this paper, we propose a new learning paradigm named multitask multiclass privileged information support vector machines. The starting point of our work is mainly based on the success of multitask multiclass support vector machines which cast multitask multiclass problems as a constrained optimization problem with a quadratic objective function. Learning using privileged information is an advanced learning paradigm integrated with the idea of human teaching in machine learning. This paper mainly extends multitask multiclass support vector machines to privileged information learning strategy. Our approaches can take full advantages of the multitask learning and privileged information. Experimental results show that our approaches obtains very good results for multitask multiclass problems.

14:00-14:30, Paper WePSBT1.19
Learning Distance Metric Regression for Facial Age Estimation
Li, Changsheng	Inst. of Automation, Chinese Acad. of Sciences
Liu, Qingshan	Nanjing Univ. of Information Science and Tech.
Liu, Jing	National Lab. of Pattern Recognition,Inst.
Lu, Hanqing	Inst. of Automation,Chinese Acad. of Science
Keywords: Machine Learning and Data Mining, Pattern Recognition for Bioinformatics Abstract: This paper proposes a novel regression method based on distance metric learning for human age estimation. We take age estimation as a problem of distance-based ordinal regression, in which age difference is measured by an efﬁcient distance metric. To reach this goal, we propose to learn such a distance metric that can preserve both the ordinal information of different age groups and the local geometry structure of the target neighborhoods simultaneously. Then, the facial aging trend can be truly discovered by the learned metric. Experimental results on the publicly available FG-NET database are very competitive against the state of the art.

14:00-14:30, Paper WePSBT1.20
Efficient Classification Using Phrases Generated by Topic Models
Gujraniya, Deepak	Indian Inst. of Science. Bangalore
Musti, Narasimha Murty	Indian Inst. of Science-
Keywords: Classification and Clustering, Machine Learning and Data Mining, Statistical, Syntactic and Structural Pattern Recognition Abstract: There are many popular models available for classification of documents like Naive Bayes Classifier, k-Nearest Neighbors and Support Vector Machine. In all these cases, the representation is based on the Bag of words model. This model doesn't capture the actual semantic meaning of a word in a particular document. Semantics are better captured by proximity of words and their occurrence in the document. We propose a new Bag of Phrases model to capture this discriminative power of phrases for text classification. We present a novel algorithm to extract phrases from the corpus using the well known topic model, Latent Dirichlet Allocation(LDA), and to integrate them in vector space model for classification. Experiments show a better performance of classifiers with the new Bag of Phrases model against related representation models.

14:00-14:30, Paper WePSBT1.21
A Discriminative Parametric Approach to Video-Based Score-Level Fusion for Biometric Authentication
Poh, Norman	Univ. of Surrey
Kittler, Josef	Univ. of Surrey
Alkoot, Fuad	HITN,
Keywords: Biometrics, Classification and Clustering, Machine Learning and Data Mining Abstract: Video-based biometric systems are becoming feasible thanks to advancement in both algorithms and computation platforms. Such systems have many advantages: improved robustness to spoof attack, performance gain thanks to variance reduction, and increased data quality/resolution, among others. We investigate a discriminative video-based score-level fusion mechanism, which enables an existing biometric system to further harness the riches of temporarily sampled biometric data using a set of distribution descriptors. Our approach shows that higher order moments of the video scores contain discriminative information. To our best knowledge, this is the first time this higher order moment is reported to be effective in the score-level fusion literature. Experimental results based on face and speech unimodal systems, as well as multimodal fusion, show that our proposal can improve the performance over that of the standard fixed rule fusion strategies by as much as 50%.

14:00-14:30, Paper WePSBT1.22
Incremental Support Vector Clustering with Outlier Detection
Huang, Dong	Sun Yat-sen Univ.
Lai, Jian-huang	Sun Yat-sen Univ.
Wang, Chang-Dong	Sun Yat-sen Univ.
Keywords: Classification and Clustering, Machine Learning and Data Mining Abstract: Support vector clustering (SVC) is a nonparametric clustering algorithm inspired by support vector machines. Incremental support vector clustering (ISVC) extends the SVC algorithm to an incremental version for the case of large-scale datasets with the assumption of no outliers. In order to tackle the problem of clustering large-scale noisy datasets, this paper proposes the algorithm termed incremental support vector clustering with outlier detection (OD-ISVC). The proposed algorithm consists of two components, namely, incremental support vector (SV) construction and dynamic bounded support vector (BSV) management. We introduce the concept of BSV-pool, where the check and recycle procedure is designed for updating the temporarily stored BSVs and detecting outliers. The experiments on real and synthetic datasets demonstrate the effectiveness and efficiency of our method.

14:00-14:30, Paper WePSBT1.23
Face Recognition Using Multi-Modal Binary Patterns
Nguyen, Thanh Phuong	LORIA
Vu, Ngoc-Son	Gipsa-Lab.
Caplier, Alice	GIPSA-Lab. Grenoble Univ.
Keywords: Biometrics Abstract: A new descriptor called Multi-modal Binary Patterns (MMBP) is proposed for face recognition. It balances well important requirements for real-world applications, including the robustness, discriminative power, and the low computational cost. The proposed algorithm has several desirable properties: 1) it captures information from face image in any direction as it is oriented feature, 2) being a spatial multi-scale structure, the descriptor catches not only local but also more global information about object, 3) it is robust to image transformation like variations of lighting, expressions, and 4) it is computationally efficient. In more detail, to catch information in a given direction, a Local Line Binary Pattern (LLBP) based operator is first applied. The MMBP feature is then built by applying a LBP-based self-similarity operator on the values being calculated by LLBP operators across different directions. A Whitened PCA dimensionality reduction technique is applied to get more a compact and efficient descriptor. Experimental results achieved on the comprehensive FERET data set being comparable to state-of-the-art validates the efficiency of our method.

14:00-14:30, Paper WePSBT1.24
Sketch-Based Face Alignment for Thermal Face Recognition
Sun, Lin	Zhejiang Univ. city Coll.
Dai, XiaoXia	Zhejiang Univ. city Coll.
Keywords: Biometrics Abstract: In this paper, we present a novel face alignment approach in thermal infrared face recognition. The alignment procedure is based on closest point set matching between sketch faces. Linear combination of positional and local pattern features is embedded in the pointwise distance to solve the local minimum problem for ICP due to edge noise in sketch faces. The comprehensive experiments, including intra-class, inter-class and variable expressions, show the alignment accuracy and face recognition performance results of our algorithm compared to manual labelling, ICP and congealing methods.

14:00-14:30, Paper WePSBT1.25
Fusing Biographical and Biometric Classifiers for Improved Person Identification
Tyagi, Vivek	IBM Res. India
K, Hima Prasad	IBM Res. India
Faruquie, Tanveer	IBM Res. India
Subramaniam, L. Venkata	IBM Res. India
Ratha, Nalini	IBM Res.
Keywords: Biometrics, Pattern Recognition for Surveillance and Security, Pattern Recognition for Search, Retrieval and Visualization Abstract: Several citizen service databases such as, police, national citizen identity, passport and vehicle registration, store both biographical and biometric information containing huge number of records. Achieving scalability and high accuracy for a 1:N person identification task on these databases is a huge challenge. In this work, we propose to use complementary information present in the biographical data along with biometric information of a user to improve 1:N person identification task for large systems. We show that a likelihood ratio based method for score level fusion of the biometric and biographical classifiers results in high accuracy identification as compared to using only the biometric classifiers or the biographical classifiers

14:00-14:30, Paper WePSBT1.26
Time Series Alignment with Gaussian Processes
Suematsu, Nobuo	Hiroshima City Univ.
Hayashi, Akira	Hiroshima City Univ.
Keywords: Statistical, Syntactic and Structural Pattern Recognition, Machine Learning and Data Mining Abstract: We propose a nonparametric Bayesian approach to time series alignment. Time series alignment is a technique often required when we analyze a set of time series in which there exists a typical structural pattern common to all the time series. Such a set of time series is typically obtained by repeated measurements of a biological, chemical or physical process. In time series alignment, we are required to estimate a common shape function, which describes a common structural patter shared among a set of time series, and time transformation functions, each of which represents time shifts involved in individual time series. In this paper, we introduce a generative model for time series data in which the common shape function and the time transformation functions are modeled nonparametrically using Gaussian processes and we develop an effective Markov Chain Monte Carlo algorithm, which realizes a nonparametric Bayesian approach to time series alignment. The effectiveness of our method is demonstrated in an experiment with synthetic data and an experiment with real time series data is also presented.

14:00-14:30, Paper WePSBT1.27
Placing Landmarks Suitably for Shape Analysis by Optimization
Iwata, Kazunori	Hiroshima City Univ.
Keywords: Statistical, Syntactic and Structural Pattern Recognition, Machine Learning and Data Mining, 2D/3D Object Detection and Recognition Abstract: Shape analysis relies on using a finite number of points on the contour of an object to compare the shapes of objects. These points are called landmarks. Hence, when landmarks are not available for analysis, we must place some appropriately on the contour. In this paper, we describe a new method for placing landmarks well on the contours of objects in the same class. The landmarks located by our method are determined by solving an optimization problem. In experiments using line drawings, we demonstrate that our method places landmarks well on each drawing.

14:00-14:30, Paper WePSBT1.28
A Hierarchical Algorithm with Multi-Feature Fusion for Facial Expression Recognition
Zhang, Zheng	Tsinghua Univ.
Fang, Chi	Tsinghua Univ.
Ding, Xiaoqing	Tsinghua Univ.
Keywords: Pattern Recognition for Bioinformatics, Human Computer Interaction, Image and Video Understanding Abstract: In this paper, a novel hierarchical algorithm with multi-feature fusion is proposed for facial expression recognition. In this area, many people have proposed many good results, but few of them made good use of the distribution characteristic of facial expression itself. In the analysis of the feature distribution, we find happiness and surprise are clearly separated from the other expressions. So we aim to distinguish these two expressions in the first layer of our algorithm using Gabor features. In the second layer, we use Gabor and LBP features respectively to classify the other five expressions. And a well designed result fusion of two branches is adopted to improve the accuracy. Experiments results on the Cohn-Kanade database show that our algorithm achieves excellent accuracy. Furthermore, our algorithm also performs well in our hybrid database, in which there are extensive variations of expressions. It demonstrates the good generalization ability of our algorithm.

14:00-14:30, Paper WePSBT1.29
Appearance Modeling for Person Re-Identification Using Weighted Brightness Transfer Functions
Datta, Ankur	IBM T. J. Watson Res. Center
Brown, Lisa	IBM T. J. Watson Res. Center
Feris, Rogerio	IBM Res.
Pankanti, Sharath	IBM Res.
Keywords: Pattern Recognition for Surveillance and Security, Image and Video Processing, Pattern Recognition for Search, Retrieval and Visualization Abstract: Appearance of individuals across multiple cameras varies a lot due to illumination and viewpoint changes making person re-identification a challenging problem. In this paper, we describe how to model this appearance variation by using a novel Weighted Brightness Transfer Function (WBTF). In combination with powerful low-level features, we show that WBTF leads to large performance improvements by assigning different weights to different BTFs and combining them accordingly. We have compared our algorithm on two public benhmark datasets: VIPeR and CAVIAR4REID dataset, achieving new state-of-the art performance on both datasets.

14:00-14:30, Paper WePSBT1.30
Learning Feature Weights of Symbols, with Application to Symbol Spotting
Nayef, Nibal	Tech. Univ. of Kaiserslautern
Afzal, Muhammad Zeshan	Tech. Univ. of Kaiserslautern
Breuel, Thomas	Univ. of kaiserslautern
Keywords: Feature Reduction and Manifold Learning, Graphics Recognition Abstract: Finding discriminant features is useful for pattern recognition applications. In this work, geometric matching is combined with linear discriminant analysis (LDA) to learn the importance of the features of symbols, and assign weights to these features accordingly. The features are the line segments of the symbols. We use geometric matching within a symbol spotting system to get information on the matching between the line segments of a query symbol and the line segments of the spotted symbols found by the spotting system (both true and false matches). The matching information is used to compute feature vectors for a query symbol. The vectors represent how well the segments of a query are matched to the segments of the true and false matches. Then, LDA is trained on these vectors to get the weights of the line segments of different query symbols. This feature weighting approach is applied in symbol spotting. Using the query weighted features, the spotting system�s precision improves from an average of 71% to an average of 98%, with a speed up factor of 2.1.

14:00-14:30, Paper WePSBT1.31
SuStorID: A Multiple Classifier System for the Protection of Web Services
Corona, Igino	Univ. of Cagliari, Italy
Tronci, Roberto	Univ. of Cagliari
Giacinto, Giorgio	Univ. of Cagliari
Keywords: Pattern Recognition for Surveillance and Security, Statistical, Syntactic and Structural Pattern Recognition Abstract: The security of web services is nowadays one of the major concerns for Internet users. Web services may manage confidential information, monetary transactions, or even health-critical systems, such as those employed in public airports or hospitals. A key problem of web services is that they should work as expected even in the presence of malicious inputs. Unfortunately, with the increasing complexity of web services, this task becomes more and more challenging. In this paper we present SuStorID, a multiple classifier system which is able to model legitimate inputs towards web services, given a sample of web traffic. If anomalous inputs are detected, web services are protected according to a set of anomaly templates. Our experiments, performed on a production environment, highlight that our system can accurately detect web attacks and help security operators to protect their web services against known and unknown attacks.

14:00-14:30, Paper WePSBT1.32
A Score-Level Fusion Method with Prior Knowledge for Fingerprint Matching
Zang, Yali	Inst. of Automation, Chinese Acad. of Sciences
Yang, Xin	Insititute of Automation, Chinese Acad. of Sciences
Cao, Kai	School of Life Sciences and Tech. Xidian Univ.
Jia, Xiaofei	Insititute of Automation, Chinese Acad. of Sciences
Zhang, Ning	Insititute of Automation, Chinese Acad. of Sciences
Tian, Jie	Insititute of Automation, Chinese Acad. of Sciences
Keywords: Biometrics Abstract: Fingerprint matching is one of the most widely used biometrics for personal identification. However, the performance of fingerprint identification system is insufficient for many applications. Lots of methods were proposed to improve system performance by introducing more information into matching process. In this paper, we introduced a new kind of information named prior knowledge and proposed a score-level fusion method with prior knowledge for fingerprint matching. The trend and discrimination of scores are used as prior knowledge with sigmoid function to search the optimal fusion parameters. Experimental results show that the proposed prior knowledge is useful for score fusion and fingerprint matching and the score-level fusion algorithm is effective to improve system performance and comparative to the best ones in FVC2004.

14:00-14:30, Paper WePSBT1.33
View-Invariant Gait Recognition from Low Frame-Rate Videos
Mansur, Al	Osaka Univ.
Makihara, Yasushi	The Inst.
Yagi, Yasushi	Osaka Univ.
Keywords: Biometrics, Pattern Recognition for Surveillance and Security Abstract: In this paper, we introduce a torus manifold-based temporal super resolution method for gait recognition from low frame-rate videos with view transitions. Given a low frame-rate gait sequence with view transition from an unknown person, we estimate three unknowns: view, phase, and style. We estimate view by walking trajectory and camera information, phase by dynamic programming using multiview exemplar sequences, and style by bilinear model and linear least squares. Once these parameters are known, we can synthesize a high frame-rate sequence corresponding to that unknown person and can use existing methods for gait recognition. Experiments with OU-ISIR multiview gait dataset demonstrate the effectiveness of the proposed method for frame-rates as low as 1 or 2 fps.

14:00-14:30, Paper WePSBT1.34
Visualizing Vein Patterns from Color Skin Images Based on Image Mapping for Forensics Analysis
Tang, Chaoying	School of Computer Engineering, Nanyang Tech. Univ.
Zhang, Hengyi	School of Computer Engineering, Nanyang Tech. Univ.
Kong, Adams	Nanyang Tech. Univ.
Craft, Noah	2Department of Medicine, Los Angeles Biomedical Res. Inst.
Keywords: Biometrics Abstract: Traditionally, it was difficult to use vein patterns in evidence images for forensic identification, because they were nearly invisible in color images. We proposed a computational method based on skin optics to uncover vein patterns from color images. However, its performance is dependent on the accuracy of the skin optical model. In this paper, we propose an algorithm based on image mapping to visualize vein patterns. It extracts information from a pair of synchronized color and near infrared (NIR) images, and uses a neural network (NN) to map RGB values to NIR intensities. In addition, an NN weight adjustment scheme is proposed to improve the robustness of the algorithm. The proposed algorithm was examined on a database with 300 pairs of color and NIR images collected from the forearms of 150 subjects. The automatic matching results from the proposed algorithm were better than those from our previous method, and comparable to the results from matching NIR images with NIR images.

14:00-14:30, Paper WePSBT1.35
Dual Subspace Nonnegative Matrix Factorization for Person-Invariant Facial Expression Recognition
Tu, Yi-Han	National Tsing Hua Univ.
Hsu, Chiou-Ting	National Tsing Hua Univ.
Keywords: Classification and Clustering, Feature Reduction and Manifold Learning, Features and Image Descriptors Abstract: Person-dependent appearance changes tend to increase difficulties in automatic facial expression recognition. Although one can use neutral face images to reduce the personal variations, acquisition of neutral face images may not always be possible in real cases. In order to remove the person-dependent influence from expressive images, we propose a dual subspace nonnegative matrix factorization (DSNMF) to decompose facial images into two parts: identity and expression parts. The identity part should characterize person-dependent variations, while the expression part should characterize person-invariant expression features. Our experimental results show that the proposed method significantly outperforms existing approaches on the CK+ and JAFFE expression databases.

14:00-14:30, Paper WePSBT1.36
Non-Parametric Score Normalization for Biometric Verification Systems
Struc, Vitomir	Univ. of Ljubljana
Zganec Gros, Jerneja	Alpinoen Ltd.
Pavesic, Nikola	Univ. of Ljubljana, Faculty of Electrical Engineering
Keywords: Biometrics, Classification and Clustering, Statistical, Syntactic and Structural Pattern Recognition Abstract: In this paper we study the problem of score normalization in biometric verification systems. Specifically, we introduce a new class of normalization techniques, which unlike the commonly used parametric score normalization techniques, such as z- or t-norm, make no assumptions regarding the shape of the underlying score distribution. The proposed class of normalization techniques first estimates the relevant score distribution in an impostor-centric manner using kernel density estimation and then maps the estimated distribution to a common one. Our experimental results obtained on the FRGCv2 face database show that the proposed non-parametric score normalization techniques consistently outperform their parametric counterparts when the target distribution takes a log-normal form and that all assessed techniques, i.e., z-, t-, zt- and tz-norms, improve upon the setting where no score normalization is used.

14:00-14:30, Paper WePSBT1.37
Fingerprint Matching Utilizing Non-Distal Phalanges
Top�u, Berkay	Sabancı Univ.
Kayaoglu, Mehmet	TUBITAK BILGEM UEKAE
Kilinc, Merve	TUBITAK BILGEM UEKAE
Uludag, Umut	TUBITAK-BILGEM-UEKAE
Keywords: Biometrics Abstract: Human hand is composed of structures called carpal bones, metacarpal bones and phalanges (which form the fingers). Typically, fingerprint matching is used for personal authentication, with images & features obtained from the "tip" of the fingers, ie. distal phalanges (sections, digits). In this study, we report fingerprint minutiae matching results, with images obtained from proximal and middle phalanges. Experiments conducted on a medium-size database, collected using a commercial low-cost optical (distal) fingerprint sensor without any modification, show that, in applications where distal phalanx images are not usable (e.g. due to missing digits, low quality finger surface due to manual labor), non-distal phalanges may provide an acceptable biometric verification source.

14:00-14:30, Paper WePSBT1.38
Motion Histogram Quantification for Human Action Recognition
Tabia, Hedi	IEF, Inst. d'Electronique Fondamentale
Gouiff�s, Mich�le	IEF Univ. Paris Sud 11
Lacassagne, Lionel	IEF Univ. Paris Sud 11
Keywords: Gesture and Behavior Analysis, Pattern Recognition for Surveillance and Security, Classification and Clustering Abstract: In this paper, we propose an approach for human activity categorizing based on the use of optical flow direction and magnitude features. The main contribution of this paper is the feature representation that mirrors the geometry of the human body and relationships between its moving regions when performing activities. The features are quantified using a quantization algorithm. We analyze the performance of two well-known classifiers: the Na�ve Bayes and the SVM. The results show the effectiveness of our approach.

14:00-14:30, Paper WePSBT1.39
Age-Invariant Face Verification Based on Local Classifier Ensemble
Mao, Xiaojiao	Nanjing Univ.
Yang, Yu-Bin	NANJING Univ.
Li, Ning	NANJING Univ.
Zhang, Yao	NANJING Univ.
Keywords: Pattern Recognition for Bioinformatics, Pattern Recognition for Surveillance and Security, Biometrics Abstract: Human faces undergo considerable amount of variations across ages. This paper proposes an age-invariant face verification method by using a Local Classifier Ensemble Model (LCEM). First, reference points are located based on an extended Active Shape Model and faces are aligned afterwards. Second, a face is grouped into several non-overlapping patches and each group is further divided into several overlapping sub-patches. Local classifiers are then trained for each sub-patch and integrated to build an ensemble classifier under a Semi-Naive Bayesian framework. Finally, the proposed approach is evaluated on the MORPH database and experimental results show that such local classifier ensemble achieves significant and robust performance.

14:00-14:30, Paper WePSBT1.40
A Multiple Kernel Learning Approach to Multi-Modal Pedestrian Classification
San Biagio, Marco	Istituto Italiano di Tecnologia
Ulas, Aydin	Univ. degli Studi di Verona
Crocco, Marco	Istituto Italiano di Tecnologia
Cristani, Marco	Univ. of Verona
Castellani, Umberto	Univ. of Verona
Murino, Vittorio	Univ. of Verona
Keywords: Pattern Recognition for Surveillance and Security, Image and Video Processing Abstract: Pedestrian detection is a key problem in many computer vision applications, especially in surveillance and security systems. To this end, information integration from different imaging modalities, such as thermal infrared and visible spectrum, can significantly improve the detection rate in respect to mono-modal strategies. For this reason, an effective fusion scheme is necessary to combine the information presented by multiple sensors. In this paper, we propose a pedestrian classification method based on the multiple kernel learning framework; standard pixel features (such as spatial derivatives) from both imaging modalities are employed to learn several feature-related basic kernels and a compound kernel is found as an optimized linear combination of basic kernels. Finally the compound kernel is used to train an SVM. Experiments performed on the OTCBVS dataset [1], demonstrate that our recipe definitely outclasses a wide set of literature fusion modalities.

14:00-14:30, Paper WePSBT1.41
Enhancing Biometric Security with Wavelet Quantization Watermarking Based Two-Stage Multimodal Authentication
Ma, Bin	Beihang Univ.
Li, ChunLei	Beihang Univ.
Wang, Yunhong	Beihang Univ.
Zhang, Zhaoxiang	Beihang Univ.
Huang, Di	Beihang Univ.
Keywords: Biometrics, Image and Video Processing Abstract: In this paper we propose a watermarking based two-stage authentication framework to enhance biometric system security. The face feature of one individual is embedded into his/her fingerprint image as credibility token. During authentication, the watermark is first extracted to establish data authenticity. If legitimate, the face pattern can further serve as supplemental trait in biometric authentication progress. A wavelet quantization based watermarking method is proposed to robustly embed information in fingerprints while preserving their discriminating features. Meanwhile, a sparse representation based classifier is adopted to efficiently exploit the identity information within face watermarks. Experimental results which evaluate both watermarking and biometric authentication performance demonstrate the effectiveness of this work.

14:00-14:30, Paper WePSBT1.42
Iris Recognition Using Ordinal Encoding of Log-Euclidean Covariance Matrices
Li, Peihua	Heilongjiang Univ.
Wu, Guolong	Heilongjiang Univ.
Keywords: Biometrics, Pattern Recognition for Surveillance and Security Abstract: Iris recognition in less constrained environments is challenging due to the degraded iris images. This paper proposes a novel method fusing multiple cues for iris recognition in the non-ideal imagery. The covariance matrices are used to represent local iris texture property, which capture the correlation of spatial coordinates, intensities, 1st and 2nd-order partial derivatives. The covariance matrices are symmetric positive definite (SPD) which form a Riemannian space rather than a Euclidean one. In the Log-Euclidean framework, the space of SPD matrices is equipped with a linear space structure so that in the logarithmic domain the Euclidean operations are applicable. This enables us to compute the logarithms of covariance matrices, leading to the Log-Euclidean covariance matrices (LECM), which can be handled in common Euclidean operations. The ordinal measure is further used to represent the order relation of iris texture by comparing LECMs at different positions. We finally perform iris matching based on the Hamming distance in which the noise effects are considered. Experiments on challenging databases show the effectiveness of the proposed method.

14:00-14:30, Paper WePSBT1.43
F-Measure Optimisation in Multi-Label Classifiers
Pillai, Ignazio	Univ. of Cagliari
Fumera, Giorgio	Univ. of Cagliari
Roli, Fabio	Univ. of Cagliari
Keywords: Machine Learning and Data Mining, Statistical, Syntactic and Structural Pattern Recognition, Classification and Clustering Abstract: When a multi-label classifier outputs a real-valued score for each class, a well known design strategy consists of tuning the corresponding decision thresholds by optimising the performance measure of interest on validation data. In this paper we focus on the F-measure, which is widely used in multi-label problems. We derive two properties of the micro-averaged F measure, viewed as a function of the threshold values, which allow its global maximum to be found by an optimisation strategy with an upper bound on computational complexity of O(n^2 N^2), where N and n are respectively the number of classes and of validation samples. So far, only a suboptimal threshold selection rule and a greedy algorithm without any optimality guarantee were known for this task. We then devise a possible optimisation algorithm based on our strategy, and evaluate it on three benchmark, multi-label data sets.

14:00-14:30, Paper WePSBT1.44
Combining General Multi-Class and Specific Two-Class Classifiers for Improved Customized ECG Heartbeat Classification
Ye, Can	Carnegie MellonUniversity
Kumar, BVK Vijaya	Carnegie Mellon Univ.
Coimbra, Miguel	Univ. of Porto
Keywords: Pattern Recognition for Bioinformatics, Classification and Clustering, Machine Learning and Data Mining Abstract: We present an approach for customized heartbeat classification of electrocardiogram (ECG) signals, based on the construction of one general multi-class classifier and one specific two-class classifier. The general classifier is trained on a global training dataset, containing examples of all possible classes and patterns. On the other hand, the individual-specific classifier is built using a small amount of individual data, which is a binary one-against-the-rest classifier, providing discrimination between normal and abnormal patterns from that individual. Such an individual-specific classifier can be a two-class classifier or a one-class classifier, depending on the availability of abnormal patterns in the individual training dataset. The classifications from the two classifiers are fused to obtain a final decision. The proposed approach is applied to the study of ECG heartbeat classification problem, significantly outperforming state-of-the-art methods. The proposed method can also be useful in anomaly detection of other biomedical signals.

14:00-14:30, Paper WePSBT1.45
Facial Image-Based Gender Classification Using Local Circular Patterns
Wang, Chen	Beihang Univ.
Huang, Di	Beihang Univ.
Wang, Yunhong	Beihang Univ.
Zhang, Guangpeng	Beihang Univ.
Keywords: Biometrics Abstract: Gender is one of the most important demographic attributes of human beings, and recently automatic face based gender classification has received increasing attentions due to its wide potential in many useful applications. To address such an issue, in this paper, we propose a novel variant of Local Binary Patterns (LBP), namely Local Circular Patterns (LCP). LCP makes use of clustering-based quantization instead of the binary coding strategy of the LBP operator, leading to an improvement in discriminative power. Meanwhile, thanks to the nature property of clustering-based quantization, LCP is more robust than LBP to noise. Experiments are carried out on the FERET database and the classification accuracy is up to 95.36%, clearly highlighting the effectiveness of the proposed method.


WePSBT2	Multi-Purpose Hall
Poster Shotgun (10): CV	Regular Session

14:00-14:30, Paper WePSBT2.1
Arrangement Based Image Representation for Scene Recognition
Somanath, Gowri	Univ. of Delaware
Kambhamettu, Chandra	Univ. of Delaware
Keywords: Scene Understanding Abstract: Studies on human faculties of scene recognition have lead to two broad classifications of the perceived information: local and global. It has been shown that both are processed separately and combined towards final category assignment. Recently, it was suggested that accuracy of computational models for local information closely match human performance, while it is not so for current global representations. In this paper, we propose a new global representation, AGIR. The key differences we propose to current approaches is the explicit modeling of `arrangement' (co-occurrence and configuration) in the scene, and use of multiple hierarchical dictionaries. The effectiveness of the proposed scheme is shown through various experiments and comparisons on both indoor and outdoor scene recognition.

14:00-14:30, Paper WePSBT2.2
Efficient Statistical/Morphological Cell Texture Characterization and Classification
Thibault, Guillaume	Center for Mathematical Morphology, Mines-ParisTech
Angulo, Jesus	MINES ParisTech
Keywords: Features and Image Descriptors, Pattern Recognition for Bioinformatics, Image and Video Processing Abstract: This paper presents the different steps for an automatic fluorescence-labelled cell classification method. First a data features study is discussed in order to describe cell texture by means of morphological and statistical texture descriptors. Then, results on supervised classification using logistic regression, random forest and neural networks, for both morphological and statistical descriptors, is presented. We propose a final consolidated classifier based on a weighted probability for each class, where the weights are given by the empirical classification performances. The method is evaluated on ICPR�12 HEp-2 dataset contest.

14:00-14:30, Paper WePSBT2.3
Parallelized Annealed Particle Filter for Real-Time Marker-Less Motion Tracking Via Heterogeneous Computing
Bian, Yatao	Shanghai Jiao tong Univ.
Zhao, Xu	Shanghai Jiao tong Univ.
Song, Jian	Shanghai Jiao tong Univ.
Liu, Yuncai	Shanghai Jiao Tong Univ.
Keywords: Motion, Tracking and Video Analysis, Gesture and Behavior Analysis Abstract: We propose a parallelized Annealed Particle Filter method via heterogeneous computing (P-APF), to implement real-time marker-less motion tracking based on OpenCL framework. The overall computing procedure in P-APF is decomposed into several computational tasks with corresponding granularity. According to the degree of parallelism, the tasks are assigned to standard and attached processors respectively, to fully leverage heterogeneous computing ability. A novel task latency hidden strategy is proposed to further reduce time cost. Experiments on different human motion datasets demonstrate that P-APF can achieve real-time tracking performance without losing accuracy. With an average acceleration ratio of 106 compared to serial implementation, the time cost basically remains constant with the growth of particle number and view number in a limited range.

14:00-14:30, Paper WePSBT2.4
Multi-Pose Face Detection for Accurate Face Logging
Bagdanov, Andrew	Univ. of Florence
Del Bimbo, Alberto	Univ. of Florence
Lisanti, Giuseppe	Univ. of Florence
Masi, Iacopo	Univ. of Florence
Keywords: Pattern Recognition for Surveillance and Security Abstract: In this paper we present a technique for real-time face logging in video streams. Our system is capable of detecting faces across a range of poses and of track- ing multiple targets in real time, grabbing face images and evaluating their quality in order to store only the best for each detected target. An advantage of our ap- proach is that we qualify every logged face in terms of a quality measure based both on face pose and on resolu- tion. Extensive qualitative and quantitative evaluation of the performance of our system is provided on many hours of realistic surveillance footage captured in dif- ferent environments. Results show that our system can simultaneously minimizing false positives and identity mismatches, while balancing this against the need to obtain face images of all people in a scene.

14:00-14:30, Paper WePSBT2.5
Image Classification Using HTM Cortical Learning Algorithms
Zhuo, Wen	Huazhong Uni. of Sci. and Tech.
Cao, Zhiguo	Huazhong Univ. of Science and Tech.
Xiao, Yang	Nanyang Tech. Univ.
Qin, Yueming	Huazhong Univ. of Science and Tech.
Yu, Zhenghong	Huazhong Univ. of Science and Tech.
Keywords: Features and Image Descriptors, Neural Networks Abstract: Recently the improved bag of features (BoF) model with locality-constrained linear coding (LLC) and spatial pyramid matching (SPM) achieved state-of-the-art performance in image classification. However, only adopting SPM to exploit spatial information is not enough for satisfactory performance. In this paper, we use hierarchical temporal memory (HTM) cortical learning algorithms to extend this LLC & SPM based model. HTM regions consist of HTM cells are constructed to spatial pool the LLC codes. Each cell receives a subset of LLC codes, and adjacent subsets are overlapped so that more spatial information can be captured. Additionally, HTM cortical learning algorithms have two processes: learning phase which make the HTM cell only receive most frequent LLC codes, and inhibition phase which ensure that the output of HTM regions is sparse. The experimental results on Caltech 101 and UIUC-Sport dataset show the improvement on the original LLC & SPM based model.

14:00-14:30, Paper WePSBT2.6
Real-Time Hand Status Recognition from RGB-D Imagery
Bagdanov, Andrew	Univ. of Florence
Del Bimbo, Alberto	Univ. of Florence
Seidenari, Lorenzo	Media Integration and Communication Center - Univ. ofFloren
Usai, Lorenzo	Media Integration and Communication Center - Univ. of Flore
Keywords: Gesture and Behavior Analysis, Human Computer Interaction Abstract: One of the most critical limitations of Kinecttexttrademark-based interfaces is the need for persistence in order to interact with virtual objects. Indeed, a user must keep her arm still for a not-so-short span of time while pointing at an object with which she wants to interact. The most natural way to overcome this limitation and improve interface reactivity is to employ a vision module able to recognize simple hand poses (e.g. open/closed) in order to add a state to the virtual pointer represented by the user hand. In this paper we propose a method to robustly predict the status of the user hand in real-time. We jointly exploit depth and RGB imagery to produce a robust feature for hand representation. Finally, we use temporal filtering to reduce spurious prediction errors. We have also prepared a dataset of more than 30K depth-RGB image pairs of hands that is being made publicly available. The proposed method achieves more than 98% accuracy and is highly responsive.

14:00-14:30, Paper WePSBT2.7
Using Spatial Pyramids with Compacted VLAT for Image Categorization
Negrel, Romain	ETIS/ENSEA - CNRS UMR 8051
Picard, David	ETIS/ENSEA CNRS UMR 8051
Gosselin, Philippe Henri	CNRS
Keywords: Features and Image Descriptors, Classification and Clustering, Multimedia Analysis, Indexing and Retrieval Abstract: In this paper, we propose a compact image signature based on VLAT. Our method integrates spatial information while significantly reducing the size of original VLAT by using two pojection steps. we carry out experiments showing our approach is competitive with state of the art signatures.

14:00-14:30, Paper WePSBT2.8
A Novel, Efficient, Tree-Based Descriptor and Matching Algorithm
Fowers, Spencer	Brigham Young Univ.
Lee, Dah-Jye	Brigham Young Univ.
Ventura, Dan	Brigham Young Univ.
Wilde, Doran	Brigham Young Univ.
Keywords: Features and Image Descriptors, Motion, Tracking and Video Analysis Abstract: This paper presents the development of the Tree BAsis Sparse-coding Inspired Similarity feature descriptor (TreeBASIS). TreeBASIS utilizes a binary vocabulary tree that is computed off-line using basis dictionary images (BDIs) derived from sparse coding, and the resulting tree is stored in memory for on-line searching. During the on-line algorithm, a small region around a feature point is passed into the BASIS tree, where a Hamming distance is computed between the region and the effectively descriptive BDI (EDBDI) to determine the branch taken. The path the FRI takes is saved as the descriptor, and matching is performed by following the paths of two features. Experimental results show that the TreeBASIS descriptor outperforms SIFT and SURF on frame-to-frame aerial feature point matching.

14:00-14:30, Paper WePSBT2.9
RGBD Object Pose Recognition Using Local-Global Multi-Kernel Regression
El-Gaaly, Tarek	Rutgers Univ.
Torki, Marwan	Rutgers Univ.
Elgammal, Ahmed	Rutgers Univ.
Singh, Maneesh	Siemens Corp. Res.
Keywords: 2D/3D Object Detection and Recognition, Vision for Robotics, Feature Reduction and Manifold Learning Abstract: The advent of inexpensive depth augmented color (RGBD) sensors has brought about a large advancement in the perceptual capability of vision systems and mobile robots. Challenging vision problems like object category, instance and pose recognition have all benefited from this recent technological advancement. In this paper we address the challenging problem of pose recognition using simultaneous color and depth information. For this purpose, we extend a state-of-the-art regression framework by using a multi-kernel approach to incorporate depth information to perform more effective pose recognition on table-top objects. We do extensive experiments on a large publicly available dataset to validate our approach. We show significant performance improvements (more than 20%) over published results.

14:00-14:30, Paper WePSBT2.10
Video Figure Ground Labeling
Elqursh, Ali	Rutgers Univ.
Elgammal, Ahmed	Rutgers Univ.
Keywords: Motion, Tracking and Video Analysis, Scene Understanding, Segmentation, Color and Texture Abstract: Figure-ground labeling is a classical problem in computer vision in which the goal is to label different parts of the visual input as figural or background. Yet most existing approaches focuses on single image figure-ground labeling with little emphasis on video. We present a method which integrates several cues to achieve figure-ground labeling on video sequences. The method is evaluated on challenging video sequences.

14:00-14:30, Paper WePSBT2.11
Local Phase Quantization Descriptors for Blur Robust and Illumination Invariant Recognition of Color Textures
Pedone, Matteo	Univ. of Oulu
Heikkil�, Janne	Univ. of Oulu
Keywords: Features and Image Descriptors Abstract: Abstract A novel extension for color images of the local phase quantization (LPQ) local descriptor is presented. The descriptor is obtained by using a multivector representation of color values in order to derive blur-robust features in frequency domain. We tested the proposed descriptor in texture classification problems, and quantified its robustness for several amounts of blur. The experiments show that the proposed descriptor achieves superior accuracy over its grayscale counterpart and other color texture descriptors. Furthermore its illumination-invariance properties guarantee remarkable performances in challenging scenarios of varying illumination, without the need of pre-processing textures with color-constancy algorithms.

14:00-14:30, Paper WePSBT2.12
Dense Reconstruction by Stereo-Motion under Perspective Camera Model
Fang, Mu	The Chinese Univ. of Hong Kong
Chung, Chi-kit Ronald	The Chinese Univ. of Hong Kong
Keywords: Stereo and Image-Based Modeling, Vision for Robotics, Vision for Graphics Abstract: This paper presents a new stereo-motion approach for 3D scene reconstruction in dense and accurate form, that allows the cameras to be described by the full perspective model. Given a short and arbitrary motion of a stereo rig of camera, the projective depth of every image point can be recovered from the rank-four property of a matrix that comprises the image positions of the scene, and the associated 3D position can thereby be determined accurately. Compared to earlier methods, the approach allows the use of the full perspective model to describe the cameras, and thus can attain a higher accuracy. In addition, the projective depths are recovered without the need of having initial guess of the depth map or going through iterations or approximation. The recovery process demands only a few stereo correspondences over the entire scene to start with, and points that are occluded in some of the views can also be reconstructed. Experiments on real image sequences are shown to illustrate the effectiveness of the approach.

14:00-14:30, Paper WePSBT2.13
3D Human Pose Estimation Using 2D Body Part Detectors
Barbulescu, Adela	Aalborg Univ.
Gong, Wenjuan	cvc
Gonzalez, Jordi	Centre de Visio per Computador, Univ. Aut�nomadeBarcelona
Moeslund, Thomas	Aalborg Univ.
Roca, F. Xavier	Computer Vision Center - Univ. Autonoma de Barcelona
Keywords: 2D/3D Object Detection and Recognition, Machine Learning and Data Mining Abstract: Automatic 3D reconstruction of human poses from monocular images is a challenging and popular topic in the computer vision community, which provides a wide range of applications in multiple areas. Solutions for 3D pose estimation involve various learning approaches, such as support vector machines and Gaussian processes, but many encounter difficulties in cluttered scenarios and require additional input data, such as silhouettes, or controlled camera settings. We present a framework that is capable of estimating the 3D pose of a person from single images or monocular image sequences without requiring background information and which is robust to camera variations. The framework models the non-linearity present in human pose estimation as it benefits from flexible learning approaches, including a highly customizable 2D detector. Results on the HumanEva benchmark show how they perform and influence the quality of the 3D pose estimates.

14:00-14:30, Paper WePSBT2.14
Matting-Driven Online Learning of Hough Forests for Object Tracking
Qin, Tao	Xiamen Univ.
Zhong, Bineng	Huaqiao Univ.
Chin, Tat-Jun	The Univ. of Adelaide
Wang, Hanzi	Xiamen Univ.
Keywords: Motion, Tracking and Video Analysis Abstract: Accurate segmentation provides a useful contour constraint to alleviate drifting during online learning for tracking. Towards this end, we present a closed-loop approach for object tracking that links Hough forests and alpha matting via an effective back-projection scheme for patches. A novel hybrid-Hough-forests-based method first estimates object location. Given the object location, the trimap of matting is then automatically generated from the patches back-projected from the Hough forests. Subsequently, an accurate contour of the object can be obtained based on a robust matting technique. Based on such an accurate contour, an update strategy is utilized to obtain reliable labeled samples to update the Hough forests to decrease the risk of model drift. Extensive comparisons on challenging sequences demonstrate the robustness and effectiveness of the proposed method.

14:00-14:30, Paper WePSBT2.15
Face Pose Estimation with Combined 2D and 3D HOG Features
Yang, Jiaolong	Beijing Inst. of Tech.
Liang, Wei	Beijing Lab. of Intelligent Information Tech.
Jia, Yunde	Beijing Inst. of Tech.
Keywords: 2D/3D Object Detection and Recognition Abstract: This paper describes an approach to location and orientation estimation of a person's face with color image and depth data from a Kinect sensor. The combined 2D and 3D histogram of oriented gradients (HOG) features, called RGBD-HOG features, are extracted and used throughout our approach. We present a coarse-to-fine localization paradigm to obtain localization results efficiently using multiple HOG filters trained in support vector machines (SVMs). A feed-forward multi-layer perception (MLP) network is trained for fine face orientation estimation over a continuous range. The experimental result demonstrates the effectiveness of the RGBD-HOG feature and our face pose estimation approach.

14:00-14:30, Paper WePSBT2.16
Evaluation of Local Feature Descriptors and Their Combination for Pedestrian Representation
Liang, Jixiang	Graduate Univ. of Chinese Acad. of Sciences
Ye, Qixiang	graduate Univ. of the chinese Acad. and sciences
Chen, Jie	Univ. of Oulu, Finland
Jiao, Jianbin	graduate Univ. of the chinese Acad. and sciences
Keywords: Features and Image Descriptors, Performance Evaluation Abstract: Pedestrian detection problem has been a touchstone of various image feature descriptors. In this paper, we evaluate four kinds of representative local descriptors (HOG, Haar-like, SURF and LBP) for pedestrian representation. Our goal is to find out the best combination of feature descriptors by analyzing and evaluating the complementarities of them. With the cross validation method, we first find out the best descriptor, which is then combined with other descriptors one by one for evaluation. In addition to direct descriptor combination, we propose a new descriptor strategy, called structural combination. Experiments on two public pedestrian datasets show that the performance evaluation can support the complementarily analysis and the complementarities is relevant to combination strategies.

14:00-14:30, Paper WePSBT2.17
Multi-Dimensional Local Binary Pattern Descriptors for Improved Texture Analysis
Schaefer, Gerald	Loughborough Univ.
Doshi, Niraj P.	Loughborough Univ.
Keywords: Features and Image Descriptors, Segmentation, Color and Texture, Multimedia Analysis, Indexing and Retrieval Abstract: Texture analysis algorithms are employed in many computer vision applications. A group of high performing texture algorithms are based on the concept of local binary patterns (LBP) which describe the relationship of pixels to their local neighbourhood. LBP descriptors are invariant to intensity changes and rotation invariance is simple to derive. In addition, LBP features can be calculated for different neighbourhood radii and thus allow texture description at different scales. In conventional LBP methods, the histograms corresponding to different radii are simply concatenated which results in a loss of information between these scales and added ambiguity. In this paper, we address this problem and show that multi-dimensional LBP histograms provide effective texture descriptors. We demonstrate, on various texture datasets from the Outex suite and both for texture classification and texture retrieval scenarios, that our proposed approach consistently outperforms conventional LBP features.

14:00-14:30, Paper WePSBT2.18
Exploiting Ramp Structures for Improving Optical Flow Estimation
Singh, Abhishek	Univ. of Illinois at Urbana-Champaign
Ahuja, Narendra	-UIUC
Keywords: Low-Level Vision, Motion, Tracking and Video Analysis, Geometric and Photometric Registration Abstract: The underlying principle behind most optical flow algorithms is that the brightness of a pixel remains the same as it flows from one frame to the next. The first order Taylor approximation used in formulating this brightness constancy principle may not be accurate when intensity profiles change non-linearly. In this paper, we propose a method of alleviating the effect of this approximation. Instead of computing image gradients using conventional horizontal and vertical filters of fixed coefficients and sizes, we propose to obtain the gradient information by an explicit examination of ramp profiles at a given location, in all directions. The gradient information obtained using the proposed analysis is more robust under non-linear changes in intensity profiles. Our results demonstrate that by incorporating the ramp structure information as proposed, we are able to improve existing optical flow algorithms.

14:00-14:30, Paper WePSBT2.19
Integrating Bottom-Up and Top-Down Processes for Accurate Pedestrian Counting
Lin, Yujie	Sun Yat-Sen Univ.
Liu, Ning	Sun Yat-Sen Univ.
Keywords: Motion, Tracking and Video Analysis, Pattern Recognition for Surveillance and Security Abstract: This paper presents a novel method for pedestrian counting in surveillance videos, which localizes and tracks the head-shoulders of pedestrians via the integrated bottom-up/top-down processes. In the bottom-up stage, we extract and match informative local image features crossing frames to obtain the initial moving regions (i.e. potential pedestrians). The top-down stage comprises two steps: (i) head-shoulder verification via a part-based classifier and (ii) head-shoulder tracking guided by the motion and appearance consistency. Moreover, the geometric context of the camera is employed to effective narrow the searching space of inference. We apply the method with the challenging videos and outperform the state-of-the-arts approach.

14:00-14:30, Paper WePSBT2.20
Context-Driven Moving Vehicle Detection in Wide Area Motion Imagery
Shi, Xinchu	Inst. of Automation
Ling, Haibin	Temple Univ.
Blasch, E	Air Force Res. Lab.
Hu, Weiming	National Lab. of Pattern Recognition,Inst.
Keywords: 2D/3D Object Detection and Recognition, Remote Sensing, Pattern Recognition for Surveillance and Security Abstract: Detection of moving vehicles in wide area motion imagery (WAMI) is increasingly important, with promising applications in surveillance, traffic scene understanding and public service applications such as emergency evacuation and policy security. However, the large camera motion, along with low contrast between vehicles and backgrounds, makes detection a challenging task. In this paper, we propose a novel moving vehicle detection approach by embedding the scene context, which is a road network estimated online. A two-step framework is used in the work. First, with an initial vehicle detection, trajectories are achieved by vehicle tracking. Then, the road network is extracted and used to reduce false detections. Quantitative evaluation demonstrates that the proposed contextual model remarkably improves the detection performance.

14:00-14:30, Paper WePSBT2.21
Shape Prior Regularized Continuous Max-Flow Approach to Image Segmentation
Duan, Yuping	Inst. for Infocomm Res.
Huang, Weimin	I2R
Chang, Huibin	Tianjin Normal Univ.
Keywords: Segmentation, Color and Texture, 2D/3D Object Detection and Recognition, Detection, Separation and Segmentation Abstract: In this work, we propose a novel segmentation method based on the continuous max-flow (CMF) formulation of the Potts model incorporating the statistical shape model. We increase the robustness and accuracy of the Potts model by using the prior shape knowledge from the Principal Component Analysis (PCA) to represent the desired shape. Our multi-label model can segment several objects simultaneously and guarantee one label with the structure similar to the shape prior. The proposed approach is applied to both synthetic and medical image of liver from computed tomography (CT). Numerous numerical experiments demonstrate that our model is efficient and with good quality in practice.

14:00-14:30, Paper WePSBT2.22
Segmentation and Tracking of Multiple Interacting Mice by Temperature and Shape Information
Giancardo, Luca	Istituto Italiano di Tecnologia (IIT)
Sona, Diego	Istituto Italiano di Tecnologia (IIT)
Scheggia, Diego	Istituto Italiano di Tecnologia (IIT)
Papaleo, Francesco	Istituto Italiano di Tecnologia (IIT)
Murino, Vittorio	Univ. of Verona
Keywords: Motion, Tracking and Video Analysis, Medical Image Analysis and Registration, Segmentation, Color and Texture Abstract: The study of neurological processes and pharmaceutical effects often relies on the analysis of mice behaviour. Automatic tracking tools are widely employed for this purpose, however they are mainly limited to a single mouse. We propose a real time segmentation and tracking algorithm able to manage multiple interacting mice regardless of their fur colour or light settings via an infrared camera. The approach proposed combines position, temperature and shape information thanks to the two main contributions of this paper: the ``temporal watershed'' and its information fusion with mice ``heat signatures''. The former segments shapes thanks to an extension of a classical seed-based segmentation algorithm in a expectation maximization framework; the latter contributes in mice identities preservation through the dynamic heat distribution of each body. Preliminary results show that our algorithm achieves performance comparable to the state of art, even with a larger number of targets to be tracked.

14:00-14:30, Paper WePSBT2.23
Object Detection Via Foreground Contour Feature Selection and Part-Based Shape Model
Zhang, Huigang	Beihang Univ.
Wang, Junxiu	Beihang Univ.
Bai, Xiao	Beihang Univ.
Zhou, Jun	Australian National Univ.
Cheng, Jian	Inst. of Automation,Chinese Acad. of Science
Zhao, Huijie	Beihang Univ. China
Keywords: 2D/3D Object Detection and Recognition, Features and Image Descriptors, Statistical, Syntactic and Structural Pattern Recognition Abstract: In this paper, we propose a novel approach for object detection via foreground feature selection and part-based shape model. It automatically learns a shape model from cluttered training images without need to explicitly given bounding box on objects. Our approach commences by extracting a set of feature descriptors, and iteratively selects the foreground features using Earth Movers Distances based matching. This lead to a part-based shape model that can be used for object detection. Experimental results show that the proposed method has comparable performance with the state-of-the-art shape-based detection methods but with less requirements on the data at the training stage.

14:00-14:30, Paper WePSBT2.24
Key Observation Selection for Effective Video Synopsis
Zhu, Xiaobin	Inst. ofAutomation, Chinese Acad. of Sciences
Liu, Jing	National Lab. of Pattern Recognition,Inst.
Wang, Jinqiao	Chinese Acad. of Science
Lu, Hanqing	Inst. of Automation,Chinese Acad. of Science
Keywords: Motion, Tracking and Video Analysis, Image and Video Understanding Abstract: Millions of video surveillance cameras distribute around the world, and capture tremendous number of video data endlessly. Video browsing by frame are time consuming and inefficient, since needless information is abundant in the raw videos. Video synopsis is an effective way to solve this problem by producing a short video representation, while keeping the essential activities of the original video. However, traditional video synopsis only eliminates redundancy in spatial and temporal domain, while neglects redundancy in content domain. However, too many observations will make synopsis video confusing and degrade the subjective effect of synopsis video. In this paper, we present a novel video synopsis method based on key observation selection. Key observation selection is conducted for activity to eliminate content redundancy. We have demonstrated the effectiveness of our approach on real surveillance videos.

14:00-14:30, Paper WePSBT2.25
3D Tracking of Soccer Players Using Time-Situation Graph in Monocular Image Sequence
Itoh, Hiroki	Kobe Univ.
Takiguchi, Tetsuya	Kobe Univ.
Ariki, Yasuo	Kobe Univ.
Keywords: Motion, Tracking and Video Analysis Abstract: In this paper, we propose a new method to track players using 3D particle filter guided by the time-situation graph in order to perform players tracking robust to occlusion in a soccer image sequence. In the conventional method using particle filter, there is a deficit that it is difficult to discover the players again once they are lost in an image sequence. Thus, we represents the position information of two or more players as the time-situation graph beforehand. Then, by running particle filter guided by this graph, the incorrect detection of players can be greatly reduced and the players be robustly tracked even when occlusion occurs. As a result, the tracking accuracy was improved by 7.15 points in comparison with the conventional method.

14:00-14:30, Paper WePSBT2.26
Invariant Signatures for Omnidirectional Visual Place Recognition and Robot Localization in Unknown Environments
Marie, Romain	Univ. de Picardie Jules Verne
Labbani-Igbida, Ouiddad	Univ. of Picardie Jules Verne
Mouaddib, El Mustapha	MIS Lab. Univ. de Picardie Jules Verne
Keywords: Features and Image Descriptors, Vision for Robotics Abstract: The paper introduces a novel approach to place representation for robot localization and mapping. It uses classical invariance theory while proposing an adaptive kernel to omnidirectional images and exploiting only the main significant visual information in the images. The approach is validated in real world robot exploration and localization and compared to color histograms.

14:00-14:30, Paper WePSBT2.27
Covariance Profiles: A Signature Representation for Object Sets
Kolar Rajagopal, Anoop	Indian Inst. of Science
Mitra, Adway	Indian Inst. of Science
Bonde, Ujwal	Indian Inst. of Science
Bhattacharyya, Chiranjib	Indian Inst. of Science, Bangalore, Karnataka, 560012
Kalpathi, Ramakrishnan	Indian Inst. of Science, Bangalore
Keywords: Features and Image Descriptors, Classification and Clustering, Pattern Recognition for Surveillance and Security Abstract: We consider the problem of extracting a signature representation of similar entities employing covariance descriptors. Covariance descriptors can efficiently represent objects and are robust to scale and pose changes. We posit that covariance descriptors corresponding to similar objects share a common geometrical structure which can be extracted through joint diagonalization. We term this diagonalizing matrix as the Covariance Profile (CP). CP can be used to measure the distance of a novel object to an object set through the diagonality measure. We demonstrate how CP can be employed on images as well as for videos, for applications such as face recognition and object-track clustering.

14:00-14:30, Paper WePSBT2.28
Towards a Robust Hand-Eye Calibration Using Normal Flows
Hui, Tak-Wai	The Chinese Univ. of Hong Kong
Chung, Chi-kit Ronald	The Chinese Univ. of Hong Kong
Keywords: Motion, Tracking and Video Analysis, Vision for Robotics Abstract: Calibrating hand-eye geometry is often based on explicit feature correspondences. This article presents an alternative method that uses the apparent flow induced by the motion of the camera to achieve selfcalibration. To make the method more robust against noise, the strategy is to use the orientation of the normal flow field which is more noise-immune, to recover first the direction component of the hand-eye geometry. Outliers in the extracted flow data are identified using some intrinsic properties of the flow field together with the partially recovered hand-eye geometry. The final complete solution is refined using a robust process. The proposed method can also be used for determining the relative geometry of multiple cameras without demanding overlap in the visual fields of the cameras. Experimental results on synthetic data and real image data are shown to illustrate the feasibility of the method.

14:00-14:30, Paper WePSBT2.29
Visual Description and Recognition of Mechanical Tools with a Silhouette-Based Approach
Pazzaglia, Fabio	Univ. of Florence, Italy
Colombo, Carlo	Univ. of Florence
Keywords: Features and Image Descriptors, 2D/3D Object Detection and Recognition Abstract: In this paper we propose an original framework for the description and the subsequent recognition of objects of limited size. Although of general applicability, the framework is presented here as a way to trace different yet similar metal tools employed in the mechanical constructions industry. For the purpose of object description, time-varying silhouettes of the object are acquired under turntable motion and collated into a single image. The resulting footprint matrix represents the object in both a compact and effective way. Visual matching of footprint matrices is carried out with a computationally efficient algorithm that is organized into three distinct levels so as to benefit of a progressive suppression of the irrelevant information.

14:00-14:30, Paper WePSBT2.30
Pose Estimation from Minimal Dual-Receiver Configurations
Burgess, Simon	Lund Univ.
Kuang, Yubin	Lund Univ.
Astroem, Kalle	Lund Univ.
Keywords: Low-Level Vision, Speech and Audio Analysis, Scene Understanding Abstract: Using multiple receivers (microphones or antennas) in a rigid configuration, such as on a smartphone, it is possible to measure time difference of arrival to the receivers. This in turn can be used to determine the direction to the transmissions, if there are at least three receivers. When using two receivers it can be used to determine the angle to the transmissions relative to the line through the two receivers. In this paper we study three minimal problems for pose using such data: (i) determine position and orientation using five transmissions, (ii) determine position and orientation using four transmissions and known 'down' direction and (iii) determine position using three transmissions and known orientation. Numerically stable solvers are implemented. An experimental validation of the solvers are performed on simulated data.

14:00-14:30, Paper WePSBT2.31
Fast Automatic Saliency Map Driven Geometric Active Contour Model for Color Object Segmentation
Nguyen, Tran Lan Anh	Chonnam National Univ.
Vo, Quang Nhat	Chonnam National Univ.
Kodirov, Elyor	Chonnam National Univ.
Kim, Soohyung	Chonnam National Univ.
Lee, Gueesang	Chonnam National Univ.
Keywords: Segmentation, Color and Texture Abstract: Segmenting objects from color images to obtain useful information is a challenging research area recently. In this paper, a novel algorithm by combining a saliency map with an extension of a geometric active contour model is proposed to automatically segment the object of interest. The saliency map is first generated from the input image by a histogram based contrast method. The most salient regions are then detected as dominant parts of the object. After that, a contour is initialized using salient regions determined. Finally, by applying a geometric active contour model, the contour starts evolving iteratively to segment object boundaries. Experimental results attained on various natural scene images have shown that our proposed method is able to not only replace manual initialized contour and improve the accuracy, noise robustness of segmentation but converge to an optimal solution earlier than recent active contour models as well.

14:00-14:30, Paper WePSBT2.32
Traffic Accident Risk Analysis Based on Relation of Common Route Models
Er, Uygar	Yildiz Tech. Univ.
Yuksel, Suleyman	Yildiz Tech. Univ.
Akoz, Omer	Yildiz Tech. Univ.
Karsligil, M. Elif	Yildiz Tech. Univ.
Keywords: Motion, Tracking and Video Analysis, Image and Video Understanding, Image and Video Processing Abstract: This paper proposes a novel accident prediction approach based on extracting the relation between interested vehicles and increasing risk factor according to anomaly detection in real time traffic videos. In learning process of the traffic model at intersections, we detect all trajectories by tracking of each vehicle and then group them considering road model. All trajectories are clustered by Continuous Hidden Markov Model with Mixture of Gaussian (MoG) and Common Route Model (CRM) for each group of trajectories is found. After extracting all CRM�s and defining their relations, in real time traffic analysis process, partial motion of the vehicles are evaluated and anomalies are detected if there is. In this approach, while searching for accident risk, partial trajectories of vehicles are classified to the most similar CRM�s. For each source vehicle, risk factors are calculated with target vehicles that are in related CRM�s and has Region of Interest (ROI) intersected with source vehicle. The advantage of this approach is that the system does only analyze vehicles in accident risk and this increases the performance of the system. Beside these, since CRM information and their features like relations, directions and likelihood in classification process are learned, anomalies can easily be detected and used as risk enhancer. Experimental results show that the proposed model has high prediction rate in real world accident events.

14:00-14:30, Paper WePSBT2.33
Joint Shot Boundary Detection and Key Frame Extraction
Liu, Xiao	Zhejiang Univ.
Song, Mingli	Zhejiang Univ.
Zhang, Luming	Zhejiang Univ.
Wang, Senlin	Zhejiang Univ.
Bu, Jiajun	Zhejiang Univ.
Chen, Chun	Zhejiang Univ.
Tao, Dacheng	Nanyang Tech. Univ.
Keywords: Stereo and Image-Based Modeling, Motion, Tracking and Video Analysis Abstract: Representing a video by a set of key frames is useful for efficient video browsing and retrieving. But key frame extraction keeps a challenge in the computer vision field. In this paper, we propose a joint framework to integrate both shot boundary detection and key frame extraction, wherein three probabilistic components are taken into account, i.e. the prior of the key frames, the conditional probability of shot boundaries and the conditional probability of each video frame. Thus the key frame extraction can be treated as a Maximum A Posteriori problem which can be solved by adopting alternate strategy. Experimental results show that the proposed method preserves the scene level structure and extracts key frames that are representative and discriminative.

14:00-14:30, Paper WePSBT2.34
Detecting Occlusion Boundaries Via Saliency Network
Chen, Dapeng	Xi'an Jiaotong Univ.
Yuan, Zejian	Xi'an Jiaotong Univ.
Zhang, Geng	Xi'an Jiaotong Univ.
Zheng, Nanning	Xi'an Jiaotong Univ.
Keywords: Motion, Tracking and Video Analysis, Occlusion and Shadow Detection, Segmentation, Color and Texture Abstract: In this paper, we address the problem of detecting occlusion boundaries from video sequences. We build a bi-directed graph whose nodes are line fragments extracted from superpixels's edges. Based on the graph, we compute a global occlusion saliency map by integrating motion, shape and topology cues into the framework of Saliency Network. Furthermore, with the structural information generated from the network, the property of structural consistency is proposed to prune the graph and refine the saliency map. Finally, we train a classifier to detect occlusion fragments combining the global saliency value and local edge strength. The detector outperforms the state-of-the-art on the benchmark of Stein and Hebertcite{Stein09occlusionboundaries} by improving average precision to .80.

14:00-14:30, Paper WePSBT2.35
Visual Cortex Inspired Features for Object Detection in X-Ray Images
Schmidt-Hackenberg, Ludwig	Univ. of Kaiserslautern
Yousefi, Mohammad Reza	Univ. of Kaiserslautern
Breuel, Thomas	Univ. of kaiserslautern
Keywords: Features and Image Descriptors, 2D/3D Object Detection and Recognition, Image and Video Understanding Abstract: Visual cortex inspired features mimic what we know of the brain's visual cortex, which is still the best existing object detection system regarding speed and accuracy. For this paper we benchmarked two prominent implementations of these features, Mutch and Lowe's SLF-HMAX and Pinto et al.'s V1-like, against the popular local invariant features SIFT and PHOW in combination with the bag of visual words approach. The benchmark task was the detection of illicit objects in X-ray images of luggage. X-ray inspection is one of the main means of preventing the transport of illicit objects into sensitive areas. The visual cortex inspired features performed superior to the conventional features, probably owing to the textureless nature of X-ray images and the encoding of geometric information.

14:00-14:30, Paper WePSBT2.36
3D Facial Expression Recognition Via Multiple Kernel Learning of Multi-Scale Local Normal Patterns
Li, Huibin	Ec. Centrale de Lyon
Chen, Liming	Ec. Centrale de Lyon
Huang, Di	Beihang Univ.
Wang, Yunhong	Beihang Univ.
Morvan, Jean-Marie	Univ. Lyon 1, Inst. Camille Jordan, 43 blvd du 11Novembr
Keywords: Features and Image Descriptors, Classification and Clustering, Biometrics Abstract: In this paper, we propose a fully automatic approach for person-independent 3D facial expression recognition. In order to extract discriminative expression features, each aligned 3D facial surface is compactly represented as multiple global histograms of local normal patterns from multiple normal components and multiple binary encoding scales, namely Multi-Scale Local Normal Patterns (MS-LNPs). 3D facial expression recognition is finally carried out by modeling multiple kernel learning (MKL) to efficiently embed and combine these histogram based features. By using the SimpleMKL algorithm with the chi-square kernel, we achieved an average recognition rate of 80.14% based on a fair experimental setup. To the best of our knowledge, our method outperforms most of the state-of-the-art ones.

14:00-14:30, Paper WePSBT2.37
Stereo-Based Tracking of Multiple Overlapping Persons
Satake, Junji	Toyohashi Univ. of Tech.
Miura, Jun	Toyohashi Univ. of Tech.
Keywords: Motion, Tracking and Video Analysis Abstract: This paper describes a method of tracking multiple persons with occlusions using stereo. We previously developed an accurate and stable tracking method using overlapping silhouette templates which considers how persons overlap in the image. It realized a fast tracking by using an approximated likelihood map based on kernel density estimation. The method, however, treated only two overlapping persons. In this paper, we %extend the method to more general situation and propose an improvement of approximation applicable to the case where three or more persons overlap. Experimental results show that the proposed method can track persons stably even when three persons overlap in the image.

14:00-14:30, Paper WePSBT2.38
Automatic Face Annotation by Multilinear AAM with Missing Values
Feng, Zhenhua	Univ. of Surrey
Kittler, Josef	Univ. of Surrey
Christmas, William	Univ. of Surrey
Wu, Xiaojun	Jiangnan Univ.
Pfeiffer, Sebastian	Goethe Univ. Frankfurt/Main
Keywords: Features and Image Descriptors, Segmentation, Color and Texture, 2D/3D Object Detection and Recognition Abstract: It has been shown that multilinear subspace analysis is a powerful tool to overcome difficulties posed by viewpoint, illumination and expression variations in Active Appearance Model(AAM). However, the Higher Order Singular Value Decomposition (HOSVD) in multilinear analysis requires training samples to build the training tensor, which include face images under all dif-ferent variations. It is hard to obtain such a complete training tensor in practical applications. In this paper, we propose a multilinear AAM which can be generated from an incomplete training tensor using Multilinear Subspace Analysis with Missing Values (M2SA). Also, the 2D appearance is used for training appearance tensor directly to reduce the memory requirements. Experimental results on the Multi-PIE face database show the efficiency of the proposed method.

14:00-14:30, Paper WePSBT2.39
Anomaly Detection with Spatio-Temporal Context Using Depth Images
Ma, Xiaolin	Nanjing Univ.
Lu, Tong	Nanjing Univ.
Xu, Feiming	Nanjing Univ.
Su, Feng	Nanjing Univ.
Keywords: Motion, Tracking and Video Analysis, Stereo and Image-Based Modeling Abstract: A novel statistical framework for modeling the intrinsic structure of crowded scenes and detecting abnormal activities is presented. The proposed framework essentially turns the complex anomaly detection process into two parts: motion pattern representation and spatio-temporal context modeling. We propose a new 4D spatio-temporal hypervolume representation by integrating the depth constraints to enrich motion information. When detecting abnormal behaviors from crowded scenes, we divide the hypervolume into local blocks and construct environmental contexts by coupling their spatio-temporal correlations together with the co-occurrence probabilities. As a result, statistical deviations can be detected as abnormal events. Experiments on a new depth image dataset composed of four crowded scene categories show that our spatio-temporal framework offers promising results in real-life crowded scenes with complex activities.

14:00-14:30, Paper WePSBT2.40
Pedestrian Lane Detection for Assistive Navigation of Blind People
Le, Manh Cuong	Univ. of Wollongong
Phung, Son Lam	Univ. of Wollongong
Bouzerdoum, Abdesselam	Univ. of Wollongong
Keywords: 2D/3D Object Detection and Recognition, Detection, Separation and Segmentation, Pattern Recognition for Surveillance and Security Abstract: Navigating safely in outdoor environments is a challenging activity for vision-impaired people. This paper is a step towards developing an assistive navigation system for the blind. We propose a robust method for detecting the pedestrian marked lanes at traffic junctions. The proposed method includes two stages: regions of interest (ROI) extraction and lane marker verification. The ROI extraction is performed by using colour and intensity information. A probabilistic framework employing multiple geometric cues is then used to verify the extracted ROI. The experimental results have shown that the proposed method is robust under challenging illumination conditions and obtains superior performance compared to the existing methods.

14:00-14:30, Paper WePSBT2.41
Video Object Segmentation by Clustering Region Trajectories
Zhang, Geng	Xi'an Jiaotong Univ.
Yuan, Zejian	Xi'an Jiaotong Univ.
Chen, Dapeng	Xi'an Jiaotong Univ.
Liu, Yuehu	Xi'an Jiaotong Univ.
Zheng, Nanning	Xi'an Jiaotong Univ.
Keywords: Motion, Tracking and Video Analysis, Image and Video Processing, Image and Video Understanding Abstract: We propose a novel approach to segment the moving objects in video clips. First, we introduce a region trajectory generation method based on watershed transform and graph clustering. Region trajectory has the advantage that it preserves object boundaries and it is able to compactly represent the video. Second, we employ a spectral embedding framework to cluster the region trajectories into objects. Affinities between region trajectories are computed based on their motion similarities. Finally, we introduce the background prior and foreground topology in the discretization procedure, in order to achieve consistent segmentation. Our results are insensitive to the number of eigenvectors selected. We validate our method on a challenging dataset and provide statistical comparison with state-of-the-art trajectory clustering methods.

14:00-14:30, Paper WePSBT2.42
Removal of Dust Artifacts from Focal Stack Image Sequences
Li, Chen	Zhejiang Univ.
Zhou, Kun	Zhejiang Univ.
Lin, Stephen	Microsoft Res. Asia
Keywords: Physics-Based Vision, Low-Level Vision Abstract: We propose a technique for removing the appearance of sensor dust in a focal stack image sequence captured with multiple focus settings. Our method is based on the key observation that sensor dust artifacts shift in im-age position with respect to focus setting, which allows scene information occluded by dust in one image to be inferred from other images in the focal stack. To deal with complications arising from differences in local de-focus blur among the images, we analyze the relative blur among corresponding image regions in detecting and removing dust artifacts. Our results show improve-ments over the state-of-art technique for automatic re-moval of sensor dust.

14:00-14:30, Paper WePSBT2.43
Visual Saliency: A Manifold Way of Perception
Zhu, Hao	Beijing Normal Univ.
Han, Biao	Shanghai Univ.
Ruan, Xiang	Omron coorparation
Keywords: Cognitive and Embodied Vision, Low-Level Vision, Feature Reduction and Manifold Learning Abstract: Visual saliency plays an important role in the human visual system （HVS） since it is indispensable for object detection and recognition. A bottom-up saliency model was proposed, following the manifold characteristic of HVS, previously developed for understanding HVS mechanism. The saliency of a given location of visual field is defined as the power of features responses after the dimensionality reduction with manifold learning for sparse representation of raw input. This saliency definition also explains that HVS can suppress the response of redundant pattern and excite the response of attended pattern. The experiment is shown that the resulting saliency model produces better predictions of human eye fixations on two dataset than four previously proposed bottom-up saliency detectors.

14:00-14:30, Paper WePSBT2.44
Binary Invariant Cross Color Descriptor Using Galaxy Sampling
Huang, Guo-Hao	National Chung-Hsing Univ.
Huang, Chun-Rong	National Chung-Hsing Univ.
Keywords: Features and Image Descriptors Abstract: In this paper, we propose a new descriptor which is computed by comparing invariant cross color channels of pairs of points in the local patch. To efficiently obtain the sampled pairs of points, a galaxy sampling pattern is proposed. As shown in the experiments, our descriptor using invariant cross color channels and the galaxy sampling can achieve the best performance in most cases with slight computation time increasing.

14:00-14:30, Paper WePSBT2.45
Dissociating Rigid and Articulated Motion for Hand Tracking
Mart�nez, Oriol	Univ. Pompeu Fabra
Cirujeda, Pol	Univ. Pompeu Fabra
Ferraz, Luis	Univ. Pompeu Fabra
Binefa, Xavier	Univ. Pompeu Fabra
Keywords: Motion, Tracking and Video Analysis, Segmentation, Color and Texture Abstract: In this paper we present a novel template tracking method for articulated objects, specifically hands. The template is defined as a level set function and our goal is to distinguish between the rigid motion and the non- linear movement of its articulations. For the rigid motion estimation, at each frame, we use a combination of optical flow and Monte Carlo sampling over the affine Lie group, Aff(2). In order to cope with the articulated parts we evolve the template using a region based active contour. Finally, we show that the use of Monte Carlo methods over Aff(2) allows the estimation of the rigid motion avoiding the noise introduced by the articulated parts.

14:00-14:30, Paper WePSBT2.46
Stereo Matching on Low Intensity Quantization Images
Lin, Huei-Yung	National Chung Cheng Univ.
Keywords: Stereo and Image-Based Modeling, Features and Image Descriptors Abstract: This paper addresses the problem of how the image intensity quantization affects the stereo matching algorithms. We compute the disparity using the stereo images represented by various intensity quantization levels. It is shown that, depending on the stereo matching algorithms, even the image pairs with low intensity quantization are able to produce fairly good disparity results. Experiments on Middlebury datasets demonstrate that the global algorithms such as GC and BP are suitable for stereo matching using low intensity quantization images.


WeCT1	Main Hall
Medical Registration and Spatio-Temporal Analysis	Regular Session
Chair: Kensaku, Mori	Nagoya Univ.

14:30-14:50, Paper WeCT1.1
Spectral Clustering to Model Deformations for Fast Multimodal Prostate Registration
Mitra, Jhimli	Univ. de Bourgogne
Kato, Zoltan	Univ. of Szeged
Ghose, Soumya	Univ. de Bourgogne
Sidibe, Desire	Univ. de Bourgogne
Mart�, Robert	Univ. of Girona
Llado, Xavier	Univ. of Girona
Oliver, Arnau	Univ. of Girona
Vilanova, Joan C.	Girona Magnetic Res. Center
Meriaudeau, Fabrice	LE2I
Keywords: Medical Image Analysis and Registration Abstract: This paper proposes a method to learn deformation parameters off-line for fast multimodal registration of ultrasound and magnetic resonance prostate images during ultrasound guided needle biopsy. The registration method involves spectral clustering of the deformation parameters obtained from a spline-based nonlinear diffeomorphism between training magnetic resonance and ultrasound prostate images. The deformation models built from the principal eigen-modes of the clusters are then applied on a test magnetic resonance image to register with the test ultrasound prostate image. The deformation model with the least registration error is finally chosen as the optimal model for deformable registration. The rationale behind modeling deformations is to achieve fast multimodal registration of prostate images while maintaining registration accuracies which is otherwise computationally expensive. The method is validated for 25 patients each with a pair of corresponding magnetic resonance and ultrasound images in a leave-one-out validation framework. The average registration accuracies i.e. Dice similarity coefficient of 0.927 � 0.025, 95% Hausdorff distance of 5.14 � 3.67 mm and target registration error of 2.44 � 1.17 mm are obtained by our method with a speed-up in computation time by 98% when compared to Mitra et al. [7].

14:50-15:10, Paper WeCT1.2
Learning-Based Deformable Registration Using Weighted Mutual Information
Lu, YongNing	National Univ. of Singapore
Liao, Rui	Siemens Corp. Corp. Res. and Tech.
Zhang, Li	Siemens Corp. Corp. Res. and Tech.
Sun, Ying	National Univ. of Singapore
Chefd'hotel, Christophe	Siemens Corp. Corp. Res. and Tech.
Ong, Sim Heng	National Univ. of Singapore
Keywords: Medical Image Analysis and Registration, Statistical, Syntactic and Structural Pattern Recognition Abstract: Deformable registration of multi-modality medical image remains a challenging research topic. The incorporation of prior information on the expected joint distribution has shown to noticeably improve registration accuracy and robustness. However, direct application of the learned joint histogram makes the algorithm sensitive to the difference between the training data and the test image. This paper explores a more intrinsic intensity mapping relationship using normalized pointwise mutual information, and integrates the learned relationship into the conventional mutual information (MI) to formulate a weighted mutual information (WMI). We further derive a closed-form expression of the first variation of WMI for non-parametric deformable registration in a variational framework. Experiment results show that the proposed WMI is more accurate and robust than MI, and is less sensitive to discrepancies between the training and test images, compared to the method in [1]. In addition, our prior can be learned from only a subset of the image, and can be object-specific.

15:10-15:30, Paper WeCT1.3
Automated Detection of Skeletal Muscle Twitches from B-Mode Ultrasound Images: An Application to Motor Neuron Disease
Harding, Peter John	Manchester Metropolitan Univ.
Hodson-Tole, Emma	Manchester Metropolitan Univ.
Cunningham, Ryan James	Manchester Metropliton Univ.
Costen, Nicholas Paul	Manchester Metropolitan Univ.
Loram, Ian	Manchester Metropolitan Univ.
Keywords: Medical Image Analysis and Registration, Computer-Aided Diagnosis and Surgery Abstract: The presence of involuntary muscle twitches is a diagnostic indicator of neurodegenerative diseases, such as motor neurone disease (MND), but current methods of twitch detection are invasive and pose potential risks to patients. We present a method by which standard B-mode ultrasound can be used to automatically identify muscle twitches similar to those found in patients with MND. The results of initial experimentation are presented, and we show that this technique can detect muscle twitches with a high degree of accuracy.

15:30-15:50, Paper WeCT1.4
Neural-Net Classification for Spatio-Temporal Descriptor Based Depression Analysis
Joshi, Jyoti	Univ. of Canberra
Dhall, Abhinav	Australian National Univ.
Goecke, Roland	Univ. of Canberra
Breakspear, Michael	Univ. of New South Wales
Parker, Gordon	Univ. of New South Wales
Keywords: Human Computer Interaction, Gesture and Behavior Analysis Abstract: Depression is a severe psychiatric disorder. Despite the high prevalence, current clinical practice depends almost exclusively on self-report and clinical opinion, risking a range of subjective biases. This paper focuses on depression analysis based on visual cues from facial expressions and upper body movements. The proposed diagnostic support system is based on computing spatio-temporal features from video sequences. Space Time Interest Points are computed for the videos for analysing the upper body movements and a temporal visual words dictionary is learned from them. Intra-facial muscle movement is captured by computing a LBP-TOP based codebook. Various neural-net classifiers are explored and compared with a SVM. The approach is evaluated on real-world clinical data from interactive interviews with depressed and healthy subjects.

15:50-16:10, Paper WeCT1.5
Feature-Aligned 4D Spatiotemporal Image Registration
Xu, Huanhuan	LSU
Chen, Peizhi	Xiamen Univ.
Yu, Wuyi	LSU
Sawant, Amit	UT Southwestern Medical Center
Iyengar, S.S	Florida International
Li, Xin	Louisiana State Univ.
Keywords: Medical Image Analysis and Registration Abstract: In this paper, we develop a feature-aware 4D spatiotemporal image registration method. Our model is based on a 4D (3D+time) free-form B-spline deformation model which has both spatial and temporal smoothness. We first introduce an automatic 3D feature extraction and matching method based on an improved 3D SIFT descriptor, which is scale- and rotation- invariant. Then we use the results of feature correspondence to guide an intensity-based deformable image registration. Experimental results show that our method can lead to smooth temporal registration with good matching accuracy; therefore this registration model is potentially suitable for dynamic tumor tracking.


WeCT2	Multi-Purpose Hall
Applications of Image and Signal Processing	Regular Session
Chair: Sebe, Nicu	Univ. of Trento
Co-Chair: Ho, Shen-Shyang	Nanyang Tech. Univ.

14:30-14:50, Paper WeCT2.1
An Effective Vortex Detection Approach for Velocity Vector Field
Ho, Shen-Shyang	Nanyang Tech. Univ.
Keywords: Remote Sensing, Statistical, Syntactic and Structural Pattern Recognition, Detection, Separation and Segmentation Abstract: Detection of vortices, which are rotating flow features, is an important task to identify, analyze, and understand flow dynamics in a fluid. For example, it can be used to accurately tag nonrigid salient rotation features from large amount of wind vectors captured by orbiting satellites for hurricane research. In this paper, we describe in detail a general vortex detection algorithm motivated by Hough transform and flow vector tree structures. The vortex detection algorithm allows one to find the exact vortex center efficiently if it is in the vector field. A special case of the algorithm has been successfully applied to cyclone annotation and tracking using QuikSCAT satellite wind measurements.

14:50-15:10, Paper WeCT2.2
Robust Detection of Single-Frame Defects in Archived Film
Wechtitsch, Stefanie	JOANNEUM Res.
Fassold, Hannes	JOANNEUM Res.
Schallauer, Peter	JOANNEUM Res. Forschungsgesellschaft m.b.H.
Keywords: Enhancement, Restoration and Filtering, Image and Video Processing Abstract: The main issue in current algorithms for the detection of single-frame defects like dust, dirt and blotches in archived film is the significant number of false alarms due to motion compensation errors and film grain. This typically leads to disturbing artifacts occurring in the subsequent defect removal process. We propose a novel algorithm for the detection of single-frame defects which addresses this issue. A continuous response map is defined which indicates the probability of a pixel being part of a single-frame defect. Furthermore, we introduce the novel co-support which is applied to the response map for both noise suppression and spatial completion of potential defects. Finally, regions which are likely affected by motion compensation errors are identified (e.g., by analyzing the motion field) and the single-frame-defect response map is damped in these regions. Experimental results show that the proposed method outperforms state-of-the-art algorithms in terms of accuracy and robustness to motion estimation issues.

15:10-15:30, Paper WeCT2.3
Coupling Reduced Models for Optimal Motion Estimation
Drifi, Karim	INRIA
Herlin, Isabelle	INRIA
Keywords: Image and Video Processing, Remote Sensing Abstract: The paper discusses the issue of motion estimation by image assimilation in numerical models, based on Navier-Stokes equations. In such context, models' reduction is an attractive approach that is used to decrease cost in memory and computation time. The reduced models are obtained from a Galerkin projection on a small-size subspace, defined by its orthogonal basis. Long temporal image sequences may then be processed by a sliding-windows method. On the first window, a fixed basis is considered to define the reduced model. On the next one, a Principal Order Decomposition is applied, in order to define a basis that is simultaneously small-size and adapted to the studied image data. Results are given on synthetic data and quantified according to state-of-the-art methods. Application to satellite images is provided to demonstrate the potential of the approach.

15:30-15:50, Paper WeCT2.4
Detection of Bubbles As Concentric Circular Arrangements
Strokina, Nataliya	Lappeenranta Univ. of Tech. (LUT)
Matas, Jiri	CTU Prague
Eerola, Tuomas	Lappeenranta Univ. of Tech.
Lensu, Lasse	Lappeenranta Univ. of Tech.
Kalviainen, Heikki	Lappeenranta Univ. of Tech.
Keywords: Detection, Separation and Segmentation, Enhancement, Restoration and Filtering Abstract: The paper proposes a method for the detection of bubble-like transparent objects with multiple interfaces in a liquid. Depending on the lighting conditions, bubble appearance varies significantly, including contrast reversal and multiple inter-reflections. We formulate the bubble detection problem as the detection of Concentric Circular Arrangements (CCA). The CCAs are recovered in a hypothesize-optimize-verify framework. The hypothesis generation proceeds by sampling from the partially linked components of the non-maximum suppressed responses of oriented ridge filters followed by CCA parameter estimation. Parameter optimization is carried out by minimizing a novel cost-function by the simplex method. The verification decision is a function of the achieved cost function. The proposed method for bubble detection showed good performance in an industrial application requiring estimation of gas volume in pulp suspension, achieving 1.5% mean absolute relative error.

15:50-16:10, Paper WeCT2.5
Bayesian Separation of Wind Power Generation Signals
Yoon, Ji Won	IBM Res.
Fusco, Francesco	IBM
Wurst, Michael	IBM Res.
Keywords: Detection, Separation and Segmentation, Statistical, Syntactic and Structural Pattern Recognition, Machine Learning and Data Mining Abstract: One of most challenging and important tasks for electricity grid operators and utility companies is to predict and estimate the precise energy consumption and generation of individual households which have their own decentralized production system. This is a under-determined source separation problem since only the difference between energy production and consumption in the micro-generation system is visible. Therefore, we present a latent variable model with a polynomial regression form for the separation and then the model is used by several statistical algorithms to explore the underlying energy consumption and production from the differenced signals. In order to efficiently find global optima of the hidden variables of the model, we develop a source separation algorithm based on the Integrated Nested Laplace Approximation (INLA).


WeCT3	Room 101+102
Matching	Regular Session
Chair: Saito, Hideo	Keio Univ.
Co-Chair: Tao, Dacheng	Nanyang Tech. Univ.

14:30-14:50, Paper WeCT3.1
Image Contextual Representation and Matching through Hierarchies and Higher Order Graphs
Rubio Ballester, Jose C.	Computer Vision Center. UAB
Serrat, Joan	Univ. Autonoma de Barcelona
L�pez Pe�a, Antonio M.	CVC-UAB
Paragios, Nikos	Ec. Centrale de Paris
Keywords: Segmentation, Color and Texture, Scene Understanding, Pattern Recognition for Search, Retrieval and Visualization Abstract: We present a region matching algorithm which estab- lishes correspondences between regions from two seg- mented images. An abstract graph-based representa- tion conceals the image in a hierarchical graph, exploit- ing the scene properties at two levels. First, the similar- ity and spatial consistency of the image semantic ob- jects is encoded in a graph of commute times. Second, the cluttered regions of the semantic objects are repre- sented with a shape descriptor. Many-to-many match- ing of regions is specially challenging due to the insta- bility of the segmentation under slight image changes, and we explicitly handle it through high order poten- tials. We demonstrate the matching approach applied to images of world famous buildings, captured under dif- ferent conditions, showing the robustness of our method to large variations in illumination and viewpoint.

14:50-15:10, Paper WeCT3.2
Manhattan-Pyramid Distance: A Solution to an Anomaly in Pyramid Matching by Minimization
Chauhan, Aneesh	Univ. of Aveiro
Seabra Lopes, Lu�s	Univ. of Aveiro
Keywords: 2D/3D Object Detection and Recognition, Features and Image Descriptors, Vision for Robotics Abstract: In the field of computer vision, pyramid matching by minimization has gained increasing popularity. This paper points out and discusses an inherent anomaly in pyramid matching by minimization that can affect the performance of classification approaches based on this type of matching. As a solution, a new multi- resolution measure, called Manhattan-Pyramid Distance (MPD), is proposed. Systematic evaluations are carried out at the task of instance-based object classification on four object image datasets. Results show that MPD improves object classification performance with respect to a standard approach based on pyramid matching by minimization.

15:10-15:30, Paper WeCT3.3
Robust and Accurate Multi-View Reconstruction by Prioritized Matching
Ylim�ki, Markus	Univ. of Oulu
Kannala, Juho	Univ. of Oulu
Holappa, Jukka	Univ. of Oulu
Heikkil�, Janne	Univ. of Oulu
Brandt, Sami Sebastian	Univ. of Copenhagen
Keywords: Stereo and Image-Based Modeling, Vision for Graphics, Computational Photography Abstract: This paper proposes a prioritized matching approach for finding corresponding points in multiple calibrated images for multi-view stereo reconstruction. The approach takes a sparse set of seed matches between pairs of views as input and then propagates the seeds to neighboring regions by using a prioritized matching method which expands the most promising seeds first. The output of the method is a three-dimensional point cloud. Unlike previous correspondence growing approaches our method allows to use the best-first matching principle in the generic multi-view stereo setting with arbitrary number of input images. Our experiments show that matching the most promising seeds first provides very robust point cloud reconstructions efficiently with just a single expansion step. A comparison to the current state-of-the-art shows that our method produces reconstructions of similar quality but significantly faster.

15:30-15:50, Paper WeCT3.4
Combining Color and Geometry for Local Image Matching
Mazin, Baptiste	Telecom ParisTech, LTCI CNRS
Delon, Julie	CNRS, Telecom ParisTech
Gousseau, Yann	Telecom Paris
Keywords: Features and Image Descriptors, Segmentation, Color and Texture Abstract: This paper introduces a generic way to incorporate color information into local, SIFT-like descriptors, in view of image matching. First, a new color descriptor, relying on local hue histograms, is introduced. Second, we describe a procedure permitting the automatic set- ting of matching parameters when matching images us- ing both geometric and color information. Experiments on a color image database show that our SIFT+Hue combination performs significantly better than classical color descriptors.

15:50-16:10, Paper WeCT3.5
Evaluation of Local Detectors and Descriptors for Fast Feature Matching
Miksik, Ondrej	Brno Univ. of Tech.
Mikolajczyk, Krystian	Univ. of Surrey
Keywords: Features and Image Descriptors, Low-Level Vision, Vision for Robotics Abstract: Local feature detectors and descriptors are widely used in many computer vision applications and various methods have been proposed during the past decade. There have been a number of evaluations focused on various aspects of local features, matching accuracy in particular, however there has been no comparisons considering the accuracy and speed trade-offs of recent extractors such as BRIEF, BRISK, ORB, MRRID, MROGH and LIOP. This paper provides a performance evaluation of recent feature detectors and compares their matching precision and speed in randomized kd-trees setup as well as an evaluation of binary descriptors with efficient computation of Hamming distance.


WeCT4	Hall 200
Surveillance and Security	Regular Session
Chair: Taniguchi, Rin-ichiro	Kyushu Univ.
Co-Chair: Hlavac, Vaclav	Czech Tech. Univ. Faculty of ElectricalEngineering

14:30-14:50, Paper WeCT4.1
Learning to Count with Regression Forest and Structured Labels
Fiaschi, Luca	HCI/IWR Heidelberg
Nair, Rahul	HCI/IWR Heidelberg
Koethe, Ullrich	HCI/IWR Heidelberg
Hamprecht, Fred Andreas	Univ. of Heidelberg
Keywords: Statistical, Syntactic and Structural Pattern Recognition, Pattern Recognition for Surveillance and Security Abstract: Following [Lempitsky and Zisserman, 2010], we seek to count objects by integrating over an object density map that is predicted from an input image. In contrast to that work, we propose to estimate the object density map by averaging over structured, namely patch-wise, predictions. Using an ensemble of randomized regression trees that use dense features as input, we obtain results that are of similar quality, at a fraction of the training time, and with low implementation effort. An open source implementation will be provided in the framework of http://ilastik.org.

14:50-15:10, Paper WeCT4.2
Edge Detection for Facial Images under Noisy Conditions
Madabusi, Sudarshan	International Inst. of Information Tech. Hyderabad
Gangashetty, Suryakanth	International Inst. of Information Tech. Gachibowli, H
Keywords: Pattern Recognition for Surveillance and Security, Image and Video Processing, Enhancement, Restoration and Filtering Abstract: Edge detection is an important image processing technique used in face recognition. In this paper we propose an edge detection algorithm suitable for extracting edge maps from facial images under noisy conditions. We employ several improvised techniques like composite gradient variation measures, weighted wide convolution kernels and dual maxima detection. This helps in removing spurious edge points along stronger edges, extracting fine edge lines along facial boundaries and curvatures and suppressing noise. This is demonstrated by extracting edges from facial images degraded by Gaussian, speckle and salt and pepper noise. For performance measurement and evaluation Pratt's Figure of Merit and Signal to Noise statistics is used.

15:10-15:30, Paper WeCT4.3
Person Re-Identification Using View-Dependent Score-Level Fusion of Gait and Color Features
Kawai, Ryo	Osaka Univ.
Makihara, Yasushi	The Inst.
Hua, Chunsheng	Osaka Univ.
Iwama, Haruyuki	Osaka Univ.
Yagi, Yasushi	Osaka Univ.
Keywords: Pattern Recognition for Surveillance and Security, Motion, Tracking and Video Analysis Abstract: This paper describes a method for person re-identification across multiple non-overlapping cameras using both gait and color features. Because a single color feature is insufficient to distinguish persons with similar color clothes, a spatio-temporal histogram of oriented gradients is employed as a gradient-based shape and motion gait feature to discriminate such persons in conjunction with a background edge attenuation technique. However, since the gait feature is more sensitive to view differences than the color feature, a view-dependent score-level fusion framework adaptively controls the weights of the gait and color features. Experiments across seven non-overlapping cameras confirm the effectiveness of the proposed method.

15:30-15:50, Paper WeCT4.4
Camera Pan / Tilt Control with Multiple Trackers
Li, Yiming	Univ. of California at Riverside
Bhanu, Bir	Univ. of California
Keywords: Pattern Recognition for Surveillance and Security Abstract: In this paper, we consider the multi-camera tracking and the camera active control (pan and tilt). Auction mechanism from economics is developed to choose the best available camera. By modeling the camera bids with prior knowledge of the camera homographies, the system can �think� ahead to perform necessary panning or tilting operations. The uncertainties of homographies are considered inherently in the metrics used for computing camera bids. Further, to have a better tracking result, we use multiple trackers simultaneously. The trackers are rectified periodically based on the previous auction results. The proposed approach is evaluated in a real-world camera network.

15:50-16:10, Paper WeCT4.5
Joining Feature-Based and Similarity-Based Pattern Description Paradigms for Object Detection
Martelli, Samuele	Istituto Italiano di Tecnologia
Cristani, Marco	Univ. of Verona
Bazzani, Loris	Univ. of Verona
Tosato, Diego	Univ. of Verona
Murino, Vittorio	Univ. of Verona
Keywords: Pattern Recognition for Surveillance and Security, 2D/3D Object Detection and Recognition Abstract: In pattern recognition, two of the main paradigms for describing objects are the feature-based and (dis)similarity-based ones. The former aims at encoding tangible features that characterize the object per-se. The latter gives a relational description of the object, considering other entities as reference. In this paper, we propose the marriage of these two philosophies: this is possible by considering an object as described by local parts. Actually, object parts can be described by features, and structural information can be extracted �considering the similarities between parts. We cast our intuition in a object detection framework, where we select HOG as feature and simple euclidean distances as similarity encoders. The results shown how this hybrid representation outperforms the single paradigms, demonstrating their complementarity.


WeCT5	Hall 300
Learning and Boosting	Regular Session
Chair: Tamaki, Toru	Hiroshima Univ.
Co-Chair: Mori, Greg	Simon Fraser Univ.

14:30-14:50, Paper WeCT5.1
Multiple Instance Real Boosting with Aggregation Functions
Hajimirsadeghi, Hossein	Simon Fraser Univ.
Mori, Greg	Simon Fraser Univ.
Keywords: Statistical, Syntactic and Structural Pattern Recognition, Classification and Clustering, Machine Learning and Data Mining Abstract: We introduce a boosting framework for multiple instance learning (MIL) with varied aggregation of instances. In this framework, a diverse set of aggregation functions can be used to refine the notion of a positive bag for multiple instance learning. We investigate the effect of a wide range of orness in aggregation, using ordered weighted averaging. Thus, we obtain a new notion of a positive bag, which can represent different levels of ambiguity. We evaluate the performance of the proposed algorithm on popular MIL datasets. The experimental results show that this algorithm outperforms the standard MILBoost algorithm.

14:50-15:10, Paper WeCT5.2
Weighted Conditional Mutual Information Based Boosting for Classification of Imbalanced Datasets
Utasi, �kos	MTA-SZTAKI
Keywords: Machine Learning and Data Mining, Classification and Clustering Abstract: This paper addresses the problem of binary classifier learning when the training data is imbalanced, i.e. the samples of the two classes have significantly different cardinality. We investigate two different cost-sensitive approaches in the conditional mutual information (CMI) based weak classifier selection procedure using histogram descriptors. The first method uses CMI for classifier selection, and cost factors are utilized in the construction of the final boosted classifier using support vector machine learning. In the second approach these costs are incorporated into the classifier selection step by weighting the CMI (wCMI). We evaluate the proposed methods in object recognition and detection tasks using two popular histogram-like descriptors. Extensive experiments showed that the proposed methods provide efficient tools to address both problems.

15:10-15:30, Paper WeCT5.3
Compressed Submanifold Multifactor Analysis with Adaptive Factor Structures
Luu, Khoa	Carnegie Mellon Univ.
Savvides, Marios	Carnegie Mellon Univ.
Bui, Tien D.	Concordia Univ.
Suen, Ching Y	Concordia Univ.
Keywords: Feature Reduction and Manifold Learning, Biometrics, Statistical, Syntactic and Structural Pattern Recognition Abstract: This paper proposes a novel approach named Compressed Submanifold Multifactor Analysis (CSMA) to concisely and precisely deal with multifactor analysis. Compared to the state-of-the-art MPCA method that loses the original local geometry structures of input factors due to the averaging process, our proposed approach can preserve their original geometry. In addition, the fast low-rank approximation of a given dataset with multifactors is also provided using Random Projection to reduce space requirements and give more transparent representation. Our proposed method achieves both fastest running time and highest accuracy in the face recognition problem compared to MPCA and some other multifactor based methods on two challenging databases, i.e. CMU-MPIE and Extended YALE-B.

15:30-15:50, Paper WeCT5.4
Unsupervised Skeleton Learning for Manifold Denoising
Sun, Ke	Univ. of Geneva
Bruno, Eric	Knowledge Discovery & Data Mining, Firmenich S.A.
Marchand-Maillet, Stephane	Univ. of Geneva
Keywords: Machine Learning and Data Mining, Feature Reduction and Manifold Learning, Classification and Clustering Abstract: The representative samples can be pictured as the skeleton of a point cloud. We learn a discrete distribution defined over all samples, so that these skeleton points have large probabilities and the outliers have probabilities close to zero. The basic assumption is that any observation is generated from a nearby skeleton point. The learning objective is to minimize the communication cost from a random sample to its generation source. Experiments show that the learned distribution highlights a compact size of key positions. It is further applied to a denoising task as an indirect method of evaluation. The clustering structures of image datasets are best preserved among several methods investigated.

15:50-16:10, Paper WeCT5.5
Efficient and Accurate Learning of Bayesian Networks Using Chi-Squared Independence Tests
Tang, Yi	Center of Excellence for Document AnalysisandRecognition,Univ.
Srihari, Sargur	Univ. at Buffalo, The State Univ. of New York
Keywords: Machine Learning and Data Mining, Handwriting Recognition, Statistical, Syntactic and Structural Pattern Recognition Abstract: Bayesian network structure learning is a well-known NP-complete problem, whose solution is of importance in machine learning. Two algorithms are proposed, both of which assess dependency between variables using the chi-squared test of independence between pairs of variables and the log-likelihood evaluation criterion for the network. The first determines the effect of adding a potential edge (in both directions) on the log-likelihood. The second uses K-L divergence to determine direction, and edges to be included are determined by thresholding normalized chi-squared statistics. Experiments on multinomial data show that the proposed algorithms are more efficient and accurate than an optimized branch and bound algorithm, and human experts.


WePBT6	Room 201+202
Poster Session (09, 10)	Poster Session


WeDT1	Main Hall
Geometric/Photometric Registration and Inpainting	Regular Session
Chair: Mase, Kenji	Nagoya Univ.
Co-Chair: Arth, Clemens	Graz Univ. of Tech.

16:50-17:10, Paper WeDT1.1
Multi-Camera Rectification Using Linearized Trifocal Tensor
Zilly, Frederik Leonhard	Fraunhofer Heinrich Hertz Inst.
Riechert, Christian	Fraunhofer Inst. for Telecommunications - HeinrichHertz Inst.
M�ller, Marcus	Fraunhofer Inst. for Telecommunications - Heinrich Hertz Ins
Waizenegger, Wolfgang	Fraunhofer Heinrich Hertz Inst.
Sikora, Thomas	Tech. Univ. Berlin, Communication Systems Group
Kauff, Peter	Fraunhofer Inst. for Telecommunications - Heinrich Hertz Ins
Keywords: Geometric and Photometric Registration, Stereo and Image-Based Modeling, Motion, Tracking and Video Analysis Abstract: Multi-camera systems such as linear camera arrays are commonly used to capture content for multi-baseline stereo estimation, view generation for auto-stereoscopic displays, or similar tasks. However, even after a careful mechanical alignment, residual vertical disparities and horizontal disparity offsets impair further processing steps. In consequence, the multi-camera content needs to be rectified on a common baseline. The trifocal tensor represents the geometry between three cameras and hence is a helpful tool to calibrate a multi-camera system, and to derive rectifying homographies. Against this background we propose a new method for a robust estimation of the trifocal tensor specialized for linear camera arrays and subsequent rectifying homography computation based on feature point triplets.

17:10-17:30, Paper WeDT1.2
Contrast-Enhancing Seam Detection and Blending Using Graph Cuts
Weibel, Thomas	Fraunhofer ITWM
R�sch, Ronald	Fraunhofer ITWM
Daul, Christian	Centre de Recherche en Automatique de Nancy / ENSEM
Wolf, Didier	Centre de Recherche en Automatique de Nancy
Keywords: Geometric and Photometric Registration Abstract: During the image placement onto the compositing surface (mosaic), stitching algorithms try to minimize visual inconsistencies (texture discontinuities), seam induced color gradients, and blurry image regions. These problems are classically processed separately. In this contribution, we describe a two step graph-cut algorithm that combines these issues. In the first step, optimal seam locations are detected while maximizing the contrast of the mosaic. The second step corrects vignetting and exposure differences along the previously determined seams while retaining contrast, hue and saturation of the images. Qualitative and quantitative results demonstrates that the proposed method produces exposure corrected mosaics that are locally sharper than the individual images from the sequence.

17:30-17:50, Paper WeDT1.3
Fast Global Non-Rigid Registration for Mosaic Creation
Castanheira de Souza, Rafael Henrique	Tokyo Inst. of Tech.
Okutomi, Masatoshi	Tokyo Inst. of Tech.
Torii, Akihiko	Tokyo Inst. of Tech.
Keywords: Geometric and Photometric Registration, Inpainting and Superimposing Abstract: In this work we present a new registration method designed for image mosaicing of scenes containing nonplanar surfaces. A global non-rigid deformation model may have a high number of parameters. To tackle this problem, we apply a registration framework combining deformation model based on triangular meshes and feature point registration. In the registration framework, since we formulate the energy function for achieving the global registration as a quadratic function, the optimum of the energy can be found very efficiently by solving sparse linear systems. We performed experiments to demonstrate the precision and efficiency of our method.

17:50-18:10, Paper WeDT1.4
An Intrinsic Coordinate System for 3D Face Registration
Koppen, Paul	Univ. of Surrey
Chan, Chi Ho	Univ. of Surrey
Christmas, William	Univ. of Surrey
Kittler, Josef	Univ. of Surrey
Keywords: Geometric and Photometric Registration, Medical Image Analysis and Registration, Biometrics Abstract: We present a method to estimate, based on the horizontal symmetry, an intrinsic coordinate system of faces scanned in 3D. We show that this coordinate system provides an excellent basis for subsequent landmark positioning and model-based refinement such as Active Shape Models, outperforming other --explicit-- landmark localisation methods including the commonly-used ICP+ASM approach.

18:10-18:30, Paper WeDT1.5
Image Inpainting Considering Symmetric Patterns
Kawai, Norihiko	Nara Inst. of Science and Tech.
Yokoya, Naokazu	Nara Inst. of Science and Tech.
Keywords: Inpainting and Superimposing, Image and Video Processing, Enhancement, Restoration and Filtering Abstract: This paper proposes a novel image inpainting method to remove undesired objects in an image. Conventionally, missing regions are filled in using similar textures in an image as exemplars. However, unnatural textures are often generated due to the paucity of available samples. In this study, we take into account symmetric transformation of texture patterns to increase exemplars. To generate plausible textures in missing regions with variously transformed patterns, we employ two approaches: (1) we use spatial coherence of texture patterns when searching for similar patterns, and (2) we define a new degree of confidence of exemplars for determining pixel values. The effectiveness of the proposed method is demonstrated by comparing results by three methods.


WeDT2	Multi-Purpose Hall
Texture and Saliency	Regular Session
Chair: Ding, Xiaoqing	Tsinghua Univ.
Co-Chair: Zhang, Hong	Univ. of Alberta

16:50-17:10, Paper WeDT2.1
Saliency Detection Via Divergence Analysis: A Unified Perspective
Huang, Jia-Bin	Univ. of Illinois, Urbana-Champaign
Ahuja, Narendra	-UIUC
Keywords: Scene Understanding, 2D/3D Object Detection and Recognition, Image and Video Understanding Abstract: A number of bottom-up saliency detection algorithms have been proposed in the literature. Since these have been developed from intuition and principles inspired by psychophysical studies of human vision, the theoretical relations among them are unclear. In this paper, we present a unifying perspective. Saliency of an image area is defined in terms of divergence between certain feature distributions estimated from the central part and its surround. We show that various, seemingly different saliency estimation algorithms are in fact closely related. We also discuss some commonly used center-surround selection strategies. Experiments with two datasets are presented to quantify the relative advantages of these algorithms.

17:10-17:30, Paper WeDT2.2
Visual Saliency and Categorisation of Abstract Images
Laine-Hernandez, Mari	Aalto Univ.
Kinnunen, Juha Teemu Ensio	Aalto Univ.
Kamarainen, Joni-Kristian	Lappeenranta Univ. of Tech.
Lensu, Lasse	Lappeenranta Univ. of Tech.
Kalviainen, Heikki	Lappeenranta Univ. of Tech.
Oittinen, Pirkko	Aalto Univ.
Keywords: 2D/3D Object Detection and Recognition, Scene Understanding, Features and Image Descriptors Abstract: Visual object categorisation problem has attracted significant attention during the last ten years, and the two main hypotheses adopted by virtually all methods are i) detection of visual saliency and ii) bag-of-visual-words based categorisation. It is, however, difficult to verify the hypotheses with humans since many recordings, such as gaze fixation locations, represent processing after the recognition and the object classification task is too easy for humans producing no information about uncertainties in the cognitive process. To the authors' best knowledge, this work is the first attempt to study the main hypotheses and state-of-the-art algorithms for visual object categorisation with abstract images. These images inhibit rapid recognition and cause the observers' opinions differ substantially in assigning the images into ``similar categories''. Our work reveals interesting findings: the state-of-the-art methods' performances drop to almost pure chance while human observers remain surprisingly consistent.

17:30-17:50, Paper WeDT2.3
Combining Local and Global Correlation for Texture Description
Hong, Xiaopeng	Univ. of Oulu
Zhao, Guoying	Univ. of Oulu
Pietik�inen, Matti	Univ. of Oulu
Chen, Xilin	Inst. of Computing Tech. ChineseAcademyofSciences
Keywords: Features and Image Descriptors, Segmentation, Color and Texture, Statistical, Syntactic and Structural Pattern Recognition Abstract: Local Binary Patterns (LBPs) and Covariance Matrices (CovMs) are two popular kinds of texture descriptors. However, local correlation brought by LBPs and global correlation brought by CovMs could not be directly combined to achieve enhanced discriminative power. This paper develops a powerful descriptor, named COV-LBP. Firstly, we propose a variant of LBPs on Euclidean space, named the LBP Difference feature (LBPD), which can be used to calculate any statistical image description. LBPD reflects how far one LBP lies from the LBP mean of a given image. It is simple, descriptive, rotation invariant, and computationally efficient. Secondly, by applying LBPD in multiple commonly used elementary features mapped from the original image, we provide a bank of discriminative features optional for CovMs. Consequently the information of LBPs and CovMs are embedded in a unified COV-LBP descriptor. Experimental results show that COV-LBP achieves promising performance on the public texture classification databases.

17:50-18:10, Paper WeDT2.4
A Comprehensive Benchmark of Local Binary Pattern Algorithms for Texture Retrieval
Doshi, Niraj P.	Loughborough Univ.
Schaefer, Gerald	Loughborough Univ.
Keywords: Features and Image Descriptors, Multimedia Analysis, Indexing and Retrieval, Segmentation, Color and Texture Abstract: Image retrieval is a well researched area and often based on integrating various kinds of image features. Apart from colour features, texture features are deemed crucial for successful image retrieval. Local binary pattern (LBP) based texture algorithms have gained significant popularity in recent years and have been shown to be useful for a variety of tasks. In this paper, we provide a comprehensive benchmark of LBP based methods for texture retrieval. In particular, a comparison of 16 LBP variants leading to 38 different texture descriptors, are evaluated on a large dataset of more than 6000 texture images. Interestingly, conventional LBP features are shown to work best, while almost all LBP methods are shown to significantly outperform other texture methods including Tamura, co-occurrence and Gabor features.

18:10-18:30, Paper WeDT2.5
Distinctive Texture Features from Perspective-Invariant Keypoints
Gossow, David	Tech. Univ. M�nchen
Weikersdorfer, David	Tech. Univ. M�nchen
Beetz, Michael	Tech. Univ. M�nchen
Keywords: Features and Image Descriptors, Low-Level Vision, Vision for Robotics Abstract: In this paper, we present an algorithm to detect and describe features of surface textures, similar to SIFT and SURF. In contrast to approaches solely based on the intensity image, it uses depth information to achieve invariance with respect to arbitrary changes of the camera pose. The algorithm works by constructing a scale space representation of the image which conserves the real-world size and shape of texture features. In this representation, keypoints are detected using a Difference-of-Gaussian response. Normal-aligned texture descriptors are then computed from the intensity gradient, normalizing the rotation around the normal using a gradient histogram. We evaluate our approach on a dataset of planar textured scenes and show that it outperforms SIFT and SURF under large viewpoint changes.


WeDT3	Room 101+102
Feature Description	Regular Session
Chair: El-Saban, Motaz	Microsoft Res.
Co-Chair: Crandall, David	Indiana Univ.

16:50-17:10, Paper WeDT3.1
A Fully Affine Invariant Feature Detector
Li, Wei	Chinese Acad. ofSciences
Shi, Zelin	Chinese Acad. of Sciences
Yin, Jian	The Res. Inst. on General Development of Air Force
Keywords: Features and Image Descriptors, 2D/3D Object Detection and Recognition, Detection, Separation and Segmentation Abstract: This paper proposes a Fully Affine Invariant Feature (FAIF) detector which is based on the covariance matrixes of Maximally Stable Extremal Regions (MSER). The covariance matrix can be interpreted as an isotropy measure of an image region. A local anisotropic image region can be supposed as an affine transformed isotropic image region. Therefore, the affine deformation of a MSER can be estimated by its covariance matrix. Filters must be compatible with image regions. An anisotropic image region should be smoothed by an elliptical Gaussian filter which is difficult to be implemented. In order to use circular Gaussian filters, FAIF transforms an anisotropic image region into an isotropic one by rotating and compressing an ellipse into a circle. The fully affine invariant features are detected on the isotropic image regions by Scale Invariant Feature Transform (SIFT) algorithm. Experimental results show that FAIF has much more matching features than the state-of-the-art approaches. FAIF functions perfectly even in extreme conditions.

17:10-17:30, Paper WeDT3.2
Contour Detection Via Random Forest
Zhang, Chao	Shanghai Jiao Tong Univ.
Ruan, Xiang	Omron coorparation
Zhao, Yuming	Shanghai Jiao Tong Univ.
Yang, Ming-Hsuan	Univ. of California at Merced
Keywords: Segmentation, Color and Texture, Machine Learning and Data Mining Abstract: Contour detection is an important and fundamental problem in computer vision that ﬁnds numerus applications. In this paper, we propose a learning algorithm for contour detection via random forest. Visual cues that can be extracted easily and efficiently are integrated to learn a detector where the decision of an contour pixel is made independently via the random forest at each location in the image. We evaluate the proposed algorithm against leading methods in the literature on the Berkeley Segmentation Dataset. Experimental results demonstrate that the proposed contour detection algorithm performs favorably against state-of-the-art methods in terms of speed and accuracy.

17:30-17:50, Paper WeDT3.3
EDVD - Enhanced Descriptor for Visual and Depth Data
Nascimento, Erickson	Univ. Federal de Minas Gerais
Schwartz, William	Federal Univ. of Minas Gerais
Campos, Mario Montenegro Campos	Univ. Federal de Minas Gerais
Keywords: Features and Image Descriptors, Low-Level Vision Abstract: Many problems in computer vision and robotics rely on automatically determining point correspondences from two images. Due to issues such as illumination variations, uncontrolled acquisition conditions and noise, this is a challenging problem. This work presents a method that combines visual and shape information to perform point correspondences which is invariant to rotation and scaling transformations in the image and geometry domains. Experimental results show that our approach is a robust and computationally efficient technique compared with classic descriptors in the literature.

17:50-18:10, Paper WeDT3.4
An Accurate and Contrast Invariant Junction Detector
Xia, Gui-Song	Univ. Paris-Dauphine
Delon, Julie	CNRS, Telecom ParisTech
Gousseau, Yann	Telecom Paris
Keywords: Low-Level Vision, Features and Image Descriptors Abstract: This paper introduces a generic method for the accurate analysis of junctions, relying on a statistical modeling of normalized image gradients. We analyze junctions as local visual events that do not happen by chance under a background model derived from the a-contrario methodology. The method not only provides thresholds for the detection of junctions, but also enables their accurate characterization, including a precise computation of their type, localization, scale and geometrical configuration. The efficiency of the method is evaluated through various experiments.

18:10-18:30, Paper WeDT3.5
Do Humans Fixate on Interest Points?
Ghanem, Bernard	King Abdullah Univ. of Science and Tech.
Dubey, Rachit	Nanyang Tech. Univ.
Dave, Akshat	Nanyang Tech. Univ.
Keywords: Features and Image Descriptors, Low-Level Vision, Performance Evaluation Abstract: Interest point detectors (e.g. SIFT, SURF, and MSER) have been successfully applied to numerous applications in high level computer vision tasks such as object detection,and image classification. Despite their popularity, the perceptual relevance of these detectors has not been thoroughly studied. Here, perceptual relevance is meant to define the correlation between these point detectors and free-viewing human fixations on images. In this work, we provide empirical evidence to shed light on the fundamental question: �Do humans fixate on interest points in images?�. We believe that insights into this question may play a role in improving the performance of vision systems that utilize these interest point detectors. We conduct an extensive quantitative comparison between the spatial distributions of human fixations and automatically detected interest points on a recently released dataset of 1003 images. This comparison is done at both the global (image) level as well as the local (region)level. Our experimental results show that there exists a weak correlation between the spatial distributions of human fixation and interest points.


WeDT4	Hall 200
Video Analysis and Surveillance	Regular Session
Chair: Shimosaka, Masamichi	The Univ. of Tokyo
Co-Chair: Chen, Hua-Tsung	National Chiao Tung Univ.

16:50-17:10, Paper WeDT4.1
Acceleration of Vanishing Point-Based Line Sampling Scheme for People Localization and Height Estimation Via 3D Sampling
Lo, Kuo-Hua	National Chiao Tung Univ.
Wang, Chih-Jung	National Chiao Tung Univ.
Chuang, Jen-Hui	National Chiao Tung Univ.
Chen, Hua-Tsung	National Chiao Tung Univ.
Keywords: Image and Video Understanding, Pattern Recognition for Search, Retrieval and Visualization, Pattern Recognition for Surveillance and Security Abstract: With the popularity of vision-based camera surveillance, the research on people localization appeals to much attention. In this paper, we propose an efficient and effective system capable of locating a crowd of dense people in real time, using multiple cameras. For each camera view, sample lines, originated from a vanishing point, of foreground objects are projected on the ground plane. Ground regions containing a high density of projected lines are then used to find people locations. Enhanced from previous works, the people localization approach proposed in this paper needs not project all foreground pixels of all views to multiple reference planes or compute pairwise intersections of projected sample lines at different heights, resulting in significant improvement in computational efficiency. Furthermore, the people heights can also be estimated. Experimental results on real surveillance scenes show that comparable accuracy in people localization can be achieved with five times in computing speed with the proposed approach.

17:10-17:30, Paper WeDT4.2
Consistent Collective Activity Recognition with Fully Connected CRFs
Kaneko, Takuhiro	The Univ. of Tokyo
Shimosaka, Masamichi	The Univ. of Tokyo
Odashima, Shigeyuki	The Univ. of Tokyo
Fukui, Rui	The Univ. of Tokyo
Sato, Tomomasa	The Univ. of Tokyo
Keywords: Image and Video Understanding, Pattern Recognition for Surveillance and Security, Detection, Separation and Segmentation Abstract: Recognizing collective human activities has gained attention. Collective activities are such as queueing in a line, talking together and waiting by an intersection. It is often hard to differentiate between these activities only by the appearance of the individual. Hence, recent works exploit the contextual information of other people nearby. However, these works do not take enough care of the spacial and temporal consistency in a group (e.g. considering the consistency in only adjacent area). To solve the problem, this paper describes a method to integrate individual recognition result via fully connected CRFs, which assume the relationships among all the people. Unlike previous methods that determine the range of human relations by heuristics, our method describes the �multi-scale� relationships in position, size, movement and time sequence as flexible potentials, so as to handle various types, sizes and shapes of groups. Experimental results show that our method outperforms state-of-the art methods.

17:30-17:50, Paper WeDT4.3
Automatic Heterogeneous Video Summarization in Temporal Profile
Cai, Hongyuan	Indiana Univ. Purdue Univ. Indianapolis
Zheng, Jiang Yu	Indiana Univ. Purdue Univ. Indianapolis
Keywords: Image and Video Understanding, Image and Video Processing, Multimedia Analysis, Indexing and Retrieval Abstract: Numerous videos are uploaded on video websites; most of them employ several kinds of camera operations for expanding FOV, emphasizing events, and expressing cinematic effect. To generate a profile of heterogeneous types of videos, an automatic video profiling method has been proposed to include both spatial and temporal information in a 2D image scroll. In this paper, we propose a uniformed scheme to segment video clips and sections, compute major optical flow and convergence factor, and then sample video volume across the major flow for the profiles. A video profile shows an intrinsic scene space less influenced by the camera ego-motion. It is also a fine-grained temporal representation that can be displayed in a video track to guide the access to frames, help video editing and visual archiving of environment, video retrieval, and browsing.

17:50-18:10, Paper WeDT4.4
Annotating Videos from the Web Images
Wang, Han	Beijing Lab. ofInrelligentInformationTechnology,School of
Wu, Xinxiao	Beijing Lab. of IntelligentInformationTechnology,School of
Jia, Yunde	Beijing Inst. of Tech.
Keywords: Multimedia Analysis, Indexing and Retrieval Abstract: In this paper, we propose a generic framework for annotating videos based on web images. To greatly reduce expensive human annotation on tremendous quantity of videos, it is necessary to transfer the knowledge learned from web images with a rich source of information to videos. A discriminative structural model is proposed to transfer knowledge from web images (auxiliary domain) to the video (target domain) by jointly modeling the interaction between video labels and web image attributes. The advantage of our framework is that it allows us to infer video labels using the information from different domains, i.e. the video itself and image attributes. Experimental results on UCF Sports Action Dataset demonstrates that it is effective to use knowledge gained from web images for video annotation.

18:10-18:30, Paper WeDT4.5
Foreground Detection Via Robust Low Rank Matrix Factorization Including Spatial Constraint with Iterative Reweighted Regression
Guyon, Charles	Lab. MIA (Mathematics,ImageandApplications)-Univ. of
Bouwmans, Thierry	Lab. of Mathematics, Images and Applications,Univ. of
Zahzah, Elhadi	Lab. MIA (Mathematics, ImageandApplications)-Univ. of
Keywords: Detection, Separation and Segmentation, Motion, Tracking and Video Analysis, Image and Video Processing Abstract: Foreground detection is the first step in video surveillance system to detect moving objects. Robust Principal Components Analysis (RPCA) shows a nice framework to separate moving objects from the background. The background sequence is then modeled by a low rank subspace that can gradually change over time, while the moving foreground objects constitute the correlated sparse outliers. In this paper, we propose to use a lowrank matrix factorization with IRLS scheme (Iteratively reweighted least squares) and to address in the minimization process the spatial connexity of the pixels. Experimental results on the Wallflower and I2R datasets show the pertinence of the proposed approach.


WeDT5	Hall 300
Categorization and Learning	Regular Session
Chair: Kambhamettu, Chandra	Univ. of Delaware
Co-Chair: Sukthankar, Rahul	Google

16:50-17:10, Paper WeDT5.1
A Pyramid Nearest Neighbor Search Kernel for Object Categorization
Cheng, Hong	Univ. of Electronic Science and Tech. of China
Yu, Rongchao	Univ. of Electronics Science and Tech. of China
Liu, Zicheng	Microsoft
Liu, Yiguang	Sichuan Univ.
Keywords: Features and Image Descriptors, 2D/3D Object Detection and Recognition Abstract: Nearest-Neighbor based Image Classification (NNIC) has drawn considerable attention in the past several years because it does not require classifier training. Similar to an orderless Bag-of-Feature image representation, the traditional NNIC ignores global geometric correspondence. In this paper, we present a technique to exploit the global geometric correspondence in a nearest neighbor classifier framework. We divide an image into increasingly fine sub-regions like the Spatial Pyramid Matching (SPM) approach, and introduce a Pyramid Nearest Neighbor Search (PNNS) kernel by measuring the search similarity between a local descriptor and a feature set in each pyramid window. Instead of using a fixed weighting as in SPM, the weights of the pyramid windows are learned from training data in a class-dependent manner. By doing so, we learn a class-specific geometric correspondence. Finally, an optimal nearest neighbor classifier framework is developed to incorporate the kernel functions over different pyramid windows. The proposed approach is evaluated on a number of public datasets, and the experiment results show that our approach significantly outperforms existing techniques.

17:10-17:30, Paper WeDT5.2
Spatial Graphlet Matching Kernel for Recognizing Aerial Image Categories
Zhang, Luming	Zhejiang Univ.
Song, Mingli	Zhejiang Univ.
Li, Sun	Zhejiang Univ.
Tao, Dacheng	Nanyang Tech. Univ.
Bu, Jiajun	Zhejiang Univ.
Chen, Chun	Zhejiang Univ.
Liu, Xiao	Zhejiang Univ.
Wang, Yinting	Zhejiang Univ.
Keywords: Features and Image Descriptors, Feature Reduction and Manifold Learning Abstract: This paper presents a method for recognizing aerial image categories based on matching graphlets(i.e., small connected subgraphs) extracted from aerial images. By constructing a Region Adjacency Graph (RAG) to encode the geometric property and the color distribution of each aerial image, we cast aerial image category recognition as RAG-to-RAG matching. Based on graph theory, RAG-to-RAG matching is conducted by matching all their respective graphlets. Towards an eective graphlet matching process, we develop a manifold embedding algorithm to transfer dierent-sized graphlets into equal length feature vectors and further integrate these feature vectors into a kernel. This kernel is used to train a SVM [8] classifier for aerial image categories recognition. Experimental results demonstrate our method outperforms several state-of-the-art object/scene recognition models.

17:30-17:50, Paper WeDT5.3
Sparse Coding for Histograms of Local Binary Patterns Applied for Image Categorization: Toward a Bag-Of-Scenes Analysis
Paris, Sebastien	Univ. de la M�diterran�e
HALKIAS, Xanadu, Xanadu	Univ. South-Toulon-Var,LSIS/DYNI
Herve, Glotin	LSIS
Keywords: Scene Understanding, Features and Image Descriptors, Classification and Clustering Abstract: In this work, we propose a novel approach for image categorization, which we will refer to as Bag-of-Scenes (BoS). It is based on the association of Sparse coding (Sc) and pooling techniques applied to histograms of multi-scale Local Binary Patterns (LBP) and its improved variant. This approach can be considered as a 2-layer hierarchical architecture. The first layer, encodes general local patch�s structure via histograms of LBP, and the second, encodes the relationships between pre-analyzed LBP-scenes. Our method outperforms SIFT-based approaches using Sc techniques and can be trained efficiently with a simple linear SVM. Our BoS method achieves 87:02%, 87:71% and 79:05% of accuracy for Scene-15, UIUC-Sport and Caltech101 datasets respectively.

17:50-18:10, Paper WeDT5.4
Online Human-Assisted Learning Using Random Ferns
Villamizar Vergel, Michael	CSIC-UPC
Garrell, Ana�s	CSIC-UPC
Sanfeliu, Alberto	-Univ. Pol. de Catalunya
Moreno-Noguer, Francesc	CSIC-UPC
Keywords: 2D/3D Object Detection and Recognition, Vision for Robotics Abstract: We present an Online Random Ferns (ORFs) classifier that progressively learns and builds enhanced models of object appearances. During the learning process, we allow the human intervention to assist the classifier and discard false positive training samples. The amount of human intervention is minimized and integrated within the online learning, such that in a few seconds, complex object appearances can be learned. After the assisted learning stage, the classifier is able to detect the object under severe changing conditions. The system runs at a few frames per second, and has been validated for face and object detection tasks on a mobile robot platform. We show that with minimal human assistance we are able to build a detector robust to viewpoint changes, partial occlusions, varying lighting and cluttered backgrounds.

18:10-18:30, Paper WeDT5.5
K-MLE for Mixtures of Generalized Gaussians
Schwander, Olivier	�cole Pol.
Nielsen, Frank	Sony Computer Science Lab. Inc
Schutz, Aur�lien	Univ. de Bordeaux
Berthoumieu, Yannick	Univ. de Bordeaux
Keywords: Segmentation, Color and Texture, Machine Learning and Data Mining Abstract: We introduce an extension of the k-MLE algorithm, a fast algorithm for learning statistical mixture models relying on maximum likelihood estimators, which allows to build mixture of generalized Gaussian distributions without a fixed shape parameter. This allows us to model finely probability density functions which are made of highly non Gaussian components. We theoretically prove the convergence of our method and show experimentally that it performs comparably to Expectation-Maximization methods while being more computationally efficient.

Technical Program for Wednesday November 14, 2012