|
MoPT4 |
Poster Session Hall |
MoP4 |
Poster Session |
|
15:00-17:10, Paper MoPT4.1 | |
View Invariant Gait Recognition Using Only One Uniform Model |
Yu, Shiqi | Shenzhen Univ |
Wang, Qing | Shenzhen Univ |
Shen, Linlin | Shenzhen Univ |
Huang, Yongzhen | Inst. of Automation, Chinese Acad. of Sciences |
Keywords: Gait recognition
Abstract: Gait recognition has been proved useful in human identification at a distance. But view variance of gait feature is always a great challenge because of the difference in appearance. If the view of the probe is different from that of the gallery, one view transformation model can be employed to convert the gait feature from one view to another. But most existing models need to estimate the view angle first, and can work for only one view pair. They can not convert multi-view data to one specific view efficiently. We employ one deep model based on auto-encoder for view invariant gait extraction.The model can synthesize gait feature in a progressive way by stacked multi-layer auto-encoders. The unique advantage is that it can extract view invariant feature from any view using only one model, and view estimation is not needed. The proposed method is evaluated on a large dataset, CASIA Gait Dataset B. The experimental results show that it can achieve state-of-the-art performance, and the improvement is more obvious when the view variance is larger.
|
|
15:00-17:10, Paper MoPT4.3 | |
Learning Shape Variations of Motion Trajectories for Gait Analysis |
Devanne, Maxime | Telecom Lille/LIFL |
Wannous, Hazem | Univ. of Lille1 |
Berretti, Stefano | Univ. of Florence |
Pala, Pietro | Univ. of Firenze |
Daoudi, Mohammed | Télécom Lille/CRIStAL (UMR 9189) |
Del Bimbo, Alberto | Univ. of Florence |
Keywords: Gait recognition, Gesture and Behavior Analysis
Abstract: The analysis of human gait is more and more investigated due to its large panel of potential applications in various domains, like rehabilitation, deficiency diagnosis, surveillance and movement optimization. In addition, the release of depth sensors offers new opportunities to achieve gait analysis in a non-intrusive context. In this paper, we propose a gait analysis method from depth sequences by analyzing separately each step so as to be robust to gait duration and incomplete cycles. We analyze the shape of the motion trajectory as signature of the gait and consider shape variations within a Riemannian manifold to learn step models. During classification, the derivation of each performed step is evaluated in an online manner to qualitatively analyze the gait. Experiments are carried out in the context of abnormal gait detection and person re-identification trough gait recognition. Results demonstrated the potential of the method in both scenarios.
|
|
15:00-17:10, Paper MoPT4.4 | |
Learning Robust Features for Gait Recognition by Maximum Margin Criterion |
Balazia, Michal | Faculty of Informatics, Masaryk Univ |
Sojka, Petr | Faculty of Informatics, Masaryk Univ |
Keywords: Gait recognition, Pattern Recognition for Surveillance and Security, Biometric systems and applications
Abstract: In the field of gait recognition from motion capture data, designing human-interpretable gait features is a common practice of many fellow researchers. To refrain from ad-hoc schemes and to find maximally discriminative features we may need to explore beyond the limits of human interpretability. This paper contributes to the state-of-the-art with a machine learning approach for extracting robust gait features directly from raw joint coordinates. The features are learned by a modification of Linear Discriminant Analysis with Maximum Margin Criterion so that the identities are maximally separated and, in combination with an appropriate classifier, used for gait recognition. Experiments on the CMU MoCap database show that this method outperforms eight other relevant methods in terms of the distribution of biometric templates in respective feature spaces expressed in four class separability coefficients. Additional experiments indicate that this method is a leading concept for rank-based classifier systems.
|
|
15:00-17:10, Paper MoPT4.5 | |
Leveraging Intra-Class Variations to Improve Large Vocabulary Gesture Recognition |
Conly, Christopher | Univ. of Texas at Arlington |
Dillhoff, Alex | Univ. of Texas at Arlington |
Athitsos, Vassilis | Univ. of Texas at Arlington |
Keywords: Gesture and Behavior Analysis, Human body motion and gesture based interaction, Motion, tracking and video analysis
Abstract: Large vocabulary gesture recognition using a training set of limited size is a challenging problem in computer vision. With few examples per gesture class, researchers often employ exemplar-based methods such as Dynamic Time Warping (DTW). This paper makes two contributions in the area of exemplar-based gesture recognition: 1) it introduces Multiple-Pass DTW (MP-DTW), a method in which scores from multiple DTW passes focusing on different gesture properties are combined, and 2) it introduces a new set of features modeling intra-class variation of several gesture properties that can be used in conjunction with MP-DTW or DTW. We demonstrate that these techniques provide substantial improvement over DTW in both user-dependent and user-independent experiments on American Sign Language (ASL) datasets, even when using noisy data generated by RGB-D skeleton detectors. We further show that using these techniques in a large vocabulary system with a limited training set provides significantly better results compared to Long Short-Term Memory (LSTM) network and Hidden Markov Model (HMM) approaches.
|
|
15:00-17:10, Paper MoPT4.6 | |
Human Pose Estimation Based on Human Limbs |
Liang, Guoqiang | Xi'an Jiaotong Univ |
Lan, Xuguang | Xi'an Jiaotong Univ |
Wang, Jiang | Baidu Res |
Zheng, Nanning | Xi'an Jiaotong Univ |
Keywords: Gesture and Behavior Analysis, Image and video analysis and understanding, Deep learning
Abstract: Modeling the relationship among human joints is one of the most important components in human pose estimation. Previous methods usually define this relationship as geometric constraints on the relative location of two neighboring joints. In this definition, the local image appearance of the region connecting two neighboring joints is ignored. In fact, this image appearance, called human limb, plays an important role in human joint localization in human visual system. To make full use of this local image appearance, we propose to solve a new task: human limb detection. We combine it with human joint localization in one deep convolutional neural network. After getting coarse results, we employ a graphical model to remove false positive detections. Besides, shallow and deep features are combined in this model. We evaluate our method on the FLIC and LSP datasets. The experiments results show the effectiveness of our method.
|
|
15:00-17:10, Paper MoPT4.7 | |
A Fast and Accurate Motion Descriptor for Human Action Recognition Applications |
Ghorbel, Enjie | IRSEEM and URIA(Mines Douai) |
Boutteau, Remi | Esigelec Irseem |
Boonaert, Jacques | Ec. Des Mines De DOUAI |
Savatier, Xavier | IRSEEM |
Lecoeuche, Stephane | URIA - Mines Douai |
Keywords: Gesture and Behavior Analysis, Image and video analysis and understanding, Human body motion and gesture based interaction
Abstract: With the availability of the recent human skeleton extraction algorithm introduced by Shotton et al. cite{shotton}, an interest for skeleton-based action recognition methods has been renewed. Despite the importance of the low-latency aspect in applications, it can be noted that the majority of recent approaches has not been evaluated in terms of computational cost. In this paper, a novel fast and accurate human action descriptor named Kinematic Spline Curves (KSC) is introduced. This descriptor is built by interpolating the kinematics of joints (position, velocity and acceleration). To overcome the anthropometric and the execution rate variabilities, we respectively propose the use of a skeleton normalization and a temporal normalization. For this purpose, a new temporal normalization method based on the Normalized Accumulated kinetic Energy (NAE) of the human skeleton is suggested. Finally, the classification step is performed using a linear Support Vector Machine (SVM). Experimental results on challenging benchmarks show the efficiency of our approach in terms of recognition accuracy and computational latency.
|
|
15:00-17:10, Paper MoPT4.8 | |
Unsupervised Mouse Behavior Analysis: A Data-Driven Study of Mice Interactions |
Katsageorgiou, Vasiliki-Maria | Istituto Italiano Di Tecnologia |
Zanotto, Matteo | Istituto Italiano Di Tecnologia |
Huang, Huiping | Istituto Italiano Di Tecnologia |
Ferretti, Valentina | Istituto Italiano Di Tecnologia |
Papaleo, Francesco | Istituto Italiano Di Tecnologia (IIT) |
Sona, Diego | Istituto Italiano Di Tecnologia (IIT) |
Murino, Vittorio | Istituto Italiano Di Tecnologia |
Keywords: Gesture and Behavior Analysis, Image and video analysis and understanding, Machine learning and data mining
Abstract: Automatic analysis of rodent behavior has been receiving growing attention in recent years since rodents have been the reference species for many neuroscientific studies, with the social interaction being among the subjects of the most important ones. Systems that are employed in these studies are mainly based on tracking of mice and activity classification through supervised learning methods, trained on datasets manually annotated by experts. In this paper, we introduce a completely unsupervised way of analysing tracking data for the automatic identification of social and non-social behaviors using models capable of spotting regularities in the data. In particular, a mean-covariance Restricted Boltzmann Machine is employed to abstract higher-level behavioral configurations of mice interacting in an arena for a long time.
|
|
15:00-17:10, Paper MoPT4.9 | |
A Novel Fingerprint Classification Method Based on Deep Learning |
Ruxin, Wang | School of Mathematical Science Univ. of Chinese Acad. Of |
Congying, Han | School of Mathematical Science Univ. of Chinese Acad. Of |
Tiande, Guo | School of Mathematical Science Univ. of Chinese Acad. Of |
Keywords: Fingerprint recognition, Classification and clustering, Deep learning
Abstract: Fingerprint classification is an effective technique for reducing the candidate numbers of fingerprints in the stage of matching in automatic fingerprint identification system (AFIS). In recent years, deep learning is an emerging technology which has achieved great success in many fields, such as image processing, computer vision. In this paper, we have a preliminary attempt on the traditional fingerprint classification problem based on the new depth neural network method. For the four-class problem, only choosing orientation field as the classification feature, we achieve 91.4% accuracy using the stacked sparse autoencoders (SAE) with three hidden layers in the NIST-DB4 database. And then two classification probabilities are used for fuzzy classification which can effectively enhance the accuracy of classification. By only adjusting the probability threshold, we get the accuracy of classification is 96.1% (setting threshold is 0.85), 97.2% (setting threshold is 0.90) and 98.0% (setting threshold is 0.95) with a single layer SAE. Applying the fuzzy method, we obtain higher accuracy.
|
|
15:00-17:10, Paper MoPT4.10 | |
Define a Fingerprint Orientation Field Pattern |
Zhang, Ning | Insititute of Automation, Chinese Acad. of Sciences |
Zang, Yali | Inst. of Automation, Chinese Acad. of Sciences |
Jia, Xiaofei | Insititute of Automation, Chinese Acad. of Sciences |
Yang, Xin | Insititute of Automation, Chinese Acad. of Sciences |
Tian, Jie | Inst. of Automation, Chinese Acad. of Sciences |
Keywords: Fingerprint recognition, Forensic biometrics and its applications
Abstract: Orientation Field (OF) is one of the most significant characters to distinguish fingerprint images from non-fingerprint images. An effective definition of fingerprint OF pattern will not only benefit fingerprint enhancement, but also contribute to latent fingerprint detection and segmentation. The existing fingerprint OF models either require pre-knowledge of singular points, or cannot be generalized to all kinds of fingerprint OFs. In this paper, we propose to define the fingerprint OF patterns based on low rank decomposition and sparse coding. Then we apply this proposed method to fingerprint OF recognition and detection. Experimental results prove the effectiveness of our method.
|
|
15:00-17:10, Paper MoPT4.11 | |
Improving Cross Sensor Interoperability for Fingerprint Identification |
Lin, Chenhao | The Hong Kong Pol. Univ |
Kumar, Ajay | The Hong Kong Pol. Univ |
Keywords: Fingerprint recognition, Other Biometric applications
Abstract: Improving accuracy of matching fingerprint images acquired from two different fingerprint sensors is an important research problem with several promising studies in the literature. Most of these studies focus on sensor interoperability using fingerprints acquired from different kinds of contact-based sensors. However emerging contactless fingerprint technologies have shown its benefits. This paper investigates fingerprint sensor interoperability problem using fingerprints acquired from contact-based and contactless sensor. We propose a generalized contact-based fingerprint deformation correction model (DCM) to improve the matching accuracy. This model is trained by estimating the deformation between contact-based fingerprint and corresponding contactless fingerprint (ground truth). We present a method to estimate contact-based fingerprint impression type and intensity. As a result, minutiae features from contact-based and contactless fingerprint can be better aligned using the proposed model. A database of 1200 2D contactless fingerprints and respective contact-based fingerprints from 200 clients is used for the experiments. The experimental results presented in this paper validate our approach and illustrate promising improvement in performance using the proposed model.
|
|
15:00-17:10, Paper MoPT4.12 | |
Local Active Content Fingerprint: Solutions for General Linear Feature Maps |
Kostadinov, Dimche | Univ. of Geneva |
Voloshynovskiy, Sviatoslav | Univ. of Geneva |
Diephuis, Maurits | Univ. of Genvea |
Ferdowsi, Sohrab | Univ. of Geneva |
Holotyak, Taras | Univ. of Geneva |
Keywords: Fingerprint recognition, Signal, image and video processing, Image and video analysis and understanding
Abstract: This paper presents solutions to the local patch based Active Content Fingerprint (aCFP) with linear modulation, general linear feature map and convex constraints on the properties of the local feature descriptor. A direct approximation of the linear feature map such that the image distortion is as small as possible and the approximate linear feature map is as close as possible to the original map is proposed. Then an explicit regularization of the trade-off between the modulation distortion and the robustness of the local feature is introduced trough a novel problem formulation. A computer simulation using local image patches, extracted from publicly available data set is provided, demonstrating the advantages under: additive white Gaussian noise (AWGN), lossy JPEG compression and projective geometrical transform distortions.
|
|
15:00-17:10, Paper MoPT4.13 | |
A Proposed Pattern Recognition Framework for EEG-Based Blind Watermarking System |
Pham, Trung Duy | Univ. of Canberra |
Tran, Dat | Univ. of Canberra |
Ma, Wanli | Univ. of Canberra |
Keywords: Forensic biometrics and its applications, Classification and clustering, Security issues
Abstract: Copyright protection for multimedia data owners is of crucial importance as the duplication of multimedia data has become easily with the advent of Internet and digital multimedia technology. Current digital watermarking techniques for preserving the product ownership are rule-based and not directly deal with the data synchronization, therefore their decoding performance reduces significantly when the watermarked data is transmitted through a real communication channel. This paper proposes a pattern recognition framework to build a new blind watermark scheme for electroencephalography (EEG) data. Embedding a watermark is based on modifying mean modulation relationship of approximation coefficient in wavelet domain. Retrieving this watermark is done effectively using Support vector data description (SVDD) models trained with the correlation between modified frequency coefficients and the watermark sequence in wavelet domain. Experimental results show that the proposed scheme provides good imperceptibility and more robust against various signal processing techniques and common attacks such as random cropping, noise addition, low-pass filtering, and resampling.
|
|
15:00-17:10, Paper MoPT4.15 | |
Novel Generative Model for Facial Expressions Based on Statistical Shape Analysis of Landmarks Trajectories |
Desrosiers, Paul Audain | Télécom Lille, CRIStAL UMR (CNRS 9189) |
Devanne, Maxime | Telecom Lille/LIFL |
Daoudi, Mohammed | Télécom Lille/CRIStAL (UMR 8219) |
Keywords: Facial expression recognition, Statistical, syntactic and structural pattern recognition, Shape modeling and encoding
Abstract: We propose a novel geometric framework for analyzing spontaneous facial expressions, with the specific goal of comparing, matching, and averaging the shapes of landmarks trajectories. Here we represent facial expressions by the motion of the landmarks across the time. The trajectories are represented by curves. We use elastic shape analysis of these curves to develop a Riemannian framework for analyzing shapes of these trajectories. In terms of empirical evaluation, our results on two databases: UvA-NEMO and Cohn-Kanade CK+ are very promising. From a theoretical perspective, this framework allows formal statistical inferences, such as generation of facial expressions.
|
|
15:00-17:10, Paper MoPT4.16 | |
 Shannon Information Based Adaptive Sampling for Action Recognition |
Tian, Qing | McGill Univ |
Arbel, Tal | Centre for Intelligent Machines, McGill Univ |
Clark, James | McGill Univ |
Attachments: Supplementary material
Keywords: Gesture and Behavior Analysis, Image and video analysis and understanding, Motion, tracking and video analysis
Abstract: This paper investigates the effects of sampling on action recognition performance. Currently, dense (regular grid) sampling and uniform random sampling are popular strategies that achieve state-of-the-art performance. However, they are data-blind and pay equal attention to locations of different informativeness. In this paper, a Shannon information based adaptive sampling approach is proposed for action recognition. Results of different sampling approaches are compared on three benchmark datasets: the basic KTH and the challenging HMDB51 and UCF101 datasets. The method is shown to improve recognition accuracy as well as computational efficiency over the current state-of-the-art using less than one percent of the total pixels.
|
|
15:00-17:10, Paper MoPT4.17 | |
High Precision Gesture Sensing Via Quantitative Characterization of the Doppler Effect |
Ai, Haojun | Wuhan Univ |
Men, Yifang | Wuhan Univ |
Han, Liangliang | Aerospace System Engineering Shanghai |
Li, Zuchao | Wuhan Univ |
Mengyun, Liu | Wuhan Univ |
Keywords: Gesture and Behavior Analysis, Segmentation, features and descriptors, Audio and acoustic processing and analysis
Abstract: This paper presents a high precision gesture recognition system that leverages the Doppler effect of ultrasound to sense in-air hand gestures. The system can precisely identify a wider variety of gestures than other systems without any modification to consumer laptops. The system recognizes quantitatively detailed and complex movements from the signals reflected by a moving body. A Hidden Markov Model is used to construct a library of independent, discrete gestures. The gestures can be mapped to diverse application actions. Our method can distinguish among similar gestures with slight difference by extracting fewer, more effective features. Our proposed system reduces false positives caused by unintended motions and is versatile and adaptable to multiple device. We implemented a proof-of-concept prototype on a laptop and extensively evaluated the system. Our results show that the system recognizes six gestures with an average accuracy of 98.6% and 18 gestures including similar ones with 95% accuracy. The flexibility and robustness on multiple devices highlights its ability to enable future ubiquitous non-contact gesture-based interaction with computing devices.
|
|
15:00-17:10, Paper MoPT4.18 | |
 Fast 3D Hand Estimation for Mobile Interactions |
Pei, Yuru | Peking Univ |
Ma, Gengyu | Usens Inc |
Attachments: Supplementary material
Keywords: Human body motion and gesture based interaction, 3D shape recovery, 2D/3D object detection and recognition
Abstract: The ubiquitous hand gesture plays an important role in the natural human machine interaction (HMI). Recently, the consumer color and depth cameras have been used to estimate hand shapes and postures for the mid-air HMI. Under the observation that the 3D hand contours possess much information of hand postures, we estimate the 3D hand contour from infrared images with a limited computation complexity for the HMI on mobile devices. A variant of the dynamic programming (vDP) algorithm is proposed to handle the complex self-occlusion in 3D hand estimation, where a set of heuristic rules are introduced to avoid the finger missing. Furthermore, the constraints are used to reduce the searching space in the contour alignment. Given 3D hand contours, a set of hand gestures, including touching, swiping, and pinching, can be applied to mid-air interactions. The proposed method is much faster than the traditional depth estimation of the whole hand, and can achieve up to 500 Hz on PC, and 100 Hz on mobile devices.
|
|
15:00-17:10, Paper MoPT4.19 | |
HIF3D: Handwriting-Inspired Features for 3D Skeleton-Based Action Recognition |
Boulahia, Said Yacine | IRISA/INSA De Rennes |
Anquetil, Eric | IRISA/INSA |
Kulpa, Richard | INRIA/Univ. De Rennes2 |
Multon, Franck | INRIA/Univ. De Rennes2 |
Keywords: Human body motion and gesture based interaction, Gesture and Behavior Analysis, Human Computer Interaction
Abstract: Action recognition based on human skeleton structure represents nowadays a prosper research field. This is mainly due to the recent advances in terms of capture technologies and skeleton extraction algorithms. In this context, we observed that 3D skeleton-based actions share several properties with handwritten symbols since they both result from a human performance. We accordingly hypothesize that the action recognition problem can take advantage of trial and error already carried out on handwritten patterns. Therefore, inspired by one of the most efficient and compact handwriting feature-set, we propose in this paper a skeleton descriptor referred to as Handwriting-Inspired Features (HIF3D). First of all a data preprocessing is applied to joint trajectories in order to handle the variabilities among actor's morphologies. Then we extract the HIF3D features from the processed joint locations according to a time partitioning scheme so as to additionally encode the temporal information over the sequence. Finally, we selected the Support Vector Machine (SVM) to achieve the classification step. Evaluations conducted on two challenging datasets, namely HDM05 and UTKinect, testify the soundness of our approach as the obtained results outperform the state-of-the-art algorithms that rely on skeleton data.
|
|
15:00-17:10, Paper MoPT4.20 | |
Locating Human Interactions with Discriminatively Trained Deformable Pose+Motion Parts |
van Gemeren, Coert | Utrecht Univ |
Poppe, Ronald | Utrecht Univ |
Veltkamp, Remco | Utrecht Univ |
Keywords: Human body motion and gesture based interaction, Gesture and Behavior Analysis, Image and video analysis and understanding
Abstract: We model dyadic (two-person) interactions by discriminatively training a spatio-temporal deformable part model of fine-grained human interactions. All interactions involve at most two persons. Our models are capable of localizing human interactions in unsegmented videos, marking the interactions of interest in space and time. Our contributions are as follows: First, we create a model that localizes human interactions in space and time. Second, our models use multiple pose and motion features per part. Third, we experiment with different ways of training our models discriminatively. When testing on the target class our models achieve a mean average precision score of 0.86. Cross dataset tests show that our models generalize well to different environments.
|
|
15:00-17:10, Paper MoPT4.21 | |
Fast Gesture Recognition with Multiple Stream Discrete HMMs on 3D Skeletons |
Borghi, Guido | Univ. of Modena and Reggio Emilia |
Vezzani, Roberto | Univ. of Modena and Reggio Emilia |
Cucchiara, Rita | Univ. Degli Studi Di Modena E Reggio Emilia |
Keywords: Human body motion and gesture based interaction, Human Computer Interaction, Gesture and Behavior Analysis
Abstract: HMMs are widely used in action and gesture recognition due to their implementation simplicity, low computational requirement, scalability and high parallelism. They have worth performance even with a limited training set. All these characteristics are hard to find together in other even more accurate methods. In this paper, we propose a novel double-stage classification approach, based on Multiple Stream Discrete Hidden Markov Models (MSD-HMM) and 3D skeleton joint data, able to reach high performances maintaining all advantages listed above. The approach allows both to quickly classify pre-segmented gestures (offline classification), and to perform temporal segmentation on streams of gestures (online classification) faster than real time. We test our system on three public datasets, MSRAction3D, UTKinect-Action and MSRDailyAction, and on a new dataset, Kinteract Dataset, explicitly created for Human Computer Interaction (HCI). We obtain state of the art performances on all of them.
|
|
15:00-17:10, Paper MoPT4.22 | |
Localization of Skin Features on the Hand and Wrist from Small Image Patches |
Stearns, Lee | Univ. of Maryland |
Oh, Uran | Univ. of Maryland, Coll. Park |
Cheng, Bridget J. | Cornell Univ |
Findlater, Leah | Univ. of Maryland, Coll. Park |
Ross, David | Atlanta VA R&D Center for Vision & Neurocognitive Rehabilitation |
Chellappa, Rama | Univ. of Maryland |
Froehlich, Jon E. | Univ. of Maryland, Coll. Park |
Keywords: Human Computer Interaction, Biometric systems and applications, Texture and color analysis
Abstract: Skin-based biometrics rely on the distinctiveness of skin patterns across individuals for identification. In this paper, we investigate whether small image patches of the skin can be localized on a user’s body, determining not “who?” but instead “where?” Applying techniques from biometrics and computer vision, we introduce a hierarchical classifier that estimates a location from the image texture and refines the estimate with keypoint matching and geometric verification. To evaluate our approach, we collected 10,198 close-up images of 17 hand and wrist locations across 30 participants. Within-person algorithmic experiments demonstrate that an individual’s own skin features can be used to localize their skin surface image patches with an F1 score of 96.5%. As secondary analyses, we assess the effects of training set size and between-person classification. We close with a discussion of the strengths and limitations of our approach and evaluation methods as well as implications for future applications using a wearable camera to support touch-based, location-specific taps and gestures on the surface of the skin.
|
|
15:00-17:10, Paper MoPT4.23 | |
Improving Classifier Fusion Via Pool Adjacent Violators Normalization |
Goswami, Gaurav | Indraprastha Inst. of Information Tech. Delhi |
Ratha, Nalini | IBM Res |
Singh, Richa | IIIT Delhi |
Vatsa, Mayank | IIIT Delhi |
Keywords: Multi-biometrics
Abstract: Classifier fusion is a well studied problem in which decisions from multiple classifiers are combined at the score, rank, or decision level to obtain better results than a single classifier. Subsequently, various techniques for combining classifiers at each of these levels have been proposed in the literature. Many popular methods entail scaling and normalizing the scores obtained by each classifier to a common numerical range before combining the normalized scores using the sum rule or another classifier. In this research, we explore an alternative method to combine classifiers at the score level. The Pool Adjacent Violators (PAV) algorithm has traditionally been utilized to convert classifier match scores to confidence values that model posterior probabilities for test data. The PAV algorithm and other score normalization techniques have studied the same problem without being aware of each other. In this first ever study to combine the two, we propose the PAV algorithm for classifier fusion on publicly available NIST multi-modal biometrics score data. We observe that it provides several advantages over existing techniques and find that the interpretation learned by the PAV algorithm is more robust than the scaling learned by other popular normalization algorithms such as min-max. Moreover, the PAV algorithm enables the combined score to be interpreted as confidence and is able to further improve the results obtained by other approaches. We also observe that utilizing traditional normalization techniques first for individual classifiers and then normalizing the fused score using PAV offers a performance boost compared to only using the PAV algorithm.
|
|
15:00-17:10, Paper MoPT4.24 | |
Spoofing Detection for Embedded Face Recognition System Using a Low Cost Stereo Camera |
Tian, Guifen | Csrd, Toshiba |
Keywords: Other Biometric applications, 2D/3D object detection and recognition
Abstract: Spoofing detection is essential for practical face recognition system. Based on the fact that genuine face has special geometric curvatures across surface, this paper brings forward an ultra-fast yet accurate spoofing detection approach using a low-cost stereo camera. To obtain curvatures, the three dimensional shapes of selected facial landmarks are analyzed, by fitting point cloud around each landmark to a specific partial face surface. Spoofing detection is then performed by evaluating curvatures of each landmark and integrating them together. Experiments verify that the approach is able to detect spoofed faces in printed photographs without or with various bending at FAR equal to 0.00%. Meanwhile, genuine faces have a trivial opportunity to be falsely rejected: FRR is 0.59% for near frontal faces and less than 5% for faces with large varying poses. Detection time is 51 milliseconds when executed on a single processor [1] running at a clock frequency of 266M Hz, this makes the detection very suitable for embedded face recognition system.
|
|
15:00-17:10, Paper MoPT4.25 | |
Automatic Leaf Shape Category Discovery |
Olivares, Leonel | Univ. Central |
Victorino, Jorge | Univ. Central |
Gómez, Francisco | Univ. Nacional De Colombia |
Keywords: Pattern Recognition for Bioinformatics, Classification and clustering, Segmentation, features and descriptors
Abstract: Categorical description of leaf shapes is of paramount importance in agriculture and plant sciences. Traditionally, these descriptions have been based on categorical systems proposed by domain experts. Despite the importance of these visual descriptive systems, these approaches may be limited by the representation of unknown shapes as expected in exploratory domains. In this work, we propose a novel strategy to automatically discover the shape categories from a leaf dataset by using only the leaf-shape information. The proposed approach maintains high levels of visual interpretability, a major requirement for interpretation of biological data. The method is based on a complex Fourier shape representation, a low-dimensional representation of this information, and an adaptive kernel-based strategy to discover the shape categories. The proposed method was evaluated through the task of discovering shape categories from 6 different plant species for 3 different biological scenarios. Our experiments demonstrate that the proposed method is able to successfully infer the underlying shape categories presented in a leaf dataset.
|
|
15:00-17:10, Paper MoPT4.26 | |
Towards Protecting Biometric Templates without Sacrificing Performance |
Li, Jing | National Univ. of Singapore, Univ. of Science and Tech |
Wong, Yongkang | National Univ. of Singapore |
Sim, Terence | National Univ. of Singapore |
Keywords: Security issues, Face recognition
Abstract: The ideal biometric template protection scheme possesses the properties of irreversibility, revocability, unlinkability, and good performance. These properties protect the security of the biometrics system as well as users’ privacy. Practical systems, however, fall short of this ideal. In this paper, we present a novel protection scheme that achieves this ideal under the circumstance that a subject’s token and his biometric template are not concurrently exposed. Moreover, our scheme can add template protection to any face verifier. We do this by rendering virtual faces, rather than by devising new biometric features, which is the more common approach. Experimental evaluations using two public face recognition systems show that accuracy is not adversely affected with our scheme.
|
|
15:00-17:10, Paper MoPT4.27 | |
Face Anti-Spoofing with Multifeature Videolet Aggregation |
Ahmad Siddiqui, Talha | IIIT-Delhi |
Bharadwaj, Samarth | IBM |
Dhamecha, Tejas Indulal | IBM |
Agarwal, Akshay | IIIT Delhi |
Vatsa, Mayank | IIIT Delhi |
Singh, Richa | IIIT Delhi |
Ratha, Nalini | IBM Res |
Keywords: Security issues, Face recognition, Biometric systems and applications
Abstract: Biometric systems can be attacked in several ways and the most common being spoofing the input sensor. Therefore, anti-spoofing is one of the most essential prerequisite against attacks on biometric systems. For face recognition it is even more vulnerable as the image capture is non-contact based. Several anti-spoofing methods have been proposed in the literature for both contact and non-contact based biometric modalities often using video to study the temporal characteristics of a real vs. spoofed biometric signal. This paper presents a novel multi-feature evidence aggregation method for face spoofing detection. The proposed method fuses evidence from features encoding of both texture and motion (liveness) properties in the face and also the surrounding scene regions. The feature extraction algorithms are based on a configuration of local binary pattern and motion estimation using histogram of oriented optical flow. Furthermore, the multi-feature windowed videolet aggregation of these orthogonal features coupled with support vector machine-based classification provides robustness to different attacks. We demonstrate the efficacy of the proposed approach by evaluating on three standard public databases: CASIA-FASD, 3DMAD and MSU-MFSD with equal error rate of 3.14%, 0%, and 0%, respectively.
|
|
15:00-17:10, Paper MoPT4.28 | |
Exposing Seam Carving Forgery under Recompression Attacks by Hybrid Large Feature Mining |
Liu, Qingzhong | Sam Houston State Univ |
Keywords: Security issues, Pattern Recognition for Surveillance and Security
Abstract: While seam carving has been widely used in computer vision and multimedia processing, it is also used for tampering illusions. Although several methods have been proposed to detect seam carving-based forgery, to this date, the detection of the seam carving forgery under recompression attacks in JPEG images has not been explored. To fill this gap, we proposed a hybrid large scale feature mining-based detection method to distinguish the doctored JPEG images from the untouched JPEG images under recompression attacks. Over one hundred thousand features from the spatial domain and from the DCT transform domain are extracted. Ensemble learning is used to deal with the high dimensionality and to avoid overfitting that may occur with some traditional learning classifier for the detection. Our study demonstrates the efficacy of proposed approach to exposing the seam-carving forgery under recompression attacks, especially from a lower quality level or on the same quality recompression.
|
|
15:00-17:10, Paper MoPT4.29 | |
Effective 3D Based Frontalization for Unconstrained Face Recognition |
Ferrari, Claudio | Univ. of Florence |
Lisanti, Giuseppe | Univ. Degli Studi Di Firenze |
Berretti, Stefano | Univ. of Florence |
Del Bimbo, Alberto | Univ. of Florence |
Keywords: Face recognition, Pattern Recognition for Surveillance and Security
Abstract: In this paper, we propose a new and effective frontalization algorithm for frontal rendering of unconstrained face images, and experiment it for face recognition. Initially, a 3DMM is fit to the image, and an interpolating function maps each pixel inside the face region on the image to the 3D model's. Thus, we can render a frontal view without introducing artifacts in the final image thanks to the exact correspondence between each pixel and the 3D coordinate of the model. The 3D model is then back projected onto the frontalized image allowing us to localize image patches where to extract the feature descriptors, and thus enhancing the alignment between the same descriptor over different images. Our solution outperforms other frontalization techniques in terms of face verification. Results comparable to state-of-the-art on two challenging benchmark datasets are also reported, supporting our claim of effectiveness of the proposed face image representation.
|
|
15:00-17:10, Paper MoPT4.30 | |
Radon Transform Inspired Method for Hand Gesture Recognition |
Khorsandi, Mohammad Amin | Isfahan Univ. of Tech |
Karimi, Nader | Isfahan Univ. of Tech |
Soroushmehr, S.M. Reza | Univ. of Michigan |
Hajabdollahi, Mohsesn | Isfahan Univ. of Tech |
Samavi, Shadrokh | McMaster Univ |
Ward, Kevin | Univ. of Michigan |
Najarian, Kayvan | Univ. of Michigan |
Keywords: Gesture and Behavior Analysis, Image based modeling, Statistical, syntactic and structural pattern recognition
Abstract: Abstract—Touchless communication is a new field for commanding electronic devices. This method is highlighted when hygiene is a special issue. Automated hand gesture recognition needs processing of hand images. Many research works have tried to cope with this recognition problem. Complexity and high computational costs are important drawbacks that make real-time execution of these algorithms difficult. In this paper a new hand gesture recognition method is proposed. To show the functionality of our method we show how it can be used for recognition of the number of fingers in segmented images. Also the proposed algorithm can estimate angles of fingers, direction of the hand, and positions of fingers. In this work, we transform an image to intercept-slope coordinate using a proposed Radon transform inspired mapping. Using this mapping, the algorithm becomes invariant to rotation, scale and position. Straight and separated fingers will be extracted and their locations and angles are feasible to be determined as well. Simplicity and robustness against rotation, scaling and position and also having no complex mathematical calculation are advantages of our work.
|
|
15:00-17:10, Paper MoPT4.31 | |
StereoTag: A Novel Stereogram-Marker-Based Approach for Augmented Reality |
Nguyen, Minh | Auckland Univ. of Tech. New Zealand |
Yeap, Albert (Wai) | Auckland Univ. of Tech. New Zealand |
Keywords: Mixed and Augmented Reality, Stereo and multiple view geometry, 2D/3D object detection and recognition
Abstract: Augmented Reality (AR) is an active and exciting topic aiming to create intuitive computer interface by blending reality and virtual reality. One challenge of AR is to align virtual data with the environment. Typically, one uses a marker-based approach such as a thick-bordered black and white 2D marker which allows one to recover the relative pose (location and orientation) of a camera in real time. However, bar-code markers do not contain any intuitive visual meaning, and they thus look uninteresting and uninformative. We propose a new type of marker, referred to as a StereoTag, which embeds a meaningful stereogram image hiding 3D coded/decoded information. From experiments conducted, our StereoTag is found to be relatively robust under various conditions and thus could be widely used in future AR applications.
|
|
15:00-17:10, Paper MoPT4.32 | |
Sketch Simplification by Classifying Strokes |
Ogawa, Toru | The Univ. of Tokyo |
Matsui, Yusuke | NII |
Yamasaki, Toshihiko | The Univ. of Tokyo |
Aizawa, Kiyoharu | The Univ. of Tokyo |
Keywords: Pattern Recognition for Art, Cultural Heritage and Entertainment, Vision for graphics, Signal, image and video processing
Abstract: In this paper, we propose a novel approach to creating clean-line drawing from a scribbled sketch automatically. The main problem is determining the strokes of a scribbled sketch to be merged, and in our method, we use a machine learning approach to solve this problem. We create training data by comparing scribbled sketches with manually drawn line drawings. Then, we verify that our method creates clean-line drawings when training data are used as the input of the merging phase. In addition, our method includes a step to remove incorrect prediction results that are returned from the trained estimator. We perform tests to show that this step increases the rate of correct results, and the line drawings created using this step were better than those created without this step.
|
|
15:00-17:10, Paper MoPT4.33 | |
Over-Atoms Accumulation Orthogonal Matching Pursuit Reconstruction Algorithm for Fish Recognition and Identification |
Hsiao, YiHao | National Tsing Hua Univ |
Chen, Chaur-Chin | Department of Computer Science, National Tsing Hua Univ |
Keywords: Pattern Recognition for Bioinformatics, Classification and clustering, Face recognition
Abstract: Fish recognition and identification in an underwater environment are important research topics. In this study, several real-world underwater videos were collected to construct a fish category database for further fish recognition and identification. Recently, compressive sensing, using reconstruction algorithms to reconstruct a sparse signal, has been successfully applied to face recognition. Reconstruction algorithms can be roughly categorized into two groups: basic pursuit (BP) and matching pursuit (MP). BP-related methods adopt a convex optimization technique, while MP-related methods utilize greedy search and vector projection ideas. This study reviews concepts for these reconstruction algorithms and analyzes their performance. Moreover, an over-atoms accumulation orthogonal matching pursuit (OAOMP) method based on OMP is proposed. OAOMP includes two procedures: picking over atoms, and accumulating weighting coefficients of each subject to assign as new weights. OAOMP was compared with existing reconstruction algorithms in terms of reconstruction performance and run time. Experiments were implemented in a fish category database by using eigenfaces and fisherfaces for feature extraction. The experimental results demonstrated that BP-related methods have better recognition rates, while MP-related methods have shorter run times. Moreover, OAOMP is able to achieve better accuracy than OMP and other MP-related methods.
|
|
15:00-17:10, Paper MoPT4.34 | |
An Effective Voiceprint Based Identity Authentication System for Mandarin Smartphone Users |
Liu, Junhong | Peking Univ |
Zou, Yuexian | Peking Univ |
Huang, Yichi | Peking Univ |
Keywords: Speaker recognition, Biometric systems and applications, Other Biometric applications
Abstract: Voiceprint based identity authentication system (IAS) for smartphone users is highly demanded in mobile internet times. There are some successful application cases for English smartphone users. However, to our knowledge, the research outcomes are few for Mandarin smartphone users. Analysis shows that there remain some issues need to be carefully considered: (1) security issue: vulnerable to replay attacks; (2) user experience issue: zero-tolerance of misreading; (3) channel mismatch issue: perform poorly when user change his smartphone. Taking above issues into account, this study strives to develop an effective voiceprint based IAS (termed as DR-EiSV-IAS) for Mandarin smartphone users. Specifically, a content disorder degree (CDD) module implemented with DNN based digit recognition is introduced to resist replay attacks and enhance the fault-tolerance of misreading. Besides, the speaker verification is carefully designed using enhanced ivector technique where ivector framework is incorporated with WCCN to compensate for channel variability. To facilitate this study, we have built up a Mandarin corpus MTDSR2015, which is the first public and free Mandarin database recorded by smartphones for text-dependent speaker recognition research. Extensive experiments have been conducted on both MTDSR2015 and RSR2015 to validate the effectiveness of our proposed DR-EiSV-IAS.
|
|
15:00-17:10, Paper MoPT4.35 | |
Persistent Homology-Based Gait Recognition Robust to Upper Body Variations |
Lamar-Leon, Javier | Advances Tech. Application Center |
Alonso-Baryolo, Raul | Advances Tech. Application Center |
Garcia, Edel | Advanced Tech. Application Center |
Gonzalez-Diaz, Rocio | Univ. of Seville |
Keywords: Gait recognition
Abstract: Gait recognition is nowadays an important biometric technique for video surveillance tasks, due to the advantage of using it at distance. However, when the upper body movements are unrelated to the natural dynamic of the gait, caused for example by carrying a bag or wearing a coat, the reported results show low accuracy. With the goal of solving this problem, we apply persistent homology to extract topological features from the lowest fourth part of the body silhouettes. To obtain the features, we modify our previous algorithm for gait recognition, to improve its efficacy and robustness to variations in the amount of simplices of the gait complex. We evaluate our approach using the CASIA-B dataset, obtaining a considerable accuracy improvement of 93.8%, achieving at the same time invariance to upper body movements unrelated with the dynamic of the gait.
|
|
15:00-17:10, Paper MoPT4.36 | |
Towards Miss Universe Automatic Prediction: The Evening Gown Competition |
Carvajal, Johanna | The Univ. of Queensland |
Wiliem, Arnold | The Univ. of Queensland |
Sanderson, Conrad | NICTA |
Lovell, Brian Carrington | The Univ. of Queensland |
Keywords: Image and video analysis and understanding
Abstract: Can we predict the winner of Miss Universe after watching how they stride down the catwalk during the evening gown competition? Fashion gurus say they can! In our work, we study this question from the perspective of computer vision. In particular, we want to understand whether existing computer vision approaches can be used to automatically extract the qualities exhibited by the Miss Universe winners during their catwalk. This study can pave the way towards new vision-based applications for the fashion industry. To this end, we propose a novel video dataset, called the Miss Universe dataset, comprising 10 years of the evening gown competition selected between 1996-2010. We further propose two ranking-related problems: (1) Miss Universe Listwise Ranking and (2) Miss Universe Pairwise Ranking. In addition, we also develop an approach that simultaneously addresses the two proposed problems. To describe the videos we employ the recently proposed Stacked Fisher Vectors in conjunction with robust local spatio-temporal features. From our evaluation we found that although the addressed problems are extremely challenging, the proposed system is able to rank the winner in the top 3 best predicted scores for 5 out of 10 Miss Universe competitions.
|
|
15:00-17:10, Paper MoPT4.37 | |
Landmark Manifold: Revisiting the Riemannian Manifold Approach for Facial Emotion Recognition |
Zhao, Kun | The Univ. of Queensland |
Yang, Siqi | The Univ. of Queensland |
Wiliem, Arnold | The Univ. of Queensland |
Lovell, Brian Carrington | The Univ. of Queensland |
Keywords: Facial expression recognition, Image and video analysis and understanding
Abstract: Automatically recognising facial emotions has drawn increasing attention in computer vision. Facial landmark based methods are one of the most widely used approaches to perform this task. However, these approaches do not provide good performance. Thus, researchers usually tend to combine more information such as textural and audio information to increase the recognition rate. In this paper we propose a novel method, here called the landmark manifold, that shows the possibility to achieve competitive performance by facial landmark information alone. Through experiments on the well-known dataset: marked Cohn-Kanade extended facial emotion dataset~(CK+), we show that with accurate facial landmarks, our simple approach is fast to run and can achieve competitive performance with enormously expensive methods.
|