|
TuPT1 |
Poster Session Hall |
TuP1 |
Poster Session |
|
15:00-17:10, Paper TuPT1.1 | |
Reducing the Computational Cost of Shape Matching with the Distance Set |
Iwata, Kazunori | Hiroshima City Univ |
Keywords: Statistical, syntactic and structural pattern recognition
Abstract: The distance set is known to be a versatile local descriptor of shape. As this is simply a set of ordinary distances between sample points on a shape, it is easy to construct and use. More importantly, it remains invariant under many settings and deformations, unlike other typical descriptors. However, in shape matching with distance sets, there is a tradeoff between performance and computational feasibility. In this paper, we present a new descriptor by improving the choice and order of elements in the distance set. We show that our descriptor is more efficient for shape matching from the viewpoint of computer algorithms. Additionally, we demonstrate that, although our descriptor runs more quickly in practice, it is equivalent to the original distance set in terms of shape retrieval.
|
|
15:00-17:10, Paper TuPT1.2 | |
Convex Optimization Approach for Multi-Label Feature Selection Based on Mutual Information |
Lim, Hyunki | Chung-Ang Univ |
Kim, Dae-Won | Chung-Ang Univ |
Keywords: Classification and clustering, Dimensionality reduction and manifold learning
Abstract: We propose a convex optimization approach for multi-label feature selection. The effective feature subset can be obtained through finding a global optima of a convex objective function for multi-label feature selection. However conventional greedy approaches are prone to suboptimal result. In this paper, the mathematical procedures and considerations for the optimization approach are presented for multi-label feature selection based on mutual information. We compared the proposed method with conventional greedy search based methods to show the potential of optimization based multi-label feature selection.
|
|
15:00-17:10, Paper TuPT1.3 | |
Semi-Supervised Image Labelling Using Barycentric Graph Embeddings |
Robles-Kelly, Antonio | NICTA |
Wei, Ran | NICTA |
Keywords: Statistical, syntactic and structural pattern recognition, Classification and clustering
Abstract: Here, we turn our attention to barycentric embeddings and examine their utility for semi-supervised image labelling tasks. To this end, we view the pixels in the image as vertices in a graph and their pairwise affinities as weights of the edges between them. Abstracted in this manner, we can pose the semi-supervised labelling problem into a graph theoretic setting where the labels are assigned based upon the distance in the embedding space between the nodes corresponding to the unlabelled pixels and those whose labels are in hand. We do this using a barycentric embedding approach which naturally leads to a setting in which the embedding coordinates can be computed by solving a system of linear equations. Moreover, the method presented here can incorporate side information such as that delivered by colour priors used elsewhere in the literature for semi-supervised colour image labelling. We illustrate the utility of our method for both, colour and hyperspectal image labelling and compare our results against other techniques elsewhere in literature.
|
|
15:00-17:10, Paper TuPT1.4 | |
Measuring Regularity of Network Patterns by Grid Approximations Using the LLL Algorithm |
Hajdu, Andras | Univ. of Debrecen, Hungary |
Harangi, Balazs | Univ. of Debrecen |
Besenczi, Renátó | Univ. of Debrecen |
Lazar, Istvan | Univ. of Debrecen |
Emri, Gabriella | Univ. of Debrecen |
Hajdu, Lajos | Univ. of Debrecen |
Tijdeman, Robert | Leiden Univ |
Keywords: Statistical, syntactic and structural pattern recognition, Medical image and signal analysis, Computer-aided detection and diagnosis
Abstract: In a recent work, we have proposed a novel way to approximate point sets with grids using the LLL algorithm, which operates in polynomial time. Now, we show how this approach can be applied to pattern recognition purposes with interpreting the rate of approximation as a new feature for regularity measurement. Our practical problem is the characterization of pigment networks in skin lesions. For this task we also introduce a novel image processing method for the extraction of the pigment network. Then, we show how our grid approximation framework can be applied with specializing it for the recognition of hexagonal patterns. The classification performance of our approach for the pigment network characterization problem is measured on a database annotated by a clinical expert. Throughout the paper we address several practical issues that may help to apply our general framework to other practical tasks, as well.
|
|
15:00-17:10, Paper TuPT1.5 | |
Distance-Preserving Vector Space Embedding for the Closest String Problem |
Nienkötter, Andreas | Univ. of Münster |
Jiang, Xiaoyi | Univ. of Münster |
Keywords: Statistical, syntactic and structural pattern recognition, Pattern Recognition for Bioinformatics
Abstract: The closest string problem is a core problem in computational biology with applications in other fields like coding theory. Many algorithms exist to solve this problem, but due to its inherent high computational complexity (typically NP-hard), it can only be solved efficiently by restricting the search space to a specific range of parameters. Often, the run-time of these algorithms is exponential in the maximum distance between strings, restricting these solutions to very small distances. Recently, a prototype embedding method has been proposed to solve the similar generalized median problem for arbitrary objects. In this approach, objects are transformed into vector space using prototype embedding. The problem is solved in vector space and afterwards inversely transformed back into original space. This method has been successfully applied to generalized median computation in several domains where the computational complexity is inherently high. In this work, we apply prototype embedding to the closest string problem. We show that different embedding methods can result in a very good and fast approximation of the closest string, independent of the maximum distance and other parameters.
|
|
15:00-17:10, Paper TuPT1.6 | |
Quantum Thermodynamics of Time Evolving Networks |
Minello, Giorgia | Univ. CA' FOSCARI DI VENEZIA |
Torsello, Andrea | Univ. Ca' Foscari Venezia |
Hancock, Edwin | Univ. of York |
Keywords: Statistical, syntactic and structural pattern recognition, Reinforcement learning and temporal models
Abstract: In this paper, we present a novel thermodynamic framework for graphs that can be used to analyze time evolving networks, relating the thermodynamics variables to macroscopic changes in network topology, and linking major structural transition to phase changes in the thermodynamic picture. We start from a recent quantum-mechanical characterization of the structure of a network relating the graph Laplacian to a density operator and resulting in a characterization of the network's entropy. Then we adopt a Schrödinger picture of the dynamics of the network, resulting in an estimation of a hidden time-varying Hamiltonian from the data, from which we derive a measure of Energy exchange. From these variables, using the thermodynamic identity, we obtain temperature under the assumption of constant volume of the system. Evaluation of real-world data shows that the thermodynamic variables thus extracted are effective in detecting critical events occurring during network evolution.
|
|
15:00-17:10, Paper TuPT1.7 | |
Taking into Account Stereoisomerism in the Prediction of Molecular Properties |
Grenier, Pierre-Anthony | Univ. De Caen, CNRS UMR 6072, GREYC, ENSICAEN |
Brun, Luc | ENSICAEN |
Villemin, Didier | Cnrs Umr 6507 Lcmt, Ensicaen |
Keywords: Statistical, syntactic and structural pattern recognition, Support vector machines and kernel methods, Pattern Recognition for Bioinformatics
Abstract: The prediction of molecule's properties through Quantitative Structure Activity (resp. Property) Relationships are two active research fields named QSAR and QSPR. Within these frameworks Graph kernels allow to combine a natural encoding of a molecule by a graph with classical statistical tools such as SVM or kernel ridge regression. Unfortunately some molecules encoded by a same graph and differing only by the three dimensional orientation of their atoms in space have different properties. Such molecules are called stereoisomers. These latter properties can not be predicted by usual graph methods which do not encode stereoisomerism. In a previous paper, we proposed to encode the stereoisomerism property of each atom by a local subgraph, called minimal stereo subgraph, and we designed a kernel based on the comparison of bags of such subgraphs. However, the encoding of a molecule by a bag of subgraphs induces an important loss of information. In this paper, we propose a new kernel based both on the spatial relationships between minimal stereo subgraphs and the local neighbourhood of each minimal stereo subgraph within its supergraph. Our experiments show the benefits of taking into account such information.
|
|
15:00-17:10, Paper TuPT1.8 | |
Similarity Measures for Title Matching |
Gali, Najlah | Univ. of Eastern Finland |
Mariescu-Istodor, Radu | Univ. of Eastern Finland |
Fränti, Pasi | Univ. of Eastern Finland |
Keywords: Statistical, syntactic and structural pattern recognition, Symbolic learning, Classification and clustering
Abstract: In many web applications, users query a place name, a photo name, and other entity names using search words that include alternate spellings, abbreviations, and variants that are similar, but not identical to the title associated with the desired entity. Given two titles, an effective similarity measure should be able to determine whether the titles represent the same entity or not. In this paper, we evaluate 21 measures with the aim at detecting the most appropriate measure for matching the titles. Results show that Soft-TFIDF performs the best.
|
|
15:00-17:10, Paper TuPT1.9 | |
Fast Kernel SVM Training Via Support Vector Identification |
Mao, Xue | National Lab. of Pattern Recognition, Inst. of Automat |
Fu, Zhouyu | Monash Univ |
Wu, Ou | Inst. of Automation, CAS |
Hu, Weiming | National Lab. of Pattern Recognition, Inst |
Keywords: Support vector machines and kernel methods, Classification and clustering
Abstract: Training kernel SVM on large datasets suffers from high computational complexity and requires a large amount of memory. However, a desirable property of SVM is that its decision function is solely determined by the support vectors, a subset of training examples with non-vanishing weights. This motivates a novel efficient algorithm for training kernel SVM via support vector identification. The efficient training algorithm involves two steps. In the first step, we randomly sample the training data without replacement several times, each time a small subset of training data is sampled. Then a kernel SVM is trained on each subset, and the resulting kernel SVM models are used to identify the support vectors on the margin. In the second step, an optimization problem is solved to estimate the Lagrange multipliers corresponding to these support vectors. After obtaining the support vectors and Lagrange multipliers, we can approximate the decision function of kernel SVM. Due to the cubic complexity of standard kernel SVM training algorithm, training many kernel SVMs on small subsets of training data is much more efficient than training a single kernel SVM on the whole training data especially for large datasets. Therefore, our algorithm has better scalability than kernel SVM. Besides, training SVMs on each subset can be done independently, and hence our algorithm can be easily parallelized for further speedup. Since our algorithm only identifies the support vectors on the margin, it produces less number of support vectors as compared to that produced by standard kernel SVM. This makes our algorithm more efficient in prediction too. Experimental results show that our method outperforms state-of-the-art methods and achieves performance on par with the kernel SVM albeit with much improved efficiency.
|
|
15:00-17:10, Paper TuPT1.10 | |
Adapting Instance Weights for Unsupervised Domain Adaptation Using Quadratic Mutual Information and Subspace Learning |
Khan, Mohammad Nazmul Alam | Oklahoma State Univ |
Heisterkamp, Douglas | Oklahoma State Univ |
Keywords: Transfer learning, Classification and clustering, 2D/3D object detection and recognition
Abstract: Domain adaptation (DA) algorithms utilize a label-rich old dataset (domain) to build a machine learning model (classification, detection etc.) in a label-scarce new dataset with different data distribution. Recent approaches transform cross-domain data into a shared subspace by minimizing the shift between their marginal distributions. In this paper, we propose a novel iterative method to learn a common subspace based on non-parametric quadratic mutual information (QMI) between data and corresponding class labels. We extend a prior work of discriminative subspace learning based on maximization of QMI and integrate instance weighting into the QMI formulation. We propose an adaptive weighting model to identify relevant samples that share underlying similarity across domains and ignore irrelevant ones. Due to difficulty of applying cross-validation, an alternative strategy is integrated with the proposed algorithm to setup model parameters. A set of comprehensive experiments on benchmark datasets is conducted to prove the efficacy of our proposed framework over state-of-the-art approaches.
|
|
15:00-17:10, Paper TuPT1.11 | |
A Simple Approach for Unsupervised Domain Adaptation |
Guo, Xifeng | National Univ. of Defense Tech |
Chen, Wei | National Univ. of Defense Tech |
Yin, Jianping | National Univ. of Defense Tech |
Keywords: Transfer learning, Machine learning and data mining, Support vector machines and kernel methods
Abstract: Domain adaptation (DA) aims to eliminate the difference between the distribution of labeled source domain on which a classifier is trained and that of unlabeled or partly labeled target domain to which the classifier is to be applied. Compared with the semi-supervised domain adaptation where some labeled data from target domain is utilized to help train the classifier, the unsupervised domain adaptation where no labels can be seen from the target domain is without doubt more challenging. Most published approaches suffer from high complexity of design or implementation. In this paper, we propose a simple method for unsupervised domain adaptation which minimizes domain shift by projecting each instance from source and target domains into a common feature space using a linear kernel function. Our method is extremely simple without hyper-parameters (it can be implemented in two lines of Matlab code) but still outperforms the state-of-the-art domain adaptation approaches on standard benchmark datasets.
|
|
15:00-17:10, Paper TuPT1.12 | |
Multi-Task Learning for One Class Svm with Additional New Features |
Xue, Yongjian | Univ. of Tech. of Troyes |
Beauseroy, Pierre | Inst. Charles Delaunay |
Keywords: Transfer learning, Support vector machines and kernel methods, Machine learning and data mining
Abstract: In real applications of one class classification, new features may be added due to some practical or technical reason. While lacking of representative samples for the new features, multi-task learning idea could be used to bring some information from the former learning model. Based on the above assumption, a new multi-task learning approach is proposed to deal with the training of the updated system when adding new measurements. In the model, a parameter is introduced to control the information needed from the former model and an heuristic search method is also established to get a corresponding proper value. Experiments conducted on toy data and real data set show that the new method could decrease the probability of false positive rapidly, while keeping the probability of false negative approximately stable as the number of samples for new introduced features increases.
|
|
15:00-17:10, Paper TuPT1.13 | |
Semantic-Free Attributes for Image Classification |
Oliveau, Quentin | Télécom ParisTech |
Sahbi, Hichem | CNRS, TELECOM ParisTech |
Keywords: Classification and clustering
Abstract: Attributes are defined as mid-level image characteristics shared among different categories. These characteristics are suitable in order to handle classification problems especially when training data are scarce. In this paper, we design discriminative real-valued attributes by learning nonlinear inductive maps. Our method is based on solving a constrained optimization problem that mixes three criteria; the first one aims to predict real-valued attributes with a high precision while the second criterion maximizes their discrimination power. We also consider a particular smoothness term that provides gradual representations for data belonging to the same categories, resulting into highly discriminant and predictable attributes. Experiments conducted for the particular task of fine-grained image classification - with relatively small training sets - show that our attribute design approach is very competitive and outperforms related attribute learning methods.
|
|
15:00-17:10, Paper TuPT1.14 | |
Co-Regularized Kernel K-Means for Multi-View Clustering |
Ye, Yongkai | National Univ. of Defense Tech |
Liu, Xinwang | Computer School, National Univ. of Defense Tech |
Yin, Jianping | National Univ. of Defense Tech |
Zhu, En | National Univ. of Defense Tech |
Keywords: Classification and clustering
Abstract: In clustering applications, multiple views of the data are often available. Although clustering could be done within each view independently, exploiting information across views is promising to gain clustering accuracy improvement. A common assumption in the field of multi-view learning is that the clustering results from multiple views should be consistent with a latent clustering. However, the potential noise among some views would make this assumption difficult to be satisfied, which finally hurts the clustering performance. To address this issue, we propose a novel clustering algorithm where the intrinsic clustering is found by maximizing the sum of weighted similarities between clusterings of different views. Weights that indicate the qualities of views are learned simultaneously along with the latent clustering and clusterings of different views. A three-step alternative algorithm is designed to solve the problem efficiently. Empirical comparisons with a number of baselines on various datasets confirm the efficacy of our approach.
|
|
15:00-17:10, Paper TuPT1.15 | |
Latent Model Ensemble with Auto-Localization |
Sun, Miao | Univ. of Missouri |
Han, Tony | Univ. of Missouri |
Xun, Xu | Sony Electronics |
Keywords: Classification and clustering, Deep learning
Abstract: Deep Convolutional Neural Networks (CNN) have exhibited superior performance in many visual recognition tasks including image classification, object detection, and scene labeling, due to their large learning capacity and resistance to overfit. For the image classification task, most of the current deep CNN-based approaches take the whole size-normalized image as input and have achieved quite promising results. Compared with the previously dominating approaches based on feature extraction, pooling, and classification, the deep CNN-based approaches mainly rely on the learning capability of deep CNN to achieve superior results: the burden of minimizing intra-class variation while maximizing inter-class difference is entirely dependent on the implicit feature learning component of deep CNN; we rely upon the implicitly learned filters and pooling component to select the discriminative regions, which correspond to the activated neurons. However, if the irrelevant regions constitute a large portion of the image of interest, the classification performance of the deep CNN, which takes the whole image as input, can be heavily affected. To solve this issue, we propose a novel latent CNN framework, which treats the most discriminate region as a latent variable. We can jointly learn the global CNN with the latent CNN to avoid the aforementioned big irrelevant region issue, and our experimental results show the evident advantage of the proposed latent CNN over traditional deep CNN: latent CNN outperforms the state-of-the-art performance of deep CNN on standard benchmark datasets including the CIFAR-10, CIFAR-100, MNIST and PASCAL VOC 2007 Classification dataset.
|
|
15:00-17:10, Paper TuPT1.16 | |
Discovering Characteristic Landmarks on Ancient Coins Using Convolutional Networks |
Kim, Jongpil | Rutgers, the State Univ. of New Jersey |
Pavlovic, Vladimir | Rutgers Univ |
Keywords: Classification and clustering, Deep learning
Abstract: We propose a novel method to find characteristic landmarks and recognize ancient Roman imperial coins using deep convolutional neural networks (CNNs) combined with expert-designed domain hierarchies. We first propose a new framework to recognize the Roman coin which exploits the hierarchical knowledge structure embedded in the coin domain, which we combine with the CNN-based category classifiers. We next formulate an optimization problem to discover class-specific salient coin regions. Analysis of discovered salient regions confirms that they are largely consistent with human expert annotations. Experimental results show that the proposed framework is able to effectively recognize the ancient Roman coins as well as successfully identify landmarks in a general fine-grained classification problem. For this research, we have collected a new Roman coin dataset where all coins are annotated and consist of obverse (head) and reverse (tail) images.
|
|
15:00-17:10, Paper TuPT1.17 | |
PLSNet: A Simple Network Using Partial Least Squares Regression for Image Classification |
Hasegawa, Ryoma | Meijo Univ |
Hotta, Kazuhiro | Meijo Univ |
Keywords: Classification and clustering, Deep learning
Abstract: PCANet is a simple network using Principal Component Analysis (PCA) for image classification and obtained high accuracies on a variety of datasets. PCA projects explanatory variables on a subspace that the first component has the largest variance. On the other hand, Partial Least Squares (PLS) regression projects explanatory variables on a subspace that the first component has the largest covariance between explanatory and objective variables, and the objective variables are predicted from the subspace. If class labels are used as objective variables for PLS, the subspace is suitable for classification. Stacked PLS is a simple network using PLS for image classification and obtained high accuracy on the MNIST database. However, the performance of Stacked PLS was inferior to PCANet on the others. One of differences between Stacked PLS and PCANet is network architecture. In this paper, we combine the network architecture of PCANet with PLS and propose a new image classification method called PLSNet. It obtained higher accuracies than PCANet on the MNIST and the CIFAR-10 datasets. Furthermore, we change how to make filters for extracting features at the second convolution layer, and we call it Improved PLSNet. It obtained higher accuracies than PLSNet. In addition, we give it deeper network architecture, and we call it Deep Improved PLSNet. It obtained higher accuracies than Improved PLSNet.
|
|
15:00-17:10, Paper TuPT1.18 | |
Simultaneous Dual-Views Reconstruction with Adaptive Dictionary and Low-Rank Representation |
Yi, Shuangyan | Harbin Inst. of Tech. Shenzhen Graduate School |
He, Zhenyu | Harbin Inst. of Tech. Shenzhen Graduate School |
Li, Yi | Harbin Inst. of Tech. Shenzhen Graduate School |
Cheung, Yiu-ming | Hong Kong Baptist Univ |
Chen, Wen-Sheng | Shenzhen Univ |
Keywords: Classification and clustering, Dimensionality reduction and manifold learning
Abstract: Low-Rank Representation (LRR) is an effective self-expressiveness method, which uses the observed data itself as the dictionary to reconstruct the original data. LRR focuses on representing the global low-dimensional information, but ignores the real fact that data often resides on low-dimensional manifolds embedded in a high-dimensional data. Therefore, LRR can not capture the non-linear geometric structures within data. As well known, locality preserving projections (LPP) is able to preserve the intrinsic geometry structure embedded in high-dimensional data. To this end, we treat the projected data by LPP as an adaptive dictionary, and such a dictionary can capture the intrinsic geometry structures of data. In this way, our method is in favor of the global low-rank representation. Specifically, the proposed method provides a way to reconstruct the original data from two views, and hence we call this proposed method as Simultaneous Dual-Views Reconstruction with Adaptive Dictionary and Low-Rank Representation. The proposed method can be used for unsupervised feature extraction and subspace clustering. Experiments on benchmark databases show the excellent performance of this proposed method in comparison with other state-of-the-art methods.
|
|
15:00-17:10, Paper TuPT1.19 | |
Multi-Label Classification with Meta-Label-Specific Features |
Sun, Lu | Hokkaido Univ |
Kudo, Mineichi | Hokkaido Univ |
Kimura, Keigo | Hokkaido Univ |
Keywords: Classification and clustering, Dimensionality reduction and manifold learning, Machine learning and data mining
Abstract: Multi-label classification has attracted many attentions in various fields, such as text categorization and semantic image annotation. Aiming to classify an instance into multiple labels, various multi-label classification methods have been proposed. However, the existing methods typically build models in the identical feature (sub)space for all labels, possibly inconsistent with real-world problems. In this paper, we develop a novel method based on the assumption that meta-labels with specific features exist in the scenario of multi-label classification. The proposed method consists of meta-label learning and specific feature selection. Experiments on twelve benchmark multi-label datasets show the efficiency of the proposed method compared with several state-of-the-art methods.
|
|
15:00-17:10, Paper TuPT1.20 | |
Bus Trajectory Identification by Map-Matching |
Raymond, Rudy | IBM Res. - Tokyo |
Imamichi, Takashi | IBM Brazil |
Keywords: Classification and clustering, Dimensionality reduction and manifold learning, Machine learning and data mining
Abstract: We study the problem of identifying vehicle trajectories from the sequences of noisy geospatial-temporal datasets. Nowadays we witness the accumulation of vehicle trajectory datasets in the form of the sequences of GPS points. However, in many cases the sequences of GPS points are sparse and noisy so that identifying the actual trajectories of vehicles is hard. Although there are many advanced map-matching techniques claiming to achieve high accuracy to deal with the problem, only few public datasets that come with ground truth trajectories for supporting the claims. On the other hand, some cities are releasing their bus datasets for real-time monitoring and analytics. Since buses are expected to run on predefined routes, such datasets are highly valuable for map-matching and other pattern recognition applications. Nevertheless, some buses in reality appear not following their predefined routes and behave anomalously. We propose a simple and robust technique based on the combination of map-matching, bag-of-roads, and dimensionality reduction for their route identification. Experiments on datasets of buses in the city of Rio de Janeiro confirm the high accuracy of our method.
|
|
15:00-17:10, Paper TuPT1.21 | |
Two-Dimensional PCA Hashing |
Mao, Minqi | Zhejiang Normal Univ |
Zheng, Zl | UCM |
Chen, Zy | ZJNU |
Ye, Rh | ZJNU |
He, Xw | ZJNU |
Keywords: Classification and clustering, Dimensionality reduction and manifold learning, Machine learning and data mining
Abstract: Recently, hashing algorithm catches amounts of sight in the field of machine learning. Most existing hashing methods directly utilize a vector, which can be piped by an image matrix, as unit and adopt some projection functions to project the original data into several dimensions of real values. Then each of these projected real values is quantized or hashed into zero-one bit by thresholding, such as locality sensitive hashing (LSH) and principal component analysis hashing (PCAH). However, the plain elongation of image may cause the curse of dimensionality. In this paper, different with PCAH method, a two dimensional (2D) image is directly used to feature extraction by two-dimensional principal component analysis (2DPCA) and 2DPCAH performs hashing using these 2DPCA-projected data. Furthermore, starting with 2DPCA-projected data, we apply iterative quantization technology, which aims to finding a rotation matrix data so as to minimize the quantization error of mapping these data to a zero-centered binary hypercube. The experimental results indicate that 2DPCAH, 2DPCA-RR and 2DPCA-ITQ are competitive compared with traditional PCAH and some other classic methods.
|
|
15:00-17:10, Paper TuPT1.22 | |
Sampling Based Approximate Spectral Clustering Ensemble for Partitioning Datasets |
Moazzen, Yaser | Istanbul Tech. Univ |
Taşdemir, Kadim | Antalya International Univ |
Keywords: Classification and clustering, Semi-supervised learning and spectral methods
Abstract: Spectral clustering is able to extract clusters with various characteristics without a parametric model, however it is infeasible for large datasets due to its high computational cost and memory requirement. Approximate spectral clustering (ASC) addresses this challenge by a representative-based partitioning approach which first finds a set of data representatives either by sampling or quantization, then applies spectral clustering on them. To achieve an optimal partitioning with ASC, several sampling or quantization methods together with advanced similarity criteria have been recently proposed. While quantization is more accurate than sampling in expense of heavy computation, and geodesic based hybrid similarity criteria are often more informative than others, there is no unique solution optimum for all datasets. Alternatively, we propose to use ensemble learning to produce a consensus partitioning constructed from different set of representatives and similarity criteria. The proposed ensemble (SASCE) not only produces a relatively more accurate partitioning but also eliminates the need to determine the best pair (the optimum set of representatives and the optimum similarity). Thanks to the efficient similarity definition on the representative level, the SASCE can be powerful for clustering small and medium datasets, outperforming traditional clustering approaches and their ensembles.
|
|
15:00-17:10, Paper TuPT1.23 | |
Integrating Hidden Markov Models Based on Mixture-Of-Templates and K-NN2 Ensemble for Activity Recognition |
Kim, Yong-Joong | POSTECH |
Kim, Yonghyun | POSTECH |
Ahn, Juhyun | Pohang Univ. of Science and Tech |
Kim, Daijin | POSTECH |
Keywords: Classification and clustering, Signal, image and video processing, Other applications
Abstract: This paper considers the activity recognition problem using inertial sensor data. It is a challenging temporal pattern recognition problem as the sensor data can be easily mixed with noise and also has large intra-class variation, resulting from different characteristic of people doing same activity, and inter-class similarity among several similar activities. To handle these problems concentrating on the classification method, this paper proposes a novel ensemble scheme of hidden Markov models for activity recognition. To improve the performance of activity recognition, our method models the outputs of multiple hidden Markov models by using enhanced template-based classifier fusion method, in which multiple local templates are generated as Mixture-of-Templates, and k-NN2 ensemble method is proposed to elaborately recognize activities. To show the effectiveness of the proposed method, we carried out several experiments on UCI Human Activity Recognition dataset and compared our method with several alternative methods. As a result, our method outperforms other methods.
|
|
15:00-17:10, Paper TuPT1.24 | |
Weighted K-Nearest Neighbor Revisited |
Bicego, Manuele | Univ. of Verona |
Loog, Marco | Delft Univ. of Tech. / Univ. of Copenhagen |
Keywords: Classification and clustering, Statistical, syntactic and structural pattern recognition
Abstract: In this paper we show that weighted K-Nearest Neighbor, a variation of the classic K-Nearest Neighbor, can be reinterpreted from a classifier combining perspective, specifically as a fixed combiner rule, the sum rule. Subsequently, we experimentally demonstrate that it can be rather beneficial to consider other combining schemes as well. In particular, we focus on trained combiners and illustrate the positive effect these can have on classification performance.
|
|
15:00-17:10, Paper TuPT1.25 | |
A New Geometrical Approach for Solving the Supervised Pattern Recognition Problem |
Valev, Ventzeslav | Inst. of Mathematics and Informatics, Bulgarian Acad. of S |
Yanev, Nicola | Univ. of Sofia and Inst. of Mathematics and Informatics |
Krzyzak, Adam | -Concordia Univ |
Keywords: Classification and clustering, Statistical, syntactic and structural pattern recognition, Machine learning and data mining
Abstract: This paper explores the supervised pattern recognition problem based on feature partitioning. This formulation leads to a new problem in computational geometry. The supervised pattern recognition problem is formulated as an heuristic good clique cover problem satisfying the k-nearest neighbors rule. First it is applied a heuristic algorithm for partitioning a graph into a minimal number of cliques. Next cliques are merged using the k-nearest neighbors rule. An important advantage of this approach is the decomposition of a problem involving l classes into l optimization problems involving a single class. The computational complexity of the method, computational procedures, and classification rules are discussed. A geometrical interpretation of the solution is also given. Using the proposed approach, the geometrical structure of the training set is utilized in the best possible way.
|
|
15:00-17:10, Paper TuPT1.26 | |
Hybrid Markov Blanket Discovery |
Gao, Tian | Rensselaer Pol. Inst |
Ji, Qiang | RPI |
Keywords: Machine learning and data mining, Model selection
Abstract: In a Bayesian Network (BN), a target node is independent of all other nodes given its Markov Blanket (MB). By finding the MB, many problem can be solved directly or indirectly. There exist predominately two different approaches to finding the MB: the score-based and the constraint-based algorithms. We introduce a new Markov Blanket learning algorithm, Hybrid Markov Blanket (HMB) discovery, by combining these two different approaches. Specifically, HMB first employs a score-based method for finding the parents and children (PC) of the target node. HMB then introduces an efficient constraint-based approach to finding target node's spouses without enforcing the symmetry constraint that is required by existing constraint-based methods. In comparison, HMB achieves a better accuracy than the traditional constraint-based approaches and a better efficiency than the existing score-based approaches. In addition, HMB is theoretically proven sound and complete. Empirical results on synthetic and standard MB discovery datasets demonstrate the superior performance of HMB.
|
|
15:00-17:10, Paper TuPT1.27 | |
Online Discriminant Projective Non-Negative Matrix Factorization |
Liao, Qing | The Hong Kong Univ. of Science and Tech. |
Zhang, Qian | The Hong Kong Univ. of Science and Tech. |
|
15:00-17:10, Paper TuPT1.28 | |
A Robust UAV Landing Site Detection System Using Mid-Level Discriminative Patches |
Guo, Xufeng | Queensland Univ. of Tech |
Denman, Simon | Queensland Univ. of Tech |
Fookes, Clinton | Queensland Univ. of Tech |
Sridha, Sridharan | Queensland Univ. of Tech |
Keywords: Machine learning and data mining, Scene understanding, Motion, tracking and video analysis
Abstract: The forced landing problem has become one of the main impediments to UAV's entering civilian airspace. Unfortunately there is no robust forced landing site detection system that will reliably detect a safe landing site. One of the main reasons for this is the difficulty in considering the various classes of surface, to determine whether they are safe or not. We propose a robust UAV landing site detection system using mid-level discriminative patches. The training and tuning process uses a dataset containing 1600 randomly selected Google map images with weak labels. We then show how the output from multiple mid-level discriminative patch detectors can be combined to indicate the level or danger for a given region. The proposed technique reliably detects safe landing areas in UAV imagery, and achieves improved performance over the state-of-the art. The proposed system outperforms the baseline system by 29.4% for completeness and 33.9% for correctness, and is invariant to the changes of illumination, sharpness and resolution of images.
|
|
15:00-17:10, Paper TuPT1.29 | |
Multi-Stage Multi-Task Feature Learning Via Adaptive Threshold |
Fan, Ya-Ru | Univ. of Electronic Science and Tech. of China |
Wang, Yilun | Univ. of Electronic Science and Tech. of China |
Huang, Ting-Zhu | Univ. of Electronic Science and Tech. of China |
Keywords: Model selection, Machine learning and data mining, Classification and clustering
Abstract: Multi-task feature learning aims to identify the shared features among tasks to improve generalization. Recent works have shown that the non-convex learning model often returns a better solution than the convex alternatives. Thus a non-convex model based on the capped-ell_{1},ell_{1} regularization was proposed in cite{Gong2013}, and the corresponding efficient multi-stage multi-task feature learning algorithm (MSMTFL) was presented. However, this method harnesses a fixed threshold in the capped-ell_{1},ell_{1} regularization. The lack of adaptivity might result in suboptimal practical performance. In this paper we propose to employ an adaptive threshold in the capped-ell_{1},ell_{1} regularized formulation, and the corresponding variant of MSMTFL will incorporate an additional scheme to adaptively determine the threshold. Considering that this threshold aims to distinguish true nonzero components of large magnitude from others, the heuristic of detecting the ``first significant jump" proposed in cite{Wang2010} is applied here to adaptively determine its value. The preliminary theoretical analysis is provided to guarantee the feasibility of the proposed method. Several numerical experiments demonstrate the proposed method outperforms existing state-of-the-art feature learning approaches.
|
|
15:00-17:10, Paper TuPT1.30 | |
Enhancing Label Inference Algorithms Considering Vertex Importance in Graph-Based Semi-Supervised Learning |
Oh, Byonghwa | Hyundai Card Corp |
Yang, Jihoon | Sogang Univ |
Keywords: Semi-supervised learning and spectral methods, Classification and clustering, Machine learning and data mining
Abstract: Graph-based semi-supervised learning has recently come into focus for to its two defining phases: graph construction, which converts the data into a graph, and label inference, which predicts the appropriate labels for unlabeled data using the constructed graph. And the label inference is based on the smoothness assumption of semi-supervised learning. In this study, we propose an enhanced label inference approach which incorporates the importance of each vertex into the existing inference algorithms to improve the prediction capabilities of the algorithms. We also present extensions of three algorithms which are capable of taking the vertex importance variable to apply in learning. Experiments show that our algorithms perform better than the base algorithms on a variety of datasets, especially when the data is less smooth over the graphs.
|
|
15:00-17:10, Paper TuPT1.31 | |
Optimistic Semi-Supervised Least Squares Classification |
Krijthe, Jesse Hendrik | Delft Univ. of Tech |
Loog, Marco | Delft Univ. of Tech. / Univ. of Copenhagen |
Keywords: Semi-supervised learning and spectral methods, Classification and clustering, Statistical, syntactic and structural pattern recognition
Abstract: The goal of semi-supervised learning is to improve supervised classifiers by using additional unlabeled training examples. In this work we study a simple self-learning approach to semi-supervised learning applied to the least squares classifier. We show that a soft-label and a hard-label variant of self-learning can be derived by applying block coordinate descent to two related but slightly different objective functions. The resulting soft-label approach is related to an idea about dealing with missing data that dates back to the 1930s. We show that the soft-label variant typically outperforms the hard-label variant on benchmark datasets and partially explain this behaviour by studying the relative difficulty of finding good local minima for the corresponding objective functions.
|
|
15:00-17:10, Paper TuPT1.32 | |
Constrained Local and Global Consistency for Semi-Supervised Learning |
de Sousa, Celso | Univ. De São Paulo |
Batista, Gustavo | Univ. De São Paulo |
Keywords: Semi-supervised learning and spectral methods, Machine learning and data mining
Abstract: One of the widely used algorithms for graph-based semi-supervised learning (SSL) is the Local and Global Consistency (LGC). Such an algorithm can be viewed as a convex optimization problem that balances fitness on labeled examples and smoothness on the graph using a graph Laplacian. In this paper, we provide a novel graph-based SSL algorithm incorporating two normalization constraints into the regularization framework of LGC. We prove that our method has closed-form solution and generalizes two existing methods, being more flexible than the original ones. Through experiments on benchmark data sets, we show the effectiveness of our method, which consistently outperforms the competing methods.
|
|
15:00-17:10, Paper TuPT1.32 | |
A Tight Convex Upper Bound on the Likelihood of a Finite Mixture |
Mezuman, Elad | IBM Res |
Weiss, Yair | Hebrew Univ |
Keywords: Model selection
Abstract: The likelihood function of a finite mixture model is a non-convex function with multiple local maxima and commonly used iterative algorithms such as EM will converge to different solutions depending on initial conditions. In this paper we ask: is it possible to assess how far we are from the global maximum of the likelihood? Since the likelihood of a finite mixture model can grow unboundedly by centering a Gaussian on a single datapoint and shrinking the covariance, we constrain the problem by assuming that the parameters of the individual models are members of a large discrete set (e.g. estimating a mixture of two Gaussians where the means and variances of both Gaussians are members of a set of a million possible means and variances). For this setting we show that a simple upper bound on the likelihood can be computed using convex optimization and we analyze conditions under which the bound is guaranteed to be tight. This bound can then be used to assess the quality of solutions found by EM (where the final result is projected on the discrete set) or any other mixture estimation algorithm. For any dataset our method allows us to find a finite mixture model together with a dataset-specific bound on how far the likelihood of this mixture is from the global optimum of the likelihood.
|
|
15:00-17:10, Paper TuPT1.33 | |
Object-Aware Tracking |
Bogun, Ivan | Florida Inst. of Tech |
Ribeiro, Eraldo | Florida Inst. of Tech |
Keywords: Semi-supervised learning and spectral methods, Support vector machines and kernel methods, Motion, tracking and video analysis
Abstract: In this paper, we address the problem of visual tracking in videos without using a pre-learned model of the object. This type of model-free tracking is a hard problem because of limited information about the object, abrupt object motion, and shape deformation. We propose to integrate an object-agnostic prior, called objectness, which is designed to measure the likelihood of a given location to contain an object of any type, into structured tracking framework. Our objectness prior is based on image segmentation and edges; thus, it does not require training data. By extending a structured tracker with the prior, we introduce a new tracker which we call ObjStruck. We extensively evaluate our tracker on publicly available datasets and show that objectness prior improves tracking accuracy.
|
|
15:00-17:10, Paper TuPT1.34 | |
Graph Edit Distance As a Quadratic Program |
Bougleux, Sébastien | Normandie Univ. UNICAEN |
Gaüzère, Benoit | Normandie Univ. INSA De Rouen, LITIS |
Brun, Luc | ENSICAEN |
Keywords: Statistical, syntactic and structural pattern recognition
Abstract: The graph edit distance (GED) measures the amount of distortion needed to transform a graph into another graph. Such a distance, developed in the context of error-tolerant graph matching, is one of the most flexible tool used in structural pattern recognition. However, the computation of the exact GED is NP-complete. Hence several suboptimal solutions, such as the ones based on bipartite assignments with edition, have been proposed. In this paper we propose a binary quadratic programming problem whose global minimum corresponds to the exact GED. This problem is interpreted a quadratic assignment problem (QAP) where some constraints have been relaxed. This allows to adapt the integer projected fixed point algorithm, initially designed for the QAP, to efficiently compute an approximate GED by finding an interesting local minimum. Experiments show that our method remains quite close to the exact GED for datasets composed of small graphs, while keeping low execution times on datasets composed of larger graphs.
|
|
15:00-17:10, Paper TuPT1.35 | |
Graph Model Boosting for Structural Data Recognition |
Miyazaki, Tomo | Tohoku Univ |
Omachi, Shinichiro | Tohoku Univ |
Keywords: Statistical, syntactic and structural pattern recognition, Active and ensemble learning, Model selection
Abstract: This paper presents a novel method for structural data recognition using a large number of graph models. Broadly, existing methods for structral data recognition have two crucial problems: 1) only a single model is used to capture structural variation, 2) naive classification rules are used, such as nearest neighbor method. In this paper, we propose to strengthen both capturing structural variation and the classification ability. The proposed method constructs a large number of graph models and trains decision tree classifiers with the models. There are two contributions of this paper. The first contribution is a novel graph model which can be constructed by straightforward calculation. This calculation enables us to construct many models in feasible time. The second contribution is a novel approach to capture structural variation. We construct a large number of our models in a boosting framework so that we can capture structural variation comprehensively. Consequently, we are able to perform structural data recognition with the powerful classification ability and comprehensive structural variation. In experiments, we show that the proposed method achieves significant results and outperforms the existing methods.
|
|
15:00-17:10, Paper TuPT1.36 | |
Unsupervised Automatic Attribute Discovery Method Via Multi-Graph Clustering |
Liu, Liangchen | Univ. of Queensland |
Nie, Feiping | NWPU |
Zhang, Teng | The Univ. of Queensland |
Wiliem, Arnold | The Univ. of Queensland |
Lovell, Brian Carrington | The Univ. of Queensland |
Keywords: Scene understanding, Classification and clustering
Abstract: Recently, various automated attribute discovery methods have been developed to discover useful visual attributes from a given set of images. Despite of the progress made, most of methods consider the supervised scenario which assumes the existence of labelled data. Recent results suggest that it is possible to discover attributes from a set of unlabelled data. In this work, we propose a novel unsupervised attribute discovery method utilising multi-graph approach that preserves both local neighbourhood structure as well as class separability. We employ multiple similarity graphs that encode various relationships between image exemplars. For evaluation, we first investigate the performance of the proposed approach to address a clustering task. Then we apply our proposed method to automatically discover visual attributes and compare with various automatic attribute discovery and hashing methods. The results show that our proposed method is able to improve the performance in the clustering task. Furthermore, when evaluated using the recent meaningfulness metric, the proposed method outperforms the other unsupervised attribute discovery methods.
|
|
15:00-17:10, Paper TuPT1.37 | |
Context Aware Nonnegative Matrix Factorization Clustering |
Tripodi, Rocco | Ca' Foscari Univ |
Vascon, Sebastiano | Univ. Ca' Foscari of Venice |
Pelillo, Marcello | Ca' Foscari Univ |
Keywords: Classification and clustering, Face recognition, Document Understanding
Abstract: In this article we propose a method to refine the clustering results obtained with the nonnegative matrix factorization (NMF) technique, imposing consistency constraints on the final labeling of the data. The research community focused its effort on the initialization and on the optimization part of this method, without paying attention to the final cluster assignments. We propose a game theoretic framework in which each object to be clustered is represented as a player, which has to choose its cluster membership. The information obtained with NMF is used to initialize the strategy space of the players and a weighted graph is used to model the interactions among the players. These interactions allow the players to choose a cluster which is coherent with the clusters chosen by similar players, a property which is not guaranteed by NMF, since it produces a soft clustering of the data. The results on common benchmarks show that our model is able to improve the performances of many NMF formulations.
|
|
15:00-17:10, Paper TuPT1.38 | |
Finding Rigid Sub-Structure Patterns from 3D Point-Sets |
Chen, Zihe | State Univ. of New York at Buffalo |
Chen, Danyang | State Univ. of New York at Buffalo |
Ding, Hu | State Univ. of New York at Buffalo |
Huang, Ziyun | State Univ. of New York at Buffalo |
Li, Zheshuo | State Univ. of New York at Buffalo |
Sehgal, Nitasha | State Univ. of New York at Buffalo |
Fritz, Andrew | State Univ. of New York at Buffalo |
Berezney, Ronald | State Univ. of New York at Buffalo |
Xu, Jinhui | State Univ. of New York at Buffalo |
Keywords: Statistical, syntactic and structural pattern recognition, Pattern Recognition for Bioinformatics
Abstract: In this paper, we study the following rigid sub-structure pattern reconstruction problem: given a set of n input structures (i.e. point-sets), partition each structure into k rigid sub-structures so that the nk rigid substructures can be grouped into k clusters with each of them containing exact one rigid substructure from every input structure and the total clustering cost is minimized, where the clustering cost of a cluster is the total distance between a pattern reconstructed for this cluster and every member rigid substructure. Different from most of the existing models for pattern reconstruction (where each input point-set is often treated as a single structure), our model views each input point-set as a collection of k rigid substructures, and aims to extract similar rigid substructures from each input point-set to form k rigid clusters. The problem is motivated by an interesting biological application for determining the topological structure of chromosomes inside the cell nucleus. We propose a highly effective and practical solution based on a number of new insights to pattern reconstruction, clustering, and motion detection. We validate our method on synthetic, biological and motion tracking data sets. Experimental results suggest that our approach yields a near optimal solution.
|