|
WePSAT2 |
Multi-Purpose Hall |
Poster Shotgun (08): SS |
Regular Session |
|
08:30-09:00, Paper WePSAT2.1 | |
Motion Blur Free Photometric Stereo Using Correlation Image Sensor |
Kurihara, Toru | Univ. of Tokyo |
Ando, Shigeru | Univ. of Tokyo |
Keywords: Enhancement, Restoration and Filtering, Image and Video Processing, 2D/3D Object Detection and Recognition
Abstract: We developed a motion blur restoration technique for surface orientation images using a correlation image sensor. This system consists of two components; one is ring-shaped modulation illumination for encoding surface orientation into the amplitude and phase of the reflected light intensity, and the other is the three-phase correlation image sensor (3PCIS) for demodulating the amplitude and phase of reflected light. The object motion is formalized by optical flow constraints. It is solved by weighted integral methods (WIM) developed by us, which is a direct algebraic method. The weighted integral method is suitable for correlation image sensor, because exposure time corresponds to integral over time in WIM and reference signals used by 3PCIS correspond to weighted function in WIM. It is demonstrated by both simulation and experiments that modulation imaging with sinusoids can be used to remove motion blur not only in intensity images, but also in normal vector maps.
|
|
08:30-09:00, Paper WePSAT2.2 | |
Simultaneous Reflectance Estimation and Surface Shape Recovery Using Polarisation |
Zhang, Lichi | Univ. of York |
Hancock, Edwin | Univ. of York |
Keywords: Image and Video Processing, Physics-Based Vision, Vision for Graphics
Abstract: In this paper we develop a practical method for estimating shape and reflectance using only three polarised images. Using polarised light and retro-reflection settings during image acquisition, we separate the diffuse and specular reflectance components using Blind Source Separation without the accurate knowledge of the polariser angle information. Next, we compare the capacities of five chosen reflectance models, and estimate parameters of appropriate models for the two separated components together with their corresponding zenith angles. Finally, we recover surface shape by using a mixture model to match the two zenith angle estimations. We present experiments to demonstrate the validity of the proposed method for a variety of materials, and we show that the proposed method is capable of accurately estimating both shape and reflectance information.
|
|
08:30-09:00, Paper WePSAT2.3 | |
Facial Emotion Recognition in Continuous Video |
Cruz, Alberto | Univ. of California, Riverside |
Bhanu, Bir | Univ. of California |
Thakoor, Ninad | Univ. of California, Riverside |
Keywords: Image and Video Processing, Image and Video Understanding, Human Computer Interaction
Abstract: Facial emotion recognition--the detection of emotion states from video of facial expressions--has applications in video games, medicine, and affective computing. While there have been many advances, an approach has yet to be revealed that performs well on the non-trivial Audio/Visual Emotion Challenge 2011 data set. A majority of approaches still employ single frame classification, or temporally aggregate features. We assert that in unconstrained emotion video, a better classification strategy should model the change in features, versus simply combining them. We compute a derivative of features with histogram differencing and derivative of Gaussians and model the changes with a hidden Markov model. We are the first to incorporate temporal information in terms of derivatives. The efficacy of the approach is tested on the non-trivial AVEC2011 data set and increases classification rates on the data by as much as 13%.
|
|
08:30-09:00, Paper WePSAT2.4 | |
A Novel Spatial-Temporal Multi-Scale Method for Detection and Analysis of Infrared Multiple Moving Objects |
Zhang, Tianxu | Huazhong Univ. of Science andTechnology, Wuhan |
Li, Hao | Huazhong Univ. of Science andTechnology, Wuhan |
Li, Gaofei | Huazhong Univ. of Science andTechnology, Wuhan |
Chen, Jianchong | Huazhong Univ. of ScienceandTechnology, Wuhan430074 |
Keywords: Detection, Separation and Segmentation
Abstract: In this paper, a novel spatial-temporal multi-scale method (STMSM) is proposed to solve the problem of detecting multiple moving objects on complex background. Moving objects have multi-scale features both in spatial and temporal domain. The motion salience sub-spaces determine the moving features including position, size and trajectory of each moving object, then the problem of detecting moving objects can be transformed into searching optimal sub-spaces with different scales .This paper proposes a recursive algorithm for estimating motion salience in 3D space and an optimal determinant criterion. These can detect multiple objects at different spatial-temporal scales and extract their features on complex background. The experimental results show this method is effective in detecting multiple moving objects.
|
|
08:30-09:00, Paper WePSAT2.5 | |
Classification Oriented Semi-Supervised Band Selection for Hyperspectral Images |
Bai, Jun | Inst. of Automation, Chinese Acad. of Sciences |
Xiang, Shiming | Inst. ofAutomation,Chinese Acad. of Sciences |
Pan, Chunhong | Inst. of Automation, Chinese Acad. of Sciences |
Keywords: Remote Sensing, Image and Video Processing
Abstract: This paper proposes a new framework of band selection for hyperspectral images. The algorithm is designed for classification purpose. In this work, different subsets of bands are selected for different class pairs. Without prior knowledge of spectral database, we estimate the spectral characteristic of objects using the labeled and unlabeled samples, benefiting from the concept of semi-supervised learning. Under the assumption of Gaussian mixture model (GMM), the vectors of mean values and covariance matrices for each class are estimated. The separabilities for all pair of classes are thus calculated on each band. The bands with the highest separabilities are then selected. To validate our band selection result, support vector machine (SVM) is employed using a strategy of one against one (OAO). Experiments are carried out on a real data set of hyperspectral image, and the results can validate our algorithm.
|
|
08:30-09:00, Paper WePSAT2.6 | |
Key Frame Selection Based on Jensen-Renyi Divergence |
Xu, Qing | Tianjin Univ. |
Keywords: Image and Video Processing
Abstract: The key frame extraction is designed for obtaining a (very) compressed set of video frames that summarizes the essential content of a video sequence. In this paper, a well-known information theoretic measure, the Jensen-Renyi divergence (JRD), is studied to estimate the frame-by-frame distance between consecutive video images, for segmenting shots/subshots and for choosing key frames. Our new key frame extraction method, which is effective and computationally fast, contributes to a good and quick understanding of a large amount of video data.
|
|
08:30-09:00, Paper WePSAT2.7 | |
Arbitrarily Oriented Text Detection Using Geodesic Distances between Corners and Skeletons |
Zhang, Yong | Sun Yat-senUniversity |
Lai, Jian-huang | Sun Yat-sen Univ. |
Keywords: Detection, Separation and Segmentation, Scene Understanding, Segmentation, Color and Texture
Abstract: This paper proposes a corner and skeleton based method for arbitrarily oriented text detection. By calculating the minimum moment of inertia of each candidate text region, we firstly obtain the orientation and minimum bounding box of each connected component. Secondly, based on the fact that corners are frequent and essential patterns in text regions, we propose a geodesic distance between corner and the skeleton of text regions to measure the effective distance between corner and text. Finally, a geodesic distances weighted corner saturation parameter is given to determine which candidate regions are the true text regions. Experimental results in ICDAR 2003 database show that the proposed method can handle the natural scene text of both horizontal and nonhorizontal orientation.
|
|
08:30-09:00, Paper WePSAT2.8 | |
A Spectral Reflectance Representation for Recognition and Reproduction |
Ratnasingam, Sivalogeswaran | NICTA |
Robles-Kelly, Antonio | NICTA |
Keywords: Coding and Compression, Image and Video Processing, Image and Video Understanding
Abstract: In this paper we present a method to recover a spectra representation for reproduction and recognition on multispectral imagery. To do this, we commence by viewing the spectra in the image as a mixture which can be expressed in terms of the sample mean and a set of basis vectors and weights. This treatment leads to an MAP approach where the sample means is given by the centers yielded by the application of the k-means clustering algorithm and the basis vectors are the eigenvectors of the corresponding covariance matrix. We compute the weights making use of a linear programming approach. We illustrate the utility of the method for purposes of skin recognition and spectra reconsruction.
|
|
08:30-09:00, Paper WePSAT2.9 | |
Context-Aware Horror Video Scene Recognition Via Cost-Sensitive Sparse Coding |
Ding, Xinmiao | China Univ. of Mining and Tech. |
Li, Bing | National Lab. of Pattern Recognition, Inst. of Automa |
Hu, Weiming | National Lab. of Pattern Recognition,Inst. |
Xiong, Weihua | Omnivision Corp. |
Wang, Zhenchong | China Univ. of Mining and Tech. |
Keywords: Image and Video Understanding
Abstract: Along with the ever-growing Web, horror video sharing through the Internet has interfered with our daily life and affected ours, especially children's, health. Most of current horror video filtering researches pay more attention to the extraction of global features or selection of an optimal classifier, while neglecting the underlying contexts in a scene. In this paper, a novel cost-sensitive sparse coding (CSC) model is proposed to address the context inside scene and interrelations between audio-visual features simultaneously. The model essentially includes two aspects: one is to construct inner contextual structure among frames from same scene based on a graph; the other one is to extend the classic sparse coding technique into a cost-sensitive sparse coding model for graph pattern classification as well as audio-visual features fusion through graph kernel. The experiments on various video scenes demonstrate that our method's performance is superior to the other existing algorithm.
|
|
08:30-09:00, Paper WePSAT2.10 | |
Pan-Sharpening Using Weighted Red-Black Wavelet |
Liu, Qingjie | BeiHang Univ. |
Wang, Yunhong | Beihang Univ. |
Zhang, Zhaoxiang | Beihang Univ. |
Liu, Lining | Beihang Univ. |
Keywords: Remote Sensing
Abstract: In this paper, we propose a new method for remote sensing image pan-sharpening which is based on weighted red-black (WRB) wavelet and adaptive principal component analysis (PCA), where the adaptive PCA is used to reduce spectral distortions and the utilization of WRB wavelet is used to extract the spatial details in PAN images. To reduce the artifacts and spectral distortions in the pan-sharpened images, which were caused by the local instabilities and dissimilarities in the PAN and MS images, a local process strategy incorporating detail enhancement is introduced. The proposed method is tested on two datasets both acquired by QuickBird and compared with the existing methods. Experimental results show that our method can provide promising fused MS images at a high spatial resolution.
|
|
08:30-09:00, Paper WePSAT2.11 | |
A Color Chart Detection Method for Automatic Color Correction |
Minagawa, Akihiro | Fujitsu Lab. LTD |
Katsuyama, Yutaka | FUJITSU Lab. LTD. |
Takebe, Hiroaki | Fujitsu Lab. Ltd. |
Hotta, Yoshinobu | Fujitsu Lab. LTD. |
Keywords: Detection, Separation and Segmentation, Segmentation, Color and Texture, 2D/3D Object Detection and Recognition
Abstract: Recently, there are a wide range of cameras used around the world in both indoor and outdoor settings. Although precise color reproducibility is an important requirement, the colors of captured images often seem to be different from those in the original scenes. To evaluate color reproducibility problems, the Macbeth color chart, as shown in Fig. 1, is frequently used. In conventional color correction processes, a color chart is first placed near a target and an image including the chart is captured. Image colors are then corrected based on color deviations of patches on the chart from their true color. However, since the location of the color chart in the image is not readily apparent in such images, color analysis and correction is normally done manually. In this paper, automatic and fast color chart detection method is proposed. In the conventional approach to detecting a color chart, a special chart is used to determine the direction. The difficulty in detecting the chart lies in the fact that its position is unknown and its size within an image is not so large. In addition, in an uncorrected image, the chart colors may deviate from their true values. To detect a color chart precisely and automatically, we propose a method based on a colored pixel spotting approach based on the color array in the chart. Using this array information as a constraint leads to a low computational cost, which makes this method suitable for use in many types of cameras. The effectiveness of this algorithm is confirmed using a 167-image dataset that includes several sizes and rotations of color chart placed at arbitrary locations.
|
|
08:30-09:00, Paper WePSAT2.12 | |
Single Image Super-Resolution Using Gaussian Mixture Model |
He, HuaYong | School of Information Science and Tech. The computerapplica |
Li, JianHong | School of Information Science andTechnologyThecomputerapplicatio |
Luo, Xiaonan | Sun Yat-sen Univ. |
Keywords: Image and Video Processing
Abstract: In this paper we present a novel method for single super-resolution (SR). Given the input low-resolution image, we create a pyramid pair: the ground truth pyramid and the interpolated pyramid. Our method aims to model a relationship between pixel value in the ground truth pyramid and its corresponding 8- neighborhood vector in the interpolated pyramid using Gaussian Mixture Model (GMM). Each pixel in final high-resolution image is predicted by its corresponding 8- neighborhood vector through the trained GMM. Unlike the prior example-based SR method, our algorithm just utilizes the information of input image rather than the external image database. Our proposed algorithm achieves much better results than many state of the art algorithms in terms of both PSNR and visual perception.
|
|
08:30-09:00, Paper WePSAT2.13 | |
Multi Scale Multi Structuring Element Top-Hat Transform for Linear Feature Detection |
Xiangzhi, Bai | Beihang Univ. |
Fugen, Zhou | Beihang Univ. |
Bindang, Xue | Beihang Univ. |
Keywords: Detection, Separation and Segmentation
Abstract: To efficiently detect all the possible linear features, a multi scale multi structuring element top-hat transform based algorithm is proposed in this paper. The algorithm is divided into two parts: the multi scale multi structuring element top-hat transform and post-processing. In the multi scale multi structuring element top-hat transform, multi scales of multi structuring elements with increasing sizes are used by the top-hat transform to extract the useful information of linear features. In the post processing, the detected linear feature regions are binarized, firstly. Then, the small noise regions are removed. After that, the final linear feature regions are thinned to form the final binary detected linear features. Experimental results show that, the proposed algorithm could efficiently detect all the possible linear features of different types of images and could be widely used for linear feature detection in different applications.
|
|
08:30-09:00, Paper WePSAT2.14 | |
Joint Multi-Frame Super-Resolution and Matting |
Prabhu, Sahana | IITM |
Ambasamudram, Rajagopalan | Indian Inst. of Tech. Madras |
Keywords: Enhancement, Restoration and Filtering
Abstract: Matting and super-resolution of frames from an image sequence have been studied independently in the literature. We propose a unified formulation to solve both inverse problems by assimilating matting within the super-resolution model. We adopt a multi-frame approach which uses data from adjacent frames to increase the resolution of the matte as well as foreground.
|
|
08:30-09:00, Paper WePSAT2.15 | |
Locally Linear Embedding Based Example Learning for Pan-Sharpening |
Liu, Qingjie | BeiHang Univ. |
Liu, Lining | Beihang Univ. |
Wang, Yunhong | Beihang Univ. |
Zhang, Zhaoxiang | Beihang Univ. |
Keywords: Remote Sensing
Abstract: In this paper, a novel example based method is proposed to solve the remote sensing pan-sharpening problem, utilizing an implicit non-parametric learning framework. The high resolution (HR) and downsampled panchromatic (PAN) images are used to train the high/low resolution patch pair dictionaries. Based on the perspective of locally linear embedding (LLE), every patch in each multi-spectral (MS) image band is modeled by its K nearest neighbors in patch set generated from low resolution (LR) PAN image, and this model can be generalized to the HR condition. The intended HR MS patch is reconstructed from the corresponding neighbors in HR PAN patches. Finally, the HR MS images are recovered by stitching these patches together. Two datasets of images acquired by QuickBird satellite are used to test the performance of the proposed method. Experimental results show that the proposed method performs well in preserving spectral information as well as spatial details.
|
|
08:30-09:00, Paper WePSAT2.16 | |
Envelope Extraction for Composite Shapes for Shape Retrieval |
Song, Jianguo | Inst. of Computer Science & Tech. |
Lu, Xiaoqing | Peking Univ. |
Ling, Haibin | Temple Univ. |
Wang, Xiao | Peking Univ. |
Tang, Zhi | Peking Univ. |
Keywords: Multimedia Analysis, Indexing and Retrieval, Image and Video Processing
Abstract: Analysis of composite shapes recently receives increasing amount of research attention. Different from a silhouette, a composite shape rarely contains a complete envelope. In the paper, we propose a novel envelope extraction algorithm based on the Delaunay triangulation for composite shapes. By analyzing the spatial relationship among individual components of contours and their concavities, we establish new models to describe the envelope edges and their corresponding local enclosed regions. These new models are then used to extract accurate envelopes for composite shapes. We then apply the extracted envelopes to improve shape classification used in shape retrieval. The experimental results show that our algorithm effectively boosts existing shape retrieval algorithms.
|
|
08:30-09:00, Paper WePSAT2.17 | |
Image Super-Resolution by Structural Sparse Coding |
Ren, Jie | Peking Univ. |
Liu, Jiaying | Peking Univ. |
Wang, Mengyan | Peking Univ. |
Guo, Zongming | Peking Univ. |
Keywords: Image and Video Processing, Enhancement, Restoration and Filtering
Abstract: Sparsity-based super-resolution has attracted lots of attention. Due to the high dimensionality of image data, sparsity-based methods are often in a patch-wise manner and simply impose the smoothness constraints on the overlapped regions between reconstructed patches. However, the imposed smoothness constraint is commonly weak to regularize super-resolution problem when the observed low-resolution image loses structure information. In this paper, we propose to improve the performance of the sparsity-based method by incorporating the structural correlations between neighboring patches. Concretely, the structural information is contained by the dictionary atoms which are used to sparsely represent the image patches. Incorporating the correlations of dictionary atoms into the basic sparse coding, a structural sparse coding algorithm is proposed. Experimental results demonstrate that the proposed algorithm outperforms the sparsity-based baseline in both objective and subjective quality.
|
|
08:30-09:00, Paper WePSAT2.18 | |
Estimation of the Human Performance for Pedestrian Detectability Based on Visual Search and Motion Features |
Wakayama, Masashi | Nagoya Univ. |
Deguchi, Daisuke | Nagoya Univ. |
Doman, Keisuke | Nagoya Univ. |
Ide, Ichiro | Nagoya Univ. |
Murase, Hiroshi | Nagoya Univ. |
Tamatsu, Yukimasa | DENSO Corp. |
Keywords: Image and Video Understanding, Scene Understanding, Cognitive and Embodied Vision
Abstract: This paper proposes a method for estimating the human performance of pedestrian detectability from in-vehicle camera images in order to warn a driver of the positions of pedestrians in an appropriate timing. By introducing features related to visual search and motion of the target, the proposed method estimates the detectability of pedestrians accurately. Support Vector Regression (SVR) is used to estimate the detectability. Here, SVR is trained using features calculated by the proposed method with the ground truth obtained through experiments with human subjects. From experiments using in-vehicle camera images, we confirmed that the proposed features were effective to estimate the detectability of pedestrians.
|
|
08:30-09:00, Paper WePSAT2.19 | |
A Hybrid Approach for Artificial Urdu Text Detection in Video Images |
Jamil, Akhtar | COMSATS Inst. of Information Tech. |
Abidi, Ali | National Univ. of Sciences & Tech. |
Siddiqi, Imran | Bahria Univ. |
Arif, Fahim | National Univ. of Sciences & Tech. |
Keywords: Image and Video Processing, Image and Video Understanding, Multimedia Analysis, Indexing and Retrieval
Abstract: The rapid growth of multimedia data containing rich textual information demands for efficient indexing and retrieval techniques. In this paper, we propose a hybrid approach based on a combination of supervised and unsupervised techniques for the detection of horizontally aligned artificial Urdu text appearing in video images. First, we use an unsupervised approach to detect potential text regions which are later validated by a supervised method. In the first step, edge features followed by morphological operations are used to identify the candidate text regions. These regions are further refined by using edge density and geometrical filters. In the next step, these detected text regions are validated by an Artificial Neural Network which is trained on example text and non-text regions. The effectiveness of the proposed system is evaluated on a dataset of 500 images reading promising results.
|
|
08:30-09:00, Paper WePSAT2.20 | |
Image Super-Resolution Based on Locality-Constrained Linear Coding |
Taniguchi, Kazuki | Ritsumeikan Univ. |
Han, Xian-Hua | Ritsumeikan Univ. |
Iwamoto, Yutaro | Ritsumeikan Univ. |
Sasatani, So | Ritsumeikan Univ. |
Chen, Yen-wei | Ritsumeikan Univ. |
Keywords: Enhancement, Restoration and Filtering
Abstract: This paper presents a learning-based method called image super-resolution (SR) for generating a high-resolution (HR) image from a single low-resolution (LR) image. Recent research investigated the image SR problem using sparse coding, which is based on good reconstruction of any image local patch by a sparse linear combination of atoms from an overcomplete dictionary. However, sparse-coding-based SR (ScSR) generally takes a significant amount of computational time to compute an HR image. Further, it can yield only a global dictionary D = [Dh; Dl] by jointly training the concatenated HR and LR image local patches, which results in no accurate correspondence between the HR and LR dictionaries. Therefore, we propose the generation of an HR image using a linear combination of several anchor points (codes) for a local patch based on locality-constrained linear coding (LLC), which is a fast implementation of local coordinate coding (LCC). In the proposed LLC-based strategy, each local patch is represented by a weighted linear combination of its nearer codes in a predefined codebook, and the linear weights become its local coordinate coding. Experimental results show that the recovered HR images with our proposed approach can achieve comparable performance at a processing time much shorter than those of conventional methods.
|
|
08:30-09:00, Paper WePSAT2.21 | |
Color Maximal-Dissimilarity Pattern for Pedestrian Detection |
Wang, Qingyuan | Graduate Univ. of Chinese Acad. of Sciences |
Pang, Junbiao | Beijing Uinversity of Tech. |
Liu, Guoyi | nec Lab. china |
Qin, Lei | Inst. of Computing Tech. Chinese Acad. C |
Huang, Qingming | Chinese Acad. of Sciences |
Jiang, Shuqiang | Chinese Acad. of Sciences |
Keywords: Detection, Separation and Segmentation
Abstract: Feature plays an important role in pedestrian detection, and considerable progress has been made on shape-based descriptors. However, color cues have barely been devoted to detection tasks, seemingly due to the variable appearance of pedestrians. In this paper, Color Maximal-Dissimilarity Pattern (CMDP) is proposed to encode color cues by two core operations, i.e., oriented filtering and max-pooling, which emulate the functions of primary visual cortex (V1). The exten- sively experimental results reveal that the biologically-explainable encoding scheme increases the invariance of color cues, and outperforms the state-of-the-art color descriptor in terms of both accuracy and speed.
|
|
08:30-09:00, Paper WePSAT2.22 | |
A Tracking Based Fast Online Complete Video Synopsis Approach |
Sun, Lei | Tsinghua Univ. Beijing, China |
Xing, Junliang | Tsinghua Univ. |
Ai, Haizhou | Tsinghua Univ. China |
Lao, Shihong | OMRON Social Solutions Co., LTD |
Keywords: Multimedia Analysis, Indexing and Retrieval, Motion, Tracking and Video Analysis, Image and Video Processing
Abstract: By segmenting moving objects out and then densely stitching them into background frames, video synopsis provides an efficient way to condense long videos while preserving most activities. Existing video synopsis methods, however, often suffer from either high computation cost due to global energy minimization or unsatisfactory condense rate to avoid loss of important object activities. To address these problems, a tracking based fast online video synopsis approach is proposed in this paper which makes following three main contributions: 1) an online formulation of the video synopsis problem which makes the approach very fast and scalable to endless surveillance videos with reduced chronological disorders, 2) a tracking based schema which can preserve most object activities, and 3) a complete optimization process from both temporal and spatial redundancies of the video which results in much higher condense rate and less object conflict rate. Experimental results demonstrate the effectiveness and efficiency of proposed approach compared to the traditional method on public surveillance videos.
|
|
08:30-09:00, Paper WePSAT2.23 | |
Sorted Dominant Local Color for Searching Large and Heterogeneous Image Databases |
Vidal, Marcio | Federal Univ. of Amazonas |
Cavalcanti, Jo�o | Federal Univ. of Amazonas |
Silva de Moura, Edleno | Federal Univ. of Amazonas |
da Silva, Altigran | Univ. Federal do Amazonas |
Torres, Ricardo | Inst. of Computing, Univ. of Campinas |
Keywords: Multimedia Analysis, Indexing and Retrieval, Features and Image Descriptors
Abstract: Recent work on Content-Based Image Retrieval (CBIR) have presented alternative methods for fast image indexing and retrieval using Bags of Visual Words (BoVW). In such methods, images are represented as sets of visual words, which can be indexed and searched using well-known text retrieval techniques, allowing fast search on large image databases. In this paper we propose a novel method based on BoVW that improves over current methods by using a new kind of local color descriptor, which we call SDLC, that encodes the most predominant color occurrences in blocks of different im- age regions. We report results of experiments we per- formed with two publicly available image databases. The results indicate that the use of SDLC led to a quite competitive CBIR method in comparison to the state- of-the-art.
|
|
08:30-09:00, Paper WePSAT2.24 | |
Indexed Heat Curves for 3D-Model Retrieval |
EL Khoury, Rachid | telecom lille1 |
Vandeborre, Jean-Philippe | Univ. of Lille 1 |
Daoudi, Mohamed | TELECOM Lille1 |
Keywords: Multimedia Analysis, Indexing and Retrieval, Pattern Recognition for Search, Retrieval and Visualization, Classification and Clustering
Abstract: 3D-model analysis plays an important role in numerous applications. In this paper, we present an approach for 3D-model retrieval by creating index of closed curves in R3 generated from the center of a 3Dmodel, using a commute time mapping function. Our mapping function respects important properties in order to compute robust closed curves. Each curve describes a small region of the 3D-model. To describe all the mesh, we compute a set of indexed closed curves. These curves lead to creates an invariant descriptor to different transformations. Then we compute the distance between models by comparing the indexed curves. In order to evaluate our method, we used shapes from SHREC 2012 database. The results show the robustness of our method on various classes of 3D-models with different positions.
|
|
08:30-09:00, Paper WePSAT2.25 | |
Nonlocal Processing of 3D Colored Point Clouds |
Fran�ois, Lozes | Univ. de Caen Basse-Normandie GREYC UMR 6072 |
Elmoataz, Abderrahim | Univ. de Caen Basse-Normandie |
Lezoray, Olivier | Univ. de Caen Basse-Normandie |
Keywords: Enhancement, Restoration and Filtering, Image and Video Processing
Abstract: In this paper we present a methodology for nonlocal processing of 3D colored point clouds using regularization of functions defined on weighted graphs. To adapt it to nonlocal processing of 3D data, a new definition of patches for 3D point clouds is introduced and used for nonlocal filtering of 3D data such as colored point~clouds. Results illustrate the benefits of our nonlocal approach to filter noisy 3D colored point clouds (either on spatial or colorimetric information).
|
|
08:30-09:00, Paper WePSAT2.26 | |
Speech Emotion Recognition Based on Kernel Reduced-Rank Regression |
Wenming, Zheng | Southeast Univ. |
Zhou, Xiaoyan | Nanjing Univ. of Information Science & Tech. |
Keywords: Speech and Audio Analysis
Abstract: Emotion recognition from Speech has been a very active research topic in pattern recognition. In this paper, we investigate the use of kernel reduced-rank regression (KRRR) model to address the emotion recognition problem from speech. KRRR is a nonlinear extension of the linear reduced-rank regression (RRR) model via the kernel trick, in which a kernel mapping is used for the multivariable of RRR. To find the optimal kernel for KRRR, a kernel optimization algorithm is also proposed in the paper. To evaluate the performance of the proposed method, we conduct extensive experiments on the Berlin emotional database. The experimental results confirm the effectiveness of the proposed method.
|
|
08:30-09:00, Paper WePSAT2.27 | |
Context-Aware Learning for Automatic Sports Highlight Recognition |
Ghanem, Bernard | King Abdullah Univ. of Science and Tech. |
Kreidieh, Maya | American Univ. of Beirut |
Farra, Marc | American Univ. of Beirut |
Zhang, Tianzhu | Advanced Digital Sciences Center of Illinois |
Keywords: Multimedia Analysis, Indexing and Retrieval, Image and Video Understanding, Image and Video Processing
Abstract: Video highlight recognition is the procedure in which a long video sequence is summarized into a shorter video clip that depicts the most "salient" parts of the sequence. It is an important technique for content delivery systems and search systems which create multimedia content tailored to their users' needs. This paper deals specifically with capturing highlights inherent to sports videos, especially for American football. Our proposed system exploits the multimodal nature of sports videos (i.e. visual, audio, and text cues) to detect the most important segments among them. The optimal combination of these cues is learned in a data-driven fashion using user preferences (expert input) as ground truth. Unlike most highlight recognition systems in the literature that define a highlight to be salient only in its own right (globally salient), we also consider the context of each video segment w.r.t. the video sequence it belongs to (locally salient). To validate our method, we compile a large dataset of broadcast American football videos, acquire their ground truth highlights, and evaluate the performance of our learning approach.
|
|
08:30-09:00, Paper WePSAT2.28 | |
No Reference Measurement of Contrast Distortion and Optimal Contrast Enhancement |
Xu, Hongteng | Shanghai Jiao Tong Univ. |
Zhai, Guangtao | Shanghai Jiao Tong Univ. |
Yang, Xiaokang | Shanghai Jiao Tong Univ. |
Keywords: Enhancement, Restoration and Filtering, Image and Video Processing, Pattern Recognition for Art, Cultural Heritage and Entertainment
Abstract: In this paper, a novel histogram-based model for contrast enhancement is proposed. Based on our analysis about the relationships of histogram with contrast, we establish a model which 1) achieves contrast enhancement by an optimal transform of histogram, 2) gives two metrics called contrast gain and nonlinearity of transform to measure the strength of enhancement and the seriousness of distortion caused by enhancement respectively. The ratio of the two proposed metrics not only gives a guidance for the configuration of parameter in the algorithm, but also provides a useful measurement for contrast distortion, which can be a potential solution to judge whether the contrast of an image is optimal. Experimental results show the superior performances of the proposed algorithm in image enhancement.
|
|
08:30-09:00, Paper WePSAT2.29 | |
Enhanced Semantic Descriptors for Functional Scene Categorization |
Zen, Gloria | Univ. of Trento |
Rostamzadeh, Negar | Univ. of Trento |
Staiano, Jacopo | Univ. of Trento |
Ricci, Elisa | Univ. of Perugia |
Sebe, Nicu | Univ. of Trento |
Keywords: Image and Video Understanding, Image and Video Processing, Detection, Separation and Segmentation
Abstract: In this work we present a novel approach which combines semantic information with low level features extracted from a complex video scene. The proposed method for video scene understanding relies on a bag-of-words approach, in which, typically, visual words contain information of local motion, but information regarding what generated such motion is discarded. Instead, in our framework, the semantic information is embedded in the visual words and it allows to automatically obtain semantic categorization of the scene. We show the effectiveness of our method in a traffic analysis scenario: in this case, two main semantic classes, pedestrians and vehicles, are discovered.
|
|
08:30-09:00, Paper WePSAT2.30 | |
Active Contours Segmentation with Edge Based and Local Region Based |
Srikham, Manassanan | Chulalongkorn Univ. |
Keywords: Detection, Separation and Segmentation, Image and Video Processing
Abstract: In this paper, we proposed a novel active contour method for image segmentation, which utilizes the advantages of the GAC and the LRAC methods. We consider the smoothing force of the GAC method and local region-based force of the LRAC method. The advantages of our method are as follows. First the proposed method a new region-based signed pressure force function, which can efficiently stop the contours at weak boundary. Second the proposed method can be handle the heterogeneous texture objects and able to reach into deep concave shapes. Finally, the proposed formulation can be easily implemented by simple finite difference scheme and is computationally more efficient and accurate. The proposed method has been applied to both synthetic and real images.
|
|
08:30-09:00, Paper WePSAT2.31 | |
Robust Detection of Adventitious Lung Sounds in Electronic Auscultation Signals |
Sakai, Tomoya | Nagasaki Univ. |
Kato, Madoka | Nagasaki Univ. |
Miyahara, Sueharu | Nagasaki Univ. |
Kiyasu, Senya | Nagasaki Univ. |
Keywords: Detection, Separation and Segmentation, Speech and Audio Analysis, Enhancement, Restoration and Filtering
Abstract: We present a sparse representation-based method for detecting adventitious lung sounds in low-quality auscultation signals. Since the noise cannot be represented sparsely by any bases, we can extract clear breath sounds and adventitious sounds from noisy electronic auscultation signals via the sparse representation. Using these clear sound components, we measure the level of abnormality, and robustly detect adventitious sounds with pulsating waveforms, a.k.a crackles. We have experimentally confirmed that our detection achieves an average precision of about 85 percents regardless of nose level.
|
|
08:30-09:00, Paper WePSAT2.32 | |
A Classwise Supervised Ordering Approach for Morphology Based Hyperspectral Image Classification |
Courty, Nicolas | Univ. of Bretagne Sud |
Aptoula, Erchan | Okan Univ. |
Lef�vre, S�bastien | Univ. of South Brittany |
Keywords: Remote Sensing, Classification and Clustering, Machine Learning and Data Mining
Abstract: We present a new method for the spectral-spatial classification of hyperspectral images, by means of morphological features and manifold learning. In particular, mathematical morphology has proved to be an invaluable tool for the description of remote sensing images. However, its application to hyperspectral data is problematic, due to the absence of a complete lattice structure at higher dimensions. We address this issue by following up previous experimental indications on the interest of classwise orderings. The practical interest of the proposed approach is shown through comparison on the Pavia dataset with Extended Morphological Profiles, against which it achieves superior results.
|
|
08:30-09:00, Paper WePSAT2.33 | |
Unsupervised People Organization and Its Application on Individual Retrieval from Videos |
Hao, Pengyi | waseda Univ. |
Kamata, Sei-ichiro | Waseda Univ. |
Keywords: Multimedia Analysis, Indexing and Retrieval, Image and Video Processing, Image and Video Understanding
Abstract: In this paper, a method named histogram intersection metric learning from scene tracks is proposed for automatic organizing people in videos. We make the following contributions: (i) learning histogram intersection distance instead of Mahalanobis distance for widely used face features; (ii) learning the metric from scene tracks without manually labeling any examples, which enables learning across large variations in pose, expression, occlusion and illumination with small number of face pairs and can distinguish different people powerfully. We firstly test face identification, track clustering, and people organization on a long film, then individual retrieval based on people organization from a large video dataset is evaluated, demonstrating significantly increased search quality with respect to previous approaches.
|
|
08:30-09:00, Paper WePSAT2.34 | |
Sparse Representation of Audio Features for Sputum Detection from Lung Sounds |
Yamashita, Tatsuya | Gifu Univ. |
Tamura, Satoshi | Gifu Univ. |
Hayashi, Kenji | Gifu Univ. |
Nishimoto, Yutaka | Gifu Univ. |
Hayamizu, Satoru | Gifu Univ. |
Keywords: Detection, Separation and Segmentation, Speech and Audio Analysis, Speech and Audio Processing
Abstract: A medical staff needs to check sputum accumulation in patient's respiratory tract by lung sounds auscultation at any time, and it is the big burden for the staff. This paper aims to develop a system which notifies appropriate timing for the tracheal suction for the medical staff by analyzing lung sounds of the patients. We present a novel framework about automatic sputum detection from lung sounds. We proposed the sparse representation of audio features to realize robust detection in real environment. We showed the effectiveness of our proposed method for three patients in an ICU of Gifu University Hospital, where the recorded lung sounds included electronic beeps, human voices, and other various noises.
|
|
08:30-09:00, Paper WePSAT2.35 | |
Circular Object Detection Method Based on Separability and Uniformity of Feature Distributions Using Bhattacharyya Coefficient |
Niigaki, Hitoshi | Nippon Telegraph and Telephone Corp. |
Shimamura, Jun | NTT Corp. |
Morimoto, Masashi | NTT Corp. |
Keywords: Detection, Separation and Segmentation, Low-Level Vision, Features and Image Descriptors
Abstract: This paper proposes a robust detection method for circular objects in noisy and inhomogeneous contrast image. This method detects circular objects not by the difference in image intensities between the object interior and its surrounding, but by the separability and uniformity of the image intensity distributions as calculated by Bhattacharyya Coefficient. The proposed method can detect obscure and textured circular objects, both of which are challenges for conventional methods. In addition, this method does not incur the cost of texture learning. Experiments demonstrate the effectiveness and robustness of the proposed method.
|
|
08:30-09:00, Paper WePSAT2.36 | |
Adaptive Support-Window Approximation to Bilateral Filtering |
Lin, Guo-Shiang | Da-Yeh Univ. |
Chen, Chun-Yu | National Chung Cheng Univ. |
Kuo, Chun-Ting | National Chung Cheng Univ. |
Lie, Wen-Nung | National Chung Cheng Univ. |
Liu, Kai-Che | Medical Image Res. Department, Asian Inst. of TeleSurger |
Keywords: Image and Video Processing
Abstract: In this paper, a computation-efficient adaptive support-window scheme is proposed to approximate the conventional bilateral filtering. The difference is that the pixel-wise weights in bilateral filter are thresholded to be only 0 or 1. This results in an adaptive support window, depending on the local image structure of the anchor pixel. A cross-based algorithm is devised to achieve adaptive support window. Experiments show that both noise removal and edge-preserving can be also achieved using our proposed filter. By computing integral images during data aggregation, our algorithm is capable of achieving constant-time complexity regardless of the shape of the support window. Experiments demonstrate that our proposed computing scheme can reduce up to 98% of execution time with respect to the traditional bilateral filter.
|
|
08:30-09:00, Paper WePSAT2.37 | |
Multiple-Food Recognition Considering Co-Occurrence Employing Manifold Ranking |
Matsuda, Yuji | The Univ. of Electro-Communications, Tokyo |
Yanai, Keiji | The Univ. of Electro-Commnications, Tokyo |
Keywords: Multimedia Analysis, Indexing and Retrieval, Image and Video Understanding, Pattern Recognition for Search, Retrieval and Visualization
Abstract: In this paper, we propose a method to recognize food images which include multiple food items considering co-occurrence statistics of food items. The proposed method employs a manifold ranking method which has been applied to image retrieval successfully in the literature. In the experiments, we prepared co-occurrence matrices of 100 food items using various kinds of data sources including Web texts, Web food blogs and our own food database, and evaluated the final results obtained by applying manifold ranking. As results, it has been proved that co-occurrence statistics obtained from a food photo database is very helpful to improve the classification rate within the top ten candidates.
|
|
08:30-09:00, Paper WePSAT2.38 | |
Spatiotemporal Saliency Based on Distributed Opponent Oriented Energy |
Zhou, Yue | Shanghai Jiaotong Univ. Inst. of image processing& pat |
Shi, Kun | Shanghai Jiao Tong Univ. |
Keywords: Image and Video Processing, Detection, Separation and Segmentation, Enhancement, Restoration and Filtering
Abstract: A computational saliency model utilizing bio-inspired features for spatiotemporal saliency is presented in this paper. We first propose distributed opponent oriented energy for compact local dynamic texture description motivated by Human Vision System. Then, we integrate the derived motion characterization and a revised self-resemblance saliency framework. High effectiveness and efficiency of the proposed method is extensively demonstrated both qualitatively and quantitatively, for background subtraction in the cases of extremely dynamic scenes and camera jitter. In terms of the trade-off between accuracy and computation cost, our method achieves competitive results in contrast to the state-of-art algorithm.
|
|
08:30-09:00, Paper WePSAT2.39 | |
Image Enhancement by Wavelet Multi-Scale Edge Statistics |
Liew, Alan Wee-Chung | Griffith Univ. |
Jo, Jun | Griffith Univ. |
Chun, Yong-Sik | Korea Aerospace Res. Inst. |
Tae-Hong, Ahn | Chonnam Tech. Univ. |
Tae Byong, Chae | Korea Aerospace Res. Inst. |
Keywords: Enhancement, Restoration and Filtering, Image and Video Processing
Abstract: The distribution of wavelet modulus maxima across wavelet scales can be used to characterize edges in an image. In this paper, we present a novel algorithm that performs image enhancement by mapping the distribution of the wavelet modulus maxima of the blurred image to that of a generic sharp image. Experimental results confirm that the proposed algorithm is able to perform image enhancement without introducing unpleasant visual artifacts.
|
|
08:30-09:00, Paper WePSAT2.40 | |
A Semi-Lagrangian Scheme for Area Preserving Flow |
Carlini, Elisabetta | Sapienza Univ. di Roma |
Ferretti, Roberto | Univ. di Roma Tre |
Keywords: Enhancement, Restoration and Filtering, 2D/3D Object Detection and Recognition, Pattern Recognition for Search, Retrieval and Visualization
Abstract: We propose a new Semi-Lagrangian scheme for the area-preserving Mean Curvature flow. The model uses a level set framework to propagate a closed planar curve. The corresponding flow has been proposed by Sapiro & Tannenbaum cite{ST}. The scheme has the advantage to allow large time step still maintaining a good accuracy. We apply the algorithm to recover the original shape of a synthetic image to which artificial noise has been added. The numerical result shows that the area is preserved during the filtering process.
|
|
08:30-09:00, Paper WePSAT2.41 | |
Enhancement and Noise Reduction of Very Low Light Level Images |
Zhang, Xiangdong | Xidian Univ. |
Shen, Peiyi | Xidian Univ. |
Luo, Lingli | Xidian Univ. |
Zhang, Liang | Xidian Univ. |
Song, Juan | Xidian Univ. |
Keywords: Image and Video Processing, Enhancement, Restoration and Filtering, Low-Level Vision
Abstract: A general method for image contrast enhancement and noise reduction is proposed in this paper. The method is developed especially for enhancing images acquired under very low light conditions where the features of images are nearly invisible and the noise is serious. By applying an improved and effective image de-haze algorithm to the inverted input image, the intensity can be amplified so that the dark areas become bright and the contrast get enhanced. Then, the joint-bilateral filter with the original green component as the edge image is introduced to suppress the noise. Experimental results validate the performance of the proposed approach.
|
|
08:30-09:00, Paper WePSAT2.42 | |
Visual Attention Region Determination for H.264 Videos |
Hu, Kang-Ting | National Chung Cheng Univ. |
Leou, Jin-Jang | National Chung Cheng Univ. |
Hsiao, Han-Hui | National Chung Cheng Univ. |
Keywords: Image and Video Processing, Enhancement, Restoration and Filtering
Abstract: In this study, a visual attention region determination approach for H.264 videos using spatiotemporal features is proposed. After Gaussian filtering in Lab color space, the phase spectrum of Fourier transform (PFT) is used to generate the spatial saliency map of each video frame. On the other hand, the motion vector fields from an H.264 video bitstream are backward accumulated and the phase spectrum of Fourier transform (PFT) is used to obtain the temporal saliency map of each video frame. Then, the spatial and temporal saliency maps of each video frame are combined to obtain its spatiotemporal saliency map using adaptive fusion. Finally, a visual attention region determination scheme is used to determine visual attention regions (VARs) of each video frame. Based on the experimental results obtained in this study, the performance of the proposed approach is better than that of two comparison approaches.
|
|
08:30-09:00, Paper WePSAT2.43 | |
Node Localization in Unsynchronized Time of Arrival Sensor Networks |
Burgess, Simon | Lund Univ. |
Kuang, Yubin | Lund Univ. |
Astroem, Kalle | Lund Univ. |
Keywords: Remote Sensing
Abstract: We present a method for solving the previously unstudied problem of localizing a set of receivers and directions from transmitters placed far from the receivers, measuring unsynchronized time of arrival data. The same problem is present in node localization of microphone and antenna arrays. The solution algorithm using 5 receivers and 9 transmitters is extended to the overdetermined case in a straightforward manner. Degenerate cases are shown to be when i) the measurement matrix has rank 4 or less or ii) the directions from the transmitters to the receivers lie on an intersection between the unit sphere and another quadric surface. In simulated experiments we explore how sensitive the solution is with respect to different degrees of far field approximations of the transmitters and with respect to noise in the data. Using real data we get a reconstruction of the receivers with a relative error of 14%.
|
|
08:30-09:00, Paper WePSAT2.44 | |
Video Summarization Using Simple Action Patterns |
Aydemir, M. Said | Yildiz Tech. Univ. |
Ergul, Ugur | Yildiz Tech. Univ. |
Guclu, Adem | Yildiz Tech. Univ. |
Karsligil, M. Elif | Yildiz Tech. Univ. |
Keywords: Image and Video Processing, Scene Understanding, Pattern Recognition for Surveillance and Security
Abstract: Video summarization, which has a tremendous usage area that spreads from information retrieval to data compression, plays a crucial role in the multimedia understanding. In recent years, with the explosion of the number of videos and their area of use, video summarization became a must to signify. Therefore, this work introduces a novel approach for the summarization problem which is based on human movement understanding. Proposed system presents efficient video knowledge extraction, especially for surveillance cases. Human centric videos are analyzed with histogram of oriented gradients as feature extractor and optical flow as motion descriptor. Above these, a template matching algorithm implemented in a shrinkable and stretchable manner to search for periodicity and thereby extract patterns. Summarization is built up on the validation of these extracted patterns with a correlation based search-through subsystem.
|
|
08:30-09:00, Paper WePSAT2.45 | |
A Probabilistic Framework for Logo Detection and Localization in Natural Scene Images |
Roy, Ankush | Indian Statistical Inst. |
Garain, Utpal | Indian Statistical Inst. |
Keywords: Multimedia Analysis, Indexing and Retrieval, Detection, Separation and Segmentation, 2D/3D Object Detection and Recognition
Abstract: This paper presents a probabilistic approach for logo detection and localization in natural scene images. Two probability distributions are computed, one considering the features extracted from the key points located inside a region and the second refers to the shape geometry defined by the key points. The barycentric co-ordinates are considered to define the shape statistics. The performance of the proposed approach has been reported on two publicly available datasets. Logo detection is tested on BelgaLogos and shown that statistically significant improvement is achieved over two recently proposed methods. Logo localization efficiency has been tested on Flickr Logos 27.
|
|
08:30-09:00, Paper WePSAT2.46 | |
Guided Inpainting and Filtering for Kinect Depth Maps |
Liu, Junyi | Zhejiang Univ. |
Gong, Xiaojin | Zhejiang Univ. |
Liu, Jilin | Zhejiang Univ. |
Keywords: Enhancement, Restoration and Filtering, Image and Video Processing
Abstract: Depth maps captured by Kinect-like cameras are lack of depth data in some areas and suffer from heavy noise. These defects have negative impacts on practical applications. In order to enhance the depth maps, this paper proposes a new inpainting algorithm that extends the original fast marching method (FMM) to reconstruct unknown regions. The extended FMM incorporates an aligned color image as the guidance for inpainting. An edge-preserving guided filter is further applied for noise reduction. To validate our algorithm and compare it with other existing methods, we perform experiments on both the Kinect data and the Middlebury dataset which, respectively, provide qualitative and quantitative results. The results show that our method is efficient and superior to others.
|