Object tracking with dynamic feature modeling and frame watermarking via embedded filters

  • Zhuan Qing Huang

Western Sydney University thesis: Doctoral thesis

Abstract

This thesis investigates the object tracking problem based on modeling the object features in a video sequence in terms of distinctive color characteristics, or a hybrid along with spatial correlation and motion properties, and it also studies the related ownership protection for the individual video frames, even potentially the object itself, through watermarking via embedded wavelet filters. Tracking an object in a video sequence poses a significant challenge in its wide applications including process control, medical diagnosis, human computer interface, to name a few. The difficulty of the problem is often directly related to complexity of the object-occupied scene and its variation within the video, and is also dependent upon the critical feature modeling that underpins the tracking process. We propose in this thesis to model the distinctiveness of the non-rigid object within a moving background through identifying and analyzing discriminating colors, stable appearance, spatial relevance, and motion data, and to develop the corresponding methods to track objects in different type of scenes including the camouflaged object of interest. Many existing tracking methodologies are based on plain object templates, histograms or detectable object contours. Such a template approach is often susceptible to changes in object shape, illumination, local appearance, or to partial object occlusion. Some approaches considered certain local appearance changes but are limited to knowing a priori how the changes take place, while others may have to resort to complicated data training in a long course to retrieve a stable feature. We thus propose a kernel based object model that transforms the object appearance into a statistical representation in term of grouped color probability densities. Quite different from the histogram approaches, this model represents the characteristics of the groups of pixels with implicit or explicit incorporation of their spatial information. We differentiate the importance on the selected groups of pixels by identifying distinctive features in term of stable appearance, the undesirable region of potentially large local appearance change, or areas of heavy object deformation. This framework provides the flexibility to model the object based on the dynamic characteristics of the object in shape and in the distribution of color groups. It is designed to effectively handle the deformable object, local changes or partial occlusion in the moving background, and can also greatly reduce the computational complexity by making smaller number of statistical samples. We further model the distinctive features by means of the color contrast between the object and its background. We extract sections of color distribution density for the object that stand out distinctively from its local background, and use these to locate the object in the newer frames through the Bayesian estimation. We then propose to extract dominant elements for the distinctive features by maximizing the difference of the object and its local background through the optimal segmentation. The object is located in the newer frames via the pixel similarity to the extracted dominant elements. In contrast to the traditional approach in which only certain specific types of objects such as a bright target are applicable, or the feature selection is based directly on the total difference of the densities, our proposed approach is generic and efficient with its dynamic and automatic extraction of updated distinctive feature in the process, and such features are explored through different color spaces or their derived properties. One of the most challenging tracking problems is when object color and texture resemble that of the background, and when the object shape and the background also change in a video sequence. Most existing tracking algorithms will fail under such harsh environment or choose to completely stay away. In this regard, however, we propose an iterative method of Weighted Region Consolidation to track a camouflaged object. We will detect the object motion based on both spatial and intensity densities by locating pixels with high motion probabilities from the difference data of successive frames. We then consolidate the object region by weighted overall neighborhood intensity, and by a contour verification or voting method. In the realm of videos or frame images, watermarks may be inserted for various reasons, including copyright protection on parts of the multimedia data or even embedding sporadic object cues for future searching. We propose to watermark frame images by encoding the watermark bits into the choice of the wavelet filters, in complete contrast to the watermarking convention. Such filters are selected from different classes in such a way that they lead to sufficiently distinguishing subbands upon different sequence of filters. This proposed scheme is scalable in that the methodology can be utilized to build a larger or a smaller system, and the scheme is also shown to be robust to noise injection, illumination changes and some forms of geometric distortion or cropping.
Date of Award2009
Original languageEnglish

Keywords

  • image processing
  • object tracking
  • object detection
  • digital video
  • digital techniques

Cite this

'