Object Tracking Using Radial Basis Function Networks

Page 1

The world's largest digital library

Try Scribd FREE for 30 days to access over 125 million titles without ads or interruptions! Start Free Trial Cancel Anytime.





Object tracking using Radial basis function networks 1

A. Prem Kumar[a], T. N. Rickesh[b], R. Venkatesh Babu[c ], R. Hariharan[d]

Abstract: The applications of visual tracking are broad in scope ranging from surveillance and monitoring to smart rooms. A robust object-tracking algorithm using Radial Basis Function (RBF) networks has been implemented using OpenCV libraries. The pixel-based color features are used to develop classifiers. The algorithm has been tested on various video samples under different conditions, and the results are analyzed.

1. Introduction The object object ive of tracking tracking is to follow follow the tar get object object in suc cessive video video frames. The major utility of such algorithm is in the design of video surveillance system to tackle terrorism. For instance, large-scale surveillance might have played a crucial role in preventing (or tracking the trails of terrorism) 26/11 terrorist attacks in Mumbai, many bomb blasts in Kashmir, North-east Indian region, and other parts of India. It is important to have a robust object-tracking algorithm. Since neural network framework does not require any assumptions on structures of input data, they have been used in the field of pattern recognition, image analysis, etc. The Radial Basis Function (RBF) based neural network netw ork is one one of ma ny ways t o build build c lassifiers. lassifiers. A robust algo rith rith m fo r objec objec t trac king king using RBF networks was described in the paper [1]. We have implemented that algorithm using OpenCV libraries so that this module can be integrated into a large surveillance system.

2. Object T racking racking Object tracking is an important task within the field of computer vision. The growth of high-performance high-performance computers, computers, t he availa availa bility of high high q uality uality yet ine ine xpensive xpensive video c a meras, and the increasing increasing need fo r automated video analysis has generated a great deal of interest interest in object tracking algorithms. There are three key steps in video analysis: detection of interesting moving objects, tracking of such objects from frame to frame, and analysis of tracks t o recognize recognize their behavior. behavior. T he object object tracking tracking is pertinent in the t asks asks of:     

Motion-based recognition, that is, human identification based on gait, automatic object object detect detect ion, ion, etc. etc. Automated surveillance, that is, monitoring a scene to detect suspicious activities or unlike unlike ly events Video indexing, that is, automatic annotation and retrieval of the videos in multimedia multimedia dat abases Human-computer interaction, that is, gesture recognition, eye gaze tracking for data input input to computers, computers, etc . Traffic monitoring, onitoring, t hat is, real-t ime ime gat hering hering of traffic st atisti atistics cs t o direct direct traffic traffic flow





In its simplest form, tracking can be defined as the problem of estimating the trajectory of an object in the image plane as it moves around a scene. A tracker assigns consistent labels to the tracked objects in different frames of a video. Additionally, depending on the tracking domain, a tracker can also provide object-centric information, such as orientation, area, or shape shape of an ob ject . Tracking Tracking object object s can be complex due due t o:



Loss of depth information Noise in images, Complex object motion, Non-rigid Non-rigid o r articulated articulated nature nature of object object s, Partial and full object occlusions, Complex Complex objec t shapes, Scene illumination changes, and



Rea l-ti me processing requireme requireme nts.

     

One can simplify tracking by imposing constraints on the motion and/or appearance of objects. For example, almost all tracking algorithms assume that the object motion is smooth with no abrupt changes. One can further constrain the object motion to be of constant velocity or constant acceleration based on a priori information. Prior knowledge about the number and the size of objects, or the object appearance and shape, can also be incorporated. The foremost factor is the object, its representation, and modeling.

3. Object Representation Re presentation Objects can be represented using their shapes and appearances. Here we describe the object shape representations commonly employed for tracking. 





Points. The object is represented by a point, that is, the centroid or by a set of points. In general, the point representation is suitable for tracking objects that occupy small regions in an image. Primitive geometric shapes. Object shape is represented by a rectangle, ellipse, etc. Object motion for such representations is usually modeled by translation, affine, or projective transformation. Though primitive geometric shapes are more suitable for representing representing simple rigid object s, they are also used for tracking non-rigid objects. Object silhouette and contour. Contour representation defines the boundary of an object. The region inside the contour is called the silhouette of the object. Silhouette and contour representations are suitable for tracking complex non-rigid shapes

4. Object mode mode ling ling The purpose of modeling is to classify whether a pixel chosen belongs to the object or not. Some of the prominent features used for modeling are: 

Templates: Templates: Templates are formed using simple geometric shapes or silhouettes. An









Probabilistic densities of object appearance : The probability density estimates of the object object appearance appearance c an either either be parametric, such as Gaussian Gaussian and a mixt ure ure of Gaussians (for instance RBF networks), or nonparametric, such as histograms. histograms . The probability densities of object appearance features (color, texture) can be computed from the image regions specified by the shape models (interior region of an ellipse or a contour). Histogram: Histogram: It uses the color features of the image. Based on the histogram developed, developed, a pi xel can be decided whether it belongs to object or not. Under conditions in which background has similar color to that of object then classification can be based on a componen componentt color that can differentiate an object or nonnon- object. object.

5. Radial Basis Function Networks A radial basis function network [2] [2] is an artificial neural network that uses radial basis functions as activation functions. It is a linear combination of radial basis functions. The Radial basis function networks are neural nets consisting of three layers. The first input layer feeds data to a hidden intermediate layer. The hidden layer processes the data and transports it to the output layer. Only the tap weights between the hidden layer and the output layer are modified during training. Each hidden layer neuron represents a basis function of the output space, with respect to a particular center in the input space. The activation function chosen is commonly a Gaussian kernel. This kernel is centered at the point in the input space specified by the weight vector. The closer the input signal is to the current weight vector, the higher the output of the neuron will be. Radial basis function networks networks a re used co mmonly mmonly in funct ion ion approxi approxi mation and series predi predi c tion.

6. Description of Algorithm 6.1 Object background separation The object is selected, and a white rectangle then marks the object domain. Another box is marked around the first one with surrounding region has equal number of pixels, which is used as the object background. The object and background are separated from each other. The R-G-B based joint probability density function (pdf) of the object region and that of the background region is obtained. The region within the marked region is used to obtain the object pdf and using the marked background region the background pdf is obtained. The Log-likelihood of a pixel considered in the object and background region is obtained as

where ho and hb are the probabilities of the i th pixel belonging to the object or the





hreshold. where τ 0 is the t hreshold. 6.2 Feature Extract ion We use the color features of pixels to develop RBF based classifier. The result obtained by applying classifier on a pixel gives values –1 or +1. If the selected pixel belongs to object it is assigned +1, and if it belongs to the background -1. 6.3 Developing Developing Ob ject Model The object model is developed using a radial basis function (RBF) classifier called the „Object classifier‟ or „Object model‟. The object classifier classifies the pixels into object or background background based o n the output produced produced by the c lassifier lassifier.. It is possible possible t hat with sufficient number of neurons (second layer), any function can be approximated to any required level accuracy. Let µi is a d- dimensional real vector, and σ i is a d- dimensional dimensional positive real vec tor,

let them be the centre and the width of the Gaussian hidden neuron respectively, with α be the output weights and N be the number of pixels.

The output with k neurons k neurons has the following form[1]:

The above equation can be rewritten in matrix form,

Ỳ = YH α where Y H is the matrix representation of the neuron. Each row in the matrix Y H contains the coefficients with inputs U 1 , U2 , U3 …, Un . And µ and σ values are selected randomly. The output weights are estimated analytically as α= ( YH )† Ỳ where (Y H ) † is the pseudo inverse inverse of Y H. 6.4 Object Object Tracking It is the process of tracing the path of an object from one frame to another in a video sequence. The centroid of the object is calculated from the output of the classifier. In the first frame where we select the object, we calculate the centroid of the object of that frame. Then we proceed to the next frame the new centroid for the object is calculated. If the

calcul calc ulated ated new centroi cent roid d is within є range ( i.e. tolerance) tolerance) of the previous previous frame fra me then the new centroid is the assigned as the current object centroid and proceeds to the next frame.





7. Implementatio Implementatio n This algorithm was implemented in C++ using OpenCV libraries[3]. The code flow is given:

Fig 1: 1: Code Flow







8. Results The algorithm is tested on various video samples. The results are given below, and the problems encountered during experiments are also noted. 8.1 Likelihood Results The following figures show sources (Fig. 2(a), 3(a)) and their binary images (Fig. 2(b), 3(b)) based on likelihoods.

Fig. 2(a)

Fig. 2(b)

Fig. 3(a)

Fig. 3(b)

8.2 Classifier Results The following figures show the results of the classifier. The first column figures (Fig. 4(a), 5(a)) show the object selection. The second column (Fig. 4(b), 5(b)) shows the corresponding binary images based on likelihoods, and the third set (Fig. 4(c), 5(c)) shows the binary images that are obtained from the classifier.





8.3 Trac king king res ults ults The following figures show tracking rectangle of the object and their respective binary images from the classifier. Video frame

Binar y i mage mage

Video Fr ame

Binary Image

Fig. 6(a)

Fig. 6(b)

Fig. 7(a)

Fig. 7(b)

Fig. 8(a)

Fig. 8(b)

Fig. 9(a)

Fig. 9(b)

Fig. 10(a)

Fig. 10(b)

Fig. 11(a)

Fig. 11(b)

Fig. 6(a), 7(a), 8(a), 9(a), 10(a), and 11(a) correspond to frame numbers 89, 172, 265, 316, 394 and 404 respectively. 8.4 Issues The problems encountered in the tracking experiment are discussed below.





Fig. 12(c)

Fig. 12(d) 12(d)

2) Occlusion: Occlusion: W hen the tracking object (ca r in in F ig. ig. 13) is completely completely c overed by any other surrounding environment (tree in Fig. 13) then the object information is lost thereby leading to failure of tracking.

Fig. 13(a)

Fig. Fig. 13(b)





Fig. 14(a)

Fig. 14(b)

Fig. 15(a)

Fig. 15(b)

9. Conclusions and future enhancements A robust object-tracking algorithm using Radial Basis Function (RBF) networks has been implemented using OpenCV libraries. The pixel-based color features are used to develop c lassifiers. lassifiers. The T he algorith algorith m has been t ested est ed on various various video sa mples unde underr diff ere nt conditions, and the results are analyzed. The cases where the tracking algorithm fails are also shown along with possible reasons. The RBF networks could be redesigned to incorporate incorporate adapt ive mec mec hanisms for light light variations variations and varying varying o bject bject domain, domain, th resholds, scale changes, and multiple multiple c ameraamera- feeds. Acknowledgement: We thank Dr. U. N. Sinha (Head, Flosolver) for his constant encouragement encouragement and inspi ration. ration. Without his his support and g uidance, uidance, this work would not have been carried out.





Object Tracking Using Radial Basis Function Networks

Recommend Documents