Tuesday, December 2, 2008

[Research] Distribution of spatiotemporal events to describe movement in video


Since I am taking Applied Spatial Statistics, I am starting to consider a statistical approach to model the presence of events in video. Rather than describing the method of Laptev, this approach will be a mechanisim to formally discuss in the paper the existence of gradients in human movement.


Gradients will be considered as random variables that could appear in any pixel of the video. I plan to model those events as a probability function with the goal of finding the maximum likelihood estimates. I am reading the Expectation-maximization algorithm and the book "Interactive Spatial Data Analysis" to classify types of movement based on probability distribution of events.

The idea of using a distribution of probabilities in videos can also be found in [1]. The authors use this approach to synchronize video recordings of the same scene, but with different viewpoints. A weak point in this approach is that the authors only compare 2 distributions (histograms) by directly subtracting each position in the histograms.


I have also defined the "conflictive" spatiotemporal gradients as the ones that fall in the [middle line +- 2*spatial variance]. These events are removed and I am generating them for all the videos.


[1] J. Yan, M. Pollefeys, Video Synchronization via Space-Time Interest Point Distribution, Advanced Concepts for Intelligent Vision Systems, 2004.

Thursday, November 27, 2008

[Research] Paper Review on Time Series Indexation


Discovering Characteristic Actions from On-Body Sensor Data. Minnen, D. Starner, T. Essa, I. Isbell, C. College of Computing, Georgia Institute of Technology, Atlanta, GA 30332 USA. dminn@cc.gatech.edu. 10th IEEE International Symposium on Wearable Computers. Oct. 2006


In this paper the authors discuss the discovery of human activity from sensor data. Rather than identifying the type of movement, this work pretend to discover if there are some patterns that appear frequently during stream data. The authors called motifs to this patterns and are practically the unit motions we discover in the paper of time series. Some interesting ideas of this paper are the use of a suffix tree to discover human motion in real time from stream data. Although the authors do not exploit this feature (they prefer to use a HMM), I think suffix tree can help to discover fragments of unit motions.

Finding motifs in time series (2002). Jessica Lin, Eamonn Keogh, Stefano Lonardi, Pranav Patel. In the 2nd workshop on temporal data mining, at the 8th ACM SIGKDD international


This is the former paper of Lin et at. about finding motifs in time series by using the SAX approach. Here, the authors present first a brute force algorithm to find motifs and then a refined version that use a complicated local hashing indexation of time series. Two interesting things I found useful in this paper was the assumption of normal probability for all the normalized time series. I plan to evaluate the normal probability plot of our time series to check if we have a normal distribution indeed. If not, a more general distribution (adaptative to the data) can be used to generate SAX symbols for the time series. In terms of indexation, the authors use ADM algorithm.


Keogh, E., Palpanas, T., Zordan, V. B., Gunopulos, D., and Cardle, M. 2004. Indexing large human-motion databases. In Proceedings of the Thirtieth international Conference on Very Large Data Bases - Volume 30 (Toronto, Canada, August 31 - September 03, 2004). M. A. Nascimento, M. T. Özsu, D. Kossmann, R. J. Miller, J. A. Blakeley, and K. B. Schiefer, Eds. Very Large Data Bases. VLDB Endowment, 780-791.


In this paper, Keogh et al. discuss the advantage of using DTW and not Euclidean distance when dealing with large databases. Here the concept of uniform scaling is presented as a method that is less imprecise than using only DTW. Unfortunately, in this work the all the time series has the same length, which is not common in real problems like human motion detection.
By reading these papers, I can see that we have the problem of motif detection to find unit motions during the time series that represents one video. And, we have the problem of partially identifying time series with a suffix tree. We also face specific problems like unit motions of varying length. I want to know the effect of scaling in amplitude, scaling in length (with Fourier coefficients) in terms of distance.

Tuesday, November 25, 2008

[Research] Unit motions within time series (adaptative method)

Unit motions are independent segments of the time series that contains motion. And they are separated by intervals of "silence" (no motion). We assumed a fixed value for the length of this separation. That is, if the distance, in terms of frames, between two unit motions is more than 12 frames, we consider these two unit motions as different unit motions. However, this fixed value sometimes gives us incorrect unit motions (unit motions are were considered together or incorrectly split). A employed Statistics to adapt this value to the nature of the time series. First I evaluate the mean u and variance d of the lengths of segments with silence. Then, the minimum distance to consider independent unit motions is t = u + 0.7sqrt(d). I evaluate then the new unit motions for all the time series and make clustering of the unit motions. The new results are slightly better in some movements which seems to indicates the importance of setting values of external parameters with adaptive methods.

The new results are the following:

Hand-based Foot-based
Boxing 94.7369 5.2631
Hand clapping94.7368 5.2632
Hand waving 52.6316 47.3684
Jogging 26.8421 73.1579
Running 21.5790 78.4210
Walking 26.3159 73.6841

And this are the previous one:

Hand-based Foot-based
Boxing 89.5% 10.5%
Hand clapping89.5% 10.5%
Hand waving 78.9% 21.1%
Jogging 28.8% 71.1%
Running 21.1% 78.9%
Walking 22.3% 77.7%

Saturday, November 22, 2008

[Research] Obtaining Time Series to characterize human movement in video data

I finished the slides for the presentation on next Tuesday. They are 15 slides.
I read the final version of the HCI book chapter and fix errors made bu the editor during the revision of the paper.
Time series for all the videos were generated again. These time series does not consider the points that fall in the middle of the human body. I also made tests to generate time series with different number of gradients per video. Our previous approach considered the 20 most important gradients in terms of spatiotemporal variation. I made histograms of the gradients for each video and discover that taking the 10% of all the gradients discovered lead us to obtain more estable time series.