Let's data mining together: [Research] Paper Review on Time Series Indexation

Discovering Characteristic Actions from On-Body Sensor Data. Minnen, D. Starner, T. Essa, I. Isbell, C. College of Computing, Georgia Institute of Technology, Atlanta, GA 30332 USA. dminn@cc.gatech.edu. 10th IEEE International Symposium on Wearable Computers. Oct. 2006

In this paper the authors discuss the discovery of human activity from sensor data. Rather than identifying the type of movement, this work pretend to discover if there are some patterns that appear frequently during stream data. The authors called motifs to this patterns and are practically the unit motions we discover in the paper of time series. Some interesting ideas of this paper are the use of a suffix tree to discover human motion in real time from stream data. Although the authors do not exploit this feature (they prefer to use a HMM), I think suffix tree can help to discover fragments of unit motions.

Finding motifs in time series (2002). Jessica Lin, Eamonn Keogh, Stefano Lonardi, Pranav Patel. In the 2nd workshop on temporal data mining, at the 8th ACM SIGKDD international

This is the former paper of Lin et at. about finding motifs in time series by using the SAX approach. Here, the authors present first a brute force algorithm to find motifs and then a refined version that use a complicated local hashing indexation of time series. Two interesting things I found useful in this paper was the assumption of normal probability for all the normalized time series. I plan to evaluate the normal probability plot of our time series to check if we have a normal distribution indeed. If not, a more general distribution (adaptative to the data) can be used to generate SAX symbols for the time series. In terms of indexation, the authors use ADM algorithm.

Keogh, E., Palpanas, T., Zordan, V. B., Gunopulos, D., and Cardle, M. 2004. Indexing large human-motion databases. In Proceedings of the Thirtieth international Conference on Very Large Data Bases - Volume 30 (Toronto, Canada, August 31 - September 03, 2004). M. A. Nascimento, M. T. Özsu, D. Kossmann, R. J. Miller, J. A. Blakeley, and K. B. Schiefer, Eds. Very Large Data Bases. VLDB Endowment, 780-791.

In this paper, Keogh et al. discuss the advantage of using DTW and not Euclidean distance when dealing with large databases. Here the concept of uniform scaling is presented as a method that is less imprecise than using only DTW. Unfortunately, in this work the all the time series has the same length, which is not common in real problems like human motion detection.
By reading these papers, I can see that we have the problem of motif detection to find unit motions during the time series that represents one video. And, we have the problem of partially identifying time series with a suffix tree. We also face specific problems like unit motions of varying length. I want to know the effect of scaling in amplitude, scaling in length (with Fourier coefficients) in terms of distance.

Let's data mining together

Thursday, November 27, 2008

[Research] Paper Review on Time Series Indexation

No comments:

Visitors