MINING TIMED SEQUENCES TO FIND SIGNATURES

Nabil Benayadi, Marc Le Goc

2010

Abstract

We introduce the problem of mining sequential patterns among timed messages in large database of sequences using a Stochastic Approach. An example of patterns we are interested in is : 50% of cases of engine stops in the car are happened between 0 and 2 minutes after observing a lack of the gas in the engine, produced between 0 and 1 minutes after the fuel tank is empty. We call this patterns “signatures”. Previous research have considered some equivalent patterns, but such work have three mains problems : (1) the sensibility of their algorithms with the value of their parameters, (2) too large number of discovered patterns, and (3) their discovered patterns consider only ”after“ relation (succession in time) and omit temporal constraints between elements in patterns. To address this issue, we present TOM4L process (Timed Observations Mining for Learning process) which uses a stochastic representation of a given set of sequences on which an inductive reasoning coupled with an abductive reasoning is applied to reduce the space search. A very simple example is used to show the efficiency of the TOM4L process against others literature approaches.

References

  1. Agrawal, R. and Srikant, R. (1995). Mining sequential patterns. Proceedings of the 11th International Conference on Data Engineering (ICDE95), pages 3-14.
  2. Benayadi, N. and Le Goc, M. (2008a). Discovering temporal knowledge from a crisscross of observations timed. The proceedings of the 18th European Conference on Artificial Intelligence (ECAI'08). University of Patras, Patras, Greece.
  3. Benayadi, N. and Le Goc, M. (2008b). Using a measure of the crisscross of series of timed observations to discover timed knowledge. In Proceedings of the 19th International Workshop on Principles of Diagnosis (DX'08), Blue Mountains, Australia.
  4. Blachman, N. M. (1968). The amount of information that y gives about x. IEEE Transcations on Information Theory IT, 14.
  5. Le Goc, M. (2006). Notion d'observation pour le diagnostic des processus dynamiques: Application à Sachem et à la découverte de connaissances temporelles. Hdr, Faculté des Sciences et Techniques de Saint Jéroˆme.
  6. Mannila, H. (2002). Local and global methods in data mining: Basic techniques and open problems. 29th International Colloquium on Automata, Languages and Programming.
  7. Mannila, H., Toivonen, H., and Verkamo, A. I. (1997). Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1(3):259-289.
  8. Shannon, C. E. (1949). Communication in the presence of noise. Institute of Radio Engineers, 37.
Download


Paper Citation


in Harvard Style

Benayadi N. and Le Goc M. (2010). MINING TIMED SEQUENCES TO FIND SIGNATURES . In Proceedings of the 5th International Conference on Software and Data Technologies - Volume 2: ICSOFT, ISBN 978-989-8425-23-2, pages 450-455. DOI: 10.5220/0003007604500455


in Bibtex Style

@conference{icsoft10,
author={Nabil Benayadi and Marc Le Goc},
title={MINING TIMED SEQUENCES TO FIND SIGNATURES},
booktitle={Proceedings of the 5th International Conference on Software and Data Technologies - Volume 2: ICSOFT,},
year={2010},
pages={450-455},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003007604500455},
isbn={978-989-8425-23-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 5th International Conference on Software and Data Technologies - Volume 2: ICSOFT,
TI - MINING TIMED SEQUENCES TO FIND SIGNATURES
SN - 978-989-8425-23-2
AU - Benayadi N.
AU - Le Goc M.
PY - 2010
SP - 450
EP - 455
DO - 10.5220/0003007604500455