# An Interval Distribution Analysis for RTI+

### Fabian Witter, Timo Klerx, Artus Krohn-Grimberghe

#### Abstract

The algorithm RTI+ learns a Probabilistic Deterministic Real-Time Automaton (PDRTA) from unlabeled timed sequences. RTI+ is an efficient algorithm that runs in polynomial time and can be applied to a variety of real-world behavior identification problems. Nevertheless, we uncover a lack of accuracy in identifying the intervals (or time guards) of the PDRTA. This inaccuracy can lead to wrong predictions of timed sequences in the learned model. We show by example that segments in intervals that are not covered by training data are responsible for this effect. We call those segments gaps and name three types of gaps that can appear. Two of the types cause wrong predictions of sequences and should thus be removed from the model. Therefore, we introduce our novel Interval Distribution Analysis (IDA) which utilizes statistical outlier detection to identify and remove gaps. In the context of ATM fraud detection, we show that IDA can improve the results of RTI+ in a real-world scenario.

#### References

- Ankerst, M., Breunig, M. M., Kriegel, H.-P., and Sander, J. (1999). OPTICS: Ordering Points to Identify the Clustering Structure. In SIGMOD'99, ACM International Conference on Management of Data, pages 49- 60. ACM.
- Carrasco, R. C. and Oncina, J. (1994). Learning Stochastic Regular Grammars by Means of a State Merging Method. In ICGI'94, 2nd International Colloquium on Grammatical Inference and Applications, pages 139-152. Springer.
- Dima, C. (2001). Real-Time Automata. Journal of Automata, Languages and Combinatorics, 6(1):3-23.
- Drucker, H., Burges, C. J. C., Kaufman, L., Smola, A. J., and Vapnik, V. (1997). Support Vector Regression Machines. In NIPS'96, 9th Neural Information Processing Systems Conference, pages 155-161. MIT Press.
- Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. (1996). A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In KDD'96, 2nd International Conference on Knowledge Discovery and Data Mining, pages 226-231. AAAI Press.
- Klerx, T., Anderka, M., Kleine B üning, H., and Priesterjahn, S. (2014). Model-Based Anomaly Detection for Discrete Event Systems. In ICTAI'14, 26th IEEE International Conference on Tools with Artificial Intelligence, pages 665-672. IEEE Computer Society.
- Lang, K. J., Pearlmutter, B. A., and Price, R. A. (1998). Results of the Abbadingo One DFA Learning Competition and a New Evidence-Driven State Merging Algorithm. In ICGI'98, 4th International Colloquium Conference on Grammatical Inference, pages 1-12. Springer.
- Leys, C., Ley, C., Klein, O., Bernard, P., and Licata, L. (2013). Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of Experimental Social Psychology, 49(4):764-766.
- Maier, A. (2015). Identification of Timed Behavior Models for Diagnosis in Production Systems. PhD thesis, University of Paderborn.
- Parzen, E. (1962). On estimation of a probability density function and mode. The annals of mathematical statistics, 33(3):1065-1076.
- Pelleg, D. and Moore, A. W. (2000). X-means: Extending K-means with Efficient Estimation of the Number of Clusters. In ICML'00, 7th International Conference on Machine Learning, pages 727-734. Morgan Kaufmann Publishers Inc.
- Tukey, J. W. (1977). Exploratory Data Analysis. Pearson.
- Verwer, S., de Weerdt, M., and Witteveen, C. (2010). A Likelihood-Ratio Test for Identifying Probabilistic Deterministic Real-Time Automata from Positive Data. In ICGI'10, 10th International Colloquium Conference on Grammatical Inference, pages 203-216. Springer.
- Verwer, S., Weerdt, M., and Witteveen, C. (2012). Efficiently Identifying Deterministic Real-Time Automata from Labeled Data. Machine Learning, 86(3):295-333.

#### Paper Citation

#### in Harvard Style

Witter F., Klerx T. and Krohn-Grimberghe A. (2017). **An Interval Distribution Analysis for RTI+** . In *Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,* ISBN 978-989-758-222-6, pages 351-358. DOI: 10.5220/0006146603510358

#### in Bibtex Style

@conference{icpram17,

author={Fabian Witter and Timo Klerx and Artus Krohn-Grimberghe},

title={An Interval Distribution Analysis for RTI+},

booktitle={Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},

year={2017},

pages={351-358},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0006146603510358},

isbn={978-989-758-222-6},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,

TI - An Interval Distribution Analysis for RTI+

SN - 978-989-758-222-6

AU - Witter F.

AU - Klerx T.

AU - Krohn-Grimberghe A.

PY - 2017

SP - 351

EP - 358

DO - 10.5220/0006146603510358