The Discovery of Interacting Episodes and Temporal Rule Determination in Sequential Pattern Mining

Author: Carl Howard Mooney

Mooney, Carl Howard, 2007 The Discovery of Interacting Episodes and Temporal Rule Determination in Sequential Pattern Mining, Flinders University, School of Computer Science, Engineering and Mathematics

Terms of Use: This electronic version is (or will be) made publicly available by Flinders University in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. You may use this material for uses permitted under the Copyright Act 1968. If you are the owner of any included third party copyright material and/or you believe that any material has been made available without permission of the copyright owner please contact copyright@flinders.edu.au with the details.

Abstract

The reason for data mining is to generate rules that can be used as the basis for making decisions. One such area is sequence mining which, in terms of transactional datasets, can be stated as the discovery of inter-transaction associations or associations between different transactions. The data used for sequence mining is not limited to data stored in overtly temporal or longitudinally maintained datasets and in such domains data can be viewed as a series of events, or episodes, occurring at specific times. The problem thus becomes a search for collections of events that occur frequently together. While the mining of frequent episodes is an important capability, the manner in which such episodes interact can provide further useful knowledge in the search for a description of the behaviour of a phenomenon but as yet has received little investigation. Moreover, while many sequences are associated with absolute time values, most sequence mining routines treat time in a relative sense, returning only patterns that can be described in terms of Allen-style relationships (or simpler), ie. nothing about the relative pace of occurrence. They thus produce rules with a more limited expressive power. Up to this point in time temporal interval patterns have been based on the endpoints of the intervals, however in many cases the ‘natural’ point of reference is the midpoint of an interval and it is therefore appropriate to develop a mechanism for reasoning between intervals when midpoint information is known. This thesis presents a method for discovering interacting episodes from temporal sequences and the analysis of them using temporal patterns. The mining can be conducted both with and without the mechanism for handling the pace of events and the analysis is conducted using both the traditional interval algebras and a midpoint algebra presented in this thesis. The visualisation of rules in data mining is a large and dynamic field in its own right and although there has been a great deal of research in the visualisation of associations, there has been little in the area of sequence or episodic mining. Add to this the emerging field of mining stream data and there is a need to pursue methods and structures for such visualisations, and as such this thesis also contributes toward research in this important area of visualisation.

Keywords: temporal logic,sequences,episodes,data mining

Subject: Computer Science thesis

Thesis type: Doctor of Philosophy
Completed: 2007
School: School of Computer Science, Engineering and Mathematics
Supervisor: Professor John F. Roddick