QTempIntMiner, QTIAPriori and QTIPrefixSpan are algorithms to mine quantitative intervals sequences. The algorithms are able to discover the largest frequent patterns in a set of short intervals sequences. It combines the sequence mining algorithm (GSP or PrefixSpan) to discover the frequent sequence and the clustering algorithm to discover the 'representative' temporal extends of the intervals (EM, KMeans, Affinity Propagation).


Author: Guyet Thomas ( thomas_dot_guyet_at_irisa.fr) and René Quiniou ( thomas_dot_guyet_at_inria.fr)
Year: 2011 (most recent algorithm version)
Laboratory : INRIA/IRISA (Team LACODAM ex. Team-Project inria DREAM)
Licence: Affero GPL
APP Deposit number: IDDN.FR.001.030005.000.S.P.2016.000.20700


News[top]

  1. Video presentation of QTempIntMiner at Inria Innovation meeting (2015)
  2. Algorithm will be soon available in C++ (with Python interface)
in French

Algorithms description [top]

QTempIntMiner

QTempIntMiner considers the problem of discovering frequent temporal patterns in a database of temporal sequences, where a temporal sequence is a set of items with associated dates and durations. Since the quantitative temporal information appears to be fundamental in many contexts, it is taken into account in the mining processes and returned as part of the extracted knowledge. To this end, we have adapted the classical GSP framework to propose an efficient algorithm based on a hyper-cube representation of temporal sequences. The extraction of quantitative temporal information is performed using a density estimation of the distribution of event intervals from the temporal sequences. An evaluation on synthetic data sets shows that the proposed algorithm can robustly extract frequent temporal patterns with quantitative temporal extents.

The QTempIntMiner algorithm is based on hyper-cube representations of the temporal sequences and the temporal patterns. An hyper-cube is a geometrical volume in dimension n. Given a temporal sequence S made of n temporal events, then the hyper-cube representation of S is the hyper-cube such that the temporal bounds of the sequence items are determined by the orthogonal projection along each corresponding timeline.




Figure 1. Temporal sequence of dimension 3 (on the left) and its hyper-cube representation (on the right)

Our aim is to define a unique and representative temporal extent for each interval of a temporal pattern. Since, the distributions of temporal intervals can highlight such representative temporal extents, we propose a solution to construct unique and representative temporal extents which relies on the estimation of the interval distributions.

The hyper-cube representation allows to consider the distribution of the intervals for all temporal dimensions globally at the same time. In Figure 2, the frequency of temporal regions is outlined with darkening color: the darker a region is, the more temporal sequences overlap this region.




Figure 2. Interval sequences distribution in dimension 2 : Each sequence is represented by a red square (hyper-cube of dimension 2). The distribution of the red intensity figure out the interval sequences distribution.

QTempIntMiner is based on hyper-cube distributions which can be approximated using density-based multidimensional clustering. We approximate distributions with Gaussian mixture models. Once computed, each Gaussian is associated with a possible temporal pattern. The temporal intervals to overlay over the symbolic signature are extracted from a Gaussian of the model.

QTempIntMiner constructs the longest frequent symbolic signatures. The main property of the GSP algorithm, on which QTempIntMiner is based, is : every symbolic signature that covers a frequent symbolic signature is frequent. This property is helpful to prune a lot of candidates of size n knowing the infrequent symbolic signatures of size strictly inferior to n because if a symbolic signature of size n is covered by an infrequent symbolic signature then it is not frequent.
The algorithm iterates two stages, namely (1) the selection of frequent temporal patterns of dimension n, and (2) the generation of symbolic signature candidates of dimension n + 1 from those of dimension n. These two stages are detailed in the following sections. The process terminates when there is no more potential longer candidate or when the candidates are no longer sufficiently frequent.
The algorithm has been adapted to process hyper-cube representations of temporal sequences. In particular, the selection of frequent temporal patterns uses the distribution estimation approach to extract the temporal intervals, and the candidate generation uses a priori constraints enriched by temporal bounds.

QTIAPriori & QTIPrefixSpan

QTIPrefixSpan and QTIAPriori extends respectivelly algorithms PrefixSpan and GSP with a multi-dimensional interval clustering step for extracting the representative temporal intervals associated to events in patterns.

IJCAI Poster

Publications [top]


Download [top]

If you used this software, if you find it interesting but not appropriate for your data set or if you encounter any problem: do not hesitate to contact us.


Contacts [top]

Guyet Thomas, AGROCAMPUS-OUEST/IRISA
Quiniou René, INRIA


Guyet Thomas, last modification 25/04/2016