GLCM/PGLCM: Efficient Parallel Mining of Closed Frequent Gradual Itemsets
GLCM/PGLCM
are efficient algorithms for mining gradual closed frequent
itemsets. Given a numerical dataset (e.g., biological databases,
survey databases, data streams or sensor), a gradual itemset can be
expressed as the covariation of several attributes, for example:
GLCM is the sequential version of the algorithm. It has the same strong advantages of LCM: linear time complexity and constant memory consumption.
PGLCM is the parallel implementation. This parallel implementation can take advantages of the strength of recent multi-process computer. It scales well with the number of available cores for complex datasets, where such computing power is really needed.
The low memory requirements of our two algorithms allow it to handle large real world datasets, which could not be handled by existing algorithms due to memory saturation. Our algorithms thus removed the lock that prevented the use of gradual patterns analysis in realistic applications.
You
can find more details about GLCM/PGLCM, as well as a detailed experimental
study about its performance in our ICDM 2010 paper:
PDF.
Download:
Compilation and Run:
Read the README.txt file in the
download pakage.
GLCM/PGLCM is the Master Thesis work of Trong Dinh Thac Do, during is master internship between LIG laboratory, University of Grenoble and LIRMM laboratory, University of Montpellier
Current contact: Alexandre Termier.
Feel free to contact me by email if you have any inquiry!