Our method, called Laplacian Norm Alignment (LNA), characterizes
proteins as a multi-dimensional sequence of quantity of deformation at
different scales of the local space in which residues are embedded.
More precisely, the discrete Laplace operator (discrete Laplacian) is
used to measure the divergence of the gradient of residue positions in
a graph that describes a protein weighted adjacency map. Laplacian
coordinates of residues are resilient to translation and homothety but
not to rotation. This limitation is overcome by only keeping for each
residue the Euclidean norm of its Laplacian coordinates that is indeed
invariant to rotation.
Dynamic programming algorithms (e.g. Needleman-Wunsch or
Smith–Waterman) can then be used to match these multivariate descriptor
sequences and achieve similar classification or clustering results
compared to other approaches that requires in general more processing
time. These algorithms require few memory and are implemented on
Graphical Processing Unit (GPU) for an even faster computation time.
This allows to compare one protein structure against 180,000 others
within a second time span.
Please reference the LNA method and code in your paper as:
Nicolas Bonnel and Pierre-Francois Marteau, LNA: Fast Protein Classification Using A Laplacian Characterization of Tertiary Structure, IEEE/ACM Transaction on Computational Biology and Bioinformatics, in press PDF, homepage: http://www-irisa.univ-ubs.fr/LNA/