Software

TMG

Toolbox that can be used for various tasks in text mining (TM) specifically:

i) indexing,
ii) retrieval,
iii) dimensionality reduction,
iv) clustering,
v) classification.

Most of TMG is written in MATLAB, though a large segment of the indexing phase is written in Perl.

TMG is especially suited for TM applications where data is high-dimensional but extremely sparse as it uses the sparse matrix infrastructure of MATLAB. Initially built as a preprocessing tool for creating term-document matrices (tdm’s) from unstructured text that was reportedly used with success by several researchers and instructors, the new version of TMG (May’07) offers a much wider range of tools.

Primary contact: Eugenia-Maria Kontopoulou
Primary developer: Dimitrios Zeimpekis
Funding: University of Patras K. Karatheodori grant no. B120; Bodossaki Foundation Graduate Fellowship.
Links: TMG home, Some uses of TMG


g-Spike

A very fast and fairly robust direct parallel tridiagonal solver based on Givens rotations and the Spike polyalgorithm for the GPU.

Primary contact: Ioannis Venetis
Funding: By product of research conducted under the Research Funding Program: THALES: Reinforcement of the interdisciplinary and/or inter – institutional research and innovation, (MIS-379421, “Expertise development for the aeroelastic analysis and the design-optimization of wind turbines”).
Links: g-Spike page (if you cannot load the code please contact us)

Reference: I. Venetis, A. Kouris, A. Sobczyk, E. Gallopoulos and A.H. Sameh, “A direct tridiagonal solver based on Givens rotations for
GPU-based architectures”, DOI 10.1016/j.parco.2015.03.008, Parallel Computing, Apr. 2015. See also HPCLAB-SCG-06/11-14, CEID, University of Patras, Nov. 2014.


Jylab

Portable and flexible scientific computing environment running on all platforms providing a recent JVM and enabling the development of scientific applications over distributed computing platforms. Jylab conveniently packages Jython (http://www.jython.org/) for flexible Python language scripting, with a core set of open source libraries implementing numerical linear algebra routines (NLA) and communication models. Recently, a package was implemented in Jylab that enables accessing and using the Grid infrastructure.

Primary contact: Georgios Kollias (research staff at IBM Research, Yorktown Heights)
Funding: PYTHAGORAS I grant, project B365016; GRID-APP project of Hellenic General Secreteriat of Research and Technology
Links: Jylabwiki


NNDSVD

MATLAB functions to initialize approximate nonnegative matrix factorization algorithms.The basic algorithm contains no randomization and is based on approximations of positive sections of the partial SVD factors of the data matrix utilizing an algebraic property of unit rank matrices. The method is also suitable when seeking sparse factors. The approximants furnished by NNDSVD appear to lead to much faster error reduction compared to random initialization though the eventual error is of similar quality.

Primary contact: Christos Boutsidis (research staff at Yahoo!Labs, NY)
Funding: University of Patras K. Karatheodori grant no. B120
Links: NNDSVD presentation, major publication


IRLANB

MATLAB functions to compute the smallest singular triplets of large sparse matrices in matrix free manner. The algorithms used are based on Lanczos bidiagonalization, implicit restarting,and harmonic Ritz values, deflation and refinement. The method has been used with success in applications such as the computation of matrix pseudospectra and clustering.

Primary contact: Effrosyni Kokiopoulou (research staff at Google, Zurich)
Funding: University of Patras K. Karatheodori grant no. B120; Bodossaki Foundation Graduate Fellowship.
Links: TBA, major publication