MVAPACK is a chemometric toolbox for NMR and GC/LC-MS metabolomics data processing. First introduced by Bradley Worley and Robert Powers and has grown to encompass full processing of Nuclear Magnetic Resonance (NMR) and Mass Spectrometry (MS) (As well as chromatography) type data. The software is written as a package for GNU Octave and encompasses over 336 functions made to automate data analysis with emphasis on simplifying manual inspection.
MVAPACK stands for MultiVariate Analysis Package. This is key to the use of Octave which is a simple language to learn and has much support for linear algebra based algorithms. The use of multivariate regression techniques has been pivotal to metabolomics as a whole. Especially principle component analysis (PCA) which has become a nearly emblematic analysis approach of metabolomics. MVAPACK also provides partial least squares (PLS), orthogonal partial least squares (OPLS), linear discriminate analysis (LDA), random forest (RF), and support vector machine (SVM) algorithms. MVAPACK also provides multiple data loading functions to simplify the data analysis pipeline of multiple types of instrumental data.
The software package can be installed through the Dockerfile found in this git page or octave installable tar files can be obtained by requesting through the Powers Group website. MVAPACK is also being developed as a webserver with full GUI support and interactive processing workflow.
We have developed a synthetic dataset in mzML format for benchmarking metabolomics data processing software. We have a publication (link here after submission) demonstrating the use and capabilities of multiple software applications.
We also have a standard mixture dataset and biological dataset (M. Smegmatis) that were used to test the LC-MS pipelines as well as benchmark the software applications as discussed above.