# Information on software that was coded at Nofima

## The Måge plot

Some multiblock regression Methods (such as SO-PLS and PO-PLS) allow for different numbers of components in each block. There are two strategies for selecting the numbers of components for these models: *sequential* and *global*. With the sequential strategy, the number of components to use for the first block is determined before the second block is introduced, and so on. With the global strategy, all blocks are taken into account from the beginning. Models With all combinations of components from each block are tested, and the combination giving the minimum prediction error is selected. Often, several combinations have approximately equally good prediction ability, and in such cases it is important to also take the total number of components into account. The *Måge plot* is a valuable tool for evaluating the models and selecting the optimal numbers of components.

The Måge plot shows the prediction error for each combination of components, as a function of the total number of components. From this perspective, it is possible to decide the total dimensionality of the system and the individual dimensionalities of each block at the same time. It is also easy to identify models that are indistinguishable from a prediction point of view.

A matlab function for creating the plot can be found here: MagePlot

## HySpec toolbox

A freeware toolbox for hyperspectral image analysis has been made available. It includes various routines for importing, correcting, segmenting, analysing and visualising data from various origins. A graphical user interface is included for explorative data analysis. The toolbox is work in progress with minor bugs. There will be future additions and changes, but it is already useful for a lot of tasks.

Download the toolbox hyspec in zip format.

## Multiblock-PLS (MB-PLS)

MB-PLS is a multi-block regression method developed to estimate regression equations between independent blocks.

This toolbox contains a MATLAB function that allow creating calibration models and then predicting unknown samples. The function implements methodology described in J.A.Westerius, T. Kourti, J.F. MacGregor, Analysis of hierarchical PCA and PLS models, J. Chemometr. 12 (1998) 301–321

Download the function plsmbExVal

**The output is a structure that will contain:**

**Calibration Model:**

- PLS parameters
- Predicted Y
- Pretreatment info

**Prediction ****Model:**

- Scores
- Predicted Y

## Multiblock classification:

### MATLAB toolbox for classification by SO-PLS-LDA, MB-PLS-LDA and PLS-LDA

SO-PLS-LDA is a classification method based on the combination of the multi-block SO-PLS regression method and LDA. This toolbox contains MATLAB functions that allow choosing the optimal complexity on the basis of the Måge plot, fitting the SO-PLS-LDA models using different type of cross-validation and then making classification by LDA. The output is a structure that will contain not only the SO-PLS-LDA model, but also the MB-PLS-LDA model built on the same blocks and the PLS-LDA models on the single blocks. For applications and a description of the methods involved see:

Biancolillo,I. Måge,T. Næs, *Combining SO-PLS and linear discriminant analysis for multi-block classification. (**paper submitted to Chemometrics and Intelligent Laboratory Systems**)*

The toolbox (zip-file) can be downloaded here

## HotPLS toolbox for MATLAB

HotPLS is a free MATLAB toolbox accompanying the recently published article “Hierarchically ordered taxonomic classification by partial least squares” in Chemometrics and Intelligent Laboratory Systems. It consists of an example script, the same data set found in the article and all necessary functions to perform the calculations in the article.

The included set of functions can be used in any situation where classification is performed in a fixed hierarchy and where the data are multivariate, e.g. spectroscopic measurements or similar.

For a thorough description of the method and example see:

Kristian Hovde Liland, Achim Kohler, Volha Shapaval, Hot PLS—a framework for hierarchically ordered taxonomic classification by partial least squares. Chemometrics and Intelligent Laboratory Systems, Volume 138, 15 November 2014, Pages 41–47.

Follow this link to download the toolbox.

## Multiblock regression by PO/SO-PLS, new toolbox for MATLAB

PO-PLS and SO-PLS are a collection of methods for multiblock regression (data fusion) developed at Nofima. This toolbox contains MATLAB functions for model fitting, cross-validation, prediction and plotting of results. An example script and a data set are also included for illustration, along with a short description and documentation of the main functions. All functions are based on the saisir data structure. For a detailed description of PO- and SO-PLS,see e.g:

Næs, T., Tomic, O., Afseth, N.K., Segtnan, V.H., Måge, I. 2013. *Multi-block regression based on combinations of orthogonalisation, PLS-regression and canonical correlation analysis*. Chemometrics and Intelligent Laboratory Systems, Vol 124, pp 32-42.

Follow this link to download the toolbox.

## Open EMSC toolbox, published February 2014

The open EMSC toolbox is a freely available collection of functions for performing Extended Multiplicative Signal Correction on spectroscopic data. All functions are based on the Saisir data structure for chemometric data analysis. A graphical user interface is included to make the toolbox accessible for users with no or limited programming knowledge. An accompanying paper explaining the basics of the toolbox is under review in a peer reviewed journal.

Included in the toolbox is a complete replication of the tutorial paper of Nils Kristian Afseth and Achim Kohler published in Chemometrics and Intelligent Laboratory Systems in 2012.

Follow this link to download the toolbox (updated to also work in MATLAB 2014b).

ConsumerCheck, published January 2014

This is a software providing a graphical user interface (GUI) for analyzing typical data for consumer studies. The software was published in cooperation with DTU Denmark. Features provided are

– Visualization (box plot, stacked histograms, single sample histograms)

– PCA and related plots

– Preference mapping

– Conjoint analysis

The software is designed for analyzing consumer data, but users with adequate statistical training can also use ConsumerCheck for analyzing other types of data with similar structure. For more information, see the Consumer Check website www.consumercheck.co

**50-50 manova**

Multiple responses are common in industrial and scientific experimentation and a multivariate alternative to ordinary analysis of variance (ANOVA) is often required. Significance tests based on classical multivariate ANOVA (MANOVA) are, however, useless in many practical cases. The tests perform poorly in cases with several highly correlated responses and the method collapses when the number of responses exceeds the number of observations. 50-50 MANOVA is a method which handles this problem. Principal component analysis is an important part of the new methodology. The methodology was developed by Øyvind Langsrud at MATFORSK/NOFIMA.

The R-package ffmanova is available at CRAN: http://cran.r-project.org/web/packages/ffmanova/

Matlab code, a GUI and relevant references can be found here: http://www.langsrud.com/stat/

The software also contain rotation testing which is a framework for doing significance testing by computer simulations, ajustment of p-values and estimation of false discovery rates for multiple testing problems.

**PanelCheck**

This is a software providing a graphical user interface (GUI) for checking and visualising performance of sensory panels and their assessors. For more information, please see the PanelCheck website.