Data di Pubblicazione:
2013
Citazione:
Assessing feature relevance in NPLS models by VIP / Favilla, Stefania; Durante, Caterina; Li Vigni, Mario; Cocchi, Marina. - In: CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS. - ISSN 0169-7439. - STAMPA. - 129:(2013), pp. 76-86. [10.1016/j.chemolab.2013.05.013]
Abstract:
Multilinear PLS (NPLS) and its discriminant version (NPLS-DA) are very diffuse tools to model multi-way data
arrays. Analysis of NPLS weights and NPLS regression coefficients allows data patterns, feature correlation
and covariance structure to be depicted. In this study we propose an extension of the Variable Importance
in Projection (VIP) parameter to multi-way arrays in order to highlight the most relevant features to predict
the studied dependent properties either for interpretative purposes or to operate feature selection. The VIPs
are implemented for each mode of the data array and in the case of multivariate dependent responses considering
both the cases of expressing VIP with respect to each single y-variable and of taking into account
all y-variables altogether.
Three different applications to real data are presented: i) NPLS has been used to model the properties of
bread loaves from near infrared spectra of dough, acquired at different leavening times, and corresponding
to different flour formulations. VIP values were used to assess the spectral regions mainly involved in determining
flour performance; ii) assessing the authenticity of extra virgin olive oils by NPLS-DA elaboration of
gas chromatography/mass spectrometry data (GC–MS). VIP values were used to assess both GC and MS discriminant
features; iii) NPLS analysis of a fMRI-BOLD experiment based on a pain paradigm of acute
prolonged pain in healthy volunteers, in order to reproduce efficiently the corresponding psychophysical
pain profiles. VIP values were used to identify the brain regions mainly involved in determining the pain intensity
profile.
arrays. Analysis of NPLS weights and NPLS regression coefficients allows data patterns, feature correlation
and covariance structure to be depicted. In this study we propose an extension of the Variable Importance
in Projection (VIP) parameter to multi-way arrays in order to highlight the most relevant features to predict
the studied dependent properties either for interpretative purposes or to operate feature selection. The VIPs
are implemented for each mode of the data array and in the case of multivariate dependent responses considering
both the cases of expressing VIP with respect to each single y-variable and of taking into account
all y-variables altogether.
Three different applications to real data are presented: i) NPLS has been used to model the properties of
bread loaves from near infrared spectra of dough, acquired at different leavening times, and corresponding
to different flour formulations. VIP values were used to assess the spectral regions mainly involved in determining
flour performance; ii) assessing the authenticity of extra virgin olive oils by NPLS-DA elaboration of
gas chromatography/mass spectrometry data (GC–MS). VIP values were used to assess both GC and MS discriminant
features; iii) NPLS analysis of a fMRI-BOLD experiment based on a pain paradigm of acute
prolonged pain in healthy volunteers, in order to reproduce efficiently the corresponding psychophysical
pain profiles. VIP values were used to identify the brain regions mainly involved in determining the pain intensity
profile.
Tipologia CRIS:
Articolo su rivista
Keywords:
VIP; Multi-way data; NPLS; NPLS-DA; Feature selection
Elenco autori:
Favilla, Stefania; Durante, Caterina; Li Vigni, Mario; Cocchi, Marina
Link alla scheda completa:
Link al Full Text:
Pubblicato in: