GRASS logo

NAME

i.hyper.preproc - General hyperspectral data preprocessing

KEYWORDS

raster, hyperspectral, preprocessing

SYNOPSIS

i.hyper.preproc
i.hyper.preproc --help
i.hyper.preproc [-bcqz] input=name output=name [polyorder=integer] [derivative_order=integer] [window_length=integer] [dr_method=string] [dr_components=integer] [dr_kernel=string] [dr_gamma=float] [dr_degree=integer] [dr_max_iter=integer] [dr_tol=float] [dr_alpha=float] [dr_l1_ratio=float] [dr_random_state=integer] [dr_chunk_size=integer] [dr_bands=string] [dr_export=name] [--overwrite] [--help] [--verbose] [--quiet] [--ui]

Flags:

-b
Apply baseline correction
-c
Apply continuum removal
-q
Interpolate missing values in valid bands
-z
Clamp negative values to zero
--overwrite
Allow output files to overwrite existing files
--help
Print usage summary
--verbose
Verbose module output
--quiet
Quiet module output
--ui
Force launching GUI dialog

Parameters:

input=name [required]
Input hyperspectral raster map
output=name [required]
Output preprocessed raster map
polyorder=integer
Polynomial order for Savitzky-Golay filter (0 = skip Savitzky-Golay)
Default: 0
derivative_order=integer
Derivative order (0 = smoothing only)
Default: 0
window_length=integer
Window length (must be odd number)
Default: 11
dr_method=string
Dimensionality reduction method (linear or nonlinear)
Options: pca, kpca, nystroem, fastica, truncatedsvd, nmf, sparsepca
dr_components=integer
Number of components to retain (PCA,KPCA,Nystroem,FastICA,TruncatedSVD,NMF,SparsePCA). 0 = automatic (up to 10 or number of bands)
Default: 0
dr_kernel=string
Kernel type (used only for KPCA and Nystroem)
Options: linear, rbf, poly, sigmoid
Default: rbf
dr_gamma=float
Kernel gamma (KPCA and Nystroem only)
Default: 0.01
dr_degree=integer
Polynomial degree (used if kernel=poly)
Default: 3
dr_max_iter=integer
Maximum iterations for convergence (FastICA,NMF,SparsePCA)
Default: 200
dr_tol=float
Convergence tolerance (FastICA,NMF,SparsePCA)
Default: 1e-4
dr_alpha=float
Regularization strength (NMF,SparsePCA)
Default: 0.0
dr_l1_ratio=float
L1 ratio in [0,1] (NMF,SparsePCA)
Default: 0.0
dr_random_state=integer
Random seed for reproducibility (PCA,FastICA,NMF,SparsePCA,TruncatedSVD)
Default: 0
dr_chunk_size=integer
Number of spectra per chunk for dimensionality reduction (0 = automatic; KPCA is approximated if chunked)
Default: 0
dr_bands=string
Wavelength intervals or single values to include before reduction (e.g., 400–700,850–1300,2200)
dr_export=name
Optional path to export fitted reduction model (.pkl) for reuse

Table of contents

DESCRIPTION

i.hyper.preproc performs preprocessing of hyperspectral data stored as a 3D raster map (raster_3d). It is designed to improve data quality, suppress noise, and transform the spectral dimension into representations better suited for scientific analysis and machine learning workflows.

The module operates directly on hyperspectral cubes imported with i.hyper.import or other compatible 3D raster datasets. All transformations are performed along the spectral (z) dimension for each spatial position (x, y).

Preprocessing steps can be chained together in a pipeline, specified with the steps option. Each stage is executed sequentially according to the defined preprocessing pipeline. The module displays the full pipeline sequence in the console (for example: Savitzky–Golay → Baseline correction → Continuum removal → PCA), providing a clear overview of the operations applied in order.

i.hyper.preproc is part of the i.hyper module family and provides a reproducible, modular framework for spectral preprocessing prior to feature extraction, classification, or regression. All output maps are 3D rasters (raster_3d) compatible with the rest of the i.hyper suite.

FUNCTIONALITY

The following preprocessing methods are supported:

Multiple steps can be combined in one command by listing them in steps= (comma-separated). For example, steps='sav_gol,baseline,cont_rem,kpca' will execute all four in sequence. Intermediate rasters are handled internally and automatically cleaned up.

All dimensionality reduction methods are implemented using the scikit-learn library. For detailed algorithmic descriptions and parameter explanations, refer to the official scikit-learn documentation.

NOTES

The module is constructed as a preprocessing pipeline engine. Each transformation acts spectrally while preserving full spatial alignment. Operations are reported in the console as a sequential pipeline.

When using PCA, KPCA, FastICA, NMF, or SparsePCA, the number of output components can be controlled using the dr_components parameter.

Chunked dimensionality reduction:
Large hyperspectral datasets can be processed in smaller portions using the dr_chunk_size option. This enables dimensionality reduction on datasets exceeding system memory capacity. When dr_chunk_size is used with kernel-based methods (e.g., KPCA), the algorithm operates as an approximation of the full kernel mapping, trading some precision for scalability.

Model export and reuse:
Trained dimensionality reduction models can be exported using the dr_export option. The exported model (in .pkl format) can be reused to transform other spectra—such as field or laboratory measurements from a spectroradiometer—into the same reduced feature space. This allows consistent feature alignment between image-derived data and point spectra, facilitating integrated machine learning and spectral modeling workflows.

Results can be directly used by i.hyper.explore, i.hyper.composite, or exported with i.hyper.export for further analysis.

EXAMPLES

# Example 1: Savitzky–Golay smoothing (basic denoising)

# Set the region
g.region raster_3d=prisma

# Perform Savitzky–Golay smoothin with a window of 7 bands and polynomial order of 3
i.hyper.preproc input=prisma output=prisma_savgol \
                window_length=7 polyorder=3

# Console output:
Savitzky–Golay
Loading floating point  data with 4  bytes ...  (1254x1222x234)
# Example 2: PCA transformation

# Set the region
g.region raster_3d=enmap

# Performs PCA
# Interpolaties missing values in valid bands
i.hyper.preproc input=enmap output=enmap_pca \
                dr_method=pca dr_components=10 -q

# Console output:
PCA
Interpolating missing values across spectral bands...
Loading floating point  data with 4  bytes ...  (1263x1127x10)
# Example 3.1: Combined preprocessing pipeline

# Set the region
g.region raster_3d=tanager

# Savitzky–Golay derivative + baseline correction + continuum removal + Nystroem
# Interpolaties missing values in valid bands
# Processes the hyperspectral 3D map in chunks and exports the fitted Nystroem model
i.hyper.preproc input=tanager output=tanager_ml \
                polyorder=3 derivative_order=1 window_length=9 \
                -b -c -q \
                dr_method=nystroem dr_components=30 \
                dr_chunk_size=5000 \
                dr_export=/models/tanager_nystroem.pkl

# Console output:
Savitzky–Golay → Baseline correction → Continuum removal → NYSTROEM
Interpolating missing values across spectral bands...
Loading floating point  data with 4  bytes ...  (869x804x426)
# Example 3.2: Using the exported Nystroem model in Python
import joblib, numpy as np

# Load exported Nystroem model (kernel map + PCA compressor)
feature_map, pca_after = joblib.load("/models/tanager_nystroem.pkl")

# Load new field spectra (rows = samples, cols = wavelengths
# The spectra must use the same wavelength order and scaling as the hyperspectral 3D map)
spectra = np.loadtxt("/data/field_spectra.txt")

# Apply the same nonlinear mapping and dimensionality reduction
Z = feature_map.transform(spectra)
spectra_reduced = pca_after.transform(Z)

SEE ALSO

i.hyper.explore, i.hyper.composite, i.hyper.export, i.hyper.import, r3.stats r3.stats

DEPENDENCIES

AUTHORS

Alen Mangafić and Tomaž Žagar, Geodetic Institute of Slovenia

SOURCE CODE

Available at: i.hyper.preproc source code (history)

Latest change: Monday Nov 17 15:45:17 2025 in commit: 615887d217deac99a8f08bcf940384863fd47f2b


Main index | Imagery index | Topics index | Keywords index | Graphical index | Full index

© 2003-2025 GRASS Development Team, GRASS GIS 8.4.2dev Reference Manual