Skip to content

i.hyper.preproc

General hyperspectral data preprocessing

i.hyper.preproc [-bcqz] input=name output=name [polyorder=integer] [derivative_order=integer] [window_length=integer] [dr_method=string] [dr_components=integer] [dr_kernel=string] [dr_gamma=float] [dr_degree=integer] [dr_max_iter=integer] [dr_tol=float] [dr_alpha=float] [dr_l1_ratio=float] [dr_random_state=integer] [dr_chunk_size=integer] [dr_bands=string] [dr_export=name] [--overwrite] [--verbose] [--quiet] [--qq] [--ui]

Example:

i.hyper.preproc input=name output=name

grass.tools.Tools.i_hyper_preproc(input, output, polyorder=0, derivative_order=0, window_length=11, dr_method=None, dr_components=0, dr_kernel="rbf", dr_gamma=0.01, dr_degree=3, dr_max_iter=200, dr_tol=1e-4, dr_alpha=0.0, dr_l1_ratio=0.0, dr_random_state=0, dr_chunk_size=0, dr_bands=None, dr_export=None, flags=None, overwrite=None, verbose=None, quiet=None, superquiet=None)

Example:

tools = Tools()
tools.i_hyper_preproc(input="name", output="name")

This grass.tools API is experimental in version 8.5 and expected to be stable in version 8.6.

grass.script.run_command("i.hyper.preproc", input, output, polyorder=0, derivative_order=0, window_length=11, dr_method=None, dr_components=0, dr_kernel="rbf", dr_gamma=0.01, dr_degree=3, dr_max_iter=200, dr_tol=1e-4, dr_alpha=0.0, dr_l1_ratio=0.0, dr_random_state=0, dr_chunk_size=0, dr_bands=None, dr_export=None, flags=None, overwrite=None, verbose=None, quiet=None, superquiet=None)

Example:

gs.run_command("i.hyper.preproc", input="name", output="name")

Parameters

input=name [required]
    Input hyperspectral raster map
output=name [required]
    Output preprocessed raster map
polyorder=integer
    Polynomial order for Savitzky-Golay filter (0 = skip Savitzky-Golay)
    Default: 0
derivative_order=integer
    Derivative order (0 = smoothing only)
    Default: 0
window_length=integer
    Window length (must be odd number)
    Default: 11
dr_method=string
    Dimensionality reduction method (linear or nonlinear)
    Allowed values: pca, kpca, nystroem, fastica, truncatedsvd, nmf, sparsepca
dr_components=integer
    Number of components to retain (PCA,KPCA,Nystroem,FastICA,TruncatedSVD,NMF,SparsePCA). 0 = automatic (up to 10 or number of bands)
    Default: 0
dr_kernel=string
    Kernel type (used only for KPCA and Nystroem)
    Allowed values: linear, rbf, poly, sigmoid
    Default: rbf
dr_gamma=float
    Kernel gamma (KPCA and Nystroem only)
    Default: 0.01
dr_degree=integer
    Polynomial degree (used if kernel=poly)
    Default: 3
dr_max_iter=integer
    Maximum iterations for convergence (FastICA,NMF,SparsePCA)
    Default: 200
dr_tol=float
    Convergence tolerance (FastICA,NMF,SparsePCA)
    Default: 1e-4
dr_alpha=float
    Regularization strength (NMF,SparsePCA)
    Default: 0.0
dr_l1_ratio=float
    L1 ratio in [0,1] (NMF,SparsePCA)
    Default: 0.0
dr_random_state=integer
    Random seed for reproducibility (PCA,FastICA,NMF,SparsePCA,TruncatedSVD)
    Default: 0
dr_chunk_size=integer
    Number of spectra per chunk for dimensionality reduction (0 = automatic; KPCA is approximated if chunked)
    Default: 0
dr_bands=string
    Wavelength intervals or single values to include before reduction (e.g., 400–700,850–1300,2200)
dr_export=name
    Optional path to export fitted reduction model (.pkl) for reuse
-b
    Apply baseline correction
-c
    Apply continuum removal
-q
    Interpolate missing values in valid bands
-z
    Clamp negative values to zero
--overwrite
    Allow output files to overwrite existing files
--help
    Print usage summary
--verbose
    Verbose module output
--quiet
    Quiet module output
--qq
    Very quiet module output
--ui
    Force launching GUI dialog

input : str, required
    Input hyperspectral raster map
    Used as: input, raster_3d, name
output : str, required
    Output preprocessed raster map
    Used as: output, raster_3d, name
polyorder : int, optional
    Polynomial order for Savitzky-Golay filter (0 = skip Savitzky-Golay)
    Default: 0
derivative_order : int, optional
    Derivative order (0 = smoothing only)
    Default: 0
window_length : int, optional
    Window length (must be odd number)
    Default: 11
dr_method : str, optional
    Dimensionality reduction method (linear or nonlinear)
    Allowed values: pca, kpca, nystroem, fastica, truncatedsvd, nmf, sparsepca
dr_components : int, optional
    Number of components to retain (PCA,KPCA,Nystroem,FastICA,TruncatedSVD,NMF,SparsePCA). 0 = automatic (up to 10 or number of bands)
    Default: 0
dr_kernel : str, optional
    Kernel type (used only for KPCA and Nystroem)
    Allowed values: linear, rbf, poly, sigmoid
    Default: rbf
dr_gamma : float, optional
    Kernel gamma (KPCA and Nystroem only)
    Default: 0.01
dr_degree : int, optional
    Polynomial degree (used if kernel=poly)
    Default: 3
dr_max_iter : int, optional
    Maximum iterations for convergence (FastICA,NMF,SparsePCA)
    Default: 200
dr_tol : float, optional
    Convergence tolerance (FastICA,NMF,SparsePCA)
    Default: 1e-4
dr_alpha : float, optional
    Regularization strength (NMF,SparsePCA)
    Default: 0.0
dr_l1_ratio : float, optional
    L1 ratio in [0,1] (NMF,SparsePCA)
    Default: 0.0
dr_random_state : int, optional
    Random seed for reproducibility (PCA,FastICA,NMF,SparsePCA,TruncatedSVD)
    Default: 0
dr_chunk_size : int, optional
    Number of spectra per chunk for dimensionality reduction (0 = automatic; KPCA is approximated if chunked)
    Default: 0
dr_bands : str, optional
    Wavelength intervals or single values to include before reduction (e.g., 400–700,850–1300,2200)
dr_export : str, optional
    Optional path to export fitted reduction model (.pkl) for reuse
    Used as: output, file, name
flags : str, optional
    Allowed values: b, c, q, z
    b
        Apply baseline correction
    c
        Apply continuum removal
    q
        Interpolate missing values in valid bands
    z
        Clamp negative values to zero
overwrite : bool, optional
    Allow output files to overwrite existing files
    Default: None
verbose : bool, optional
    Verbose module output
    Default: None
quiet : bool, optional
    Quiet module output
    Default: None
superquiet : bool, optional
    Very quiet module output
    Default: None

Returns:

result : grass.tools.support.ToolResult | None
If the tool produces text as standard output, a ToolResult object will be returned. Otherwise, None will be returned.

Raises:

grass.tools.ToolError: When the tool ended with an error.

input : str, required
    Input hyperspectral raster map
    Used as: input, raster_3d, name
output : str, required
    Output preprocessed raster map
    Used as: output, raster_3d, name
polyorder : int, optional
    Polynomial order for Savitzky-Golay filter (0 = skip Savitzky-Golay)
    Default: 0
derivative_order : int, optional
    Derivative order (0 = smoothing only)
    Default: 0
window_length : int, optional
    Window length (must be odd number)
    Default: 11
dr_method : str, optional
    Dimensionality reduction method (linear or nonlinear)
    Allowed values: pca, kpca, nystroem, fastica, truncatedsvd, nmf, sparsepca
dr_components : int, optional
    Number of components to retain (PCA,KPCA,Nystroem,FastICA,TruncatedSVD,NMF,SparsePCA). 0 = automatic (up to 10 or number of bands)
    Default: 0
dr_kernel : str, optional
    Kernel type (used only for KPCA and Nystroem)
    Allowed values: linear, rbf, poly, sigmoid
    Default: rbf
dr_gamma : float, optional
    Kernel gamma (KPCA and Nystroem only)
    Default: 0.01
dr_degree : int, optional
    Polynomial degree (used if kernel=poly)
    Default: 3
dr_max_iter : int, optional
    Maximum iterations for convergence (FastICA,NMF,SparsePCA)
    Default: 200
dr_tol : float, optional
    Convergence tolerance (FastICA,NMF,SparsePCA)
    Default: 1e-4
dr_alpha : float, optional
    Regularization strength (NMF,SparsePCA)
    Default: 0.0
dr_l1_ratio : float, optional
    L1 ratio in [0,1] (NMF,SparsePCA)
    Default: 0.0
dr_random_state : int, optional
    Random seed for reproducibility (PCA,FastICA,NMF,SparsePCA,TruncatedSVD)
    Default: 0
dr_chunk_size : int, optional
    Number of spectra per chunk for dimensionality reduction (0 = automatic; KPCA is approximated if chunked)
    Default: 0
dr_bands : str, optional
    Wavelength intervals or single values to include before reduction (e.g., 400–700,850–1300,2200)
dr_export : str, optional
    Optional path to export fitted reduction model (.pkl) for reuse
    Used as: output, file, name
flags : str, optional
    Allowed values: b, c, q, z
    b
        Apply baseline correction
    c
        Apply continuum removal
    q
        Interpolate missing values in valid bands
    z
        Clamp negative values to zero
overwrite : bool, optional
    Allow output files to overwrite existing files
    Default: None
verbose : bool, optional
    Verbose module output
    Default: None
quiet : bool, optional
    Quiet module output
    Default: None
superquiet : bool, optional
    Very quiet module output
    Default: None

DESCRIPTION

i.hyper.preproc performs preprocessing of hyperspectral data stored as a 3D raster map (raster_3d). It is designed to improve data quality, suppress noise, and transform the spectral dimension into representations better suited for scientific analysis and machine learning workflows.

The module operates directly on hyperspectral cubes imported with i.hyper.import or other compatible 3D raster datasets. All transformations are performed along the spectral (z) dimension for each spatial position (x, y).

Preprocessing steps can be chained together in a pipeline, specified with the steps option. Each stage is executed sequentially according to the defined preprocessing pipeline. The module displays the full pipeline sequence in the console (for example: Savitzky–Golay → Baseline correction → Continuum removal → PCA), providing a clear overview of the operations applied in order.

i.hyper.preproc is part of the i.hyper module family and provides a reproducible, modular framework for spectral preprocessing prior to feature extraction, classification, or regression. All output maps are 3D rasters (raster_3d) compatible with the rest of the i.hyper suite.

FUNCTIONALITY

The following preprocessing methods are supported:

  • Savitzky--Golay (sav_gol) -- Polynomial smoothing and derivative computation to reduce spectral noise and enhance absorption features.
  • Baseline correction (baseline) -- Removes global trends or offsets in reflectance curves.
  • Continuum removal (cont_rem) -- Normalizes spectra to their convex hull to highlight relative absorption depths.
  • Principal Component Analysis (pca) -- Linear dimensionality reduction using eigen decomposition of covariance.
  • Kernel PCA (kpca) -- Nonlinear dimensionality reduction using kernel functions (RBF, polynomial, sigmoid).
  • Nystroem approximation (nystroem) -- Scalable approximation of Kernel PCA using a low-rank kernel mapping followed by PCA compression. Provides nonlinear feature extraction suitable for large hyperspectral cubes.
  • Fast Independent Component Analysis (fastica) -- Separates statistically independent spectral sources or mixtures.
  • Truncated Singular Value Decomposition (tsvd) -- Linear dimensionality reduction preserving dominant singular vectors (useful for sparse data).
  • Non-negative Matrix Factorization (nmf) -- Decomposes spectra into additive non-negative basis components.
  • Sparse Principal Component Analysis (sparsepca) -- PCA variant enforcing sparsity on component loadings for interpretability.

Multiple steps can be combined in one command by listing them in steps= (comma-separated). For example, steps='sav_gol,baseline,cont_rem,kpca' will execute all four in sequence. Intermediate rasters are handled internally and automatically cleaned up.

All dimensionality reduction methods are implemented using the scikit-learn library. For detailed algorithmic descriptions and parameter explanations, refer to the official scikit-learn documentation.

NOTES

The module is constructed as a preprocessing pipeline engine. Each transformation acts spectrally while preserving full spatial alignment. Operations are reported in the console as a sequential pipeline.

When using PCA, KPCA, FastICA, NMF, or SparsePCA, the number of output components can be controlled using the dr_components parameter.

When dimensionality reduction is applied, output hyper.json stores DR parameters in top-level key dimensionality_reduction.

Chunked dimensionality reduction:\ Large hyperspectral datasets can be processed in smaller portions using the dr_chunk_size option. This enables dimensionality reduction on datasets exceeding system memory capacity. When dr_chunk_size is used with kernel-based methods (e.g., KPCA), the algorithm operates as an approximation of the full kernel mapping, trading some precision for scalability.

Model export and reuse:\ Trained dimensionality reduction models can be exported using the dr_export option. The exported model (in .pkl format) can be reused to transform other spectra---such as field or laboratory measurements from a spectroradiometer---into the same reduced feature space. This allows consistent feature alignment between image-derived data and point spectra, facilitating integrated machine learning and spectral modeling workflows.

Results can be directly used by i.hyper.explore, i.hyper.composite, or exported with i.hyper.export for further analysis.

EXAMPLES

::: code

# Example 1: Savitzky–Golay smoothing (basic denoising)

# Set the region
g.region raster_3d=prisma

# Perform Savitzky–Golay smoothin with a window of 7 bands and polynomial order of 3
i.hyper.preproc input=prisma output=prisma_savgol \
                window_length=7 polyorder=3

# Console output:
Savitzky–Golay
Loading floating point  data with 4  bytes ...  (1254x1222x234)

:::

::: code

# Example 2: PCA transformation

# Set the region
g.region raster_3d=enmap

# Performs PCA
# Interpolaties missing values in valid bands
i.hyper.preproc input=enmap output=enmap_pca \
                dr_method=pca dr_components=10 -q

# Console output:
PCA
Interpolating missing values across spectral bands...
Loading floating point  data with 4  bytes ...  (1263x1127x10)

:::

::: code

# Example 3.1: Combined preprocessing pipeline

# Set the region
g.region raster_3d=tanager

# Savitzky–Golay derivative + baseline correction + continuum removal + Nystroem
# Interpolaties missing values in valid bands
# Processes the hyperspectral 3D map in chunks and exports the fitted Nystroem model
i.hyper.preproc input=tanager output=tanager_ml \
                polyorder=3 derivative_order=1 window_length=9 \
                -b -c -q \
                dr_method=nystroem dr_components=30 \
                dr_chunk_size=5000 \
                dr_export=/models/tanager_nystroem.pkl

# Console output:
Savitzky–Golay → Baseline correction → Continuum removal → NYSTROEM
Interpolating missing values across spectral bands...
Loading floating point  data with 4  bytes ...  (869x804x426)

:::

::: code

# Example 3.2: Using the exported Nystroem model in Python
import joblib, numpy as np

# Load exported Nystroem model (kernel map + PCA compressor)
feature_map, pca_after = joblib.load("/models/tanager_nystroem.pkl")

# Load new field spectra (rows = samples, cols = wavelengths
# The spectra must use the same wavelength order and scaling as the hyperspectral 3D map)
spectra = np.loadtxt("/data/field_spectra.txt")

# Apply the same nonlinear mapping and dimensionality reduction
Z = feature_map.transform(spectra)
spectra_reduced = pca_after.transform(Z)

:::

SEE ALSO

i.hyper.metadata, i.hyper.explore, i.hyper.composite, i.hyper.export, i.hyper.import, r3.stats r3.stats

DEPENDENCIES

  • NumPy -- Core numerical operations and array manipulation.
  • SciPy -- Signal processing.
  • scikit-learn -- Machine learning algorithms for PCA, KPCA, FastICA, NMF, SparsePCA, TruncatedSVD, and Nystroem.

AUTHORS

Alen Mangafić and Tomaž Žagar, Geodetic Institute of Slovenia

SOURCE CODE

Available at: i.hyper.preproc source code (history)
Latest change: Monday Jun 22 12:53:40 2026 in commit 425b037