Calibration transfer#

A model trained on one instrument does not always predict accurately on a second instrument — even when measuring the same samples. Differences in detector response, optical alignment, wavelength accuracy, and environmental conditions all introduce systematic spectral shifts between instruments.

Calibration transfer (also called model transder or domain adaptation) corrects for these differences so that a model built on a source instrument can be applied to spectra measured on a target instrument, without rebuilding the model from scratch.

chemotools implements two classical transfer methods in chemotools.adaptation:

  • DirectStandardization (DS) — a global linear transformation.

  • PiecewiseDirectStandardization (PDS) — a local, window-based transformation using PLS regression.

Both follow the formulation introduced by Wang, Veltkamp & Kowalski (1991) [1].

The calibration transfer workflow#

The general workflow is:

  1. Measure a small set of transfer samples on both the source and target instruments.

  2. Fit the standardization transformer using the paired spectra.

  3. Apply the transformer to new target spectra before prediction.

Source instrument  ──►  calibration model (PLS, etc.)
                               ▲
Target instrument  ──►  standardization transformer  ──►  standardized spectra

After standardization, the corrected spectra can be fed directly into the source-instrument model.

Direct Standardization (DS)#

DirectStandardization finds a global linear map T that transforms target-instrument spectra to the source-instrument space:

\[\hat{X}_{\text{source}} = X_{\text{target}} \cdot T\]

T is estimated by ordinary least squares on the paired transfer spectra. DS is simple, fast, and works well when the spectral differences between instruments are globally smooth.

Fitting DS

Pass the target transfer spectra as X and the corresponding source transfer spectra as X_source:

import numpy as np
from chemotools.adaptation import DirectStandardization

rng = np.random.default_rng(42)
n_transfer, n_features = 30, 100

# Transfer samples measured on both instruments
X_source = rng.normal(size=(n_transfer, n_features))
X_target = X_source * 1.02 + 0.005 + rng.normal(size=(n_transfer, n_features)) * 0.002

ds = DirectStandardization()
ds.fit(X_target, X_source=X_source)

Applying DS to new spectra

# New spectra from the target instrument
X_new_target = rng.normal(size=(10, n_features))

# Transform to source-instrument space before prediction
X_new_standardized = ds.transform(X_new_target)

DS in a Pipeline

from sklearn.pipeline import Pipeline
from sklearn.cross_decomposition import PLSRegression

pipe = Pipeline([
    ("ds",  DirectStandardization()),
    ("pls", PLSRegression(n_components=3)),
])

# Fit the transfer step on transfer samples (X_target → X_source mapping)
# Fit the regression step on source spectra and reference values
# In practice these two fits are done separately; see the API reference.

注釈

When X_source is not provided, DS fits an identity transformation (i.e., transform returns X unchanged). This is useful as a no-op placeholder in pipelines during development.

Piecewise Direct Standardization (PDS)#

PiecewiseDirectStandardization extends DS by building one local PLS model per output feature, using a small window of neighbouring input features. This makes PDS more robust when:

  • the spectral shift between instruments is non-linear or varies across the spectral range.

  • there are wavelength registration differences (small offsets between the x-axis grids of the two instruments).

For each output feature \(j\), PDS uses the window \([j - w, \ldots, j + w]\) of the target spectrum (where w = window_length) to predict the corresponding feature of the source spectrum via PLS regression.

Fitting PDS

from chemotools.adaptation import PiecewiseDirectStandardization

pds = PiecewiseDirectStandardization(
    window_length=3,    # half-width of the local spectral window
    n_components=2,     # PLS components per local model
)
pds.fit(X_target, X_source=X_source)

Applying PDS

X_new_standardized = pds.transform(X_new_target)

Choosing between DS and PDS#

DS

PDS

Transformation

Global linear map

Local PLS per feature

Number of transfer samples needed

As many as features (can use regularization)

Few (local models have few variables)

Handles wavelength shifts

Poorly

Well (windowed input)

Handles non-linear differences

No

Partially

Computation time

Fast

Moderate (one PLS per feature)

Best for

Globally smooth instrument differences

Wavelength offsets, local non-linearities

A common strategy is to start with DS (fast, interpretable) and switch to PDS if prediction accuracy on the target instrument is insufficient.

References#

参考