ExternalParameterOrthogonalization#

class chemotools.projection.OrthogonalPLS(n_components: int = 1, copy=False)[source]

Bases: TransformerMixin, BaseEstimator

A transformer that removes variation in X that is orthogonal to the target y using Orthogonal Projection to Latent Structures (OPLS) [1].

OPLS extends PLS by explicitly separating X into predictive variation, correlated with y, and orthogonal variation, unrelated to y. At each iteration, a predictive weight vector is estimated using the PLS criterion (maximizing covariance between X and y). Scores and loadings are then computed, and the loading vector is decomposed into a component aligned with the predictive weight and a component orthogonal to it. The orthogonal component defines an orthogonal score vector, which is used to deflate X.

This procedure is repeated to remove multiple orthogonal components while retaining the predictive structure. Multivariate targets are supported via decomposition of the cross-covariance matrix.

The transformer returns X with orthogonal variation removed, preserving the original number of features.

Parameters:

n_components (int, default=1) – The number of orthogonal components to compute. This determines how many orthogonal variations will be removed from the data.
copy (bool, default=False) – If True, a copy of the input data is created and used for computations. If False, the input data is modified in place.

Variables:

x_weights (ndarray of shape (n_features, n_components)) – The weights of the original components.
x_weights_orth (ndarray of shape (n_features, n_components)) – The weights of the orthogonal components.
x_loadings (ndarray of shape (n_features, n_components)) – The loadings of the original components.
x_loadings_orth (ndarray of shape (n_features, n_components)) – The loadings of the orthogonal components.
x_scores (ndarray of shape (n_samples, n_components)) – The scores of the original components.
x_scores_orth (ndarray of shape (n_samples, n_components)) – The scores of the orthogonal components.
mean_X (ndarray of shape (n_features,)) – The mean of the original data X used for centering.
mean_y (float or ndarray of shape (n_targets,)) – The mean of the target variable y used for centering.
retained_variance_ratio (float) – The proportion of variance in X retained explained by the predictive components.
removed_variance_ratio (float) – The proportion of variance in X removed explained by the orthogonal components.

References

Examples

Fit and apply OrthogonalPLS to remove variation in X that is orthogonal to y.

>>> import numpy as np
>>> from chemotools.projection import OrthogonalPLS
>>> X = np.array([[1, 2], [3, 4], [5, 6]])
>>> y = np.array([1, 2, 3])
>>> opls = OrthogonalPLS(n_components=1)
>>> opls.fit(X, y)
OrthogonalPLS(n_components=1, copy=False)
>>> X_transformed = opls.transform(X, y)

Initialize the OrthogonalPLS transformer.

Parameters:

n_components (int, default=1) – The number of orthogonal components to compute. This determines how many orthogonal variations will be removed from the data.
copy (bool, default=False) – If True, a copy of the input data is created and used for computations. If False, the input data is modified in place.

fit(X: ndarray, y: ndarray) → OrthogonalPLS[source]

Fit the OrthogonalPLS model to the training data. :param X: The input data to fit the model to. :type X: array-like of shape (n_samples, n_features) :param y: The target values. :type y: array-like of shape (n_samples,)

Returns:: self – Fitted estimator.
Return type:: OrthogonalPLS

transform(X: ndarray, y=None) → ndarray[source]

Apply the OrthogonalPLS correction to X

This returns the predictive part of the data, i.e. the variation in X that is related to y, after removing the orthogonal part (variation in X that is not related to y).

Parameters:

X (array-like of shape (n_samples, n_features)) – The input data to transform.
y (None) – Ignored to align with API.

Returns:

X_transformed – The transformed data.

Return type:

array-like of shape (n_samples, n_features)