KernelRidge¶

class kooplearn.kernel.KernelRidge(n_components=None, *, lag_time=1, reduced_rank=True, kernel='linear', gamma=None, degree=3, coef0=1, kernel_params=None, alpha=1e-06, eigen_solver='auto', tol=0, max_iter=None, iterated_power='auto', n_oversamples=5, optimal_sketching=False, random_state=None, copy_X=True, n_jobs=None)[source]¶

Bases: BaseEstimator

Kernel model minimizing the \(L^{2}\) loss.

Implements a model approximating the Koopman (deterministic systems) or Transfer (stochastic systems) operator by lifting the state with a infinite-dimensional nonlinear feature map associated to a kernel \(k\) and then minimizing the \(L^{2}\) loss in the embedded space as described in Kostic et al. [1].

Tip

The dynamical modes obtained by calling kooplearn.kernel.KernelRidge.dynamical_modes correspond to the Kernel Dynamical Mode Decomposition by O. Williams et al. [2].

Parameters:

n_components (int or None, default None) – Number of components to retain. If None, all components are used.
lag_time (int, default 1) – Time delay between the pairs of snapshots \((X_t, X_{t + \text{lag_time}})\) used to train the estimator.
reduced_rank (bool, default True) – Whether to use reduced-rank regression introduced in Kostic et al. [8]. If False, initializes the classical principal component estimator.
kernel ({'linear', 'poly', 'rbf', 'sigmoid', 'cosine'} ) – or callable, default=’linear’ Kernel function to use, or a callable that returns a Gram matrix.
gamma (float or None, default None) – Kernel coefficient for rbf, poly, and sigmoid kernels. Ignored by other kernels. If None, it is set to 1 / n_features.
degree (float, default 3) – Degree for polynomial kernels. Ignored by other kernels.
coef0 (float, default 1) – Independent term in polynomial and sigmoid kernels. Ignored by other kernels.
kernel_params (dict or None, default None) – Parameters (keyword arguments) passed to a callable kernel. Ignored by other kernels.
alpha (float, default 1e-6) – Tikhonov (ridge) regularization coefficient. None is equivalent to alpha = 0, and internally calls specialized stable algorithms to deal with this specific case.
eigen_solver ({'auto', 'dense', 'arpack', 'randomized'}, default 'auto') –
Solver used to perform the internal SVD calculations. If n_components is much less than the number of training samples, randomized (or arpack to a smaller extent) may be more efficient than the dense solver.

auto :
the solver is selected automatically based on the number of samples and components: if the number of components to extract is less than 10 (strict) and the number of samples is more than 200 (strict), the arpack method is enabled. Otherwise the exact full eigenvalue decomposition is computed and optionally truncated afterwards (dense method).

dense :
run exact full eigenvalue decomposition calling the standard LAPACK solver via scipy.linalg.eigh, and select the components by postprocessing.

arpack :
run SVD truncated to n_components calling ARPACK solver using scipy.sparse.linalg.eigsh. It requires strictly 0 < n_components < n_samples.

randomized :
run randomized SVD as described in Turri et al. [3].
tol (float, default 0) – Convergence tolerance for arpack. If 0, optimal value is chosen automatically by arpack.
max_iter (int or None, default None) – Maximum number of iterations for arpack. If None, optimal value is chosen automatically by arpack.
iterated_power ({'auto'} or int, default 'auto') – Number of iterations for the power method computed by eigen_solver == 'randomized'. When 'auto', it is set to 7 when n_components < 0.1 * min(X.shape), otherwise it is set to 4.
n_oversamples (int, default 5) – Number of oversamples when using a randomized algorithm (eigen_solver == 'randomized').
optimal_sketching (bool, default False) – Sketching strategy for the randomized solver. If True performs optimal sketching (computationally expensive but more accurate).
random_state (int, RandomState instance or None, default None) – Used when eigen_solver is 'arpack' or 'randomized'. Pass an int for reproducible results across multiple function calls.
copy_X (bool, default True) – If True, input X is copied and stored by the model in the X_fit_ attribute. If no further changes will be done to X, setting copy_X=False saves memory by storing a reference.
n_jobs (int or None, default None) – Number of parallel jobs to run. None means 1 unless in a joblib.parallel_backend context. -1 uses all available processors.

Variables:

X_fit (ndarray of shape (n_samples, n_features)) – The data used to fit the model. If copy_X=False, then X_fit_ is a reference to the original data.
y_fit (ndarray of shape (n_samples, ) or None) – The observable used to fit the model. If no observable is provided during fitting, this attribute is None.
gamma (float) – Effective kernel coefficient for RBF, polynomial, and sigmoid kernels. When gamma is explicitly provided, this is the same as gamma. When gamma is None, this is the inferred value.
kernel_X (ndarray of shape (n_samples, n_samples)) – Kernel matrix evaluated at the initial states.
kernel_Y (ndarray of shape (n_samples, n_samples)) – Kernel matrix evaluated at the evolved states.
kernel_YX (ndarray of shape (n_samples, n_samples)) – Cross-kernel matrix between evolved and initial states.
U (ndarray of shape (n_samples, rank)) – Left projection matrix of the operator approximation: \(k(\cdot, X) U V^\top k(\cdot, Y)\) (see Kostic et al. [1]).
V (ndarray of shape (n_samples, rank)) – Right projection matrix of the operator approximation: \(k(\cdot, X) U V^\top k(\cdot, Y)\) (see Kostic et al. [1]).
rank (int) – Effective rank of the fitted estimator.

Examples

>>> from kooplearn.datasets import make_linear_system
>>> from kooplearn.kernel import KernelRidge
>>> import numpy as np
>>>
>>> # Generate a linear system
>>> A = np.array([[0.9, 0.1], [-0.1, 0.9]])
>>> X0 = np.array([1.0, 0.0])
>>> data = make_linear_system(X0, A, n_steps=100, noise=0.1, random_state=42).to_numpy()
>>>
>>> # Fit the model
>>> model = KernelRidge(n_components=2, kernel='linear', alpha=1e-3)
>>> model = model.fit(data)
>>> # Predict the future state
>>> pred = model.predict(data)
>>>
>>> # Get the eigenvalues of the Koopman operator
>>> eigvals = model.eig()

Methods

dynamical_modes(X, observable=False) → DynamicalModes[source]¶

Compute the mode decomposition of arbitrary observables of the evolution operator at the states defined by X. If \((\lambda_i, \xi_i, \psi_i)_{i = 1}^{r}\) are eigentriplets of the evolution operator, for any observable \(f\) the i-th mode of \(f\) at \(x\) is defined as: \(\lambda_i \langle \xi_i, f \rangle \psi_i(x)\). See Kostic et al. [1] for more details.

Parameters:

X (ndarray of shape (n_samples, n_features)) – States at which to evaluate the modes.
observable (bool, default False) – If True, computes the modes of the observable rather than those of the system state.

Returns:

Object containing the eigenvalues, modes, and their projections. See kooplearn.structs.DynamicalModes for details.

Return type:

DynamicalModes

eig(eval_left_on=None, eval_right_on=None)[source]¶

Compute the eigendecomposition of the learned evolution operator.

This method returns the eigenvalues of the estimated evolution operator, and optionally evaluates the corresponding left and/or right eigenfunctions on user-provided data.

Parameters:

eval_left_on (ndarray of shape (n_samples, n_features), optional) – Data points on which to evaluate the left eigenfunctions. If None, left eigenfunctions are not evaluated.
eval_right_on (ndarray of shape (n_samples, n_features), optional) – Data points on which to evaluate the right eigenfunctions. If None, right eigenfunctions are not evaluated.

Returns:

eigenvalues (ndarray of shape (n_components,)) – Eigenvalues of the estimated operator.
left_eigenfunctions (ndarray of shape (n_samples, n_components), optional) – Values of the left eigenfunctions evaluated on eval_left_on. Returned only if eval_left_on is provided.
right_eigenfunctions (ndarray of shape (n_samples, n_components), optional) – Values of the right eigenfunctions evaluated on eval_right_on. Returned only if eval_right_on is provided.

fit(X, y=None)[source]¶

Fit the kernel model to data.

Depending on the model parameters, this method fits the estimator using either a randomized or non-randomized algorithm, and either a full-rank or reduced-rank regression approach.

Parameters:

X (ndarray of shape (n_samples, n_features)) – Training trajectory data.
y (ndarray of shape (n_samples, n_features_out), default None) – Optional observable used for training. If None, the observable is assumed to be the state itself.

Returns:

self – Fitted model instance.

Return type:

object

predict(X, n_steps=1, observable=False) → ndarray[source]¶

Predict the system state or its expected value after n_steps.

Computes the predicted state — or, in the case of a stochastic system, the expected value \(\mathbb{E}[X_t \mid X_0 = X]\) — after t = n_steps time steps given the initial conditions X. If observable=True, returns the corresponding predicted observable instead of the state.

Parameters:

X (ndarray of shape (n_samples, n_features)) – Initial conditions used for prediction.
n_steps (int, default 1) – Number of future time steps to predict. Only the final predicted state (at time t = n_steps) is returned.
observable (bool, default False) – If True, returns the predicted observable at time \(t\) instead of the system state.

Returns:

Predicted (expected) state or observable at time \(t = n_steps\).

Return type:

ndarray of shape (n_samples, n_features)

risk(X=None)[source]¶

Compute the estimator risk.

Parameters:: X (ndarray or None, default None) – Trajectory used to evaluate the risk. If None, evaluates on training data.
Returns:: Risk of the estimator.
Return type:: float

score(X=None, y=None, n_steps=1, observable=False, metric=<function r2_score>, **metric_kws) → float[source]¶

Score the model predictions for timestep n_steps.

Computes the metric (default is sklearn.metrics.r2_score) evaluated between the model predictions at timestep n_steps and the true system state (or observable, if observable=True) at the same timestep.

Parameters:

X – ndarray of shape (n_samples, n_features) or None, default=None Trajectory used to compute the score. If None, evaluates on training data.
y – ndarray of shape (n_samples, n_features) or None, default=None Optional observable used to compute the score.
n_steps – int, default=1 Number of future time steps on which to compute the score. Only the predictions at the final timestep (t = n_steps) are compared to the true system state (or observable).
observable – bool, default=False If True, returns the predicted observable at time \(t\) instead of the system state.
metric – callable (default=r2_score) The metric function used to score the model predictions.
metric_kws – dict Optional parameters to pass to the metric function.

Returns:

float: Metric function value for the model predictions at the next timestep.

Return type:

score

[1] (1,2,3,4)

Vladimir Kostic, Pietro Novelli, Andreas Maurer, Carlo Ciliberto, Lorenzo Rosasco, and Massimiliano Pontil. Learning dynamical systems via koopman operator regression in reproducing kernel hilbert spaces. Advances in Neural Information Processing Systems, 35:4017–4031, 2022.

[2]

Matthew O. Williams, Clarence W. Rowley, and Ioannis G. Kevrekidis. A kernel-based method for data-driven koopman spectral analysis. Journal of Computational Dynamics, 2(2):247–265, 2015. URL: http://dx.doi.org/10.3934/jcd.2015005, doi:10.3934/jcd.2015005.

[3]

Giacomo Turri, Vladimir Kostic, Pietro Novelli, and Massimiliano Pontil. A randomized algorithm to solve reduced rank operator regression. SIAM Journal on Mathematics of Data Science, 2025.