GeneratorDirichlet¶

class kooplearn.kernel.GeneratorDirichlet(diffusion, n_components=None, *, gamma=None, alpha=1e-06, n_jobs=1, shift=1.0)[source]¶

Bases: BaseEstimator

Kernel-based estimator for the infinitesimal generator of diffusion processes using the Dirichlet-form method.

This class approximates the infinitesimal generator $\mathcal{L}$ of a diffusion process by embedding data into a reproducing kernel Hilbert space (RKHS). The generator is learned from samples of the invariant distribution, without explicit knowledge of the drift or diffusion coefficients.

The estimator is designed for data generated by a general diffusion process of the form

\[\mathrm{d} X_t = a(X_t)\, \mathrm{d} t + b(X_t)\, \mathrm{d} W_t,\]

where $a : \mathbb{R}^d \to \mathbb{R}^d$ is the drift field, $b : \mathbb{R}^d \to \mathbb{R}^{d \times m}$ is the diffusion coefficient, and $W_t$ denotes an $m$-dimensional standard Brownian motion.

Assuming sufficient smoothness, the corresponding infinitesimal generator acts on smooth test functions as

\[(\mathcal{L} f)(x) = \nabla f(x)^\top a(x) + \frac{1}{2}\, \operatorname{Tr}\!\left[ b(x)^\top (\nabla^2 f(x))\, b(x) \right],\]

where $\nabla^2 f(x)$ denotes the Hessian of $f$.

The generator governs the evolution of observables through the backward Kolmogorov equation

\[\frac{\partial}{\partial t} u(x,t) = \mathcal{L} u(x,t), \qquad u(x,0) = f(x),\]

with $u(x,t) = \mathbb{E}[f(X_t) \mid X_0 = x]$.

The estimator constructs a kernelized approximation of the resolvent of the generator $( \mu I - \mathcal{L})^{-1}$ using first- and second-order derivatives of the kernel and solves a regularized variational problem to recover generator eigenvalues and eigenfunctions.

Parameters:

diffusion (float or ndarray) –
Diffusion specification for the SDE. The input is interpreted as the diffusion matrix directly (no multiplication to form $D = b b^\top$). The allowed formats are:

1. Scalar diffusion If a float is provided, the diffusion is assumed isotropic, the same for all dimensions.

2. constant diffusion If an array of shape (n_features,) is provided, the diffusion is taken to be constant.

3. State dependant diffusion If an array of shape (n_samples, n_features) is provided, it is interpreted as a state dependant diffusion.
n_components (int or None, optional) – Number of generator eigenmodes to retain. If None, all components are kept.
gamma (float, optional) – RBF kernel scale parameter. If None, defaults to 1 / n_features.
alpha (float or None, default 1e-6) – Tikhonov regularization parameter for the variational problem. If None, a specialized unregularized solver is used.
n_jobs (int, default 1) – Number of parallel workers for kernel computation.
shift (float, default 1.0) – Positive shift $mu$ to define the resolvent operator.

Variables:

X_fit (ndarray of shape (n_samples, n_features)) – Training data used for fitting.
gamma (float) – Effective kernel parameter.
kernel_X (ndarray of shape (n_samples, n_samples)) – Kernel Gram matrix.
eigresults (dict) –
Result of the eigendecomposition step. Contains entries:
- "values"ndarray of shape (r,)
  Generator eigenvalues.
- "left"ndarray of shape (n_samples, r)
  Left eigenfunctions evaluated on the data.
- "right"ndarray
  Right eigenfunctions represented in RKHS coordinates.
rank (int) – Number of retained eigenmodes.

Notes

This implementation follows and provides a kernel-based estimator of the infinitesimal generator from equilibrium samples of the overdamped Langevin dynamics.

Attention

Currently, only the RBF kernel is supported.

Examples

>>> import numpy as np
>>> from kooplearn.kernel import GeneratorDirichlet
>>> from kooplearn.datasets import make_prinz_potential
>>> X = make_prinz_potential(X0=0, n_steps=500, gamma=1.0, sigma=2.0)
>>> model = GeneratorDirichlet(
...     diffusion=1.0,
...     n_components=4,
...     gamma=1.0,
... )
>>> model = model.fit(X)
>>> eigvals = model.eig()
>>> f_pred = model.predict(X, t=1.0)

Methods

dynamical_modes(X, observable=False) → DynamicalModes[source]¶

Compute the dynamical mode decomposition of an observable.

For an observable $f$, its expansion in generator modes is:

\[f(x) = \sum_{i=1}^r \langle \xi_i, f \rangle \, \psi_i(x),\]

where $\xi_i$ and $\psi_i$ are left and right eigenfunctions. Time evolution under the semigroup $e^{t\mathcal{L}}$ acts as:

\[f_t(x) = \sum_i e^{t \lambda_i} \langle \xi_i, f \rangle \psi_i(x).\]

Parameters:

X (ndarray) – Points at which the right eigenfunctions will be evaluated.
observable (bool, default False) – If True, returns the predicted observable at time $t$ instead of the system state.

Returns:

Structured object containing: - eigenvalues $e^{\lambda_i}$, - mode coefficients, - conditioning factors for evolution.

Return type:

DynamicalModes

eig(eval_left_on=None, eval_right_on=None)[source]¶

Predict the expected observable value at time $t$, conditional on the initial condition X.

This computes:

\[\mathbb{E}[f(X_t) \mid X_0 = X],\]

using the spectral representation of the generator and the Koopman semigroup $e^{t \mathcal{L}}$.

Parameters:

X (ndarray of shape (n_samples, n_features)) – Evaluation points.
t (float) – Time horizon for the Koopman propagation.
observable (ndarray of shape (n_samples, n_dim)) – Observable $f(X)$ to propagate in time.
recompute (bool, default True) – If True, recompute kernel matrices between X and X_fit_. If False, reuse precomputed training kernels.

Returns:

Predicted observable value $\mathbb{E}[f(X_t)]$.

Return type:

ndarray of shape (n_samples, n_dim)

fit(X, y=None)[source]¶

Fit the Dirichlet-form kernel model to trajectory data.

This computes:

The Gram matrix K,
Its first and second kernel derivatives,
An approximation of the generator via reduced-rank regression

, - Its eigenvalues and eigenfunctions.

Parameters:

X (ndarray of shape (n_samples, n_features)) – Training states sampled from a diffusion process.
y (ndarray of shape (n_samples, n_features_out), default None) – Optional observable used for training. If None, the observable is assumed to be the state itself.

Returns:

self – Fitted estimator.

Return type:

GeneratorDirichlet

predict(X, t, observable=False) → ndarray[source]¶

Predict the expected observable value at time $t$, conditional on the initial condition X.

This computes:

\[\mathbb{E}[f(X_t) \mid X_0 = X],\]

using the spectral representation of the generator and the Koopman semigroup $e^{t \mathcal{L}}$.

Parameters:

X (ndarray of shape (n_samples, n_features)) – Evaluation points.
t (float) – Time horizon for the Koopman propagation.
observable (bool, default False) – If True, returns the predicted observable at time $t$ instead of the system state.

Returns:

Predicted observable value $\mathbb{E}[f(X_t)]$.

Return type:

ndarray of shape (n_samples, n_dim)