fetch_ordered_mnist¶
- kooplearn.datasets.fetch_ordered_mnist(*, num_digits: int = 10, data_home: str | PathLike | None = None, cache: bool = True, n_retries: int = 3, delay: float = 1.0)[source]¶
Fetch the MNIST dataset and return an ordered subset interleaving samples from each digit class.
This function wraps
sklearn.datasets.fetch_openml()for the MNIST dataset (OpenML ID 554) and reorders the samples so that digits0throughnum_digits - 1are interleaved in the output. This is useful for generating class-balanced or periodic sequences for Koopman operator regression experiments.The MNIST dataset contains 70,000 grayscale handwritten digits (60,000 for training and 10,000 for testing) of size 28×28.
- Parameters:
num_digits (
int, default10) – Number of digit classes to include, from 1 to 10. For example,num_digits=3returns only digits0,1, and2.data_home (
strorpath-like, optional) – Specify an alternative download and cache folder for the dataset. By default, scikit-learn stores data in~/scikit_learn_data.cache (
bool, defaultTrue) – Whether to cache the downloaded dataset.n_retries (
int, default3) – Number of times to retry downloading if network errors occur.delay (
float, default1.0) – Number of seconds between retries during download.
- Returns:
images (
ndarrayofshape (n_samples,28,28)) – Array of grayscale MNIST images, with values scaled to the [0, 1] range (float64).targets (
ndarrayofshape (n_samples,)) – Corresponding digit labels (integers in[0, num_digits - 1]).
Notes
The dataset is reordered so that classes are interleaved in the returned arrays. For example, with
num_digits=3, the ordering will be:[0, 1, 2, 0, 1, 2, 0, 1, 2, ...]Examples
>>> from kooplearn.datasets import fetch_ordered_mnist >>> images, targets = fetch_ordered_mnist(num_digits=3) >>> images.shape (20709, 28, 28) >>> np.unique(targets) array([0, 1, 2])