:mod:`torchfilter.data`
=======================

.. py:module:: torchfilter.data

.. autoapi-nested-parse::

   Dataset utilities for learning & evaluating state estimators in PyTorch.


Package Contents
----------------

Classes
~~~~~~~

.. autoapisummary::

   torchfilter.data.ParticleFilterMeasurementDataset
   torchfilter.data.SingleStepDataset
   torchfilter.data.SubsequenceDataset


Functions
~~~~~~~~~

.. autoapisummary::

   torchfilter.data.split_trajectories


.. py:class:: ParticleFilterMeasurementDataset(trajectories: List[types.TrajectoryNumpy], *, covariance: np.ndarray, samples_per_pair: int, **kwargs)

   Bases: :class:`torch.utils.data.Dataset`

   .. autoapi-inheritance-diagram:: torchfilter.data.ParticleFilterMeasurementDataset
      :parts: 1

   A dataset interface for pre-training particle filter measurement models.

   Centers Gaussian distributions around our ground-truth states, and provides examples
   for learning the log-likelihood.

   :param trajectories: List of trajectories.
   :type trajectories: List[torchfilter.types.TrajectoryNumpy]

   :keyword covariance: Covariance of Gaussian PDFs.
   :kwtype covariance: np.ndarray
   :keyword samples_per_pair: Number of training examples to provide for each
                              state/observation pair. Half of these will typically be generated close
                              to the example, and the other half far away.
   :kwtype samples_per_pair: int

   .. method:: __getitem__(self, index) -> Tuple[types.StatesNumpy, types.ObservationsNumpy, np.ndarray]

      Get a state/observation/log-likelihood sample from our dataset. Nominally, we
      want our measurement model to predict the returned log-likelihood as the PDF of
      the ``p(observation | state)`` distribution.

      :param index: Subsequence number in our dataset.
      :type index: int

      :returns: *tuple* -- ``(state, observation, log-likelihood)`` tuple.


   .. method:: __len__(self) -> int

      Total number of samples in the dataset.

      :returns: *int* -- Length of dataset.


.. py:class:: SingleStepDataset(trajectories: List[types.TrajectoryNumpy])

   Bases: :class:`torch.utils.data.Dataset`

   .. autoapi-inheritance-diagram:: torchfilter.data.SingleStepDataset
      :parts: 1

   A dataset interface that returns single-step training examples:
   ``(previous_state, state, observation, control)``

   By default, extracts these examples from a list of trajectories.

   :param trajectories: List of trajectories.
   :type trajectories: List[torchfilter.types.TrajectoryNumpy]

   .. method:: __getitem__(self, index: int) -> Tuple[types.StatesNumpy, types.StatesNumpy, types.ObservationsNumpy, types.ControlsNumpy]

      Get a single-step prediction sample from our dataset.

      :param index: Subsequence number in our dataset.
      :type index: int

      :returns: *tuple* -- ``(previous_state, state, observation, control)`` tuple that
                contains data for a single subsequence. Each tuple member should be either a
                numpy array or dict of numpy arrays with shape ``(subsequence_length, ...)``.


   .. method:: __len__(self) -> int

      Total number of subsequences in the dataset.

      :returns: *int* -- Length of dataset.


.. function:: split_trajectories(trajectories: List[types.TrajectoryNumpy], subsequence_length: int) -> List[types.TrajectoryNumpy]

   Helper for splitting a list of trajectories into a list of overlapping
   subsequences.

   For each trajectory, assuming a subsequence length of 10, this function
   includes in its output overlapping subsequences corresponding to
   timesteps...

   .. code-block::

          [0:10], [10:20], [20:30], ...

   as well as...

   .. code-block::

          [5:15], [15:25], [25:30], ...

   :param trajectories: List of trajectories.
   :type trajectories: List[torchfilter.base.TrajectoryNumpy]
   :param subsequence_length: # of timesteps per subsequence.
   :type subsequence_length: int

   :returns: *List[torchfilter.base.TrajectoryNumpy]* -- List of subsequences.


.. py:class:: SubsequenceDataset(trajectories: List[types.TrajectoryNumpy], subsequence_length: int)

   Bases: :class:`torch.utils.data.Dataset`

   .. autoapi-inheritance-diagram:: torchfilter.data.SubsequenceDataset
      :parts: 1

   A data preprocessor for producing training subsequences from
   a list of trajectories.

   Thin wrapper around ``torchfilter.data.split_trajectories()``.

   :param trajectories: list of trajectories, where each is a tuple of
                        ``(states, observations, controls)``. Each tuple member should be
                        either a numpy array or dict of numpy arrays with shape ``(T, ...)``.
   :type trajectories: list
   :param subsequence_length: # of timesteps per subsequence.
   :type subsequence_length: int

   .. method:: __getitem__(self, index: int) -> types.TrajectoryNumpy

      Get a subsequence from our dataset.

      :param index: Subsequence number in our dataset.
      :type index: int

      :returns: *tuple* -- ``(states, observations, controls)`` tuple that contains
                data for a single subsequence. Each tuple member should be either a
                numpy array or dict of numpy arrays with shape
                ``(subsequence_length, ...)``.


   .. method:: __len__(self) -> int

      Total number of subsequences in the dataset.

      :returns: *int* -- Length of dataset.