duo_ai.algorithms.pyod
======================

.. py:module:: duo_ai.algorithms.pyod


Classes
-------

.. autoapisummary::

   duo_ai.algorithms.pyod.PyODAlgorithmConfig
   duo_ai.algorithms.pyod.PyODAlgorithm


Module Contents
---------------

.. py:class:: PyODAlgorithmConfig

   Configuration dataclass for PyODAlgorithm.

   :param cls: Name of the algorithm class. Default is "PyODAlgorithm".
   :type cls: str, optional
   :param num_rollouts: Number of rollouts to use for data generation. Default is 128.
   :type num_rollouts: int, optional
   :param percentiles: List of percentiles to use for threshold selection. Default is range(0, 101, 10).
   :type percentiles: list of float, optional
   :param explore_temps: List of temperatures to use during exploration rollouts. Default is [1.0].
   :type explore_temps: list of float, optional
   :param accept_rate: Acceptance rate for sampling data during rollouts. Default is 0.05.
   :type accept_rate: float, optional


   .. py:attribute:: name
      :type:  str
      :value: 'pyod'


   .. py:attribute:: num_rollouts
      :type:  int
      :value: 128


   .. py:attribute:: percentiles
      :type:  List[float]


   .. py:attribute:: explore_temps
      :type:  List[float]
      :value: [1.0]


   .. py:attribute:: accept_rate
      :type:  float
      :value: 0.05


.. py:class:: PyODAlgorithm(config: PyODAlgorithmConfig)

   Bases: :py:obj:`duo_ai.core.Algorithm`


   Algorithm for out-of-distribution (OOD) detection using PyOD models.

   .. rubric:: Examples

   >>> algo = PyODAlgorithm(PyODAlgorithmConfig())


   .. py:attribute:: config_cls


   .. py:attribute:: config


   .. py:attribute:: random


   .. py:method:: train(policy: duo.policies.PPOPolicy, env: gym.Env, validators: Dict[str, duo.core.Evaluator]) -> None

      Train the PyODAlgorithm by searching for the best threshold parameter
      that maximizes evaluation reward.

      :param policy: The policy to be evaluated and tuned.
      :type policy: duo.policies.PPOPolicy
      :param env: The environment instance for training and data generation.
      :type env: gym.Env
      :param validators: Dictionary mapping split names to evaluator instances for evaluation.
      :type validators: dict of str to duo.core.Evaluator

      :rtype: None

      .. rubric:: Examples

      >>> algorithm = PyODAlgorithm(PyODAlgorithmConfig())
      >>> algorithm.train(policy, env, validators)


   .. py:method:: save_checkpoint(policy: duo.policies.PPOPolicy, name: str) -> None

      Save the current policy configuration and parameters to a checkpoint file.

      :param policy: The policy whose parameters are to be saved.
      :type policy: duo.policies.PPOPolicy
      :param name: Name for the checkpoint file.
      :type name: str

      :rtype: None

      .. rubric:: Examples

      >>> self.save_checkpoint(policy, "best_test")


   .. py:method:: _generate_data(env: gym.Env, policy: duo.policies.PPOPolicy, temperature: float, num_rollouts: int, accept_rate: float) -> dict

      Generate data for OOD detection by rolling out the policy in the environment.

      :param env: The environment used for rollouts.
      :type env: gym.Env
      :param policy: The policy to be evaluated.
      :type policy: duo.policies.PPOPolicy
      :param temperature: Temperature parameter for action selection.
      :type temperature: float
      :param num_rollouts: Total number of rollout episodes to generate.
      :type num_rollouts: int
      :param accept_rate: Acceptance rate for sampling data during rollouts.
      :type accept_rate: float

      :returns: **data** -- Dictionary containing collected data arrays for each feature.
      :rtype: dict

      .. rubric:: Examples

      >>> data = self._generate_data(env, policy, 1.0, 128, 0.05)