duo_ai.policies.pyod
====================

.. py:module:: duo_ai.policies.pyod


Classes
-------

.. autoapisummary::

   duo_ai.policies.pyod.PyODPolicyConfig
   duo_ai.policies.pyod.PyODPolicy


Module Contents
---------------

.. py:class:: PyODPolicyConfig

   Configuration dataclass for PyODPolicy.

   :param name: Name of the policy class. Default is "pyod".
   :type name: str, optional
   :param method: PyOD method to use. Default is "deepsvdd.DeepSVDD".
   :type method: str, optional
   :param feature_type: Type of feature representation to use. Default is "hidden".
   :type feature_type: str, optional
   :param pyod_config: Additional configuration for the PyOD model. Default is None.
   :type pyod_config: dict, optional
   :param load_path: Path to a checkpoint to load. Default is None.
   :type load_path: str, optional

   .. rubric:: Examples

   >>> config = PyODPolicyConfig(method="deepsvdd.DeepSVDD", feature_type="hidden")


   .. py:attribute:: name
      :type:  str
      :value: 'pyod'


   .. py:attribute:: method
      :type:  str
      :value: 'deepsvdd.DeepSVDD'


   .. py:attribute:: feature_type
      :type:  str
      :value: 'hidden'


   .. py:attribute:: pyod_config
      :type:  Optional[Dict[str, Any]]
      :value: None


   .. py:attribute:: load_path
      :type:  Optional[str]
      :value: None


.. py:class:: PyODPolicy(config: PyODPolicyConfig, env: gym.Env)

   Bases: :py:obj:`duo_ai.core.policy.Policy`


   Policy that uses a PyOD outlier detector for action selection based on OOD scores.

   .. rubric:: Examples

   >>> policy = PyODPolicy(PyODPolicyConfig(), env)
   >>> obs = ...
   >>> action = policy.act(obs)


   .. py:attribute:: config_cls


   .. py:attribute:: config


   .. py:attribute:: threshold
      :value: None


   .. py:attribute:: device


   .. py:attribute:: clf


   .. py:attribute:: feature_type


   .. py:attribute:: EXPERT


   .. py:method:: _get_pyod_class(config: PyODPolicyConfig) -> type

      Dynamically import and return the PyOD class specified in the config.

      :param config: Configuration object for the policy.
      :type config: PyODPolicyConfig

      :returns: The PyOD class to instantiate.
      :rtype: type

      :raises ImportError: If the specified class cannot be imported.

      .. rubric:: Examples

      >>> cls = policy._get_pyod_class(config)


   .. py:method:: reset(done: numpy.ndarray) -> None

      Reset the policy state at episode boundaries.

      :param done: Boolean array indicating which episodes in a batch require a reset.
      :type done: numpy.ndarray

      :rtype: None

      .. rubric:: Examples

      >>> policy.reset(done)


   .. py:method:: _make_input(obs: Dict[str, Any]) -> numpy.ndarray

      Construct the input feature array for the PyOD model from the observation.

      :param obs: Observation dictionary containing required features.
      :type obs: dict

      :returns: Concatenated feature array for the PyOD model.
      :rtype: np.ndarray

      :raises AssertionError: If no features are selected for PyOD input.

      .. rubric:: Examples

      >>> inp = policy._make_input(obs)


   .. py:method:: fit(data: Dict[str, Any]) -> None

      Fit the PyOD model using the provided data.

      :param data: Data dictionary containing features for fitting the model.
      :type data: dict

      :rtype: None

      .. rubric:: Examples

      >>> policy.fit(data)


   .. py:method:: get_train_scores() -> numpy.ndarray

      Get the OOD decision scores from the PyOD model after fitting.

      :returns: Array of decision scores for the training data.
      :rtype: np.ndarray

      .. rubric:: Examples

      >>> scores = policy.get_train_scores()


   .. py:method:: act(obs: Dict[str, Any], temperature: Optional[float] = None) -> torch.Tensor

      Select actions based on OOD scores from the PyOD model.

      :param obs: Observation dictionary containing required features.
      :type obs: dict
      :param temperature: Unused. Included for API compatibility.
      :type temperature: float, optional

      :returns: Tensor of selected actions (expert or not) for the batch.
      :rtype: torch.Tensor

      .. rubric:: Examples

      >>> action = policy.act(obs)


   .. py:method:: set_params(params: Dict[str, Any]) -> None

      Set the parameters of the policy.

      :param params: Dictionary of policy parameters to set.
      :type params: dict

      :rtype: None

      .. rubric:: Examples

      >>> policy.set_params({'threshold': 0.5, 'clf': clf})


   .. py:method:: get_params() -> Dict[str, Any]

      Get the current parameters of the policy.

      :returns: Dictionary of policy parameters.
      :rtype: dict

      .. rubric:: Examples

      >>> params = policy.get_params()


   .. py:method:: train() -> None

      Set the PyOD model to training mode if applicable.

      :rtype: None

      .. rubric:: Examples

      >>> policy.train()


   .. py:method:: eval() -> None

      Set the PyOD model to evaluation mode if applicable.

      :rtype: None

      .. rubric:: Examples

      >>> policy.eval()