duo_ai.policies.pyod ==================== .. py:module:: duo_ai.policies.pyod Classes ------- .. autoapisummary:: duo_ai.policies.pyod.PyODPolicyConfig duo_ai.policies.pyod.PyODPolicy Module Contents --------------- .. py:class:: PyODPolicyConfig Configuration dataclass for PyODPolicy. :param name: Name of the policy class. Default is "pyod". :type name: str, optional :param method: PyOD method to use. Default is "deepsvdd.DeepSVDD". :type method: str, optional :param feature_type: Type of feature representation to use. Default is "hidden". :type feature_type: str, optional :param pyod_config: Additional configuration for the PyOD model. Default is None. :type pyod_config: dict, optional :param load_path: Path to a checkpoint to load. Default is None. :type load_path: str, optional .. rubric:: Examples >>> config = PyODPolicyConfig(method="deepsvdd.DeepSVDD", feature_type="hidden") .. py:attribute:: name :type: str :value: 'pyod' .. py:attribute:: method :type: str :value: 'deepsvdd.DeepSVDD' .. py:attribute:: feature_type :type: str :value: 'hidden' .. py:attribute:: pyod_config :type: Optional[Dict[str, Any]] :value: None .. py:attribute:: load_path :type: Optional[str] :value: None .. py:class:: PyODPolicy(config: PyODPolicyConfig, env: gym.Env) Bases: :py:obj:`duo_ai.core.policy.Policy` Policy that uses a PyOD outlier detector for action selection based on OOD scores. .. rubric:: Examples >>> policy = PyODPolicy(PyODPolicyConfig(), env) >>> obs = ... >>> action = policy.act(obs) .. py:attribute:: config_cls .. py:attribute:: config .. py:attribute:: threshold :value: None .. py:attribute:: device .. py:attribute:: clf .. py:attribute:: feature_type .. py:attribute:: EXPERT .. py:method:: _get_pyod_class(config: PyODPolicyConfig) -> type Dynamically import and return the PyOD class specified in the config. :param config: Configuration object for the policy. :type config: PyODPolicyConfig :returns: The PyOD class to instantiate. :rtype: type :raises ImportError: If the specified class cannot be imported. .. rubric:: Examples >>> cls = policy._get_pyod_class(config) .. py:method:: reset(done: numpy.ndarray) -> None Reset the policy state at episode boundaries. :param done: Boolean array indicating which episodes in a batch require a reset. :type done: numpy.ndarray :rtype: None .. rubric:: Examples >>> policy.reset(done) .. py:method:: _make_input(obs: Dict[str, Any]) -> numpy.ndarray Construct the input feature array for the PyOD model from the observation. :param obs: Observation dictionary containing required features. :type obs: dict :returns: Concatenated feature array for the PyOD model. :rtype: np.ndarray :raises AssertionError: If no features are selected for PyOD input. .. rubric:: Examples >>> inp = policy._make_input(obs) .. py:method:: fit(data: Dict[str, Any]) -> None Fit the PyOD model using the provided data. :param data: Data dictionary containing features for fitting the model. :type data: dict :rtype: None .. rubric:: Examples >>> policy.fit(data) .. py:method:: get_train_scores() -> numpy.ndarray Get the OOD decision scores from the PyOD model after fitting. :returns: Array of decision scores for the training data. :rtype: np.ndarray .. rubric:: Examples >>> scores = policy.get_train_scores() .. py:method:: act(obs: Dict[str, Any], temperature: Optional[float] = None) -> torch.Tensor Select actions based on OOD scores from the PyOD model. :param obs: Observation dictionary containing required features. :type obs: dict :param temperature: Unused. Included for API compatibility. :type temperature: float, optional :returns: Tensor of selected actions (expert or not) for the batch. :rtype: torch.Tensor .. rubric:: Examples >>> action = policy.act(obs) .. py:method:: set_params(params: Dict[str, Any]) -> None Set the parameters of the policy. :param params: Dictionary of policy parameters to set. :type params: dict :rtype: None .. rubric:: Examples >>> policy.set_params({'threshold': 0.5, 'clf': clf}) .. py:method:: get_params() -> Dict[str, Any] Get the current parameters of the policy. :returns: Dictionary of policy parameters. :rtype: dict .. rubric:: Examples >>> params = policy.get_params() .. py:method:: train() -> None Set the PyOD model to training mode if applicable. :rtype: None .. rubric:: Examples >>> policy.train() .. py:method:: eval() -> None Set the PyOD model to evaluation mode if applicable. :rtype: None .. rubric:: Examples >>> policy.eval()