duo_ai.policies.pyod¶
Classes¶
Configuration dataclass for PyODPolicy. |
|
Policy that uses a PyOD outlier detector for action selection based on OOD scores. |
Module Contents¶
- class duo_ai.policies.pyod.PyODPolicyConfig[source]¶
Configuration dataclass for PyODPolicy.
- Parameters:
name (str, optional) – Name of the policy class. Default is “pyod”.
method (str, optional) – PyOD method to use. Default is “deepsvdd.DeepSVDD”.
feature_type (str, optional) – Type of feature representation to use. Default is “hidden”.
pyod_config (dict, optional) – Additional configuration for the PyOD model. Default is None.
load_path (str, optional) – Path to a checkpoint to load. Default is None.
Examples
>>> config = PyODPolicyConfig(method="deepsvdd.DeepSVDD", feature_type="hidden")
- name: str = 'pyod'¶
- method: str = 'deepsvdd.DeepSVDD'¶
- feature_type: str = 'hidden'¶
- pyod_config: Dict[str, Any] | None = None¶
- load_path: str | None = None¶
- class duo_ai.policies.pyod.PyODPolicy(config: PyODPolicyConfig, env: gym.Env)[source]¶
Bases:
duo_ai.core.policy.PolicyPolicy that uses a PyOD outlier detector for action selection based on OOD scores.
Examples
>>> policy = PyODPolicy(PyODPolicyConfig(), env) >>> obs = ... >>> action = policy.act(obs)
- config_cls¶
- config¶
- threshold = None¶
- device¶
- clf¶
- feature_type¶
- EXPERT¶
- _get_pyod_class(config: PyODPolicyConfig) type[source]¶
Dynamically import and return the PyOD class specified in the config.
- Parameters:
config (PyODPolicyConfig) – Configuration object for the policy.
- Returns:
The PyOD class to instantiate.
- Return type:
type
- Raises:
ImportError – If the specified class cannot be imported.
Examples
>>> cls = policy._get_pyod_class(config)
- reset(done: numpy.ndarray) None[source]¶
Reset the policy state at episode boundaries.
- Parameters:
done (numpy.ndarray) – Boolean array indicating which episodes in a batch require a reset.
- Return type:
None
Examples
>>> policy.reset(done)
- _make_input(obs: Dict[str, Any]) numpy.ndarray[source]¶
Construct the input feature array for the PyOD model from the observation.
- Parameters:
obs (dict) – Observation dictionary containing required features.
- Returns:
Concatenated feature array for the PyOD model.
- Return type:
np.ndarray
- Raises:
AssertionError – If no features are selected for PyOD input.
Examples
>>> inp = policy._make_input(obs)
- fit(data: Dict[str, Any]) None[source]¶
Fit the PyOD model using the provided data.
- Parameters:
data (dict) – Data dictionary containing features for fitting the model.
- Return type:
None
Examples
>>> policy.fit(data)
- get_train_scores() numpy.ndarray[source]¶
Get the OOD decision scores from the PyOD model after fitting.
- Returns:
Array of decision scores for the training data.
- Return type:
np.ndarray
Examples
>>> scores = policy.get_train_scores()
- act(obs: Dict[str, Any], temperature: float | None = None) torch.Tensor[source]¶
Select actions based on OOD scores from the PyOD model.
- Parameters:
obs (dict) – Observation dictionary containing required features.
temperature (float, optional) – Unused. Included for API compatibility.
- Returns:
Tensor of selected actions (expert or not) for the batch.
- Return type:
torch.Tensor
Examples
>>> action = policy.act(obs)
- set_params(params: Dict[str, Any]) None[source]¶
Set the parameters of the policy.
- Parameters:
params (dict) – Dictionary of policy parameters to set.
- Return type:
None
Examples
>>> policy.set_params({'threshold': 0.5, 'clf': clf})
- get_params() Dict[str, Any][source]¶
Get the current parameters of the policy.
- Returns:
Dictionary of policy parameters.
- Return type:
dict
Examples
>>> params = policy.get_params()