duo_ai.policies.pyod

Classes

PyODPolicyConfig

Configuration dataclass for PyODPolicy.

PyODPolicy

Policy that uses a PyOD outlier detector for action selection based on OOD scores.

Module Contents

class duo_ai.policies.pyod.PyODPolicyConfig[source]

Configuration dataclass for PyODPolicy.

Parameters:
  • name (str, optional) – Name of the policy class. Default is “pyod”.

  • method (str, optional) – PyOD method to use. Default is “deepsvdd.DeepSVDD”.

  • feature_type (str, optional) – Type of feature representation to use. Default is “hidden”.

  • pyod_config (dict, optional) – Additional configuration for the PyOD model. Default is None.

  • load_path (str, optional) – Path to a checkpoint to load. Default is None.

Examples

>>> config = PyODPolicyConfig(method="deepsvdd.DeepSVDD", feature_type="hidden")
name: str = 'pyod'
method: str = 'deepsvdd.DeepSVDD'
feature_type: str = 'hidden'
pyod_config: Dict[str, Any] | None = None
load_path: str | None = None
class duo_ai.policies.pyod.PyODPolicy(config: PyODPolicyConfig, env: gym.Env)[source]

Bases: duo_ai.core.policy.Policy

Policy that uses a PyOD outlier detector for action selection based on OOD scores.

Examples

>>> policy = PyODPolicy(PyODPolicyConfig(), env)
>>> obs = ...
>>> action = policy.act(obs)
config_cls
config
threshold = None
device
clf
feature_type
EXPERT
_get_pyod_class(config: PyODPolicyConfig) type[source]

Dynamically import and return the PyOD class specified in the config.

Parameters:

config (PyODPolicyConfig) – Configuration object for the policy.

Returns:

The PyOD class to instantiate.

Return type:

type

Raises:

ImportError – If the specified class cannot be imported.

Examples

>>> cls = policy._get_pyod_class(config)
reset(done: numpy.ndarray) None[source]

Reset the policy state at episode boundaries.

Parameters:

done (numpy.ndarray) – Boolean array indicating which episodes in a batch require a reset.

Return type:

None

Examples

>>> policy.reset(done)
_make_input(obs: Dict[str, Any]) numpy.ndarray[source]

Construct the input feature array for the PyOD model from the observation.

Parameters:

obs (dict) – Observation dictionary containing required features.

Returns:

Concatenated feature array for the PyOD model.

Return type:

np.ndarray

Raises:

AssertionError – If no features are selected for PyOD input.

Examples

>>> inp = policy._make_input(obs)
fit(data: Dict[str, Any]) None[source]

Fit the PyOD model using the provided data.

Parameters:

data (dict) – Data dictionary containing features for fitting the model.

Return type:

None

Examples

>>> policy.fit(data)
get_train_scores() numpy.ndarray[source]

Get the OOD decision scores from the PyOD model after fitting.

Returns:

Array of decision scores for the training data.

Return type:

np.ndarray

Examples

>>> scores = policy.get_train_scores()
act(obs: Dict[str, Any], temperature: float | None = None) torch.Tensor[source]

Select actions based on OOD scores from the PyOD model.

Parameters:
  • obs (dict) – Observation dictionary containing required features.

  • temperature (float, optional) – Unused. Included for API compatibility.

Returns:

Tensor of selected actions (expert or not) for the batch.

Return type:

torch.Tensor

Examples

>>> action = policy.act(obs)
set_params(params: Dict[str, Any]) None[source]

Set the parameters of the policy.

Parameters:

params (dict) – Dictionary of policy parameters to set.

Return type:

None

Examples

>>> policy.set_params({'threshold': 0.5, 'clf': clf})
get_params() Dict[str, Any][source]

Get the current parameters of the policy.

Returns:

Dictionary of policy parameters.

Return type:

dict

Examples

>>> params = policy.get_params()
train() None[source]

Set the PyOD model to training mode if applicable.

Return type:

None

Examples

>>> policy.train()
eval() None[source]

Set the PyOD model to evaluation mode if applicable.

Return type:

None

Examples

>>> policy.eval()