duo_ai.policies.pyod¶

Classes¶

`PyODPolicyConfig`	Configuration dataclass for PyODPolicy.
`PyODPolicy`	Policy that uses a PyOD outlier detector for action selection based on OOD scores.

Module Contents¶

class duo_ai.policies.pyod.PyODPolicyConfig[source]¶

Configuration dataclass for PyODPolicy.

Parameters:

name (str, optional) – Name of the policy class. Default is “pyod”.
method (str, optional) – PyOD method to use. Default is “deepsvdd.DeepSVDD”.
feature_type (str, optional) – Type of feature representation to use. Default is “hidden”.
pyod_config (dict, optional) – Additional configuration for the PyOD model. Default is None.
load_path (str, optional) – Path to a checkpoint to load. Default is None.

Examples

>>> config = PyODPolicyConfig(method="deepsvdd.DeepSVDD", feature_type="hidden")

name: str = 'pyod'¶

method: str = 'deepsvdd.DeepSVDD'¶

feature_type: str = 'hidden'¶

pyod_config: Dict[str, Any] | None = None¶

load_path: str | None = None¶

class duo_ai.policies.pyod.PyODPolicy(config: PyODPolicyConfig, env: gym.Env)[source]¶

Bases: duo_ai.core.policy.Policy

Policy that uses a PyOD outlier detector for action selection based on OOD scores.

Examples

>>> policy = PyODPolicy(PyODPolicyConfig(), env)
>>> obs = ...
>>> action = policy.act(obs)

config_cls¶

config¶

threshold = None¶

device¶

clf¶

feature_type¶

EXPERT¶

_get_pyod_class(config: PyODPolicyConfig) → type[source]¶

Dynamically import and return the PyOD class specified in the config.

Parameters:: config (PyODPolicyConfig) – Configuration object for the policy.
Returns:: The PyOD class to instantiate.
Return type:: type
Raises:: ImportError – If the specified class cannot be imported.

Examples

>>> cls = policy._get_pyod_class(config)

reset(done: numpy.ndarray) → None[source]¶

Reset the policy state at episode boundaries.

Parameters:: done (numpy.ndarray) – Boolean array indicating which episodes in a batch require a reset.
Return type:: None

Examples

>>> policy.reset(done)

_make_input(obs: Dict[str, Any]) → numpy.ndarray[source]¶

Construct the input feature array for the PyOD model from the observation.

Parameters:: obs (dict) – Observation dictionary containing required features.
Returns:: Concatenated feature array for the PyOD model.
Return type:: np.ndarray
Raises:: AssertionError – If no features are selected for PyOD input.

Examples

>>> inp = policy._make_input(obs)

fit(data: Dict[str, Any]) → None[source]¶

Fit the PyOD model using the provided data.

Parameters:: data (dict) – Data dictionary containing features for fitting the model.
Return type:: None

Examples

>>> policy.fit(data)

get_train_scores() → numpy.ndarray[source]¶

Get the OOD decision scores from the PyOD model after fitting.

Returns:: Array of decision scores for the training data.
Return type:: np.ndarray

Examples

>>> scores = policy.get_train_scores()

act(obs: Dict[str, Any], temperature: float | None = None) → torch.Tensor[source]¶

Select actions based on OOD scores from the PyOD model.

Parameters:

obs (dict) – Observation dictionary containing required features.
temperature (float, optional) – Unused. Included for API compatibility.

Returns:

Tensor of selected actions (expert or not) for the batch.

Return type:

torch.Tensor

Examples

>>> action = policy.act(obs)

set_params(params: Dict[str, Any]) → None[source]¶

Set the parameters of the policy.

Parameters:: params (dict) – Dictionary of policy parameters to set.
Return type:: None

Examples

>>> policy.set_params({'threshold': 0.5, 'clf': clf})

get_params() → Dict[str, Any][source]¶

Get the current parameters of the policy.

Returns:: Dictionary of policy parameters.
Return type:: dict

Examples

>>> params = policy.get_params()

train() → None[source]¶

Set the PyOD model to training mode if applicable.

Return type:: None

Examples

>>> policy.train()

eval() → None[source]¶

Set the PyOD model to evaluation mode if applicable.

Return type:: None

Examples

>>> policy.eval()