duo_ai.algorithms.random

Classes

RandomAlgorithmConfig

Configuration dataclass for RandomAlgorithm.

RandomAlgorithm

Algorithm that searches for the best probability parameter to maximize evaluation reward.

Module Contents

class duo_ai.algorithms.random.RandomAlgorithmConfig[source]

Configuration dataclass for RandomAlgorithm.

Parameters:
  • name (str, optional) – Name of the algorithm class. Default is “random”.

  • probs (list of float, optional) – List of probabilities to search. Default is np.arange(0, 1.01, 0.1).

Examples

>>> config = RandomAlgorithmConfig()
name: str = 'random'
probs: List[float]
class duo_ai.algorithms.random.RandomAlgorithm(config: RandomAlgorithmConfig)[source]

Bases: duo_ai.core.Algorithm

Algorithm that searches for the best probability parameter to maximize evaluation reward.

Examples

>>> algo = RandomAlgorithm(RandomAlgorithmConfig())
config_cls
config
train(policy: duo.policies.PPOPolicy, env: gym.Env, validators: Dict[str, duo.core.Evaluator]) None[source]

Train the RandomAlgorithm by searching for the best probability parameter that maximizes evaluation reward.

Parameters:
  • policy (duo.policies.PPOPolicy) – The policy to be evaluated and tuned.

  • env (gym.Env) – The environment instance for training and data generation.

  • validators (dict of str to duo.core.Evaluator) – Dictionary mapping split names to evaluator instances for evaluation.

Return type:

None

Examples

>>> algorithm = RandomAlgorithm(RandomAlgorithmConfig())
>>> algorithm.train(policy, env, validators)
save_checkpoint(policy: duo.policies.PPOPolicy, name: str) None[source]

Save the current policy configuration and parameters to a checkpoint file.

Parameters:
  • policy (duo.policies.PPOPolicy) – The policy whose parameters are to be saved.

  • name (str) – Name for the checkpoint file.

Return type:

None

Examples

>>> self.save_checkpoint(policy, "best_test")