duo_ai.algorithms.random¶
Classes¶
Configuration dataclass for RandomAlgorithm. |
|
Algorithm that searches for the best probability parameter to maximize evaluation reward. |
Module Contents¶
- class duo_ai.algorithms.random.RandomAlgorithmConfig[source]¶
Configuration dataclass for RandomAlgorithm.
- Parameters:
name (str, optional) – Name of the algorithm class. Default is “random”.
probs (list of float, optional) – List of probabilities to search. Default is np.arange(0, 1.01, 0.1).
Examples
>>> config = RandomAlgorithmConfig()
- name: str = 'random'¶
- probs: List[float]¶
- class duo_ai.algorithms.random.RandomAlgorithm(config: RandomAlgorithmConfig)[source]¶
Bases:
duo_ai.core.AlgorithmAlgorithm that searches for the best probability parameter to maximize evaluation reward.
Examples
>>> algo = RandomAlgorithm(RandomAlgorithmConfig())
- config_cls¶
- config¶
- train(policy: duo.policies.PPOPolicy, env: gym.Env, validators: Dict[str, duo.core.Evaluator]) None[source]¶
Train the RandomAlgorithm by searching for the best probability parameter that maximizes evaluation reward.
- Parameters:
policy (duo.policies.PPOPolicy) – The policy to be evaluated and tuned.
env (gym.Env) – The environment instance for training and data generation.
validators (dict of str to duo.core.Evaluator) – Dictionary mapping split names to evaluator instances for evaluation.
- Return type:
None
Examples
>>> algorithm = RandomAlgorithm(RandomAlgorithmConfig()) >>> algorithm.train(policy, env, validators)
- save_checkpoint(policy: duo.policies.PPOPolicy, name: str) None[source]¶
Save the current policy configuration and parameters to a checkpoint file.
- Parameters:
policy (duo.policies.PPOPolicy) – The policy whose parameters are to be saved.
name (str) – Name for the checkpoint file.
- Return type:
None
Examples
>>> self.save_checkpoint(policy, "best_test")