duo_ai.algorithms.random ======================== .. py:module:: duo_ai.algorithms.random Classes ------- .. autoapisummary:: duo_ai.algorithms.random.RandomAlgorithmConfig duo_ai.algorithms.random.RandomAlgorithm Module Contents --------------- .. py:class:: RandomAlgorithmConfig Configuration dataclass for RandomAlgorithm. :param name: Name of the algorithm class. Default is "random". :type name: str, optional :param probs: List of probabilities to search. Default is np.arange(0, 1.01, 0.1). :type probs: list of float, optional .. rubric:: Examples >>> config = RandomAlgorithmConfig() .. py:attribute:: name :type: str :value: 'random' .. py:attribute:: probs :type: List[float] .. py:class:: RandomAlgorithm(config: RandomAlgorithmConfig) Bases: :py:obj:`duo_ai.core.Algorithm` Algorithm that searches for the best probability parameter to maximize evaluation reward. .. rubric:: Examples >>> algo = RandomAlgorithm(RandomAlgorithmConfig()) .. py:attribute:: config_cls .. py:attribute:: config .. py:method:: train(policy: duo.policies.PPOPolicy, env: gym.Env, validators: Dict[str, duo.core.Evaluator]) -> None Train the RandomAlgorithm by searching for the best probability parameter that maximizes evaluation reward. :param policy: The policy to be evaluated and tuned. :type policy: duo.policies.PPOPolicy :param env: The environment instance for training and data generation. :type env: gym.Env :param validators: Dictionary mapping split names to evaluator instances for evaluation. :type validators: dict of str to duo.core.Evaluator :rtype: None .. rubric:: Examples >>> algorithm = RandomAlgorithm(RandomAlgorithmConfig()) >>> algorithm.train(policy, env, validators) .. py:method:: save_checkpoint(policy: duo.policies.PPOPolicy, name: str) -> None Save the current policy configuration and parameters to a checkpoint file. :param policy: The policy whose parameters are to be saved. :type policy: duo.policies.PPOPolicy :param name: Name for the checkpoint file. :type name: str :rtype: None .. rubric:: Examples >>> self.save_checkpoint(policy, "best_test")