Some library such as ART supports training AI agents with RL efficiently using Unsloth: https://docs.unsloth.ai/basics/reinforcement-learning-rl-guide/training-ai-agents-with-rl
Currently we support most dataset-based use cases with GRPO and DPO, but this would be also useful!
Some library such as ART supports training AI agents with RL efficiently using Unsloth: https://docs.unsloth.ai/basics/reinforcement-learning-rl-guide/training-ai-agents-with-rl
Currently we support most dataset-based use cases with GRPO and DPO, but this would be also useful!