Reinforcement Learning AI News & Research

📈 Newsletters Towards Data Science 11 min read

Surviving High Uncertainty in Logistics with MARL

Part 2. Building scale-invariant agents that seamlessly change contexts The post Surviving High Uncertainty in Logistics with MARL appeared first on Towards Data Science .

#marl #logistics #scheduling

🕐 8 days ago

Read →

🍎 AI Labs Apple ML Research 2 min read

PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning

Multi-tool-integrated reasoning enables LLM-empowered tool-use agents to solve complex tasks by interleaving natural-language reasoning with calls to external tools. However, training such agents using outcome-only rewards suffers from credit-assignment ambiguity,…

#llm #tool-use #reinforcement-learning

🕐 9 days ago

Read →

📈 Newsletters Towards Data Science 8 min read

Introduction to Approximate Solution Methods for Reinforcement Learning

Learn about function approximation and the different choices for approximation functions The post Introduction to Approximate Solution Methods for Reinforcement Learning appeared first on Towards Data Science .

#reinforcement learning #function approximation #machine learning

🕐 19 days ago

Read →

🐻 Research Berkeley AI Research 14 min read

Gradient-based Planning for World Models at Longer Horizons

GRASP is a new gradient-based planner for learned dynamics (a “world model”) that makes long-horizon planning practical by (1) lifting the trajectory into virtual states so optimization is parallel across…

#AI Planning #World Models #Gradient-Based Optimization

🕐 23 days ago

Read →

🤗 AI Labs Hugging Face Blog 12 min read

Ecom-RLVE: Adaptive Verifiable Environments for E-Commerce Conversational Agents

#reinforcement learning #e-commerce #conversational AI

🕐 27 days ago

Read →

🐻 Research Berkeley AI Research 9 min read

RL without TD learning

In this post, I’ll introduce a reinforcement learning (RL) algorithm based on an “alternative” paradigm: divide and conquer . Unlike traditional methods, this algorithm is not based on temporal difference…

#reinforcement learning #off-policy RL #algorithm design

🕐 6 months ago

Read →

🔄 News Synced Review 8 min read

Can GRPO be 10x Efficient? Kwai AI’s SRPO Suggests Yes with SRPO

Kwai AI's SRPO framework slashes LLM RL post-training steps by 90% while matching DeepSeek-R1 performance in math and code. This two-stage RL approach with history resampling overcomes GRPO limitations. The…

#reinforcement learning #large language models #reasoning models

🕐 1 year, 19 days ago

Read →

🔄 News Synced Review 5 min read

DeepSeek Signals Next-Gen R2 Model, Unveils Novel Approach to Scaling Inference with SPCT

DeepSeek AI, a prominent player in the large language model arena, has recently published a research paper detailing a new technique aimed at enhancing the scalability of general reward models…

#DeepSeek #Large Language Models #Reinforcement Learning

🕐 1 year, 1 month ago

Read →

🐻 Research Berkeley AI Research 9 min read

Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment

Training Diffusion Models with Reinforcement Learning We deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone. Our goal is to…

#reinforcement learning #autonomous vehicles #traffic optimization

🕐 1 year, 1 month ago

Read →

🔬 Research Distill.pub 42 min read

Understanding RL Vision

With diverse environments, we can analyze, diagnose and edit deep reinforcement learning models using attribution.

#reinforcement learning #interpretability #computer vision

🕐 5 years ago

Read →

Reinforcement Learning AI News & Research · DeepTrendLab

Reinforcement Learning