GSoC: Integration of Agents.jl with RL methods #1170

bergio13 · 2025-07-10T17:16:33Z

Fixes #648

Initial draft for the integration of reinforcement learning methods within Agents.jl. It is a working sketch still to be polished, refined and improved. The file rl_interface contains the code that allows to train the agents of an abm model using reinforcement learning. Examples on how to use this interface are provided in rl_interface_examples. These examples can be compared with the ones implemented without the interface (see boltzmann_local and wolfsheep) to see how much the interface simplifies the code the user needs to implement.

codecov-commenter · 2025-07-10T17:31:29Z

Codecov Report

❌ Patch coverage is 20.15707% with 305 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.30%. Comparing base (8b5b456) to head (4eb1985).
⚠️ Report is 195 commits behind head on main.

Files with missing lines	Patch %	Lines
ext/AgentsRL/src/rl_training_functions.jl	0.00%	132 Missing ⚠️
ext/AgentsRL/src/rl_utils.jl	0.00%	116 Missing ⚠️
ext/AgentsRL/src/step_reinforcement_learning.jl	0.00%	37 Missing ⚠️
src/reinforcement_learning.jl	79.31%	18 Missing ⚠️
src/Agents.jl	77.77%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1170      +/-   ##
==========================================
+ Coverage   70.12%   77.30%   +7.18%     
==========================================
  Files          42       42              
  Lines        2718     2992     +274     
==========================================
+ Hits         1906     2313     +407     
+ Misses        812      679     -133

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

examples/rl/rl_interface_examples.jl

bergio13 · 2025-07-17T09:10:24Z

I am trying to implement the new model type in the following way:

struct ReinforcementLearningABM{
    S<:SpaceType,
    A<:AbstractAgent,
    C<:Union{AbstractDict{Int,A},AbstractVector{A}},
    T,G,K,F,P,R<:AbstractRNG} <: AgentBasedModel{S}
    # Standard ABM components
    agents::C
    agent_step::G
    model_step::K
    space::S
    scheduler::F
    properties::P
    rng::R
    agents_types::T
    agents_first::Bool
    maxid::Base.RefValue{Int64}
    time::Base.RefValue{Int64}

    # RL-specific components
    rl_config::Base.RefValue{Any}
    trained_policies::Dict{Type,Any}
    training_history::Dict{Type,Vector{Float64}}
    is_training::Base.RefValue{Bool}
end

# Extend mandatory internal API for AgentBasedModel
containertype(::ReinforcementLearningABM{S,A,C}) where {S,A,C} = C
agenttype(::ReinforcementLearningABM{S,A}) where {S,A} = A
discretimeabm(::ReinforcementLearningABM) = true

function ReinforcementLearningABM(
    A::Type,
    space::S=nothing,
    rl_config=nothing;
    agent_step!::G=dummystep,
    model_step!::K=dummystep,
    container::Type=Dict,
    scheduler::F=Schedulers.Randomly(),
    properties::P=nothing,
    rng::R=Random.default_rng(),
    agents_first::Bool=true,
    warn=true,
    kwargs...
) where {S<:SpaceType,G,K,F,P,R<:AbstractRNG}

    # Initialize agent container using proper construction
    agents = construct_agent_container(container, A)
    agents_types = union_types(A)
    T = typeof(agents_types)
    C = typeof(agents)

    model = ReinforcementLearningABM{S,A,C,T,G,K,F,P,R}(
        agents,
        agent_step!,
        model_step!,
        space,
        scheduler,
        properties,
        rng,
        agents_types,
        agents_first,
        Ref(0),
        Ref(0),
        Ref{Any}(rl_config),
        Dict{Type,Any}(),
        Dict{Type,Vector{Float64}}(),
        Ref(false)
    )

    return model
end

I am able to create the model yet when I try to use the function add_agent I get a notimplemented error. Does this mean I have to redefine add_agent, remove_agent, etc... functions or extend the model_accessing_api functions ?

What I mean is the following, is it better to do something like:
const DictRLABM = ReinforcementLearningABM{S,A,<:AbstractDict{<:Integer,A}} where {S,A}
or should I try to extend the existing functions like this: const DictABM = Union{StandardABM{S,A,<:AbstractDict{<:Integer,A}} where {S,A}, EventQueueABM{S,A,<:AbstractDict{<:Integer,A}} where {S,A}, ReinforcementLearningABM{S,A,<:AbstractDict{<:Integer,A}} where {S,A}}

Datseris · 2025-07-17T09:57:04Z

you need to make the model a subtype of AgentBasedModel.

bergio13 · 2025-07-17T10:00:53Z

I thought I was already doing it here:
struct ReinforcementLearningABM{ S<:SpaceType, A<:AbstractAgent, C<:Union{AbstractDict{Int,A},AbstractVector{A}}, T,G,K,F,P,R<:AbstractRNG} <: AgentBasedModel{S}

So this is not sufficient ?

Datseris · 2025-07-17T17:50:33Z

ah, sorry, I didn't see this correctly. I am a bit short on time right now, but I can spend some time to help here next weekend, 26-27 of July. @Tortar perhaps you can give some advice before that?>

bergio13 · 2025-07-17T17:55:22Z

No worries, I think I figured it out. I went this solution const DictABM = Union{StandardABM{S,A,<:AbstractDict{<:Integer,A}} where {S,A}, EventQueueABM{S,A,<:AbstractDict{<:Integer,A}} where {S,A}, ReinforcementLearningABM{S,A,<:AbstractDict{<:Integer,A}} where {S,A}}. But during the next meeting we can discuss this and if you don't like it i can change it.

Datseris · 2025-07-17T17:58:56Z

Yes, this DictABM design is not optimal, because it is extended by type while it should be extended by functions. Instead of trying to use multiple dispatch to dispatch on models that are dict-based, we should instead have a central function with a bunch of if statements that calls agent_container(model), examines its type, and then acts accordingly? Let's go over this in our next meeting. I am up for weekly meetings on Thursdays.

bergio13 · 2025-07-17T18:11:19Z

We have our next meeting scheduled on the 24/07 at 11. By the way is there any chance we can move that later in the day ? I might have some problems connecting at that time. Furthermore, there are also some design choices regarding the implementation of the POMDPs.initialstate() function that this refactoring from GeneralRLEnvironment to ReinforcementLearningABM requires to take into account in my opinion.

Datseris · 2025-07-17T20:12:09Z

at 11

Which timezone? We can move, give me a timewindow (and zone) of preferrence and i'll put a date there assuming Adriano is also available.

Tortar · 2025-07-18T00:48:40Z

Yes, this DictABM design is not optimal, because it is extended by type while it should be extended by functions. Instead of trying to use multiple dispatch to dispatch on models that are dict-based, we should instead have a central function with a bunch of if statements that calls agent_container(model), examines its type, and then acts accordingly? Let's go over this in our next meeting. I am up for weekly meetings on Thursdays.

Actually the best way to solve this is to make DictABM = AgentBasedModel{S, A,<:AbstractDict{<:Integer,A}} where {S,A}. For now AgentBasedModel doesn't have all those parameters though, but they can be added.

Tortar · 2025-07-18T01:02:10Z

For me it's okay at any hour in the afternoon by the way so I'll let you decide for the hour of our meeting

bergio13 · 2025-07-20T18:18:10Z

We could make it after lunch, would 2.30pm (CEST) work ?

Datseris · 2025-07-20T21:02:20Z

yes, i update hte invite

…erface for RL - Reorganized old interface examples - Added ReinforcementLearningABM and RLEnvironmentWrapper to enable compatibility with POMDPs-based RL algorithms provided by Crux. - Implemented necessary POMDPs functions: actions, observations, observation, initialstate, initialobs, gen, isterminal, discount, and state_space. - Added step_rl! and rl_agent_step! functions for RL agent behavior. - Added examples to show how the new model type works

src/core/model_reinforcement_learning.jl

src/core/rl_utils.jl

Datseris · 2025-07-26T22:40:28Z

hello, i am sorry but i am on sick leave for two weeks and i cannot meet next week. I am available again on the 11th of August and would be happy to meet that week!

Datseris · 2025-08-08T09:10:31Z

Hello, I am returning from my sick leave next week, would you like to have a videocall on thursday?

bergio13 · 2025-08-11T16:21:38Z

Hello, I am returning from my sick leave next week, would you like to have a videocall on thursday?

At what time would you like to meet ?

Datseris · 2025-08-11T17:04:35Z

I can do 2pm UK time if that works.

bergio13 · 2025-08-12T09:39:21Z

I can do 2pm UK time if that works.

For me it works

ext/AgentsVisualizations/src/interaction.jl

examples/rl_boltzmann.jl

Tortar · 2025-08-22T14:34:06Z

examples/rl_boltzmann.jl

+    target_agent = model[agent_id]
+    agent_pos = target_agent.pos
+    width, height = getfield(model, :space).extent
+    observation_radius = model.rl_config[][:observation_radius]


this could be a model property instead of being passed in rl_config

examples/rl_boltzmann.jl

Tortar · 2025-08-22T14:57:22Z

examples/rl_boltzmann.jl

+# decreases in the Gini coefficient. This creates an incentive for agents to learn
+# movement patterns that promote wealth redistribution.
+
+function boltzmann_calculate_reward(env, agent, action, initial_model, final_model)


probably env can be spared and something like boltzmann_calculate_reward(agent, action, previous_model, current_model) could be better

Tortar · 2025-08-22T14:58:04Z

examples/rl_boltzmann.jl

+    properties = Dict{Symbol,Any}(
+        :gini_coefficient => 0.0,
+        :step_count => 0
+    )


these are probably not needed

Tortar · 2025-08-22T15:01:25Z

I added review comments about what we discussed @bergio13 @Datseris in the meeting as well as other example code simplifications

examples/rl_boltzmann.jl

Tortar · 2025-08-23T01:15:50Z

I have taken care of the most part of these review things myself. I've also generated these videos showing how random agents differ from RL agents in our example

boltzmann.mp4

rl_boltzmann.mp4

RL agents are clearly smarter :-)

Great work Giorgio!

Tortar

I verified locally that documentation works, only the videos are lacking at the moment because needs to be included with html code after being produced. Apart from that, I think that the PR is in pretty good shape and so I will approve it, but before merging it, it would probably be useful if @Datseris has a final look at it (and the few open review comments be tackled)

Co-authored-by: George Datseris <datseris.george@gmail.com>

Datseris · 2025-08-27T08:20:06Z

Thanks a lot for your work both @bergio13 and @Tortar . I agree that this looks great, and it is very close to finishing! However I would really like to have a final in-depth look before that. The only problem is that I am currently under a lot of pressure from my main job and therefore lack time. I will try to work on this on the coming weekend if that's okay. Here and there during the evenings I will be adding comments to the review (you won't see them until I submit).

The PR is approved and as far as I can tell @bergio13 had a great GSOC project!

bergio13 · 2025-08-29T10:53:59Z

I have taken care of the most part of these review things myself. I've also generated these videos showing how random agents differ from RL agents in our example

boltzmann.mp4
rl_boltzmann.mp4
RL agents are clearly smarter :-)

Great work Giorgio!

Thanks a lot for your work both @bergio13 and @Tortar . I agree that this looks great, and it is very close to finishing! However I would really like to have a final in-depth look before that. The only problem is that I am currently under a lot of pressure from my main job and therefore lack time. I will try to work on this on the coming weekend if that's okay. Here and there during the evenings I will be adding comments to the review (you won't see them until I submit).

The PR is approved and as far as I can tell @bergio13 had a great GSOC project!

Thanks to you and to @Tortar for your help throughout this GSoC !

bergio13 added 4 commits July 10, 2025 17:07

Create .gitkeep

569cdac

Add files via upload

438d861

Add files via upload

d024f37

Delete examples/rl/.gitkeep

906a6c7

Datseris reviewed Jul 11, 2025

View reviewed changes

examples/rl/rl_interface_examples.jl Outdated Show resolved Hide resolved

Datseris reviewed Jul 11, 2025

View reviewed changes

examples/rl/rl_interface_examples.jl Outdated Show resolved Hide resolved

bergio13 added 3 commits July 22, 2025 14:50

ignore log

6463646

fix wolfsheep

ece4017

Tortar reviewed Jul 26, 2025

View reviewed changes

src/core/model_reinforcement_learning.jl Outdated Show resolved Hide resolved

Tortar reviewed Jul 26, 2025

View reviewed changes

src/core/rl_utils.jl Outdated Show resolved Hide resolved

fix indexing, stepping with policies and training config

fd5d309

Tortar and others added 8 commits August 21, 2025 14:22

Update rl_boltzmann.jl

2fb36da

fix

f7df9ce

improve boltzmann tutorial + fixes

9657ec4

refactor observation_radius + fixes

33bb8b2

fix example

3a638d2

fix tutorial

ac20b28

add tests for extension + improve docs for RLABM

bf86bd0

fix tests

c3c984b

Datseris reviewed Aug 22, 2025

View reviewed changes

ext/AgentsVisualizations/src/interaction.jl Outdated Show resolved Hide resolved

Datseris marked this pull request as ready for review August 22, 2025 13:35

Tortar reviewed Aug 22, 2025

View reviewed changes

Tortar added 4 commits August 22, 2025 17:02

Update examples/rl_boltzmann.jl

f1bf7f6

Update examples/rl_boltzmann.jl

1b367d0

Update examples/rl_boltzmann.jl

de7a081

Update examples/rl_boltzmann.jl

73a7c49

Tortar reviewed Aug 22, 2025

View reviewed changes

examples/rl_boltzmann.jl Outdated Show resolved Hide resolved

Tortar added 6 commits August 22, 2025 17:13

Update examples/rl_boltzmann.jl

d336fd0

Update examples/rl_boltzmann.jl

c37b554

update example

aa0222a

Update Project.toml

47a3240

fix stepping

988821f

Update rl_boltzmann.jl

2e21eca

Tortar added 2 commits August 23, 2025 03:50

Update rl_boltzmann.jl

4b9edd1

Update rl_boltzmann.jl

19b85bb

Tortar approved these changes Aug 23, 2025

View reviewed changes

Update ext/AgentsVisualizations/src/interaction.jl

4eb1985

Co-authored-by: George Datseris <datseris.george@gmail.com>

GSoC: Integration of Agents.jl with RL methods #1170

Are you sure you want to change the base?

GSoC: Integration of Agents.jl with RL methods #1170

Uh oh!

Conversation

bergio13 commented Jul 10, 2025 • edited by Tortar Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

bergio13 commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Datseris commented Jul 17, 2025

Uh oh!

bergio13 commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Datseris commented Jul 17, 2025

Uh oh!

bergio13 commented Jul 17, 2025

Uh oh!

Datseris commented Jul 17, 2025

Uh oh!

bergio13 commented Jul 17, 2025

Uh oh!

Datseris commented Jul 17, 2025

Uh oh!

Tortar commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Tortar commented Jul 18, 2025

Uh oh!

bergio13 commented Jul 20, 2025

Uh oh!

Datseris commented Jul 20, 2025

Uh oh!

Uh oh!

Uh oh!

Datseris commented Jul 26, 2025

Uh oh!

Datseris commented Aug 8, 2025

Uh oh!

bergio13 commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Datseris commented Aug 11, 2025

Uh oh!

bergio13 commented Aug 12, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Tortar Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Tortar Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

Tortar Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

Tortar commented Aug 22, 2025

Uh oh!

Uh oh!

Tortar commented Aug 23, 2025

Uh oh!

Tortar left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Datseris commented Aug 27, 2025

Uh oh!

bergio13 commented Jul 10, 2025 •

edited by Tortar

Loading

codecov-commenter commented Jul 10, 2025 •

edited

Loading

bergio13 commented Jul 17, 2025 •

edited

Loading

bergio13 commented Jul 17, 2025 •

edited

Loading

Tortar commented Jul 18, 2025 •

edited

Loading

bergio13 commented Aug 11, 2025 •

edited

Loading

Tortar left a comment •

edited

Loading