-
Notifications
You must be signed in to change notification settings - Fork 137
GSoC: Integration of Agents.jl with RL methods #1170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1170 +/- ##
==========================================
+ Coverage 70.12% 77.30% +7.18%
==========================================
Files 42 42
Lines 2718 2992 +274
==========================================
+ Hits 1906 2313 +407
+ Misses 812 679 -133 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
I am trying to implement the new model type in the following way: struct ReinforcementLearningABM{
S<:SpaceType,
A<:AbstractAgent,
C<:Union{AbstractDict{Int,A},AbstractVector{A}},
T,G,K,F,P,R<:AbstractRNG} <: AgentBasedModel{S}
# Standard ABM components
agents::C
agent_step::G
model_step::K
space::S
scheduler::F
properties::P
rng::R
agents_types::T
agents_first::Bool
maxid::Base.RefValue{Int64}
time::Base.RefValue{Int64}
# RL-specific components
rl_config::Base.RefValue{Any}
trained_policies::Dict{Type,Any}
training_history::Dict{Type,Vector{Float64}}
is_training::Base.RefValue{Bool}
end
# Extend mandatory internal API for AgentBasedModel
containertype(::ReinforcementLearningABM{S,A,C}) where {S,A,C} = C
agenttype(::ReinforcementLearningABM{S,A}) where {S,A} = A
discretimeabm(::ReinforcementLearningABM) = true
function ReinforcementLearningABM(
A::Type,
space::S=nothing,
rl_config=nothing;
agent_step!::G=dummystep,
model_step!::K=dummystep,
container::Type=Dict,
scheduler::F=Schedulers.Randomly(),
properties::P=nothing,
rng::R=Random.default_rng(),
agents_first::Bool=true,
warn=true,
kwargs...
) where {S<:SpaceType,G,K,F,P,R<:AbstractRNG}
# Initialize agent container using proper construction
agents = construct_agent_container(container, A)
agents_types = union_types(A)
T = typeof(agents_types)
C = typeof(agents)
model = ReinforcementLearningABM{S,A,C,T,G,K,F,P,R}(
agents,
agent_step!,
model_step!,
space,
scheduler,
properties,
rng,
agents_types,
agents_first,
Ref(0),
Ref(0),
Ref{Any}(rl_config),
Dict{Type,Any}(),
Dict{Type,Vector{Float64}}(),
Ref(false)
)
return model
end I am able to create the model yet when I try to use the function What I mean is the following, is it better to do something like: |
you need to make the model a subtype of |
I thought I was already doing it here: So this is not sufficient ? |
ah, sorry, I didn't see this correctly. I am a bit short on time right now, but I can spend some time to help here next weekend, 26-27 of July. @Tortar perhaps you can give some advice before that?> |
No worries, I think I figured it out. I went this solution |
Yes, this |
We have our next meeting scheduled on the 24/07 at 11. By the way is there any chance we can move that later in the day ? I might have some problems connecting at that time. Furthermore, there are also some design choices regarding the implementation of the |
Which timezone? We can move, give me a timewindow (and zone) of preferrence and i'll put a date there assuming Adriano is also available. |
Actually the best way to solve this is to make |
For me it's okay at any hour in the afternoon by the way so I'll let you decide for the hour of our meeting |
We could make it after lunch, would 2.30pm (CEST) work ? |
yes, i update hte invite |
…erface for RL - Reorganized old interface examples - Added ReinforcementLearningABM and RLEnvironmentWrapper to enable compatibility with POMDPs-based RL algorithms provided by Crux. - Implemented necessary POMDPs functions: actions, observations, observation, initialstate, initialobs, gen, isterminal, discount, and state_space. - Added step_rl! and rl_agent_step! functions for RL agent behavior. - Added examples to show how the new model type works
hello, i am sorry but i am on sick leave for two weeks and i cannot meet next week. I am available again on the 11th of August and would be happy to meet that week! |
Hello, I am returning from my sick leave next week, would you like to have a videocall on thursday? |
At what time would you like to meet ? |
I can do 2pm UK time if that works. |
For me it works |
target_agent = model[agent_id] | ||
agent_pos = target_agent.pos | ||
width, height = getfield(model, :space).extent | ||
observation_radius = model.rl_config[][:observation_radius] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be a model property instead of being passed in rl_config
# decreases in the Gini coefficient. This creates an incentive for agents to learn | ||
# movement patterns that promote wealth redistribution. | ||
|
||
function boltzmann_calculate_reward(env, agent, action, initial_model, final_model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably env
can be spared and something like boltzmann_calculate_reward(agent, action, previous_model, current_model)
could be better
properties = Dict{Symbol,Any}( | ||
:gini_coefficient => 0.0, | ||
:step_count => 0 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are probably not needed
I have taken care of the most part of these review things myself. I've also generated these videos showing how random agents differ from RL agents in our example boltzmann.mp4rl_boltzmann.mp4RL agents are clearly smarter :-) Great work Giorgio! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified locally that documentation works, only the videos are lacking at the moment because needs to be included with html code after being produced. Apart from that, I think that the PR is in pretty good shape and so I will approve it, but before merging it, it would probably be useful if @Datseris has a final look at it (and the few open review comments be tackled)
Co-authored-by: George Datseris <datseris.george@gmail.com>
Thanks a lot for your work both @bergio13 and @Tortar . I agree that this looks great, and it is very close to finishing! However I would really like to have a final in-depth look before that. The only problem is that I am currently under a lot of pressure from my main job and therefore lack time. I will try to work on this on the coming weekend if that's okay. Here and there during the evenings I will be adding comments to the review (you won't see them until I submit). The PR is approved and as far as I can tell @bergio13 had a great GSOC project! |
Thanks to you and to @Tortar for your help throughout this GSoC ! |
Fixes #648
Initial draft for the integration of reinforcement learning methods within Agents.jl. It is a working sketch still to be polished, refined and improved. The file
rl_interface
contains the code that allows to train the agents of an abm model using reinforcement learning. Examples on how to use this interface are provided inrl_interface_examples
. These examples can be compared with the ones implemented without the interface (seeboltzmann_local
andwolfsheep
) to see how much the interface simplifies the code the user needs to implement.