Implementing extrinsic rewards using MeltingPot #28

ManuelRios18 · 2022-04-13T18:17:44Z

Hello everybody,

I am trying to replicate the results of the paper Inequity aversion improves cooperation in
intertemporal social dilemmas using MeltingPot. According with the tutorial that I have been following the reward that the agent receives is specified by the "Edible" component. Aditionally, in self_play.py I notice that Rllib handles the training completely in the trainer.train() method. Therefore, I am not sure which is the best way to implement the advantageous and disadvantageous inequity aversion since they need to punish or reward agents on each time step taking into account the reward that the other agents have received. Should I implement it using lua ? in a new substrate ?

Aditionally, When I run python3 meltingpot/python/human_players/play_clean_up.py --observation WORLD.RGB I notice that there are no apples in the field. Is this normal ? Will they appear as the agents clean the river?

Thanks in advance for your help and for providing this library!

Best regards.

The text was updated successfully, but these errors were encountered:

jzleibo · 2022-04-13T19:49:19Z

I would implement the inequity aversion reward function as part of the agent, not the environment. If you use self_play.py that would then mean using RLLib to implement it.

As for clean_up, yes it's normal that there are no apples present at the start. You have to clean the river to get them to appear. The growth settings were chosen so that it's easiest to get lots of apples to appear when two players clean at the same time. But if you move quickly with the human player you can still get it to work alone.

Good luck! Let us know how it goes. We're always happy to discuss.

ManuelRios18 · 2022-04-14T15:12:35Z

Thank you @jzleibo,

I will let you guys know when I have the code ready!

Once again thank you.

ManuelRios18 closed this as completed Apr 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implementing extrinsic rewards using MeltingPot #28

Implementing extrinsic rewards using MeltingPot #28

ManuelRios18 commented Apr 13, 2022

jzleibo commented Apr 13, 2022

Uh oh!

ManuelRios18 commented Apr 14, 2022

Uh oh!

Implementing extrinsic rewards using MeltingPot #28

Implementing extrinsic rewards using MeltingPot #28

Comments

ManuelRios18 commented Apr 13, 2022

jzleibo commented Apr 13, 2022

Uh oh!

ManuelRios18 commented Apr 14, 2022

Uh oh!