Skip to content

Implemented an environment using Q Learning in which two populations try to maximize itself and conquer over the entire grid

Notifications You must be signed in to change notification settings

mridul-g/Neuroeconomics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Neuroeconomics

Submission by Aryans - Team 1

  • Rishav Bikarwar - 200792
  • Aditya Bangar - 210069
  • Shubham Patel - 210709
  • Mridul Gupta - 220672
  • Sagar Arora - 220933

Parameters :

  • Environment Parameters

    • HEIGHT : size parameter
    • WIDTH : size parameter
    • INITIAL_HELPUL_POP : initial number of helpful macpen
    • INITIAL_TITTAT_POP : initial number of tittat macpen
    • INITIAL_UNHELPFUL_POP : initial number of unhelpful macpen
    • MOVEMENTS_PER_DAY : the number of movements per day
    • CANTEEN_COUNT : the number of canteens in our world
    • CANTEED_FOOD_PER_PERSON : the amount of food the canteen will give to a particular person
    • CANTEEN_DAILY_FOOD_LIMIT : the daily limit of a given canteen
    • CANTEEN_REAPPEAR_TIME : this parameter is required in case we want to change the positions of the canteens
    • GHOST_VAL : the amount of food the ghost gang takes away
    • FOOD_THRESHOLD_FOR_REPRODUCTION : reproduction threshold of the macpens
    • FOOD_COST_PER_MOVE : the amount of food lost in movement
    • MIN_START_FOOD : the least amount of food each macpen will have at the beginning
  • QLearning Parameters

    • LEARNING_RATE
    • START_EXPLORATION_PROB
    • MIN_EXPLORATION_PROB
    • DISCOUNT
    • EXPL_RATE_DECAY
    • PRODUCTION_REWARD
    • VISION_RADIUS
    • VISION_MOVEMENT_PROB

The whole analysis of the world depends on the variation of these above parameters

The environment is simulated in a pygame environment:

  • red blocks = unhelpful
  • green blocks = helpful
  • yellow blocks = tittat
  • blue blocks = canteens

Screenshot 2023-04-03 at 10 47 39 AM

Our approach

Movement Policy

  • we are following the Q-learning approach to decide upon the action for the movement of our macpens.
  • the rewards for our macpens are the actions that lead to it having more food, and the moment when it reproduces.
  • each of these actions are associated with certain amounts of rewards.
  • the exploration vs exploitation is also taken care of by giving it a minimum exploration probability, below which it can never go.
  • our macpen also have the VISION attribute, which helps it to see around for canteens, and the movement along with the vision is also given a certain probability so that it doesn't always follow the visual cue, but also explores around for more.

Interaction Policy

  • the interaction policy is decided by taking combinations of all the macpens in a grid cell, and a macpen only gives enough food to the other macpen so that it just makes it to the survival threshold, which is just more than the GHOST_VAL.
  • if the macpen if of type TITTAT, then it keeps the count of the total food it has received when it was in need and also keeps track of when it needed food and didn't get it. So, the overall sum of these is used to determine whether the titat macpen will behave helpfully or unhelpfully.
  • the helpful macpen will always donate, the unhelpful macpen will never donate as said in the question.

Some interesting results from our analysis

  • It is usually found that colonies are formed around the area of the canteens, because that is the area where it receives the food most accessibly, and reproduces there itself, therefore the population grows there.

    • this is just like the real world scenarios where the population tends to more near more food sources and fertile land.
  • It is also found that generally the colonies of unhelpful and tittat or helpful and tittat co-exist.. but the colonies of helpful and unhelpful are rarely found to co-exist, or even if they do, eventually the helpful ones die out.

About

Implemented an environment using Q Learning in which two populations try to maximize itself and conquer over the entire grid

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published