- Rishav Bikarwar - 200792
- Aditya Bangar - 210069
- Shubham Patel - 210709
- Mridul Gupta - 220672
- Sagar Arora - 220933
-
Environment Parameters
- HEIGHT : size parameter
- WIDTH : size parameter
- INITIAL_HELPUL_POP : initial number of helpful macpen
- INITIAL_TITTAT_POP : initial number of tittat macpen
- INITIAL_UNHELPFUL_POP : initial number of unhelpful macpen
- MOVEMENTS_PER_DAY : the number of movements per day
- CANTEEN_COUNT : the number of canteens in our world
- CANTEED_FOOD_PER_PERSON : the amount of food the canteen will give to a particular person
- CANTEEN_DAILY_FOOD_LIMIT : the daily limit of a given canteen
- CANTEEN_REAPPEAR_TIME : this parameter is required in case we want to change the positions of the canteens
- GHOST_VAL : the amount of food the ghost gang takes away
- FOOD_THRESHOLD_FOR_REPRODUCTION : reproduction threshold of the macpens
- FOOD_COST_PER_MOVE : the amount of food lost in movement
- MIN_START_FOOD : the least amount of food each macpen will have at the beginning
-
QLearning Parameters
- LEARNING_RATE
- START_EXPLORATION_PROB
- MIN_EXPLORATION_PROB
- DISCOUNT
- EXPL_RATE_DECAY
- PRODUCTION_REWARD
- VISION_RADIUS
- VISION_MOVEMENT_PROB
red blocks = unhelpful
green blocks = helpful
yellow blocks = tittat
blue blocks = canteens
- we are following the Q-learning approach to decide upon the action for the movement of our macpens.
- the rewards for our macpens are the actions that lead to it having more food, and the moment when it reproduces.
- each of these actions are associated with certain amounts of rewards.
- the exploration vs exploitation is also taken care of by giving it a minimum exploration probability, below which it can never go.
- our macpen also have the VISION attribute, which helps it to see around for canteens, and the movement along with the vision is also given a certain probability so that it doesn't always follow the visual cue, but also explores around for more.
- the interaction policy is decided by taking combinations of all the macpens in a grid cell, and a macpen only gives enough food to the other macpen so that it just makes it to the survival threshold, which is just more than the GHOST_VAL.
- if the macpen if of type TITTAT, then it keeps the count of the total food it has received when it was in need and also keeps track of when it needed food and didn't get it. So, the overall sum of these is used to determine whether the titat macpen will behave helpfully or unhelpfully.
- the helpful macpen will always donate, the unhelpful macpen will never donate as said in the question.