Let us consider N drones
To solve the routing protocol the Q-Learning algorithm was used which is an off-policy TD control algorithm in Reinforcement Learning:
- greedy (with exploration in the early stages)
- greedy (without exploratin in the early stages)
- best action (with exploration in the early stages)
- best action (without exploration in the early stages)
- Q-FANET
The simulator used for the experiments can be found at this link https://github.com/flaat/DroNETworkSimulator/
If you want to try the solutions, you can put the routing algorithms in this folder https://github.com/flaat/DroNETworkSimulator/tree/main/src/routing_algorithms