My own implementation of the Counter Factual Multi-Agent Policy Gradients algorithm.
The style of coding is such that others can view the code and read the accompanying explanation as part of a research implementation, rather than an industry use-case.
Some analysis of COMA has been done over on my personal website - it covers the performance of COMA, the innovations which the paper made and identifies an error in the ma-gym wiki of the combat game:
https://rdanks.wixsite.com/raydanks/post/counter-factual-multi-agent-policy-gradients-coma