Representation of MDPs #1

czgdp1807 · 2020-06-14T15:09:38Z

Description of the problem

Since, RL aims to solve MDPs i.e., Markov Decision Processes so our first aim should be decide on their representation. It should be designed in such a way that RL algorithms can easily use these representations for finding optimal/sub-optimal solutions.
MDPs have the following elements,

State
Actions
Transition Probabilities
Transition Rewards
Policy
Performance Metric

SMDPs have an additional element called Time of Transition.

How each of the above elements can be represented? One idea can be to use a class for encapsulating the above elements.

Example of the problem

References/Other comments

czgdp1807 · 2020-07-12T10:13:59Z

MDPs and its associated concepts may be represented using the following class structure,

class Action
{
    private string description;

   private Action(const string& description="");
   public static getObject(const string& description="");
   private ~Action();
   public void deleteObject();
   public string getDescription();
};

template <class _type>
class State
{
    private string description;
    private vector<Action*> actions;
    unordered_map<Action*, _type> transitionProbs;
    unordered_map<Action*, _type> iTransitionRewards;
    
   private State(const string& description="");
   public static getObject(const string& description="");
   public void addAction(Action& action);
   public void setTransitionProb(Action& action, _type transitionProb);
   public void setITransitionRewards(Action& action, _type reward);
};

template <class _type>
class MarkovDecisionProcess
{
    private vector<State<_type>*> stateSpace;
    private unordered_map<State<_type>*, Action*>  policy;
    friend _type performanceMetric();
    
    private MarkovDecisionProcess();
    public static getObject();
    public void addState(State& state);
    public void updatePolicy(State& state, Action& action);
};

czgdp1807 added discussion mdp labels Jun 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Representation of MDPs #1

Representation of MDPs #1

czgdp1807 commented Jun 14, 2020

czgdp1807 commented Jul 12, 2020

Representation of MDPs #1

Representation of MDPs #1

Comments

czgdp1807 commented Jun 14, 2020

Description of the problem

Example of the problem

References/Other comments

czgdp1807 commented Jul 12, 2020