Switched from RLInterface to CommonRLInterface #53

zsunberg · 2020-12-13T04:15:41Z

I switched from RLInterface to CommonRLInterface, so we should be able to register this. It is also almost possible to use the package with a CommonRLInterface.AbstractEnv, but not quite (see #51 and #52). I am not sure if there are any performance regressions.

rejuvyesh

Looks good to me except for some small issues I have pointed out.

rejuvyesh · 2020-12-14T23:11:25Z

src/evaluation_policy.jl

@@ -7,25 +7,28 @@ Interface for defining an evaluation policy
    returns the average reward of the current policy, the user can specify its own function 
    f to carry the evaluation, we provide a default basic_evaluation that is just a rollout. 
 """
-function evaluation(f::Function, policy::AbstractNNPolicy, env::AbstractEnvironment, n_eval::Int64, max_episode_length::Int64, verbose::Bool = false)
+function evaluation(f::Function, policy::AbstractNNPolicy, env::AbstractEnv, n_eval::Int64, max_episode_length::Int64, verbose::Bool = false)


How did ^M come in? Likely a vim issue?

Yeah, I am not sure how the ^M is happening. I will correct it. Thanks

rejuvyesh · 2020-12-14T23:13:35Z

src/solver.jl

    return solve(solver, env)
 end

 function POMDPs.solve(solver::DeepQLearningSolver, problem::POMDP)
-    env = POMDPEnvironment(problem, rng=solver.rng)
+    env = POMDPCommonRLEnv{AbstractArray{Float32}}(problem) # ignores solver.rng because CommonRLEnv doesn't have rng support yet


We can't get any more concrete information except AbstractArray{Float32}?

I think AbstractArray{Float32} is what we want here. It just means that convert_o(AbstractArray{Float32}, o, pomdp) will be called on every observation. That way problem-writers can use static arrays or built-in arrays. If the problem implementation is type-stable, the compiler should still be able to infer a nice concrete return type.

zsunberg · 2020-12-22T00:39:43Z

@MaximeBouton do you have any comments on or concerns about this?

rejuvyesh

LGTM!

MaximeBouton · 2021-01-06T14:55:40Z

Sorry I took a lot of time off :)
Reviewing this right now

MaximeBouton · 2021-01-06T14:59:27Z

That looks good! We can merge it, should we tag a version 0.6 and register?

zsunberg · 2021-01-07T22:57:20Z

Yes, I think we can register! Go ahead or I can do it in the next few days

switched from RLInterface to CommonRLInterface

2824f38

zsunberg requested review from rejuvyesh and MaximeBouton December 13, 2020 04:15

rejuvyesh requested changes Dec 14, 2020

View reviewed changes

used dos2unix to fix lineendings

7908f96

zsunberg requested a review from rejuvyesh December 24, 2020 03:40

rejuvyesh approved these changes Dec 24, 2020

View reviewed changes

MaximeBouton merged commit f7f4f73 into master Jan 6, 2021

dylan-asmar deleted the common-rl branch December 19, 2023 21:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switched from RLInterface to CommonRLInterface #53

Switched from RLInterface to CommonRLInterface #53

zsunberg commented Dec 13, 2020

rejuvyesh left a comment

rejuvyesh Dec 14, 2020

zsunberg Dec 22, 2020

rejuvyesh Dec 14, 2020

zsunberg Dec 22, 2020

zsunberg commented Dec 22, 2020

rejuvyesh left a comment

MaximeBouton commented Jan 6, 2021

MaximeBouton commented Jan 6, 2021

zsunberg commented Jan 7, 2021

Switched from RLInterface to CommonRLInterface #53

Switched from RLInterface to CommonRLInterface #53

Conversation

zsunberg commented Dec 13, 2020

rejuvyesh left a comment

Choose a reason for hiding this comment

rejuvyesh Dec 14, 2020

Choose a reason for hiding this comment

zsunberg Dec 22, 2020

Choose a reason for hiding this comment

rejuvyesh Dec 14, 2020

Choose a reason for hiding this comment

zsunberg Dec 22, 2020

Choose a reason for hiding this comment

zsunberg commented Dec 22, 2020

rejuvyesh left a comment

Choose a reason for hiding this comment

MaximeBouton commented Jan 6, 2021

MaximeBouton commented Jan 6, 2021

zsunberg commented Jan 7, 2021