Observation Normalization Problem #4

im-Kitsch · 2022-05-04T23:19:01Z

Hi,

the normalization process used in your AWR implementation is not the same with original AWR paper.

Is there any reference or reason to implement so? It seems not right in compare to offical AWR code.

Best.

inference-based-rl/pfrlx/networks/empirical_normalization.py

Lines 62 to 85 in 8c93996

    
           def experience(self, x): 
        
               """Learn input values without computing the output values of them""" 
        
               if self.until is not None and self.count >= self.until: 
        
                   return 
        
               count_x = x.shape[self.batch_axis] 
        
               if count_x == 0: 
        
                   return 
        
               self.count += count_x 
        
               rate = count_x / self.count.float() 
        
               assert rate > 0 
        
               assert rate <= 1 
        
               var_x, mean_x = torch.var_mean( 
        
                   x, axis=self.batch_axis, keepdims=True, unbiased=False 
        
               ) 
        
               delta_mean = mean_x - self._mean 
        
               self._mean += rate * delta_mean 
        
               self._var += rate * (var_x - self._var + delta_mean * (mean_x - self._mean)) 
        
               # clear cache 
        
               self._cached_std_inverse = None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Observation Normalization Problem #4

Observation Normalization Problem #4

im-Kitsch commented May 4, 2022

Observation Normalization Problem #4

Observation Normalization Problem #4

Comments

im-Kitsch commented May 4, 2022