Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with Action convergence #6

Open
fyouly opened this issue Jun 15, 2022 · 1 comment
Open

Problem with Action convergence #6

fyouly opened this issue Jun 15, 2022 · 1 comment

Comments

@fyouly
Copy link

fyouly commented Jun 15, 2022

Hi,
This is a really great job and I have reproduce your code and run it on my terminal. Thank you for the effort.
But I have a problem that my action converges to a value near -100 rather than the value that the line of nash equilibrium shows in the figure.
May I have your help with that problem? Or is that some issue related with my parameters? My current parameter settings are:

Agent Parameters

POWER_CAPACITIES = [50 / 100, 50 / 100] #[50 / 100, 50 / 100] # 50
PRODUCTION_COSTS = [20 / 100, 20 / 100] #[20 / 100, 20 / 100] # 20
mean=np.array([6,6,6,6,6])
var=np.array([9,0,9,0,4])
DEMAND = [5,6] #[70 / 100, 70 / 100] # 70
ACTION_LIMITS = [-1, 1] # [-10/100,100/100]#[-100/100,100/100]
NUMBER_OF_AGENTS = 2
PAST_ACTION = 1
FRINGE = 0

Neural Network Parameters

rescaling the rewards to avoid hard weight Updates of the Criticer

REWARD_SCALING = 1 # 0.01 #
LEARNING_RATE_ACTOR = 1e-4
LEARNING_RATE_CRITIC = 1e-3
NORMALIZATION_METHOD = 'none' # options are BN = Batch Normalization, LN = Layer Normalization, none

Noise Parameters

NOISE = 'GaussianNoise' # Options are: 'GaussianNoise',OUNoise','UniformNoise'
DECAY_RATE = 0.001 # 0.0004 strong; 0.0008 medium; 0.001 soft; # if 0: Not used, if:1: only simple Noise without decay used
REGULATION_COEFFICENT = 10 # if 1: Not used, if:0: only simple Noise used

TOTAL_TEST_RUNS = 1 # How many runs should be executed
EPISODES_PER_TEST_RUN = 10000 # 10000 # How many episodes should one run contain
ROUNDS_PER_EPISODE = 24 # How many rounds are allowed per episode (right now number of rounds has no impact -due 'done' is executed if step >= round- and choosing 1 is easier to interpret; )
BATCH_SIZE = 128 # *0.5 # *2

@viktorzob
Copy link
Collaborator

Hi,

sorry for this super late reply!
I don't know if your question is still relevant, but I am happy to answer it finally:
It is totally fine if one of the agents converges to -100, as long as the other converges to maximum action, i.e., price cap. The agents converge to a Nash Equillibrium as long as one bids anything <=50 and the other 100.
You can also find a more detailed explanation in our paper: https://link.springer.com/article/10.1007/s10614-022-10351-6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants