MSTDP not learning #697
Replies: 2 comments
-
|
Beta Was this translation helpful? Give feedback.
-
Thank you for sharing the graphs. The network code looks good. To begin with, let's remove the upper and lower bounds on the weights in each layer, specifically eliminating Additionally, for debugging purposes, I recommend removing the second middle layer. This will help us observe if the output weights change. I suspect that there may not be enough activity in the deeper layers to produce meaningful weight changes. In spiking neurons, there's a risk of vanishing spikes, where deeper layers receive progressively fewer spikes, resulting in diminished weight updates compared to layers connected directly to the input. Furthermore, for the Regarding your concerns about the continuous environment and potential issues with negative inputs and outputs, I agree that working with both positive and negative rewards can be challenging. I recommend starting with just one type of reward to assess its performance. Additionally, consider mapping the output to a continuous range by normalizing or scaling it before passing it to the environment. You might transform spike outputs into a more continuous space by normalizing the total spike count for each neuron to fit a desired output range. I also encourage you to read the following papers, as they contain valuable techniques that may be helpful:
|
Beta Was this translation helpful? Give feedback.
-
I've been trying to implement Bindsnet based off my reinforcement learning Pytorch code. I opted to not use the pipeline and to use custom functions. I need help trying to get it to learn since it does not seem to be working. I feel like part of my problem may stem from trying to do continuous environments or my setup for them since input and output can be negative. If there is any advice or resources that could help me there let me know.
What I've tried:
Observations:
I intend to keep the update for network separate from it's output generation since there are times when I want to do the equivalent of 'no_grad' however I may be thinking too much in the ANN world and perhaps I should always be performing the update. If someone has advice or papers that apply to this please let me know.
I've been running it with basic gymnasium environment like classic control pendulum.
I appreciate any help that can be offered.
The following the network from pytorch for reference purposes
The following is the network in BindsNET. The LINode is the LIFNode but after the last step in the forward it sets self.s=self.v.
Agent.learn Function:
Beta Was this translation helpful? Give feedback.
All reactions