-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Brooklyn restart failed: IndexingServiceException, due to NoShardAvailableActionException #66
Comments
@aledsage What parameters did you use when starting the server? I don't think this is to do with rebind. I think there is an occasional initialisation race - we should be waiting for Alien to be ready, or Alien should be waiting for Elasticsearch to be ready. |
This appears to be a problem with the way elasticsearch is used. It can also be resolved by deleting the generated |
@aledsage As @robertgmoss alluded, we think Elasticsearch's data directory is corrupted. More questions:
|
|
@aledsage I have been trying to replicate the error but am unable to. I suspect that your ES data folder was corrupted somehow, but cannot find a cause. Are you able to reproduce the error after deleting the |
I'm also having difficulty reproducing this. The two stacktraces are interesting. Both involve the initialisation of A4C:
The source of the first (in this thread, the second time Brooklyn was started) is Brooklyn's start-up:
But the source of the latter is the API:
The case coming from the API surprises me but it should (!) be conincidental. Only one thread can initialise A4C at once. To reach the I'm going to take a look at the A4C code that does the indexing: https://github.com/alien4cloud/alien4cloud/blob/ca638b77f42dd0d65186edd1e23897a6e382b747/alien4cloud-core/src/main/java/alien4cloud/component/CSARRepositoryIndexerService.java#L92-92. |
@aledsage Was anybody else running brooklyn-tosca on the same network as you at the same time? |
I've posted the following to Alien4Cloud's public Slack channel:
|
@sjcorbett no-one else was using brooklyn-tosca on the same network. Did this error happen when using a4c code to parse the Tosca, or when trying to talk to a separate a4c? Is it possible for us to code more defensively for this? For example, what is the ideal behaviour if What about any automated cleanup? @Graeme-Miller suggested that clearing out some corrupted directory of a4c/Elasticsearch could fix the problem. It would be good to document those steps (or perhaps even to do them automatically on restart?!). If this is just using a4c code to parse the Tosca, then presumably we could make assumptions about it being a dedicated Elasticsearch instance, and presumably it doesn't contain any persisted state that needs preserved between runs? (But we don't want to slow-down the server startup too much either). |
When restarting a Brooklyn server that included brooklyn-tosca, the restart failed due to an error in alien4cloud. There were no tosca blueprints or catalog items.
The server's web-console reported "this brooklyn server has errors".
Looking at
mgmt.errors()
, it showed:The workaround for me to restart the Brooklyn server was to disable brooklyn-tosca - in brooklyn.properties I added:
The full exception from the log was:
The text was updated successfully, but these errors were encountered: