Skip to content
This repository has been archived by the owner on Jun 24, 2022. It is now read-only.

Elasticsearch does not start: '${ES_TMPDIR}' does not exist #791

Closed
sourcecode-glitch opened this issue Apr 1, 2021 · 7 comments
Closed

Comments

@sourcecode-glitch
Copy link

Elasticsearch version:
Version: 6.1.2, Build: 5b1fea5/2018-01-10T02:35:59.208Z

Role version:
v7.12.0

JVM version (java -version):
1.8.0_275

OS version (uname -a if on a Unix-like system):
Linux nosql1 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1+deb9u2 (2019-05-13) x86_64 GNU/Linux

Description of the problem including expected versus actual behaviour:
The role times out at the "Wait for elasticsearch to startup" task instead of correctly starting. It seems like the env var ${ES_TMPDIR} in jvm.options is not resolved, therefore it tries to find a directory literally called "$ES_TMPDIR".

I am running this via molecule on a vagrant VM (though I don't expect this to be important for this issue).

Playbook:

---
- name: Install elasticsearch
  hosts: all
  serial: 1
  remote_user: root
  roles:
    - role: 'elasticsearch'
      es_instance_name: "{{ ansible_nodename }}"
      es_data_dirs: "/var/lib/elasticsearch"
      es_config:
        cluster.name: es-cluster
        discovery.zen.ping.unicast.hosts: "{{ unicast_hosts | join(',') }}"
        network.host: "['{{ ansible_eth0.ipv4.address }}', '_local_']"
        discovery.zen.minimum_master_nodes: 2
        script.max_compilations_rate: "1000/1m"
      es_api_host: "{{ ansible_nodename }}"
      es_major_version: "6.x"
      es_version: "6.1.2"
      es_heap_size: '1g'

Provide logs from Ansible:

TASK [elasticsearch : Make sure elasticsearch is started] **********************
ok: [nosql1]

TASK [elasticsearch : Wait for elasticsearch to startup] ***********************
fatal: [nosql1]: FAILED! => {"changed": false, "elapsed": 300, "msg": "Timeout when waiting for nosql1:9200"}

These are only the last lines. Please see the full log in case you need more details.

ES Logs if relevant:

vagrant@nosql1:~$ sudo journalctl -u elasticsearch
-- Logs begin at Thu 2021-04-01 09:44:41 GMT, end at Thu 2021-04-01 10:03:09 GMT. --
Apr 01 09:48:55 nosql1 systemd[1]: Started Elasticsearch.
Apr 01 09:49:01 nosql1 elasticsearch[6265]: JNA Warning: IOException removing temporary files: JNA temporary directory '${ES_TMPDIR}' does not exist
Apr 01 09:49:02 nosql1 systemd[1]: elasticsearch.service: Main process exited, code=exited, status=1/FAILURE
Apr 01 09:49:02 nosql1 systemd[1]: elasticsearch.service: Unit entered failed state.
Apr 01 09:49:02 nosql1 systemd[1]: elasticsearch.service: Failed with result 'exit-code'.

When starting the elasticsearch binary directly from command line there is a much more detailed java stacktrace.

It includes the line Caused by: java.nio.file.AccessDeniedException: /home/vagrant/${ES_TMPDIR} so it seems like the environment variable is not resolved. The env var is defined as /tmp:

elasticsearch@nosql1:/home/vagrant$ echo $ES_TMPDIR
/tmp
@sourcecode-glitch
Copy link
Author

I am wondering if this could be because this repo uses one jvm.options file regardless of the ES version. This may be related to #738

@jmlrt
Copy link
Member

jmlrt commented Apr 12, 2021

Hi @sourcecode-glitch, without digging too much, I think this could be related to deploying the playbook as user root (remote_user: root), did you try with using become: yes which is the recommended way instead?

@jmlrt jmlrt added the question label Apr 12, 2021
@sourcecode-glitch
Copy link
Author

Thanks for the suggestion but it did not work with become. I get exactly the same result.

Just for completeness, this is the diff compared with the previous version:

@@ -2,7 +2,7 @@
 - name: Install elasticsearch
   hosts: all
   serial: 1
-  remote_user: root
+  become: yes
   roles:
     - role: 'elasticsearch'
       es_instance_name: "{{ ansible_nodename }}"

@botelastic
Copy link

botelastic bot commented Jul 12, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@jmlrt
Copy link
Member

jmlrt commented Jul 19, 2021

still valid

@botelastic botelastic bot removed the triage/stale label Jul 19, 2021
@botelastic
Copy link

botelastic bot commented Oct 17, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@botelastic
Copy link

botelastic bot commented Nov 16, 2021

This issue has been automatically closed because it has not had recent activity since being marked as stale.

@botelastic botelastic bot closed this as completed Nov 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants