Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show how to use elasticsearch integration with security enabled #661

Open
taborzbislaw opened this issue Apr 13, 2024 · 1 comment
Open
Labels
feature request Ideas to improve an integration integration:elasticsearch

Comments

@taborzbislaw
Copy link

I have installed elasticsearch locally and run it as a deamon according to https://www.elastic.co/guide/en/elasticsearch/reference/current/starting-elasticsearch.html
I have exported ELASTIC_PASSWORD and ES_HOME environtment variables

When I run
document_store = ElasticsearchDocumentStore(hosts = "http://localhost:9200")
I got an error

ConnectionError Traceback (most recent call last)
Cell In[2], line 1
----> 1 document_store = ElasticsearchDocumentStore(hosts = "http://localhost:9200")

File ~/anaconda3/envs/NLP/lib/python3.10/site-packages/haystack_integrations/document_stores/elasticsearch/document_store.py:104, in ElasticsearchDocumentStore.init(self, hosts, index, embedding_similarity_function, **kwargs)
101 self._kwargs = kwargs
103 # Check client connection, this will raise if not connected
--> 104 self._client.info()
106 # configure mapping for the embedding field
107 mappings = {
108 "properties": {
109 "embedding": {"type": "dense_vector", "index": True, "similarity": embedding_similarity_function},
(...)
122 ],
123 }

File ~/anaconda3/envs/NLP/lib/python3.10/site-packages/elasticsearch/_sync/client/utils.py:446, in _rewrite_parameters..wrapper..wrapped(*args, **kwargs)
443 except KeyError:
444 pass
--> 446 return api(*args, **kwargs)

File ~/anaconda3/envs/NLP/lib/python3.10/site-packages/elasticsearch/_sync/client/init.py:2453, in Elasticsearch.info(self, error_trace, filter_path, human, pretty)
2451 __query["pretty"] = pretty
2452 __headers = {"accept": "application/json"}
-> 2453 return self.perform_request( # type: ignore[return-value]
2454 "GET",
2455 __path,
2456 params=__query,
2457 headers=__headers,
2458 endpoint_id="info",
2459 path_parts=__path_parts,
2460 )

File ~/anaconda3/envs/NLP/lib/python3.10/site-packages/elasticsearch/_sync/client/_base.py:271, in BaseClient.perform_request(self, method, path, params, headers, body, endpoint_id, path_parts)
255 def perform_request(
256 self,
257 method: str,
(...)
264 path_parts: Optional[Mapping[str, Any]] = None,
265 ) -> ApiResponse[Any]:
266 with self._otel.span(
267 method,
268 endpoint_id=endpoint_id,
269 path_parts=path_parts or {},
270 ) as otel_span:
--> 271 response = self._perform_request(
272 method,
273 path,
274 params=params,
275 headers=headers,
276 body=body,
277 otel_span=otel_span,
278 )
279 otel_span.set_elastic_cloud_metadata(response.meta.headers)
280 return response

File ~/anaconda3/envs/NLP/lib/python3.10/site-packages/elasticsearch/_sync/client/_base.py:316, in BaseClient._perform_request(self, method, path, params, headers, body, otel_span)
313 else:
314 target = path
--> 316 meta, resp_body = self.transport.perform_request(
317 method,
318 target,
319 headers=request_headers,
320 body=body,
321 request_timeout=self._request_timeout,
322 max_retries=self._max_retries,
323 retry_on_status=self._retry_on_status,
324 retry_on_timeout=self._retry_on_timeout,
325 client_meta=self._client_meta,
326 otel_span=otel_span,
327 )
329 # HEAD with a 404 is returned as a normal response
330 # since this is used as an 'exists' functionality.
331 if not (method == "HEAD" and meta.status == 404) and (
332 not 200 <= meta.status < 299
333 and (
(...)
337 )
338 ):

File ~/anaconda3/envs/NLP/lib/python3.10/site-packages/elastic_transport/_transport.py:342, in Transport.perform_request(self, method, target, body, headers, max_retries, retry_on_status, retry_on_timeout, request_timeout, client_meta, otel_span)
340 try:
341 otel_span.set_node_metadata(node.host, node.port, node.base_url, target)
--> 342 resp = node.perform_request(
343 method,
344 target,
345 body=request_body,
346 headers=request_headers,
347 request_timeout=request_timeout,
348 )
349 _logger.info(
350 "%s %s%s [status:%s duration:%.3fs]"
351 % (
(...)
357 )
358 )
360 if method != "HEAD":

File ~/anaconda3/envs/NLP/lib/python3.10/site-packages/elastic_transport/_node/_http_urllib3.py:202, in Urllib3HttpNode.perform_request(self, method, target, body, headers, request_timeout)
194 err = ConnectionError(str(e), errors=(e,))
195 self._log_request(
196 method=method,
197 target=target,
(...)
200 exception=err,
201 )
--> 202 raise err from None
204 meta = ApiResponseMeta(
205 node=self.config,
206 duration=duration,
(...)
209 headers=response_headers,
210 )
211 self._log_request(
212 method=method,
213 target=target,
(...)
217 response=data,
218 )

ConnectionError: Connection error caused by: ConnectionError(Connection error caused by: ProtocolError(('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))))

  • OS: Linux Mint 19.3 (Tricia)
  • Haystack version 2.0

Please, help in resolving the issue

@taborzbislaw taborzbislaw added the bug Something isn't working label Apr 13, 2024
@taborzbislaw
Copy link
Author

When I start elasticsearch using

sudo docker run -p 9200:9200 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1024m -Xmx1024m" -e "xpack.security.enabled=false" docker.elastic.co/elasticsearch/elasticsearch:8.13.2

examples from https://docs.haystack.deepset.ai/docs/elasticsearchbm25retriever work correctly

But when docker is started with security enabled:

sudo docker run -p 9200:9200 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms1024m -Xmx1024m" docker.elastic.co/elasticsearch/elasticsearch:8.13.2

I got en error described in the previous comment: ConnectionError: Connection error caused by: ConnectionError(Connection error caused by: ProtocolError(('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))))

It would be nice if the examples show how to use elasticsearch integration with security enabled like at sentence-transformers page:
https://github.com/UKPLab/sentence-transformers/blob/master/examples/applications/semantic-search/semantic_search_quora_elasticsearch.py

best

@masci masci changed the title ElasticsearchDocumentStore is not created Show how to use elasticsearch integration with security enabled Apr 22, 2024
@masci masci added feature request Ideas to improve an integration integration:elasticsearch and removed bug Something isn't working labels Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Ideas to improve an integration integration:elasticsearch
Projects
None yet
Development

No branches or pull requests

2 participants