docs(document_store): add usage information for aws elastic search (#1008)

Co-authored-by: Mario Jäckle <m.jaeckle@careerpartner.eu>
2025-12-07 04:20:37 +00:00 · 2021-04-30 10:38:25 +01:00 · 2021-04-30 10:38:25 +01:00 · a00703256f
commit a00703256f
parent 37a72d2f45
1 changed files with 16 additions and 14 deletions
--- a/docs/_src/usage/usage/document_store.md
+++ b/docs/_src/usage/usage/document_store.md
@ -11,10 +11,10 @@ id: "documentstoremd"
 # DocumentStores

 You can think of the DocumentStore as a "database" that:
- stores your texts and meta data  
- provides them to the retriever at query time 
+- stores your texts and meta data
+- provides them to the retriever at query time

-There are different DocumentStores in Haystack to fit different use cases and tech stacks. 
+There are different DocumentStores in Haystack to fit different use cases and tech stacks.

 ## Initialisation

@ -29,7 +29,7 @@ Initialising a new DocumentStore within Haystack is straight forward.

 [Install](https://www.elastic.co/guide/en/elasticsearch/reference/current/install-elasticsearch.html)
 Elasticsearch and then [start](https://www.elastic.co/guide/en/elasticsearch/reference/current/starting-elasticsearch.html)
-an instance. 
+an instance.

 If you have Docker set up, we recommend pulling the Docker image and running it.
 ```bash
@ -49,6 +49,8 @@ Note that we also support [Open Distro for Elasticsearch](https://opendistro.git
 Follow [their documentation](https://opendistro.github.io/for-elasticsearch-docs/docs/install/)
 to run it and connect to it using Haystack's `OpenDistroElasticsearchDocumentStore` class.

+We further support [AWS Elastic Search Service](https://aws.amazon.com/elasticsearch-service/) with [signed Requests](https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html):
+Use e.g. [aws-requests-auth](https://github.com/davidmuller/aws-requests-auth) to create an auth object and pass it as `aws4auth` to the `ElasticsearchDocumentStore` constructor.

 </div>
 </div>
@ -59,7 +61,7 @@ to run it and connect to it using Haystack's `OpenDistroElasticsearchDocumentSto
 <div class="tabcontent">

 Follow the [official documentation](https://www.milvus.io/docs/v1.0.0/milvus_docker-cpu.md) to start a Milvus instance via Docker
- 
+
 You can initialize the Haystack object that will connect to this instance as follows:
 ```python
 from haystack.document_store import MilvusDocumentStore
@ -75,7 +77,7 @@ document_store = MilvusDocumentStore()
 <label class="labelouter" for="tab-1-3">FAISS</label>
 <div class="tabcontent">

-The `FAISSDocumentStore` requires no external setup. Start it by simply using this line. 
+The `FAISSDocumentStore` requires no external setup. Start it by simply using this line.
 ```python
 from haystack.document_store import FAISSDocumentStore

@ -106,7 +108,7 @@ document_store = InMemoryDocumentStore()
 <div class="tabcontent">

 The `SQLDocumentStore` requires SQLite, PostgresQL or MySQL to be installed and started.
-Note that SQLite already comes packaged with most operating systems. 
+Note that SQLite already comes packaged with most operating systems.

 ```python
 from haystack.document_store import SQLDocumentStore
@ -174,7 +176,7 @@ Having GPU acceleration will significantly speed this up.
 <!-- _comment: !! Make this a tab element to show how different datastores are initialized !! -->
 ## Choosing the Right Document Store

-The Document Stores have different characteristics. You should choose one depending on the maturity of your project, the use case and technical environment: 
+The Document Stores have different characteristics. You should choose one depending on the maturity of your project, the use case and technical environment:

 <div class="tabs tabsdschoose">

@ -183,13 +185,13 @@ The Document Stores have different characteristics. You should choose one depend
 <label class="labelouter" for="tab-2-1">Elasticsearch</label>
 <div class="tabcontent">

-**Pros:** 
+**Pros:**
 - Fast & accurate sparse retrieval with many tuning options
 - Basic support for dense retrieval
 - Production-ready
 - Support also for Open Distro

-**Cons:** 
+**Cons:**
 - Slow for dense retrieval with more than ~ 1 Mio documents

 </div>
@ -200,7 +202,7 @@ The Document Stores have different characteristics. You should choose one depend
 <label class="labelouter" for="tab-2-2">Milvus</label>
 <div class="tabcontent">

-**Pros:** 
+**Pros:**
 - Scalable DocumentStore that excels at handling vectors (hence suited to dense retrieval methods like DPR)
 - Encapsulates multiple ANN libraries (e.g. FAISS and ANNOY) and provides added reliability
 - Runs as a separate service (e.g. a Docker container)
@ -217,7 +219,7 @@ The Document Stores have different characteristics. You should choose one depend
 <label class="labelouter" for="tab-2-3">FAISS</label>
 <div class="tabcontent">

-**Pros:** 
+**Pros:**
 - Fast & accurate dense retrieval
 - Highly scalable due to approximate nearest neighbour algorithms (ANN)
 - Many options to tune dense retrieval via different index types (more info [here](https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index))
@ -255,7 +257,7 @@ The Document Stores have different characteristics. You should choose one depend
 - No database requirements
 - Supports MySQL, PostgreSQL and SQLite

-**Cons:** 
+**Cons:**
 - Not scalable
 - Not persisting your data on disk

@ -268,7 +270,7 @@ The Document Stores have different characteristics. You should choose one depend

 #### Our Recommendations

-**Restricted environment:** Use the `InMemoryDocumentStore`, if you are just giving Haystack a quick try on a small sample and are working in a restricted environment that complicates running Elasticsearch or other databases  
+**Restricted environment:** Use the `InMemoryDocumentStore`, if you are just giving Haystack a quick try on a small sample and are working in a restricted environment that complicates running Elasticsearch or other databases

 **Allrounder:** Use the `ElasticSearchDocumentStore`, if you want to evaluate the performance of different retrieval options (dense vs. sparse) and are aiming for a smooth transition from PoC to production