From a0913bf213f32e4b975ebc25899f4bff8968f0e0 Mon Sep 17 00:00:00 2001 From: Mohamed Malhou Date: Wed, 27 Sep 2023 18:15:26 +0200 Subject: [PATCH] update readme (#4646) GitOrigin-RevId: 5d51f83dae3bd38af5e2b8816cdcc60552c69712 --- README.md | 15 ++++++++------- examples/pipelines/unstructured/app.py | 2 +- run_examples.py | 2 +- 3 files changed, 10 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 54f9e18..2e1c6db 100644 --- a/README.md +++ b/README.md @@ -89,12 +89,13 @@ Read more about the implementation details and how to extend this application in To get started explore one of the examples: -| Example | Description | -| ---------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| [`contextless`](examples/pipelines/contextless/app.py) | This simple example calls OpenAI ChatGPT API but does not use an index when processing queries. It relies solely on the given user query. We recommend it to start your Pathway LLM journey. | -| [`contextful`](examples/pipelines/contextful/app.py) | This default example of the app will index the documents located in the `data/pathway-docs` directory. These indexed documents are then taken into account when processing queries. The pathway pipeline being run in this mode is located at [`examples/pipelines/contextful/app.py`](examples/pipelines/contextful/app.py). | -| [`contextful_s3`](examples/pipelines/contextful_s3/app.py) | This example operates similarly to the contextful mode. The main difference is that the documents are stored and indexed from an S3 bucket, allowing the handling of a larger volume of documents. This can be more suitable for production environments. | -| [`local`](examples/pipelines/local/app.py) | This example runs the application using Huggingface Transformers, which eliminates the need for the data to leave the machine. It provides a convenient way to use state-of-the-art NLP models locally. | +| Example | Description | +| ---------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| [`contextless`](examples/pipelines/contextless/app.py) | This simple example calls OpenAI ChatGPT API but does not use an index when processing queries. It relies solely on the given user query. We recommend it to start your Pathway LLM journey. | +| [`contextful`](examples/pipelines/contextful/app.py) | This default example of the app will index the jsonlines documents located in the `data/pathway-docs` directory. These indexed documents are then taken into account when processing queries. The pathway pipeline being run in this mode is located at [`examples/pipelines/contextful/app.py`](examples/pipelines/contextful/app.py). | +| [`contextful_s3`](examples/pipelines/contextful_s3/app.py) | This example operates similarly to the contextful mode. The main difference is that the documents are stored and indexed from an S3 bucket, allowing the handling of a larger volume of documents. This can be more suitable for production environments. | +| [`unstructured`](examples/pipelines/unstructured/app.py) | Process unstructured documents such as PDF, HTML, DOCX, PPTX and more. Visit [unstructured-io](https://unstructured-io.github.io/unstructured/) for the full list of supported formats. | +| [`local`](examples/pipelines/local/app.py) | This example runs the application using Huggingface Transformers, which eliminates the need for the data to leave the machine. It provides a convenient way to use state-of-the-art NLP models locally. | And follow the easy steps to install and run one of those examples. @@ -207,7 +208,7 @@ When the process is complete, the App will be up and running inside a Docker con ``` ### Step 5: Launch the User Interface: -Go to the `examples/ui/` directory and execute `streamlit run server.py`. Then, access the URL displayed in the terminal to engage with the LLM App using a chat interface. +Go to the `examples/ui/` directory (or `examples/pipelines/unstructured/ui` if you are running the unstructured version.) and execute `streamlit run server.py`. Then, access the URL displayed in the terminal to engage with the LLM App using a chat interface. ### Bonus: Build your own Pathway-powered LLM App diff --git a/examples/pipelines/unstructured/app.py b/examples/pipelines/unstructured/app.py index ae59b04..c9a0844 100644 --- a/examples/pipelines/unstructured/app.py +++ b/examples/pipelines/unstructured/app.py @@ -15,7 +15,7 @@ Usage: In the root of this repository run: -`poetry run ./run_examples.py unstruct` +`poetry run ./run_examples.py unstructured` or, if all dependencies are managed manually rather than using poetry `python examples/pipelines/unstructured/app.py` diff --git a/run_examples.py b/run_examples.py index 7e4eea3..a51b7b9 100755 --- a/run_examples.py +++ b/run_examples.py @@ -145,7 +145,7 @@ def contextless(**kwargs): @cli.command() @common_options -def unstruct(**kwargs): +def unstructured(**kwargs): from examples.pipelines.unstructured import run return run(**kwargs)