Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

English edits #7

Merged
merged 4 commits into from
Jun 2, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 17 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,22 @@

<img src="./images/call-center-readme.png" alt="huggingface-mlrun" style="width: 600px"/>

In this demo we will be showcasing how we used LLMs to turn call center conversation audio files of customers and agents into valueable data in a single workflow orchastrated by MLRun.
This demo showcases how to use LLMs to turn audio files from call center conversations between customers and agents into valuable data, all in a single workflow orchestrated by MLRun.

MLRun will be automating the entire workflow, auto-scale resources as needed and automatically log and parse values between the workflow different steps.
MLRun automates the entire workflow, auto-scales resources as needed, and automatically logs and parses values between the different workflow steps.

By the end of this demo you will see the potential power of LLMs for feature extraction, and how easy it is being done using MLRun!
By the end of this demo you will see the potential power of LLMs for feature extraction, and how easily you can do this with MLRun!

We will use:
This demo uses:
* [**OpenAI's Whisper**](https://openai.com/research/whisper) - To transcribe the audio calls into text.
* [**Flair**](https://flairnlp.github.io/) and [**Microsoft's Presidio**](https://microsoft.github.io/presidio/) - To recognize PII for filtering it out.
* [**HuggingFace**](https://huggingface.co/) - as the main machine learning framework to get the model and tokenizer for the features extraction. The demo uses [tiiuae/falcon-40b-instruct](https://huggingface.co/tiiuae/falcon-40b-instruct) as the LLM to asnwer questions.
* and [**MLRun**](https://www.mlrun.org/) - as the orchastraitor to operationalize the workflow.
* [**Flair**](https://flairnlp.github.io/) and [**Microsoft's Presidio**](https://microsoft.github.io/presidio/) - To recognize PII so it can be filtered out.
* [**HuggingFace**](https://huggingface.co/) - The main machine learning framework to get the model and tokenizer for the features extraction. The demo uses [tiiuae/falcon-40b-instruct](https://huggingface.co/tiiuae/falcon-40b-instruct) as the LLM to answer questions.
* and [**MLRun**](https://www.mlrun.org/) - as the orchestrator to operationalize the workflow.

The demo contains a single [notebook](./notebook.ipynb) that covers the entire demo.
The demo contains a single [notebook](./notebook.ipynb) that encompasses the entire demo.

Most of the functions are being imported from [MLRun's hub](https://docs.mlrun.org/en/stable/runtimes/load-from-hub.html) - a wide range of functions that can be used for a variety of use cases. You can find all the python source code under [/src](./src) and links to the used functions from the hub in the notebook.

Most of the functions are imported from [MLRun's hub](https://docs.mlrun.org/en/stable/runtimes/load-from-hub.html), which contains a wide range of functions that can be used for a variety of use cases. All functions used in the demo include links to their source in the hub. All of the python source code is under [/src](./src).
Enjoy!

___
Expand All @@ -29,25 +29,25 @@ This project can run in different development environments:
* Inside GitHub Codespaces
* Other managed Jupyter environments

### Install the code and mlrun client
### Install the code and the mlrun client

To get started, fork this repo into your GitHub account and clone it into your development environment.

To install the package dependencies (not required in GitHub codespaces) use:

make install-requirements

If you prefer to use Conda use this instead (to create and configure a conda env):
If you prefer to use Conda, use this instead (to create and configure a conda env):

make conda-env

> Make sure you open the notebooks and select the `mlrun` conda environment

### Install or connect to MLRun service/cluster
### Install or connect to the MLRun service/cluster

The MLRun service and computation can run locally (minimal setup) or over a remote Kubernetes environment.

If your development environment support docker and have enough CPU resources run:
If your development environment supports Docker and these are sufficient CPU resources, run:
jillnogold marked this conversation as resolved.
Show resolved Hide resolved

make mlrun-docker

Expand All @@ -57,10 +57,10 @@ If your environment is minimal, run mlrun as a process (no UI):

[conda activate mlrun &&] make mlrun-api

For MLRun to run properly you should set your client environment, this is not required when using **codespaces**, the mlrun **conda** environment, or **iguazio** managed notebooks.
For MLRun to run properly you should set your client environment. This is not required when using **codespaces**, the mlrun **conda** environment, or **iguazio** managed notebooks.

Your environment should include `MLRUN_ENV_FILE=<absolute path to the ./mlrun.env file> ` (point to the mlrun .env file
in this repo), see [mlrun client setup](https://docs.mlrun.org/en/latest/install/remote.html) instructions for details.
in this repo); see [mlrun client setup](https://docs.mlrun.org/en/latest/install/remote.html) instructions for details.

> Note: You can also use a remote MLRun service (over Kubernetes), instead of starting a local mlrun,
> edit the [mlrun.env](./mlrun.env) and specify its address and credentials
> Note: You can also use a remote MLRun service (over Kubernetes), instead of starting a local mlrun:
> edit the [mlrun.env](./mlrun.env) and specify its address and credentials.