Skip to content

Commit

Permalink
docs: [Retrieval > .. > PDF] update package installation instructions…
Browse files Browse the repository at this point in the history
… for Unstructured and PDFMiner (#20723)

**Description:** Adds the command to install packages required before
using _Unstructured_ and _PDFMiner_ from `langchain.community`
**Documentation Page Being Updated:** [LangChain > Retrieval > Document
loaders > PDF > Using
Unstructured](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf/#using-unstructured)
**Issue:** #20719 
**Dependencies:** no dependencies
**Twitter handle:** SalikaDave

<!--
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17. -->

---------

Co-authored-by: Bagatur <[email protected]>
  • Loading branch information
salikadave and baskaryan authored Apr 24, 2024
1 parent a9e2e98 commit 6353991
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions docs/docs/modules/data_connection/document_loaders/pdf.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,11 @@ data = loader.load()

## Using Unstructured

The `unstructured[all-docs]` package currently supports loading of text files, powerpoints, html, pdfs, images, and more.

```bash
pip install unstructured[pdf]
```

```python
from langchain_community.document_loaders import UnstructuredPDFLoader
Expand Down Expand Up @@ -225,6 +230,11 @@ data = loader.load()

## Using PDFMiner

PDFMiner is a tool that can help with extracting information and analyzing data from PDF documents.

```bash
pip install pdfminer.six
```

```python
from langchain_community.document_loaders import PDFMinerLoader
Expand Down

0 comments on commit 6353991

Please sign in to comment.