Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community: Skip nested directories when using S3DirectoryLoader #17829

Merged
merged 2 commits into from
Mar 9, 2024

Conversation

Falydoor
Copy link
Contributor

@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Feb 20, 2024
Copy link

vercel bot commented Feb 20, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Feb 20, 2024 11:29pm

@dosubot dosubot bot added Ɑ: doc loader Related to document loader module (not documentation) 🔌: aws Primarily related to Amazon Web Services (AWS) integrations 🤖:improvement Medium size change to existing code to handle new use-cases labels Feb 20, 2024
Copy link
Collaborator

@baskaryan baskaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @3coins

@t-mac81
Copy link

t-mac81 commented Feb 28, 2024

Thanks for putting in the pull request @Falydoor Hopefully it gets merged soon as we also need this functionality.

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Mar 9, 2024
@baskaryan baskaryan merged commit cf94091 into langchain-ai:master Mar 9, 2024
58 checks passed
@Falydoor Falydoor deleted the fix-s3-directory-loader branch March 9, 2024 06:56
gkorland pushed a commit to FalkorDB/langchain that referenced this pull request Mar 30, 2024
langchain-ai#17829)

- **Description:** `S3DirectoryLoader` is failing if prefix is a folder
(ex: `my_folder/`) because `S3FileLoader` will try to load that folder
and will fail. This PR skip nested directories so prefix can be set to
folder instead of `my_folder/files_prefix`.
- **Issue:**
  - langchain-ai#11917
  - langchain-ai#6535
  - langchain-ai#4326
- **Dependencies:** none
- **Twitter handle:** @Falydoor


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
hinthornw pushed a commit that referenced this pull request Apr 26, 2024
#17829)

- **Description:** `S3DirectoryLoader` is failing if prefix is a folder
(ex: `my_folder/`) because `S3FileLoader` will try to load that folder
and will fail. This PR skip nested directories so prefix can be set to
folder instead of `my_folder/files_prefix`.
- **Issue:**
  - #11917
  - #6535
  - #4326
- **Dependencies:** none
- **Twitter handle:** @Falydoor


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔌: aws Primarily related to Amazon Web Services (AWS) integrations Ɑ: doc loader Related to document loader module (not documentation) 🤖:improvement Medium size change to existing code to handle new use-cases lgtm PR looks good. Use to confirm that a PR is ready for merging. size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants