Add a `ListRerank` document compressor #13311

denver1117 · 2023-11-13T22:41:47Z

Description: This PR adds a new document compressor called ListRerank. It's derived from BaseDocumentCompressor. It's a near exact implementation of introduced by this paper: Zero-Shot Listwise Document Reranking with a Large Language Model which it finds to outperform pointwise reranking, which is somewhat implemented in LangChain as LLMChainFilter.
Issue: None
Dependencies: None
Tag maintainer: @hwchase17 @izzymsft
Twitter handle: @HarrisEMitchell

Notes:

I didn't add anything to docs. I wasn't exactly sure which patterns to follow as cohere reranker is under Retrievers with other external document retrieval integrations, but other contextual compression is here. Happy to contribute to either with some direction.
I followed syntax, docstrings, implementation patterns, etc. as well as I could looking at nearby modules. One thing I didn't do was put the default prompt in a separate .py file like Chain Filter and Chain Extract. Happy to follow that pattern if it would be preferred.

vercel · 2023-11-13T22:41:51Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
langchain	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jul 18, 2024 8:31pm

denver1117 · 2023-11-24T16:03:03Z

libs/langchain/langchain/retrievers/document_compressors/list_rerank.py

+            type="array[dict]",
+        )
+    ]
+    output_parser = StructuredOutputParser.from_response_schemas(response_schemas)


I experimented with a Pydantic parser that defines the full nested structure explicitly and saw notably more output parsing errors. Expressing the array[dict] type as an implicit nested type within a single ResponseSchema type argument was much more successful.

hwchase17

this seems really cool! Would be extra good to add a notebook in documentation as an example for this

denver1117 · 2023-11-30T15:47:54Z

this seems really cool! Would be extra good to add a notebook in documentation as an example for this

Thanks @hwchase17. I fixed the lint issues and added to the documentation, it LGTM in the Vercel preview:

This is ready to go from my perspective.

…7/langchain into feature/list-rerank-compressor

denver1117 · 2023-12-05T16:53:31Z

I fixed the order of arguments (need arg first, not kwarg) and fixed the bad type in the Callable arg as noted by the failed check.

hwchase17 · 2024-07-16T21:35:13Z

needs a re-review and clean up, but i generally like it

@hwchase17

- **Description:** This PR adds a new document compressor called `ListRerank`. It's derived from `BaseDocumentCompressor`. It's a near exact implementation of introduced by this paper: [Zero-Shot Listwise Document Reranking with a Large Language Model](https://arxiv.org/pdf/2305.02156.pdf) which it finds to outperform pointwise reranking, which is somewhat implemented in LangChain as [LLMChainFilter](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_filter.py). - **Issue:** None - **Dependencies:** None - **Tag maintainer:** @hwchase17 @izzymsft - **Twitter handle:** @HarrisEMitchell Notes: 1. I didn't add anything to `docs`. I wasn't exactly sure which patterns to follow as [cohere reranker is under Retrievers](https://python.langchain.com/docs/integrations/retrievers/cohere-reranker) with other external document retrieval integrations, but other contextual compression is [here](https://python.langchain.com/docs/modules/data_connection/retrievers/contextual_compression/). Happy to contribute to either with some direction. 2. I followed syntax, docstrings, implementation patterns, etc. as well as I could looking at nearby modules. One thing I didn't do was put the default prompt in a separate `.py` file like [Chain Filter](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_filter_prompt.py) and [Chain Extract](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_extract_prompt.py). Happy to follow that pattern if it would be preferred. --------- Co-authored-by: Harrison Chase <[email protected]> Co-authored-by: Bagatur <[email protected]> Co-authored-by: Chester Curme <[email protected]>

add list reranker

12a6033

dosubot bot added the 🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features label Nov 13, 2023

denver1117 commented Nov 24, 2023

View reviewed changes

add tests

3bab92e

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Nov 24, 2023

hwchase17 approved these changes Nov 29, 2023

View reviewed changes

dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Nov 29, 2023

denver1117 added 3 commits November 29, 2023 17:48

add example to documentation

963bfb8

fix linting errors in tests

12c0d62

sort imports

2407631

dosubot bot removed the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Nov 30, 2023

fix typo in docs

ba9a055

vercel bot deployed to Preview November 30, 2023 01:25 View deployment

vercel bot deployed to Preview November 30, 2023 01:37 View deployment

updated docs

8cb08b2

vercel bot deployed to Preview November 30, 2023 15:33 View deployment

hwchase17 approved these changes Dec 5, 2023

View reviewed changes

dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Dec 5, 2023

cr

8ad9bca

vercel bot deployed to Preview December 5, 2023 04:11 View deployment

denver1117 added 2 commits December 5, 2023 09:50

fix arg order and callable type

ce4332a

Merge branch 'feature/list-rerank-compressor' of github.com:denver111…

e8720be

…7/langchain into feature/list-rerank-compressor

vercel bot deployed to Preview December 5, 2023 17:05 View deployment

hwchase17 closed this Jan 30, 2024

baskaryan reopened this Jan 30, 2024

baskaryan added 2 commits April 1, 2024 15:45

fmt

571bfd1

fmt

e0afa8a

baskaryan added 3 commits April 1, 2024 16:16

fmt

aa7470f

poetry

5eb4975

poetry

4873379

baskaryan added the needs documentation PR needs to be updated with documentation label Apr 1, 2024

ccurme added the langchain Related to the langchain package label Jun 21, 2024

hwchase17 assigned ccurme Jul 16, 2024

ccurme added 10 commits July 18, 2024 14:17

merge

6314575

move test

52da5c1

undo change to lock file

62b497a

format

9abb92b

expand docstring

1b66387

typing

1efa986

remove unused typeddict

e1f3e61

format

d6ff137

export LLMListwiseRerank from module

ad6b700

update docs

5843510

vercel bot deployed to Preview July 18, 2024 19:58 View deployment

ccurme approved these changes Jul 18, 2024

View reviewed changes

fix link typo

51d18c8

ccurme enabled auto-merge (squash) July 18, 2024 20:23

vercel bot deployed to Preview July 18, 2024 20:31 View deployment

ccurme merged commit 61ea7bf into langchain-ai:master Jul 18, 2024
55 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a `ListRerank` document compressor #13311

Add a `ListRerank` document compressor #13311

denver1117 commented Nov 13, 2023 •

edited

Loading

vercel bot commented Nov 13, 2023 •

edited

Loading

denver1117 Nov 24, 2023

hwchase17 left a comment

denver1117 commented Nov 30, 2023 •

edited

Loading

denver1117 commented Dec 5, 2023

hwchase17 commented Jul 16, 2024

Add a ListRerank document compressor #13311

Add a ListRerank document compressor #13311

Conversation

denver1117 commented Nov 13, 2023 • edited Loading

vercel bot commented Nov 13, 2023 • edited Loading

denver1117 Nov 24, 2023

Choose a reason for hiding this comment

hwchase17 left a comment

Choose a reason for hiding this comment

denver1117 commented Nov 30, 2023 • edited Loading

denver1117 commented Dec 5, 2023

hwchase17 commented Jul 16, 2024

Add a `ListRerank` document compressor #13311

Add a `ListRerank` document compressor #13311

denver1117 commented Nov 13, 2023 •

edited

Loading

vercel bot commented Nov 13, 2023 •

edited

Loading

denver1117 commented Nov 30, 2023 •

edited

Loading