Feat/shence #2

muntaqamahmood · 2023-11-30T15:09:06Z

No description provided.

…hain into feat/Shence

added disclaimer --------- Co-authored-by: Erick Friis <[email protected]>

@baskaryan

Implements [langchain-ai#12115](langchain-ai#12115) Who can review? @baskaryan , @eyurtsev , @hwchase17 Integrated Stack Exchange API into Langchain, enabling access to diverse communities within the platform. This addition enhances Langchain's capabilities by allowing users to query Stack Exchange for specialized information and engage in discussions. The integration provides seamless interaction with Stack Exchange content, offering content from varied knowledge repositories. A notebook example and test cases were included to demonstrate the functionality and reliability of this integration. - Add StackExchange as a tool. - Add unit test for the StackExchange wrapper and tool. - Add documentation for the StackExchange wrapper and tool. If you have time, could you please review the code and provide any feedback as necessary! My team is welcome to any suggestions. --------- Co-authored-by: Yuval Kamani <[email protected]> Co-authored-by: Aryan Thakur <[email protected]> Co-authored-by: Manas1818 <[email protected]> Co-authored-by: aryan-thakur <[email protected]> Co-authored-by: Bagatur <[email protected]>

…Cypher/Neo4j schema (langchain-ai#13851) Instead of using JSON-like syntax to describe node and relationship properties we changed to a shorter and more concise schema description Old: ``` Node properties are the following: [{'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Movie'}, {'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Actor'}] Relationship properties are the following: [] The relationships are the following: ['(:Actor)-[:ACTED_IN]->(:Movie)'] ``` New: ``` Node properties are the following: Movie {name: STRING},Actor {name: STRING} Relationship properties are the following: The relationships are the following: (:Actor)-[:ACTED_IN]->(:Movie) ```

@eyurtsev

- **Description:** Updated to remove deprecated parameter penalty_alpha, and use string variation of prompt rather than json object for better flexibility. - **Issue:** the issue # it fixes (if applicable), - **Dependencies:** N/A - **Tag maintainer:** @eyurtsev - **Twitter handle:** @Symbldotai --------- Co-authored-by: toshishjawale <[email protected]> Co-authored-by: Harrison Chase <[email protected]>

…angchain-ai#13274) - **Description:** Update 5 pdf document loaders in `langchain.document_loaders.pdf`, to store a url in the metadata (instead of a temporary, local file path) if the user provides a web path to a pdf: `PyPDFium2Loader`, `PDFMinerLoader`, `PDFMinerPDFasHTMLLoader`, `PyMuPDFLoader`, and `PDFPlumberLoader` were updated. - The updates follow the approach used to update `PyPDFLoader` for the same behavior in langchain-ai#12092 - The `PyMuPDFLoader` changes required additional work in updating `langchain.document_loaders.parsers.pdf.PyMuPDFParser` to be able to process either an `io.BufferedReader` (from local pdf) or `io.BytesIO` (from online pdf) - The `PDFMinerPDFasHTMLLoader` change used a simpler approach since the metadata is assigned by the loader and not the parser - **Issue:** Fixes langchain-ai#7034 - **Dependencies:** None ```python # PyPDFium2Loader example: # old behavior >>> from langchain.document_loaders import PyPDFium2Loader >>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf') >>> docs = loader.load() >>> docs[0].metadata {'source': '/var/folders/7z/d5dt407n673drh1f5cm8spj40000gn/T/tmpm5oqa92f/tmp.pdf', 'page': 0} # new behavior >>> from langchain.document_loaders import PyPDFium2Loader >>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf') >>> docs = loader.load() >>> docs[0].metadata {'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0} ```

@baskaryan

…ain-ai#14029) - **Description:** use post field validation for `CohereRerank` - **Issue:** langchain-ai#12899 and langchain-ai#13058 - **Dependencies:** - **Tag maintainer:** @baskaryan --------- Co-authored-by: Bagatur <[email protected]>

@eyurtsev

- **Description:** Mask API key for ForeFrontAI LLM and associated unit tests - **Issue:** langchain-ai#12165 - **Dependencies:** N/A - **Tag maintainer:** @eyurtsev - **Twitter handle:** `__mmahmad__` I made the API key non-optional since linting required adding validation for None, but the key is required per documentation: https://python.langchain.com/docs/integrations/llms/forefrontai

@baskaryan

- **Description:** Volc Engine MaaS serves as an enterprise-grade, large-model service platform designed for developers. You can visit its homepage at https://www.volcengine.com/docs/82379/1099455 for details. This change will facilitate developers to integrate quickly with the platform. - **Issue:** None - **Dependencies:** volcengine - **Tag maintainer:** @baskaryan - **Twitter handle:** @he1v3tica --------- Co-authored-by: lvzhong <[email protected]>

- **Description:** Added some of the more endpoints supported by serpapi that are not suported on langchain at the moment, like google trends, google finance, google jobs, and google lens - **Issue:** [Add support for many of the querying endpoints with serpapi langchain-ai#11811](langchain-ai#11811) --------- Co-authored-by: zushenglu <[email protected]> Co-authored-by: Erick Friis <[email protected]> Co-authored-by: Ian Xu <[email protected]> Co-authored-by: zushenglu <[email protected]> Co-authored-by: KevinT928 <[email protected]> Co-authored-by: Bagatur <[email protected]>

@baskaryan

- **Description:** Added a tool called RedditSearchRun and an accompanying API wrapper, which searches Reddit for posts with support for time filtering, post sorting, query string and subreddit filtering. - **Issue:** langchain-ai#13891 - **Dependencies:** `praw` module is used to search Reddit - **Tag maintainer:** @baskaryan , and any of the other maintainers if needed - **Twitter handle:** None. Hello, This is our first PR and we hope that our changes will be helpful to the community. We have run `make format`, `make lint` and `make test` locally before submitting the PR. To our knowledge, our changes do not introduce any new errors. Our PR integrates the `praw` package which is already used by RedditPostsLoader in LangChain. Nonetheless, we have added integration tests and edited unit tests to test our changes. An example notebook is also provided. These changes were put together by me, @Anika2000, @CharlesXu123, and @Jeremy-Cheng-stack Thank you in advance to the maintainers for their time. --------- Co-authored-by: What-Is-A-Username <[email protected]> Co-authored-by: Anika2000 <[email protected]> Co-authored-by: Jeremy Cheng <[email protected]> Co-authored-by: Harrison Chase <[email protected]>

@baskaryan

- **Description:** Update the document for drop box loader + made the messages more verbose when loading pdf file since people were getting confused - **Issue:** langchain-ai#13952 - **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17, --------- Co-authored-by: Erick Friis <[email protected]>

@baskaryan

# Description We implemented a simple tool for accessing the Merriam-Webster Collegiate Dictionary API (https://dictionaryapi.com/products/api-collegiate-dictionary). Here's a simple usage example: ```py from langchain.llms import OpenAI from langchain.agents import load_tools, initialize_agent, AgentType llm = OpenAI() tools = load_tools(["serpapi", "merriam-webster"], llm=llm) # Serp API gives our agent access to Google agent = initialize_agent( tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True ) agent.run("What is the english word for the german word Himbeere? Define that word.") ``` Sample output: ``` > Entering new AgentExecutor chain... I need to find the english word for Himbeere and then get the definition of that word. Action: Search Action Input: "English word for Himbeere" Observation: {'type': 'translation_result'} Thought: Now I have the english word, I can look up the definition. Action: MerriamWebster Action Input: raspberry Observation: Definitions of 'raspberry': 1. rasp-ber-ry, noun: any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries 2. rasp-ber-ry, noun: a perennial plant (genus Rubus) of the rose family that bears raspberries 3. rasp-ber-ry, noun: a sound of contempt made by protruding the tongue between the lips and expelling air forcibly to produce a vibration; broadly : an expression of disapproval or contempt 4. black raspberry, noun: a raspberry (Rubus occidentalis) of eastern North America that has a purplish-black fruit and is the source of several cultivated varieties —called also blackcap Thought: I now know the final answer. Final Answer: Raspberry is an english word for Himbeere and it is defined as any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries. > Finished chain. ``` # Issue This closes langchain-ai#12039. # Dependencies We added no extra dependencies.  --------- Co-authored-by: Lara <[email protected]> Co-authored-by: Harrison Chase <[email protected]>

@Farhan-Faisal

# Description This PR implements Self-Query Retriever for MongoDB Atlas vector store. I've implemented the comparators and operators that are supported by MongoDB Atlas vector store according to the section titled "Atlas Vector Search Pre-Filter" from https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-stage/. Namely: ``` allowed_comparators = [ Comparator.EQ, Comparator.NE, Comparator.GT, Comparator.GTE, Comparator.LT, Comparator.LTE, Comparator.IN, Comparator.NIN, ] """Subset of allowed logical operators.""" allowed_operators = [ Operator.AND, Operator.OR ] ``` Translations from comparators/operators to MongoDB Atlas filter operators(you can find the syntax in the "Atlas Vector Search Pre-Filter" section from the previous link) are done using the following dictionary: ``` map_dict = { Operator.AND: "$and", Operator.OR: "$or", Comparator.EQ: "$eq", Comparator.NE: "$ne", Comparator.GTE: "$gte", Comparator.LTE: "$lte", Comparator.LT: "$lt", Comparator.GT: "$gt", Comparator.IN: "$in", Comparator.NIN: "$nin", } ``` In visit_structured_query() the filters are passed as "pre_filter" and not "filter" as in the MongoDB link above since langchain's implementation of MongoDB atlas vector store(libs\langchain\langchain\vectorstores\mongodb_atlas.py) in _similarity_search_with_score() sets the "filter" key to have the value of the "pre_filter" argument. ``` params["filter"] = pre_filter ``` Test cases and documentation have also been added. # Issue langchain-ai#11616 # Dependencies No new dependencies have been added. # Documentation I have created the notebook mongodb_atlas_self_query.ipynb outlining the steps to get the self-query mechanism working. I worked closely with [@Farhan-Faisal](https://github.com/Farhan-Faisal) on this PR. --------- Co-authored-by: Bagatur <[email protected]>

langchain-ai#13297) Response_if_no_docs_found is not implemented in ConversationalRetrievalChain for async code paths. Implemented it and added test cases Co-authored-by: Harrison Chase <[email protected]>

@baskaryan

grammar correction  Co-authored-by: Harrison Chase <[email protected]>

@baskaryan

**Description:** When using Vald, only insecure grpc connection was supported, so secure connection is now supported. In addition, grpc metadata can be added to Vald requests to enable authentication with a token.

@baskaryan

**Description:** Added support for a Pandas DataFrame OutputParser with format instructions, along with unit tests and a demo notebook. Namely, we've added the ability to request data from a DataFrame, have the LLM parse the request, and then use that request to retrieve a well-formatted response. Within LangChain, it seamlessly integrates with language models like OpenAI's `text-davinci-003`, facilitating streamlined interaction using the format instructions (just like the other output parsers). This parser structures its requests as `<operation/column/row>[<optional_array_params>]`. The instructions detail permissible operations, valid columns, and array formats, ensuring clarity and adherence to the required format. For example: - When the LLM receives the input: "Retrieve the mean of `num_legs` from rows 1 to 3." - The provided format instructions guide the LLM to structure the request as: "mean:num_legs[1..3]". The parser processes this formatted request, leveraging the LLM's understanding to extract the mean of `num_legs` from rows 1 to 3 within the Pandas DataFrame. This integration allows users to communicate requests naturally, with the LLM transforming these instructions into structured commands understood by the `PandasDataFrameOutputParser`. The format instructions act as a bridge between natural language queries and precise DataFrame operations, optimizing communication and data retrieval. **Issue:** - langchain-ai#11532 **Dependencies:** No additional dependencies :) **Tag maintainer:** @baskaryan **Twitter handle:** No need. :) --------- Co-authored-by: Wasee Alam <[email protected]> Co-authored-by: Harrison Chase <[email protected]>

@baskaryan

### Description Hello, The [integration_test README](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/tests) was indicating incorrect paths for the `.env.example` and `.env` files. `tests/.env.example` ->`tests/integration_tests/.env.example` While it’s a minor error, it could **potentially lead to confusion** for the document’s readers, so I’ve made the necessary corrections. Thank you! ☺️ ### Related Issue - langchain-ai#2806

…hain into feat/Shence

shenceyang and others added 30 commits November 11, 2023 15:40

add: setup initial files

d097fcf

Merge remote-tracking branch 'origin/develop' into feat/Shence

2778879

add: config and validation for steam

01194ab

add: functions with API calls & prompt

c16e000

add: Implemented the get_id function.

a4cdb72

add recommended games (WIP)

4abac50

add: Implemented the get_id function.

effd9af

Merge branch 'feat/Shence' of https://github.com/muntaqamahmood/langc…

b8b4799

…hain into feat/Shence

add: Modified get_id to return a dictionary now.

b76a7d2

Merge branch 'feat/Shence' of https://github.com/muntaqamahmood/langc…

b2454e9

…hain into feat/Shence

fix: change the details that returns to users

ce45cd5

remove json

61f0821

fix: fix bugs in steam.py

658e128

fix: fix bugs in steam.py

961eed8

fix: fix bugs in steam.py

ebfc67b

use function to get game details

cd7856f

change prompt wording

5638635

fix: lint check

e92aec1

Merge branch 'feat/Shence' of https://github.com/muntaqamahmood/langc…

2bc5150

…hain into feat/Shence

save: new recommed_games WIP commented out

7a1d5d2

add: updated recommended_games

f215696

add: modified get_id_link_price to only show the first research result.

5c8e90f

fix: recommendation flow and cleanup old code

f866610

Merge branch 'feat/Shence' of https://github.com/muntaqamahmood/langc…

6ede07f

…hain into feat/Shence

Merge branch 'langchain-ai:master' into feat/Shence

ac9772d

finish recommend games, add agent

634de5f

fix recommend games

0ba2909

remove unused param

8c374d9

DOCS: updated langchain stack img to be svg (langchain-ai#13540)

02a1303

DOCS langchain decorators update (langchain-ai#13535)

cc50e02

added disclaimer --------- Co-authored-by: Erick Friis <[email protected]>

baskaryan and others added 29 commits November 29, 2023 10:31

langchain[patch]: Release 0.0.343 (langchain-ai#14037)

d4405bc

delete unused file

1dd713a

Merge branch 'master' into feat/Shence

734c4cb

Fix issue where response_if_no_docs_found is not implemented on async… (

d1d693b

langchain-ai#13297) Response_if_no_docs_found is not implemented in ConversationalRetrievalChain for async code paths. Implemented it and added test cases Co-authored-by: Harrison Chase <[email protected]>

fix import

d42a27a

add: notebook title

124d7e6

Merge branch 'feat/Shence' of https://github.com/muntaqamahmood/langc…

87a84da

…hain into feat/Shence

fix: updated the steam.ipynb to provide example

45eff25

Merge branch 'master' into feat/Shence

d33fde4

cr

1ecfbfd

cr

cc7d2c5

spell check

e893a70

Merge branch 'develop' into feat/Shence

dc845fd

muntaqamahmood closed this Nov 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/shence #2

Feat/shence #2

muntaqamahmood commented Nov 30, 2023