Replies: 1 comment 1 reply
-
same error here |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Checked other resources
Commit to Help
Example Code
Description
I have a workspace in Databricks that I use to parse large .pdf files into lists of langchain documents, then I store those lists in a dictionary associated with the name of the .pdf they came from like so:
I need to use these preprocessed documents in my AI application. So I pickle the dictionary in Databricks, and then move the pickle file to the Docker container in which my app runs and unpickle it. Then I can access the langchain docs associated with those .pdfs.
This has worked fine for months. But on Friday I had to parse some new .pdfs so I went through the same process. This time, however, when I attempt to unpickle the dictionary, I get this error:
Now, the dictionary is not created using Pydantic, and the process works if instead of having a list of langchain documents in thr dictionary, I just have a list of strings. So my guess is that langchain is using Pydantic and that is how Pydantic gets involved. My research suggested that this error may be due to different Python versions or different pickle versions or different Pydantic versions. To address this, I have set the Python version in Databricks and the Docker container to 3.9.19 (and then I also tried 3.11). In both cases, the pickle version is 4.0. The Pydantic version is 2.9.2 in both environments.
This was working fine for months. The only thing that I can think that changed is that I could no longer import chromadb from langchain.vectorstores. I import it from langchain_community.vectorstores.
Can anyone offer any other things I can try?
System Info
The platform in databricks is linux 5.15 Azure os.
The platform in my Docker container is Linux-5.15.146.1-microsoft-standard-WSL2-x86_64-with-glibc2.36
Beta Was this translation helpful? Give feedback.
All reactions