-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature/document_translation #9
base: master
Are you sure you want to change the base?
Conversation
Currently translates PDF,TXT, and DOCX files.
from typing import Optional | ||
from azure.ai.translation.text.models import InputTextItem | ||
from azure.core.credentials import AzureKeyCredential | ||
from azure.ai.translation.text import TextTranslationClient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filename says azure_document_translation, but it looks like we're importing from azure.ai.translation.text. The Azure AI document translation service translates full documents (e.g. pdf, docx, etc) from a source --> target container while the latter does realtime translation on text. I thought I saw a PR that already added the realtime translation feature, if this is a wrapper around the same service, maybe they should go into the same tool class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, we should probably import azure.ai.translation.text and azure.core types under the init and raise any import errors so the user knows that they need to install the library to use this tool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can move it in there for now but I have also been trying to get the document translation to work. Ill try to get it done before the meeting.
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
I havent moved the file yet just in case there are some more changes to be made and I am trying to get a better version working.
I edited it some more and changed the document loader to use unstructured which is part of langchain.
I am not entirely sure how to format the big try block with the unstructured imports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the initial feedback, making good progress! A few more comments, I think the next step would be to add some tests and examples
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_document_translation.py
Outdated
Show resolved
Hide resolved
I have no idea why the "make_spell_check" is failing. |
I will work on the notebook/examples and tests now |
92ae151
to
4ba0b27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked out your branch and gave the code a try myself. I think with the following changes we should be able to invoke the tool, or use with an agent:
file_path = "hello.txt" # has the text "hola mundo"
ft = AzureFileTranslateTool(
text_translation_key=os.environ["AZURE_TRANSLATE_API_KEY"],
text_translation_endpoint=os.environ["AZURE_TRANSLATE_ENDPOINT"],
region="westus2",
file_path=file_path,
)
answer = ft.invoke({"query": f"What does the text in the file say in english?"})
print(answer) # hello world
or
ft = AzureFileTranslateTool(
text_translation_key=os.environ["AZURE_TRANSLATE_API_KEY"],
text_translation_endpoint=os.environ["AZURE_TRANSLATE_ENDPOINT"],
region="westus2",
)
tools = [ft]
model = AzureChatOpenAI(
openai_api_version=os.environ["OPENAI_API_VERSION"],
azure_deployment=os.environ["COMPLETIONS_MODEL"],
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
api_key=os.environ["AZURE_OPENAI_API_KEY"],
)
agent = create_structured_chat_agent(model, tools, prompt)
agent_executor = AgentExecutor(
agent=agent, tools=tools, verbose=True, handle_parsing_errors=True
)
file_path = "hello.txt"
answer = agent_executor.invoke(
{"input": f"What does the text in the file say in english? : {file_path}"}
)
print(answer)
{
"action": "Final Answer",
"action_input": "The text in the file says 'Hello world' in English."
}
libs/community/langchain_community/tools/azure_ai_services/azure_file_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_file_translation.py
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_file_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_file_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_file_translation.py
Outdated
Show resolved
Hide resolved
libs/community/langchain_community/tools/azure_ai_services/azure_file_translation.py
Outdated
Show resolved
Hide resolved
My guess is that this has nothing to do with your code changes and is a bug on the repo. I suspect if you pull in the latest from master, it may resolve. |
I was putting in the wrong version. Hopefully this works now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, nice job! I think we just need to add an example of using the tool and showing its output
except ImportError: | ||
raise ImportError("Run 'pip install azure-ai-translation-text'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate_environment is called before this function so we'll already have raised an ImportError if the translation library is not installed. We can remove the try/except here
bf00d90
to
317c923
Compare
4c5a2d7
to
59f178e
Compare
Currently translates PDF,TXT, and DOCX files.