-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Attaching/Drag-and-Dropping more file types #715
Conversation
|
Deploying chatcraft-org with
|
Latest commit: |
8aa52ae
|
Status: | ✅ Deploy successful! |
Preview URL: | https://0e9e0de8.console-overthinker-dev.pages.dev |
Branch Preview URL: | https://humphd-import-non-image-file.console-overthinker-dev.pages.dev |
Refactored to remove code duplication, simplified things, added support for Word DOCX (I can't get Jina.ai to do the right thing when I POST an HTML file to it, so for now I'm leaving the Word DOCX in HTML format, which still works). |
I'd say this is good to land and we can do follow-ups. Anyone want to test before I do? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@humphd Looks and works great!
Left some comments on why CSS and html to markdown might not be working.
src/lib/ai.ts
Outdated
@@ -510,3 +510,109 @@ export function isChatModel(model: string): boolean { | |||
!(isTextToSpeechModel(model) || isSpeechToTextModel(model) || isTextToImageModel(model)) | |||
); | |||
} | |||
|
|||
export type JinjaReaderResponse = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this Jina
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't believe how many times I've made this mistake. For whatever reason, my brain can't do jina vs. jinja...
src/lib/ai.ts
Outdated
// TODO: jina.ai is not returning the expected result for HTML, not sure why... | ||
return convertToMarkdownWithJina(html, "html"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because you return the input content as is, when file
is a string
and type
is html
?
Lines 566 to 568 in c454e07
} else if (type === "html" && typeof file === "string") { | |
// HTML file is already a string | |
contents = file; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I want that behaviour: I only want to read the file into a base64 string if it's a PDF, but if it's already parsed HTML content, I want that bare string.
However, the bug seems to be with Jina.ai requiring a URL when you post your local HTML file, which makes no sense to me. Even when I add it, it seems to only return the first line of text.
I think I'll give up on this bit and rip it out.
_hover={{ borderColor: activeBorder }} | ||
borderColor={isDragActive ? activeBorder : "inherit"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -291,19 +304,23 @@ function DesktopPromptForm({ | |||
// Otherwise, let the default paste handling happen | |||
}; | |||
|
|||
const activeBorder = useColorModeValue("green.100", "green.600"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've done a version of this. I still think it could be made better than what I can do in a follow-up, but now it works.
src/hooks/use-file-import.tsx
Outdated
const document = (contents as JinjaReaderResponse).data; | ||
chat.addMessage(new ChatCraftHumanMessage({ text: `${document.content}\n` })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- While testing, I ran into a case where my pdf content was not text and jina returned the following content:
{
"text": "\n"
}
leading to an empty message like this:
Should we have an additional check here so we only add a user message when we have some content.trim()
?
- Also, should we be paying attention the chat's size by rejecting large files and showing error alerts, so we don't exceed model context lengths? Could probably be done in a follow up if we consider this an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an info warning when a file is empty instead of importing.
For now, I think we should just try to import all files and let the browser/LLM complain if things are too large.
This updates our "Attach Files..." menu option and Drag-and-Drop to support the following file types:
You can attach or drag-and-drop a bunch of files and it will process them all.
A few things could be improved, which I ran out of time to work on (can happen in follow ups, or others can push fixes):
There is a bunch of code duplication here that could be tamed@menghif, @Amnish04: can you see what I'm doing wrong with my CSS on the drop target?