Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Attaching/Drag-and-Dropping more file types #715

Merged
merged 7 commits into from
Nov 10, 2024

Conversation

humphd
Copy link
Collaborator

@humphd humphd commented Nov 8, 2024

This updates our "Attach Files..." menu option and Drag-and-Drop to support the following file types:

  • images: works same as before
  • PDF files: converts to markdown with jina.ai, then creates a new message with the contents
  • text files: tries to syntax highlight when type is known and creates a new message with the contents
  • Word DOCX: converts to HTML then to Markdown (the last bit isn't working yet)

You can attach or drag-and-drop a bunch of files and it will process them all.

A few things could be improved, which I ran out of time to work on (can happen in follow ups, or others can push fixes):

  1. I want proper CSS to happen when the user drags over the prompt form so you know it's a drop target. I tried to do something with a green border, but it's not working.
  2. There is a bunch of code duplication here that could be tamed
  3. I want to be able to add a jina.ai provider key in the settings modal so you can have LLM Providers and "Document Providers" (e.g., services that allow you to process file types into text for LLMs). This isn't critical, since it works now.

@menghif, @Amnish04: can you see what I'm doing wrong with my CSS on the drop target?

@humphd
Copy link
Collaborator Author

humphd commented Nov 8, 2024

It would be cool to add DOCX support next by passing the file through https://github.com/mwilliamson/mammoth.js/ and then sending the HTML to jina.ai to get Markdown. Done!

Copy link

cloudflare-workers-and-pages bot commented Nov 9, 2024

Deploying chatcraft-org with  Cloudflare Pages  Cloudflare Pages

Latest commit: 8aa52ae
Status: ✅  Deploy successful!
Preview URL: https://0e9e0de8.console-overthinker-dev.pages.dev
Branch Preview URL: https://humphd-import-non-image-file.console-overthinker-dev.pages.dev

View logs

@humphd
Copy link
Collaborator Author

humphd commented Nov 10, 2024

Refactored to remove code duplication, simplified things, added support for Word DOCX (I can't get Jina.ai to do the right thing when I POST an HTML file to it, so for now I'm leaving the Word DOCX in HTML format, which still works).

@humphd
Copy link
Collaborator Author

humphd commented Nov 10, 2024

I'd say this is good to land and we can do follow-ups. Anyone want to test before I do?

Copy link
Collaborator

@Amnish04 Amnish04 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@humphd Looks and works great!

Left some comments on why CSS and html to markdown might not be working.

src/lib/ai.ts Outdated
@@ -510,3 +510,109 @@ export function isChatModel(model: string): boolean {
!(isTextToSpeechModel(model) || isSpeechToTextModel(model) || isTextToImageModel(model))
);
}

export type JinjaReaderResponse = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this Jina?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't believe how many times I've made this mistake. For whatever reason, my brain can't do jina vs. jinja...

src/lib/ai.ts Outdated
Comment on lines 616 to 617
// TODO: jina.ai is not returning the expected result for HTML, not sure why...
return convertToMarkdownWithJina(html, "html");
Copy link
Collaborator

@Amnish04 Amnish04 Nov 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because you return the input content as is, when file is a string and type is html?

chatcraft.org/src/lib/ai.ts

Lines 566 to 568 in c454e07

} else if (type === "html" && typeof file === "string") {
// HTML file is already a string
contents = file;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I want that behaviour: I only want to read the file into a base64 string if it's a PDF, but if it's already parsed HTML content, I want that bare string.

However, the bug seems to be with Jina.ai requiring a URL when you post your local HTML file, which makes no sense to me. Even when I add it, it seems to only return the first line of text.

I think I'll give up on this bit and rip it out.

Comment on lines 318 to 319
_hover={{ borderColor: activeBorder }}
borderColor={isDragActive ? activeBorder : "inherit"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_hover={{ borderColor: activeBorder }}
borderColor={isDragActive ? activeBorder : "inherit"}
border={"2px solid"}
borderColor={isDragActive ? activeBorder : "inherit"}

The reason is there is no border here to start with :)

image

@@ -291,19 +304,23 @@ function DesktopPromptForm({
// Otherwise, let the default paste handling happen
};

const activeBorder = useColorModeValue("green.100", "green.600");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const activeBorder = useColorModeValue("green.100", "green.600");
const activeBorder = useColorModeValue("green.200", "green.600");

I would change the lighter version to green.200. Current shade is too light.

Before:
image

After:
image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've done a version of this. I still think it could be made better than what I can do in a follow-up, but now it works.

Comment on lines 181 to 182
const document = (contents as JinjaReaderResponse).data;
chat.addMessage(new ChatCraftHumanMessage({ text: `${document.content}\n` }));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. While testing, I ran into a case where my pdf content was not text and jina returned the following content:
{
  "text": "\n"
}

leading to an empty message like this:
image

Should we have an additional check here so we only add a user message when we have some content.trim()?

  1. Also, should we be paying attention the chat's size by rejecting large files and showing error alerts, so we don't exceed model context lengths? Could probably be done in a follow up if we consider this an issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an info warning when a file is empty instead of importing.

For now, I think we should just try to import all files and let the browser/LLM complain if things are too large.

@humphd humphd merged commit ebb9258 into main Nov 10, 2024
4 checks passed
@humphd humphd deleted the humphd/import-non-image-files branch November 10, 2024 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants