Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deal with text input #6

Open
Ddiamondgold opened this issue Feb 2, 2022 · 2 comments
Open

How to deal with text input #6

Ddiamondgold opened this issue Feb 2, 2022 · 2 comments

Comments

@Ddiamondgold
Copy link

Hi,

Thanks for providing the text descriptions in the previous post. I really appreciate it.

But I still have several questions:

  1. I did not find the code about how you take the descriptions as text input during your training. Can you elaborate? If possible, can you send me the code (such as the data loader for image + text input) if not shown on github?
  2. For the 10 sentences for each image, do you simply concatenate them as input or do you do any preprocessing of the raw text? May I have the proprocessing code?
  3. For the final results reported in the paper, 96.81%, is it the accuracy of the 200 class prediction? Or it is related with attribute prediction?

Thanks!

@nicolalandro
Copy link
Owner

nicolalandro commented Feb 3, 2022

  1. I do not work on the text part, we train separately text and image, we extract the feature of both and after we train a model that merge that, the merging model (with the extracted features) can be founded here so I do not have a dataloader for text + Image, it must be written
  2. I think that they concatenate but I'm not sure
  3. the results of the paper is about classification, we do not predict attribute.

I can try to ask for the code of the text but it may have been lost, but I think that on HuggingFace you can find better model to extract features from test. For example here you can find Bart, Tapas, Bert and more Bert evolution models for text feature extraction, so I think that is not necessary to have that part of to recreate the flow today. (I know that it should be easiest if it would be present 😢)

@Ddiamondgold
Copy link
Author

No Worries! Thank you so much for the explanation. I will try looking for text feature extractors. As long as I know that it is about classification accuracy and the text is concatenated, that should be good to me. Thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants