Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StuCleaner removes hyperlinks #24

Open
SamuelYLay opened this issue Aug 15, 2024 · 2 comments
Open

StuCleaner removes hyperlinks #24

SamuelYLay opened this issue Aug 15, 2024 · 2 comments
Assignees

Comments

@SamuelYLay
Copy link

When I upload file to stucleanear/gulagcleaner its nice and removes all the watermarks but it removes all the hyperlinks. Especially from table of content which is useful :(
Here you have both files, if you need them:
math2988-lecture-notes.pdf
math2988-lecture-notes-stucleaner.pdf

@YM162 YM162 self-assigned this Sep 17, 2024
@YM162
Copy link
Owner

YM162 commented Sep 17, 2024

Hi! Thanks for the heads up! It should be an easy fix.

Right now we remove all annotations, which are used for all the clickable elements (including the links in the watermarks), by setting them to an empty array on gulagcleaner_rs/src/models/method.rs line 115

There is probably a good way of filtering the "bad" annotations from the "good" ones, instead of removing all of them. This could be done either by checking the type of the annotation (URL,Intra-document?) or by using some regex for the studocu/wuolah urls.

I´ll try to fix it in a couple of weeks if I have the time, but if someone else wants to give it a go before then, go ahead :)

@YM162
Copy link
Owner

YM162 commented Oct 3, 2024

We can now check if a hyperlink should be removed in the Wuolah PDFs using the page_type::is_annots_wuolah() function.

It should be easy to modify a bit the function to make the same check in the Studocu PDFs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants