Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[webtoonxyz] add support #5141

Closed
wants to merge 1 commit into from
Closed

Conversation

Dragonatorul
Copy link

@Dragonatorul Dragonatorul commented Jan 31, 2024

Webtoon.xyz uses the same CDN as mangaread.org (WordPressMadara) so this is pretty much a copy-paste of the mangaread extractor.

However, they are also using cloudflare checks. My local tests return the following error even with cookies extracted from Firefox.

» python3 . --cookies ../../cookies.txt https://www.webtoon.xyz/read/the-world-after-the-end/
[webtoonxyz][warning] Cloudflare challenge
[webtoonxyz][error] HttpError: '403 Forbidden' for 'https://www.webtoon.xyz/read/the-world-after-the-end'

EDIT: It worked after adding the correct user-agent:

» python3 . --cookies ../../cookies.txt --user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:122.0) Gecko/20100101 Firefox/122.0" https://www.webtoon.xyz/read/the-world-after-the-end/
./gallery-dl/webtoonxyz/The World After The End/c000/The World After The End_c000_001.jpg

@Dragonatorul Dragonatorul changed the title [webtoonxyz] Add support for webtoon.xyz [webtoonxyz] add support Jan 31, 2024
@enduser420
Copy link
Contributor

Both this and #5140 can be rewritten using the BaseExtractor to prevent having 3 copies of the same extractor.

@Dragonatorul
Copy link
Author

Both this and #5140 can be rewritten using the BaseExtractor to prevent having 3 copies of the same extractor.

I'll have a look tomorrow to see how that's done.

@Dragonatorul Dragonatorul marked this pull request as draft February 1, 2024 00:23
@Dragonatorul
Copy link
Author

@enduser420 I'm still looking over how BaseExtractor works.

I have a question though:

Notice how each of these 3 sites have a different path. /manga/ vs /read/ vs /webtoon/

From other implementations of BaseExtractor it seems like the presumption is that all instances are using the same path. I suppose we could have {manga|read|webtoon) in the match regexp, but that doesn't feel right to me. Is that alright though?

@enduser420
Copy link
Contributor

Thats alright.

@Dragonatorul
Copy link
Author

Superceded by #5166

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants