F4CK V2PH

F4CK V2PH is a script for scraping images from the V2PH website. It supports album scraping, image URL extraction, and an image downloader.

Main Features

Scrape Album URLs Based on Names: Input the name of a model to scrape image albums associated with the name.
Retrieve Image URLs from Albums: Fetch individual image URLs after scraping albums.
Download Images: Download images from URLs specified in a .txt file.

Demo

Scrapping albums

The output will be saved on albums_url/[NAME].txt
Example albums_url\2024-03-22_Jangjoo.txt

https://www.v2ph.com/album/a9m6e8oa.html
https://www.v2ph.com/album/an3nx54z.html
https://www.v2ph.com/album/z69nxe8a.html

Scrapping images

The output will be saved on image_url/[NAMES]/[ALBUM_TITLE].txt
Example image_url\2024-03-22_Jangjoo\[ArtGravia] VOL.162 Jang Joo.txt

https://cdn.v2ph.com/photos/TJqv9cajZNKBo1xR.jpg
    out=[ArtGravia] VOL.162 Jang Joo 35.jpg
https://cdn.v2ph.com/photos/UMRnw_W4l9rOsTru.jpg
    out=[ArtGravia] VOL.162 Jang Joo 3.jpg
https://cdn.v2ph.com/photos/EwVJlPhy88rHjMFY.jpg
    out=[ArtGravia] VOL.162 Jang Joo 15.jpg

Note

You can turn off the headless mode by change into headless=False on module > driver.py

menu-2.mp4

Downloading images

The output will be saved on images\[NAME]\[NAME.jpg]
Example images\[ArtGravia] VOL.154 Jang Joo

images\[ArtGravia] VOL.154 Jang Joo\[ArtGravia] VOL.154 Jang Joo 0.jpg
images\[ArtGravia] VOL.154 Jang Joo\[ArtGravia] VOL.154 Jang Joo 1.jpg
images\[ArtGravia] VOL.154 Jang Joo\[ArtGravia] VOL.154 Jang Joo 2.jpg

menu-3.mp4

Prerequisites

Python 3.8+ (Tested on Python 3.10.9)
Google Chrome (Tested on version 128.0.6613.138)
aria2c (Tested on version 1.37.0)

Installation

Clone this repository:

git clone https://github.com/senhan07/V2PH.git

Navigate to the project directory:

cd V2PH

Install the required packages:
Recomended using virtual enviroment (optional)

python -m venv venv
.\venv\Scripts\activate
pip install -r requirements.txt

or without it

pip install -r requirements.txt

Run

python main.py

Notes

Each account has 16 tokens, which reset every 24 hours.
The script will automatically create a new account if all tokens are exhausted.
It prioritizes using accounts with the highest available tokens.
Before running, the script checks the last login for each account, resetting tokens if it has been 24 hours.
Just watch out for IP bans :).

Todo

Fix delay issue when checking existing scraped URLs.
Implement proxy support for IP rotation.
Bypass Cloudflare Turnstile.
~~Bypass Captcha with RecaptchaSolver~~.

Disclaimer

This script is purely intended for practicing my programming and logic skills, not because of any personal interest in the content itself.

Contributing

Contributions are welcome! If you encounter any issues or have suggestions for improvement, feel free to open an issue or submit a pull request.

License

This script is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

F4CK V2PH

Main Features

Demo

Scrapping albums

Scrapping images

Downloading images

Prerequisites

Installation

Notes

Todo

Disclaimer

Contributing

License

About

Sponsor this project

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.github		.github
.vscode		.vscode
accounts		accounts
albums_url		albums_url
module		module
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

senhan07/V2PH

Folders and files

Latest commit

History

Repository files navigation

F4CK V2PH

Main Features

Demo

Scrapping albums

Scrapping images

Downloading images

Prerequisites

Installation

Notes

Todo

Disclaimer

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Sponsor this project

Languages