GitHub - vance-coder/proxy_pool: Proxy IP pool for Python3

proxy_pool(自用)

Proxy IP pool for Python3

Github上其实已经有很多IP代理池了，但为何还要多造这个轮子呢？

爬取的网站资源比较少，需要自己写扩展才能获取更多的资源；
很多都是通用性代理IP，需要自己二次校验；
满足自己个性化需求，想怎么扩展就怎么扩展；
总的来说就是Github上面的都不好用，用着很是不爽，看我来造个更差劲更难用的！

当前爬取的网站主要如下：

云代理 www.ip3366.net
旗云代理 http://www.qydaili.com/
unknown http://www.goubanjia.com
快代理 http://www.kuaidaili.com/free/inha/
89免费代理 http://www.89ip.cn/index_1.html
IP海代理 http://www.iphai.com/free/ng
极速代理 http://www.superfastip.com/welcome/freeip/1
西刺代理 https://www.xicidaili.com/nn/
西拉免费代理IP http://www.xiladaili.com/https/1/ 可用率比较高有反爬限制
http://www.nimadaili.com/gaoni/ 可用率比较高有反爬限制
http://ip.kxdaili.com/ipList/1.html#ip
http://31f.cn/
http://www.shenjidaili.com/shareip/ http代理(处理方式不一致, 未处理)
http://www.66ip.cn/areaindex_19/1.html 有反爬限制，js动态加载
http://www.dlnyys.com/free/

环境要求：

Python3.6+
Redis

环境准备：

pip install requests faker redis

开始爬取：

按自个情况修改 ProxyPool.py init() 参数
启动爬虫：python3 ProxyPool.py

使用demo：

# 结合ProxyPool 中的 get_proxy_ip 方法使用
import requests
from ProxyPool import ProxyPool

proxy_ip = ProxyPool().get_proxy_ip()

proxies = {
    'http': 'http://' + proxy_ip,
    'https': 'https://' + proxy_ip
}
res = requests.get('https://baidu.com/', proxies=proxies, timeout=5)

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
ProxyPool.py		ProxyPool.py
README.md		README.md
SearchProxyIP.py		SearchProxyIP.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

proxy_pool(自用)

About

Releases

Packages

Languages

vance-coder/proxy_pool

Folders and files

Latest commit

History

Repository files navigation

proxy_pool(自用)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages