-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathatom.xml
164 lines (105 loc) · 71.3 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Mr.Pan Blog</title>
<subtitle>You had me at hello</subtitle>
<link href="/atom.xml" rel="self"/>
<link href="http://plq.91sq.cc/"/>
<updated>2019-04-02T07:18:01.449Z</updated>
<id>http://plq.91sq.cc/</id>
<author>
<name>Mr.Pan</name>
</author>
<generator uri="http://hexo.io/">Hexo</generator>
<entry>
<title>flask+nginx+uwsgi+supervisor项目部署</title>
<link href="http://plq.91sq.cc/2019/04/02/flask-nginx-uwsgi-supervisor%E9%A1%B9%E7%9B%AE%E9%83%A8%E7%BD%B2/"/>
<id>http://plq.91sq.cc/2019/04/02/flask-nginx-uwsgi-supervisor项目部署/</id>
<published>2019-04-02T07:12:56.000Z</published>
<updated>2019-04-02T07:18:01.449Z</updated>
<content type="html"><![CDATA[<h3 id="环境"><a href="#环境" class="headerlink" title="环境"></a>环境</h3><pre><code>- Linux: Ubuntu 16.04- uWSGI 2.0.18 - Flask 1.0.2- supervisor 3.2.0- nginx/1.8.1</code></pre><h3 id="首先区分几个概念"><a href="#首先区分几个概念" class="headerlink" title="首先区分几个概念"></a>首先区分几个概念</h3><p><img src="https://img2018.cnblogs.com/blog/778496/201904/778496-20190401002529915-1118976912.png" alt="uwsgi和WSGI协议"></p><a id="more"></a><ol><li><p>WSGI</p><ul><li>Web Server Gateway Interface (web服务器网管接口)</li><li>是一种规范,是web服务器和web应用(django/flask) 之间的接口,是二者之间的通信桥梁</li><li>没有官方的实现,更像是一个协议,约定俗成的,规定WSGI application 应该实现为一个可调用的对象。只要遵循这些协议,WSGI应用都可以在任何服务器上运行</li></ul></li><li><p>uWSGI</p><ul><li>是一个web服务器,实现了WSGI协议,uwsgi、http等协议</li><li>代码完全用c编写,效率高性能稳定,用于接收前端服务器转发的动态请求并处理后给web应用程序</li></ul></li><li><p>uwsgi<br> 是uWSGI服务器实现的独有的协议,是一种传输协议,用户uWSGI与其他服务器间通信(<br>如与Nginx之间通信)</p></li></ol><blockquote><p>在Django中启动文件是wsgi.py, 该文件在生成Django目录的时候便会自动生成,用于web server 与 Django 通信,相当于提供了一个可调用的application对象,在这个类中实现了call方法。 </p></blockquote><blockquote><p>在flask 中 app = Flask(<strong>name</strong>) 所在的启动文件 manager.py 便是与web server 进行通信的 application可调用对象</p></blockquote><h3 id="简单的服务器项目准备"><a href="#简单的服务器项目准备" class="headerlink" title="简单的服务器项目准备"></a>简单的服务器项目准备</h3><p>新建一个项目并写一个简单的flask web 服务器app<br>目录~/Desktop/flask_deploy/manager.py<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="number">1</span> <span class="comment"># coding=utf8</span></span><br><span class="line"> <span class="number">2</span> <span class="keyword">from</span> flask <span class="keyword">import</span> Flask</span><br><span class="line"> <span class="number">3</span> </span><br><span class="line"></span><br><span class="line"> <span class="number">4</span> app = Flask(__name__)</span><br><span class="line"> <span class="number">5</span> </span><br><span class="line"> <span class="number">6</span> </span><br><span class="line"> <span class="number">7</span> @app.route(<span class="string">'/'</span>, methods=[<span class="string">'GET'</span>])</span><br><span class="line"> <span class="number">8</span> <span class="function"><span class="keyword">def</span> <span class="title">index</span><span class="params">()</span>:</span></span><br><span class="line"> <span class="number">9</span> <span class="keyword">return</span> <span class="string">'hello world'</span></span><br><span class="line"> <span class="number">10</span> </span><br><span class="line"> <span class="number">11</span> </span><br><span class="line"> <span class="number">12</span> <span class="keyword">if</span> __name__ == <span class="string">'__main__'</span>:</span><br><span class="line"> <span class="number">13</span> app.run(debug=<span class="keyword">False</span>)</span><br></pre></td></tr></table></figure></p><h3 id="1-配置python项目虚拟环境"><a href="#1-配置python项目虚拟环境" class="headerlink" title="1 配置python项目虚拟环境"></a>1 配置python项目虚拟环境</h3><ul><li><p>安装虚拟环境管理工具</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pip install virtualenv virtualenvwrapper</span><br></pre></td></tr></table></figure></li><li><p>编辑主目录下的.bashrc文件,添加以下内容</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">export</span> WORKON_HOME=<span class="variable">$HOME</span>/.virtualenvs <span class="comment"># ./virtualvenvs便是虚拟环境安装目录</span></span><br><span class="line"><span class="built_in">source</span> /urs/<span class="built_in">local</span>/bin/virtualenvwrapper.sh</span><br></pre></td></tr></table></figure></li></ul><p>可以通过whereis virtaulenvwrapper.sh 查找该源文件<br><a href="https://blog.csdn.net/l1902090/article/details/24887997" target="_blank" rel="noopener">inux命令和文件查找</a></p><ul><li>执行以下命令使配置生效<blockquote><p>source ./bashrc</p></blockquote></li><li>相关命令<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">mkvirtualenv -p python3 env_name <span class="comment"># -p 指定python环境</span></span><br><span class="line">workon + tab*2 <span class="comment"># 查看本机下有哪些虚拟环境</span></span><br><span class="line">workon env_nmae <span class="comment"># 进入虚拟环境</span></span><br><span class="line">deactivate <span class="comment"># 退出虚拟环境</span></span><br><span class="line">rmvirtualenv env_name <span class="comment"># 删除虚拟环境</span></span><br></pre></td></tr></table></figure></li></ul><h3 id="2-uwsgi安装与配置"><a href="#2-uwsgi安装与配置" class="headerlink" title="2 uwsgi安装与配置"></a>2 uwsgi安装与配置</h3><p>在当前虚拟环境下,进行安装相应包</p><blockquote><p>pip install falsk uwsgi<br>在当前项目目录下创建文件 ~/Desktop/flask_deploy/uwsgi.ini<br>vi uwsgi.ini<br><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line">[uwsgi]</span><br><span class="line"><span class="comment"># 使用nginx连接时使用socket通信</span></span><br><span class="line">socket=127.0.0.1:8000</span><br><span class="line"><span class="comment"># 直接使用自带web server 使用http通信</span></span><br><span class="line"><span class="comment">#http=127.0.0.1:8000</span></span><br><span class="line"><span class="comment"># 指定项目目录</span></span><br><span class="line"><span class="built_in">chdir</span>=/home/python/Desktop/flask_deploy</span><br><span class="line"><span class="comment"># 指定python虚拟环境</span></span><br><span class="line">home=/home/python/.virtualenvs/deploy</span><br><span class="line"><span class="comment"># 指定加载的WSGI文件</span></span><br><span class="line">wsgi-file=manager.py</span><br><span class="line"><span class="comment"># 指定uWSGI加载的模块中哪个变量将被调用</span></span><br><span class="line">callable=app</span><br><span class="line"><span class="comment"># 设置工作进程的数量</span></span><br><span class="line">processes=2</span><br><span class="line"><span class="comment"># 设置每个工作进程的线程数</span></span><br><span class="line">threads=2</span><br><span class="line"><span class="comment"># 将主进程pid写到指定的文件</span></span><br><span class="line">pidfile=%(<span class="built_in">chdir</span>)/uwsgi.pid</span><br><span class="line"><span class="comment"># 日志文件</span></span><br><span class="line">req-logger=file:/home/python/Desktop/flask_deploy/<span class="built_in">log</span>/req.log</span><br><span class="line">logger=file:/home/python/Desktop/flask_deploy/<span class="built_in">log</span>/err.log</span><br><span class="line"></span><br><span class="line"><span class="comment">#uid=xxx # uWSGI服务器运行时的用户id,未设置则为当前启动的用户</span></span><br><span class="line"><span class="comment">#gid=xxx # uWSGI服务器运行时的用户组id</span></span><br><span class="line"><span class="comment">#procname-prefix-spaced=site # 指定工作进程名称的前缀</span></span><br></pre></td></tr></table></figure></p></blockquote><p>配置文件中指定wsgi启动文件有几种方式 </p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 指定加载的WSGI文件</span></span><br><span class="line">wsgi-file=manager.py</span><br><span class="line"><span class="comment"># 指定uWSGI加载的模块中哪个变量将被调用</span></span><br><span class="line">callable=app</span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 模块名:可调用对象app</span></span><br><span class="line">module=manager:app</span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">module=manager</span><br><span class="line">callable=app</span><br></pre></td></tr></table></figure><h5 id="uwsgi相关命令"><a href="#uwsgi相关命令" class="headerlink" title="uwsgi相关命令"></a>uwsgi相关命令</h5><pre><code>uwsgi --ini uwsgi.ini # 启动 uwsgi --stop uwsgi.pip # 停止 pkill -9 uwsgi # 停止</code></pre><h3 id="3-supervisor-安装与监控"><a href="#3-supervisor-安装与监控" class="headerlink" title="3 supervisor 安装与监控"></a>3 supervisor 安装与监控</h3><p>简介: supervisor就是用Python开发的一套通用的进程管理程序,能将一个普通的命令行进程变为后台daemon,并监控进程状态,异常退出时能自动重启。</p><h5 id="安装"><a href="#安装" class="headerlink" title="安装:"></a>安装:</h5><blockquote><p>apt-get install supervisor</p></blockquote><p>默认配置文件在/etc/supervisro/supervisord.conf, 自己开发可以将配置文件写在 /etc/supervisor/conf.d/目录下,文件扩展名必须为*.conf </p><h5 id="配置解释"><a href="#配置解释" class="headerlink" title="配置解释"></a>配置解释</h5><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">[program:uwsgi]</span><br><span class="line"><span class="built_in">command</span>=/home/python/.virtualenvs/deploy/bin/uwsgi /home/python/Desktop/flask_deploy/uwsgi.ini</span><br><span class="line">user=root</span><br><span class="line">autostart=<span class="literal">true</span></span><br><span class="line">autorestart=<span class="literal">true</span></span><br><span class="line">stdout_logfile=/home/python/Desktop/flask_deploy/<span class="built_in">log</span>/uwsgi_supervisor.log</span><br><span class="line">stderr_logfile=/home/python/Desktop/flask_deploy/<span class="built_in">log</span>/uwsgi_supervisor_err.log</span><br></pre></td></tr></table></figure><pre><code>- [program:module_name]表示supervisor的一个模块名 - command 程序启动命令如: /usr/bin/python - app.py - user 进程运行的用户身份- autostart=true 跟随Supervisor一起启动- autorestart=true 挂掉之后自动重启- stderr_logfile, stdout_logfile 标准输出,错误日志文件</code></pre><h5 id="启动supervisor"><a href="#启动supervisor" class="headerlink" title="启动supervisor"></a>启动supervisor</h5><blockquote><p>sudo supervisord -c /etc/supervisor/supervisord.conf # supervisord.conf 会自动包含conf.d/目录下的conf文件</p></blockquote><p>相关命令<br> 1️⃣supervisorctl status # 查看启动的项目<br> 2️⃣supervisorctl start module_name # 启动项目<br> 3️⃣supervisorctl stop module_name # 停止木箱<br> 4️⃣supervisorctl shutdown # 关闭所有项目和服务</p><p>启动后可以 ps -aux | grep 查看 uwsgi 和supervisor 都在运行了</p><h3 id="4-Nginx安装与配置"><a href="#4-Nginx安装与配置" class="headerlink" title="4 Nginx安装与配置"></a>4 Nginx安装与配置</h3><blockquote><p>apt-get install nginx<br>默认安装在/etc/nginx/目录下<br>配置目录 /etc/nginx/conf/flask_deploy.conf<br><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">http {</span><br><span class="line"> include mime.types;</span><br><span class="line"> default_type application/octet-stream;</span><br><span class="line"> server {</span><br><span class="line"> listen 80;</span><br><span class="line"> server_name 127.0.0.1; <span class="comment">#公网地址</span></span><br><span class="line"></span><br><span class="line"> location / {</span><br><span class="line"> include uwsgi_params;</span><br><span class="line"> uwsgi_pass 127.0.0.1:8000;</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure></p></blockquote><p>启动</p><blockquote><p>usr/sbin/nginx -c /etc/nginx/conf/flask_deploy.conf</p></blockquote><p>相关命令:<br> 1️⃣nginx -s reload<br> 2️⃣nginx -s stop </p><h4 id="nginx-详细介绍及语法参考-nginx-详细配置说明"><a href="#nginx-详细介绍及语法参考-nginx-详细配置说明" class="headerlink" title="nginx 详细介绍及语法参考:nginx:详细配置说明"></a>nginx 详细介绍及语法参考:<a href="https://juejin.im/post/5bff57246fb9a049be5d3297#heading-39" target="_blank" rel="noopener">nginx:详细配置说明</a></h4><h5 id="不出意外的话浏览器访问-127-0-0-1即可出现hello-world。"><a href="#不出意外的话浏览器访问-127-0-0-1即可出现hello-world。" class="headerlink" title="不出意外的话浏览器访问:127.0.0.1即可出现hello world。"></a>不出意外的话浏览器访问:127.0.0.1即可出现hello world。</h5><h4 id="部署负载均衡"><a href="#部署负载均衡" class="headerlink" title="部署负载均衡"></a>部署负载均衡</h4><p>nginx+uwsgi+flask+supervisor部署负载均衡,</p><ol><li>只需要在项目目录下加一个uwsgi2.ini文件(uWSGI 应用启动配置),修改soket ip,pipfile,logfile路径即可</li><li>再根据以上步骤在supervisor 配置文件中增加一个uwsgi2的监控模块,增加相应配置</li><li>nginx 负载均衡配置<figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line">events {</span><br><span class="line"> worker_connections 1024;</span><br><span class="line">}</span><br><span class="line"></span><br><span class="line">http {</span><br><span class="line"> include mime.types;</span><br><span class="line"> default_type application/octet-stream;</span><br><span class="line"></span><br><span class="line"> upstream flask {</span><br><span class="line"> server 127.0.0.1:8000;</span><br><span class="line"> server 127.0.0.1:8001;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> server {</span><br><span class="line"> listen 80;</span><br><span class="line"> server_name 127.0.0.1; <span class="comment">#公网地址</span></span><br><span class="line"></span><br><span class="line"> location / {</span><br><span class="line"> include uwsgi_params;</span><br><span class="line"> uwsgi_pass flask;</span><br><span class="line"> proxy_</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure></li></ol><p>如此,便配置了一个简单的负载均衡的服务器。访问127.0.0.1,同时用tail 命令查看 两个uwsgi配置中文件中设置的req_logfile 可以观察到流量分发的现象。</p><h3 id="小结"><a href="#小结" class="headerlink" title="小结"></a>小结</h3><p>suervisor 是个后台进程管理工具,不仅局限于监控uwsgi 服务器,还可以监控其他 可能意外宕机的服务程序。</p><h3 id="其他"><a href="#其他" class="headerlink" title="其他"></a>其他</h3><p>相对的可作为web服务器的还有Gunicorn 是从Ruby 的(Unicorn)移植的python HTTP 服务器,兼容各种框架,不需要写配置文件,轻量级的资源消耗.</p><ol><li>安装<blockquote><p>pip install gunicorn<br>启动服务器<br>gunicorn -w 4 -b 127.0.0.1:8080 manager:app –daemon # 已守护进程方式启动,默认为False</p></blockquote></li></ol><h4 id="gunicorn-以配置文件方式启动"><a href="#gunicorn-以配置文件方式启动" class="headerlink" title="gunicorn 以配置文件方式启动"></a>gunicorn 以配置文件方式启动</h4><p>文件名 gunicorn.conf<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 指定web服务器监听的if和端口</span></span><br><span class="line">bind = <span class="string">'127.0.0.1:8080'</span></span><br><span class="line"><span class="comment"># 指定工作进程</span></span><br><span class="line">workers = <span class="number">4</span></span><br><span class="line"><span class="comment"># 指定服务器后台运行</span></span><br><span class="line">daemon = <span class="keyword">True</span></span><br><span class="line"><span class="comment"># 保存主进程id</span></span><br><span class="line">pidfile = <span class="string">'gunicorn.pid'</span></span><br><span class="line"><span class="comment"># 启动服务器之后生成 access.log 保存访问日志</span></span><br><span class="line">accesslog = <span class="string">'access.log'</span></span><br><span class="line"><span class="comment"># 启动服务器之后生成 errorlog , 保存错误日志</span></span><br><span class="line">errorlog = <span class="string">'error.log'</span></span><br></pre></td></tr></table></figure></p><p>启动方式:</p><blockquote><p>gunicorn -c gunicorn.conf manager:app</p></blockquote><h3 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h3><p> <a href="https://www.liaoxuefeng.com/article/0013738926914703df5e93589a14c19807f0e285194fe84000" target="_blank" rel="noopener">Linux后台进程管理利器:supervisor</a><br> <a href="http://xuxping.com/2017/07/09/flask+nginx+uwsgi+supervisor%E9%A1%B9%E7%9B%AE%E9%83%A8%E7%BD%B2/" target="_blank" rel="noopener">flask+nginx+uwsgi+supervisor项目部署</a></p>]]></content>
<summary type="html">
<h3 id="环境"><a href="#环境" class="headerlink" title="环境"></a>环境</h3><pre><code>- Linux: Ubuntu 16.04
- uWSGI 2.0.18
- Flask 1.0.2
- supervisor 3.2.0
- nginx/1.8.1
</code></pre><h3 id="首先区分几个概念"><a href="#首先区分几个概念" class="headerlink" title="首先区分几个概念"></a>首先区分几个概念</h3><p><img src="https://img2018.cnblogs.com/blog/778496/201904/778496-20190401002529915-1118976912.png" alt="uwsgi和WSGI协议"></p>
</summary>
<category term="python学习" scheme="http://plq.91sq.cc/categories/python%E5%AD%A6%E4%B9%A0/"/>
<category term="python" scheme="http://plq.91sq.cc/tags/python/"/>
<category term="deploy" scheme="http://plq.91sq.cc/tags/deploy/"/>
</entry>
<entry>
<title>Read large file with python</title>
<link href="http://plq.91sq.cc/2019/03/29/Read-large-file-with-python/"/>
<id>http://plq.91sq.cc/2019/03/29/Read-large-file-with-python/</id>
<published>2019-03-28T16:35:44.000Z</published>
<updated>2019-03-28T16:45:00.757Z</updated>
<content type="html"><![CDATA[<h4 id="python读取大文件"><a href="#python读取大文件" class="headerlink" title="python读取大文件"></a>python读取大文件</h4><ol><li>较pythonic的方法,使用with结构 <ul><li>文件可以自动关闭</li><li>异常可以在with块内处理<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">with</span> open(filename, <span class="string">'rb'</span>) <span class="keyword">as</span> f: </span><br><span class="line"> <span class="keyword">for</span> line <span class="keyword">in</span> f:</span><br><span class="line"> <do someting <span class="keyword">with</span> the line></span><br></pre></td></tr></table></figure></li></ul></li></ol><a id="more"></a><p><strong>最大的优点</strong>:对可迭代对象 f,进行迭代遍历:for line in f,会自动地使用缓冲IO(buffered IO)以及内存管理,而不必担心任何大文件的问题。</p><blockquote><p>There should be one – and preferably only one – obvious way to do it.</p></blockquote><ol start="2"><li>使用生成器generator </li></ol><p>如果想对每次迭代读取的内容进行更细粒度的处理,可以使用yield生成器来读取大文件<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">readInChunks</span><span class="params">(file_obj, chunkSize=<span class="number">2048</span>)</span>:</span></span><br><span class="line"> <span class="string">"""</span></span><br><span class="line"><span class="string"> Lazy function to read a file piece by piece. </span></span><br><span class="line"><span class="string"> Default chunk size: 2kB.</span></span><br><span class="line"><span class="string"> """</span></span><br><span class="line"> <span class="keyword">while</span> <span class="keyword">True</span>:</span><br><span class="line"> data = file_obj.read(chunkSize)</span><br><span class="line"> <span class="keyword">if</span> <span class="keyword">not</span> data:</span><br><span class="line"> <span class="keyword">break</span></span><br><span class="line"> <span class="keyword">yield</span> data</span><br><span class="line">f = open(<span class="string">'bigFile'</span>)</span><br><span class="line"><span class="keyword">for</span> chunk <span class="keyword">in</span> readInChunks(f):</span><br><span class="line"> do_something(chunk)</span><br><span class="line">f.close()</span><br></pre></td></tr></table></figure></p><ol start="3"><li>linux下使用split命令(将一个文件根据大小或行数平均分成若干个小文件) </li></ol><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line">wc -l BLM.txt <span class="comment"># 读出BLM.txt文件一共有多少行</span></span><br><span class="line"><span class="comment"># 利用split进行分割</span></span><br><span class="line">split -l 2482 ../BLM/BLM.txt -d -a 4 BLM_</span><br><span class="line"><span class="comment"># 将 文件 BLM.txt 分成若干个小文件,每个文件2482行(-l 2482),文件前缀为BLM_ ,系数不是字母而是数字(-d),后缀系数为四位数(-a 4) </span></span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment"># 按行数分割</span></span><br><span class="line">split -l 300 large_file.txt new_file_prefix</span><br><span class="line"><span class="comment"># 文件大小分割</span></span><br><span class="line">split -b 10m server.log waynelog</span><br><span class="line"></span><br><span class="line"><span class="comment"># 对文件进行合并:使用重定向,'>' 写入文件 , '>>' 追加到文件中</span></span><br><span class="line">cat file_prefix* > large_file</span><br></pre></td></tr></table></figure><blockquote><p>在工作中的日常: 用户信息,log日志缓存,等都是大文件</p></blockquote><h4 id="补充:linecache模块"><a href="#补充:linecache模块" class="headerlink" title="补充:linecache模块"></a>补充:linecache模块</h4><p>当读取一个文件的时候,python会尝试从缓存中读取文件内容,优化读取速度,提高效率,减少了I/O操作 </p><blockquote><p>linecache.getline(filename, lineno) 从文件中读取第几行,注意:包含换行符<br>linecache.clearcache() 清除现有的文件缓存<br>linecache.checkcache(filename=None) 检查缓存内容的有效性,可能硬盘内容发生改变,更新了,如果没有参数,将检查缓存中的所有记录(entries)<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> linecache</span><br><span class="line">linecache.getline(linecache.__file__, <span class="number">8</span>)</span><br></pre></td></tr></table></figure></p></blockquote><p>题目:<br>现给一个文件400M(该文件是由/etc/passwd生成的),统计其中root字符串出现的次数<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> time</span><br><span class="line">sum = <span class="number">0</span></span><br><span class="line">start = time.time()</span><br><span class="line"><span class="keyword">with</span> open(<span class="string">'file'</span>, <span class="string">'r'</span>) <span class="keyword">as</span> f:</span><br><span class="line"> <span class="keyword">for</span> i <span class="keyword">in</span> f:</span><br><span class="line"> new = i.count(<span class="string">'root'</span>)</span><br><span class="line"> sum+=new</span><br><span class="line">end = time.time()</span><br><span class="line">print(sum, end-start)</span><br></pre></td></tr></table></figure></p><p><strong>注</strong>:有时候这个程序比c,shell快10倍,原因就是,python会读取cache中的数据,使用缓存在内部进行优化,减少i/o,提高效率</p><p>References : <a href="https://stackoverflow.com/questions/8009882/how-to-a-read-large-file-line-by-line-in-python" target="_blank" rel="noopener">How to read a large file</a></p>]]></content>
<summary type="html">
<h4 id="python读取大文件"><a href="#python读取大文件" class="headerlink" title="python读取大文件"></a>python读取大文件</h4><ol>
<li>较pythonic的方法,使用with结构 <ul>
<li>文件可以自动关闭</li>
<li>异常可以在with块内处理<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">with</span> open(filename, <span class="string">'rb'</span>) <span class="keyword">as</span> f: </span><br><span class="line"> <span class="keyword">for</span> line <span class="keyword">in</span> f:</span><br><span class="line"> &lt;do someting <span class="keyword">with</span> the line&gt;</span><br></pre></td></tr></table></figure>
</li>
</ul>
</li>
</ol>
</summary>
<category term="python学习" scheme="http://plq.91sq.cc/categories/python%E5%AD%A6%E4%B9%A0/"/>
<category term="python" scheme="http://plq.91sq.cc/tags/python/"/>
</entry>
<entry>
<title>斐波那契数列的5种python写法</title>
<link href="http://plq.91sq.cc/2018/07/13/%E6%96%90%E6%B3%A2%E9%82%A3%E5%A5%91%E6%95%B0%E5%88%97%E7%9A%845%E7%A7%8Dpython%E5%86%99%E6%B3%95/"/>
<id>http://plq.91sq.cc/2018/07/13/斐波那契数列的5种python写法/</id>
<published>2018-07-13T10:57:07.000Z</published>
<updated>2019-03-28T12:09:42.556Z</updated>
<content type="html"><![CDATA[<p> 斐波那契数列(Fibonacci sequence),又称黄金分割数列、因数学家<strong>列昂纳多·斐波那契</strong>(Leonardoda Fibonacci)以兔子繁殖为例子而引入,故又称为“兔子数列”,指的是这样一个数列:1、1、2、3、5、8、13、21、34、……在数学上,斐波纳契数列以如下被以递归的方法定义:F(1)=1,F(2)=1, F(n)=F(n-1)+F(n-2)(n>=2,n∈N*)</p><a id="more"></a><p><img src="/assets/blogimg/fibonacci1.jpg" alt=""></p><blockquote><p>斐波那契数列,难点在于算法,还有如果变成生成器,generator,就要用for循环去遍历可迭代的generator </p></blockquote><h4 id="第一种递归法"><a href="#第一种递归法" class="headerlink" title="第一种递归法"></a>第一种递归法</h4><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">fib_recur</span><span class="params">(n)</span>:</span></span><br><span class="line"> <span class="keyword">assert</span> n >= <span class="number">0</span>, <span class="string">"n > 0"</span></span><br><span class="line"> <span class="keyword">if</span> n <= <span class="number">1</span>:</span><br><span class="line"> <span class="keyword">return</span> n</span><br><span class="line"> <span class="keyword">return</span> fib_recur(n<span class="number">-1</span>) + fib_recur(n<span class="number">-2</span>)</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> i <span class="keyword">in</span> range(<span class="number">1</span>, <span class="number">20</span>):</span><br><span class="line"> print(fib_recur(i), end=<span class="string">' '</span>)</span><br></pre></td></tr></table></figure><blockquote><p>写法最简洁,但是效率最低,会出现大量的重复计算,时间复杂度O(1.618^n),而且最深度1000 </p></blockquote><h4 id="第二种递推法"><a href="#第二种递推法" class="headerlink" title="第二种递推法"></a>第二种递推法</h4><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">fib_loop</span><span class="params">(n)</span>:</span></span><br><span class="line"> a, b = <span class="number">0</span>, <span class="number">1</span></span><br><span class="line"> <span class="keyword">for</span> i <span class="keyword">in</span> range(n+<span class="number">1</span>):</span><br><span class="line"> a, b = b, a+b</span><br><span class="line"> <span class="keyword">return</span> a</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> i <span class="keyword">in</span> range(<span class="number">20</span>):</span><br><span class="line"> print(fib_loop(i), end=<span class="string">' '</span>)</span><br></pre></td></tr></table></figure><blockquote><p>递推法,就是递增法,时间复杂度是 O(n),呈线性增长,如果数据量巨大,速度会越拖越慢 </p></blockquote><h4 id="第三种生成器"><a href="#第三种生成器" class="headerlink" title="第三种生成器"></a>第三种生成器</h4><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">fib_loop_while</span><span class="params">(max)</span>:</span></span><br><span class="line"> a, b = <span class="number">0</span>, <span class="number">1</span></span><br><span class="line"> <span class="keyword">while</span> max > <span class="number">0</span>:</span><br><span class="line"> a, b = b, a+b</span><br><span class="line"> max -= <span class="number">1</span></span><br><span class="line"> <span class="keyword">yield</span> a</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> i <span class="keyword">in</span> fib(<span class="number">10</span>):</span><br><span class="line"> print(i, end=<span class="string">' '</span>)</span><br></pre></td></tr></table></figure><blockquote><p>带有yield的函数都被看成生成器,生成器是可迭代对象,且具备__iter__ 和 __next__方法, 可以遍历获取元素</p></blockquote><h4 id="第四种类实现内部魔法方法"><a href="#第四种类实现内部魔法方法" class="headerlink" title="第四种类实现内部魔法方法"></a>第四种类实现内部魔法方法</h4><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">Fibonacci</span><span class="params">(object)</span>:</span></span><br><span class="line"> <span class="function"><span class="keyword">def</span> <span class="title">__init__</span><span class="params">(self, num)</span>:</span></span><br><span class="line"> self.num = num</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">def</span> <span class="title">__iter__</span><span class="params">(self)</span>:</span></span><br><span class="line"> <span class="keyword">if</span> self.num < <span class="number">1</span>:</span><br><span class="line"> <span class="keyword">return</span> <span class="number">1</span></span><br><span class="line"> a, b = <span class="number">0</span>, <span class="number">1</span></span><br><span class="line"> <span class="keyword">while</span> self.num > <span class="number">0</span>:</span><br><span class="line"> a, b = a + b, a</span><br><span class="line"> self.num -= <span class="number">1</span></span><br><span class="line"> <span class="keyword">yield</span> a</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">def</span> <span class="title">__next__</span><span class="params">(self)</span>:</span></span><br><span class="line"> <span class="keyword">return</span> self.__iter__()</span><br><span class="line"></span><br><span class="line">f = Fibonacci(<span class="number">15</span>)</span><br><span class="line"><span class="keyword">for</span> i <span class="keyword">in</span> f:</span><br><span class="line"> print(i)</span><br></pre></td></tr></table></figure><h4 id="第五种-矩阵"><a href="#第五种-矩阵" class="headerlink" title="第五种-矩阵"></a>第五种-矩阵</h4><p><img src="/assets/blogimg/fibonacci2.png" alt=""><br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">### 1</span></span><br><span class="line"><span class="keyword">import</span> numpy</span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">fib_matrix</span><span class="params">(n)</span>:</span></span><br><span class="line"> res = pow((numpy.matrix([[<span class="number">1</span>, <span class="number">1</span>], [<span class="number">1</span>, <span class="number">0</span>]])), n) * numpy.matrix([[<span class="number">1</span>], [<span class="number">0</span>]])</span><br><span class="line"> <span class="keyword">return</span> res[<span class="number">0</span>][<span class="number">0</span>]</span><br><span class="line"><span class="keyword">for</span> i <span class="keyword">in</span> range(<span class="number">10</span>):</span><br><span class="line"> print(int(fib_matrix(i)), end=<span class="string">' '</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">### 2</span></span><br><span class="line"><span class="comment"># 使用矩阵计算斐波那契数列</span></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">Fibonacci_Matrix_tool</span><span class="params">(n)</span>:</span></span><br><span class="line"> Matrix = npmpy.matrix(<span class="string">"1 1;1 0"</span>)</span><br><span class="line"> <span class="comment"># 返回是matrix类型</span></span><br><span class="line"> <span class="keyword">return</span> pow(Matrix, n) <span class="comment"># pow函数速度快于 使用双星好 **</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">Fibonacci_Matrix</span><span class="params">(n)</span>:</span></span><br><span class="line"> result_list = []</span><br><span class="line"> <span class="keyword">for</span> i <span class="keyword">in</span> range(<span class="number">0</span>, n):</span><br><span class="line"> result_list.append(numpy.array(Fibonacci_Matrix_tool(i))[<span class="number">0</span>][<span class="number">0</span>])</span><br><span class="line"> <span class="keyword">return</span> result_list</span><br><span class="line"><span class="comment"># 调用</span></span><br><span class="line">Fibonacci_Matrix(<span class="number">10</span>)</span><br></pre></td></tr></table></figure></p><blockquote><p>因为幂运算可以使用二分加速,所以矩阵法的时间复杂度为 O(log n)<br>用科学计算包numpy来实现矩阵法 O(log n)</p></blockquote>]]></content>
<summary type="html">
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;斐波那契数列(Fibonacci sequence),又称黄金分割数列、因数学家<strong>列昂纳多·斐波那契</strong>(Leonardoda Fibonacci)以兔子繁殖为例子而引入,故又称为“兔子数列”,指的是这样一个数列:1、1、2、3、5、8、13、21、34、……在数学上,斐波纳契数列以如下被以递归的方法定义:F(1)=1,F(2)=1, F(n)=F(n-1)+F(n-2)(n&gt;=2,n∈N*)</p>
</summary>
<category term="算法" scheme="http://plq.91sq.cc/categories/%E7%AE%97%E6%B3%95/"/>
<category term="python" scheme="http://plq.91sq.cc/tags/python/"/>
<category term="面试题" scheme="http://plq.91sq.cc/tags/%E9%9D%A2%E8%AF%95%E9%A2%98/"/>
</entry>
<entry>
<title>异步协程模块async&aiohttp学习</title>
<link href="http://plq.91sq.cc/2018/07/09/%E5%BC%82%E6%AD%A5%E5%8D%8F%E7%A8%8B%E6%A8%A1%E5%9D%97async-aiohttp%E5%AD%A6%E4%B9%A0/"/>
<id>http://plq.91sq.cc/2018/07/09/异步协程模块async-aiohttp学习/</id>
<published>2018-07-09T15:17:56.000Z</published>
<updated>2019-03-28T12:07:17.901Z</updated>
<content type="html"><![CDATA[<h2 id="异步爬虫:async-await-与aiohttp学习"><a href="#异步爬虫:async-await-与aiohttp学习" class="headerlink" title="异步爬虫:async/await 与aiohttp学习"></a>异步爬虫:async/await 与aiohttp学习</h2><blockquote><p>python自带asyncio,asyncio的编程模型就是一个消息循环。<br> 我们从asyncio模块中直接获取一个EventLoop的引用,<br> 然后把需要执行的协程扔到EventLoop中执行,就实现了异步IO</p></blockquote><a id="more"></a><p><img src="/assets/blogimg/asyncio.png" alt=""></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> threading</span><br><span class="line"><span class="keyword">import</span> asyncio</span><br><span class="line"><span class="comment"># @asyncio.coroutine把一个generator标记为coroutine类型</span></span><br><span class="line"><span class="meta">@asyncio.coroutine</span></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">hello</span><span class="params">()</span>:</span></span><br><span class="line"> print(<span class="string">"Hello world! (%s)"</span> % threading.currentThread())</span><br><span class="line"> <span class="comment"># 异步调用asyncio.sleep(1) # asyncio.sleep(1)也是一个coroutine,就是一个i/o耗时操作</span></span><br><span class="line"> r = <span class="keyword">yield</span> <span class="keyword">from</span> asyncio.sleep(<span class="number">1</span>)</span><br><span class="line"> <span class="comment"># print(r)</span></span><br><span class="line"> print(<span class="string">"Hello world! (%s)"</span> % threading.currentThread())</span><br><span class="line"></span><br><span class="line"><span class="comment"># 获取EventLoop,事件队列</span></span><br><span class="line">loop = asyncio.get_event_loop()</span><br><span class="line"><span class="comment"># 添加任务列表</span></span><br><span class="line">tasks = [hello(), hello()]</span><br><span class="line"><span class="comment"># 执行coroutine</span></span><br><span class="line">loop.run_until_complete(asyncio.wait(tasks))</span><br><span class="line">loop.close()</span><br></pre></td></tr></table></figure><h3 id="新语法-python3-5-async-await"><a href="#新语法-python3-5-async-await" class="headerlink" title="新语法(python3.5+,async/await)"></a>新语法(python3.5+,async/await)</h3><blockquote><p>只支持python3.5以上版本<br>把@asyncio.coroutine替换为async;把yield from替换为await </p></blockquote><ul><li>async 跟@asyncio.coroutine用法一样就是说明这个函数是一个coroutine</li><li>await 就是调用另一个协程</li><li>python3.6中,可以直接用yield来调用一个函数,表示协程 </li></ul><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">async</span> <span class="function"><span class="keyword">def</span> <span class="title">hello2</span><span class="params">()</span>:</span></span><br><span class="line"> print(<span class="string">"新语法写法:Hello World"</span>)</span><br><span class="line"> <span class="comment"># asyncio.sleep(1)也是一个coroutine,就是一个i/o耗时操作</span></span><br><span class="line"> r = <span class="keyword">await</span> asyncio.sleep(<span class="number">1</span>)</span><br><span class="line"> print(<span class="string">"新语法写法:Hello again"</span>)</span><br></pre></td></tr></table></figure><h3 id="使用session获取数据,session可以进行多项操作,比如post-get-put-option等等"><a href="#使用session获取数据,session可以进行多项操作,比如post-get-put-option等等" class="headerlink" title="使用session获取数据,session可以进行多项操作,比如post, get, put, option等等"></a>使用session获取数据,session可以进行多项操作,比如post, get, put, option等等</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">headers = {<span class="string">'content-type'</span>: <span class="string">'application/json'</span>}</span><br><span class="line"><span class="keyword">async</span> <span class="function"><span class="keyword">def</span> <span class="title">__fetch2</span><span class="params">()</span>:</span></span><br><span class="line"> <span class="keyword">async</span> <span class="keyword">with</span> aiohttp.ClientSession() <span class="keyword">as</span> session:</span><br><span class="line"> proxy_auth = aiohttp.BasicAuth(<span class="string">'user'</span>, <span class="string">'pass'</span>)</span><br><span class="line"> <span class="keyword">async</span> <span class="keyword">with</span> session.get(<span class="string">'http://httpbin.org'</span>,</span><br><span class="line"> headers=headers,</span><br><span class="line"> proxy=<span class="string">"http://proxy.com"</span>,</span><br><span class="line"> proxy_auth=proxy_auth) <span class="keyword">as</span> resp:</span><br><span class="line"> <span class="comment"># 另一种写法</span></span><br><span class="line"> <span class="comment"># proxy = "http://user:[email protected]"</span></span><br><span class="line"> print(resp.status)</span><br><span class="line"> print(<span class="keyword">await</span> resp.text())</span><br></pre></td></tr></table></figure><h3 id="aiohttp客户端"><a href="#aiohttp客户端" class="headerlink" title="aiohttp客户端"></a>aiohttp客户端</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> asyncio</span><br><span class="line"><span class="keyword">import</span> aiohttp</span><br><span class="line"></span><br><span class="line">URL = <span class="string">"http://www.httpbin.org/get"</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">async</span> <span class="function"><span class="keyword">def</span> <span class="title">fetch</span><span class="params">(session)</span>:</span></span><br><span class="line"> <span class="keyword">async</span> <span class="keyword">with</span> session.get(URL) <span class="keyword">as</span> response:</span><br><span class="line"> <span class="keyword">return</span> <span class="keyword">await</span> response.text()</span><br><span class="line"></span><br><span class="line"><span class="keyword">async</span> <span class="function"><span class="keyword">def</span> <span class="title">main</span><span class="params">(loop)</span>:</span></span><br><span class="line"> <span class="keyword">async</span> <span class="keyword">with</span> aiohttp.ClientSession(loop=loop) <span class="keyword">as</span> session: <span class="comment"># 官网推荐建立session形式</span></span><br><span class="line"> <span class="comment"># 创建多任务</span></span><br><span class="line"> tasks = [loop.create_task(fetch(session)) <span class="keyword">for</span> _ <span class="keyword">in</span> range(<span class="number">3</span>)]</span><br><span class="line"> finished, unfinished = <span class="keyword">await</span> asyncio.wait(tasks)</span><br><span class="line"> all_results = [r.result() <span class="keyword">for</span> r <span class="keyword">in</span> finished]</span><br><span class="line"> print(len(all_results))</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">'__main__'</span>:</span><br><span class="line"> loop = asyncio.get_event_loop()</span><br><span class="line"> loop.run_until_complete(main(loop))</span><br><span class="line"> loop.close()</span><br></pre></td></tr></table></figure><h3 id="aiohttp服务器"><a href="#aiohttp服务器" class="headerlink" title="aiohttp服务器,"></a>aiohttp服务器,</h3><blockquote><p>aiohttp服务器也是一个轻便的服务器框架,asyncio实现了TCP、UDP、SSL等协议,aiohttp则是基于asyncio实现的HTTP框架<br>asyncio可以实现单线程并发IO操作。如果仅用在客户端,发挥的威力不大。如果把asyncio用在服务器端,例如Web服务器,由于HTTP连接就是IO操作,因此可以用单线程+coroutine实现多用户的高并发支持。</p></blockquote><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> aiohttp <span class="keyword">import</span> web</span><br><span class="line"></span><br><span class="line"><span class="keyword">async</span> <span class="function"><span class="keyword">def</span> <span class="title">handle</span><span class="params">(request)</span>:</span></span><br><span class="line"> name = request.match_info.get(<span class="string">'name'</span>, <span class="string">"Anonymous"</span>)</span><br><span class="line"> text = <span class="string">"Hello, "</span> + name</span><br><span class="line"> <span class="keyword">return</span> web.Response(text=text)</span><br><span class="line"></span><br><span class="line">app = web.Application(debug=<span class="keyword">True</span>)</span><br><span class="line">app.add_routes([web.get(<span class="string">'/'</span>, handle),</span><br><span class="line"> web.get(<span class="string">'/{name}'</span>, handle)])</span><br><span class="line"></span><br><span class="line">web.run_app(app, host=<span class="string">"127.0.0.1"</span>, port=<span class="number">8000</span>)</span><br></pre></td></tr></table></figure><h4 id="可学习的写法,"><a href="#可学习的写法," class="headerlink" title="可学习的写法,"></a>可学习的写法,</h4><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">## 可学习的写法,</span></span><br><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">parseListPage</span><span class="params">()</span>:</span></span><br><span class="line"> <span class="function"><span class="keyword">def</span> <span class="title">__init__</span><span class="params">(self,page_str)</span>:</span></span><br><span class="line"> self.page_str = page_str</span><br><span class="line"> <span class="function"><span class="keyword">def</span> <span class="title">__enter__</span><span class="params">(self)</span>:</span></span><br><span class="line"> page_str = self.page_str</span><br><span class="line"> page = bs(page_str,<span class="string">'lxml'</span>)</span><br><span class="line"> <span class="comment"># 获取文章链接</span></span><br><span class="line"> articles = page.find_all(<span class="string">'div'</span>,attrs={<span class="string">'class'</span>:<span class="string">'article_title'</span>})</span><br><span class="line"> art_urls = []</span><br><span class="line"> <span class="keyword">for</span> a <span class="keyword">in</span> articles:</span><br><span class="line"> x = a.find(<span class="string">'a'</span>)[<span class="string">'href'</span>]</span><br><span class="line"> art_urls.append(<span class="string">'http://blog.csdn.net'</span>+x)</span><br><span class="line"> <span class="keyword">return</span> art_urls</span><br><span class="line"> <span class="function"><span class="keyword">def</span> <span class="title">__exit__</span><span class="params">(self, exc_type, exc_val, exc_tb)</span>:</span></span><br><span class="line"> <span class="keyword">pass</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">with</span> parseListPage(ret) <span class="keyword">as</span> tmp:</span><br><span class="line"> articles_url += tmp</span><br></pre></td></tr></table></figure><p>参考:<br><a href="https://aiohttp.readthedocs.io" target="_blank" rel="noopener">官方文档</a><br><a href="https://blog.csdn.net/u014595019/article/details/52295642/" target="_blank" rel="noopener">multiangle大佬</a><br><a href="www.liaoxuefeng.com">www.liaoxuefeng.com</a></p>]]></content>
<summary type="html">
<h2 id="异步爬虫:async-await-与aiohttp学习"><a href="#异步爬虫:async-await-与aiohttp学习" class="headerlink" title="异步爬虫:async/await 与aiohttp学习"></a>异步爬虫:async/await 与aiohttp学习</h2><blockquote>
<p>python自带asyncio,asyncio的编程模型就是一个消息循环。<br> 我们从asyncio模块中直接获取一个EventLoop的引用,<br> 然后把需要执行的协程扔到EventLoop中执行,就实现了异步IO</p>
</blockquote>
</summary>
<category term="python学习" scheme="http://plq.91sq.cc/categories/python%E5%AD%A6%E4%B9%A0/"/>
<category term="asyncio" scheme="http://plq.91sq.cc/tags/asyncio/"/>
</entry>
<entry>
<title>PIL模块学习-强大的图像处理包</title>
<link href="http://plq.91sq.cc/2018/07/01/PIL%E6%A8%A1%E5%9D%97%E5%AD%A6%E4%B9%A0-%E5%BC%BA%E5%A4%A7%E7%9A%84%E5%9B%BE%E5%83%8F%E5%A4%84%E7%90%86%E5%8C%85/"/>
<id>http://plq.91sq.cc/2018/07/01/PIL模块学习-强大的图像处理包/</id>
<published>2018-07-01T12:43:47.000Z</published>
<updated>2019-03-28T12:09:17.392Z</updated>
<content type="html"><![CDATA[<p>PIL: python imaging library, 强大的图像处理包,api简单易用, PIL仅支持到Python 2.7, python3 直接安装pillow(后来由一群志愿者整合的兼容包)</p><a id="more"></a><blockquote><p>pip3 install pillow </p></blockquote><p>接口使用:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 1.读取图片 传入文件路径,文件名</span></span><br><span class="line"><span class="keyword">from</span> PIL <span class="keyword">import</span> Image</span><br><span class="line">im = Image.open(<span class="string">'filename'</span>)</span><br><span class="line">im = Image.open(<span class="string">'/Users/michael/test.jpg'</span>)</span><br><span class="line"><span class="comment"># 2.获得图像尺寸:</span></span><br><span class="line">w, h = im.size</span><br><span class="line"><span class="comment"># 缩放到50%</span></span><br><span class="line">im.thumbnail((w//<span class="number">2</span>, h//<span class="number">2</span>))</span><br><span class="line"><span class="comment"># 3.显示图片</span></span><br><span class="line">im.show()</span><br></pre></td></tr></table></figure></p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 4.保存图片</span></span><br><span class="line"><span class="comment"># 保存图像为gif格式,等</span></span><br><span class="line">im.save(<span class="string">'save.gif'</span>, <span class="string">"GIF"</span>)</span><br><span class="line">im.save(<span class="string">'save.gif'</span>, <span class="string">"JPG"</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 5 图片裁剪功能</span></span><br><span class="line"><span class="comment"># 设置图片裁剪区域</span></span><br><span class="line">box = (<span class="number">100</span>, <span class="number">100</span>, <span class="number">400</span>, <span class="number">400</span>)</span><br><span class="line"><span class="comment"># 注意传入的是一个元组,im对象的很多api方法都是传入元组的</span></span><br><span class="line">region = im.crop(box) <span class="comment"># 返回一个新的图像对象</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 6. 图像黏贴(合并</span></span><br><span class="line">im.paste(region, box) <span class="comment">#粘贴box大小的region到原先的图片对象中</span></span><br></pre></td></tr></table></figure><p>其他功能如切片、旋转、滤镜、输出文字、调色板等一应俱全<br>模糊效果:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> PIL <span class="keyword">import</span> Image, ImageFilter</span><br><span class="line"><span class="comment"># 打开一个jpg图像文件,注意是当前路径:</span></span><br><span class="line">im = Image.open(<span class="string">'test.jpg'</span>)</span><br><span class="line"><span class="comment"># 应用模糊滤镜:</span></span><br><span class="line">im2 = im.filter(ImageFilter.BLUR)</span><br><span class="line">im2.save(<span class="string">'blur.jpg'</span>, <span class="string">'jpeg'</span>)</span><br></pre></td></tr></table></figure></p><p><img src="https://cdn.liaoxuefeng.com/cdn/files/attachments/001407671964310a6b503be6fcb4648928e2e4c522d04c7000" alt=""></p><p>参考:<a href="www.liaoxuefeng.com">廖雪峰的官方网站</a></p>]]></content>
<summary type="html">
<p>PIL: python imaging library, 强大的图像处理包,api简单易用, PIL仅支持到Python 2.7, python3 直接安装pillow(后来由一群志愿者整合的兼容包)</p>
</summary>
<category term="python模块" scheme="http://plq.91sq.cc/tags/python%E6%A8%A1%E5%9D%97/"/>
</entry>
<entry>
<title>Hello World</title>
<link href="http://plq.91sq.cc/2018/05/27/hello-world/"/>
<id>http://plq.91sq.cc/2018/05/27/hello-world/</id>
<published>2018-05-27T13:53:27.455Z</published>
<updated>2018-05-31T05:44:28.968Z</updated>
<content type="html"><![CDATA[<p>Welcome to <a href="https://hexo.io/" target="_blank" rel="noopener">Hexo</a>! This is your very first post. Check <a href="https://hexo.io/docs/" target="_blank" rel="noopener">documentation</a> for more info. If you get any problems when using Hexo, you can find the answer in <a href="https://hexo.io/docs/troubleshooting.html" target="_blank" rel="noopener">troubleshooting</a> or you can ask me on <a href="https://github.com/hexojs/hexo/issues" target="_blank" rel="noopener">GitHub</a>.</p><h2 id="Quick-Start"><a href="#Quick-Start" class="headerlink" title="Quick Start"></a>Quick Start</h2><h3 id="Create-a-new-post"><a href="#Create-a-new-post" class="headerlink" title="Create a new post"></a>Create a new post</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ hexo new <span class="string">"My New Post"</span></span><br></pre></td></tr></table></figure><a id="more"></a><p>More info: <a href="https://hexo.io/docs/writing.html" target="_blank" rel="noopener">Writing</a></p><h3 id="Run-server"><a href="#Run-server" class="headerlink" title="Run server"></a>Run server</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ hexo server</span><br></pre></td></tr></table></figure><p>More info: <a href="https://hexo.io/docs/server.html" target="_blank" rel="noopener">Server</a></p><h3 id="Generate-static-files"><a href="#Generate-static-files" class="headerlink" title="Generate static files"></a>Generate static files</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ hexo generate</span><br></pre></td></tr></table></figure><p>More info: <a href="https://hexo.io/docs/generating.html" target="_blank" rel="noopener">Generating</a></p><h3 id="Deploy-to-remote-sites"><a href="#Deploy-to-remote-sites" class="headerlink" title="Deploy to remote sites"></a>Deploy to remote sites</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ hexo deploy</span><br></pre></td></tr></table></figure><p>More info: <a href="https://hexo.io/docs/deployment.html" target="_blank" rel="noopener">Deployment</a></p>]]></content>
<summary type="html">
<p>Welcome to <a href="https://hexo.io/" target="_blank" rel="noopener">Hexo</a>! This is your very first post. Check <a href="https://hexo.io/docs/" target="_blank" rel="noopener">documentation</a> for more info. If you get any problems when using Hexo, you can find the answer in <a href="https://hexo.io/docs/troubleshooting.html" target="_blank" rel="noopener">troubleshooting</a> or you can ask me on <a href="https://github.com/hexojs/hexo/issues" target="_blank" rel="noopener">GitHub</a>.</p>
<h2 id="Quick-Start"><a href="#Quick-Start" class="headerlink" title="Quick Start"></a>Quick Start</h2><h3 id="Create-a-new-post"><a href="#Create-a-new-post" class="headerlink" title="Create a new post"></a>Create a new post</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ hexo new <span class="string">"My New Post"</span></span><br></pre></td></tr></table></figure>
</summary>
</entry>
</feed>