最近
有些朋友
看完小帥b的文章之后
把小帥b的表情包都偷了
還在我的微信
瘋狂發表情包嘚瑟
我就呵呵了
只能說一句
盤他
還有一些朋友
看完文章不點好看
還來催更
小帥b也只能說一句
繼續盤他
ok
接下來我們要來玩一個新的庫
這個庫的名稱叫做
Requests
這個庫比我們上次說的 python爬蟲03:那個叫 Urllib 的庫讓我們的 python 假裝是瀏覽器可是要牛逼一丟丟的
畢竟 Requests 是在 urllib 的基礎上搞出來的
通過它我們可以用更少的代碼
模擬瀏覽器操作
人生苦短
接下來就是
學習 Python 的正確姿勢
skr
對于不是 python 的內置庫
我們需要安裝一下
直接使用 pip 安裝
pip install requests
安裝完后就可以使用了
接下來就來感受一下 requests 吧
導入 requests 模塊
import requests
一行代碼 Get 請求
r = requests.get('https://api.github.com/events')
一行代碼 Post 請求
r = requests.post('https://httpbin.org/post', data = {'key':'value'})
其它亂七八糟的 Http 請求
>>> r = requests.put('https://httpbin.org/put', data = {'key':'value'})
>>> r = requests.delete('https://httpbin.org/delete')
>>> r = requests.head('https://httpbin.org/get')
>>> r = requests.options('https://httpbin.org/get')
想要攜帶請求參數是吧?
>>> payload = {'key1': 'value1', 'key2': 'value2'}
>>> r = requests.get('https://httpbin.org/get', params=payload)
假裝自己是瀏覽器
>>> url = 'https://api.github.com/some/endpoint'
>>> headers = {'user-agent': 'my-App/0.0.1'}
>>> r = requests.get(url, headers=headers)
獲取服務器響應文本內容
>>> import requests
>>> r = requests.get('https://api.github.com/events')
>>> r.text
u'[{"repository":{"open_issues":0,"url":"https://github.com/...
>>> r.encoding
'utf-8'
獲取字節響應內容
>>> r.content
b'[{"repository":{"open_issues":0,"url":"https://github.com/...
獲取響應碼
>>> r = requests.get('https://httpbin.org/get')
>>> r.status_code
200
獲取響應頭
>>> r.headers
{
'content-encoding': 'gzip',
'transfer-encoding': 'chunked',
'connection': 'close',
'server': 'Nginx/1.0.4',
'x-runtime': '148ms',
'etag': '"e1ca502697e5c9317743dc078f67693f"',
'content-type': 'application/json'
}
獲取 Json 響應內容
>>> import requests
>>> r = requests.get('https://api.github.com/events')
>>> r.json()
[{u'repository': {u'open_issues': 0, u'url': 'https://github.com/...
獲取 socket 流響應內容
>>> r = requests.get('https://api.github.com/events', stream=True)
>>> r.raw
<urllib3.response.HTTPResponse object at 0x101194810>
>>> r.raw.read(10)
'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'
Post請求
當你想要一個鍵里面添加多個值的時候
>>> payload_tuples = [('key1', 'value1'), ('key1', 'value2')]
>>> r1 = requests.post('https://httpbin.org/post', data=payload_tuples)
>>> payload_dict = {'key1': ['value1', 'value2']}
>>> r2 = requests.post('https://httpbin.org/post', data=payload_dict)
>>> print(r1.text)
{ ... "form": { "key1": [ "value1", "value2" ] }, ...}
>>> r1.text == r2.text
True
請求的時候用 json 作為參數
>>> url = 'https://api.github.com/some/endpoint'
>>> payload = {'some': 'data'}
>>> r = requests.post(url, json=payload)
想上傳文件?
>>> url = 'https://httpbin.org/post'
>>> files = {'file': open('report.xls', 'rb')}
>>> r = requests.post(url, files=files)
>>> r.text
{ ... "files": { "file": "<censored...binary...data>" }, ...}
獲取 cookie 信息
>>> url = 'http://example.com/some/cookie/setting/url'
>>> r = requests.get(url)
>>> r.cookies['example_cookie_name']
'example_cookie_value'
發送 cookie 信息
>>> url = 'https://httpbin.org/cookies'
>>> cookies = dict(cookies_are='working')
>>> r = requests.get(url, cookies=cookies)
>>> r.text
'{"cookies": {"cookies_are": "working"}}'
設置超時
>>> requests.get('https://github.com/', timeout=0.001)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>requests.exceptions.Timeout: HTTPConnectionPool(host='github.com', port=80): Request timed out. (timeout=0.001)
除了牛逼
還能說什么呢??






