@pandait 2017-05-16T05:02:06.000000Z 字数 1424 阅读 5455

Python爬虫之Python3.6 Requests库的基本使用方法

未分类

在使用Python过程中，使用http处理各种请求是我们绕不过去的，在Python中开发爬虫用来抓取各种网络上的资源，就必须得使用发送网络请求获取我们想要的资源。

Requests 是 Python 的一个强大的HTTP库，里面封装了我们用来发送网络请求的各种方法和函数，来方便我们更加方便的编程。

安装 Requests
$ pip install requests
如果你还没有安装 pip 那么自己去google一下怎么安装pip吧。
如何使用Requests
您可以查看Requests中文文档进行学习操作。
常用的Requests库使用

def get(url, params=None, **kwargs): req = requests.get('http://www.pandait.me') print(req)

output:<Response [200]>

如果我们使用一个不能访问的网址进行请求，这时候会报错：
try: req = requests.get('http://www.pandaitxx.me') except BaseException as ex: print(ex.args)
output:

(MaxRetryError("HTTPConnectionPool(host='www.pandaitxx.me', port=80): Max retries exceeded with url: / (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))",),)

具体try except的用法可以自行去搜索处理错误。

发送请求，返回输出结果/HTML:
print(req.text) print(req.content) 以编码的方式输出
请求类型包括：POST，GET,PUT，DELETE，HEAD 以及 OPTIONS

传URL参数进行请求

datapram = {'key1':"123","key2":"pandait"} req = requests.get('http://www.pandait.me',datapram) print(req.url)
output:http://www.pandait.me/?key1=123&key2=pandait

HTTP请求头

{'Server': 'nginx/1.6.2', 'Date': 'Tue, 16 May 2017 04:52:23 GMT', 'Content-Type': 'text/html; charset=UTF-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'X-Powered-By': 'PHP/5.4.45', 'X-Pingback': 'http://www.pandait.me/action/xmlrpc'}
可以作为参数进行传递，如：

headers = {'user-agent':'....'} r = requests.get('http://www.pandait.me',headers = headers)
以上就是本文的一些内容，只是简单介绍和记录Python Requests库的基本用法，点击查看更多 Python Requests高级用法

Python爬虫之Python3.6 Requests库的基本使用方法

内容目录