您的位置：网站首页> requests教程> 当前文章

requests请求超时处理与异常总结

老董-我爱我家房产SEO2019-07-18166围观,112赞

　　有时候在上网的时候打开1个网页非常卡，浏览器要转半天才转出来，这种情况在用代码请求网页的时候也会遇到。所以requests模块提供了1个timeout参数来设定请求时间(秒数),超出秒数以后就会抛出requests.exceptions.Timeout异常。

　　PS：正常所有的爬虫代码都应该使用timeout参数。如果不使用，可能会出现某个请求一直在等服务器的响应而导致整个程序阻塞，程序看起来卡着不动了！

# -*- coding: utf-8 -*-
url = 'https://github.com'
try:
  requests.get('https://github.com',timeout=0.0001)
except requests.exceptions.Timeout  as e:
  print(e)

HTTPSConnectionPool(host='github.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(, 'Connection to github.com timed out. (connect timeout=0.0001)'))

　　timeout仅对连接过程有效，与响应体的下载无关。timeout并不是整个下载响应的时间限制，而是如果服务器在 timeout 秒内没有应答，将会引发一个异常（更精确地说，是在 timeout 秒内没有从基础套接字上接收到任何字节的数据时）If no timeout is specified explicitly, requests do not time out.

　　上网过程可能出现各种各样的问题，比如网站服务器挂了，域名已经过期了无法访问，国外的敏感网站国内不能访问，网络断线了等等，所以访问网页过程中遇到的异常不只有超时1种，下面简单介绍下。

　　requests异常介绍

　　ConnectionError 异常：遇到网络问题（如：DNS 查询失败、拒绝连接等时，Requests会抛出。

# -*- coding: utf-8 -*-
try:
  requests.get('https://google.com',timeout=5)
except requests.exceptions.ConnectionError  as e:
  print(e)

('Connection aborted.', ConnectionResetError(10054, '远程主机强迫关闭了一个现有的连接。', None, 10054,
None))

　　ConnectTimeoutError异常：若请求超时，则抛出。

# -*- coding: utf-8 -*-
try:
  requests.get('https://baidu.com',timeout=0.00001)
except requests.exceptions.Timeout  as e:
  print(e)

HTTPSConnectionPool(host='baidu.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(, 'Connection to baidu.com timed out. (connect timeout=1e-05)'))

　　TooManyRedirects异常:若请求超过了设定的最大重定向次数，则会抛出。（requests设置最大重定向测试很多方式并没有生效）

# -*- coding: utf-8 -*-
def get_html(url,retry=1):
    try:
        r = requests.get(url=url,headers=my_header, timeout=5)
    except requests.exceptions.RequestException as e:
        print(e)
        if retry > 0:
            get_html(url, retry - 1)
    else:
        html = r.text
        return html

　　所有Requests显式抛出的异常都继承自requests.exceptions.RequestException，可以直接用这个基类来捕获。

# -*- coding: utf-8 -*-
try:
  requests.get('https://360.com',timeout=0.00001)
except requests.exceptions.RequestException  as e:
  print(e)

HTTPSConnectionPool(host='360.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(, 'Connection to 360.com timed out. (connect timeout=1e-05)'))

　　也可以用python内置的Exception类来捕获。

# -*- coding: utf-8 -*-
try:
  requests.get("https://360.com",timeout=0.00001)
except  Exception  as e:
  print(e)

HTTPSConnectionPool(host='360.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(, 'Connection to 360.com timed out. (connect timeout=1e-05)'))

很赞哦！

python编程网提示：转载请注明来源www.python66.com。
有宝贵意见可添加站长微信(底部)，获取技术资料请到公众号(底部)。同行交流请加群