Python爬虫入门:urllib中的异常处理
简介
urllib库的error模块定义了由request模块产生的异常。如果有问题,request模块会抛出error模块中定义的异常。
URLError
URLError是error异常模块的基类,request产生的异常都可以捕获。
from urllib import request,error
try:
response = request.urlopen('https://www.qq.com/index.html')
except error.URLError as e:
print(e.reason)
HTTPError
HTTPError是URLError的子类,专门处理HTTP请求错误,比如认证失败等。
属性 | 作用 |
---|---|
code | 返回HTTP状态码,比如404等 |
reason | 返回错误的原因 |
headers | 返回请求头 |
简单实例
from urllib import request,error
try:
response = request.urlopen('https://www.qq.com/index.html')
except error.HTTPError as e:
print(e.reason,e.code, e.headers, sep='\n')
优化写法
from urllib import request,error
try:
response = request.urlopen('https://www.qq.com/index.html')
except error.HTTPError as e:
print(e.reason,e.code, e.headers, sep='\n')
except error.URLError as e:
print(e.reason)
else:
print('Request Successfully')
reason可以不是字符串而是一个对象
超时的情况下,异常的reason就是一个socket.timeout对象。
import socket
from urllib import request,error
try:
response = request.urlopen('https://www.baidu.com', timeout=0.01)
except error.URLError as e:
if isinstance(e.reason, socket.timeout):
print('TIME OUT')
else:
print('Request Successfully')
微信赞赏支付宝赞赏