驗證:
HTTPBasicAuthHandler(用戶基本的身份驗證處理)
HTTPPasswordMgrWithDefaultRealm(經常和authhandler一起出現)#創建一個密碼管理器
password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
#添加進目標url,用戶名 密碼
password_mgr.add_password(None,url,username,password)
第一個參數為NONE,表示默認的域
如果需要添加不同域的憑證可以將none替換為對應的域名WithDefaultRealm ?(為不同的URL設置相同的域)
#! /usr/bin/evn python3import urllib.request
from urllib.parse import urlparsedef auto_login():url = 'https://ssr3.scrape.center/'# 指定用戶名、密碼username = 'admin'password = 'admin'# 創建一個密碼管理器password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()# 添加url、用戶名、密碼password_mgr.add_password(None, url, username, password)# 創建一個基本密碼認證處理器并把密碼管理器傳遞給它handle = urllib.request.HTTPBasicAuthHandler(password_mgr)# 創建網絡請求的構造器opener = urllib.request.build_opener(handle)response = opener.open(url)print(response.read().decode('utf-8'))auto_login()
Cookie
1.用賬號密碼登錄
2.第一次登錄成功后“set-cookie”
3.下次登錄就不需要在輸入了
處理cookie相關的handler :cookiejar
寫cookiejar:MozillaCookieJar
將cookies保存成Mozilla型瀏覽器的cookies格式讀cookiejar:LWPCookieJar
保存成 libwww-perl(LWP) 格式cookies文件。
#! /usr/bin/evn python3import urllib.request
import http.cookiejarurl = "https://www.baidu.com"# cookie = http.cookiejar.CookieJar()filename = "cookie1.txt"# cookie = http.cookiejar.LWPCookieJar(filename=filename)
# handle = urllib.request.HTTPCookieProcessor(cookie)
# opener = urllib.request.build_opener(handle)
# response = opener.open(url)
# for item in cookie:
# print(item)
# cookie.save(ignore_discard=True, ignore_expires=True)cookie = http.cookiejar.LWPCookieJar()
cookie.load(filename=filename, ignore_discard=True, ignore_expires=True)
handle = urllib.request.HTTPCookieProcessor(cookie)
opener = urllib.request.build_opener(handle)
response = opener.open(url)
print(response.read().decode('utf-8'))
項目內容:
利用cookie繞過登錄網站
#!/usr/bin/env python3
import random
import urllib.request
import urllib.parse
import urllib.error# 定義URL
url = 'http://httpbin.org/post'# 定義多組User-Agent
ip_list = ["http://183.161.45.66:17114","http://119.41.198.172:18350","http://27.191.60.244:15982","http://27.215.237.221:20983",
]# 利用 random 函數 每次隨機抽取一個User-Agent
proxy = random.choice(ip_list)
print(proxy)
try:proxy_hander = urllib.request.ProxyHandler({'http': proxy, 'https': proxy})opener = urllib.request.build_opener(proxy_hander)response = opener.open(url)print(response.read().decode('utf-8'))except urllib.error.URLError as e:print("error: ", e)
異常處理:
1、URLerror
urllib的error模塊:
urlerror繼承自OSError
except error.URLError as e:
print(e.reason) ??2、打印錯誤的原因
HTTPError:
專門用來處理HTTP請求
#! /usr/bin/evn python3import urllib.request
from urllib import request, error
from urllib.error import *
import sockettry:url = 'https://www.baidu.com/'response = urllib.request.urlopen(url, timeout=0.01)# header = {# 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/139.0.0.0 Safari/537.36'# }# req = urllib.request.Request(url=url, headers=header)# response = urllib.request.urlopen(req)# print(response.read().decode('utf-8'))
except error.URLError as e:print(e.reason)if isinstance(e.reason, socket.timeout):print("Timed out")# except error.HTTPError as e:
# print(e)