工具:
python 3.6
Fiddler4
所需要的庫:
requests
BeautifulSoup
首先抓包,觀察登錄時需要什么:
這個authenticity_token的值是訪問/login后可以獲取,值是隨機生成的,所以登錄前要獲取一下。
注意到還需要cookie
觀察到action = ‘/session’
所以post的目標url為‘https://github.com/session’
# coding:utf-8import requests
from bs4 import BeautifulSoupurl = 'https://github.com/login'
url2 = 'https://github.com/session'
#首先登錄/login,獲取cookie和authenticity_token
r = requests.get(url)
html = BeautifulSoup(r.text,'lxml')
#獲取cookies
cookie = r.cookies
authen = [i.attrs['value'] for i in html.find_all('input',{'name':'authenticity_token'})][0]
#將需要的數據列出來
postdata = {'commit':'Sign in','utf8':'√','authenticity_token':authen,'login':'********','password':'********',}
#設置好header
header = {'User-Agent':'''Mozilla/5.0 (Windows NT 6.3; WOW64)AppleWe\
bKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.146 Safari/537.36''','Referer':r'https://github.com/login','Connection':'keep-Alive','Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',}
#利用設置好的header和cookie,就可以訪問了
r = requests.post(url2,data = postdata,cookies=cookie)#將訪問的結果網頁下載下來
f = open('123.html','w')
f.write(r.text)
f.close()