来源:python中国网 时间:2019-07-18

  HTTP协议中没有规定post提交的数据必须使用什么编码方式,服务端根据请求头中的Content-Type字段来获取编码方式,再对数据进行解析。具体的编码方式包括如下:

 - application/x-www-form-urlencoded # 以form表单形式提交数据,最常见最熟悉

 - application/json # 以json串提交数据。

 - multipart/form-data # 上传文件

  下面使用requests来发送上述三种编码的POST请求。

  1、提交Form表单

  requests提交Form表单,一般存在于网站的登录,用来提交用户名和密码。以 http://httpbin.org/post 为例,在requests中,以form表单形式发送post请求,只需要将请求的参数构造成一个字典,然后传给requests.post()的data参数即可。(httpbin.org 网站可以显示提交请求的内容,输出的”Content-Type”:”application/x-www-form-urlencoded”,证明这是提交Form的方式。)代码如下:

# -*- coding: utf-8 -*-
import requests

def get_html(url, key_value, retry=2):
    try:
        r = requests.post(url=url, headers=headers, data=key_value, timeout=5)
    except Exception as e:
        print(e)
        if retry > 0:
            get_html(url, retry - 1)
    else:
        page = r.text
        return page


if __name__ == "__main__":
    # 自定义请求头信息
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36',
    }
    url = 'http://httpbin.org/post'
    kw = {'wd': 'www.python66.com'}
    html = get_html(url, kw)
    print(html)
D:python3installpython.exe D:/python/py3script/test.py
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "wd": "www.python66.com"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "19", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
  }, 
  "json": null, 
  "origin": "223.72.81.198, 223.72.81.198", 
  "url": "https://httpbin.org/post"
}


Process finished with exit code 0



  2、提交json串

  对于提交json串(浏览器中抓包显示payload),主要是用于发送ajax请求中,动态加载数据。

  可以用json.dumps()对dict进行编码,可以使用json参数直接传递,然后它就会被自动编码,在请求头中也不用显示声明 这是 2.4.2 版的新加功能。代码如下:

# -*- coding: utf-8 -*-
import requests
import json


def get_html(url, key_value, retry=2):
    try:
        r = requests.post(url=url, headers=headers, data=key_value, timeout=5)
    except Exception as e:
        print(e)
        if retry > 0:
            get_html(url, retry - 1)
    else:
        page = r.text
        return page


def get_html_json(url, key_value, retry=2):
    try:
        r = requests.post(url=url, headers=headers, json=key_value, timeout=5)
    except Exception as e:
        print(e)
        if retry > 0:
            get_html_json(url, retry - 1)
    else:
        page = r.text
        return page


if __name__ == "__main__":
    # 自定义请求头信息
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36',
        'Content-Type':'application/json; charset=UTF-8',
    }
    url = 'https://api.xxxx.com/xxx/xxx'
    kw = {'domain': 'www.python66.com'}
    # json.dumps
    html = get_html(url, json.dumps(kw))
    # 传递json参数
    html_json = get_html_json(url,kw)

  3.上传文件:

  上传文件在爬虫中使用的很少。Content-Type类型为multipart/form-data,以multipart形式发送post请求,只需将一文件传给 requests.post() 的 files参数即可。还是以 http://httpbin.org/post 为例,代码如下:

# -*- coding: utf-8 -*-
import requests

def get_html(url, key_value, retry=2):
    try:
        r = requests.post(url=url, headers=headers, data=key_value, timeout=5)
    except Exception as e:
        print(e)
        if retry > 0:
            get_html(url, retry - 1)
    else:
        page = r.text
        return page


if __name__ == "__main__":
    # 自定义请求头信息
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36',
    }
    url = 'http://httpbin.org/post'
    files = {'file': open('ajax.png', 'rb')}
    html = get_html(url, files)
    print(html)
D:python3installpython.exe D:/python/py3script/test.py
{
  "args": {}, 
  "data": "", 
  "files": {
    "file": "data:application/octet-stream;base64,...太长..省略..."
  }, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Content-Length": "68870", 
    "Content-Type": "multipart/form-data; boundary=66f5b203f18f79960ac438c59af481b0", 
    "Host": "httpbin.org", 
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36"
  }, 
  "json": null, 
  "origin": "223.72.72.67, 223.72.72.67", 
  "url": "https://httpbin.org/post"
}


Process finished with exit code 0


  警告

  建议用二进制模式(binary mode)打开文件,因为 Requests 可能会试图为你提供 Content-Length header,在它这样做的时候,这个值会被设为文件的字节数(bytes)。如果用文本模式(text mode)打开文件,就可能会发生错误。