python实现web请求-EW帮帮网

一、介绍
web请求与响应是web通信的基础。web请求由客户端发起，服务器处理后返回响应。

web请求是客户端（如浏览器、应用程序）通过网络向服务器请求资源。
web响应是服务器响应客户端请求返回数据。

web请求通常包括以下几个部分：
   请求行：包括请求方法（get、post等）、URL和http协议版本（如http/1.1）。
   请求头：包含关于客户端信息、请求体类型、浏览器类型等的元数据。
   请求体：在post请求中包含用户提交的数据，如表单数据或文件。

web响应由服务器返回包含以下几个部分。
   响应行：包含http协议版本、状态码和状态消息。
   响应头：包括关于响应的信息，如内容类型、服务器信息等。
   响应体：包含实际返回的数据（如HTML页面、JSON数据等）。

HTTP协议概述
HTTP（Hypertext Transfer Protocol 超文本传输协议）是web上传输数据的协议，负责浏览器与服务器之间的通信。常见的HTTP方法有：
   GET：请求服务器获取资源，通常用于读取数据。
   POST：提交数据到服务器，通常用于表单提交、文件上传等。
   PUT:更新服务器上的资源。
   DELETE：删除服务器上的资源。

常见的HTTP状态码包括：
   200 OK：请求成功，服务器返回所请求的数据。
   301 Moved Permanently：资源已永久移动。
   404 Not Found：请求的资源不存在。
   500 Internal Server Error：服务器内部错误。

二、示例

开始之前，先安装 requests库。

1、发送GET请求

import requests

#发送GET请求
response = requests.get('https://www.example.com')

# 输出响应的状态码
print ('Status Code:', response.status_code)

# 输出响应的内容
print ('Response Body:',response.text)

# 输出响应头
print ('Response Headeers:',response.headers)

# 获取响应内容的长度
print ('Content Length:',len(response.text))


执行结果
Status Code: 200
Response Body: <!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;

    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        div {
            margin: 0 auto;
            width: auto;
        }
    }
    </style>
</head>

<body>
<div>
    <h1>Example Domain</h1>
    <p>This domain is for use in illustrative examples in documents. You may use this
    domain in literature without prior coordination or asking for permission.</p>
    <p><a href="https://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>

Response Headeers: {'Accept-Ranges': 'bytes', 'Content-Type': 'text/html', 'ETag': '"84238dfc8092e5d9c0dac8ef93371a07:1736799080.121134"', 'Last-Modified': 'Mon, 13 Jan 2025 20:11:20 GMT', 'Vary': 'Accept-Encoding', 'Content-Encoding': 'gzip', 'Content-Length': '648', 'Cache-Control': 'max-age=505', 'Date': 'Wed, 21 May 2025 13:07:39 GMT', 'Alt-Svc': 'h3=":443"; ma=93600,h3-29=":443"; ma=93600,quic=":443"; ma=93600; v="43"', 'Connection': 'keep-alive'}
Content Length: 1256


2、发送POST请求。
import requests

# 发送POST请求
url = 'https://httpbin.org/post'
data = {'name':'Alice','age':25}
response = requests.post(url,data=data)

# 输出响应状态码
print('Status Code:',response.status_code)

# 输出响应内容（json格式）
print('Response Body:',response.json())

执行结果
Status Code: 200
Response Body: {'args': {}, 'data': '', 'files': {}, 'form': {'age': '25', 'name': 'Alice'}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Content-Length': '17', 'Content-Type': 'application/x-www-form-urlencoded', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.32.3', 'X-Amzn-Trace-Id': 'Root=1-682dd19b-270ac4c9489daadd71163f7c'}, 'json': None, 'origin': '1.194.17.20', 'url': 'https://httpbin.org/post'}


3、处理响应头和状态码

import requests

# 发送GET请求
response = requests.get('https://www.example.com')

# 获取响应头
print ('Response Headers:',response.headers)

# 获取响应状态码
print('Status Code:',response.status_code)

# 获取内容类型
print('Content-Type:',response.headers.get('Content-Type'))


执行结果
Response Headers: {'Accept-Ranges': 'bytes', 'Content-Type': 'text/html', 'ETag': '"84238dfc8092e5d9c0dac8ef93371a07:1736799080.121134"', 'Last-Modified': 'Mon, 13 Jan 2025 20:11:20 GMT', 'Vary': 'Accept-Encoding', 'Content-Encoding': 'gzip', 'Content-Length': '648', 'Cache-Control': 'max-age=3000', 'Date': 'Wed, 21 May 2025 13:20:04 GMT', 'Alt-Svc': 'h3=":443"; ma=93600,h3-29=":443"; ma=93600,quic=":443"; ma=93600; v="43"', 'Connection': 'keep-alive'}Status Code: 200
Content-Type: text/html

4、发送带查询参数的GET请求
import requests

# 发送带查询参数的GET请求
url = 'https://httpbin.org/get'
params = {'name':'Alice','age':25}
response = requests.get(url,params=params)

# 输出响应内容
print('Response Body:',response.json())

执行结果
Response Body: {'args': {'age': '25', 'name': 'Alice'}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.32.3', 'X-Amzn-Trace-Id': 'Root=1-682dd40e-3fafbc6f3058f20905989d75'}, 'origin': '1.194.17.20', 'url': 'https://httpbin.org/get?name=Alice&age=25'}


5、发送表单数据的POST请求
import requests

# 发送带表单数据的POST请求
url = 'https://httpbin.org/post'
data = {'username':'testuser','password':'mypassword'}
response = requests.post(url,data=data)

# 输出响应的内容
print ('Response Body:',response.json())


执行结果
Response Body: {'args': {}, 'data': '', 'files': {}, 'form': {'password': 'mypassword', 'username': 'testuser'}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Content-Length': '37', 'Content-Type': 'application/x-www-form-urlencoded', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.32.3', 'X-Amzn-Trace-Id': 'Root=1-682dd705-3612dc5410b2d72d2fa43bcd'}, 'json': None, 'origin': '1.194.17.20', 'url': 'https://httpbin.org/post'}

6、处理JSON响应
import requests

# 发送GET请求并获取JSON响应
url = 'https://api.github.com/users/octocat'
response = requests.get(url)

# 解析JSON数据
data = response.json()

# 输出用户的GitHub信息
print('User Login:',data['login'])
print('User Name:',data['name'])

执行结果
User Login: octocat
User Name: The Octocat

7、打开文件并使用模式
 # 以只读模式打开文件
with open('example.txt','r') as file:
    content = file.read
    print(content)

# 以写入模式打开文件，文件内容会被覆盖
with open('example.txt','w')as file:
    file.write('new \n')

# 以追加模式打开文件，新的内容会追加到文件末尾
with open('example.txt','a') as file:
    file.write('zhuijia \n')

# 以二进制模式打开文件（例如读取图片）
with open('image.jpg','rb') as file:
    binary_data = file.read()
    print('读取到的二进制数据：',binary_data[:20])

执行结果
注意：文件必须存在，否则会报错.
<built-in method read of _io.TextIOWrapper object at 0x000001B60C5298C0>
读取到的二进制数据： b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01\xcd'

文本文档中内容
原文是
111
执行py文件后
new 
zhuijia 

8、读取文件
# read()方法
with open('example.txt','r')as file:
    content = file.read()
    print(content)

执行结果
new 
zhuijia


# readline()方法
with open('example.txt','r')as file:
    line=file.readline()
    while line:
        print(line.strip()) 
        line=file.readline()

new
zhuijia


# readline()方法
with open('example.txt','r')as file:
    line=file.readline()
    while line:
        print(line.strip())
        line=file.readline()

执行结果
111new
222zhuijia
333
444

# readlines()方法
with open('example.txt','r')as file:
    lines=file.readlines()
    for line in lines:
        print(line.strip())
执行结果
111new
222zhuijia
333
444

9、使用write()方法写入文件
with open('output.txt','w')as file:
    file.write("这是第一行数据。\n")
    file.write("这是第二行数据。\n")

执行结果
生成了文件output.txt
这是第一行数据。
这是第二行数据。

使用writelines()方法写入多行数据。
lines = ["第一行数据。\n","第二行数据。\n","第三行数据。\n"]
with open('output.txt','w')as file:
    file.writelines(lines)

执行结果
第一行数据。
第二行数据。
第三行数据。

10、下载文件示例
import requests

url = 'https://www.example.com/image.jpg'
response=requests.get(url)

# 检查请求是否成功
if response.status_code==200:
    # 使用二进制模式写入文件
    with open('downloaded_image.jpg','wb')as file:
        file.write(response.content)
    print('图片下载成功！')
else:
    print(f'下载失败,状态码:{response.status_code}')

执行结果
下载失败,状态码:404

11、文件操作失误模拟

import os

if os.path.exists('example.txt'):
    with open('example.txt','r')as file:
        content=file.read()
        print(content)
else:
    print('文件不存在!')

执行结果
111new 
222zhuijia 
333
444

文件权限
try:
    with open('readonly_file.txt','w')as file:
        file.write('尝试写入只读文件')
except PermissionError:
    print('权限不足,无法写入文件。')

执行结果
生成了readonly_file.txt文件，内容为尝试写入只读文件。

12、获取文件信息
import os

file_path='example.txt'
print('文件大小:',os.path.getsize(file_path),'字节')
print('文件修改时间:',os.path.getmtime(file_path))

执行结果
文件大小: 30 字节
文件修改时间: 1747887031.546634

删除文件
import os

file_path='example.txt'
if os.path.exists(file_path):
    os.remove(file_path)
    print(f"{file_path}已删除！")
else:
    print("文件不存在！")

执行结果
example.txt已删除！
PS C:\Users\Dtt> & E:/Python/python.exe c:/Users/Dtt/Desktop/Untitled-1.py
文件不存在！


13、捕获常见异常

import requests
from requests.exceptions import RequestException,Timeout,HTTPError

try:
    # 发送GET请求，并设置超时时间为5秒
    response=requests.get('https://www.example.com',timeout=5)

    # 如果状态码是200，抛出HTTPError异常
    response.raise_for_status()  # 如果状态码是404或500，抛出异常

    # 如果请求成功，则输出响应内容
    print('Response Body:',response.text)

# 捕获请求超时异常
except Timeout:
    print('Request timed out')

# 捕获HTTP错误（如状态码404、500等）
except HTTPError as http_err:
    print(f'HTTP error occured: {http_err}')

# 捕获其他网络相关的错误
except RequestException as req_err:
    print(f'Request error occurred: {req_err}')

# 可以在finally块中清理资源（如关闭文件或连接）
finally:
    print('Request attempt completed.')

执行结果
Response Body: <!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-color: #f0f0f2;
        margin: 0;
        padding: 0;
        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;

    }
    div {
        width: 600px;
        margin: 5em auto;
        padding: 2em;
        background-color: #fdfdff;
        border-radius: 0.5em;
        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
    }
    a:link, a:visited {
        color: #38488f;
        text-decoration: none;
    }
    @media (max-width: 700px) {
        div {
            margin: 0 auto;
            width: auto;
        }
    }
    </style>
</head>

<body>
<div>
    <h1>Example Domain</h1>
    <p>This domain is for use in illustrative examples in documents. You may use this
    domain in literature without prior coordination or asking for permission.</p>
    <p><a href="https://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>

Request attempt completed.


或
Request error occurred: HTTPSConnectionPool(host='www.example.com', port=443): Max retries exceeded with url: / (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x00000289D9A22F90>: Failed to resolve 'www.example.com' ([Errno 11001] getaddrinfo failed)"))
Request attempt completed.

python实现web请求

网站公告

今日签到

热门文章

最新发布