代理配置在 Python 中不起作用
Proxy configuration is not working in Python
我正在尝试在进行网络抓取时轮换我的 IP,但它似乎不起作用,因为当我检查 IP 时,我执行此过程的过程始终相同。以下是我使用的代码:
代码:
import requests
from bs4 import BeautifulSoup
import random
headers = {'User-Agent': 'Mozilla/5.0 (Linux; Android 5.1.1; SM-G928X Build/LMY47X) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.83 Mobile Safari/537.36'}
def get_free_proxies():
url = "https://free-proxy-list.net/"
# get the HTTP response and construct soup object
soup = BeautifulSoup(requests.get(url).content, "html.parser")
proxies = list()
for row in soup.find("table", attrs={"id": "proxylisttable"}).find_all("tr")[1:]:
tds = row.find_all("td")
try:
ip = tds[0].text.strip()
port = tds[1].text.strip()
host = f"{ip}:{port}"
proxies.append(host)
except IndexError:
continue
return proxies
def get_session(proxies):
#Construct an HTTP session
session = requests.Session()
#choose one random proxy
proxy = random.choice(proxies)
session.proxies = {"http": proxy, "https": proxy}
#session.proxies.update(proxy)
return session
proxies = get_free_proxies()
for i in range(5):
session = get_session(proxies)
print("Request page with IP:", session.get("http://icanhazip.com",timeout=1.5).text.strip())
而且输出的一直是同一个IP,没有更新过,顺便说一下我的电脑IP
有人知道失败的原因吗?
谢谢大家
也许您设置了环境变量http_proxy
,并且在发送请求时,使用了此变量中指定的代理。要更改此行为,您只需在创建会话时将属性 trust_env
设置为 False
def get_session(proxies):
#Construct an HTTP session
session = requests.Session()
#choose one random proxy
proxy = random.choice(proxies)
session.proxies = {"http": proxy, "https": proxy}
session.trust_env = False
return session
我正在尝试在进行网络抓取时轮换我的 IP,但它似乎不起作用,因为当我检查 IP 时,我执行此过程的过程始终相同。以下是我使用的代码:
代码:
import requests
from bs4 import BeautifulSoup
import random
headers = {'User-Agent': 'Mozilla/5.0 (Linux; Android 5.1.1; SM-G928X Build/LMY47X) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.83 Mobile Safari/537.36'}
def get_free_proxies():
url = "https://free-proxy-list.net/"
# get the HTTP response and construct soup object
soup = BeautifulSoup(requests.get(url).content, "html.parser")
proxies = list()
for row in soup.find("table", attrs={"id": "proxylisttable"}).find_all("tr")[1:]:
tds = row.find_all("td")
try:
ip = tds[0].text.strip()
port = tds[1].text.strip()
host = f"{ip}:{port}"
proxies.append(host)
except IndexError:
continue
return proxies
def get_session(proxies):
#Construct an HTTP session
session = requests.Session()
#choose one random proxy
proxy = random.choice(proxies)
session.proxies = {"http": proxy, "https": proxy}
#session.proxies.update(proxy)
return session
proxies = get_free_proxies()
for i in range(5):
session = get_session(proxies)
print("Request page with IP:", session.get("http://icanhazip.com",timeout=1.5).text.strip())
而且输出的一直是同一个IP,没有更新过,顺便说一下我的电脑IP
有人知道失败的原因吗?
谢谢大家
也许您设置了环境变量http_proxy
,并且在发送请求时,使用了此变量中指定的代理。要更改此行为,您只需在创建会话时将属性 trust_env
设置为 False
def get_session(proxies):
#Construct an HTTP session
session = requests.Session()
#choose one random proxy
proxy = random.choice(proxies)
session.proxies = {"http": proxy, "https": proxy}
session.trust_env = False
return session