使用 Selenium 和 Python 管理每个请求的多个用户代理
Managing several user agents on each request using Selenium with Python
我使用下面的代码在搜索栏中插入一个代码,点击一个按钮,最后提取一些信息:
from selenium import webdriver
import time
from fake_useragent import UserAgent
url = 'https://www.ufficiocamerale.it/'
vat = '06655971007'
useragent = UserAgent()
profile = webdriver.FirefoxProfile()
profile.set_preference("general.useragent.override", useragent.random)
driver = webdriver.Firefox(profile)
driver.get(url)
time.sleep(5)
item = driver.find_element_by_xpath('//form[@id="formRicercaAzienda"]//input[@id="search_input"]')
item.send_keys(vat)
time.sleep(1)
button = driver.find_element_by_xpath('//form[@id="formRicercaAzienda"]//p//button[@type="submit"]')
button.click()
time.sleep(5)
all_items = driver.find_elements_by_xpath('//ul[@id="first-group"]/li')
for item in all_items:
if '@' in item.text:
print(item.text.split(' ')[1])
driver.close()
现在我想修改代码多次处理上述过程,多亏了 for 循环,即像这样:
from selenium import webdriver
import time
from fake_useragent import UserAgent
url = 'https://www.ufficiocamerale.it/'
vats = ['06655971007', '06655971007', '01010101010']
for vat in vats:
useragent = UserAgent()
# rest of the code
但它什么也没做。我哪里做错了?是user agent的定义吗?
您能否更具体地说明“它什么都不做”?
没有循环的代码是否工作正常?
*在这个网站上测试时,对于“06655971007”作为输入,它不会写任何东西,因为返回的字符串中没有@
编辑
from selenium import webdriver
import time
#from fake_useragent import UserAgent
url = 'https://www.ufficiocamerale.it/'
vats = ['06655971007', '06655971007', '01010101010']
for vat in vats:
#useragent = UserAgent()
profile = webdriver.FirefoxProfile()
profile.set_preference("general.useragent.override", "useragent.random")
driver = webdriver.Chrome('./chromedriver.exe')
#driver = webdriver.Firefox(profile)
driver.get(url)
time.sleep(5)
item = driver.find_element_by_xpath('//form[@id="formRicercaAzienda"]//input[@id="search_input"]')
item.send_keys(vat)
time.sleep(1)
button = driver.find_element_by_xpath('//form[@id="formRicercaAzienda"]//p//button[@type="submit"]')
button.click()
time.sleep(5)
all_items = driver.find_elements_by_xpath('//ul[@id="first-group"]/li')
found_it = False
for item in all_items:
if '@' in item.text:
print(vat + " = " + item.text.split(' ')[1])
found_it = True
if not found_it:
print(vat + " no email found")
driver.close()
输出如下:
01010101010 no email found
08157270961 = vince.srl@legalmail.it
06655971007 = enelenergia@pec.enel.it
我使用下面的代码在搜索栏中插入一个代码,点击一个按钮,最后提取一些信息:
from selenium import webdriver
import time
from fake_useragent import UserAgent
url = 'https://www.ufficiocamerale.it/'
vat = '06655971007'
useragent = UserAgent()
profile = webdriver.FirefoxProfile()
profile.set_preference("general.useragent.override", useragent.random)
driver = webdriver.Firefox(profile)
driver.get(url)
time.sleep(5)
item = driver.find_element_by_xpath('//form[@id="formRicercaAzienda"]//input[@id="search_input"]')
item.send_keys(vat)
time.sleep(1)
button = driver.find_element_by_xpath('//form[@id="formRicercaAzienda"]//p//button[@type="submit"]')
button.click()
time.sleep(5)
all_items = driver.find_elements_by_xpath('//ul[@id="first-group"]/li')
for item in all_items:
if '@' in item.text:
print(item.text.split(' ')[1])
driver.close()
现在我想修改代码多次处理上述过程,多亏了 for 循环,即像这样:
from selenium import webdriver
import time
from fake_useragent import UserAgent
url = 'https://www.ufficiocamerale.it/'
vats = ['06655971007', '06655971007', '01010101010']
for vat in vats:
useragent = UserAgent()
# rest of the code
但它什么也没做。我哪里做错了?是user agent的定义吗?
您能否更具体地说明“它什么都不做”?
没有循环的代码是否工作正常? *在这个网站上测试时,对于“06655971007”作为输入,它不会写任何东西,因为返回的字符串中没有@
编辑
from selenium import webdriver
import time
#from fake_useragent import UserAgent
url = 'https://www.ufficiocamerale.it/'
vats = ['06655971007', '06655971007', '01010101010']
for vat in vats:
#useragent = UserAgent()
profile = webdriver.FirefoxProfile()
profile.set_preference("general.useragent.override", "useragent.random")
driver = webdriver.Chrome('./chromedriver.exe')
#driver = webdriver.Firefox(profile)
driver.get(url)
time.sleep(5)
item = driver.find_element_by_xpath('//form[@id="formRicercaAzienda"]//input[@id="search_input"]')
item.send_keys(vat)
time.sleep(1)
button = driver.find_element_by_xpath('//form[@id="formRicercaAzienda"]//p//button[@type="submit"]')
button.click()
time.sleep(5)
all_items = driver.find_elements_by_xpath('//ul[@id="first-group"]/li')
found_it = False
for item in all_items:
if '@' in item.text:
print(vat + " = " + item.text.split(' ')[1])
found_it = True
if not found_it:
print(vat + " no email found")
driver.close()
输出如下:
01010101010 no email found
08157270961 = vince.srl@legalmail.it
06655971007 = enelenergia@pec.enel.it