Selenium - 如何单击每个项目的所有更多按钮以从下拉列表中抓取数据
Selenium - How to to click all the More buttons of each individual items to scrape the data from the dropdown
我正在尝试抓取页面上的信息,但当我打开导出的 .CSV 文件时,除了标题外,它是空白的。
我正在尝试抓取此页面上的 10 个结果:https://www.narpm.org/find/property-managers/?submitted=true&toresults=1&resultsperpage=10&a=managers&orderby=&fname=&lname=&company=&chapter=S005&city=&state=&xRadius=
我可以抓取名称、公司、城市和州,但是在单击下拉菜单时 'More' 它似乎不起作用。 (没有收到任何错误,csv 只是空白。)
我怀疑问题出在这个代码块上:
driver.find_element_by_xpath('//div[@class="col-md-4 col-lg-1 arrow"]').click()
这是我的全部代码:
options = Options()
options.headless = True
driver = webdriver.Chrome(executable_path='/Users/vilje/anaconda3/envs/webscrape/chromedriver', options=options)
driver.set_window_size(1440, 900)
# Creates master dataframe
df = pd.DataFrame(columns=['Name','Company', 'City', 'State', 'Phone', 'About'])
# URL
driver.get('https://www.narpm.org/find/property-managers/?submitted=true&toresults=1&resultsperpage=10&a=managers&orderby=&fname=&lname=&company=&chapter=S005&city=&state=&xRadius=')
name = driver.find_elements_by_xpath('//span[@class="name"]')
company = driver.find_elements_by_xpath('//div[@class="col-md-6 col-lg-4"]')
city = driver.find_elements_by_xpath('//div[@class="col-md-4 col-lg-2"]')
state = driver.find_elements_by_xpath('//div[@class="col-md-4 col-lg-2"]')
# Expand the 'More' button
driver.find_element_by_xpath('//div[@class="col-md-4 col-lg-1 arrow"]').click()
phone = driver.find_elements_by_xpath('//div[@class="col-sm-6 col-lg-3 with-icon lighter-text"]')
about = driver.find_elements_by_xpath('//div[@class="col-sm-12"]')
name_list = []
for n in range(len(name)):
name_list.append(name[n].text)
company_list = []
for c in range(len(company)):
company_list.append(company[c].text)
city_list = []
for c in range(len(city)):
city_list.append(city[c].text)
state_list = []
for s in range(len(state)):
state_list.append(state[s].text)
phone_list = []
for p in range(len(phone)):
phone_list.append(phone[p].text)
about_list = []
for a in range(len(about)):
about_list.append(about[a].text)
# List of each property managers name, company, city, state, phone and about section paired together
data_tuples = list(zip(name_list[0:], company_list[0:], city_list[0:], state_list[0:], phone_list[0:], about_list[0:]))
# Creates dataframe of each tuple in list
temp_df = pd.DataFrame(data_tuples, columns=['Name','Company', 'City', 'State', 'Phone', 'About'])
# Appends to master dataframe
df = df.append(temp_df)
driver.close()
谁能帮我点击每个人的所有 'More' 按钮,这样我就可以从下拉列表中抓取数据了。
要单击所有文本为 More 的元素,您需要引入 WebDriverWait for the element_to_be_clickable()
and you can use either of the following :
使用CSS_SELECTOR
:
driver.get("https://www.narpm.org/find/property-managers/?submitted=true&toresults=1&resultsperpage=10&a=managers&orderby=&fname=&lname=&company=&chapter=S005&city=&state=&xRadius=")
for more in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.row div.arrow"))):
more.click()
使用XPATH
:
driver.get("https://www.narpm.org/find/property-managers/?submitted=true&toresults=1&resultsperpage=10&a=managers&orderby=&fname=&lname=&company=&chapter=S005&city=&state=&xRadius=")
for more in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='row']//div[contains(@class, 'arrow') and contains(., 'More')]"))):
more.click()
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
浏览器快照:
我正在尝试抓取页面上的信息,但当我打开导出的 .CSV 文件时,除了标题外,它是空白的。
我正在尝试抓取此页面上的 10 个结果:https://www.narpm.org/find/property-managers/?submitted=true&toresults=1&resultsperpage=10&a=managers&orderby=&fname=&lname=&company=&chapter=S005&city=&state=&xRadius=
我可以抓取名称、公司、城市和州,但是在单击下拉菜单时 'More' 它似乎不起作用。 (没有收到任何错误,csv 只是空白。)
我怀疑问题出在这个代码块上:
driver.find_element_by_xpath('//div[@class="col-md-4 col-lg-1 arrow"]').click()
这是我的全部代码:
options = Options()
options.headless = True
driver = webdriver.Chrome(executable_path='/Users/vilje/anaconda3/envs/webscrape/chromedriver', options=options)
driver.set_window_size(1440, 900)
# Creates master dataframe
df = pd.DataFrame(columns=['Name','Company', 'City', 'State', 'Phone', 'About'])
# URL
driver.get('https://www.narpm.org/find/property-managers/?submitted=true&toresults=1&resultsperpage=10&a=managers&orderby=&fname=&lname=&company=&chapter=S005&city=&state=&xRadius=')
name = driver.find_elements_by_xpath('//span[@class="name"]')
company = driver.find_elements_by_xpath('//div[@class="col-md-6 col-lg-4"]')
city = driver.find_elements_by_xpath('//div[@class="col-md-4 col-lg-2"]')
state = driver.find_elements_by_xpath('//div[@class="col-md-4 col-lg-2"]')
# Expand the 'More' button
driver.find_element_by_xpath('//div[@class="col-md-4 col-lg-1 arrow"]').click()
phone = driver.find_elements_by_xpath('//div[@class="col-sm-6 col-lg-3 with-icon lighter-text"]')
about = driver.find_elements_by_xpath('//div[@class="col-sm-12"]')
name_list = []
for n in range(len(name)):
name_list.append(name[n].text)
company_list = []
for c in range(len(company)):
company_list.append(company[c].text)
city_list = []
for c in range(len(city)):
city_list.append(city[c].text)
state_list = []
for s in range(len(state)):
state_list.append(state[s].text)
phone_list = []
for p in range(len(phone)):
phone_list.append(phone[p].text)
about_list = []
for a in range(len(about)):
about_list.append(about[a].text)
# List of each property managers name, company, city, state, phone and about section paired together
data_tuples = list(zip(name_list[0:], company_list[0:], city_list[0:], state_list[0:], phone_list[0:], about_list[0:]))
# Creates dataframe of each tuple in list
temp_df = pd.DataFrame(data_tuples, columns=['Name','Company', 'City', 'State', 'Phone', 'About'])
# Appends to master dataframe
df = df.append(temp_df)
driver.close()
谁能帮我点击每个人的所有 'More' 按钮,这样我就可以从下拉列表中抓取数据了。
要单击所有文本为 More 的元素,您需要引入 WebDriverWait for the element_to_be_clickable()
and you can use either of the following
使用
CSS_SELECTOR
:driver.get("https://www.narpm.org/find/property-managers/?submitted=true&toresults=1&resultsperpage=10&a=managers&orderby=&fname=&lname=&company=&chapter=S005&city=&state=&xRadius=") for more in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.row div.arrow"))): more.click()
使用
XPATH
:driver.get("https://www.narpm.org/find/property-managers/?submitted=true&toresults=1&resultsperpage=10&a=managers&orderby=&fname=&lname=&company=&chapter=S005&city=&state=&xRadius=") for more in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='row']//div[contains(@class, 'arrow') and contains(., 'More')]"))): more.click()
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
浏览器快照: