使用硒右键单击网页后从下载的 csv 中抓取数据 python

Scrape data from csv downloaded after right clicking on webpage using selenium python

我想使用 python 和 selenium 从网页中抓取数据。有一个 csv 下载选项,只有在图表框架中右击后才可见。我无法右键单击页面并单击 csv - 使用 selenium 下载选项。 这是我试图从中获取数据的网页的 link - https://datastudio.google.com/reporting/d97f5736-2b85-4f39-beba-6dc386c24429/page/Z3ToB 已尝试使用以下代码集来做到这一点:

options = webdriver.ChromeOptions()
    options.binary_location = r"<Path where chrome application is installed>"
    driver = webdriver.Chrome(r"<path to chrome driver>",chrome_options=options)
    driver.get("https://datastudio.google.com/reporting/d97f5736-2b85-4f39-beba-6dc386c24429/page/Z3ToB")
    timeout = 10
    from selenium.webdriver import ActionChains
    action = ActionChains(driver)
    action.move_to_element(driver.find_element_by_xpath("//lego-canvas-container[@class='lego-canvas-container']")).perform()
    action.context_click().perform()

使用这个,无法找到给定的 XPATH,甚至尝试使用 class 名称,如报告区域。谁能指导如何右键单击框架中的任意位置,然后在其中找到下载 csv 选项?

由于 javascript 右键单击​​后会显示,如果不单击右键就无法找到 xpath 试试这个代码,它对我有效

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("start-maximized")
chrome_options.add_argument("disable-infobars")
chrome_options.add_argument("--disable-extensions")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('--disable-blink-features=AutomationControlled') 
driver = webdriver.Chrome(executable_path = 'chromedriver.exe',options = chrome_options)
driver.implicitly_wait(10)
driver.get("https://datastudio.google.com/reporting/d97f5736-2b85-4f39-beba-6dc386c24429/page/Z3ToB")
action = ActionChains(driver)
action.pause(1)
action.move_by_offset(150,150).perform()
action.context_click().perform()
action.move_to_element(driver.find_element_by_xpath('//*[@id="mat-menu-panel-0"]/div/span[5]/button')).perform()
action.click().perform()

使用下面的xpath来识别元素然后右击然后找到csv按钮并点击。

driver.get("https://datastudio.google.com/reporting/d97f5736-2b85-4f39-beba-6dc386c24429/page/Z3ToB")
time.sleep(5) #delay to load page properly. you can use explicit wait as well
element=driver.find_element_by_xpath("//div[@class='drop-zone-text']")
action = ActionChains(driver)
action.move_to_element(element).perform()
action.context_click().perform()
#To click on download csv
WebDriverWait(driver,5).until(EC.element_to_be_clickable((By.XPATH,"//button[contains(.,'Download CSV')]"))).click()

您需要导入以下库

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains

浏览器快照:

我发现右键单击然后使用箭头键 select 适当的选项更容易一些。

因此您可以在 canvas 上的任意位置执行 right click/context_click 以打开弹出菜单。然后你可以使用箭头键和 select 'Download Csv' 选项上下移动。

actions = ActionChains(driver)

# Find the canvas element
element = driver.find_element_by_xpath('//*[@id="body"]/div/div/div[1]/div[2]/div/div[1]/div[1]/div[1]/div/lego-report/lego-canvas-container/div/file-drop-zone/span/content-section/div[3]/canvas-component')

# Right click the element, then press the Down key twice followed by the Enter to move to the Download CSV option and select it.
actions.move_to_element(element).context_click().send_keys([Keys.DOWN, Keys.DOWN, Keys.ENTER]).perform()

driver.get("https://datastudio.google.com/reporting/d97f5736-2b85-4f39-beba-6dc386c24429/page/Z3ToB")
time.sleep(5)
source= wait.until(EC.presence_of_element_located((By.XPATH,"/html/body")))
action = ActionChains(driver)
action.context_click(source).perform()
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,"#mat-menu-panel-0 > div > span:nth-child(5) > button"))).click()

奇怪的是它可以与此一起使用。似乎您需要等待,上下文单击正文,然后单击菜单元素。

<button _ngcontent-fys-c1="" class="mat-focus-indicator mat-tooltip-trigger mat-menu-item ng-star-inserted" mat-menu-item="" role="menuitem" tabindex="0" aria-disabled="false"> Download CSV <!----><!----><!----><div class="mat-menu-ripple mat-ripple" matripple=""></div></button>

导入

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC
from time import sleep