在鼠标悬停时使用 Scrapy 和 Selenium 抓取 Datepicker 的屏幕
Screen scraping a Datepicker with Scrapy and Selenium on mouse hover
所以我需要像 this 这样的页面,我正在使用 Scrapy + Seleninum 与日期选择器日历进行交互。
我意识到,如果某个日期可用,工具提示上会显示价格,如果将鼠标悬停在该日期不可用,则什么也不会发生。
当您将鼠标悬停在可用日期时,让我获取动态显示价格的代码是什么?我如何知道仅通过悬停它是否可用?
由于页面的动态特性,如何解决问题并不是那么简单 - 您必须在这里和那里使用 waits,并且很难捕捉到动态的 HTML单击或悬停时出现的组件。
这是导航到页面的完整工作代码,单击 "Check In" 输入,等待日历加载并报告日历中每一天的可用性(它使用状态ui-datepicker-unselectable
class 来确定)。然后,它使用 move_to_element()
浏览器操作将每个单元格悬停,等待工具提示并获取价格:
from selenium import webdriver
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("https://www.airbnb.pt/rooms/265820?check_in=2016-04-26&guests=1&check_out=2016-04-29")
# wait for the check in input to load
wait = WebDriverWait(driver, 10)
elem = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.book-it-panel input[name=checkin]")))
elem.click()
# wait for datepicker to load
wait.until(
EC.visibility_of_element_located((By.CSS_SELECTOR, '.ui-datepicker:not(.loading)'))
)
days = driver.find_elements_by_css_selector(".ui-datepicker table.ui-datepicker-calendar tr td")
for cell in days:
day = cell.text.strip()
if not day:
continue
if "ui-datepicker-unselectable" in cell.get_attribute("class"):
status = "Unavailable"
else:
status = "Available"
price = "n/a"
if status == "Available":
# hover the cell and wait for the tooltip
ActionChains(driver).move_to_element(cell).perform()
price = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.datepicker-tooltip'))).text
print(day, status, price)
打印:
1 Unavailable n/a
2 Unavailable n/a
3 Unavailable n/a
4 Unavailable n/a
5 Unavailable n/a
6 Unavailable n/a
7 Unavailable n/a
8 Unavailable n/a
9 Unavailable n/a
10 Unavailable n/a
11 Unavailable n/a
12 Unavailable n/a
13 Available €40
14 Unavailable n/a
15 Unavailable n/a
16 Unavailable n/a
17 Unavailable n/a
18 Unavailable n/a
19 Available €36
20 Available €49
21 Unavailable n/a
22 Available €49
23 Unavailable n/a
24 Unavailable n/a
25 Available €40
26 Available €39
27 Available €35
28 Available €37
29 Available €37
30 Available €37
您好,请找到答案
重要说明:我们在日历上点击事件后等待几秒钟,因为 java-脚本在日历打开后需要内部处理时间。
public static void main(String[] args) throws InterruptedException {
System.setProperty("webdriver.chrome.driver","D:\eclipseProject\Whosebug\chromedriver_win32 (1)\chromedriver.exe");
WebDriver driver = new ChromeDriver();
driver.manage().timeouts().implicitlyWait(20, TimeUnit.SECONDS);
driver.manage().window().maximize();
Actions act = new Actions(driver);
WebDriverWait wait = new WebDriverWait(driver,30);
driver.get("https://www.airbnb.pt/rooms/265820?check_in=2016-04-26&guests=1&check_out=2016-04-29");
// selecting firstdate picker -- check in
driver.findElement(By.xpath("//*[@class='col-sm-6']/input")).click();
// NOTE: we have to give sleep due to java-script takes internal processing time on calendar after it get opens
Thread.sleep(5000);
// NOTE: calendar is not completely visible hence to make it visible
// scroll a little bit down
((JavascriptExecutor) driver).executeScript("window.scrollBy(0,200)");
// take all calendar dates inside the list
List<WebElement> myhAvDates = driver.findElements(By.xpath("//*[@class='ui-datepicker-calendar']/tbody/tr/td/a[contains(@class, 'ui-state-default')]"));
System.out.println("Size is "+myhAvDates.size());
for(int i=0;i<myhAvDates.size();i++){
wait.until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.xpath("//*[@class='ui-datepicker-calendar']/tbody/tr/td/a")));
System.out.println("Available Date is : " + myhAvDates.get(i).getText());
act.moveToElement(myhAvDates.get(i)).build().perform();
wait.until(ExpectedConditions.visibilityOfElementLocated(By.cssSelector(".datepicker-tooltip")));
WebElement toolTipElement = driver.findElement(By.cssSelector(".datepicker-tooltip"));
System.out.println("Available Date is : " + myhAvDates.get(i).getText() + "==" +"And price is "+ toolTipElement.getText());
myhAvDates = driver.findElements(By.xpath("//*[@class='ui-datepicker-calendar']/tbody/tr/td/a"));
}
}
以上代码将产生类似
的答案
Size is 10
Available Date is : 13
Available Date is : 13==And price is €40
Available Date is : 19
Available Date is : 19==And price is €36
Available Date is : 20
Available Date is : 20==And price is €49
Available Date is : 22
Available Date is : 22==And price is €49
Available Date is : 25
Available Date is : 25==And price is €40
Available Date is : 26
Available Date is : 26==And price is €39
Available Date is : 27
Available Date is : 27==And price is €35
Available Date is : 28
Available Date is : 28==And price is €37
Available Date is : 29
Available Date is : 29==And price is €37
Available Date is : 30
Available Date is : 30==And price is €37
所以我需要像 this 这样的页面,我正在使用 Scrapy + Seleninum 与日期选择器日历进行交互。
我意识到,如果某个日期可用,工具提示上会显示价格,如果将鼠标悬停在该日期不可用,则什么也不会发生。
当您将鼠标悬停在可用日期时,让我获取动态显示价格的代码是什么?我如何知道仅通过悬停它是否可用?
由于页面的动态特性,如何解决问题并不是那么简单 - 您必须在这里和那里使用 waits,并且很难捕捉到动态的 HTML单击或悬停时出现的组件。
这是导航到页面的完整工作代码,单击 "Check In" 输入,等待日历加载并报告日历中每一天的可用性(它使用状态ui-datepicker-unselectable
class 来确定)。然后,它使用 move_to_element()
浏览器操作将每个单元格悬停,等待工具提示并获取价格:
from selenium import webdriver
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Firefox()
driver.get("https://www.airbnb.pt/rooms/265820?check_in=2016-04-26&guests=1&check_out=2016-04-29")
# wait for the check in input to load
wait = WebDriverWait(driver, 10)
elem = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.book-it-panel input[name=checkin]")))
elem.click()
# wait for datepicker to load
wait.until(
EC.visibility_of_element_located((By.CSS_SELECTOR, '.ui-datepicker:not(.loading)'))
)
days = driver.find_elements_by_css_selector(".ui-datepicker table.ui-datepicker-calendar tr td")
for cell in days:
day = cell.text.strip()
if not day:
continue
if "ui-datepicker-unselectable" in cell.get_attribute("class"):
status = "Unavailable"
else:
status = "Available"
price = "n/a"
if status == "Available":
# hover the cell and wait for the tooltip
ActionChains(driver).move_to_element(cell).perform()
price = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.datepicker-tooltip'))).text
print(day, status, price)
打印:
1 Unavailable n/a
2 Unavailable n/a
3 Unavailable n/a
4 Unavailable n/a
5 Unavailable n/a
6 Unavailable n/a
7 Unavailable n/a
8 Unavailable n/a
9 Unavailable n/a
10 Unavailable n/a
11 Unavailable n/a
12 Unavailable n/a
13 Available €40
14 Unavailable n/a
15 Unavailable n/a
16 Unavailable n/a
17 Unavailable n/a
18 Unavailable n/a
19 Available €36
20 Available €49
21 Unavailable n/a
22 Available €49
23 Unavailable n/a
24 Unavailable n/a
25 Available €40
26 Available €39
27 Available €35
28 Available €37
29 Available €37
30 Available €37
您好,请找到答案
重要说明:我们在日历上点击事件后等待几秒钟,因为 java-脚本在日历打开后需要内部处理时间。
public static void main(String[] args) throws InterruptedException {
System.setProperty("webdriver.chrome.driver","D:\eclipseProject\Whosebug\chromedriver_win32 (1)\chromedriver.exe");
WebDriver driver = new ChromeDriver();
driver.manage().timeouts().implicitlyWait(20, TimeUnit.SECONDS);
driver.manage().window().maximize();
Actions act = new Actions(driver);
WebDriverWait wait = new WebDriverWait(driver,30);
driver.get("https://www.airbnb.pt/rooms/265820?check_in=2016-04-26&guests=1&check_out=2016-04-29");
// selecting firstdate picker -- check in
driver.findElement(By.xpath("//*[@class='col-sm-6']/input")).click();
// NOTE: we have to give sleep due to java-script takes internal processing time on calendar after it get opens
Thread.sleep(5000);
// NOTE: calendar is not completely visible hence to make it visible
// scroll a little bit down
((JavascriptExecutor) driver).executeScript("window.scrollBy(0,200)");
// take all calendar dates inside the list
List<WebElement> myhAvDates = driver.findElements(By.xpath("//*[@class='ui-datepicker-calendar']/tbody/tr/td/a[contains(@class, 'ui-state-default')]"));
System.out.println("Size is "+myhAvDates.size());
for(int i=0;i<myhAvDates.size();i++){
wait.until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.xpath("//*[@class='ui-datepicker-calendar']/tbody/tr/td/a")));
System.out.println("Available Date is : " + myhAvDates.get(i).getText());
act.moveToElement(myhAvDates.get(i)).build().perform();
wait.until(ExpectedConditions.visibilityOfElementLocated(By.cssSelector(".datepicker-tooltip")));
WebElement toolTipElement = driver.findElement(By.cssSelector(".datepicker-tooltip"));
System.out.println("Available Date is : " + myhAvDates.get(i).getText() + "==" +"And price is "+ toolTipElement.getText());
myhAvDates = driver.findElements(By.xpath("//*[@class='ui-datepicker-calendar']/tbody/tr/td/a"));
}
}
以上代码将产生类似
的答案Size is 10
Available Date is : 13
Available Date is : 13==And price is €40
Available Date is : 19
Available Date is : 19==And price is €36
Available Date is : 20
Available Date is : 20==And price is €49
Available Date is : 22
Available Date is : 22==And price is €49
Available Date is : 25
Available Date is : 25==And price is €40
Available Date is : 26
Available Date is : 26==And price is €39
Available Date is : 27
Available Date is : 27==And price is €35
Available Date is : 28
Available Date is : 28==And price is €37
Available Date is : 29
Available Date is : 29==And price is €37
Available Date is : 30
Available Date is : 30==And price is €37