Python Selenium - 循环获取标题和链接
Python Selenium - fetching titles and links in a loop
我试图在 Python:
中使用 Selenium 来做到这一点
标题 1
链接1
标题 2
链接2
...
目前我有这个代码:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
PATH = r"C:\Users\Desktop\py\msedgedriver.exe"
driver = webdriver.Edge(PATH)
driver.maximize_window()
driver.get('https://www.google.com/')
searchbar = driver.find_element(by=By.CLASS_NAME, value='gLFyf')
searchbar.send_keys('selenium')
searchbar.send_keys(Keys.RETURN)
titles = driver.find_elements(by=By.CLASS_NAME, value='LC20lb')
links = driver.find_elements(by=By.TAG_NAME, value='a')
for link in links:
href = link.get_attribute('href')
print(href)
for title in titles:
print(title.text)
time.sleep(5)
driver.quit()
然而,打印出来的链接是Google搜索链接,而不是网站本身的链接。此外,所有链接都在标题之前打印出来(我明白为什么会这样,但不知道如何解决)
请问这2个问题有什么方法可以解决?提前谢谢你。
将代码中的 for
循环替换为
for i, link in enumerate(links):
try:
print(titles[i].text)
except:
pass
print(link.get_attribute("href"));print()
输出-
Selenium Tutorial for Beginners: Learn WebDriver & Testing
https://www.google.com/search?q=selenium&source=lnms&tbm=bks&sa=X&ved=2ahUKEwiHuNKqiOv3AhXITWwGHZXxBlwQ_AUoAXoECAIQAw
Selenium: Definition, How it works and Why you need it
https://www.google.com/search?q=selenium&source=lnms&tbm=isch&sa=X&ved=2ahUKEwiHuNKqiOv3AhXITWwGHZXxBlwQ_AUoAnoECAIQBA
What Is Selenium � A Tutorial on How to Use ... - LambdaTest
https://www.google.com/search?q=selenium&source=lnms&tbm=vid&sa=X&ved=2ahUKEwiHuNKqiOv3AhXITWwGHZXxBlwQ_AUoA3oECAIQBQ
我试图在 Python:
中使用 Selenium 来做到这一点标题 1
链接1
标题 2
链接2 ...
目前我有这个代码:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
PATH = r"C:\Users\Desktop\py\msedgedriver.exe"
driver = webdriver.Edge(PATH)
driver.maximize_window()
driver.get('https://www.google.com/')
searchbar = driver.find_element(by=By.CLASS_NAME, value='gLFyf')
searchbar.send_keys('selenium')
searchbar.send_keys(Keys.RETURN)
titles = driver.find_elements(by=By.CLASS_NAME, value='LC20lb')
links = driver.find_elements(by=By.TAG_NAME, value='a')
for link in links:
href = link.get_attribute('href')
print(href)
for title in titles:
print(title.text)
time.sleep(5)
driver.quit()
然而,打印出来的链接是Google搜索链接,而不是网站本身的链接。此外,所有链接都在标题之前打印出来(我明白为什么会这样,但不知道如何解决)
请问这2个问题有什么方法可以解决?提前谢谢你。
将代码中的 for
循环替换为
for i, link in enumerate(links):
try:
print(titles[i].text)
except:
pass
print(link.get_attribute("href"));print()
输出-
Selenium Tutorial for Beginners: Learn WebDriver & Testing
https://www.google.com/search?q=selenium&source=lnms&tbm=bks&sa=X&ved=2ahUKEwiHuNKqiOv3AhXITWwGHZXxBlwQ_AUoAXoECAIQAw
Selenium: Definition, How it works and Why you need it
https://www.google.com/search?q=selenium&source=lnms&tbm=isch&sa=X&ved=2ahUKEwiHuNKqiOv3AhXITWwGHZXxBlwQ_AUoAnoECAIQBA
What Is Selenium � A Tutorial on How to Use ... - LambdaTest
https://www.google.com/search?q=selenium&source=lnms&tbm=vid&sa=X&ved=2ahUKEwiHuNKqiOv3AhXITWwGHZXxBlwQ_AUoA3oECAIQBQ