Python Selenium 驱动程序在点击后保留旧数据

Python Selenium driver keeps old data after click

我使用 Python Selenium 抓取 Youtube 视频 URL。我首先加载主页,然后单击随机结果。从第二页开始,我想在右侧获得建议的视频。但是当我这样做时,驱动程序只是将建议的视频添加到主页上的视频列表中。我不知道为什么...所以我需要重置或清除中间的内容 find_elements

driver.get('https://www.youtube.com/')
time.sleep(8)
items = driver.find_elements(By.XPATH, "//a[@id='thumbnail'][@class='yt-simple-endpoint inline-block style-scope ytd-thumbnail'][contains(@href, 'watch?v=')]")

for i in items:
    url = i.get_attribute("href")
    print(str(url))

rand = random.choice(items)
rand.click()
time.sleep(10)

# GET SUGGESTED VIDEO ON THE RIGHT
yt_right_pane_items = driver.find_elements(By.XPATH, "//a[@id='thumbnail'][@class='yt-simple-endpoint inline-block style-scope ytd-thumbnail'][contains(@href, 'watch?v=')]")

for i in yt_right_pane_items:
    url = i.get_attribute("href")
    print(str(url))

主页的输出:

https://www.youtube.com/watch?v=0YuC4ZJJI5c
https://www.youtube.com/watch?v=FyUIEU1qW1w&t=13147s
https://www.youtube.com/watch?v=H9-ekUCFCr0
https://www.youtube.com/watch?v=BoVAOpSiD_A
https://www.youtube.com/watch?v=lJqDZKAxOOY
https://www.youtube.com/watch?v=nJL1k37T6r8
https://www.youtube.com/watch?v=o1dhGnZIxfI
https://www.youtube.com/watch?v=y57jYUogWFs
https://www.youtube.com/watch?v=4V0e9IpzSfs

第二个输出=第一个find_elements的视频+第二个find_elements的视频

https://www.youtube.com/watch?v=0YuC4ZJJI5c
https://www.youtube.com/watch?v=FyUIEU1qW1w&t=13147s
https://www.youtube.com/watch?v=H9-ekUCFCr0
https://www.youtube.com/watch?v=BoVAOpSiD_A
https://www.youtube.com/watch?v=lJqDZKAxOOY
https://www.youtube.com/watch?v=nJL1k37T6r8
https://www.youtube.com/watch?v=o1dhGnZIxfI
https://www.youtube.com/watch?v=y57jYUogWFs
https://www.youtube.com/watch?v=4V0e9IpzSfs
https://www.youtube.com/watch?v=jHa20EBYPU8
https://www.youtube.com/watch?v=ImnTNcqtvlY
https://www.youtube.com/watch?v=ppiIs2YoFqo
https://www.youtube.com/watch?v=P3TFt5oqDJU
https://www.youtube.com/watch?v=BisnRXb_sk0
https://www.youtube.com/watch?v=l5Pjhl1vgUw
https://www.youtube.com/watch?v=nvsZKNYwHt0
https://www.youtube.com/watch?v=L6VBHflOeuY
https://www.youtube.com/watch?v=1MPRbX7ACh8

在第二个 find_elements 上,我只想从点击的页面获取新视频。

问题不是 Selenium 也不是 list,而是 YouTube - 它保留了这些链接但隐藏了。

您的 xpath 搜索所有链接 - 甚至是隐藏的 - 但它应该只搜索可见部分

//div[@id='columns']

完整的 xpath

//div[@id='columns']//a[@id='thumbnail'][@class='yt-simple-endpoint inline-block style-scope ytd-thumbnail'][contains(@href, 'watch?v=')]

如果您只想要 SUGGESTED VIDEO ON THE RIGHT,则搜索

//div[@id='related'] 

完整的 xpath

//div[@id='related']//a[@id='thumbnail'][@class='yt-simple-endpoint inline-block style-scope ytd-thumbnail'][contains(@href, 'watch?v=')]


其他方法是使用 set() 删除重复的元素

new = list( set(second_list) - set(first_list) )

duplicated = list( set(second_list) & set(first_list) )

它很有用,因为您可以在所有页面的建议中得到重复。