如何提取每个视频的所有观看次数导致 Selenium 搜索 Youtube?
How to extract all the number of views of each video resulted in Youtube search by Selenium?
我想要的:
- 能够提取 selenium 在 youtube 搜索结果页面上产生的每个视频的所有观看次数。
- 例如:如果我在 youtube 上搜索“来自 Imagine Dragons 的信徒”,它应该会给我所有结果视频的观看次数(例如 - 104M 观看、1.5B 观看、698M 观看等。 ) 最多可以说前 20 个视频。
我试过的
from selenium import webdriver
driver=webdriver.Chrome(executable_path='C:\ProgramData\chocolatey\bin\chromedriver.exe')
search = 'Believer from Imagine Dragons'
driver.get("https://www.youtube.com/results?search_query=" + search)
main = driver.find_elements_by_id("metadata")
for datas in main:
info = datas.find_elements_by_id("metadata-line")
for views in info:
view_counts = views.find_element_by_xpath("""//*[@id="metadata-line"]/span[1]""")
print('view_counts: ' + str(view_counts.text))
这个输出:
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
我也试过了
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver=webdriver.Chrome(executable_path='C:\ProgramData\chocolatey\bin\chromedriver.exe')
search = 'Believer from Imagine Dragons'
driver.get("https://www.youtube.com/results?search_query=" + search)
main = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "metadata"))
)
data = main.find_elements_by_id("metadata-line")
for datas in data:
views = datas.find_element_by_xpath("""//*[@id="metadata-line"]/span[1]""")
print(views.text)
这个输出:
104M views
但是,none 他们给了我想要的东西。请帮助。
未来目标(如果你能帮忙的话):
- 能够播放该页面上观看次数最多的视频。
提取文本,例如TEXT,来自每个 <span>
使用 and python you have to induce for visibility_of_all_elements_located()
and you can use either of the following :
使用CSS_SELECTOR
和get_attribute("innerHTML")
:
driver.get("https://www.youtube.com/results?search_query=Believer%20from%20Imagine%20Dragons")
print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div#metadata-line span:first-child")))])
使用 XPATH
和 text 属性:
driver.get("https://www.youtube.com/results?search_query=Believer%20from%20Imagine%20Dragons")
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@id='metadata-line']/span[@class='style-scope ytd-video-meta-block' and contains(., 'views')]")))])
控制台输出:
['1.5B views', '104M views', '32M views', '93M views', '98M views', '2.3M views', '39M views', '26M views', '1.4B views', '9.6M views', '6.7M views', '748K views', '1.3B views', '11M views', '84M views', '51M views', '13M views', '18M views', '197M views', '7.2M views', '79K views', '3.5M views']
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
结尾
Link 到有用的文档:
get_attribute()
方法Gets the given attribute or property of the element.
text
属性returnsThe text of the element.
- Difference between text and innerHTML using Selenium
我想要的:
- 能够提取 selenium 在 youtube 搜索结果页面上产生的每个视频的所有观看次数。
- 例如:如果我在 youtube 上搜索“来自 Imagine Dragons 的信徒”,它应该会给我所有结果视频的观看次数(例如 - 104M 观看、1.5B 观看、698M 观看等。 ) 最多可以说前 20 个视频。
我试过的
from selenium import webdriver
driver=webdriver.Chrome(executable_path='C:\ProgramData\chocolatey\bin\chromedriver.exe')
search = 'Believer from Imagine Dragons'
driver.get("https://www.youtube.com/results?search_query=" + search)
main = driver.find_elements_by_id("metadata")
for datas in main:
info = datas.find_elements_by_id("metadata-line")
for views in info:
view_counts = views.find_element_by_xpath("""//*[@id="metadata-line"]/span[1]""")
print('view_counts: ' + str(view_counts.text))
这个输出:
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
我也试过了
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver=webdriver.Chrome(executable_path='C:\ProgramData\chocolatey\bin\chromedriver.exe')
search = 'Believer from Imagine Dragons'
driver.get("https://www.youtube.com/results?search_query=" + search)
main = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "metadata"))
)
data = main.find_elements_by_id("metadata-line")
for datas in data:
views = datas.find_element_by_xpath("""//*[@id="metadata-line"]/span[1]""")
print(views.text)
这个输出:
104M views
但是,none 他们给了我想要的东西。请帮助。
未来目标(如果你能帮忙的话):
- 能够播放该页面上观看次数最多的视频。
提取文本,例如TEXT,来自每个 <span>
使用 visibility_of_all_elements_located()
and you can use either of the following
使用
CSS_SELECTOR
和get_attribute("innerHTML")
:driver.get("https://www.youtube.com/results?search_query=Believer%20from%20Imagine%20Dragons") print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div#metadata-line span:first-child")))])
使用
XPATH
和 text 属性:driver.get("https://www.youtube.com/results?search_query=Believer%20from%20Imagine%20Dragons") print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@id='metadata-line']/span[@class='style-scope ytd-video-meta-block' and contains(., 'views')]")))])
控制台输出:
['1.5B views', '104M views', '32M views', '93M views', '98M views', '2.3M views', '39M views', '26M views', '1.4B views', '9.6M views', '6.7M views', '748K views', '1.3B views', '11M views', '84M views', '51M views', '13M views', '18M views', '197M views', '7.2M views', '79K views', '3.5M views']
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
结尾
Link 到有用的文档:
get_attribute()
方法Gets the given attribute or property of the element.
text
属性returnsThe text of the element.
- Difference between text and innerHTML using Selenium