正在爬'UserWarning' 怎么办？

Question

我在 google 上找到了这个网络爬虫，并且 1个月前它工作得很好，但现在不工作了。
我不知道发生了什么事。
怎么了？我该如何解决这个问题？

代码

from urllib.request import urlopen
from urllib.request import urlretrieve
from urllib.parse import quote_plus
from bs4 import BeautifulSoup
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

search= input('검색어:')
url = f'https://www.google.com/search?q={quote_plus(search)}&source=inms&tbm=isch&sa=X&ved=2haUKEwid64aF87LoAhUafd4KHcEtBZEQ_AUoAXoECBgQAw&biw=1536&bih=754'

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get(url)
for i in range(500):
    driver.execute_script("window.scrollBy(0,10000)")

html = driver.page_source
soup = BeautifulSoup(html)
img = soup.select('.rg_i.Q4LuWd.tx8vtf') 
n = 1
imgurl = []
for i in img:
    try:
        imgurl.append(i.attrs['src'])
    except KeyError:
        imgurl.append(i.attrs["data-src"])

for i in imgurl:
    urlretrieve(i,"크롤링 예예/"+ search + str(n)+ ".jpg")
    n +=1
    print(imgurl)
    if (n==15):
        break


driver.close()

错误信息

[WDM] - Cache is valid for [03/07/2020]
[WDM] - Looking for [chromedriver 83.0.4103.39 win32] driver in cache
[WDM] - Driver found in cache [C:\Users\u\.wdm\drivers\chromedriver.0.4103.39\win32\chromedriver.exe]

DevTools listening on ws://127.0.0.1:57086/devtools/browser/fc2f441e-49f8-466c-aa17-7e29c3e27ac2
yt.py:17: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 17 of the file yt.py. To get rid of this warning, pass the additional argument 'features="lxml"' to the BeautifulSoup constructor.

  soup = BeautifulSoup(html)

谢谢你的帮助。

Answer 1

你需要改变这个

soup = BeautifulSoup(html)

至

soup = BeautifulSoup(html, 'lxml')

并且该警告应该消失

正在爬'UserWarning' 怎么办？

Crawling 'UserWarning' What should I do?

python

selenium

beautifulsoup

web-crawler

web-scraping