检查 NoneType 的变量并打破 while 循环

Question

我对编程还很陌生，并开始自学使用 Python 进行网页抓取。我正在从网站的多个页面抓取玩家数据，并构建了一个 while 循环，该循环抓取 'next'-button 的 href 以到达下一个玩家的页面。一切正常，除了在最后一个可用玩家之后打破 while 循环。 'next'-按钮将变灰并且后面没有 link，因此我想停止迭代并将所有内容保存到 csv。

我的脚本是这样的：

#name base url and first page to start

BaseUrl = #url
PageUrl = #also url

while True:

  #scraping tables

  try:
      # retrieve link for 'next' player in order
      link = soup.find(attrs={"class": "go_to_next_player"}).get('href')
      # join base url and new link href
      PageUrl = BaseUrl + link
      if link is None:
          break
  except IndexError as e:
      print(e)
      break

#writing to csv

我想我可以检查检索到的 href 是否为空，因此检查 'is None' 并中断，但我收到此错误：

In line > PageUrl = BaseUrl + link
TypeError: must be str, not NoneType

不胜感激！我对此很陌生，所以请忽略我的初学者代码。

Answer 1

您可以在对它进行任何操作之前检查 link 是否为 None，然后打破循环：

if link is not None:
    PageUrl = BaseUrl + link
else:
    break

检查 NoneType 的变量并打破 while 循环

Check a variable for NoneType and break a while loop

beautifulsoup

href

web-scraping

python-3.x

nonetype