AttributeError: 'NoneType' object has no attribute 'text'. Web scraping indeed with Python
AttributeError: 'NoneType' object has no attribute 'text'. Web scraping indeed with Python
我对本网站上发布的其他问题不满意。
我的 目标 是从 Indeed.com 中抓取招聘信息。我 运行 遇到了属性错误。我不知道为什么会收到此错误,因为我正在确保标签在 HTML 和 Python 之间匹配。谁能帮我解决这个问题?
代码:
import urllib.request as urllib
from bs4 import BeautifulSoup
import csv
# empty array for results
results = []
# initialize the Indeed URL to url string
url = 'https://www.indeed.com/jobs?q=software+developer&l=Phoenix,+AZ&jt=fulltime&explvl=entry_level'
soup = BeautifulSoup(urllib.urlopen(url).read(), 'html.parser')
results = soup.find_all('div', attrs={'class': 'jobsearch-SerpJobCard'})
for i in results:
title = i.find('div', attrs={"class":"title"})
print('\ntitle:', title.text.strip())
salary = i.find('span', attrs={"class":"salaryText"})
print('salary:', salary.text.strip())
company = i.find('span', attrs={"class":"company"})
print('company:', company.text.strip())
错误日志:
Traceback (most recent call last): File "c:/Users/Scott/Desktop/code/ScrapingIndeed/index.py", line 16, in
print('salary:', salary.text.strip())
Scott@DESKTOP-MS37V5T MINGW64 ~/Desktop/code
$ AttributeError: 'NoneType' object has no attribute 'text'
代码来自 indeed.com 我正在尝试抓取:
<span class="salaryText">
- an hour</span>
答案比较简单。您需要查看您试图抓取的 HTML 的来源。
并非所有 div
实体都有您要查找的薪资信息。因此,您 运行 的某些搜索返回了 Python 所指的 None
值实体。尽管您可以对其进行操作,但无法打印它。
你需要做的就是检查工资信息的值是否是可打印的值。
例如看一下修改后的代码:
salary = i.find('span', attrs={"class":"salaryText"})
if salary is not None:
print('salary:', salary.text)
整个代码如下:
import urllib.request as urllib
from bs4 import BeautifulSoup
import csv
# empty array for results
results = []
# initialize the Indeed URL to url string
url = 'https://www.indeed.com/jobs?q=software+developer&l=Phoenix,+AZ&jt=fulltime&explvl=entry_level'
soup = BeautifulSoup(urllib.urlopen(url).read(), 'html.parser')
results = soup.find_all('div', attrs={'class': 'jobsearch-SerpJobCard'})
for i in results:
title = i.find('div', attrs={"class":"title"})
print('\ntitle:', title.text.strip())
salary = i.find('span', attrs={"class":"salaryText"})
if salary is not None:
print('salary:', salary.text)
company = i.find('span', attrs={"class":"company"})
print('company:', company.text.strip())
我对本网站上发布的其他问题不满意。 我的 目标 是从 Indeed.com 中抓取招聘信息。我 运行 遇到了属性错误。我不知道为什么会收到此错误,因为我正在确保标签在 HTML 和 Python 之间匹配。谁能帮我解决这个问题?
代码:
import urllib.request as urllib
from bs4 import BeautifulSoup
import csv
# empty array for results
results = []
# initialize the Indeed URL to url string
url = 'https://www.indeed.com/jobs?q=software+developer&l=Phoenix,+AZ&jt=fulltime&explvl=entry_level'
soup = BeautifulSoup(urllib.urlopen(url).read(), 'html.parser')
results = soup.find_all('div', attrs={'class': 'jobsearch-SerpJobCard'})
for i in results:
title = i.find('div', attrs={"class":"title"})
print('\ntitle:', title.text.strip())
salary = i.find('span', attrs={"class":"salaryText"})
print('salary:', salary.text.strip())
company = i.find('span', attrs={"class":"company"})
print('company:', company.text.strip())
错误日志:
Traceback (most recent call last): File "c:/Users/Scott/Desktop/code/ScrapingIndeed/index.py", line 16, in print('salary:', salary.text.strip())
Scott@DESKTOP-MS37V5T MINGW64 ~/Desktop/code
$ AttributeError: 'NoneType' object has no attribute 'text'
代码来自 indeed.com 我正在尝试抓取:
<span class="salaryText">
- an hour</span>
答案比较简单。您需要查看您试图抓取的 HTML 的来源。
并非所有 div
实体都有您要查找的薪资信息。因此,您 运行 的某些搜索返回了 Python 所指的 None
值实体。尽管您可以对其进行操作,但无法打印它。
你需要做的就是检查工资信息的值是否是可打印的值。
例如看一下修改后的代码:
salary = i.find('span', attrs={"class":"salaryText"})
if salary is not None:
print('salary:', salary.text)
整个代码如下:
import urllib.request as urllib
from bs4 import BeautifulSoup
import csv
# empty array for results
results = []
# initialize the Indeed URL to url string
url = 'https://www.indeed.com/jobs?q=software+developer&l=Phoenix,+AZ&jt=fulltime&explvl=entry_level'
soup = BeautifulSoup(urllib.urlopen(url).read(), 'html.parser')
results = soup.find_all('div', attrs={'class': 'jobsearch-SerpJobCard'})
for i in results:
title = i.find('div', attrs={"class":"title"})
print('\ntitle:', title.text.strip())
salary = i.find('span', attrs={"class":"salaryText"})
if salary is not None:
print('salary:', salary.text)
company = i.find('span', attrs={"class":"company"})
print('company:', company.text.strip())