BeautifulSoup: SyntaxError: invalid character in identifier

BeautifulSoup: SyntaxError: invalid character in identifier

我正在尝试从该网页中抓取所有日期,这些日期在 table 内。 如何:使用查找,指定 table 的元素及其属性(蓝色) 问题:当我尝试提取整个 table 时语法错误,字符标识符无效。 其他相关信息:此站点需要用户名和密码,因此我使用会话来保存我的凭据。

import requests
from getpass import getpass
from requests import get
from requests.exceptions import RequestException
from contextlib import closing
from bs4 import BeautifulSoup
from requests.auth import HTTPBasicAuth

URL = "https://d2l.pima.edu/d2l/lms/dropbox/user/folders_list.d2l?ou=475011&isprv=0"
s = requests.Session()
s.auth = ("myusername", "mypass")
s.headers.update({"x-test": "true"}) 

# both "x-test" and "x-test2" are sent
s.get("https://d2l.pima.edu/d2l/lms/dropbox/user/folders_list.d2l?ou=475011&isprv=0", headers={"x-test2": "true"})
page = requests.get(URL)

soup = BeautifulSoup(page.content, "html.parser")
results = soup.find("div", attrs= {"id":"id_content_r_c1"}​)

错误引用最后一行代码:标识符中的无效字符 然而,我三重检查并与其他有效的代码进行比较,没有发现任何差异。

另外这是我网页的DOC

回溯:

runfile('/Users/rahelmizrahi/Python/scripts/d2lwebscrape1.py', wdir='/Users/rahelmizrahi/Python/scripts')
  File "/Users/rahelmizrahi/Python/scripts/d2lwebscrape1.py", line 26
    results = soup.find("div", attrs= {"id":"id_content_r_c1"}​)
                                                              ^
SyntaxError: invalid character in identifier

这可能是 copy/pasting 代码的结果 - 让我们看看失败的行

>>> import unicodedata as ud
>>> s = 'results = soup.find("div", attrs= {"id":"id_content_r_c1"})'
>>> for c in s:print(c, ud.name(c))
... 
r LATIN SMALL LETTER R
e LATIN SMALL LETTER E
s LATIN SMALL LETTER S
u LATIN SMALL LETTER U
l LATIN SMALL LETTER L
t LATIN SMALL LETTER T
s LATIN SMALL LETTER S
  SPACE
= EQUALS SIGN
  SPACE
s LATIN SMALL LETTER S
...
1 DIGIT ONE
" QUOTATION MARK
} RIGHT CURLY BRACKET
 ZERO WIDTH SPACE
) RIGHT PARENTHESIS

倒数第二个字符 "ZERO WIDTH SPACE" 是不可见的,这是问题所在。删除它或重新键入代码行。