Python,从数据库查询列表中删除字符,url 解析
Python, remove characters from database query list, url parse
我有一个大问题,我真的不知道该怎么办。
所以,我的数据库有 50 行电影 url。示例:
http://www.csfd.cz/hledat/?q=new+girl+s05e03
当我从数据库执行查询时,我将得到以下列表:
['http://www.csfd.cz/hledat/?q=new+girl+s05e03'] ...
重点是从列表中抓取 url 并将其提供给函数,该函数将获得 html 内容(BeautifulSoup)
def csfd_content(url):
content = requests.get(url).content
soup = BeautifulSoup(content, "html.parser")
return soup
我这样试:
##CSFD BEGIN
cur.execute('Select search_name from movies')
urls = cur.fetchall()
for url in urls:
search_url = csfd_content(url)
结果是:
找不到 '['http://www.csfd.cz/hledat/?q=new+girl+s05e03']' 的连接适配器
那个,分明意思不对url!有人可以帮助我在没有 [''] 的情况下恢复正常 url 吗?
从游标访问行:
https://docs.python.org/2/library/sqlite3.html
参考11.13.4节
An extract:
class sqlite3.Row
A Row instance serves as a highly optimized row_factory for Connection objects. It tries to mimic a tuple in most of its features.
It supports mapping access by column name and index, iteration, representation, equality testing and len().
If two Row objects have exactly the same columns and their members are equal, they compare equal.
Changed in version 2.6: Added iteration and equality (hashability).
在上面的问题中使用:
url['name of column in dbase']
这是因为 cursor.fetchall()
returns 列表 的 元组 (或者可能 lists),因此当 requests.get()
需要字符串时,您将元组传递给它。要修复,您需要将元组中的第一项传递给 requests.get()
。您可以使用 url[0]
:
cur.execute('Select search_name from movies')
urls = cur.fetchall()
for url in urls:
search_url = csfd_content(url[0])
我有一个大问题,我真的不知道该怎么办。 所以,我的数据库有 50 行电影 url。示例:
http://www.csfd.cz/hledat/?q=new+girl+s05e03
当我从数据库执行查询时,我将得到以下列表:
['http://www.csfd.cz/hledat/?q=new+girl+s05e03'] ...
重点是从列表中抓取 url 并将其提供给函数,该函数将获得 html 内容(BeautifulSoup)
def csfd_content(url):
content = requests.get(url).content
soup = BeautifulSoup(content, "html.parser")
return soup
我这样试:
##CSFD BEGIN
cur.execute('Select search_name from movies')
urls = cur.fetchall()
for url in urls:
search_url = csfd_content(url)
结果是: 找不到 '['http://www.csfd.cz/hledat/?q=new+girl+s05e03']' 的连接适配器 那个,分明意思不对url!有人可以帮助我在没有 [''] 的情况下恢复正常 url 吗?
从游标访问行:
https://docs.python.org/2/library/sqlite3.html
参考11.13.4节
An extract:
class sqlite3.Row
A Row instance serves as a highly optimized row_factory for Connection objects. It tries to mimic a tuple in most of its features.
It supports mapping access by column name and index, iteration, representation, equality testing and len().
If two Row objects have exactly the same columns and their members are equal, they compare equal.
Changed in version 2.6: Added iteration and equality (hashability).
在上面的问题中使用:
url['name of column in dbase']
这是因为 cursor.fetchall()
returns 列表 的 元组 (或者可能 lists),因此当 requests.get()
需要字符串时,您将元组传递给它。要修复,您需要将元组中的第一项传递给 requests.get()
。您可以使用 url[0]
:
cur.execute('Select search_name from movies')
urls = cur.fetchall()
for url in urls:
search_url = csfd_content(url[0])