在 BeautifulSoup python 网络抓取工具中获取不正确的 link
Getting incorrect link in BeautifulSoup python web scraper
我正在编写网络抓取工具,并且正在努力从网页中获取 href link。 URL 是 https://www.seedinvest.com/auto 我正在尝试获取他们个人文章的 href link。这是一个例子:
<a class="card-url full-size-content" href="https://www.seedinvest.com/soil.connect/seed"></a>
这是我的代码:
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re
URL = "https://www.seedinvest.com/offerings"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
### Searching for all embeded company website links and displaying them
links = []
for link in soup.findAll(class_="card-url full-size-content"):
links.append(link.get('href'))
print(links)
当我 运行 我的代码时,我得到这个:
[]
你能帮我找到合适的 link 吗?
也许您在代码中使用了错误的 URL:https://www.seedinvest.com/offerings
而不是 https://www.seedinvest.com/auto
?
它在 url
中工作
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re
URL = "https://www.seedinvest.com/auto"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
### Searching for all embeded company website links and displaying them
links = []
for link in soup.findAll(class_="card-url full-size-content"):
links.append(link.get('href'))
print(links)
输出:
['https://www.seedinvest.com/nowrx/series.c', 'https://www.seedinvest.com/appmail/seed', 'https://www.seedinvest.com/soil.connect/seed', 'https://www.seedinvest.com/cytonics/series.c.2']
我正在编写网络抓取工具,并且正在努力从网页中获取 href link。 URL 是 https://www.seedinvest.com/auto 我正在尝试获取他们个人文章的 href link。这是一个例子:
<a class="card-url full-size-content" href="https://www.seedinvest.com/soil.connect/seed"></a>
这是我的代码:
import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re
URL = "https://www.seedinvest.com/offerings"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
### Searching for all embeded company website links and displaying them
links = []
for link in soup.findAll(class_="card-url full-size-content"):
links.append(link.get('href'))
print(links)
当我 运行 我的代码时,我得到这个:
[]
你能帮我找到合适的 link 吗?
也许您在代码中使用了错误的 URL:https://www.seedinvest.com/offerings
而不是 https://www.seedinvest.com/auto
?
它在 url
中工作import pandas as pd
import os
import requests
from bs4 import BeautifulSoup
import re
URL = "https://www.seedinvest.com/auto"
page = requests.get(URL)
soup = BeautifulSoup(page.text, "html.parser")
### Searching for all embeded company website links and displaying them
links = []
for link in soup.findAll(class_="card-url full-size-content"):
links.append(link.get('href'))
print(links)
输出:
['https://www.seedinvest.com/nowrx/series.c', 'https://www.seedinvest.com/appmail/seed', 'https://www.seedinvest.com/soil.connect/seed', 'https://www.seedinvest.com/cytonics/series.c.2']