将数据从 BeautifulSoup 导出到 CSV
Export data from BeautifulSoup to CSV
[免责声明] 我已经了解了该领域的许多其他答案,但它们似乎对我不起作用。
我希望能够将我抓取的数据导出为 CSV 文件。
我的问题是如何编写将数据输出到 CSV 的代码?
当前代码
import requests
from bs4 import BeautifulSoup
url = "http://implementconsultinggroup.com/career/#/6257"
r = requests.get(url)
req = requests.get(url).text
soup = BeautifulSoup(r.content)
links = soup.find_all("a")
for link in links:
if "career" in link.get("href") and 'COPENHAGEN' in link.text:
print "<a href='%s'>%s</a>" %(link.get("href"), link.text)
代码输出
View Position
</a>
<a href='/career/management-consultants-to-help-our-customers-succeed-with-
it/'>
Management consultants to help our customers succeed with IT
COPENHAGEN • At Implement Consulting Group, we wish to make a difference in
the consulting industry, because we believe that the ability to create Change
with Impact is a precondition for success in an increasingly global and
turbulent world.
View Position
</a>
<a href='/career/management-consultants-within-process-improvement/'>
Management consultants within process improvement
COPENHAGEN • We are looking for consultants with profound
experience in Six Sigma, Lean and operational
management
我试过的代码
with open('ImplementTest1.csv',"w") as csv_file:
writer = csv.writer(csv_file)
writer.writerow(["link.get", "link.text"])
csv_file.close()
以 CSV 格式输出
第 1 列:Url 链接
第 2 列:职位描述
例如
第 1 列:/career/management-consultants-to-help-our-customers-succeed-with-
它/
第 2 栏:帮助我们的客户在 IT 方面取得成功的管理顾问
哥本哈根 • 在 Implement Consulting Group,我们希望在
咨询行业,因为我们相信创造变革的能力
具有影响力是在日益全球化和
动荡的世界。
试试这个脚本并获得 csv 输出:
import csv ; import requests
from bs4 import BeautifulSoup
outfile = open('career.csv','w', newline='')
writer = csv.writer(outfile)
writer.writerow(["job_link", "job_desc"])
res = requests.get("http://implementconsultinggroup.com/career/#/6257").text
soup = BeautifulSoup(res,"lxml")
links = soup.find_all("a")
for link in links:
if "career" in link.get("href") and 'COPENHAGEN' in link.text:
item_link = link.get("href").strip()
item_text = link.text.replace("View Position","").strip()
writer.writerow([item_link, item_text])
print(item_link, item_text)
outfile.close()
[免责声明] 我已经了解了该领域的许多其他答案,但它们似乎对我不起作用。
我希望能够将我抓取的数据导出为 CSV 文件。
我的问题是如何编写将数据输出到 CSV 的代码?
当前代码
import requests
from bs4 import BeautifulSoup
url = "http://implementconsultinggroup.com/career/#/6257"
r = requests.get(url)
req = requests.get(url).text
soup = BeautifulSoup(r.content)
links = soup.find_all("a")
for link in links:
if "career" in link.get("href") and 'COPENHAGEN' in link.text:
print "<a href='%s'>%s</a>" %(link.get("href"), link.text)
代码输出
View Position
</a>
<a href='/career/management-consultants-to-help-our-customers-succeed-with-
it/'>
Management consultants to help our customers succeed with IT
COPENHAGEN • At Implement Consulting Group, we wish to make a difference in
the consulting industry, because we believe that the ability to create Change
with Impact is a precondition for success in an increasingly global and
turbulent world.
View Position
</a>
<a href='/career/management-consultants-within-process-improvement/'>
Management consultants within process improvement
COPENHAGEN • We are looking for consultants with profound
experience in Six Sigma, Lean and operational
management
我试过的代码
with open('ImplementTest1.csv',"w") as csv_file:
writer = csv.writer(csv_file)
writer.writerow(["link.get", "link.text"])
csv_file.close()
以 CSV 格式输出
第 1 列:Url 链接
第 2 列:职位描述
例如
第 1 列:/career/management-consultants-to-help-our-customers-succeed-with- 它/
第 2 栏:帮助我们的客户在 IT 方面取得成功的管理顾问 哥本哈根 • 在 Implement Consulting Group,我们希望在 咨询行业,因为我们相信创造变革的能力 具有影响力是在日益全球化和 动荡的世界。
试试这个脚本并获得 csv 输出:
import csv ; import requests
from bs4 import BeautifulSoup
outfile = open('career.csv','w', newline='')
writer = csv.writer(outfile)
writer.writerow(["job_link", "job_desc"])
res = requests.get("http://implementconsultinggroup.com/career/#/6257").text
soup = BeautifulSoup(res,"lxml")
links = soup.find_all("a")
for link in links:
if "career" in link.get("href") and 'COPENHAGEN' in link.text:
item_link = link.get("href").strip()
item_text = link.text.replace("View Position","").strip()
writer.writerow([item_link, item_text])
print(item_link, item_text)
outfile.close()