导出到 csv 时重复抓取的数据

Question

我正在尝试从这个 web
My idea is crawling all the links on the web, then use for to send the request to each link to get the detailed data.

This is my code 中抓取数据，如您所见，我使用 selenium 网络驱动程序打开 URL 然后使用 beautiful soup 抓取数据。

嗯它工作得非常成功，但是当它导出到CSV文件时，第一个下的link的upload_date到number_employees的类别与第一个相同以下

the upload_date to the number_employees in each link are presented in the page as this box .
这个问题应该怎么卖？
衷心感谢。 <3 P/s: 我还有一个问题是我需要登录到网络来抓取每个 link 中的 salary 但我还没有找到答案

Answer 1

当您尝试保存抓取的数据时，您总是在循环中追加相同的值，这里：

upload_date = content[0]
position = content[1]
career = content[2]
skill = content[3]
language_of_cv = content[4]
detail_address = content[5]
number_employees = content[6]

您必须遍历抓取的数据才能将所有内容保存在 csv 文件中。

导出到 csv 时重复抓取的数据

data crawled is repeated when exporting to csv

python

selenium

beautifulsoup

web-crawler

web-scraping