如何调整此嵌套循环以将不同 URL 请求的输出存储在单独的数据库或 .csv 文件中？

Question

所以我正在做一个简单的项目，但显然我卡在了第一步。基本上我是从 public github 存储库请求 .json 文件。我打算下载 7 个不同的文件并将其转换为 7 个不同名称的数据库。

我尝试使用这个嵌套循环，试图创建 7 个不同的 csv 文件，唯一的问题是它给了我 7 个具有相同内容的不同名称的 csv 文件（最后一个 URL)。我认为这与我将 json 输出的数据存储在列表“数据”中的方式有关。我该如何解决这个问题？

import pandas as pd
import datetime
import re, json, requests #this is needed to import the data from the github repository

naz_l_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-andamento-nazionale-latest.json'
naz_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-andamento-nazionale.json'
reg_l_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-regioni-latest.json'
reg_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-regioni.json'
prov_l_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-province-latest.json'
prov_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-province.json'
news_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-note.json'

list_of_url= [naz_l_url,naz_url, reg_l_url,reg_url,prov_url,prov_l_url,news_url]
csv_names = ['01','02','03','04','05','06','07']

for i in list_of_url:
 resp = requests.get(i)
 data = pd.read_json(resp.text, convert_dates=True)
 for x in csv_names:
  data.to_csv(f"{x}_df.csv")

我想尝试两种不同的方式。 1 循环给我 csv 文件，另一个循环给我 pd 数据帧。但是我现在需要解决循环给我相同输出的问题。

Answer 1

问题是您要为下载的每个 URL 遍历名称的完整列表 。请注意 for x in csv_names 如何在内部 for i in list_of_url 循环中。

问题出在哪里

Python 使用缩进级别来确定您何时进入和退出循环（因为其他语言可能使用花括号、begin/end 或 do/end）。我建议你温习一下这个话题。例如，Concept of Indentation in Python. You can see the official documentation about Compound statements，也是。

建议的解决方案

我建议您替换文件的命名，改为执行以下操作：

import pandas as pd import datetime import re, json, requests #this is needed to import the data from the github repository from urllib.parse import urlparse naz_l_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-andamento-nazionale-latest.json' naz_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-andamento-nazionale.json' reg_l_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-regioni-latest.json' reg_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-regioni.json' prov_l_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-province-latest.json' prov_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-province.json' news_url = 'https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-json/dpc-covid19-ita-note.json' list_of_url= [naz_l_url,naz_url, reg_l_url,reg_url,prov_url,prov_l_url,news_url] csv_names = ['01','02','03','04','05','06','07'] for url in list_of_url: resp = requests.get(url) data = pd.read_json(resp.text, convert_dates=True) # here is where you DON'T want to have a nested `for` loop file_name = urlparse(url).path.split('/')[-1].replace('json', 'csv') data.to_csv(file_name)

如何调整此嵌套循环以将不同 URL 请求的输出存储在单独的数据库或 .csv 文件中？

How to I adjust this nested loop to store the output of different URL requests in separate databases or .csv files?

python

nested

问题出在哪里

建议的解决方案