Pandas - 将项目添加到数据框
Pandas - Add items to dataframe
我正在尝试向数据框添加行项,但无法更新数据框。
到目前为止,我尝试的内容已被注释掉,因为它不能满足我的需要。
我只想下载 json 文件并将其存储到包含给定列的数据框中。似乎我无法从 JSON 文件中提取子组件并将它们存储到全新的数据帧中。
请在下面找到我的代码:
import requests, json, urllib
import pandas as pd
url = "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"
data = pd.read_json(url)
headers = []
df = pd.DataFrame()
for key, item in data['vulnerabilities'].items():
for k in item.keys():
headers.append(k)
col = list(set(headers))
new_df = pd.DataFrame(columns=col)
for item in data['vulnerabilities'].items():
print(item[1])
# new_df['product'] = item[1]['product']
# new_df['vendorProject'] = item[1]['vendorProject']
# new_df['dueDate'] = item[1]['dueDate']
# new_df['shortDescription'] = item[1]['shortDescription']
# new_df['dateAdded'] = item[1]['dateAdded']
# new_df['vulnerabilityName'] = item[1]['vulnerabilityName']
# new_df['cveID'] = item[1]['cveID']
# new_df.append(item[1], ignore_index = True)
new_df
最后我的df还是一片空白。
嵌套的 JSON 数据可以使用 pd.json_normalize()
直接转换为扁平数据帧。 headers 是从 JSON 本身提取的。
new_df = pd.DataFrame(pd.json_normalize(data['vulnerabilities']))
更新: 专门解除了 vulnerabilities
列的嵌套。
它适用于此:
import requests, json, urllib
import pandas as pd
url = "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"
data = pd.read_json(url)
headers = []
df = pd.DataFrame()
for key, item in data['vulnerabilities'].items():
for k in item.keys():
headers.append(k)
col = list(set(headers))
new_df = pd.DataFrame(columns=col)
for item in data['vulnerabilities'].items():
new_df.loc[len(new_df.index)] = item[1] <===THIS
new_df.head()
我正在尝试向数据框添加行项,但无法更新数据框。 到目前为止,我尝试的内容已被注释掉,因为它不能满足我的需要。
我只想下载 json 文件并将其存储到包含给定列的数据框中。似乎我无法从 JSON 文件中提取子组件并将它们存储到全新的数据帧中。
请在下面找到我的代码:
import requests, json, urllib
import pandas as pd
url = "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"
data = pd.read_json(url)
headers = []
df = pd.DataFrame()
for key, item in data['vulnerabilities'].items():
for k in item.keys():
headers.append(k)
col = list(set(headers))
new_df = pd.DataFrame(columns=col)
for item in data['vulnerabilities'].items():
print(item[1])
# new_df['product'] = item[1]['product']
# new_df['vendorProject'] = item[1]['vendorProject']
# new_df['dueDate'] = item[1]['dueDate']
# new_df['shortDescription'] = item[1]['shortDescription']
# new_df['dateAdded'] = item[1]['dateAdded']
# new_df['vulnerabilityName'] = item[1]['vulnerabilityName']
# new_df['cveID'] = item[1]['cveID']
# new_df.append(item[1], ignore_index = True)
new_df
最后我的df还是一片空白。
嵌套的 JSON 数据可以使用 pd.json_normalize()
直接转换为扁平数据帧。 headers 是从 JSON 本身提取的。
new_df = pd.DataFrame(pd.json_normalize(data['vulnerabilities']))
更新: 专门解除了 vulnerabilities
列的嵌套。
它适用于此:
import requests, json, urllib
import pandas as pd
url = "https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json"
data = pd.read_json(url)
headers = []
df = pd.DataFrame()
for key, item in data['vulnerabilities'].items():
for k in item.keys():
headers.append(k)
col = list(set(headers))
new_df = pd.DataFrame(columns=col)
for item in data['vulnerabilities'].items():
new_df.loc[len(new_df.index)] = item[1] <===THIS
new_df.head()