XML SDMX 读取 python
XML SDMX reading in python
我正在努力从以下链接中读取带有 python 的 SDMX XML 文件:
https://www.newyorkfed.org/xml/fedfunds.html 或
direct
理想情况下,我想将基金利率纳入数据框,但我正在尝试使用 pandasdmx,但它似乎不适用于此
我当前的代码:
f
rom urllib.request import urlopen
import xml.etree.ElementTree as ET
url = "https://websvcgatewayx2.frbny.org/autorates_fedfunds_external/services/v1_0/fedfunds/xml/retrieve?typ=RATE&f=03012016&t=04032020"
d2 = urlopen(url).read()
root ET.fromstring(d2)
for elem in root.iter():
k = elem.get('OBS_VALUE')
if k is not None:
print(k)
我想要这样的东西:
FUNDRATE_OBS_POINT='1%' FUNDRATE_OBS_POINT='25%'
2020-04-02 0.03 0.05
2020-04-01 0.03 0.05
2020-04-01 0.01 0.05
我发现这种方法非常丑陋,对于每个 "data" 我都需要检查它是否不是 None。有更好的方法吗?
尝试以下方法:
from lxml import etree
import requests
resp = requests.get(url)
doc = etree.fromstring(resp.content)
headers = []
dates = []
columns = []
fop = doc.xpath('//Series[@FUNDRATE_OBS_POINT]')
datpath = fop[0].xpath('//*[@*="ns13:ObsType"]')
for dat in datpath:
dates.append(dat.attrib.get('TIME_PERIOD'))
for item in fop:
headers.append(item.attrib.get('FUNDRATE_OBS_POINT'))
entries = item.xpath('//*[@*="ns13:ObsType"]')
column = []
for entry in entries:
column.append(entry.attrib.get('OBS_VALUE'))
columns.append(column)
df = pd.DataFrame(columns=headers,index=dates)
for a, b in zip(headers,columns):
df[a] = b
df.head(3)
输出:
1% 25% 50% 75% 99% TARGET_HIGH TARGET_LOW
2020-04-02 0.03 0.03 0.03 0.03 0.03 0.03 0.03
2020-04-01 0.03 0.03 0.03 0.03 0.03 0.03 0.03
2020-03-31 0.01 0.01 0.01 0.01 0.01 0.01 0.01
我正在努力从以下链接中读取带有 python 的 SDMX XML 文件: https://www.newyorkfed.org/xml/fedfunds.html 或 direct
理想情况下,我想将基金利率纳入数据框,但我正在尝试使用 pandasdmx,但它似乎不适用于此
我当前的代码: f
rom urllib.request import urlopen
import xml.etree.ElementTree as ET
url = "https://websvcgatewayx2.frbny.org/autorates_fedfunds_external/services/v1_0/fedfunds/xml/retrieve?typ=RATE&f=03012016&t=04032020"
d2 = urlopen(url).read()
root ET.fromstring(d2)
for elem in root.iter():
k = elem.get('OBS_VALUE')
if k is not None:
print(k)
我想要这样的东西:
FUNDRATE_OBS_POINT='1%' FUNDRATE_OBS_POINT='25%'
2020-04-02 0.03 0.05
2020-04-01 0.03 0.05
2020-04-01 0.01 0.05
我发现这种方法非常丑陋,对于每个 "data" 我都需要检查它是否不是 None。有更好的方法吗?
尝试以下方法:
from lxml import etree
import requests
resp = requests.get(url)
doc = etree.fromstring(resp.content)
headers = []
dates = []
columns = []
fop = doc.xpath('//Series[@FUNDRATE_OBS_POINT]')
datpath = fop[0].xpath('//*[@*="ns13:ObsType"]')
for dat in datpath:
dates.append(dat.attrib.get('TIME_PERIOD'))
for item in fop:
headers.append(item.attrib.get('FUNDRATE_OBS_POINT'))
entries = item.xpath('//*[@*="ns13:ObsType"]')
column = []
for entry in entries:
column.append(entry.attrib.get('OBS_VALUE'))
columns.append(column)
df = pd.DataFrame(columns=headers,index=dates)
for a, b in zip(headers,columns):
df[a] = b
df.head(3)
输出:
1% 25% 50% 75% 99% TARGET_HIGH TARGET_LOW
2020-04-02 0.03 0.03 0.03 0.03 0.03 0.03 0.03
2020-04-01 0.03 0.03 0.03 0.03 0.03 0.03 0.03
2020-03-31 0.01 0.01 0.01 0.01 0.01 0.01 0.01