如何读取 python 中具有特定标识符的 kml 文件?
How to read kml file with specific identifier in python?
我正在尝试阅读德国气象服务提供的这些 kml-files:example_data
使用以下代码我无法访问 dwd:
children:
from zipfile import ZipFile
from lxml import html
from urllib.request import urlretrieve
urlretrieve('http://opendata.dwd.de/weather/local_forecasts/mos/MOSMIX_L/single_stations/10641/kml/MOSMIX_L_LATEST_10641.kmz')
kmz = ZipFile("local_data.kmz", 'r')
kml = kmz.open(kmz.filelist[0].filename, 'r').read()
root = parser.fromstring(kml)
使用 root.Document.Placemark.ExtendedData.getchildren()
命令我可以访问以下列表(长度为 114,我在此处剪切):
[<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71705b2b08>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706befc8>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706bef88>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706bebc8>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706bea88>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706beb08>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706be848>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706be988>]
但是使用 root.Document.Placemark.ExtendedData.Foreast
我收到以下错误消息:
AttributeError: no such child: {http://www.opengis.net/kml/2.2}Forecast
我猜问题是使用了标准的 opengis kml Schema。我如何访问数据?
这是文件的头部:
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<kml:kml xmlns:dwd="https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:xal="urn:oasis:names:tc:ciq:xsdschema:xAL:2.0" xmlns:kml="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom">
<kml:Document>
<kml:ExtendedData>
<dwd:ProductDefinition>
<dwd:Issuer>Deutscher Wetterdienst</dwd:Issuer>
<dwd:ProductID>DWD_MOSMIX_1H</dwd:ProductID>
<dwd:GeneratingProcess>DWD MOSMIX hourly, Version 1.0</dwd:GeneratingProcess>
<dwd:IssueTime></dwd:IssueTime>
<dwd:ReferencedModel>
<dwd:Model dwd:name="ICON" dwd:referenceTime="2018-05-17T00:00:00Z"/>
<dwd:Model dwd:name="ECMWF/IFS" dwd:referenceTime="2018-05-17T00:00:00Z"/>
</dwd:ReferencedModel>
<dwd:ForecastTimeSteps>
<dwd:TimeStep>2018-05-17T10:00:00.000Z</dwd:TimeStep>
<dwd:TimeStep>2018-05-17T11:00:00.000Z</dwd:TimeStep>
<dwd:TimeStep>2018-05-17T12:00:00.000Z</dwd:TimeStep>
<dwd:TimeStep>2018-05-17T13:00:00.000Z</dwd:TimeStep>
<dwd:TimeStep>2018-05-17T14:00:00.000Z</dwd:TimeStep>
<dwd:TimeStep>2018-05-17T15:00:00.000Z</dwd:TimeStep>
从 KML/KMZ 文件 BeautifulSoup Python 库中解析自定义元素可能是最简单的选择。 API可以通过id、class、属性name/value等查找全部或单个元素。它还会对 XML 进行松散解析,因此格式错误的元素不会破坏它。
import requests
from zipfile import ZipFile
from bs4 import BeautifulSoup
url = 'http://opendata.dwd.de/weather/local_forecasts/mos/MOSMIX_L/single_stations/10641/kml/MOSMIX_L_LATEST_10641.kmz'
r = requests.get(url)
with open('local_data.kmz', 'wb') as fout:
fout.write(r.content)
with ZipFile('local_data.kmz', 'r') as kmz:
kml = kmz.open(kmz.filelist[0].filename, 'r').read()
soup = BeautifulSoup(kml, 'xml')
# iterate over each TimeStep element in dwd:ForecastTimeSteps
steps = soup.find("dwd:ForecastTimeSteps")
for step in steps.find_all("dwd:TimeStep"):
print(step.text)
输出:
2021-10-20T16:00:00.000Z
2021-10-20T17:00:00.000Z
2021-10-20T18:00:00.000Z
...
2021-10-30T20:00:00.000Z
2021-10-30T21:00:00.000Z
2021-10-30T22:00:00.000Z
我正在尝试阅读德国气象服务提供的这些 kml-files:example_data
使用以下代码我无法访问 dwd:
children:
from zipfile import ZipFile
from lxml import html
from urllib.request import urlretrieve
urlretrieve('http://opendata.dwd.de/weather/local_forecasts/mos/MOSMIX_L/single_stations/10641/kml/MOSMIX_L_LATEST_10641.kmz')
kmz = ZipFile("local_data.kmz", 'r')
kml = kmz.open(kmz.filelist[0].filename, 'r').read()
root = parser.fromstring(kml)
使用 root.Document.Placemark.ExtendedData.getchildren()
命令我可以访问以下列表(长度为 114,我在此处剪切):
[<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71705b2b08>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706befc8>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706bef88>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706bebc8>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706bea88>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706beb08>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706be848>,
<Element {https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd}Forecast at 0x7f71706be988>]
但是使用 root.Document.Placemark.ExtendedData.Foreast
我收到以下错误消息:
AttributeError: no such child: {http://www.opengis.net/kml/2.2}Forecast
我猜问题是使用了标准的 opengis kml Schema。我如何访问数据?
这是文件的头部:
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<kml:kml xmlns:dwd="https://opendata.dwd.de/weather/lib/pointforecast_dwd_extension_V1_0.xsd" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:xal="urn:oasis:names:tc:ciq:xsdschema:xAL:2.0" xmlns:kml="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom">
<kml:Document>
<kml:ExtendedData>
<dwd:ProductDefinition>
<dwd:Issuer>Deutscher Wetterdienst</dwd:Issuer>
<dwd:ProductID>DWD_MOSMIX_1H</dwd:ProductID>
<dwd:GeneratingProcess>DWD MOSMIX hourly, Version 1.0</dwd:GeneratingProcess>
<dwd:IssueTime></dwd:IssueTime>
<dwd:ReferencedModel>
<dwd:Model dwd:name="ICON" dwd:referenceTime="2018-05-17T00:00:00Z"/>
<dwd:Model dwd:name="ECMWF/IFS" dwd:referenceTime="2018-05-17T00:00:00Z"/>
</dwd:ReferencedModel>
<dwd:ForecastTimeSteps>
<dwd:TimeStep>2018-05-17T10:00:00.000Z</dwd:TimeStep>
<dwd:TimeStep>2018-05-17T11:00:00.000Z</dwd:TimeStep>
<dwd:TimeStep>2018-05-17T12:00:00.000Z</dwd:TimeStep>
<dwd:TimeStep>2018-05-17T13:00:00.000Z</dwd:TimeStep>
<dwd:TimeStep>2018-05-17T14:00:00.000Z</dwd:TimeStep>
<dwd:TimeStep>2018-05-17T15:00:00.000Z</dwd:TimeStep>
从 KML/KMZ 文件 BeautifulSoup Python 库中解析自定义元素可能是最简单的选择。 API可以通过id、class、属性name/value等查找全部或单个元素。它还会对 XML 进行松散解析,因此格式错误的元素不会破坏它。
import requests
from zipfile import ZipFile
from bs4 import BeautifulSoup
url = 'http://opendata.dwd.de/weather/local_forecasts/mos/MOSMIX_L/single_stations/10641/kml/MOSMIX_L_LATEST_10641.kmz'
r = requests.get(url)
with open('local_data.kmz', 'wb') as fout:
fout.write(r.content)
with ZipFile('local_data.kmz', 'r') as kmz:
kml = kmz.open(kmz.filelist[0].filename, 'r').read()
soup = BeautifulSoup(kml, 'xml')
# iterate over each TimeStep element in dwd:ForecastTimeSteps
steps = soup.find("dwd:ForecastTimeSteps")
for step in steps.find_all("dwd:TimeStep"):
print(step.text)
输出:
2021-10-20T16:00:00.000Z
2021-10-20T17:00:00.000Z
2021-10-20T18:00:00.000Z
...
2021-10-30T20:00:00.000Z
2021-10-30T21:00:00.000Z
2021-10-30T22:00:00.000Z