从 python 列表中删除邮政编码(从 MapQuest 输出中获取州名称)
Removing the zip code from a python list (to obtain the state name from MapQuest output)
这应该很简单,但无法正常工作。
地理定位 MapQuest API 向我返回了一些字符串。我想从这些字符串中分离出州名称,这有点困难。想想'Pennsylvania Avenue'(在D.C中),然后就是'Washington',可以是州,也可以是街道名,也可以是城市。
s = "Goldman Sachs Tower, 200, West Street, Battery Park City, Manhattan Community Board 1, New York County, NYC, New York, 10282, United States of America"
s = "9th St NW, Logan Circle/Shaw, Washington, District of Columbia, 20001, United States of America"
s = "Casper, Natrona County, Wyoming, United States of America"
但我注意到 MapQuest 将州名写在邮政编码之前,接近字符串的末尾。
要获取州名,这行得通,也就是说,如果有邮政编码:
s = s.split(",")
s = [x.strip() for x in s]
state = s[-3]
然而,当没有邮政编码时,如第三个字符串,那么我得到县(Natrona County)。
我试图通过以下方式消除邮政编码:
s = s.split(",")
s = [x.strip() for x in s if '\d{5}' not in x ]
但是正则表达式 '\d{5}'
不起作用 - 我想要怀俄明州,而不是纳特罗纳县。
使用re
:
import re
s = "9th St NW, Logan Circle/Shaw, Washington, District of Columbia, 20001, United States of America"
s = s.split(",")
number = re.compile(r"\d{5}")
s = [x.strip() for x in s if not number.search(x)]
print s
print s[-2]
输出:
['9th St NW', 'Logan Circle/Shaw', 'Washington', 'District of Columbia', 'United States of America']
District of Columbia
这里有一些简单的小教程:regex tutorial
这应该很简单,但无法正常工作。
地理定位 MapQuest API 向我返回了一些字符串。我想从这些字符串中分离出州名称,这有点困难。想想'Pennsylvania Avenue'(在D.C中),然后就是'Washington',可以是州,也可以是街道名,也可以是城市。
s = "Goldman Sachs Tower, 200, West Street, Battery Park City, Manhattan Community Board 1, New York County, NYC, New York, 10282, United States of America"
s = "9th St NW, Logan Circle/Shaw, Washington, District of Columbia, 20001, United States of America"
s = "Casper, Natrona County, Wyoming, United States of America"
但我注意到 MapQuest 将州名写在邮政编码之前,接近字符串的末尾。
要获取州名,这行得通,也就是说,如果有邮政编码:
s = s.split(",")
s = [x.strip() for x in s]
state = s[-3]
然而,当没有邮政编码时,如第三个字符串,那么我得到县(Natrona County)。
我试图通过以下方式消除邮政编码:
s = s.split(",")
s = [x.strip() for x in s if '\d{5}' not in x ]
但是正则表达式 '\d{5}'
不起作用 - 我想要怀俄明州,而不是纳特罗纳县。
使用re
:
import re
s = "9th St NW, Logan Circle/Shaw, Washington, District of Columbia, 20001, United States of America"
s = s.split(",")
number = re.compile(r"\d{5}")
s = [x.strip() for x in s if not number.search(x)]
print s
print s[-2]
输出:
['9th St NW', 'Logan Circle/Shaw', 'Washington', 'District of Columbia', 'United States of America']
District of Columbia
这里有一些简单的小教程:regex tutorial