数据库内的地址标准化
Address standardization within a database
在 MS Access 2013 中工作。有大量 locations/addresses 需要标准化。
示例包括如下地址:
- 500 W 主街
- 西大街 500 号
- 西大街 500 号
你明白了。
我考虑过 运行 一个查询,它提取数据库中左 (7) 或某些字符不止一次存在的所有记录,但该逻辑存在明显缺陷。
是否有函数或查询或其他任何东西可以帮助我生成其地址可能以略有不同的方式多次存在的记录列表?
这是一件棘手的事情……黑魔法和科学的部分相等。光是林荫大道的变化,你就会惊叹不已。
这就是我使用GoogleAPI的原因。对于初始数据集,这可能很耗时,但只需要解决新添加的问题。
例如
https://maps.googleapis.com/maps/api/geocode/json?address=500 S Main St,Providence RI 02903
returns,部分
"formatted_address" : "500 S Main St, Providence, RI 02903, USA"
好消息是
https://maps.googleapis.com/maps/api/geocode/json?address=500 South Main Steet,Providence RI 02903
returns 与上一个查询格式相同的地址
"formatted_address" : "500 S Main St, Providence, RI 02903, USA"
VBA 示例:
执行以下代码后...
' VBA project Reference required:
' Microsoft XML, v3.0
Dim httpReq As New MSXML2.ServerXMLHTTP
httpReq.Open "GET", "https://maps.googleapis.com/maps/api/geocode/json?address=500 South Main Steet,Providence RI 02903", False
httpReq.send
Dim response As String
response = httpReq.responseText
...字符串变量 response
包含以下 JSON 数据:
{
"results" : [
{
"address_components" : [
{
"long_name" : "500",
"short_name" : "500",
"types" : [ "street_number" ]
},
{
"long_name" : "South Main Street",
"short_name" : "S Main St",
"types" : [ "route" ]
},
{
"long_name" : "Fox Point",
"short_name" : "Fox Point",
"types" : [ "neighborhood", "political" ]
},
{
"long_name" : "Providence",
"short_name" : "Providence",
"types" : [ "locality", "political" ]
},
{
"long_name" : "Providence County",
"short_name" : "Providence County",
"types" : [ "administrative_area_level_2", "political" ]
},
{
"long_name" : "Rhode Island",
"short_name" : "RI",
"types" : [ "administrative_area_level_1", "political" ]
},
{
"long_name" : "United States",
"short_name" : "US",
"types" : [ "country", "political" ]
},
{
"long_name" : "02903",
"short_name" : "02903",
"types" : [ "postal_code" ]
},
{
"long_name" : "2915",
"short_name" : "2915",
"types" : [ "postal_code_suffix" ]
}
],
"formatted_address" : "500 S Main St, Providence, RI 02903, USA",
"geometry" : {
"bounds" : {
"northeast" : {
"lat" : 41.82055829999999,
"lng" : -71.4028137
},
"southwest" : {
"lat" : 41.8204014,
"lng" : -71.40319219999999
}
},
"location" : {
"lat" : 41.8204799,
"lng" : -71.40300289999999
},
"location_type" : "ROOFTOP",
"viewport" : {
"northeast" : {
"lat" : 41.8218288302915,
"lng" : -71.40165396970851
},
"southwest" : {
"lat" : 41.8191308697085,
"lng" : -71.40435193029151
}
}
},
"partial_match" : true,
"place_id" : "ChIJicPQAT9F5IkRfq2njkYqZtE",
"types" : [ "premise" ]
}
],
"status" : "OK"
}
John 的回答完全正确,我还想补充一点,您可以通过此处 API 实现相同的目标。您可以使用 HERE 地图免费执行此操作,无需信用卡即可开始。
https://geocode.search.hereapi.com/v1/geocode?q=500 West Main Street&apiKey=YOUR_API_KEY
Returns:
{
"items": [
{
"title": "500 W Main St, Alhambra, CA 91801-3308, United States",
"id": "here:af:streetsection:-2rEzgpCkFyX.gMQjWtV1A:CgcIBCCl6q07EAEaAzUwMChk",
"resultType": "houseNumber",
"houseNumberType": "PA",
"address": {
"label": "500 W Main St, Alhambra, CA 91801-3308, United States",
"countryCode": "USA",
"countryName": "United States",
"state": "California",
"county": "Los Angeles",
"city": "Alhambra",
"street": "W Main St",
"postalCode": "91801-3308",
"houseNumber": "500"
},
"position": {
"lat": 34.09193,
"lng": -118.13238
},
"access": [
{
"lat": 34.09241,
"lng": -118.13272
}
],
"mapView": {
"west": -118.13347,
"south": 34.09103,
"east": -118.13129,
"north": 34.09283
},
"scoring": {
"queryScore": 1.0,
"fieldScore": {
"streets": [
1.0
],
"houseNumber": 1.0
}
}
},
additional results...
因此您可以根据标题规范化您的数据。
在 MS Access 2013 中工作。有大量 locations/addresses 需要标准化。
示例包括如下地址:
- 500 W 主街
- 西大街 500 号
- 西大街 500 号
你明白了。
我考虑过 运行 一个查询,它提取数据库中左 (7) 或某些字符不止一次存在的所有记录,但该逻辑存在明显缺陷。
是否有函数或查询或其他任何东西可以帮助我生成其地址可能以略有不同的方式多次存在的记录列表?
这是一件棘手的事情……黑魔法和科学的部分相等。光是林荫大道的变化,你就会惊叹不已。
这就是我使用GoogleAPI的原因。对于初始数据集,这可能很耗时,但只需要解决新添加的问题。
例如
https://maps.googleapis.com/maps/api/geocode/json?address=500 S Main St,Providence RI 02903
returns,部分
"formatted_address" : "500 S Main St, Providence, RI 02903, USA"
好消息是
https://maps.googleapis.com/maps/api/geocode/json?address=500 South Main Steet,Providence RI 02903
returns 与上一个查询格式相同的地址
"formatted_address" : "500 S Main St, Providence, RI 02903, USA"
VBA 示例:
执行以下代码后...
' VBA project Reference required:
' Microsoft XML, v3.0
Dim httpReq As New MSXML2.ServerXMLHTTP
httpReq.Open "GET", "https://maps.googleapis.com/maps/api/geocode/json?address=500 South Main Steet,Providence RI 02903", False
httpReq.send
Dim response As String
response = httpReq.responseText
...字符串变量 response
包含以下 JSON 数据:
{
"results" : [
{
"address_components" : [
{
"long_name" : "500",
"short_name" : "500",
"types" : [ "street_number" ]
},
{
"long_name" : "South Main Street",
"short_name" : "S Main St",
"types" : [ "route" ]
},
{
"long_name" : "Fox Point",
"short_name" : "Fox Point",
"types" : [ "neighborhood", "political" ]
},
{
"long_name" : "Providence",
"short_name" : "Providence",
"types" : [ "locality", "political" ]
},
{
"long_name" : "Providence County",
"short_name" : "Providence County",
"types" : [ "administrative_area_level_2", "political" ]
},
{
"long_name" : "Rhode Island",
"short_name" : "RI",
"types" : [ "administrative_area_level_1", "political" ]
},
{
"long_name" : "United States",
"short_name" : "US",
"types" : [ "country", "political" ]
},
{
"long_name" : "02903",
"short_name" : "02903",
"types" : [ "postal_code" ]
},
{
"long_name" : "2915",
"short_name" : "2915",
"types" : [ "postal_code_suffix" ]
}
],
"formatted_address" : "500 S Main St, Providence, RI 02903, USA",
"geometry" : {
"bounds" : {
"northeast" : {
"lat" : 41.82055829999999,
"lng" : -71.4028137
},
"southwest" : {
"lat" : 41.8204014,
"lng" : -71.40319219999999
}
},
"location" : {
"lat" : 41.8204799,
"lng" : -71.40300289999999
},
"location_type" : "ROOFTOP",
"viewport" : {
"northeast" : {
"lat" : 41.8218288302915,
"lng" : -71.40165396970851
},
"southwest" : {
"lat" : 41.8191308697085,
"lng" : -71.40435193029151
}
}
},
"partial_match" : true,
"place_id" : "ChIJicPQAT9F5IkRfq2njkYqZtE",
"types" : [ "premise" ]
}
],
"status" : "OK"
}
John 的回答完全正确,我还想补充一点,您可以通过此处 API 实现相同的目标。您可以使用 HERE 地图免费执行此操作,无需信用卡即可开始。
https://geocode.search.hereapi.com/v1/geocode?q=500 West Main Street&apiKey=YOUR_API_KEY
Returns:
{
"items": [
{
"title": "500 W Main St, Alhambra, CA 91801-3308, United States",
"id": "here:af:streetsection:-2rEzgpCkFyX.gMQjWtV1A:CgcIBCCl6q07EAEaAzUwMChk",
"resultType": "houseNumber",
"houseNumberType": "PA",
"address": {
"label": "500 W Main St, Alhambra, CA 91801-3308, United States",
"countryCode": "USA",
"countryName": "United States",
"state": "California",
"county": "Los Angeles",
"city": "Alhambra",
"street": "W Main St",
"postalCode": "91801-3308",
"houseNumber": "500"
},
"position": {
"lat": 34.09193,
"lng": -118.13238
},
"access": [
{
"lat": 34.09241,
"lng": -118.13272
}
],
"mapView": {
"west": -118.13347,
"south": 34.09103,
"east": -118.13129,
"north": 34.09283
},
"scoring": {
"queryScore": 1.0,
"fieldScore": {
"streets": [
1.0
],
"houseNumber": 1.0
}
}
},
additional results...
因此您可以根据标题规范化您的数据。