如何从文本中识别位置

how to identify locations from text

这是我获取代码的函数示例

df= read.csv("secondary.csv",header = TRUE)
S <- "s / O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"

我建议制作所有可能的 N-x 字符串,其中 N 是字符串的长度,x 是可变长度

allchr <- unlist(strsplit(S, ""))
listsubstr <- sapply(1:length(allchr), function(I) paste0(allchr[I:length(allchr)], collapse=""))

  # [1] "s / O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"
  # [2] " / O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001" 
  # [3] "/ O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001"  
  # [4] " O sk hungu 101 / 90 MODEL HOUSE TALAB GAGNI SHUKUL LUCKNOW UTTAR PRADESH LUCKNOW UTTAR PRADESH 226001" 

您可以遍历此列表以检查有效的地理编码。我必须提供伪代码,因为我不确定如何检查字符串是否是有效的地理编码。

sapply(listsubstr, function(I) is.geocode(I))     # contains pseudocode

虽然你也可以用递归来做到这一点。

myfun <- function(x) {
             if (x is gecode) { # contains pseudocode
                   return(x)
             } else {
                   myfun(substr(x, 2, nchar(S)))
             }
         }