如何将带有“,”的 "price" 列整理为数字格式?例如 1,250,000 到 1250000(R 编程)
How to tidy "price" column with "," to just a number format ? for example 1,250,000 to 1250000 (R Programming)
我尝试通过以下代码删除价格栏中的逗号:
property2 %>%
select(Price) %>%
str_remove_all(",")
但结果 return 是这样的:
\" \"525000\" \"300000\" \"490000\" \"4100000\" \"750000\" \"2130000\" \"585000\" \"2480000\" \"710000\" \"565000\" \"1400000\" \"880000\" \"3500000\" \"1230000\" \"3150000\" \"499000\" \"480000\" \"475000\" \"2700000\" \"6500000\" \"5100000\" \"5000000\" \"5500000\" \"480000\" \"540000\")"
Warning message:
In stri_replace_all_regex(string, pattern, fix_replacement(replacement), :
argument is not an atomic vector; coercing
数据信息
Location Price Rooms add_rooms Bathrooms `Car Parks`
<chr> <chr> <chr> <chr> <dbl> <dbl>
1 KLCC 1,25~ 2 1 3 2
2 Damansa~ 6,80~ 6 NA 7 NA
3 Dutamas 1,03~ 3 NA 4 2
4 Cheras NA NA NA NA NA
5 Bukit J~ 900,~ 4 1 3 2
6 Taman T~ 5,35~ 4 2 5 4
7 Seputeh NA NA NA NA NA
8 Taman T~ 2,60~ 5 NA 4 4
9 Taman T~ 1,95~ 4 1 4 3
10 Sri Pet~ 385,~ 3 NA 2 1
您可以使用 gsub 删除“,”,然后转换为数值向量。
test_df = structure(list(Price = c("1,250,000", "6,800,000", "1,030,000", NA, "900,000")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))
#remove commas
test_df$Price <-gsub(",","",test_df$Price)
#Change to numeric
test_df$Price <- as.numeric(test_df$Price)
test_df$Price
或 tidyverse 风格
test_df = structure(list(Price = c("1,250,000", "6,800,000", "1,030,000", NA, "900,000")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))
test_df <- test_df %>%
mutate(Price = gsub(",","",Price)%>%
as.numeric)
test_df$Price
您可以使用 lapply 和 readr::parse_number,记入:How to read data when some numbers contain commas as thousand separator?
#your data
test_df = structure(list(Price = c("1,250,000", "6,800,000", "1,030,000", NA, "900,000")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))
test_df$Price<- lapply(test_df$Price, readr::parse_number)
test_df
我尝试通过以下代码删除价格栏中的逗号:
property2 %>%
select(Price) %>%
str_remove_all(",")
但结果 return 是这样的:
\" \"525000\" \"300000\" \"490000\" \"4100000\" \"750000\" \"2130000\" \"585000\" \"2480000\" \"710000\" \"565000\" \"1400000\" \"880000\" \"3500000\" \"1230000\" \"3150000\" \"499000\" \"480000\" \"475000\" \"2700000\" \"6500000\" \"5100000\" \"5000000\" \"5500000\" \"480000\" \"540000\")"
Warning message:
In stri_replace_all_regex(string, pattern, fix_replacement(replacement), :
argument is not an atomic vector; coercing
数据信息
Location Price Rooms add_rooms Bathrooms `Car Parks`
<chr> <chr> <chr> <chr> <dbl> <dbl>
1 KLCC 1,25~ 2 1 3 2
2 Damansa~ 6,80~ 6 NA 7 NA
3 Dutamas 1,03~ 3 NA 4 2
4 Cheras NA NA NA NA NA
5 Bukit J~ 900,~ 4 1 3 2
6 Taman T~ 5,35~ 4 2 5 4
7 Seputeh NA NA NA NA NA
8 Taman T~ 2,60~ 5 NA 4 4
9 Taman T~ 1,95~ 4 1 4 3
10 Sri Pet~ 385,~ 3 NA 2 1
您可以使用 gsub 删除“,”,然后转换为数值向量。
test_df = structure(list(Price = c("1,250,000", "6,800,000", "1,030,000", NA, "900,000")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))
#remove commas
test_df$Price <-gsub(",","",test_df$Price)
#Change to numeric
test_df$Price <- as.numeric(test_df$Price)
test_df$Price
或 tidyverse 风格
test_df = structure(list(Price = c("1,250,000", "6,800,000", "1,030,000", NA, "900,000")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))
test_df <- test_df %>%
mutate(Price = gsub(",","",Price)%>%
as.numeric)
test_df$Price
您可以使用 lapply 和 readr::parse_number,记入:How to read data when some numbers contain commas as thousand separator?
#your data
test_df = structure(list(Price = c("1,250,000", "6,800,000", "1,030,000", NA, "900,000")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))
test_df$Price<- lapply(test_df$Price, readr::parse_number)
test_df