数据框中的二进制操作
Binary operations in a dataframe
我有一个关于数据帧中二进制操作的小问题。这里我有一个数据框,我想创建一个新列 PerWeek
,这是将 Gross
除以 Weeks
时的结果,我想知道我该怎么做,因为 [=12] =] 元素不是数字。
boxoffice = function(){
url = "https://www.imdb.com/chart/boxoffice"
read_table = read_html("https://www.imdb.com/chart/boxoffice")
movie_table = html_table(html_nodes(read_table, "table")[[1]])
Name = movie_table[2]
Gross = movie_table[4]
Weeks = movie_table[5]
BoxOffice =
for (i in 1:10){
PerWeek = movie_table[4][i] %/% movie_table[5][i]
}
df = data.frame(Name,BoxOffice,PerWeek)
return(df)
}
如果您的 Gross
值始终以百万为单位,您可以从中获取数字并乘以 1e6
以获得以百万为单位的金额,然后除以 Weeks
。
library(rvest)
library(dplyr)
url = "https://www.imdb.com/chart/boxoffice"
read_table = read_html("https://www.imdb.com/chart/boxoffice")
movie_table = html_table(html_nodes(read_table, "table")[[1]])
movie_table <- movie_table[-c(1, ncol(movie_table))]
movie_table %>% mutate(per_week_calc = readr::parse_number(Gross) * 1e6/Weeks)
# Title Weekend Gross Weeks per_week_calc
#1 Onward .5M .3M 2 30150000
#2 I Still Believe .5M .5M 1 9500000
#3 Bloodshot .3M .5M 1 10500000
#4 The Invisible Man .0M .4M 3 21466667
#5 The Hunt .3M .8M 1 5800000
#6 Sonic the Hedgehog .6M 5.8M 5 29160000
#7 The Way Back .4M .4M 2 6700000
#8 The Call of the Wild .2M .1M 4 15525000
#9 Emma. .4M .0M 4 2500000
#10 Bad Boys for Life .1M 4.3M 9 22700000
如果你有几十亿或者几千的数据可以参考
Changing Million/Billion abbreviations into actual numbers? ie. 5.12M -> 5,120,000 and
我有一个关于数据帧中二进制操作的小问题。这里我有一个数据框,我想创建一个新列 PerWeek
,这是将 Gross
除以 Weeks
时的结果,我想知道我该怎么做,因为 [=12] =] 元素不是数字。
boxoffice = function(){
url = "https://www.imdb.com/chart/boxoffice"
read_table = read_html("https://www.imdb.com/chart/boxoffice")
movie_table = html_table(html_nodes(read_table, "table")[[1]])
Name = movie_table[2]
Gross = movie_table[4]
Weeks = movie_table[5]
BoxOffice =
for (i in 1:10){
PerWeek = movie_table[4][i] %/% movie_table[5][i]
}
df = data.frame(Name,BoxOffice,PerWeek)
return(df)
}
如果您的 Gross
值始终以百万为单位,您可以从中获取数字并乘以 1e6
以获得以百万为单位的金额,然后除以 Weeks
。
library(rvest)
library(dplyr)
url = "https://www.imdb.com/chart/boxoffice"
read_table = read_html("https://www.imdb.com/chart/boxoffice")
movie_table = html_table(html_nodes(read_table, "table")[[1]])
movie_table <- movie_table[-c(1, ncol(movie_table))]
movie_table %>% mutate(per_week_calc = readr::parse_number(Gross) * 1e6/Weeks)
# Title Weekend Gross Weeks per_week_calc
#1 Onward .5M .3M 2 30150000
#2 I Still Believe .5M .5M 1 9500000
#3 Bloodshot .3M .5M 1 10500000
#4 The Invisible Man .0M .4M 3 21466667
#5 The Hunt .3M .8M 1 5800000
#6 Sonic the Hedgehog .6M 5.8M 5 29160000
#7 The Way Back .4M .4M 2 6700000
#8 The Call of the Wild .2M .1M 4 15525000
#9 Emma. .4M .0M 4 2500000
#10 Bad Boys for Life .1M 4.3M 9 22700000
如果你有几十亿或者几千的数据可以参考
Changing Million/Billion abbreviations into actual numbers? ie. 5.12M -> 5,120,000 and