如何在R中使用stop函数来检查函数参数
how to use stop function in R to check function parameters
我正在尝试创建一个对数据进行汇总和分类的函数。作为其中的一部分,我想创建一些检查以确保输入的参数有效,例如。百分比之和等于1,变量是字符或数字等
在下面的示例中,我试图检查 a、b、c 参数的总和是否等于 1,并且其中一个变量是 class 数字。但是,当我传递不符合检查条件的参数时,无论值之和是否为 1,代码仍将 运行。
根据建议,我将代码更改为 stopifnot()
,但是当我通过第二次检查时 is.numeric()
它失败了。
我在底部放了一些关于如何将函数应用到钻石包的例子。 stopifnot 函数无法正常工作。感谢任何帮助?
如果不检查 a+b+c ==1 并且 dim 变量是数字,我该如何停止?
library(tidyverse)
customer_segmentation <- function(df,group,dim,a=.7,b=.26,c=.04)
{
#this is part of the code that I am focused on--------------
stopifnot(a+b+c==1L,!is.numeric(dim))
#you can ignore the part below----------
df %>% #dataframe
group_by({{group}}) %>%
#creates a bunch of columns
summarize(
across({{ dim }}, #make sure this dimension can be aggregratad, later version will handle ratios
list(sum=~sum(.,na.rm=TRUE),
mean=~mean(.,na.rm=TRUE),
n = ~ n(),
median = ~ median(.,na.rm=TRUE),
sd = ~ sd(.,na.rm=TRUE),
mad = ~ mad(.,na.rm=TRUE), #median absolute deviation
aad = ~ mad(., center =mean(.,na.rm=TRUE),na.rm=TRUE), #average absolute deviation
IQR05 = ~quantile(., .05,na.rm=TRUE),
IQR25 = ~quantile(., .25,na.rm=TRUE),
IQR75 = ~quantile(., .75,na.rm=TRUE),
IQR95 = ~quantile(., .95,na.rm=TRUE)
),
.names = "{.fn}") #gives each column their name
) %>%
ungroup() %>%
arrange(desc(sum)) %>% # assuming positive values, descends highest to lowest (should add some logic to switch this)
mutate(cum_sum=cumsum(sum), #cumlative value, if ratio, need some sort of check - need specify
prop_total=sum/max(cum_sum), #assumes positive values, need check
cum_prop_total=cumsum(prop_total), #cumsum percent of total
cum_unit_prop=row_number()/max(row_number()), #unit percent
group_classification_by_dim=
case_when(
cum_prop_total<=a ~"A",
cum_prop_total<=(a+b) ~"B",
TRUE ~ "C"),
dim_threshold=
case_when(group_classification_by_dim=="A"~a,
group_classification_by_dim=="B"~(a+b),
TRUE ~ c)
) %>%
select(-c(prop_total,cum_sum)) %>%
relocate(dim_threshold,group_classification_by_dim,cum_prop_total,cum_unit_prop)
}
#this does not works but should not (a,b,c sums do not equal 1, x is numeric)
diamonds %>%
customer_segmentation(group = clarity,dim=x,a=.7L,b=.2L,c=.1L)
#this does not work but should work (a,b,c sums to 1, x is a numeric)
diamonds %>%
customer_segmentation(group = clarity,dim=x,a=.9,b=.2,c=.1)
#is numeric
diamonds$x %>% class()
#does not work because can't find "x", however abc works with default values
##object 'x' not found
diamonds %>%
customer_segmentation(group = clarity,dim=x)
stopifnot()
stopifnot()
通常用于在断言失败时退出。使用简单:
> a = 5
> b = 6
> stopifnot(a == b)
Error: a == b is not TRUE
包含您自己的用户帮助信息通常很有帮助。这被设置为表达式的名称:
> stopifnot("A and B should be the same, fool!" = a == b)
Error: A and B should be the same, fool!
我们也可以在一条语句中执行多项检查:
stopifnot("A is too small" = a > 0.5,
"B should be the same as A" = a == b)
另一种想法可以在打印错误消息时增加一些额外的灵活性:
a == b || exit('A should be equal to B, but A=',A,', B=',B)
在这种情况下,如果满足OR-statement(a == b
)的第一部分,则不执行第二部分(exit(...)
)。
关于比较浮点数
由于浮点数(带小数的数字)不能完全准确地以二进制格式存储,因此在比较它们时总会有误差。
可以存储的最小分数可能因 R 安装而异:
> .Machine$double.eps
[1] 2.220446e-16
R 使用浮点比较容差 .Machine$double.eps^0.5
。考虑以下示例:
> a = 1
> b = 0.00000000000000001
> a - b == 1
[1] TRUE
> a + b == 1
[1] TRUE
> b == 0
[1] FALSE
要控制公差,请使用函数 all.equal()
:
> format( pi , nsmall=7)
[1] "3.1415927"
> format( 355/113 , nsmall=7)
[1] "3.1415929"
> pi == 355/113
[1] FALSE
> isTRUE(all.equal( pi, 355/113 ))
[1] FALSE
> isTRUE(all.equal( pi, 355/113, tolerance = 0.0000001))
[1] TRUE
我正在尝试创建一个对数据进行汇总和分类的函数。作为其中的一部分,我想创建一些检查以确保输入的参数有效,例如。百分比之和等于1,变量是字符或数字等
在下面的示例中,我试图检查 a、b、c 参数的总和是否等于 1,并且其中一个变量是 class 数字。但是,当我传递不符合检查条件的参数时,无论值之和是否为 1,代码仍将 运行。
根据建议,我将代码更改为 stopifnot()
,但是当我通过第二次检查时 is.numeric()
它失败了。
我在底部放了一些关于如何将函数应用到钻石包的例子。 stopifnot 函数无法正常工作。感谢任何帮助?
如果不检查 a+b+c ==1 并且 dim 变量是数字,我该如何停止?
library(tidyverse)
customer_segmentation <- function(df,group,dim,a=.7,b=.26,c=.04)
{
#this is part of the code that I am focused on--------------
stopifnot(a+b+c==1L,!is.numeric(dim))
#you can ignore the part below----------
df %>% #dataframe
group_by({{group}}) %>%
#creates a bunch of columns
summarize(
across({{ dim }}, #make sure this dimension can be aggregratad, later version will handle ratios
list(sum=~sum(.,na.rm=TRUE),
mean=~mean(.,na.rm=TRUE),
n = ~ n(),
median = ~ median(.,na.rm=TRUE),
sd = ~ sd(.,na.rm=TRUE),
mad = ~ mad(.,na.rm=TRUE), #median absolute deviation
aad = ~ mad(., center =mean(.,na.rm=TRUE),na.rm=TRUE), #average absolute deviation
IQR05 = ~quantile(., .05,na.rm=TRUE),
IQR25 = ~quantile(., .25,na.rm=TRUE),
IQR75 = ~quantile(., .75,na.rm=TRUE),
IQR95 = ~quantile(., .95,na.rm=TRUE)
),
.names = "{.fn}") #gives each column their name
) %>%
ungroup() %>%
arrange(desc(sum)) %>% # assuming positive values, descends highest to lowest (should add some logic to switch this)
mutate(cum_sum=cumsum(sum), #cumlative value, if ratio, need some sort of check - need specify
prop_total=sum/max(cum_sum), #assumes positive values, need check
cum_prop_total=cumsum(prop_total), #cumsum percent of total
cum_unit_prop=row_number()/max(row_number()), #unit percent
group_classification_by_dim=
case_when(
cum_prop_total<=a ~"A",
cum_prop_total<=(a+b) ~"B",
TRUE ~ "C"),
dim_threshold=
case_when(group_classification_by_dim=="A"~a,
group_classification_by_dim=="B"~(a+b),
TRUE ~ c)
) %>%
select(-c(prop_total,cum_sum)) %>%
relocate(dim_threshold,group_classification_by_dim,cum_prop_total,cum_unit_prop)
}
#this does not works but should not (a,b,c sums do not equal 1, x is numeric)
diamonds %>%
customer_segmentation(group = clarity,dim=x,a=.7L,b=.2L,c=.1L)
#this does not work but should work (a,b,c sums to 1, x is a numeric)
diamonds %>%
customer_segmentation(group = clarity,dim=x,a=.9,b=.2,c=.1)
#is numeric
diamonds$x %>% class()
#does not work because can't find "x", however abc works with default values
##object 'x' not found
diamonds %>%
customer_segmentation(group = clarity,dim=x)
stopifnot()
stopifnot()
通常用于在断言失败时退出。使用简单:
> a = 5
> b = 6
> stopifnot(a == b)
Error: a == b is not TRUE
包含您自己的用户帮助信息通常很有帮助。这被设置为表达式的名称:
> stopifnot("A and B should be the same, fool!" = a == b)
Error: A and B should be the same, fool!
我们也可以在一条语句中执行多项检查:
stopifnot("A is too small" = a > 0.5,
"B should be the same as A" = a == b)
另一种想法可以在打印错误消息时增加一些额外的灵活性:
a == b || exit('A should be equal to B, but A=',A,', B=',B)
在这种情况下,如果满足OR-statement(a == b
)的第一部分,则不执行第二部分(exit(...)
)。
关于比较浮点数
由于浮点数(带小数的数字)不能完全准确地以二进制格式存储,因此在比较它们时总会有误差。
可以存储的最小分数可能因 R 安装而异:
> .Machine$double.eps
[1] 2.220446e-16
R 使用浮点比较容差 .Machine$double.eps^0.5
。考虑以下示例:
> a = 1
> b = 0.00000000000000001
> a - b == 1
[1] TRUE
> a + b == 1
[1] TRUE
> b == 0
[1] FALSE
要控制公差,请使用函数 all.equal()
:
> format( pi , nsmall=7)
[1] "3.1415927"
> format( 355/113 , nsmall=7)
[1] "3.1415929"
> pi == 355/113
[1] FALSE
> isTRUE(all.equal( pi, 355/113 ))
[1] FALSE
> isTRUE(all.equal( pi, 355/113, tolerance = 0.0000001))
[1] TRUE