如何在R中使用stop函数来检查函数参数

how to use stop function in R to check function parameters

我正在尝试创建一个对数据进行汇总和分类的函数。作为其中的一部分,我想创建一些检查以确保输入的参数有效,例如。百分比之和等于1,变量是字符或数字等

在下面的示例中,我试图检查 a、b、c 参数的总和是否等于 1,并且其中一个变量是 class 数字。但是,当我传递不符合检查条件的参数时,无论值之和是否为 1,代码仍将 运行。

根据建议,我将代码更改为 stopifnot(),但是当我通过第二次检查时 is.numeric() 它失败了。

我在底部放了一些关于如何将函数应用到钻石包的例子。 stopifnot 函数无法正常工作。感谢任何帮助?

如果不检查 a+b+c ==1 并且 dim 变量是数字,我该如何停止?

library(tidyverse)


customer_segmentation <- function(df,group,dim,a=.7,b=.26,c=.04)
  {
  
 #this is part of the code that I am focused on--------------
  stopifnot(a+b+c==1L,!is.numeric(dim))
  
 #you can ignore the part below---------- 
  
  
  
  df %>% #dataframe
    group_by({{group}}) %>% 

    #creates a bunch of columns
    summarize(  
      across({{ dim }}, #make sure this dimension can be aggregratad, later version will handle ratios
             list(sum=~sum(.,na.rm=TRUE),
                  mean=~mean(.,na.rm=TRUE),
                  n =  ~ n(),
                  median =  ~ median(.,na.rm=TRUE),
                  sd =  ~ sd(.,na.rm=TRUE),
                  mad =  ~ mad(.,na.rm=TRUE), #median absolute deviation
                  aad =  ~ mad(., center =mean(.,na.rm=TRUE),na.rm=TRUE), #average absolute deviation
                  IQR05 = ~quantile(., .05,na.rm=TRUE),
                  IQR25 = ~quantile(., .25,na.rm=TRUE),
                  IQR75 = ~quantile(., .75,na.rm=TRUE),
                  IQR95 = ~quantile(., .95,na.rm=TRUE)
                  ),
             .names = "{.fn}") #gives each column their name
    ) %>%
    ungroup() %>% 

    arrange(desc(sum)) %>% # assuming positive values, descends highest to lowest (should add some logic to switch this)
    
    mutate(cum_sum=cumsum(sum), #cumlative value, if ratio, need some sort of check - need specify
         prop_total=sum/max(cum_sum), #assumes positive values, need check
         cum_prop_total=cumsum(prop_total), #cumsum percent of total
         cum_unit_prop=row_number()/max(row_number()), #unit percent
         group_classification_by_dim=
           case_when(
           cum_prop_total<=a ~"A",
           cum_prop_total<=(a+b) ~"B",
           TRUE ~ "C"),
         dim_threshold=
           case_when(group_classification_by_dim=="A"~a,
                     group_classification_by_dim=="B"~(a+b),
                     TRUE ~ c)
         ) %>% 
    select(-c(prop_total,cum_sum)) %>% 
    relocate(dim_threshold,group_classification_by_dim,cum_prop_total,cum_unit_prop)

}


#this does not works but should not (a,b,c sums do not equal 1, x is numeric)

diamonds %>% 
  customer_segmentation(group = clarity,dim=x,a=.7L,b=.2L,c=.1L)

#this does not work but should work (a,b,c sums to 1, x is a numeric)

diamonds %>% 
  customer_segmentation(group = clarity,dim=x,a=.9,b=.2,c=.1)


#is numeric
diamonds$x %>% class()

#does not work because can't find "x", however abc works with default values
##object 'x' not found
diamonds %>% 
  customer_segmentation(group = clarity,dim=x)

stopifnot()

stopifnot() 通常用于在断言失败时退出。使用简单:

> a = 5
> b = 6
> stopifnot(a == b)

Error: a == b is not TRUE

包含您自己的用户帮助信息通常很有帮助。这被设置为表达式的名称:

> stopifnot("A and B should be the same, fool!" = a == b)

Error: A and B should be the same, fool!

我们也可以在一条语句中执行多项检查:

stopifnot("A is too small" = a > 0.5,
          "B should be the same as A" = a == b)

另一种想法可以在打印错误消息时增加一些额外的灵活性:

a == b || exit('A should be equal to B, but A=',A,', B=',B)

在这种情况下,如果满足OR-statement(a == b)的第一部分,则不执行第二部分(exit(...))。

关于比较浮点数

由于浮点数(带小数的数字)不能完全准确地以二进制格式存储,因此在比较它们时总会有误差。

可以存储的最小分数可能因 R 安装而异:

> .Machine$double.eps
[1] 2.220446e-16

R 使用浮点比较容差 .Machine$double.eps^0.5。考虑以下示例:

> a = 1
> b = 0.00000000000000001

> a - b == 1
[1] TRUE

> a + b == 1
[1] TRUE

> b == 0
[1] FALSE

要控制公差,请使用函数 all.equal():

> format( pi , nsmall=7)
[1] "3.1415927"

> format( 355/113 , nsmall=7)
[1] "3.1415929"

> pi == 355/113
[1] FALSE

> isTRUE(all.equal( pi, 355/113 ))
[1] FALSE

> isTRUE(all.equal( pi, 355/113, tolerance = 0.0000001))
[1] TRUE