在 R sqldf 语句中引用向量

Question

我有一个引用数据框的 SQLDF 语句，但我也希望它引用一个不属于数据框的向量，如下所示。

sqldf("select count(*) from carddata where new_user_indicator == 'Y' & loyalty_threshold >  average_loyalty_threshold")

average_loyalty_threshold是单独计算的独立向量，不属于数据框。

如何在 sqldf where 子句中引用独立向量。

谢谢

Answer 1

假设您的数据如下所示：

library(sqldf)

carddata = data.frame(new_user_indicator = c('N','N','Y','Y','Y'),
                      loyalty_threshold = c(1,1,5,3,1))

而您的目标是使用另一个具有单一值的向量 select carddata 的所有实体，其忠诚度阈值高于该值，您可以使用以下内容：

# create a dataframe from average_loyalty_threshold so that sqldf will see it as a table
average_loyalty_threshold = data.frame(threshold = 2)

sqldf("select count(*)
      from carddata
      where new_user_indicator == 'Y'
      and loyalty_threshold > (select * from average_loyalty_threshold)")

#returns

  count(*)
1        2

使用 (select * from average_loyalty_threshold) 您可以选择您要查找的单个值。

不过还有更简单的方法：

average_loyalty_threshold = 2

fn$sqldf("select count(*)
  from carddata
  where new_user_indicator == 'Y'
  and loyalty_threshold > `average_loyalty_threshold`")

#returns

  count(*)
1        2

在这里，我将忠诚度阈值直接传递到查询中。

您也可以使用 sprintf() 进行此文本粘贴，但正如其他人在评论中指出的那样，fn$ 是引用外部变量的推荐方式。

在 R sqldf 语句中引用向量

Referencing a vector in a R sqldf statement

r

dataframe

sqldf