基于条件的分布抽样

Condition-based distribution Sampling

我正在研究一个供学生练习假设检验的数据集。数据应包含生产建筑设备车辆的虚构处理时间。车辆有不同的类型和不同的选项,这些选项(可能)会影响处理时间。根据加工时间和机器规格,学生将调查哪些因素对加工时间有显着影响,并预测生产具有特定配置的特定机器所需的时间。

数据集的最终目标是生成每台机器的总处理时间。本质上,(总)处理时间应该是基础时间+选项1时间+选项2时间+选项3时间+等的累加……。每个选项都是从分布中随机抽样的,以免太明显。只有总时间会提供给学生,但我需要选项时间来构建总时间。

我知道如何使用 rnorm() 和其他分布进行随机抽样。但是我不知道如何只根据列的内容有条件地生成数据。

数据集看起来像这样。

Machine                  <-   c(1,2,3,4,5,6,7,8,9,10)
Pump.Option              <-   c("30 Liter", "40 Liter", "30 Liter", "30 Liter", "30 Liter", "30 Liter", "50 Liter", "30 Liter", "30 Liter", "40 Liter")
Piping.Option            <-   c("No special piping", "No special piping", "special piping", "No special piping", "special piping", "No special piping", "No special piping", "special piping", "special piping", "No special piping")
Lights.Option            <-   c("Std light", "Std & Addional", "Std & Addional","Std & Addional", "Std & Addional", "Std & Addional", "Std light", "Std & Addional", "Std & Addional", "Std & Addional")
Valve.Option             <-   c("Safety valve", "Safety valve", "Normal valve", "Normal valve", "Safety valve", "Normal valve", "Safety valve", "Safety valve", "Normal valve", "Safety valve")
Pump.Time                <-   NA       
Piping.Time              <-   NA
Lights.Time              <-   NA
Valve.Time               <-   NA
Total.Time               <-   NA


DF.Sample                <- data.frame(Machine, Pump.Option, Piping.Option, Lights.Option, Valve.Option, Pump.Time, Piping.Time, Lights.Time, Valve.Time, Total.Time)

根据Pump.Option、Piping.Option、Piping.Option列的内容,需要生成的次数为Pump.Time、Piping.Time、Lights.Time Lights.Option。这些时间将用于计算该机器的总时间。

选项的时间是这样的。

您可以为此使用 dplyr 的 case_when,与一组嵌套的 ifelse 语句相比,它提供了相对简洁的语法:

library(dplyr)

DF.Sample %>%
    mutate(Pump.Time = case_when(
            Pump.Option == "30 Liter" ~ 0,        
            Pump.Option == "40 Liter" ~ rnorm(n(), mean = 10, sd = 4),
            Pump.Option == "50 Liter" ~ rnorm(n(), mean = 20, sd = 10)
        ), 
        Piping.Time = case_when(
           Piping.Option == "No special piping" ~ 0, 
           Piping.Option == "special piping" ~ rnorm(n(), mean = 10, sd = 4)
        ),
        Lights.Time = case_when(
           Lights.Option == "Std light" ~ 0,
           Lights.Option == "Std & Additional" ~ rnorm(n(), mean = 10, sd = 4)
        )
    )
#>    Machine Pump.Option     Piping.Option    Lights.Option Valve.Option
#> 1        1    30 Liter No special piping        Std light Safety valve
#> 2        2    40 Liter No special piping Std & Additional Safety valve
#> 3        3    30 Liter    special piping Std & Additional Normal valve
#> 4        4    30 Liter No special piping Std & Additional Normal valve
#> 5        5    30 Liter    special piping Std & Additional Safety valve
#> 6        6    30 Liter No special piping Std & Additional Normal valve
#> 7        7    50 Liter No special piping        Std light Safety valve
#> 8        8    30 Liter    special piping Std & Additional Safety valve
#> 9        9    30 Liter    special piping Std & Additional Normal valve
#> 10      10    40 Liter No special piping Std & Additional Safety valve
#>    Pump.Time Piping.Time Lights.Time
#> 1   0.000000    0.000000    0.000000
#> 2   4.956528    0.000000   17.716970
#> 3   0.000000   11.051394   10.142101
#> 4   0.000000    0.000000   11.886158
#> 5   0.000000   15.291671    6.745524
#> 6   0.000000    0.000000    5.228694
#> 7  21.520437    0.000000    0.000000
#> 8   0.000000    9.777887    9.222347
#> 9   0.000000   11.219067   14.726647
#> 10 12.761031    0.000000    6.111458

数据

DF.Sample <- data.frame(
    Machine = c(1,2,3,4,5,6,7,8,9,10), 
    Pump.Option = c("30 Liter", "40 Liter", "30 Liter", "30 Liter", "30 Liter", "30 Liter", "50 Liter", "30 Liter", "30 Liter", "40 Liter"),
    Piping.Option = c("No special piping", "No special piping", "special piping", "No special piping", "special piping", "No special piping", "No special piping", "special piping", "special piping", "No special piping"),
    Lights.Option = c("Std light", "Std & Additional", "Std & Additional","Std & Additional", "Std & Additional", "Std & Additional", "Std light", "Std & Additional", "Std & Additional", "Std & Additional"),
    Valve.Option = c("Safety valve", "Safety valve", "Normal valve", "Normal valve", "Safety valve", "Normal valve", "Safety valve", "Safety valve", "Normal valve", "Safety valve")
)