字符串变量中跨子组的数字变量的均值

Mean for numeric variable Across sub-groups within a string variable

我有以下格式的两个变量:

desc waiting_days

              storage   display    value
variable name   type    format     label      variable label
--------------------------------------------------------------------------------------------------------------------------------------
waiting_days    float   %9.0g     

 desc category

              storage   display    value
variable name   type    format     label      variable label
--------------------------------------------------------------------------------------------------------------------------------------
category~y str11   %11s                  Categories

waiting_days 是数字,而类别变量是包含 11 个唯一子组的字符串,其值来自类别 1、类别 2 等。

我正在尝试为类别字符串变量下的每个类别创建 average_waiting_days。

waiting_days    category       average_waiting_days 
     
319             category 2         100 days
8763            category 6         85 days
7455            category 3         300 days
464             category 6         85 days
900             category 3         300 days
500             category 3         300 days

这里有一些按类别计算均值的选项。

// Create a new variable with means by category
egen avg_waiting_days = mean(waiting_days), by(category)

// Create a table of means
table category, c(mean waiting_days)

// Calculate means with standard errors
encode category, gen(category_num)
mean waiting_days, over(category_num)