Stata - 生成忽略顺序的独特组合
Stata - Generate unique combinations ignoring order
我正在尝试创建一个按 ID 列出所有行组合的数据集,但忽略组合的顺序(即 Apple -> Orange 与 Orange -> Apple 相同)。我开始使用 问题,它让我得到所有组合,保持不同的顺序。如何修改此代码以获得所需的输出?或者我应该完全使用不同的方法,该方法是什么?
我目前使用链接问题所做的代码:
clear all
input id str10 fruit
1 "Apple"
1 "Banana"
1 "Orange"
2 "Orange"
2 "Apple"
3 "Pear"
3 "Kiwi"
3 "Apple"
3 "Lemon"
end
tempfile t1
save `t1'
clear
use `t1'
rename fruit f2
keep id f2
joinby id using `t1'
order id fruit f2
sort id fruit f2
drop if fruit==f2
list, sepby(id)
我想要的输出是:
ID fruit f2
1 Apple Banana
1 Apple Orange
1 Banana Orange
2 Orange Apple
3 Pear Kiwi
3 Pear Apple
3 Pear Lemon
3 Kiwi Apple
3 Kiwi Lemon
3 Apple Lemon
在joinby
之后,您可以生成一个辅助变量,将fruit 和f2 放在一个变量中并对它们进行排序,确保相同的组合具有相同的值。然后你可以根据这个变量和id使用duplicates drop
来删除重复项。
clear
input id str10 fruit
1 "Apple"
1 "Banana"
1 "Orange"
2 "Orange"
2 "Apple"
3 "Pear"
3 "Kiwi"
3 "Apple"
3 "Lemon"
end
tempfile t1
save `t1'
rename fruit f2
joinby id using `t1'
order id fruit f2
sort id fruit f2
drop if fruit==f2
gen combination = cond(fruit < f2, fruit + " " + f2, f2 + " " + fruit)
duplicates drop id combination, force
drop combination
list, sepby(id)
+----------------------+
| id fruit f2 |
|----------------------|
1. | 1 Apple Banana |
2. | 1 Apple Orange |
3. | 1 Banana Orange |
|----------------------|
4. | 2 Apple Orange |
|----------------------|
5. | 3 Apple Kiwi |
6. | 3 Apple Lemon |
7. | 3 Apple Pear |
8. | 3 Kiwi Lemon |
9. | 3 Kiwi Pear |
10. | 3 Lemon Pear |
+----------------------+
我正在尝试创建一个按 ID 列出所有行组合的数据集,但忽略组合的顺序(即 Apple -> Orange 与 Orange -> Apple 相同)。我开始使用
我目前使用链接问题所做的代码:
clear all
input id str10 fruit
1 "Apple"
1 "Banana"
1 "Orange"
2 "Orange"
2 "Apple"
3 "Pear"
3 "Kiwi"
3 "Apple"
3 "Lemon"
end
tempfile t1
save `t1'
clear
use `t1'
rename fruit f2
keep id f2
joinby id using `t1'
order id fruit f2
sort id fruit f2
drop if fruit==f2
list, sepby(id)
我想要的输出是:
ID fruit f2
1 Apple Banana
1 Apple Orange
1 Banana Orange
2 Orange Apple
3 Pear Kiwi
3 Pear Apple
3 Pear Lemon
3 Kiwi Apple
3 Kiwi Lemon
3 Apple Lemon
在joinby
之后,您可以生成一个辅助变量,将fruit 和f2 放在一个变量中并对它们进行排序,确保相同的组合具有相同的值。然后你可以根据这个变量和id使用duplicates drop
来删除重复项。
clear
input id str10 fruit
1 "Apple"
1 "Banana"
1 "Orange"
2 "Orange"
2 "Apple"
3 "Pear"
3 "Kiwi"
3 "Apple"
3 "Lemon"
end
tempfile t1
save `t1'
rename fruit f2
joinby id using `t1'
order id fruit f2
sort id fruit f2
drop if fruit==f2
gen combination = cond(fruit < f2, fruit + " " + f2, f2 + " " + fruit)
duplicates drop id combination, force
drop combination
list, sepby(id)
+----------------------+
| id fruit f2 |
|----------------------|
1. | 1 Apple Banana |
2. | 1 Apple Orange |
3. | 1 Banana Orange |
|----------------------|
4. | 2 Apple Orange |
|----------------------|
5. | 3 Apple Kiwi |
6. | 3 Apple Lemon |
7. | 3 Apple Pear |
8. | 3 Kiwi Lemon |
9. | 3 Kiwi Pear |
10. | 3 Lemon Pear |
+----------------------+