两个变量不被断言识别为相同
Two variables are not recognized as identical by assert
我想详细了解以下代码的行为:
clear
set obs 10000
set seed 98034
* I generate three variables
generate double u1 = runiform()
generate double u2=u1
*check
assert u2==u1
***
generate double var1=runiform()
* I generate some ids
generate byte id_=0
forvalues i=1(1)`=10000/100'{
replace id_=`i' if _n>`=(`i'-1)*`=10000/100''
}
*I sum by id_ u1 and u2
bysort id_: egen double u11= total(u1)
bysort id_: egen double u21= total(u2)
*check
assert u11==u21
***
*I drop duplicates
bysort id_: drop if _n>1
*I generate a new variable which should be equal to var1 (I am adding and
*subtracting the same quantities)
generate double var2= var1 - u11 + u21
*(1)
assert var2==var1
特别是,我无法理解为什么 assert
(1) 会失败,我已经生成了我以相同方式求和的每个变量,因此 var1
和 var2
应该是相同的。
有趣的是,如果我对总和进行不同排序,assert
有效:
drop var2
generate double var2= - u11 + u21 + var1
*(2)
assert var2==var1
这两个变量不相同。要查看此内容,请更改他们的 format
:
format var1 %20.15f
format var2 %20.15f
list var1 var2 in 1/10
+---------------------------------------+
| var1 var2 |
|---------------------------------------|
1. | 0.498376312204773 0.498376312204776 |
2. | 0.394671386281136 0.394671386281132 |
3. | 0.515152901323075 0.515152901323077 |
4. | 0.789668809822002 0.789668809822004 |
5. | 0.931897887273974 0.931897887273976 |
|---------------------------------------|
6. | 0.947614996238336 0.947614996238336 |
7. | 0.207296218919878 0.207296218919879 |
8. | 0.368812285027951 0.368812285027950 |
9. | 0.565084085641873 0.565084085641871 |
10. | 0.331114583239097 0.331114583239099 |
+---------------------------------------+
ordering of mathematical operations 确实重要,从左到右发生:
generate double var2= var1 - u11 + u21
format var2 %20.15f
generate double v2a = - u11 + u21
generate double v2b = var1 + v2a
format v2b %20.15f
generate double v2c = var1 - u11
generate double v2d = v2c + u21
format v2d %20.15f
list var2 v2b v2d in 1/10
+-----------------------------------------------------------+
| var2 v2b v2d |
|-----------------------------------------------------------|
1. | 0.498376312204776 0.498376312204773 0.498376312204776 |
2. | 0.394671386281132 0.394671386281136 0.394671386281132 |
3. | 0.515152901323077 0.515152901323075 0.515152901323077 |
4. | 0.789668809822004 0.789668809822002 0.789668809822004 |
5. | 0.931897887273976 0.931897887273974 0.931897887273976 |
|-----------------------------------------------------------|
6. | 0.947614996238336 0.947614996238336 0.947614996238336 |
7. | 0.207296218919879 0.207296218919878 0.207296218919879 |
8. | 0.368812285027950 0.368812285027951 0.368812285027950 |
9. | 0.565084085641871 0.565084085641873 0.565084085641871 |
10. | 0.331114583239099 0.331114583239097 0.331114583239099 |
+-----------------------------------------------------------+
在这种情况下,由于差异的大小,也可能涉及精度问题。有关详细信息,请在 Stata 的提示符中键入 help precision
。
我想详细了解以下代码的行为:
clear
set obs 10000
set seed 98034
* I generate three variables
generate double u1 = runiform()
generate double u2=u1
*check
assert u2==u1
***
generate double var1=runiform()
* I generate some ids
generate byte id_=0
forvalues i=1(1)`=10000/100'{
replace id_=`i' if _n>`=(`i'-1)*`=10000/100''
}
*I sum by id_ u1 and u2
bysort id_: egen double u11= total(u1)
bysort id_: egen double u21= total(u2)
*check
assert u11==u21
***
*I drop duplicates
bysort id_: drop if _n>1
*I generate a new variable which should be equal to var1 (I am adding and
*subtracting the same quantities)
generate double var2= var1 - u11 + u21
*(1)
assert var2==var1
特别是,我无法理解为什么 assert
(1) 会失败,我已经生成了我以相同方式求和的每个变量,因此 var1
和 var2
应该是相同的。
有趣的是,如果我对总和进行不同排序,assert
有效:
drop var2
generate double var2= - u11 + u21 + var1
*(2)
assert var2==var1
这两个变量不相同。要查看此内容,请更改他们的 format
:
format var1 %20.15f
format var2 %20.15f
list var1 var2 in 1/10
+---------------------------------------+
| var1 var2 |
|---------------------------------------|
1. | 0.498376312204773 0.498376312204776 |
2. | 0.394671386281136 0.394671386281132 |
3. | 0.515152901323075 0.515152901323077 |
4. | 0.789668809822002 0.789668809822004 |
5. | 0.931897887273974 0.931897887273976 |
|---------------------------------------|
6. | 0.947614996238336 0.947614996238336 |
7. | 0.207296218919878 0.207296218919879 |
8. | 0.368812285027951 0.368812285027950 |
9. | 0.565084085641873 0.565084085641871 |
10. | 0.331114583239097 0.331114583239099 |
+---------------------------------------+
ordering of mathematical operations 确实重要,从左到右发生:
generate double var2= var1 - u11 + u21
format var2 %20.15f
generate double v2a = - u11 + u21
generate double v2b = var1 + v2a
format v2b %20.15f
generate double v2c = var1 - u11
generate double v2d = v2c + u21
format v2d %20.15f
list var2 v2b v2d in 1/10
+-----------------------------------------------------------+
| var2 v2b v2d |
|-----------------------------------------------------------|
1. | 0.498376312204776 0.498376312204773 0.498376312204776 |
2. | 0.394671386281132 0.394671386281136 0.394671386281132 |
3. | 0.515152901323077 0.515152901323075 0.515152901323077 |
4. | 0.789668809822004 0.789668809822002 0.789668809822004 |
5. | 0.931897887273976 0.931897887273974 0.931897887273976 |
|-----------------------------------------------------------|
6. | 0.947614996238336 0.947614996238336 0.947614996238336 |
7. | 0.207296218919879 0.207296218919878 0.207296218919879 |
8. | 0.368812285027950 0.368812285027951 0.368812285027950 |
9. | 0.565084085641871 0.565084085641873 0.565084085641871 |
10. | 0.331114583239099 0.331114583239097 0.331114583239099 |
+-----------------------------------------------------------+
在这种情况下,由于差异的大小,也可能涉及精度问题。有关详细信息,请在 Stata 的提示符中键入 help precision
。