如果具有与另一行中相同的条目,则虚拟等于 1

Dummy equals 1 if has the same entry as in another row

我有一个就业历史数据集,如下所示:

Year    PersonID    Company
2010      a            1
2010      b            1
2010      c            2
2010      d            3
2010      e            1
2011      a            2
2011      b            1
2011      c            2
2011      d            3
2011      e            1

我想创建一个变量,如果此人与 Person a 在同一家公司,则该变量等于 1。请注意,Person a 可能会随时间改变工作。

结果如下:

Year    PersonID    Company     SameAsA
2010      a            1          1
2010      b            1          1
2010      c            2          0
2010      d            3          0
2010      e            1          1
2011      a            2          1
2011      b            1          0
2011      c            2          1
2011      d            3          0
2011      e            1          0

如何生成变量 "SameAsA"?

不是很优雅,但以下似乎可以实现您的要求。

clear
input Year  str3 PersonID   Company
2010    a   1
2010    b   1
2010    c   2
2010    d   3
2010    e   1
2011    a   2
2011    b   1
2011    c   2
2011    d   3
2011    e   1
end

bysort Year: gen company_a = Company if PersonID == "a"
bysort Year: egen max = max(company_a)

gen     SameAsA = 0
replace SameAsA = 1 if Company == max

drop tempvar max

list

     +-------------------------------------+
     | Year   PersonID   Company   SameAsA |
     |-------------------------------------|
  1. | 2010          a         1         1 |
  2. | 2010          b         1         1 |
  3. | 2010          c         2         0 |
  4. | 2010          d         3         0 |
  5. | 2010          e         1         1 |
     |-------------------------------------|
  6. | 2011          a         2         1 |
  7. | 2011          b         1         0 |
  8. | 2011          c         2         1 |
  9. | 2011          d         3         0 |
 10. | 2011          e         1         0 |
     +-------------------------------------+

您想要一个指标变量,用于 a 在给定时间在公司。 @Cyber​​nike 的方法可以这样伸缩:

clear
input Year  str3 PersonID   Company
2010    a   1
2010    b   1
2010    c   2
2010    d   3
2010    e   1
2011    a   2
2011    b   1
2011    c   2
2011    d   3
2011    e   1
end

bysort Year Company : egen wanted = max(PersonID == "a") 

list, sepby(Year Company) 

     +------------------------------------+
     | Year   PersonID   Company   wanted |
     |------------------------------------|
  1. | 2010          e         1        1 |
  2. | 2010          a         1        1 |
  3. | 2010          b         1        1 |
     |------------------------------------|
  4. | 2010          c         2        0 |
     |------------------------------------|
  5. | 2010          d         3        0 |
     |------------------------------------|
  6. | 2011          e         1        0 |
  7. | 2011          b         1        0 |
     |------------------------------------|
  8. | 2011          c         2        1 |
  9. | 2011          a         2        1 |
     |------------------------------------|
 10. | 2011          d         3        0 |
     +------------------------------------+

有关更多讨论,请参阅 this FAQ and this tutorial review