Stata分析不同商店的队列长度

Stata Analyzing Queue Length for Different Stores

假设我有一个商店列表(例如沃尔玛、好市多等),以及:1.他们营业时间的数据,2.每个进入商店的顾客的数据,以及 3.关于每个付款并离开商店的顾客。 (对于2和3,顾客分别可以在开店前在店外等候,关门后离开;数据中没有排除)。如何计算营业时间内每个小时标记的队列(即商店中的顾客数量)?

最初,我想我可能会查看每个数据点并找出这个特定时间标记的客户(这不太棘手)。但是,它不考虑商店营业的其他时间。也就是说,说3点p.m店里有4个人。是不够的,因为它没有描述 4、5、6 p.m.等的时间,即使人数保持不变。简而言之,我不确定如何找到商店营业时间 每个 小时内的顾客数量。此外,我假设我需要创建一个不同的数据框来存储此信息,因为它的大小与我当前拥有的数据集不匹配。

我在下面粘贴了一些示例数据:

input str15 store_name double(open close customer_arrival customer_leave)
"Walmart" 15may2020 13:00:00 15may2020 22:00:00 15may2020 20:40:00 15may2020 22:51:00
"Costco" 15may2020 19:00:00 16may2020 4:00:00 15may2020 21:31:00 16may2020 1:10:00
"Costco" 15may2020 19:00:00 16may2020 4:00:00 16may2020 1:32:00 16may2020 7:40:00
"Costco" 15may2020 19:00:00 16may2020 4:00:00 15may2020 20:52:00 16may2020 00:42:00
"Target" 16may2020 03:00:00 16may2020 12:00:00 16may2020 02:13:00 16may2020 04:47:00
"Target" 16may2020 03:00:00 16may2020 12:00:00 16may2020 07:28:00 16may2020 13:55:00

例如,对于给定日期的沃尔玛,从 13:00 到 20:40,店内没有人,这意味着 13:00 到 20:00 将指示 0。然后对于桶 21:00 和 22:00,队列中将有 1 人,然后 none 再次在 23:00。

如有任何关于如何进行的建议,我们将不胜感激。如果我省略了任何 details/if 我可以更清楚,请告诉我。干杯。

这里有一些技巧可以做到。一开始它会使数据集变大很多,因此如果您的数据集在开始时已经很大,您可能 运行 会遇到内存问题。

* Example generated by -dataex-. To install: ssc install dataex
clear
input str15 store_name double(open close customer_arrival customer_leave)
"Walmart" 1.9051668e+12  1.905201e+12 1.9051944e+12 1905202260000
"Costco"  1.9051884e+12 1.9052208e+12 1905197460000 1.9052106e+12
"Costco"  1.9051884e+12 1.9052208e+12 1905211920000  1.905234e+12
"Costco"  1.9051884e+12 1.9052208e+12 1905195120000 1905208920000
"Target"  1.9052172e+12 1.9052496e+12 1905214380000 1905223620000
"Target"  1.9052172e+12 1.9052496e+12 1905233280000 1.9052565e+12
end
format %tc open
format %tc close
format %tc customer_arrival
format %tc customer_leave

// create customer id
gen customer_id = _n

// create hourly dataset
gen hours = cond(hh(open) < hh(close), hh(close) - hh(open), hh(close) + 24 - hh(open))
expand hours
sort store_name open close
bysort store_name customer_id (open close): gen double start = open + 3600000 * (_n - 1)
bysort store_name customer_id (open close): gen double end = start + 3600000
format start end %tc
drop open close hours

// check if customer is in store 
gen in_store = (customer_arrival < start & customer_leave >= start) | (customer_arrival >= start & customer_arrival < end)

// sum customers in store per hour
collapse (sum) in_store, by(store_name start end)

list, sepby(store_name) noobs

  +---------------------------------------------------------------+
  | store_~e                start                  end   in_store |
  |---------------------------------------------------------------|
  |   Costco   15may2020 19:00:00   15may2020 20:00:00          0 |
  |   Costco   15may2020 20:00:00   15may2020 21:00:00          1 |
  |   Costco   15may2020 21:00:00   15may2020 22:00:00          2 |
  |   Costco   15may2020 22:00:00   15may2020 23:00:00          2 |
  |   Costco   15may2020 23:00:00   16may2020 00:00:00          2 |
  |   Costco   16may2020 00:00:00   16may2020 01:00:00          2 |
  |   Costco   16may2020 01:00:00   16may2020 02:00:00          2 |
  |   Costco   16may2020 02:00:00   16may2020 03:00:00          1 |
  |   Costco   16may2020 03:00:00   16may2020 04:00:00          1 |
  |---------------------------------------------------------------|
  |   Target   16may2020 03:00:00   16may2020 04:00:00          1 |
  |   Target   16may2020 04:00:00   16may2020 05:00:00          1 |
  |   Target   16may2020 05:00:00   16may2020 06:00:00          0 |
  |   Target   16may2020 06:00:00   16may2020 07:00:00          0 |
  |   Target   16may2020 07:00:00   16may2020 08:00:00          1 |
  |   Target   16may2020 08:00:00   16may2020 09:00:00          1 |
  |   Target   16may2020 09:00:00   16may2020 10:00:00          1 |
  |   Target   16may2020 10:00:00   16may2020 11:00:00          1 |
  |   Target   16may2020 11:00:00   16may2020 12:00:00          1 |
  |---------------------------------------------------------------|
  |  Walmart   15may2020 13:00:00   15may2020 14:00:00          0 |
  |  Walmart   15may2020 14:00:00   15may2020 15:00:00          0 |
  |  Walmart   15may2020 15:00:00   15may2020 16:00:00          0 |
  |  Walmart   15may2020 16:00:00   15may2020 17:00:00          0 |
  |  Walmart   15may2020 17:00:00   15may2020 18:00:00          0 |
  |  Walmart   15may2020 18:00:00   15may2020 19:00:00          0 |
  |  Walmart   15may2020 19:00:00   15may2020 20:00:00          0 |
  |  Walmart   15may2020 20:00:00   15may2020 21:00:00          1 |
  |  Walmart   15may2020 21:00:00   15may2020 22:00:00          1 |
  +---------------------------------------------------------------+