文章/答案/技术大牛

发布

社区首页 >问答首页 >Stata:按ID和时间戳追加

问Stata:按ID和时间戳追加
EN

Stack Overflow用户

提问于 2020-10-11 03:16:23

回答 1查看 116关注 0票数 1

我有两个数据集。此处有一个数据集

包含杂货店/日级别的产品分类信息。此数据反映了给定一天内商店中所有可用的产品。

另一个数据集

包含在给定日期访问这些商店的个人的数据。

如屏幕截图2所示，同一个人(突出显示的是panid=1101758)只购买了2个产品：Michelob和Sam Adams in week 1677 2在商店234140，而我们知道在同一天，商店里的这个人总共有4个选择，即2个额外的Budweisers (屏幕截图1，突出显示的obs)。

我需要在商店/天为每个人合并/附加这两个数据集，以使最终数据集显示一个人进行了这两次购买，此外，该个人在该商店/天还有两个可用数据集。因此，该特定个体将有4个观察值-2个已购买和另外2个可用选项。我有不同的商店，日期和个人。

input store day brand
1 1 "Bud"
1 1 "Bud"
1 1 "Michelob"
1 1 "Sam Adams"
1 1 "Coors"
end


input hh store day brand
1 1 1 "Michelob"
1 1 1 "Sam Adams"
2 1 1 "Bud"
2 1 1 "Bud"
3 1 1 "Coors"
end

在上面的Stata代码中，您可以看到是另一个人购买了2瓶百威啤酒。对于那个人来说，类似的行为也必须发生，可以证明这个人有4个选择(米歇尔，山姆·亚当斯，百威，百威)，但他们最终只选择了2个百威。

以下是我希望收到的最终结果的示例：

input hh store day brand choice
1 1 1 "Michelob" 1
1 1 1 "Sam Adams" 1
1 1 1 "Bud" 0
1 1 1 "Bud" 0
1 1 1 "Coors" 0

2 1 1 "Bud" 1
2 1 1 "Bud" 1
2 1 1 "Michelob" 0
2 1 1 "Sam Adams" 0
2 1 1 "Coors" 0

3 1 1 "Coors" 1
3 1 1 "Michelob" 0
3 1 1 "Sam Adams" 0
3 1 1 "Bud" 0
3 1 1 "Bud" 0

stata

回答 1

Stack Overflow用户

发布于 2020-10-12 15:49:56

这里有一种方法。它包括为store和day内的重复产品创建一个指示器，使用joinby按store和day创建hh和products之间的所有可能的组合，最后进行合并以获得choice变量。

// Import hh data
clear
input hh store day str9 brand
1 1 1 "Michelob"
1 1 1 "Sam Adams"
2 1 1 "Bud"
2 1 1 "Bud"
3 1 1 "Coors"
end

// Create number of duplicate products for merging
bysort store day brand: gen n_brand = _n
gen choice = 1

tempfile hh hh_join
save `hh'

// Create dataset for use with joinby to create all possible combinations
// of hh and products per day/store
drop brand n_brand choice
duplicates drop
save `hh_join'

// Import store data
clear
input store day str9 brand
1 1 "Bud"
1 1 "Bud"
1 1 "Michelob"
1 1 "Sam Adams"
1 1 "Coors"
end

// Create number of duplicate products for merging
bysort store day brand: gen n_brand = _n

// Create all possible combinations of hh and products per day/store
joinby store day using `hh_join'
order hh store day brand n_brand
sort hh store day brand n_brand

// Merge with hh data to get choice variable
merge 1:1 hh store day brand n_brand using `hh'
drop _merge

// Replace choice with 0 if missing
replace choice = 0 if missing(choice)

list, noobs sepby(hh)

结果是：

. list, noobs sepby(hh)

  +-------------------------------------------------+
  | hh   store   day       brand   n_brand   choice |
  |-------------------------------------------------|
  |  1       1     1         Bud         1        0 |
  |  1       1     1         Bud         2        0 |
  |  1       1     1       Coors         1        0 |
  |  1       1     1    Michelob         1        1 |
  |  1       1     1   Sam Adams         1        1 |
  |-------------------------------------------------|
  |  2       1     1         Bud         1        1 |
  |  2       1     1         Bud         2        1 |
  |  2       1     1       Coors         1        0 |
  |  2       1     1    Michelob         1        0 |
  |  2       1     1   Sam Adams         1        0 |
  |-------------------------------------------------|
  |  3       1     1         Bud         1        0 |
  |  3       1     1         Bud         2        0 |
  |  3       1     1       Coors         1        1 |
  |  3       1     1    Michelob         1        0 |
  |  3       1     1   Sam Adams         1        0 |
  +-------------------------------------------------+

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/64297305

复制

相似问题

问Stata:按ID和时间戳追加
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Stata:按ID和时间戳追加EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Stata:按ID和时间戳追加
EN