我有以下数据库。Agreement_id代表协议。如果两家公司拥有相同的agreement_id,这意味着他们已经签署了一项协议。第一排和第二排的agreements_id = 1,第3、第4、第5和第6行的agreement_id =2,这意味着这4家公司已经签订了协议。
ID agreement_id firm firm_id year
1 1 RMC Group PLC 74961D 1980
2 1 Lafarge Asland SA 50590P 1980
3 2 Cable & Wireless Communicati 12682P 1981
4 2 Portugal Telecom SA 737273 1981
5 2 CITIC 1616 Holdings Ltd 17339L 1981
6 2 Macau Post 1A7458 1981
7 3 Zarubezhneft' 98914R 1981
8 3 PetroVietnam 71675K 1981
9 4 Suntory Ltd 86787Q 1981
10 4 Anheuser-Busch Cos Inc 035229 1981
11 5 Daimler-Benz Aerospace AG 23384K 1981
12 5 Finmeccanica SpA 318027 1981
13 6 Canadian Oil Sands Ltd 13643E 1982
14 6 Imperial Oil Ltd 453038 1982
15 6 Suncor Energy Inc-Certain Asts3A3559 1982
16 6 Nexen Inc-Chemical Division 05829M 1982
17 6 Mocal Energy Ltd 55441R 1982
18 6 Murphy Oil Corp 626717 1982
19 6 China Petroleum & Chemical 16941R 1982我所想做的就是构建dyads,并将唯一的firm_ids (已经签订合同的公司--相同的agreement_id --站在同一列)分成两个单独的列。换句话说,我需要相同的一致性_id的行组合。
ID agreement_id firm_id_1 firm_id_2 year
1 1 74961D 50590P 1980
2 2 12682P 737273 1981
3 2 12682P 17339L 1981
4 2 12682P 1A7458 1981
5 2 737273 17339L 1981
6 2 737273 1A7458 1981
7 2 17339L 1A7458 1981我认为,在这方面,最困难的任务是将多边协定转变为发展中国家。
这个问题的最后一部分是,我希望使用firm_id_2作为焦点公司,换句话说,使用第二家公司作为焦点公司(重复每一行,并改变事务所_id_1和事务所_id_2的值)。这是输出:
ID agreement_id firm_id_1 firm_id_2 year
1 1 74961D 50590P 1980
2 1 50590P 74961D 1980
3 2 12682P 737273 1981
4 2 737273 12682P 1981
5 2 12682P 17339L 1981
6 2 17339L 12682P 1981任何帮助都是非常感谢的。
提前谢谢。
发布于 2019-09-13 12:41:03
我用R解决了问题:
colnames(df) <- c("X", "ID","Agreement", "Firm", "FirmID", "Year") # assigning column names to dta_out
dta_inp <- df
# merging dta_inp with itself on agreement and year
dta_out <- merge(dta_inp, dta_inp, by.x = c("Agreement", "Year"), by.y = c("Agreement", "Year"))
# determining values for which FirmID.x == FirmID.y
dta_out <- cbind(dta_out, out = (dta_out$FirmID.x == dta_out$FirmID.y))
# removing values for which FirmID.x == FirmID.y
dta_out <- dta_out[dta_out$out == FALSE, ][,-5] 删除不必要的列并按协议进行排序之后:
Agreement Year FirmID.X Firm.y FirmID.y
1 1980 74961D Lafarge Asland SA 50590P
1 1980 50590P RMC Group PLC 74961D
2 1981 12682P Portugal Telecom SA 737273
2 1981 12682P CITIC 1616 Holdings Ltd 17339L
2 1981 12682P Macau Post 1A7458
2 1981 737273 Cable & Wireless Communicati 12682P
2 1981 737273 CITIC 1616 Holdings Ltd 17339L
2 1981 737273 Macau Post 1A7458
2 1981 17339L Cable & Wireless Communicati 12682P
2 1981 17339L Portugal Telecom SA 737273
2 1981 17339L Macau Post 1A7458
2 1981 1A7458 Cable & Wireless Communicati 12682P
2 1981 1A7458 Portugal Telecom SA 737273
2 1981 1A7458 CITIC 1616 Holdings Ltd 17339LGithub链接:https://gist.github.com/farid-mammadaliyev/2163b206435f71574b4c54966d18539c
https://stackoverflow.com/questions/57758652
复制相似问题