我在MySQL/Dataframe中有下面提到的表格(MySQL版本- 5.7.18):
Table_1:
ID Date uid
I-1 2020-01-01 10:12:15 K-1
I-2 2020-01-02 10:12:15 K-1
I-3 2020-02-01 10:12:15 K-2
I-4 2020-02-02 10:12:15 K-3
I-5 2020-02-04 10:12:15 K-4
I-6 2019-11-01 10:12:15 K-4
I-7 2019-11-01 10:12:15 K-3
I-8 2018-12-13 10:12:15 K-5
I-9 2019-05-17 10:12:15 K-4
I-19 2020-03-11 10:12:15 K-7 Table_2:
ID city code
I-1 New York 123
I-2 Washington 122
I-3 Tokyo 123
I-4 London 144
I-5 Dubai 101
I-6 Dubai 101
I-7 London 144
I-8 Tokyo 143
I-9 Dubai 101
I-19 Dubai 150使用上述表,我希望在2020年1月1日至2002年2月29日之间获取记录,并比较整个数据库中的ID,以检查city和code是否与其他ID相匹配,并进一步对其进行分类,以检查有多少具有相同的uid,有多少有不同的。
哪里,
city和code与数据库中其他ID的匹配组合Match uid进行分类,以确定有多少ID具有相似的uidMatch uid进行分类,以确定有多少ID没有类似的uidID的类似uid所需输出
ID Date city code uid Match Same_uid different_uid uid_count
I-1 2020-01-01 10:12:15 New York 123 K-1 No 0 0 2
I-2 2020-01-02 10:12:15 Washington 122 K-1 No 0 0 2
I-3 2020-02-01 10:12:15 Tokyo 123 K-2 No 0 0 1
I-4 2020-02-02 10:12:15 London 144 K-3 Yes 1 0 2
I-5 2020-02-04 10:12:15 Dubai 101 K-4 Yes 2 0 3 发布于 2020-08-04 12:31:21
似乎(经过纠正)你需要
SELECT t1.ID,
t1.`Date`,
t1. city,
t1.code,
t1.uid,
CASE WHEN SUM((t1.city = t2.city) * (t1.code = t2.code)) - 1
THEN 'Yes'
ELSE 'No' END `Match`,
SUM((t1.city = t2.city) * (t1.code = t2.code) * (t1.uid = t2.uid)) - 1 same_uid,
SUM((t1.city = t2.city) * (t1.code = t2.code) * (t1.uid != t2.uid)) different_uid,
SUM(t1.uid = t2.uid) uid_count
FROM cities t1
CROSS JOIN cities t2
WHERE t1.`Date` >= '2020-01-01' AND t1.`Date` < '2020-03-01'
GROUP BY t1.ID, t1.`Date`, t1. city, t1.code, t1.uid
ORDER BY t1.IDPS。SUM()s中的多个可替换为AND。
如果我把这些信息放在两个不同的表格里
SELECT t11.ID,
t11.`Date`,
t21.city,
t21.code,
t11.uid,
CASE WHEN SUM((t21.city = t22.city) * (t21.code = t22.code)) - 1
THEN 'Yes'
ELSE 'No' END `Match`,
SUM((t21.city = t22.city) * (t21.code = t22.code) * (t11.uid = t12.uid)) - 1 same_uid,
SUM((t21.city = t22.city) * (t21.code = t22.code) * (t11.uid != t12.uid)) different_uid,
SUM(t11.uid = t12.uid) uid_count
FROM /* cities t1 */
(Table_1 t11 NATURAL JOIN Table_2 t21)
CROSS JOIN /* cities t2 */
(Table_1 t12 NATURAL JOIN Table_2 t22)
WHERE t11.`Date` >= '2020-01-01' AND t11.`Date` < '2020-03-01'
GROUP BY t11.ID, t11.`Date`, t21.city, t21.code, t11.uid
ORDER BY t11.IDhttps://stackoverflow.com/questions/63245789
复制相似问题