日常工作中我们需要把不同数据来源的数据,合并到一起,不同的数据来源,不同的业务逻辑合并方式也不同。日常遇到的需求有以下几种:
已知t1,t2两张表,其中数据不重复,请把两表数据合并
--有t1表如下
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | aa | 90 |
| 2 | bb | 87 |
+-----+-------+--------+
--t2表如下
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 3 | cc | 92 |
| 4 | dd | 86 |
+-----+-------+--------+
SQL 这种数据,数据我们直接使用union all进行合并。
select id,
name,
score
from t1
union all
select id,
name,
score
from t2;执行结果
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | aa | 90 |
| 2 | bb | 87 |
| 3 | cc | 92 |
| 4 | dd | 86 |
+-----+-------+--------+已知有t1,t3两张表,其中数据(可能)存在重复,要求结果不重复。
--有t1表如下
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | aa | 90 |
| 2 | bb | 87 |
+-----+-------+--------+
--t3表如下
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | aa | 90 |
| 4 | dd | 86 |
+-----+-------+--------+
因为需要去重,所以直接使用union
SQL
select id,
name,
score
from t1
union
select id,
name,
score
from t3;执行结果
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | aa | 90 |
| 2 | bb | 87 |
| 4 | dd | 86 |
+-----+-------+--------+已知有t1,t4两张表,其中id存在重复,如果存在重复id,优先取t1表数据。
--有t1表如下
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | aa | 90 |
| 2 | bb | 87 |
+-----+-------+--------+
--t4表如下
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | aa | 55 |
| 4 | dd | 86 |
+-----+-------+--------+因为需要根据不同的数据来源确定优先级,所以
SQL
select id,
name,
score,
source
from (select id,
name,
score,
source,
row_number() over (partition by id order by source asc) as rn
from (select id,
name,
score,
1 as source
from t1
union all
select id,
name,
score,
4 as source
from t4) tt) ttt
where ttt.rn = 1执行结果
+-----+-------+--------+---------+
| id | name | score | source |
+-----+-------+--------+---------+
| 1 | aa | 90 | 1 |
| 2 | bb | 87 | 1 |
| 4 | dd | 86 | 4 |
+-----+-------+--------+---------+注意: 这里我们给出了source代表数据行来源数据哪个表。实际可以不带该列。
已知有t1,t5两张表,其中id存在重复,对于相同id的得分,name字段优先取t1表,score字段优先取t5表
--有t1表如下
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | aa | 90 |
| 2 | bb | 87 |
+-----+-------+--------+
--t5表如下
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | ee | 67 |
| 4 | dd | 86 |
+-----+-------+--------+这里我们使用coalesce(expr1, expr2, ...)函数,从左到右依次取值,优先取值的字段放到前面。
SQL
select coalesce(t1.id, t5.id) as id,
coalesce(t1.name, t5.name) as name,
coalesce(t5.score, t1.score) as score
from t1
full outer join t5
on t1.id = t5.id执行结果
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | aa | 67 |
| 2 | bb | 87 |
| 4 | dd | 86 |
+-----+-------+--------+已知有t1,t5两张表,其中id存在重复,对于相同id的得分,name取字母序大的,score取分值小的;
--有t1表如下
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | aa | 90 |
| 2 | bb | 87 |
+-----+-------+--------+
--t5表如下
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | ee | 67 |
| 4 | dd | 86 |
+-----+-------+--------+这里涉及到了函数
greatest(expr, ...) 取参数中的最大值
least(expr, ...)取参数中的最小值
SQL
select coalesce(t1.id, t5.id) as id,
greatest(t1.name, t5.name) as name,
least(t1.score, t5.score) as score
from t1
full outer join t5
on t1.id = t5.id执行结果
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | ee | 67 |
| 2 | bb | 87 |
| 4 | dd | 86 |
+-----+-------+--------+已知有t1,t5两张表,其中id存在重复,对于相同id的得分,name取score分值大的,score取分值大的。
--有t1表如下
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | aa | 90 |
| 2 | bb | 87 |
+-----+-------+--------+
--t5表如下
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | ee | 67 |
| 4 | dd | 86 |
+-----+-------+--------+这里需要判断条件,可以使用if或者case when,这里需要注意取值的时候需要coalsece()处理。 欢迎留言说出原因
SQL
select coalesce(t1.id, t5.id) as id,
case when t1.score > t5.score then coalesce(t1.name,t5.name) else coalesce(t5.name,t1.name) end as name,
greatest(t1.score, t5.score) as score
from t1
full outer join t5
on t1.id = t5.id执行结果
+-----+-------+--------+
| id | name | score |
+-----+-------+--------+
| 1 | aa | 90 |
| 2 | bb | 87 |
| 4 | dd | 86 |
+-----+-------+--------+