我有一个PostgreSQL表,其中有列,其中包含字符串数组。行有一些唯一的数组字符串,有些也有重复的字符串。如果存在重复字符串,我希望从每一行中删除它们。
我曾试图提出一些疑问,但未能做到这一点。
下表如下:
veh_id | vehicle_types
--------+----------------------------------------
1 | {"byd_tang","volt","viper","laferrari"}
2 | {"volt","viper"}
3 | {"byd_tang","sonata","jaguarxf"}
4 | {"swift","teslax","mirai"}
5 | {"volt","viper"}
6 | {"viper","ferrariff","bmwi8","viper"}
7 | {"ferrariff","viper","viper","volt"} 我期望得到以下结果:
veh_id | vehicle_types
--------+----------------------------------------
1 | {"byd_tang","volt","viper","laferrari"}
2 | {"volt","viper"}
3 | {"byd_tang","sonata","jaguarxf"}
4 | {"swift","teslax","mirai"}
5 | {"volt","viper"}
6 | {"viper","ferrariff","bmwi8"}
7 | {"ferrariff","viper","volt"} 发布于 2019-09-05 22:43:11
由于每一行的数组都是独立的,因此使用数组构造函数的简单关联子查询将完成以下工作:
SELECT *, ARRAY(SELECT DISTINCT unnest (vehicle_types)) AS vehicle_types_uni
FROM vehicle;请参见:
注意,NULL被转换为空数组('{}')。我们需要特例,但无论如何,它在下面的UPDATE中被排除了.
又快又简单。但是不使用这个。您没有这么说,但是通常您希望保留数组元素的原始顺序。你的初步样本表明。在关联子查询中使用WITH ORDINALITY,这会变得更加复杂:
SELECT *, ARRAY (SELECT v
FROM unnest(vehicle_types) WITH ORDINALITY t(v,ord)
GROUP BY 1
ORDER BY min(ord)
) AS vehicle_types_uni
FROM vehicle;请参见:
UPDATE来实际移除哑弹:
UPDATE vehicle
SET vehicle_types = ARRAY (
SELECT v
FROM unnest(vehicle_types) WITH ORDINALITY t(v,ord)
GROUP BY 1
ORDER BY min(ord)
)
WHERE cardinality(vehicle_types) > 1 -- optional
AND vehicle_types <> ARRAY (
SELECT v
FROM unnest(vehicle_types) WITH ORDINALITY t(v,ord)
GROUP BY 1
ORDER BY min(ord)
); -- suppress empty updates (optional)两个添加的WHERE条件都是可选的,以提高性能。第一个是完全多余的。每个条件也不包括NULL情况。第二次取消所有的空更新。
请参见:
如果您试图在不保留原始顺序的情况下这样做,那么您很可能不需要更新大多数行,仅仅因为顺序或元素在没有dupes的情况下也会发生变化。
需要Postgres 9.4或更高版本。
db<>fiddle https://dbfiddle.uk/?rdbms=postgres_12&fiddle=fe638c315af1193aa28d8150f421f86a
发布于 2019-09-05 18:05:34
我不认为它是有效的,但类似这样的方法可能会奏效:
with expanded as (
select veh_id, unnest (vehicle_types) as vehicle_type
from vehicles
)
select veh_id, array_agg (distinct vehicle_type)
from expanded
group by veh_id如果你真的想花哨,做一些最坏的事情-- O(n) --你可以写一个自定义函数:
create or replace function unique_array(input_array text[])
returns text[] as $$
DECLARE
output_array text[];
i integer;
BEGIN
output_array = array[]::text[];
for i in 1..cardinality(input_array) loop
if not (input_array[i] = any (output_array)) then
output_array := output_array || input_array[i];
end if;
end loop;
return output_array;
END;
$$
language plpgsql用法示例:
select veh_id, unique_array(vehicle_types)
from vehicleshttps://stackoverflow.com/questions/57810811
复制相似问题