我有一个包含以下模式和数据的PostgreSQL表:
CREATE TABLE IF NOT EXISTS T(
id uuid PRIMARY KEY,
username varchar(15),
person varchar(10),
tweets int,
followers int,
following int,
likes int,
created_at date)
;
id | username | person | tweets | followers | following | likes | created_at
:----------------------------------- | :----------- | :--------- | -----: | --------: | --------: | ----: | :---------
3fa34100-d688-4051-a687-ec49d05e7212 | renok | null | 110 | 6 | 0 | 0 | 2020-10-10
bab9ceb9-2770-49ea-8489-77e5d763a223 | Lydia_C | test user2 | 515 | 1301 | 1852 | 1677 | 2020-10-10
4649077a-9188-4821-a1ec-3b38608ea44a | Kingston_Sav | null | 2730 | 1087 | 1082 | 1339 | 2020-10-10
eef80836-e140-4adc-9598-8b612ab1825b | TP_s | null | 1835 | 998 | 956 | 1832 | 2020-10-10
fd3ff8c7-0994-40b6-abe0-915368ab9ae5 | DKSnr4 | null | 580 | 268 | 705 | 703 | 2020-10-10
3fa34100-d688-4051-a687-ec49d05e7312 | renok | null | 119 | 6 | 0 | 0 | 2020-10-12
bab9ceb9-2770-49ea-8489-77e5d763a224 | Lydia_C | test user2 | 516 | 1301 | 1852 | 1687 | 2020-10-12
4649077a-9188-4821-a1ec-3b38608ea44B | Kingston_Sav | null | 2737 | 1090 | 1084 | 1342 | 2020-10-12
eef80836-e140-4adc-9598-8b612ae1835c | TP_s | null | 1833 | 998 | 957 | 1837 | 2020-10-12
fd3ff8c7-0994-40b6-abe0-915368ab7ab5 | DKSnr4 | null | 570 | 268 | 700 | 703 | 2020-10-12我打算得到每个唯一用户名的最近日期和下一个日期之间的最大差异,查找example..In的最大裕度(差异)用户名,最近的日期是2020-10-12,最近的日期是2020-10-10。
所以我想得到这样的东西
id | username | person | tweets | followers | following | likes | created_at | prev_followers | gain
:----------------------------------- | :----------- | :----- | -----: | --------: | --------: | ----: | :--------- | -------------: | ---:
4649077a-9188-4821-a1ec-3b38608ea44a | Kingston_Sav | null | 2737 | 1090 | 1084 | 1342 | 2020-10-12 | 1087 | 3发布于 2020-10-29 13:47:28
通往罗马的路很多。下面应该是一个好的(快速和灵活),以“找到最大的页码用户名”。
假设所有涉及的列都定义为NOT NULL。而且每个用户名每天只能有一个条目。否则你得做更多。
WITH cte AS (
SELECT *, dense_rank() OVER (ORDER BY created_at DESC) AS rnk
FROM tbl
)
SELECT d1.*
, d2.followers AS prev_followers
, d1.followers - d2.followers AS gain
FROM (SELECT * FROM cte WHERE rnk = 1) d1
JOIN (SELECT * FROM cte WHERE rnk = 2) d2 USING (username)
ORDER BY gain DESC
, d1.followers, username -- added tiebreaker
LIMIT 1;名为cte的CTE将秩号与dense_rank() (不是rank(),而不是row_number())连接起来。然后加入最近一天(rnk = 1)与前一天(rnk = 2),并计算增益。显然,用户必须在两天内都有条目才有资格。最后,按收益排序,然后取第一行。
注意添加的ORDER BY表达式,以尝试打破可能的联系:可能有多个用户具有相同的收益,因此您必须定义如何处理这个问题。一种方法是添加断线。在我的例子中,关注者绝对数量较少的用户是首选的(相对增益较高),如果仍然不明确,按字母顺序排列的第一名将获胜。
再说一次,很多方式..。Postgres 13正是为此添加了标准的SQL子句带领带:
WITH cte AS (
SELECT *, dense_rank() OVER (ORDER BY created_at DESC) AS rnk
FROM tbl
)
SELECT d1.*
, d2.followers AS prev_followers
, d1.followers - d2.followers AS gain
FROM (SELECT * FROM cte WHERE rnk = 1) d1
JOIN (SELECT * FROM cte WHERE rnk = 2) d2 USING (username)
ORDER BY gain DESC
FETCH FIRST 1 ROWS WITH TIES;db<>fiddle 未定义
对WITH TIES的详细解释**:**
发布于 2020-10-29 10:15:17
WITH
cte1 AS ( SELECT DISTINCT created_at
FROM t
ORDER BY 1 DESC LIMIT 2 ),
cte2 AS ( SELECT src.*,
LEAD(src.followers) OVER (PARTITION BY src.id
ORDER BY src.created_at DESC) prev_followers,
src.followers - LEAD(src.followers) OVER (PARTITION BY src.id
ORDER BY src.created_at DESC) gain
FROM t src
JOIN cte1 ON src.created_at = cte1.created_at )
SELECT *
FROM cte2
WHERE gain IS NOT NULL
ORDER BY gain DESC LIMIT 1;每当有两个有相同差异的记录时,margin..It总是显示一个而不是两个。
WITH
cte1 AS ( SELECT DISTINCT created_at
FROM t
ORDER BY 1 DESC LIMIT 2 ),
cte2 AS ( SELECT src.*,
LEAD(src.followers) OVER (PARTITION BY src.id
ORDER BY src.created_at DESC) prev_followers,
src.followers - LEAD(src.followers) OVER (PARTITION BY src.id
ORDER BY src.created_at DESC) gain
FROM t src
JOIN cte1 ON src.created_at = cte1.created_at ),
cte3 AS ( SELECT *, RANK() OVER (ORDER BY gain DESC) rnk
FROM cte2
WHERE gain IS NOT NULL )
SELECT *
FROM cte3
WHERE rnk = 1;https://dba.stackexchange.com/questions/278849
复制相似问题