我正在学习Hive,并想编写一个优化的HiveQL/SQL查询
我的桌子是这样的:
CREATE TABLE sales (dealer VARCHAR(25), make VARCHAR(25), type VARCHAR(25), day INT);
INSERT INTO sales (dealer, make, type, day) VALUES
("Xyz", "Highlander", "SUV", "0"),
("Xyz", "Prius", "HATCH", "1"),
("Xyz", "Prius", "HATCH", "2"),
("Xyz", "Prius", "HATCH", "3"),
("Xyz", "Versa", "HATCH", "1"),
("Xyz", "Versa", "HATCH", "2"),
("Xyz", "Versa", "HATCH", "3"),
("Xyz", "S3", "SEDAN", "1"),
("Xyz", "S3", "SEDAN", "2"),
("Abc", "Forrester", "SUV", "1");给定一个“交易商”D,我希望在过去的X天中,在一个查询中计算每个“类型”的顶N "make“。
SELECT dealer, make, type, COUNT(*) AS frequency FROM sales
WHERE day > 0 AND dealer LIKE 'Xyz' GROUP BY make, type
ORDER BY frequency DESC LIMIT 5问题是,当使用GROUP BY on "make“和"type”表示前1时,我只会得到:
DEALER, MAKE, TYPE, COUNT
Xyz, Prius, Hatch, 3
Xyz, Versa, Hatch, 3
Xyz, S3, Sedan, 2
...但我想
Xyz, Prius, Hatch, 3
Xyz, S3, Sedan, 2
...对于每一个“类型”,顶部N。
有人能帮我理解如何写这样的查询吗?
SQL Fiddle http://sqlfiddle.com/#!2/df9304/5
*更新*
级别()似乎是有用的:
发布于 2014-08-26 02:29:53
在阅读了更多的文档和链接问题的提示之后:
SELECT dealer, make, rank, type FROM (
SELECT dealer, make, rank() OVER (PARTITION BY type ORDER BY count DESC) AS rank, type FROM (
SELECT dealer, make, count(*) AS count, type FROM Sales WHERE dealer = "Xyz" GROUP BY dealer, type, make
) CountedSales
) RankedSales
WHERE RankedSales.rank < 3;内部查询执行计数,中间查询执行秩(),外部查询限制排序。
销售表目录
hive> select * from Sales;
OK
Xyz Highlander SUV NULL
Xyz Highlander SUV NULL
Xyz Rouge SUV NULL
Xyz Rouge SUV NULL
Xyz Prius HATCH NULL
Xyz Prius HATCH NULL
Xyz Prius HATCH NULL
Xyz Versa HATCH NULL
Xyz S3 SEDAN NULL
Xyz S3 SEDAN NULL
Xyz S3 SEDAN NULL
Xyz A8 SEDAN NULL
Xyz A8 SEDAN NULL
Xyz A8 SEDAN NULL
Xyz A8 SEDAN NULL
Time taken: 0.054 seconds, Fetched: 15 row(s)现在是实际的查询。
hive> SELECT dealer, make, rank, type FROM (
> SELECT dealer, make, rank() OVER (PARTITION BY type ORDER BY count DESC) AS rank, type FROM (
> SELECT dealer, make, count(*) AS count, type FROM Sales WHERE dealer = "Xyz" GROUP BY dealer, type, make
> ) CountedSales
> ) RankedSales
> WHERE RankedSales.rank < 3;
...
Execution completed successfully
MapredLocal task succeeded
OK
Xyz Prius 1 HATCH
Xyz Versa 2 HATCH
Xyz A8 1 SEDAN
Xyz S3 2 SEDAN
Xyz Rouge 1 SUV
Xyz Highlander 1 SUV
Time taken: 28.491 seconds, Fetched: 6 row(s)https://stackoverflow.com/questions/25465662
复制相似问题