这将是一个很长的过程,下面是一个简短的总结:我在其计划中使用COUNT STOPKEY和ORDER BY STOPKEY的top-N查询仍然很慢,没有什么好的理由。
现在,细节。它从一个缓慢的函数开始。在现实生活中,它涉及到使用regexps的字符串操作。为了演示起见,这里有一个故意愚蠢的递归斐波纳契算法。我发现它对输入的速度相当快,大约是25,慢在30左右,在35岁的时候很可笑。
-- I repeat: Please no advice on how to do Fibonacci correctly.
-- This is slow on purpose!
CREATE OR REPLACE FUNCTION tmp_fib (
n INTEGER
)
RETURN INTEGER
AS
BEGIN
IF n = 0 OR n = 1 THEN
RETURN 1;
END IF;
RETURN tmp_fib(n-2) + tmp_fib(n-1);
END;
/现在是一些输入:名字和数字的列表。
CREATE TABLE tmp_table (
name VARCHAR2(20) UNIQUE NOT NULL,
num NUMBER(2,0)
);
INSERT INTO tmp_table (name,num)
SELECT 'Alpha', 10 FROM dual UNION ALL
SELECT 'Bravo', 11 FROM dual UNION ALL
SELECT 'Charlie', 33 FROM dual;下面是一个慢速查询的示例:使用慢速Fibonacci函数来选择其num生成具有双位数的Fibonacci数的行。
SELECT p.name, p.num
FROM tmp_table p
WHERE REGEXP_LIKE(tmp_fib(p.num), '(.)\1')
ORDER BY p.name;这对于11和33是正确的,所以Bravo和Charlie在输出中。运行大约需要5秒,几乎所有这一切都是tmp_fib(33)的缓慢计算。因此,我希望通过将慢查询转换为top-N查询来完成更快版本的慢速查询。在N=1中,如下所示:
SELECT * FROM (
SELECT p.name, p.num
FROM tmp_table p
WHERE REGEXP_LIKE(tmp_fib(p.num), '(.)\1')
ORDER BY p.name
)
WHERE ROWNUM <= 1;现在它返回最高的结果,Bravo。但还是要花5秒才能跑!唯一的解释是,它仍然在计算tmp_fib(33),尽管计算结果与结果无关。它应该能够确定Bravo将被输出,因此没有必要为表的其余部分测试WHERE条件。
我想,也许只是需要告诉优化器,tmp_fib是昂贵的。所以我试着告诉它,就像这样:
ASSOCIATE STATISTICS WITH FUNCTIONS tmp_fib DEFAULT COST (999999999,0,0);这会改变计划中的一些成本数字,但并不能使查询运行得更快。
如果这与版本相关,则SELECT * FROM v$version的输出:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE 11.2.0.2.0 Production
TNS for 64-bit Windows: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production下面是top-1查询的自动跟踪。它似乎声称查询花费了1秒,但事实并非如此。它跑了大约5秒。
NAME NUM
-------------------- ----------
Bravo 11
Execution Plan
----------------------------------------------------------
Plan hash value: 548796432
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 55 | 4 (25)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 55 | 4 (25)| 00:00:01 |
|* 3 | SORT ORDER BY STOPKEY| | 1 | 55 | 4 (25)| 00:00:01 |
|* 4 | TABLE ACCESS FULL | TMP_TABLE | 1 | 55 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=1)
3 - filter(ROWNUM<=1)
4 - filter( REGEXP_LIKE (TO_CHAR("TMP_FIB"("P"."NUM")),'(.)\1'))
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
27 recursive calls
0 db block gets
25 consistent gets
0 physical reads
0 redo size
593 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processedUPdATE:正如我在评论中提到的,INDEX提示非常有助于查询。即使它不能很好地转化到我的真实世界场景中,它也可以被接受为正确的答案。具有讽刺意味的是,甲骨文似乎已经从经验中吸取了教训,现在默认选择了INDEX计划;我不得不告诉它NO_INDEX复制最初的缓慢行为。
在现实场景中,我应用了一个更复杂的解决方案,将查询重写为PL/SQL函数。下面是我的技术,应用于fib问题:
CREATE OR REPLACE PACKAGE tmp_package IS
TYPE t_namenum IS TABLE OF tmp_table%ROWTYPE;
FUNCTION get_interesting_names (howmany INTEGER) RETURN t_namenum PIPELINED;
END;
/
CREATE OR REPLACE PACKAGE BODY tmp_package IS
FUNCTION get_interesting_names (howmany INTEGER) RETURN t_namenum PIPELINED IS
CURSOR c IS SELECT name, num FROM tmp_table ORDER BY name;
rec c%ROWTYPE;
outcount INTEGER;
BEGIN
OPEN c;
outcount := 0;
WHILE outcount < howmany LOOP
FETCH c INTO rec;
EXIT WHEN c%NOTFOUND;
IF REGEXP_LIKE(tmp_fib(rec.num), '(.)\1') THEN
PIPE ROW(rec);
outcount := outcount + 1;
END IF;
END LOOP;
END;
END;
/
SELECT * FROM TABLE(tmp_package.get_interesting_names(1));感谢回答者,他们阅读了问题,运行了测试,帮助我理解了执行计划,我会处理这个问题,但他们建议。
发布于 2013-05-21 22:23:42
后续评论,因为这是太大。在11.2.0.3 (OEL)下运行,您的查询:
SELECT * FROM (
SELECT p.name, p.num
FROM tmp_table p
WHERE REGEXP_LIKE(tmp_fib(p.num), '(.)\1')
ORDER BY p.name
)
WHERE ROWNUM <= 1;
NAME NUM
-------------------- ----------
Bravo 11
Elapsed: 00:00:00.094
Plan hash value: 1058933870
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 4 (25)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
|* 2 | VIEW | | 3 | 75 | 4 (25)| 00:00:01 |
| 3 | SORT ORDER BY | | 3 | 75 | 4 (25)| 00:00:01 |
| 4 | TABLE ACCESS FULL| TMP_TABLE | 3 | 75 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=1)
2 - filter( REGEXP_LIKE (TO_CHAR("TMP_FIB"("NUM")),'(.)\1'))
Note
-----
- dynamic sampling used for this statement (level=2)注意您所看到的SORT ORDER BY中的变化,以及相应的rows值。将order-by移到子选择中看起来更像您的:
SELECT * FROM (
SELECT * FROM (
SELECT p.name, p.num
FROM tmp_table p
ORDER BY p.name
)
WHERE REGEXP_LIKE(tmp_fib(num), '(.)\1')
)
WHERE ROWNUM <= 1;
NAME NUM
-------------------- ----------
Bravo 11
Elapsed: 00:00:07.894
Plan hash value: 548796432
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 171 (99)| 00:00:03 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 25 | 171 (99)| 00:00:03 |
|* 3 | SORT ORDER BY STOPKEY| | 1 | 25 | 171 (99)| 00:00:03 |
|* 4 | TABLE ACCESS FULL | TMP_TABLE | 1 | 25 | 170 (99)| 00:00:03 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=1)
3 - filter(ROWNUM<=1)
4 - filter( REGEXP_LIKE (TO_CHAR("TMP_FIB"("P"."NUM")),'(.)\1'))
Note
-----
- dynamic sampling used for this statement (level=2)不知道这在实际场景中有多有帮助或有多实用,但在这种情况下(无论如何,在我的环境中),在所有获取的列中添加一个索引--以获得完整的索引扫描而不是全表扫描--似乎会改变这种行为:
CREATE INDEX tmp_index ON tmp_table(name, num);
index TMP_INDEX created.
SELECT * FROM (
SELECT p.name, p.num
FROM tmp_table p
WHERE REGEXP_LIKE(tmp_fib(p.num), '(.)\1')
ORDER BY p.name
)
WHERE ROWNUM <= 1;
NAME NUM
-------------------- ----------
Bravo 11
Elapsed: 00:00:00.093
Plan hash value: 1841475998
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 1 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
|* 2 | VIEW | | 3 | 75 | 1 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN| TMP_INDEX | 3 | 75 | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=1)
2 - filter( REGEXP_LIKE (TO_CHAR("TMP_FIB"("NUM")),'(.)\1'))
Note
-----
- dynamic sampling used for this statement (level=2)
SELECT * FROM (
SELECT * FROM (
SELECT p.name, p.num
FROM tmp_table p
ORDER BY p.name
)
WHERE REGEXP_LIKE(tmp_fib(num), '(.)\1')
)
WHERE ROWNUM <= 1;
NAME NUM
-------------------- ----------
Bravo 11
Elapsed: 00:00:00.093
Plan hash value: 1841475998
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 25 | 1 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 1 | 25 | 1 (0)| 00:00:01 |
|* 3 | INDEX FULL SCAN| TMP_INDEX | 1 | 25 | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=1)
3 - filter( REGEXP_LIKE (TO_CHAR("TMP_FIB"("P"."NUM")),'(.)\1'))
Note
-----
- dynamic sampling used for this statement (level=2)顺便说一句,在我运行了几次rownum变体之后,我终于开始得到ORA-01000: maximum open cursors exceeded错误了。我将在每次运行结束时删除对象,但要保持连接。我认为这暗示了另一个错误,尽管可能与你看到的没有关系,因为即使在索引扫描时也会发生这种情况。
发布于 2013-06-03 21:34:09
显然,兴趣消失了,所以我只想在一个自我回答中总结出可能的解决方案。
https://stackoverflow.com/questions/16679808
复制相似问题