为什么连续使用标量值函数的次数越多,查询的累计运行速度就越慢?
我有这个表是用从第三方购买的数据建立的。
我删掉了一些东西让这篇文章变得更短。但只是为了让你了解事情是如何设置的。
CREATE TABLE [dbo].[GIS_Location](
[ID] [int] IDENTITY(1,1) NOT NULL, --PK
[Lat] [int] NOT NULL,
[Lon] [int] NOT NULL,
[Postal_Code] [varchar](7) NOT NULL,
[State] [char](2) NOT NULL,
[City] [varchar](30) NOT NULL,
[Country] [char](3) NOT NULL,
CREATE TABLE [dbo].[Address_Location](
[ID] [int] IDENTITY(1,1) NOT NULL, --PK
[Address_Type_ID] [int] NULL,
[Location] [varchar](100) NOT NULL,
[State] [char](2) NOT NULL,
[City] [varchar](30) NOT NULL,
[Postal_Code] [varchar](10) NOT NULL,
[Postal_Extension] [varchar](10) NULL,
[Country_Code] [varchar](10) NULL,然后我有两个函数来查找LAT和LON。
CREATE FUNCTION [dbo].[usf_GIS_GET_LAT]
(
@City VARCHAR(30),
@State CHAR(2)
)
RETURNS INT
WITH EXECUTE AS CALLER
AS
BEGIN
DECLARE @LAT INT
SET @LAT = (SELECT TOP 1 LAT FROM GIS_Location WITH(NOLOCK) WHERE [State] = @State AND [City] = @City)
RETURN @LAT
END
CREATE FUNCTION [dbo].[usf_GIS_GET_LON]
(
@City VARCHAR(30),
@State CHAR(2)
)
RETURNS INT
WITH EXECUTE AS CALLER
AS
BEGIN
DECLARE @LON INT
SET @LON = (SELECT TOP 1 LON FROM GIS_Location WITH(NOLOCK) WHERE [State] = @State AND [City] = @City)
RETURN @LON
END当我运行以下命令时...
SET STATISTICS TIME ON
SELECT
dbo.usf_GIS_GET_LAT(City,[State]) AS Lat,
dbo.usf_GIS_GET_LON(City,[State]) AS Lon
FROM
Address_Location WITH(NOLOCK)
WHERE
ID IN (SELECT TOP 100 ID FROM Address_Location WITH(NOLOCK) ORDER BY ID DESC)
SET STATISTICS TIME OFF100 ~= 8 ms、200 ~= 32 ms、400 ~= 876 ms
--编辑对不起,我应该说得更清楚些。我并不打算调优上面列出的查询。这只是一个示例,说明它处理的记录越多,执行时间就越慢。在真实的应用程序中,这些函数用作where子句的一部分,用于在城市和州周围构建半径,以包括该区域中的所有记录。
发布于 2009-05-01 00:53:30
在大多数情况下,最好避免引用表的标量值函数,因为(正如其他人所说的)它们基本上是黑盒,需要为每一行运行一次,并且不能通过查询计划引擎进行优化。因此,即使关联表有索引,它们也倾向于线性扩展。
您可能希望考虑使用内联表值函数,因为它们是与查询一起内联计算的,并且可以优化。您可以获得您想要的封装,但是在select语句中粘贴表达式的性能。
作为内联的副作用,它们不能包含任何过程代码(不声明@变量;设置@变量= ..;返回)。但是,它们可以返回多个行和列。
您可以重写函数,如下所示:
create function usf_GIS_GET_LAT(
@City varchar (30),
@State char (2)
)
returns table
as return (
select top 1 lat
from GIS_Location with (nolock)
where [State] = @State
and [City] = @City
);
GO
create function usf_GIS_GET_LON (
@City varchar (30),
@State char (2)
)
returns table
as return (
select top 1 LON
from GIS_Location with (nolock)
where [State] = @State
and [City] = @City
);使用它们的语法也略有不同:
select
Lat.Lat,
Lon.Lon
from
Address_Location with (nolock)
cross apply dbo.usf_GIS_GET_LAT(City,[State]) AS Lat
cross apply dbo.usf_GIS_GET_LON(City,[State]) AS Lon
WHERE
ID IN (SELECT TOP 100 ID FROM Address_Location WITH(NOLOCK) ORDER BY ID DESC)发布于 2009-04-29 02:00:39
:他们没有。
标量函数中不存在导致其性能根据执行标量函数中的行数而呈指数级下降的bug。再次尝试测试,并查看SQL profiler,查看CPU、READS和DURATION列。增加测试大小,以包括耗时超过1秒、2秒、5秒的测试。
CREATE FUNCTION dbo.slow
(
@ignore int
)
RETURNS INT
AS
BEGIN
DECLARE @slow INT
SET @slow = (select count(*) from sysobjects a
cross join sysobjects b
cross join sysobjects c
cross join sysobjects d
cross join sysobjects e
cross join sysobjects f
where a.id = @ignore)
RETURN @slow
END
go
SET STATISTICS TIME ON
select top 1 dbo.slow(id)
from sysobjects
go
select top 5 dbo.slow(id)
from sysobjects
go
select top 10 dbo.slow(id)
from sysobjects
go
select top 20 dbo.slow(id)
from sysobjects
go
select top 40 dbo.slow(id)
from sysobjects
SET STATISTICS TIME OFF输出
SQL Server Execution Times:
CPU time = 203 ms, elapsed time = 202 ms.
SQL Server Execution Times:
CPU time = 889 ms, elapsed time = 939 ms.
SQL Server Execution Times:
CPU time = 1748 ms, elapsed time = 1855 ms.
SQL Server Execution Times:
CPU time = 3541 ms, elapsed time = 3696 ms.
SQL Server Execution Times:
CPU time = 7207 ms, elapsed time = 7392 ms.请记住,如果对结果集中的行运行标量函数,则会逐行执行标量函数,而不会进行全局优化。
发布于 2009-05-19 15:26:31
你可以将你的功能封装在一个内联的TVF中,这样会快得多:
http://sqlblog.com/blogs/alexander_kuznetsov/archive/2008/05/23/reuse-your-code-with-cross-apply.aspx
https://stackoverflow.com/questions/800017
复制相似问题