文章/答案/技术大牛

发布

社区首页 >问答首页 >postgres:两个简单的、等价的查询，但其中一个要慢得多

问postgres:两个简单的、等价的查询，但其中一个要慢得多
EN

Stack Overflow用户

提问于 2013-12-08 00:28:06

回答 1查看 148关注 0票数 1

我在亚马逊EC2服务器上的ubuntu (12.04LTS)上运行postgres (9.1)。我有一张桌子：

    Table "public.principal_component"
    Column     |       Type       | Modifiers 
---------------+------------------+-----------
eigenvalue_id  | integer          | not null
stm_id         | integer          | not null
value          | double precision |

Indexes:
    "principal_component_pk" PRIMARY KEY, btree (eigenvalue_id, stm_id)
"pc_eigval_index" btree (eigenvalue_id)
Foreign-key constraints:
    "principal_component_eigenvalue_id_fkey" FOREIGN KEY (eigenvalue_id) REFERENCES
         eigenvalue(id)
    "principal_component_stm_id_fkey" FOREIGN KEY (stm_id) REFERENCES stm(id)

编辑:此表包含69,789,400行。

我尝试运行以下查询：

select count(*) from principal_component where eigenvalue_id >= 801 and 
    eigenvalue_id <= 900

但是花了很长时间，所以我取消了。因此，我使用bash对上述查询中范围内的每个id值运行查询：

time for ((a = 801; a <= 900; a++))
do 
    command="select count(*) from principal_component where eigenvalue_id=$a" 
    sudo -u postgres psql text_analytics -c "$command"`
done

这个bash命令总共花了16秒(对于所有100个单独的查询加上显示等)。

然后，我重新运行了第一个查询，并对其进行了计时:它花费了250秒。

编辑:查询的结果是0-计数是0(正如我所预期的)

为什么会有这样的差异？以下是每个查询的解释计划结果:快速、单独的查询：

explain analyze select count(*) from principal_component where eigenvalue_id = 801"
                                     QUERY PLAN                                    
-----------------------------------------------------------------------------------
Aggregate  (cost=168209.10..168209.11 rows=1 width=0) (actual time=13.367..13.369 
    rows=1 loops=1)
->  Index Scan using pc_eigval_index on principal_component  (cost=0.00..167883.16
       rows=130377 width=0) (actual time=13.344..13.344 rows=0 loops=1)
         Index Cond: (eigenvalue_id = 801)
 Total runtime: 13.512 ms
(4 rows)

缓慢的“组合”查询：

explain analyze select count(*) from principal_component where eigenvalue_id >= 801 and
    eigenvalue_id <= 900"
                                    QUERY PLAN                                         
---------------------------------------------------------------------------------------
Aggregate  (cost=1618222.49..1618222.50 rows=1 width=0) (actual time=774.585..774.586
   rows=1 loops=1)
->  Bitmap Heap Scan on principal_component  (cost=656742.39..1560409.48 rows=23125206
   width=0) (actual time=774.558..774.558 rows=0 loops=1)
     Recheck Cond: ((eigenvalue_id >= 801) AND (eigenvalue_id <= 900))
     ->  Bitmap Index Scan on pc_eigval_index  (cost=0.00..650961.09 rows=23125206 
          width=0) (actual time=774.549..774.549 rows=0 loops=1)
           Index Cond: ((eigenvalue_id >= 801) AND (eigenvalue_id <= 900))
Total runtime: 774.751 ms
(6 rows)

我对阅读计划一无所知，所以我提前道歉，因为我错过了一些显而易见的东西。提前感谢您的任何想法。

sql

postgresql

回答 1

Stack Overflow用户

发布于 2013-12-08 01:53:48

考虑在您的表上运行analyze，或者如果它最近进行了大量更新，则运行vacuum analyze，或者如果您已经这样做了，则增加它正在收集的统计数据的数量。估计有23M行的误差有点太大了。

除此之外，Postgres 9.1也帮不上忙。9.2将执行仅索引扫描，从而消除了该过程中的位图索引扫描。

旁白:这些查询在严格意义上不是等价或可比较的。第一个需要这样做：

select count(*) from ... group by igenvalue_id

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/20443684

复制

相似问题

问postgres:两个简单的、等价的查询，但其中一个要慢得多
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问postgres:两个简单的、等价的查询，但其中一个要慢得多EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问postgres:两个简单的、等价的查询，但其中一个要慢得多
EN