我写了一个查询,它花费了太多的时间(32分钟),所以我尝试了其他方法来更快地找到一个。
我终于写了另一个,花了不到5秒的时间,问题是我不理解我的优化。
有没有人能解释一下它怎么会这么快。
hugeTable有494500行
smallTable1有983行
smallTable2有983行
cursor.execute('UPDATE hugeTable dst,
(
SELECT smallTable1.hugeTableId, smallTable2.valueForHugeTable
FROM smallTable2
INNER JOIN smallTable1 ON smallTable1.id = smallTable2.id
-- This select represent 983 rows
)src
SET dst.columnToUpdate = src.valueForHugeTable
WHERE dst.id2 = %s AND dst.id = src.hugeTableId;', inputId2)
-- Condition : dst.id2 = %s alone target 983 rows.
-- Combinasion of : dst.id2 = %s AND dst.id = src.hugeTableId target a single unique row.
-- This query takes 32 minutes这里有一种方法,可以用更多的步骤完成完全相同的请求,但速度更快:
-- First create a temporary table to hold (983) rows from hugeTable that has to be updated
cursor.execute('CREATE TEMPORARY TABLE tmpTable AS
SELECT * from hugeTable
WHERE id2 = %s;', inputid)
-- Update the rows into tmpTable instead of into hugeTable
cursor.execute('UPDATE tmpTable dst,
(
SELECT smallTable1.hugeTableId, smallTable2.valueForHugeTable
FROM smallTable2
INNER JOIN smallTable1 ON smallTable1.id = smallTable2.id
-- This select represent 983 rows
)src
SET dst.columnToUpdate = src.valueForHugeTable
WHERE dst.id = src.hugeTableId;')
-- Then delete the (983) rows we want to update
cursor.execute('DELETE FROM hugeTable WHERE id2 = %s;', inputId2)
-- And create new rows replacing the above deleled ones with rows from tmpTable
cursor.execute('INSERT INTO hugeTable SELECT * FROM tmpTable;')
-- This takes litle under 5 seconds.我想知道为什么第一种方法需要这么多时间。理解这一点将帮助我获得一个新的MySql级别。
发布于 2016-04-26 03:09:23
将复合索引添加到dst:INDEX(id2, id) (按任意顺序)。
More
案例1:
UPDATE hugeTable dst,
( SELECT smallTable1.hugeTableId, smallTable2.valueForHugeTable
FROM smallTable2
INNER JOIN smallTable1 ON smallTable1.id = smallTable2.id
)src SET dst.columnToUpdate = src.valueForHugeTable
WHERE dst.id2 = 1234
AND dst.id = src.hugeTableId;案例2:
CREATE TEMPORARY TABLE tmpTable AS
SELECT *
from hugeTable
WHERE id2 = 1234;
UPDATE tmpTable dst,
( SELECT smallTable1.hugeTableId, smallTable2.valueForHugeTable
FROM smallTable2
INNER JOIN smallTable1 ON smallTable1.id = smallTable2.id
)src SET dst.columnToUpdate = src.valueForHugeTable
WHERE dst.id = src.hugeTableId;在不知道MySQL版本和看过EXPLAINs的情况下,我只能猜测它们为什么会如此不同……
( SELECT ... JOIN ... )可能已‘物化’到隐式临时表中,也可能未‘物化’到隐式临时表中。(更新的版本在这方面做得更好)。dst或src上没有足够的索引,那么“工作量”就是这两个表大小的乘积。请注意,在情况2中,dst要小得多。(这可能是您正在寻找的答案。)UPDATE可能会消除上面的一些问题。它可能会(也可能不会)消除时序差异。如需进一步讨论,请提供
对于每个表的更新version
SHOW CREATE TABLE -- innodb_buffer_pool_size
SHOW TABLE STATUS有多大--对于每个table
EXPLAIN UPDATE ... --对于每个更新--至少需要5.6
( id2 = inputId2 )的百分比是多少
https://stackoverflow.com/questions/36820030
复制相似问题