这是一场长达19个小时的噩梦。
我有一个非常大的查询,本质上需要连接几个表中的大型数据集。在进行连接之后,我需要用select语句中的数据更新原始表。SELECT语句超级快,而UPDATE语句超级慢。
这是select语句。
SELECT l.col1,
l.col2,
l.col3,
p.personid
FROM table1 p
LEFT JOIN table2 l ON (l.col1 = p.col1)
LEFT JOIN
(SELECT name,
col AS 'col2'
FROM tbl3 f
WHERE f.col LIKE '%-F') pcf ON (pcf.col1 = p.col1)
LEFT JOIN
(SELECT name,
col AS 'col3'
FROM tbl4 f
WHERE f.col LIKE '%-M') pcm ON (pcm.col1 = p.col1)
WHERE p.requestid = '1928'现在,如果我采用完全相同的一系列连接,并将其放入更新上下文中,则查询将永远花费时间。
UPDATE table1 p
LEFT JOIN table2 l ON (l.col1 = p.col1)
LEFT JOIN
(SELECT name,
col AS 'col2'
FROM tbl3 f
WHERE f.col LIKE '%-F') pcf ON (pcf.col1 = p.col1)
LEFT JOIN
(SELECT name,
col AS 'col3'
FROM tbl4 f
WHERE f.col LIKE '%-M') pcm ON (pcm.col1 = p.col1)
SET p.col1 = l.col1,
p.col2 = l.col2,
p.col3 = l.col3
WHERE p.requestid = '1928'So...why UPDATE JOIN语句会比SELECT JOIN语句花费更长的时间吗?更长的时间。我已经试过临时表了,但不起作用。
仅供参考,我正在处理50k或更多记录的表。
如果您对EXPLAIN结果感到好奇,那么这就是我解释select查询时发生的事情(尽管您显然不能使用EXPLAIN for update?)
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY p ALL NULL NULL NULL NULL 613246 Using where
1 PRIMARY l eq_ref PRIMARY,name_3,name,name_2 PRIMARY 257 drilldev_db.p.lastname 1
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 23435
1 PRIMARY <derived3> ALL NULL NULL NULL NULL 13610
1 PRIMARY <derived4> ALL NULL NULL NULL NULL 13053
1 PRIMARY <derived5> ALL NULL NULL NULL NULL 8273
1 PRIMARY <derived6> ALL NULL NULL NULL NULL 11481
1 PRIMARY <derived7> ALL NULL NULL NULL NULL 6708
1 PRIMARY <derived8> ALL NULL NULL NULL NULL 9588
1 PRIMARY <derived9> ALL NULL NULL NULL NULL 5494
1 PRIMARY <derived10> ALL NULL NULL NULL NULL 6981
1 PRIMARY <derived11> ALL NULL NULL NULL NULL 4107
1 PRIMARY <derived12> ALL NULL NULL NULL NULL 5973
1 PRIMARY <derived13> ALL NULL NULL NULL NULL 3851
1 PRIMARY <derived14> ALL NULL NULL NULL NULL 4935
1 PRIMARY <derived15> ALL NULL NULL NULL NULL 3574
1 PRIMARY <derived16> ALL NULL NULL NULL NULL 5793
1 PRIMARY <derived17> ALL NULL NULL NULL NULL 4706
17 DERIVED f ref year,gender gender 257 364263 Using where; Using temporary; Using filesort
16 DERIVED f ref year,gender gender 257 397322 Using where; Using temporary; Using filesort
15 DERIVED f range year,gender year 4 NULL 54092 Using where; Using temporary; Using filesort
14 DERIVED f range year,gender year 4 NULL 54092 Using where; Using temporary; Using filesort
13 DERIVED f range year,gender year 4 NULL 62494 Using where; Using temporary; Using filesort
12 DERIVED f range year,gender year 4 NULL 62494 Using where; Using temporary; Using filesort
11 DERIVED f range year,gender year 4 NULL 69317 Using where; Using temporary; Using filesort
10 DERIVED f range year,gender year 4 NULL 69317 Using where; Using temporary; Using filesort
9 DERIVED f ref year,gender gender 257 364263 Using where; Using temporary; Using filesort
8 DERIVED f range year,gender year 4 NULL 94949 Using where; Using temporary; Using filesort
7 DERIVED f ref year,gender gender 257 364263 Using where; Using temporary; Using filesort
6 DERIVED f ref year,gender gender 257 397322 Using where; Using temporary; Using filesort
5 DERIVED f ref year,gender gender 257 364263 Using where; Using temporary; Using filesort
4 DERIVED f ref year,gender gender 257 397322 Using where; Using temporary; Using filesort
3 DERIVED f ALL NULL NULL NULL NULL 37045 Using where
2 DERIVED f ALL NULL NULL NULL NULL 37045 Using where谢谢!
-b
发布于 2013-08-30 02:30:21
让我们考虑一下这个问题。如果您选择(只是抓取)表中的行,而不是在遍历它们时更新every...single....row,那么什么会花费更长的时间?读取n行还是修改(更新)n行?
将它比作阅读一本书的10行和在一张纸上写同样的10行。哪一个需要更长的时间?
我还可以补充一点,您读取的行数越多,更新的行数就越多,差异就越大。就像读一本书和写一本书的行数有更大的区别一样,你读/写的行数越多。
发布于 2013-08-30 03:32:22
如果这是你的实际语句,你不需要第二个和第三个左连接,因为它们不会改变结果。
顺便说一句,MySQL并不能有效地处理“复杂”的查询:-)如果你在一个临时表中物化SELECT的结果并使用它,它应该会更快。
https://stackoverflow.com/questions/18518432
复制相似问题