
我有这个模式,这个会坐在爬虫的后面。mitza.mine.nu我不擅长SQL,因此我遇到了接近结果的情况。但是,查询正在运行到long。对于3 forever on 4555 records来说,两个字几乎只需要一分钟。(当前的生命样本运行另一个查询)
que这样做:
w1 w2中的单词点击DICT并获取单词id,这是在一个单独的查询中完成的。271 and 8596 for example )中选择所有记录并按
(所有记录包含两个单词按权重和排序),然后单词1然后按权重排序2。
从链接中选择DISTINCT( links.linkid )、domain.ip、links.linkid、links.url、words.weight、words.wordid,链接词ON (words.linkid=links.linkid)连接域ON (domain.siteid=links.siteid),其中links.linkid IN (从wordid=271的单词选择链接IN )和links.linkid IN(从wordid=8596的单词中选择链接IN)按words.weight DESC限制0、8的顺序排列。发布于 2014-03-21 05:00:56
尝试以下查询,避免重复调用链接表。
SELECT DISTINCT(links.linkid),domain.ip,links.linkid,
links.url,words.weight,words.wordid
FROM links
JOIN words ON (words.linkid=links.linkid)
JOIN domain ON (domain.siteid=links.siteid)
WHERE (words.wordid=271 or words.wordid=8596)
ORDER BY words.weight DESC LIMIT 0, 8并确保主键上有索引。
发布于 2014-03-21 05:37:05
不要使用内部查询。此外,避免在where子句中使用OR,因为它不使用索引。建立一个关于wordid、linkid和siteid的索引,并尝试以下查询:
SELECT DISTINCT(links.linkid),domain.ip,links.linkid,
links.url,words.weight,words.wordid
FROM links
JOIN words ON (words.linkid=links.linkid)
JOIN domain ON (domain.siteid=links.siteid)
WHERE words.wordid IN (271,8596)
ORDER BY words.weight DESC LIMIT 0, 8发布于 2019-05-23 02:50:36
试试这个:
SELECT DISTINCT(links.linkid),domain.ip,links.linkid,
links.url,words.weight,words.wordid
FROM links
JOIN
(SELECT linkid,weight,wordid FROM words WHERE wordid IN (271,8596)) words
ON (words.linkid=links.linkid)
JOIN domain
ON (domain.siteid=links.siteid)
ORDER BY words.weight DESC LIMIT 0, 8;https://stackoverflow.com/questions/22550251
复制相似问题