我需要扫描旧网页中的数据库表,并删除死链接。现在我使用的是这个
$row = mysql_fetch_assoc(mysql_query("SELECT count(*) as count FROM TABLE"));
$randomrow = mt_rand(0, $row['count'] -1);
$serpq = mysql_query("SELECT * FROM TABLE LIMIT 1 OFFSET $randomrow") ;
if(mysql_num_rows($serpq) > 0) {
while($keyf = mysql_fetch_array($serpq)) {
$serpid = $keyf['id'] ;
$url = $keyf['url'] ;
if (!check_url($url))
$delq=mysql_query("DELETE FROM TABLE WHERE id='$serpid'") ;
}
}
function check_url($url) {
$headers = @get_headers( $url);
$headers = (is_array($headers)) ? implode( "\n ", $headers) : $headers;
return (bool)preg_match('#^HTTP/.*\s+[(200|301|302)]+\s#i', $headers);
}但是这个脚本加载站点只需要1-2秒的小表格,但是使用这个代码和巨大的表格(有1000万行),它使得web加载在5-7秒内完成。有什么有效的方法可以去除大表的死链接,同时又不会让web超载?
发布于 2019-07-29 16:36:07
我有几个建议: 1)使用mysqli或PDO代替mysql (mysql在PHP5.5.0中被弃用,在PHP7.0.0中被删除)。2)减少对数据库的查询次数。
/** @var mysqli $db $row */
$row = mysqli_fetch_assoc(mysqli_query($db,"SELECT count(*) as count FROM TABLE"));
$count = $row['count'];
$offset = 0;
$limit = 500;
while ($offset < $count) {
$deleteIds = [];
$sql = "SELECT * FROM TABLE LIMIT $limit OFFSET $offset";
if ($result = mysqli_query($db, $sql)) {
while ($row = mysqli_fetch_assoc($result)) {
if (!check_url($row['url'])) {
// Collecting ids which we want to delete
$deleteIds[] = $row['id'];
}
}
}
if (count($deleteIds) > 0) {
$deleteIdsString = implode(',', $deleteIds);
// Delete rows
mysqli_query($db,"DELETE FROM TABLE WHERE id IN ($deleteIdsString)");
}
// Increase $offset
$offset = $offset + $limit - count($deleteIds);
// Decrease $count
$count = $count - count($deleteIds);
}https://stackoverflow.com/questions/57249417
复制相似问题