首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在插入数据库之前检查记录是否存在

在插入数据库之前检查记录是否存在
EN

Stack Overflow用户
提问于 2020-05-17 05:28:50
回答 2查看 47关注 0票数 0

我有一个包含超过640,000条记录的数据库,我每周都会用JSON文件中的数据更新这些记录。我想要做的是只将当前不存在的记录加载到数据库中。下面的脚本可以处理少量的数据,但是当我尝试加载一个大文件时,它会超时(我得到了一个500内部服务器错误)。有没有更好的方法来做这件事?

代码语言:javascript
复制
<?php
set_time_limit(0);
ini_set('memory_limit','2000M');

$url = 'json/OERrecordstest.json';
$contents = file_get_contents($url);
$records = json_decode($contents, true);
include("../config.php"); 

echo "<div class='card card-body'>";
foreach($records as $record) {  

    $type = $record['type'];
    $name = $record['title'];
    $title = addslashes($name);
    $creator = $record['author'];
    $author = addslashes($creator);
    $link = addslashes($record['link']);
    $origin = $record['source'];
    $source = addslashes($origin);
    $description = addslashes($record['description']);
    $base_url = $record['base_url'];
    $isbn_number = $record['isbn_number'];
    $e_isbn_number = $record['e_isbn_number'];
    $publication_date = $record['publication_date'];
    $license = $record['license'];
    $subject = addslashes($record['subject']);
    $image_url = $record['image_url'];
    $review = $record['review'];
    $language = $record['language'];
    $license_url = $record['license_url'];
    $publisher = addslashes($record['publisher']);
    $publisher_url = $record['publisher_url'];

    $query = $conn->prepare("SELECT * FROM oer_search WHERE title=:title AND author=:author AND source=:source");
    $query->bindParam(":title", $name);
    $query->bindParam(":author", $creator);
    $query->bindParam(":source", $origin);
    $query->execute();

    if ($query->rowCount() == 0) {
        $insert = $conn->prepare("INSERT INTO oer_search (type, title, author, link, source, description, base_url, isbn_number, e_isbn_number, publication_date, license, subject, image_url, review, language, license_url, publisher, publisher_url) VALUES ('$type', '$title', '$author', '$link', '$source', '$description', '$base_url', '$isbn_number', '$e_isbn_number', '$publication_date', '$license', '$subject', '$image_url', '$review', '$language', '$license_url', '$publisher', '$publisher_url')");
        $insert->execute();
    }

}

if($insert){
    echo "<p><span class='recordInserted'><em>$name was successfully inserted into SOAR.</em></span></p>";
}
else {
    echo "<p><span class='recordInserted'><em>Record(s) already exist in SOAR.</em></span></p>";
}
echo "</div>";
?>
EN

回答 2

Stack Overflow用户

发布于 2020-05-17 05:34:03

我不能评论,我写了一个答案,因为我的分数不够。你能把它改成这样试一下吗?

代码语言:javascript
复制
 $query = $conn->prepare("SELECT id FROM oer_search WHERE title=:title AND author=:author AND source=:source limit 1");

代码语言:javascript
复制
<?php
if(!session_id()) session_start();
ini_set('memory_limit', '2000M');

$url = 'json/OERrecordstest.json';
$contents = file_get_contents($url);
$records = json_decode($contents, true);
include("../config.php");

echo "<div class='card card-body'>";

if (!$_SESSION["records"]) {
    foreach ($records as $record) {
        $_SESSION["records"][$record["id"]] = $records;
    }
}
$i = 0;
foreach ($_SESSION["records"] as $record) {
    $i++;
    if ($i > 1000) break;

    $type = $record['type'];
    $name = $record['title'];
    $title = addslashes($name);
    $creator = $record['author'];
    $author = addslashes($creator);
    $link = addslashes($record['link']);
    $origin = $record['source'];
    $source = addslashes($origin);
    $description = addslashes($record['description']);
    $base_url = $record['base_url'];
    $isbn_number = $record['isbn_number'];
    $e_isbn_number = $record['e_isbn_number'];
    $publication_date = $record['publication_date'];
    $license = $record['license'];
    $subject = addslashes($record['subject']);
    $image_url = $record['image_url'];
    $review = $record['review'];
    $language = $record['language'];
    $license_url = $record['license_url'];
    $publisher = addslashes($record['publisher']);
    $publisher_url = $record['publisher_url'];

    $query = $conn->prepare("SELECT id FROM oer_search WHERE title=:title AND author=:author AND source=:source limit 1");
    $query->bindParam(":title", $name);
    $query->bindParam(":author", $creator);
    $query->bindParam(":source", $origin);
    $query->execute();

    if ($query->rowCount() == 0) {
        $insert = $conn->prepare("INSERT INTO oer_search (type, title, author, link, source, description, base_url, isbn_number, e_isbn_number, publication_date, license, subject, image_url, review, language, license_url, publisher, publisher_url) VALUES ('$type', '$title', '$author', '$link', '$source', '$description', '$base_url', '$isbn_number', '$e_isbn_number', '$publication_date', '$license', '$subject', '$image_url', '$review', '$language', '$license_url', '$publisher', '$publisher_url')");
        $insert->execute();

        unset($_SESSION["records"][$record["id"]]);
    }

}

print "remaining data :". count($_SESSION["records"]);
?>
票数 -1
EN

Stack Overflow用户

发布于 2020-05-17 06:07:43

提高批量导入速度的技巧:

  • 将您的SQL prepare移到循环之外(您只需执行一次)
  • Collect data以插入到1000个批中(例如..通常有更多的可能性)
  • 在插入期间使用事务/禁用索引计算从现有数据中使用查找数组查找重复项(不查询导入的每一行的数据库)
  • 一般:避免在循环

中执行SQL查询

希望这能有点帮助

票数 -1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/61843837

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档