我已经有了一个脚本,用简单的HTML删除了一个csv的所有urls。
输出如下:
CoolerMaster Devastator II AzulCoolbox DeepTeam - Combo teclado, ratón y alfombrillaAsus Claymore RED - Teclado gamingINSERT INTO productos (nombre) VALUES('Asus Claymore RED - Teclado gaming')
Items added to the database!
INSERT INTO productos (nombre) VALUES('Asus Claymore RED - Teclado gaming')
Items added to the database!
INSERT INTO productos (nombre) VALUES('Asus Claymore RED - Teclado gaming')
Items added to the database!正如您所看到的,废品是3种不同的产品,但是当我尝试插入到MySQL数据库时,它只保存了最后一个产品,但保存了三次。
在这里,您可以看到我的PHP代码:
<?php
require 'libs/simple_html_dom/simple_html_dom.php';
set_time_limit(0);
function scrapUrl($url)
{
$html = new simple_html_dom();
$html->load_file($url);
global $name;
$names = $html->find('h1');
foreach ($names as $name) {
echo $name->innertext;
echo '<br>';
}
$rutaCSV = 'csv/urls1.csv'; // Ruta del csv.
$csv = array_map('str_getcsv', file($rutaCSV));
foreach ($csv as $linea) {
$url = $linea[0];
scrapUrl($url);
}
$servername = "localhost";
$username = "";
$password = "";
$dbname = "";
// Create connection
$conn = new mysqli($servername, $username, $password, $dbname);
// Check connection
if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
foreach ($csv as $linea) {
$url = $linea[0];
$sql = "INSERT INTO productos (nombre) VALUES('$name->plaintext')";
print ("<p> $sql </p>");
if ($conn->query($sql) === TRUE) {
echo "Items added to the database!";
} else {
echo "Error: " . $sql . "<br>" . $conn->error;
}
}
$conn->close();
?>因此,我需要的是MySQL查询添加:
INSERT INTO productos (nombre) VALUES('CoolerMaster Devastator II Azul')
Items added to the database!
INSERT INTO productos (nombre) VALUES('Coolbox DeepTeam - Combo teclado, ratón y alfombrilla')
Items added to the database!
INSERT INTO productos (nombre) VALUES('Asus Claymore RED - Teclado gaming')
Items added to the database!任何帮助都是非常感谢的!
谢谢并致以最良好的问候
发布于 2017-10-31 08:41:20
嗯,在考虑了很长一段时间后,我终于成功了。
我留着这段代码以防别人使用它。
<?php
require 'libs/simple_html_dom/simple_html_dom.php';
set_time_limit(0);
function scrapUrl($url)
{
$html = new simple_html_dom();
$html->load_file($url);
global $name;
global $price;
global $manufacturer;
$result = array();
foreach($html->find('h1') as $name){
$result[] = $name->plaintext;
echo $name->plaintext;
echo '<br>';
}
foreach($html->find('h2') as $manufacturer){
$result[] = $manufacturer->plaintext;
echo $manufacturer->plaintext;
echo '<br>';
}
foreach($html->find('.our_price_display') as $price){
$result[] = $price->plaintext;
echo $price->plaintext;
echo '<br>';
}
$servername = "localhost";
$username = "";
$password = "";
$dbname = "";
// Create connection
$conn = new mysqli($servername, $username, $password, $dbname);
// Check connection
if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
$price_go=str_replace(",",".",str_replace(" €","",$price->plaintext));
$sql = "INSERT INTO productos (nombre, nombreFabricante, precio) VALUES('$name->plaintext', '$manufacturer->plaintext', $price_go)";
print ("<p> $sql </p>");
if ($conn->query($sql) === TRUE) {
echo "Producto añadido al comparador!";
echo '<br>';
} else {
echo "Error: " . $sql . "<br>" . $conn->error;
}
$conn->close();
//echo $url;
}
$rutaCSV = 'csv/urls1.csv'; // Ruta del csv.
$csv = array_map('str_getcsv', file($rutaCSV));
//print_r($csv); // Verás que es un array donde cada elemento es array con una de las url.
foreach ($csv as $linea) {
$url = $linea[0];
scrapUrl($url);
}
?>我很确定我的代码里有一些垃圾,但它有效。
我希望这对某人有帮助。
问候和感谢你的帮助。
发布于 2017-10-30 08:53:18
这是您的代码,刚刚格式化:(请检查它您有一个丢失的} )
function scrapUrl($url)
{
$html = new simple_html_dom();
$html->load_file($url);
global $name; // -- using global is crap - I would avoid that. Pass the object in as an argument of the function eg. scrapUrl($url, $name)
$names = $html->find('h1');
foreach ($names as $name) {
// -- your re-assigning $name overwriting you global on each iteration of this loop
// -- What is the purpose of this? it does nothing but output?
echo $name->innertext;
echo '<br>';
}
// -- missing } where is this function closed at?
$rutaCSV = 'csv/urls1.csv'; // Ruta del csv.
$csv = array_map('str_getcsv', file($rutaCSV));
foreach ($csv as $linea) {
// -- this can be combined with the one with the query
// -- just put the function call in that one and delete this one
$url = $linea[0];
scrapUrl($url); //recursive? depends where you function is closed
// -- whats the purpose of this function, it returns nothing?
}
$servername = "localhost";
$username = "";
$password = "";
$dbname = "";
// Create connection
$conn = new mysqli($servername, $username, $password, $dbname);
// Check connection
if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
foreach ($csv as $linea) {
$url = $linea[0]; // -- whats this url used for?
$sql = "INSERT INTO productos (nombre) VALUES('$name->plaintext')";
// -- query is vulnerable to SQL injection? prepared statement
// -- whats $name->plaintext? where is it assigned at?
print ("<p> $sql </p>");
if ($conn->query($sql) === TRUE) {
echo "Items added to the database!";
} else {
echo "Error: " . $sql . "<br>" . $conn->error;
}
// -- when you loop over the CSV but insert $name->plaintext multiple times
// -- where is that property changed inside this loop, how is it correlated to the csv data
}
$conn->close();因此,首先,您错过了一个关闭}取决于它应该在哪里,取决于您还有什么错误。
你们中的一个CSV循环可以被删除(也许),无论如何,我在注释中添加了一堆注释,比如这个// --。
您的主要问题,或您插入相同的原因是以下行
foreach ($csv as $linea) {
$url = $linea[0]; // -- whats this url used for?
$sql = "INSERT INTO productos (nombre) VALUES('$name->plaintext')";
// -- $name->plaintext does not change per iteration of the loop
// -- you are just repeatedly inserting that data
...请参见插入$name->plaintext的值,但这与$csv变量没有相关性,也没有对其进行修改。这并不奇怪,它保持不变。
好了,现在我拆开了你的代码(不是针对个人的)。让我们看看能不能简化一点。
UPDATE --这是我能做的最好的了,给出了上面的代码。我只是把它组合起来,修正了一些逻辑错误,把它删减并简化了。初学者把任务复杂化是一个常见的错误.(但我没有办法测试这个)
<?php
$servername = "localhost";
$username = "";
$password = "";
$dbname = "";
// Create connection
$conn = new mysqli($servername, $username, $password, $dbname);
// Check connection
if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
$rutaCSV = 'csv/urls1.csv'; // Ruta del csv.
$csv = array_map('str_getcsv', file($rutaCSV));
//prepare query outside of the loops
$stmt = $conn->prepare("INSERT INTO productos (nombre)VALUES(?)");
foreach ($csv as $linea) {
//iterate over each csv line
$html = new simple_html_dom();
//load url $linea[0]
$html->load_file($linea[0]);
//find names in the document, and return them
foreach( $html->find('h1') as $name ){
//iterate over each name and bind elements text to the query
$stmt->bind_param('s', $name->plaintext);
if ($stmt->execute()){
echo "Items added to the database!";
} else {
echo "Error: " . $sql . "<br>" . $conn->error;
}
}
}在这里,我进一步简化了它,因为使用函数scrapUrl()并不真正有意义。我们没有重复使用该代码,因此它添加了一个函数调用,并通过让它读取代码变得更加困难。
即使它不能正常工作,我也鼓励您将原始代码与我所拥有的代码进行比较。在你的脑海中穿行,这样你就能感觉到我是如何去除这些冗余等的。
供参考
希望这有帮助,干杯!
发布于 2017-10-30 09:00:44
您的代码中有很多问题。
我建议您更改scrapUrl函数,以便将报废产品的名称存储到数组中,并返回该数组。
我建议您在SQL-query中使用正确的变量将数据传递给它。
另外,最好使用PDO和准备好的语句,而不是在字符串文本SQL查询中插入原始数据。
https://stackoverflow.com/questions/47010924
复制相似问题