我正在试验ArborJS,试图构建一棵知识树。Here是我的测试区域(左键点击进入节点,右击返回开始处)。我已经充实了“人文与艺术”部分的“全部”内容,所以我建议在这一部分进行实践。
我正在从Wikipedia's List of Academic Disciplines article构建这棵树。
现在,我从一个mySQL表中拉取数据(通过PHP)。表结构为TreeNodeID、ParentID、Title。"TreeNodeID“是主键(自动递增),"ParentID”是节点的父级,"Title“是应该显示在节点上的文本。
我现在在这篇文章的27页的第7页。我觉得我没有利用我的计算机的能力来自动手动输入这个过程。
我刚把所有的科目都做了一个文本文件。它的格式如下:
Anthropology
Biological Anthropology
Forensic Anthropology
Gene-Culture Coevolution
Human Behavioral Ecology
Anthropological Linguistics
Synchronic Linguistics
Diachronic Linguistics
Ethnolinguistics
Socioloinguistics
Cultural Anthropology
Anthropology of Religion
Economic Anthropology
Archaelogy
...如何使用PHP遍历此过程并填充我的数据库(为每个节点添加正确的ParentID)?
更新#3:工作代码(由下面的正确答案给出)
<?php
//echo "Checkpoint 1";
$data = "
Social sciences
Anthropology
Biological anthropology
Forensic anthropology
Gene-culture coevolution
Human behavioral ecology
Human evolution
Medical anthropology
Paleoanthropology
Population genetics
Primatology
Anthropological linguistics
Synchronic linguistics (or Descriptive linguistics)
Diachronic linguistics (or Historical linguistics)
Ethnolinguistics
Sociolinguistics
Cultural anthropology
Anthropology of religion
Economic anthropology
Ethnography
Ethnohistory
Ethnology
Ethnomusicology
Folklore
Mythology
Political anthropology
Psychological anthropology
Archaeology
...(goes on for a long time)
";
//echo "Checkpoint 2\n";
$lines = preg_split("/\n/", $data);
$parentids = array(0 => null);
$db = new PDO("host", 'username', 'pass');
$sql = 'INSERT INTO `TreeNode` SET ParentID = ?, Title = ?';
$stmt = $db->prepare($sql);
//echo "Checkpoint 3\n";
foreach ($lines as $line) {
if (!preg_match('/^([\s]*)(.*)$/', $line, $m)) {
continue;
}
$spaces = strlen($m[1]);
//$level = intval($spaces / 4); //assumes four spaces per indent
$level = strlen($m[1]); // if data is tab indented
$title = $m[2];
$parentid = ($level > 0 ? $parentids[$level - 1] : 1); //All "roots" are children of "Academia" which has an ID of "1";
$rv = $stmt->execute(array($parentid, $title));
$parentids[$level] = $db->lastInsertId();
echo "inserted $parentid - " . $parentid . " title: " . $title . "\n";
}
?>发布于 2012-09-17 02:57:09
未经测试,但这应该对您有效(使用PDO):
<?php
$data = "
Anthropology
Biological Anthropology
Forensic Anthropology
Gene-Culture Coevolution
Human Behavioral Ecology
Anthropological Linguistics
Synchronic Linguistics
Diachronic Linguistics
Ethnolinguistics
Socioloinguistics
Cultural Anthropology
Anthropology of Religion
Economic Anthropology
Archaelogy
";
$lines = preg_split("/\n/", $data);
$parentids = array(0 => null);
$sql = 'INSERT INTO `table` SET ParentID = ?, Title = ?';
$stmt = $db->prepare($sql);
foreach ($lines as $line) {
if (!preg_match('/^([\s]*)(.*)$/', $line, $m)) {
continue;
}
#$spaces = strlen($m[1]);
#$level = intval($spaces / 4); # if data is space indented
$level = strlen($m[1]); # assumes data is tab indented
$title = $m[2];
$parentid = $level > 0
? $parentids[$level - 1]
null;
$rv = $stmt->execute(array($parentid, $title));
$parentids[$level] = $db->lastInsertId();
}发布于 2012-09-17 02:49:48
我想说的是,先将复制粘贴到文本文件中,然后缩进会更容易,就像你上面做的那样。然后解析它:
indent level计数,请计算根数。注意0-缩进(根)。这将允许您构建包含每个规程的关联数组。然后你再解释一下。例如:
parse_id.parse_id分配给所有节点。然后,db_id,将mysqli_insert_id与parse_id一起添加到数组中。它应用于将db中所需的parse_id.与父级的parent_id相关联。
假设您不想检查常见的研究或独特的节点文本,那么这就足够简单了。
发布于 2012-09-17 03:57:15
您可以尝试执行以下操作
// parser.php
<?php
include_once './vendor/autoload.php';
use Symfony\Component\DomCrawler\Crawler;
$crawler = new Crawler(file_get_contents('http://en.wikipedia.org/wiki/List_of_academic_disciplines'));
$texts = $crawler->filter('.tocnumber + .toctext');
$numbers = $crawler->filter('.tocnumber');
$last = '';
for ($i=0; $i < count($numbers); $i++) {
$value = $numbers->eq($i)->text();
if(!preg_match('/\d+.\d+/', $value)) {
// is a root discipline
$last = $texts->eq($i)->text();
} else {
// is a leaf discipline
$disciplines[$last][$texts->eq($i)->text()] = $texts->eq($i)->text();
}
}
var_dump($disciplines);有了它,您可以做更多的事情,比如在数据库中持久化或执行任何操作,并且对其他DOM解析任务很有用
我用的是Symfony中的CssSelector和DomCrawler组件很容易安装
composer.json
{
"name": "wiki-parser",
"require": {
"php": ">=5.3.3",
"symfony/dom-crawler": "2.1.0",
"symfony/css-selector": "2.1.0"
}
}在控制台中
$ php composer.phar install看看getcomposer
https://stackoverflow.com/questions/12449355
复制相似问题