我有一段PHP代码,它可以成功地在$post数据中搜索$list关键字,并在有大约80-90%的相似性的地方回显结果。代码如下:
$list = array(
"Data" => "9",
"Data Structure" => "10",
"Database" => "11",
"Creativity" => "12",
"Forest" => "13",
"Al Pacino" => "14",
"Humans" => "15",
"Technology" => "16"
);
$post = array ('Database', 'Law', 'Tech', 'Creative');
$all_key_values = $all_keys = array();
foreach ($post as $keyword) {
foreach ($list as $word=>$num) {
$sim_chars = similar_text($keyword, $word);
if ($sim_chars/strlen($keyword) > .8 || $sim_chars/strlen($word) > .8) {
$all_key_values[] = $num;
$all_keys[] = $word;
}
elseif (stripos($keyword, $word) !== false || strpos($word, $keyword) !== false) {
$sll_key_values[] = $num;
$all_keys[] = $word;
}
}
}
print_r(implode(',', $all_key_values));
print_r(implode(',', $all_keys));现在,问题是我想使用用Java语言编写的Aho-Corasick库在$fulltext中搜索$list关键字。您可以在here中找到代码。
require_once("http://localhost:8080/JavaBridge/java/Java.inc");
$list = array(
"Data" => "9",
"Data Structure" => "10",
"Database" => "11",
"Creativity" => "12",
"Forest" => "13",
"Al Pacino" => "14",
"Humans" => "15",
"Technology" => "16"
);
$fulltext = "A forest, also referred to as a wood or the woods, is an area with a high density of trees. As with cities, depending on various cultural definitions, what is considered a forest may vary significantly in size and have different classifications according to how and of what the forest is composed.[1] A forest is usually an area filled with trees but any tall densely packed area of vegetation may be considered a forest, even underwater vegetation such as kelp forests, or non-vegetation such as fungi,[2] and bacteria. Tree forests cover approximately 9.4 percent of the Earth's surface (or 30 percent of total land area), though they once covered much more (about 50 percent of total land area). They function as habitats for organisms, hydrologic flow modulators, and soil conservers, constituting one of the most important aspects of the biosphere. A typical tree forest is composed of the overstory (canopy or upper tree layer) and the understory. The understory is further subdivided into the shrub layer, herb layer, and also the moss layer and soil microbes. In some complex forests, there is also a well-defined lower tree layer. Forests are central to all human life because they provide a diverse range of resources: they store carbon, aid in regulating the planetary climate, purify water and mitigate natural hazards such as floods. Forests also contain roughly 90 percent of the worlds terrestrial biodiversity.";所以,我的问题是,如何调用Aho-Corasick库来搜索$fulltext中的$list,并找到100%相似的关键字。非常感谢您的帮助和时间。
发布于 2014-06-28 04:51:32
您不能在PHP代码中包含java库。但是,您可以编写一个java服务器应用程序(用java编写),它可以接受来自php代码的数据。从套接字通信、web服务到简单的命令行工具,任何一种方式都是可以想象的。当然,你也可以用php重新实现java库--这可能会让你学到很多关于php和java以及算法的知识。
发布于 2014-06-28 05:05:59
旧的php-Java桥是defunct,仍然有php/Java bridge,但这可能需要相当多的额外编码才能开始运行。
但是有一些Aho Corasick implementations in PHP可以在最短的时间内解决你的问题,如果你想尝试一些非常酷的东西,可以看看Caucho Quercus,它是在Java应用服务器中运行的php的重新实现。真的很酷,而且从php调用Java代码也是轻而易举的事。
https://stackoverflow.com/questions/24460572
复制相似问题