任务:我有两列有产品名称。我需要从B列中找到最相似的单元格,用于单元格A1,然后是A2、A3等。
输入:
Col A | Col B
-------------
Red | Blackwell
Black | Purple
White | Whitewater
Green | Reddit 输出:
Red = Reddit / 66%相似
黑色= Blackwell / 71%相似
白色=白水/ 66%相似
绿色= Reddit / 30%相似
我认为Levenstein距离可以帮助排序,但我不知道如何应用它。
事先谢谢,任何信息都有帮助。
发布于 2018-02-16 13:13:54
使用嵌套循环
<?php
// Arrays of words
$colA = ['Red', 'Black', 'White', 'Green'];
$colB = ['Blackwell', 'Purple', 'Whitewater', 'Reddit'];
// loop through words to find the closest
foreach ($colA as $a) {
// Current max number of matches
$maxMatches = -1;
$bestMatch = '';
foreach ($colB as $b) {
// Calculate the number of matches
$matches = similar_text($a, $b, $percent);
if ($matches > $maxMatches) {
// Found a better match, update
$maxMatches = $matches;
$bestMatch = $b;
$matchPercentage = $percent;
}
}
echo "$a = $bestMatch / " .
number_format($matchPercentage, 2) .
"% similar\n";
}第一个循环迭代第一个数组的元素,为每个元素初始化找到的最佳匹配和匹配字符的数量。
内循环遍历可能匹配的数组,寻找最佳匹配,对于每个候选对象,它检查相似之处(您可以在这里使用levenshtein而不是similar_text,但后者比较方便,因为它为您计算百分比),如果当前单词比更新该变量的当前最佳匹配更匹配,则后者更方便。
对于外循环中的每一个单词,我们响应找到的最佳匹配和百分比。按要求格式化。
发布于 2018-02-16 13:36:57
我不知道您在哪里得到这些期望的百分比,所以我将只使用php函数生成的值,您可以决定是否要对它们执行任何计算。
levenshtein()根本不提供您在问题中所要求的匹配。我认为使用similar_text()会更明智。
代码:(演示)
$arrayA=['Red','Black','White','Green'];
$arrayB=['Blackwell','Purple','Whitewater','Reddit'];
// similar text
foreach($arrayA as $a){
$temp=array_combine($arrayB,array_map(function($v)use($a){similar_text($v,$a,$percent); return $percent;},$arrayB)); // generate assoc array of assessments
arsort($temp); // sort descending
$result[]="$a is most similar to ".key($temp)." (sim-score:".number_format(current($temp))."%)"; // access first key and value
}
var_export($result);
echo "\n--\n";
// levenstein doesn't offer the desired matching
foreach($arrayA as $a){
$temp=array_combine($arrayB,array_map(function($v)use($a){return levenshtein($v,$a);},$arrayB)); // generate assoc array of assessments
arsort($temp); // sort descending
$result2[]="$a is most similar to ".key($temp)." (lev-score:".current($temp).")"; // access first key and value
}
var_export($result2);输出:
array (
0 => 'Red is most similar to Reddit (sim-score:67%)',
1 => 'Black is most similar to Blackwell (sim-score:71%)',
2 => 'White is most similar to Whitewater (sim-score:67%)',
3 => 'Green is most similar to Purple (sim-score:36%)',
)
--
array (
0 => 'Red is most similar to Whitewater (lev-score:9)',
1 => 'Black is most similar to Whitewater (lev-score:9)',
2 => 'White is most similar to Blackwell (lev-score:8)',
3 => 'Green is most similar to Blackwell (lev-score:8)',
)https://stackoverflow.com/questions/48825825
复制相似问题