文章/答案/技术大牛

发布

社区首页 >问答首页 >Biopython的拉普拉斯平滑

问Biopython的拉普拉斯平滑
EN

Stack Overflow用户

提问于 2010-10-25 08:09:13

回答 1查看 4.1K关注 0票数 4

我正在尝试为我的生物信息学项目在Biopython的朴素贝叶斯代码1中添加拉普拉斯平滑支持。

我已经阅读了很多关于朴素贝叶斯算法和拉普拉斯平滑的文档，我想我知道了基本的想法，但我就是不能把它和那个代码集成在一起(实际上我看不到我要加1个-laplacian数的部分)。

我不熟悉Python，而且我是一个新手。如果任何熟悉Biopython的人能给我一些建议，我将不胜感激。

biopython

python

machine-learning

bayesian

回答 1

Stack Overflow用户

回答已采纳

发布于 2010-10-25 10:38:48

请尝试使用_contents()方法的以下定义：

def _contents(items, laplace=False):
    # count occurrences of values
    counts = {}
    for item in items:
        counts[item] = counts.get(item,0) + 1.0
    # normalize
    for k in counts:
        if laplace:
            counts[k] += 1.0
            counts[k] /= (len(items)+len(counts))
        else:
            counts[k] /= len(items)
    return counts

然后将Line 194上的调用更改为：

# Estimate P(value|class,dim)
nb.p_conditional[i][j] = _contents(values, True)

使用True启用平滑，使用False禁用平滑。

下面是使用平滑和不使用平滑的输出的比较：

# without
>>> carmodel.p_conditional
[[{'Red': 0.40000000000000002, 'Yellow': 0.59999999999999998},
  {'SUV': 0.59999999999999998, 'Sports': 0.40000000000000002},
  {'Domestic': 0.59999999999999998, 'Imported': 0.40000000000000002}],
 [{'Red': 0.59999999999999998, 'Yellow': 0.40000000000000002},
  {'SUV': 0.20000000000000001, 'Sports': 0.80000000000000004},
  {'Domestic': 0.40000000000000002, 'Imported': 0.59999999999999998}]]

# with
>>> carmodel.p_conditional
[[{'Red': 0.42857142857142855, 'Yellow': 0.5714285714285714},
  {'SUV': 0.5714285714285714, 'Sports': 0.42857142857142855},
  {'Domestic': 0.5714285714285714, 'Imported': 0.42857142857142855}],
 [{'Red': 0.5714285714285714, 'Yellow': 0.42857142857142855},
  {'SUV': 0.2857142857142857, 'Sports': 0.7142857142857143},
  {'Domestic': 0.42857142857142855, 'Imported': 0.5714285714285714}]]

除此之外，我认为代码中可能存在错误：

代码根据实例的类划分实例，然后对于每个类，并给出每个维度，它会计算每个维度值出现的次数。

问题是，如果对于属于一个类的实例的子集，某个维度的所有值都不会出现在该子集中，那么当调用_contents()函数时，它将不会看到所有可能的值，因此将返回错误的概率...

我认为您需要跟踪每个维度(来自整个数据集)的所有唯一值，并在计数过程中将其考虑在内。

票数 4

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/4011115

复制

相似问题

问Biopython的拉普拉斯平滑
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Biopython的拉普拉斯平滑EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Biopython的拉普拉斯平滑
EN