起源 皮尔逊χ²检验(Pearson's Chi-squared Test),也称为卡方检验,是由英国统计学家卡尔·皮尔逊(Karl Pearson)在19世纪末提出的。
--------|-----------|-----------| ## ## ## Statistics for All Table Factors ## ## ## Pearson's Chi-squared ------------------------ ## Chi^2 = 12.85707 d.f. = 1 p = 0.0003362066 ## ## Pearson's Chi-squared -------------------- ## Chi^2 = 11.3923 d.f. = 1 p = 0.0007374901 ## ## ## McNemar's Chi-squared ---------------------------- ## Chi^2 = 50.7 d.f. = 1 p = 1.076196e-12 ## ## McNemar's Chi-squared test with continuity correction ## ## data: ana ## McNemar's chi-squared = 5.7857, df = 1, p-value
data = mtcars) #> #> Kruskal-Wallis rank sum test #> #> data: wt by factor(cyl) #> Kruskal-Wallis chi-squared #> Fligner-Killeen test of homogeneity of variances #> #> data: wt by cyl #> Fligner-Killeen:med chi-squared chisq.test(TeaTasting) #> Warning in chisq.test(TeaTasting): Chi-squared approximation may be incorrect #> #> Pearson's Chi-squared test with Yates' continuity correction #> #> data: TeaTasting #> X-squared test with continuity correction #> #> data: Performance #> McNemar's chi-squared = 17, df = 1, p-value
想要检验有不同期望频率的样本(比如下面一个0.75,一个0.25): # 概率表 —— 和必须为1 pt <- c(.75, .25) chisq.test(ct, p=pt) #> #> Chi-squared : Named num 1 #> ..- attr(*, "names")= chr "df" #> $ p.value : num 0.0204 #> $ method : chr "Chi-squared #> 0 1 #> control 11 3 #> treatment 6 10 chisq.test(ct) #> #> Pearson's Chi-squared #> X-squared = 3.593, df = 1, p-value = 0.05802 chisq.test(ct, correct=FALSE) #> #> Pearson's Chi-squared test with continuity correction #> #> data: ct #> McNemar's chi-squared = 4, df = 1, p-value = 0.0455
necessary packages import numpy as np def chi2_distance(histA, histB, eps = 1e-10): # compute the chi-squared : E = ((a - b) ** 2) / (a + b + eps) Sum += E print Sum/2 # return the chi-squared
Fligner-Killeen test of homogeneity of variances #> #> data: count by spray #> Fligner-Killeen:med chi-squared test of homogeneity of variances #> #> data: len by interaction(supp, dose) #> Fligner-Killeen:med chi-squared Fligner-Killeen test of homogeneity of variances #> #> data: len by dose #> Fligner-Killeen:med chi-squared
mean sd group 低剂组 14.2 2.167948 低剂组 高剂组 25.0 1.581139 高剂组 模型组 7.2 3.420526 模型组 $低剂组 Chi-squared test for given probabilities data: X[[i]] X-squared = 1.3239, df = 4, p-value = 0.8573 $高剂组 Chi-squared test for given probabilities data: X[[i]] X-squared = 0.4, df = 4, p-value = 0.9825 $模型组 Chi-squared value adjustment method: bonferroni Kruskal-Wallis rank sum test data: y$V1 and a1 Kruskal-Wallis chi-squared
> chisq.test(a) Pearson's Chi-squared test with Yates' continuity correction data: a X-squared = 4.3672 0.14583333 0.14814815 0.06666667 Warning message: In prop.test(caesar.shoe.yes, caesar.shoe.total) : Chi-squared > prop.trend.test(caesar.shoe.yes,caesar.shoe.total) Chi-squared Test for Trend in Proportions data: Divorced 36 46 38 21 Single 218 327 106 67 > chisq.test(caff.marital) Pearson's Chi-squared 也可以对原始数据使用chisq.test(),这里我们使用之前的juul数据作为例子: > attach(juul) > chisq.test(tanner,sex) Pearson's Chi-squared
> chisq.test(a) Pearson’s Chi-squared test with Yates’ continuity correction data: a X-squared = 4.3672 0.14583333 0.14814815 0.06666667 Warning message: In prop.test(caesar.shoe.yes, caesar.shoe.total) : Chi-squared > prop.trend.test(caesar.shoe.yes,caesar.shoe.total) Chi-squared Test for Trend in Proportions data: Divorced 36 46 38 21 Single 218 327 106 67 > chisq.test(caff.marital) Pearson’s Chi-squared 也可以对原始数据使用chisq.test(),这里我们使用之前的juul数据作为例子: > attach(juul) > chisq.test(tanner,sex) Pearson’s Chi-squared
mydata) fit ## ## Kruskal-Wallis rank sum test ## ## data: death_rate by drug ## Kruskal-Wallis chi-squared Friedman M检验: fit <- friedman.test(df) fit ## ## Friedman rank sum test ## ## data: df ## Friedman chi-squared
1 ## 0 1610 222 ## 1 2987 729 chisq.test(tab,correct = F) ## ## Pearson's Chi-squared 1 ## 0 2777 751 ## 1 1820 200 chisq.test(tab,correct = F) ## ## Pearson's Chi-squared $catholic, correct = F): Chi-squared ## approximation may be incorrect ## [1] 0.4755703 0.8423902 0.5696924 $catholic, correct = F): Chi-squared ## approximation may be incorrect ## Warning in chisq.test(. $catholic, correct = F): Chi-squared ## approximation may be incorrect ## [1] 0.3022080 0.5994507 0.9316443
Tree (CART) Iterative Dichotomiser 3 (ID3) C4.5 and C5.0 (different versions of a powerful approach) Chi-squared
(Treatment~Improved,data=Arthritis,distribution=approximate(B=9999)) ApproximativePearson's Chi-Squared Test data: Treatment byImproved (1, 2, 3) chi-squared = 13.055, p-value = 0.0018 需要把变量Improved从一个有序因子变成一个分类因子是因为
rpubs.com/chixinzero/490992)就能看出区别 > chisq.test(table(count_matrix['HLA-A',]>0, cluster)) Pearson's Chi-squared , df = 1, p-value = 0.01347 > chisq.test(table(count_matrix['HLA-B',]>0, cluster)) Pearson's Chi-squared
多个突变频谱的比较,结果用热图来呈现,定义了以下4种距离来衡量不同突变频谱之间的差异 Chi-squared distance Cosine distance Helliger distance Jensne-Shannon
chisq.test(mytable) 输出: Pearson's Chi-squared test data: mytable X-squared = 13.055, df = 2, p-value
.)#>#> Portmanteau Test (asymptotic)#>#> data: Residuals of VAR object var3#> Chi-squared = 34, df =
单一特征选择: 根据每个特征属性和目标属性之间的计算值来进行排序选择 排序标准: 皮尔逊相关系数 Distance Metrics距离(相似性度量) Chi-Squared test(卡方检验) Information
type="PT.asymptotic") #> #> Portmanteau Test (asymptotic) #> #> data: Residuals of VAR object var3 #> Chi-squared
end) as 差 from data group by X1") 3,运用卡方检验:模型的优劣与残差的相关行 chisq.test(a) ## 去除多余的列:实验次数 Pearson's Chi-squared