我得到了这个数据集,(mydata.csv),我想做一个非参数配对测试(wilcoxon),但当我运行代码时,出现了这个错误,我不明白它的含义和如何解决:
import pandas as pd
import pingouin as pg
df = pd.read_csv("mydata.csv")
id price review score
0 7949480 99.0 Check-in 10.0
1 6627449 125.0 Check-in 10.0
2 5557381 69.0 Check-in 10.0
3 9147025 125.0 Check-in 10.0
4 11675715 85.0 Check-in 10.0
... ... ... ... ...
273745 12288416 130.0 Value 10.0
273746 7930288 95.0 Value 10.0
273747 18342528 50.0 Value 10.0
273748 16232278 42.0 Value 9.0
273749 18223756 115.0 Value 10.0
pg.pairwise_ttests(dv="price", within="review", subject="id", data=df, parametric=False).round(3)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-17-cb7090cfec05> in <module>
----> 1 pg.pairwise_ttests(dv="price", within="review", subject="id", data=reviews_test, parametric=False).round(3)
~\anaconda3\lib\site-packages\pingouin\pairwise.py in pairwise_ttests(data, dv, between, within, subject, parametric, marginal, alpha, tail, padjust, effsize, correction, nan_policy, return_desc, interaction, within_first)
402 if paired:
403 stat_name = 'W-val'
--> 404 df_ttest = wilcoxon(x, y, tail=tail)
405 else:
406 stat_name = 'U-val'
~\anaconda3\lib\site-packages\pingouin\nonparametric.py in wilcoxon(x, y, tail)
443
444 # Compute test
--> 445 wval, pval = scipy.stats.wilcoxon(x, y, zero_method='wilcox',
446 correction=True, alternative=tail)
447
~\anaconda3\lib\site-packages\scipy\stats\morestats.py in wilcoxon(x, y, zero_method, correction, alternative, mode)
2961 if zero_method in ["wilcox", "pratt"]:
2962 if n_zero == len(d):
-> 2963 raise ValueError("zero_method 'wilcox' and 'pratt' do not "
2964 "work if x - y is zero for all elements.")
2965 if zero_method == "wilcox":
ValueError: zero_method 'wilcox' and 'pratt' do not work if x - y is zero for all elements.发布于 2020-12-10 09:06:10
输入注释太长,旋转有问题,在同一组下有相同的值:
import pandas as pd
import pingouin as pg
df = pd.read_csv("https://github.com/felfonsecal/StackOverflowQuestions/raw/main/mydata.csv")
df[df['id']==7949480]
id price review score
0 7949480 99.0 Check-in 10.0
45625 7949480 99.0 Cleanliness 10.0
91250 7949480 99.0 Communication 10.0
136875 7949480 99.0 Location 10.0
182500 7949480 99.0 Rating 10.0
228125 7949480 99.0 Value 10.0
df.groupby(['id'])['price'].nunique()
id
590 1
592 1
686 1
930 1
1235 1
..
18561941 1
18562129 1
18569355 1
18577490 1
18598103 1对于跨组的任何个人,在值没有变化的情况下测试这样的数据是完全没有意义的
https://stackoverflow.com/questions/65225316
复制相似问题