我有包含模式(规则)的dataframe df_pattern:
df_pattern = pd.DataFrame({'SiteId': [4, 5, 6, 7, 8],
'ZoneId': [1, 1, 1, 2, 2]})这一模式必须遵循另一种数据格式:
df_checked = pd.DataFrame({'SiteId': [6, 5, 7, 4, 8, 7, 5, 8, 6],
'ZoneId': [1, 1, 2, 2, 2, 2, 1, 1, 1]})SiteId值4,5,6必须仅与值1 (ZoneId)相关联,7,8必须与值2相关联。因此,结果应该是这样:
index SitedId ZoneId
3 4 2
7 8 1谢谢。
发布于 2021-02-05 08:31:58
df_pattern创建一个pattern选项来标记哪个是模式行,df_checkedpattern,我们可以找到df_checked中的哪一行不是模式行。df_pattern['pattern'] = 1
dfn = pd.merge(df_checked, df_pattern, how='left')
print(dfn.loc[dfn.pattern.isnull(), ['SiteId','ZoneId']])
SiteId ZoneId
3 4 2
7 8 1
print(dfn)
SiteId ZoneId pattern
0 6 1 1.0
1 5 1 1.0
2 7 2 1.0
3 4 2 NaN
4 8 2 1.0
5 7 2 1.0
6 5 1 1.0
7 8 1 NaN
8 6 1 1.0发布于 2021-02-05 08:59:37
类似的答案供参考:
df_all = df_checked.merge(df_pattern, how='left', indicator=True)
SiteId ZoneId _merge
0 6 1 both
1 5 1 both
2 7 2 both
3 4 2 left_only
4 8 2 both
5 7 2 both
6 5 1 both
7 8 1 left_only
8 6 1 both
df_checked[df_all._merge == 'left_only']
SiteId ZoneId
3 4 2
7 8 1https://stackoverflow.com/questions/66059658
复制相似问题