首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Python/R:删除重复行-保留唯一的作者对

Python/R:删除重复行-保留唯一的作者对
EN

Stack Overflow用户
提问于 2017-11-24 02:15:44
回答 2查看 134关注 0票数 1

这是我从我的数据库中提取的一个示例。我正在与作者合作进行可视化工作,所以基于这个样本,我只需要在两个作者中保持一个关系。例如,我必须删除Brian Norton中的一个- Maria Roo Ons或Maria Roo Ons-Brian Norton以保持关系的唯一性。

代码语言:javascript
复制
-------------------------------------------------------------------------------------------------
|              article_title                                | author_name     |   coauthor_name |
-------------------------------------------------------------------------------------------------
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | Maria Roo Ons
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | Max Ammann
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | S. Shynu
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | Brian Norton
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | Max Ammann
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | S. Shynu
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Max Ammann      | Brian Norton
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Max Ammann      | Maria Roo Ons
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Max Ammann      | S. Shynu
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Max Ammann      | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | S. Shynu        | Brian Norton
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | S. Shynu        | Maria Roo Ons
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | S. Shynu        | Max Ammann
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | S. Shynu        | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Sarah McCormack | Brian Norton
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Sarah McCormack | Maria Roo Ons
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Sarah McCormack | Max Ammann
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Sarah McCormack | S. Shynu
-------------------------------------------------------------------------------------------------

理想的最终输出如下所示。

代码语言:javascript
复制
-------------------------------------------------------------------------------------------------
|              article_title                                | author_name     |   coauthor_name |
-------------------------------------------------------------------------------------------------
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | Maria Roo Ons
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | Max Ammann
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | S. Shynu
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | Max Ammann
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | S. Shynu
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Max Ammann      | S. Shynu
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Max Ammann      | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | S. Shynu        | Sarah McCormack

在这种情况下,我只想保留一行。我如何在R或Python中处理它?非常感谢你的帮助。

EN

回答 2

Stack Overflow用户

发布于 2017-11-24 02:30:09

我假设您有一个单独的数据库,并且正在使用python与其连接。

可能的方法:

1)您可以根据article列添加行号,然后执行重复数据消除。您可以查看SQL,了解如何在this中使用它。

然后,您可以使用python - db连接器运行查询

2)您可以将记录拉取到pandas数据框中并在那里进行分析。Pandas擅长处理和操纵数据。

票数 1
EN

Stack Overflow用户

发布于 2017-11-24 06:50:18

我假设你的数据帧看起来像我在下面展示的那样,因为你没有分享其他可能出现的可能性。

代码语言:javascript
复制
article author1 author2
A       a       b
A       b       a
A       a       a
A       b       b

在R中,这就是我如何获得您要查找的行的方法。我假设您的数据帧是df1

代码语言:javascript
复制
# This will create a new dataframe df2 with only those rows where author1 and author2 are different

df2 <- df1[df1$author1 != df1$author2, ]

输出与您在问题中提供的输出类似。

代码语言:javascript
复制
article author1 author2
  A       a       b
  A       b       a

如果这是你需要的,请告诉我。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/47461443

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档