首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >潘达斯之月

潘达斯之月
EN

Stack Overflow用户
提问于 2019-03-29 01:58:04
回答 1查看 230关注 0票数 0

我想在dataFrame中创建一个列,它是另外两个结果

在下面的示例中,创建了两个dataFrames : df1和df2。

然后创建了第三个dataFrame,它是前两个的交界处。在此df3中,“Date”列已更改为dateTime类型。

然后,创建列"DateMonth“,从”Date“列中提取月份。

代码语言:javascript
复制
# df1 and df2:
id_sales   = [1, 2, 3, 4, 5, 6]
col_names  = ['Id', 'parrotId', 'Dates']
df1        = pd.DataFrame(columns = col_names)
df1.Id     = id_sales
df1.parrotId = [1, 2, 3, 1, 2, 3]
df1.Dates  = ['2012-12-25', '2012-08-20', '2013-07-23', '2014-01-14', '2016-02-21', '2015-10-31']

col_names2 = ['parrotId', 'months']
df2        = pd.DataFrame(columns = col_names2)
df2.parrotId = parrot_id
df2.months = [0, ('Fev, Mar, Apr'), 0]

# df3
df3 = pd.merge(df1, df2, on = 'parrotId')
df3.Dates = pd.to_datetime(df3.Dates)
df3['DateMonth'] = df3.Dates.dt.month

在这个df3中,我需要一个新的列,如果"DateMonth“列的月份出现在”月份“列中,该列的值将为1。

我的困难在于,在“月份”栏中,值为零,或值为月列表。

如何实现这一结果?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-03-29 03:48:30

尝试以下解决方案:

代码语言:javascript
复制
import pandas as pd

# define function for df.apply
def matched(row):
    if type(row['months'])==str:
        # for the case ('Feb, Mar, Apr') - get numerical representation of month from your string and return True if the 'Dates' value matches with some list item
        return row['Dates'].month in [datetime.strptime(mon.strip(), '%b').month for mon in row['months'].split(',')]  
    else:
        # for numbers - return True if months match
        return row['Dates'].month==row['months']

# df1 and df2:
id_sales   = [1, 2, 3, 4, 5, 6]
col_names  = ['Id', 'parrotId', 'Dates']
df1        = pd.DataFrame(columns = col_names)
df1.Id     = id_sales
df1.parrotId = [1, 2, 3, 1, 2, 3]
df1.Dates  = ['2012-12-25', '2012-08-20', '2013-07-23', '2014-01-14', '2016-02-21', '2015-10-31']

col_names2 = ['parrotId', 'months']
df2        = pd.DataFrame(columns = col_names2)
df2.parrotId = [1, 2, 3]
df2.months = [12, ('Feb, Mar, Apr'), 0]

df3 = pd.merge(df1, df2, on = 'parrotId')
df3.Dates = pd.to_datetime(df3.Dates)

# use apply to run the function on each row, astype converts boolean to int (0/1) 
df3['DateMonth'] = df3.apply(matched, axis=1).astype(int)
df3

Output:      
Id  parrotId    Dates   months          DateMonth
0   1   1   2012-12-25  12              1
1   4   1   2014-01-14  12              0
2   2   2   2012-08-20  Feb, Mar, Apr   0
3   5   2   2016-02-21  Feb, Mar, Apr   1
4   3   3   2013-07-23  0               0
5   6   3   2015-10-31  0               0
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/55409478

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档