我有一个如下所示的df,但要大得多。lastDate列下有一些不正确的日期,而且只有当correctDate列中有内容时,它们才是不正确的,就在它们旁边。
dff = pd.DataFrame(
{"lastDate":['2016-3-27', '2016-4-11', '2016-3-27', '2016-3-27', '2016-5-25', '2016-5-31'],
"fixedDate":['2016-1-3', '', '2016-1-18', '2016-4-5', '2016-2-27', ''],
"analyst":['John Doe', 'Brad', 'John', 'Frank', 'Claud', 'John Doe']
})


第一个是我拥有的,第二个是循环之后我想要的。
发布于 2017-08-03 22:21:47
首先,将这些列转换为datetime dtype:
for col in ['fixedDate', 'lastDate']:
df[col] = pd.to_datetime(df[col])然后你可以用
mask = pd.notnull(df['fixedDate'])
df.loc[mask, 'lastDate'] = df['fixedDate']例如,
import pandas as pd
df = pd.DataFrame( {"lastDate":['2016-3-27', '2016-4-11', '2016-3-27', '2016-3-27', '2016-5-25', '2016-5-31'], "fixedDate":['2016-1-3', '', '2016-1-18', '2016-4-5', '2016-2-27', ''], "analyst":['John Doe', 'Brad', 'John', 'Frank', 'Claud', 'John Doe'] })
for col in ['fixedDate', 'lastDate']:
df[col] = pd.to_datetime(df[col])
mask = pd.notnull(df['fixedDate'])
df.loc[mask, 'lastDate'] = df['fixedDate']
print(df)收益率
analyst fixedDate lastDate
0 John Doe 2016-01-03 2016-01-03
1 Brad NaT 2016-04-11
2 John 2016-01-18 2016-01-18
3 Frank 2016-04-05 2016-04-05
4 Claud 2016-02-27 2016-02-27
5 John Doe NaT 2016-05-31https://stackoverflow.com/questions/45495327
复制相似问题