当pandas中没有丢失数据时,我如何进行插值?我的原创

我想要的输出:

发布于 2020-05-09 15:51:29
您可以尝试使用索引,然后进行插值:
df.reindex(np.linspace(0, len(df)-1, 2*len(df)-1))\
.reset_index(drop=True) \
.apply(lambda x: x.interpolate('linear'))说明:
有两个主要步骤:每2行插入一个NaN行,然后进行插值。
df索引开始,在最后一个df索引结束,并且值差距为0.5 (当前df索引差距的一半)。一个简单的解决方案是使用numpy.linspacenp.linspace(0, len(df)-1, 2*len(df)-1)reindex创建索引步骤1.使用reset_index.重置索引
后跟interpolate的
apply对所有列应用插值代码+过滤
# import module
import pandas as pd
import numpy as np
# create dataframe
df = pd.DataFrame({'a': [4.18483, 4.15276, 4.13156, 4.1169], 'b': [99.90139, 98.91539, 97.9294, 96.9434]})
# Step 1
print(np.linspace(0, len(df)-1, 2*len(df)-1))
# [0. 0.5 1. 1.5 2. 2.5 3. ]
# Step 2
print(df.reindex(np.linspace(0, len(df)-1, 2*len(df)-1)))
# a b
# 0.0 4.18483 99.90139
# 0.5 NaN NaN
# 1.0 4.15276 98.91539
# 1.5 NaN NaN
# 2.0 4.13156 97.92940
# 2.5 NaN NaN
# 3.0 4.11690 96.94340
# Step 3
print(df.reindex(np.linspace(0, len(df)-1, 2*len(df)-1))
.reset_index(drop=True))
# a b
# 0 4.18483 99.90139
# 1 NaN NaN
# 2 4.15276 98.91539
# 3 NaN NaN
# 4 4.13156 97.92940
# 5 NaN NaN
# 6 4.11690 96.94340
# Step 4
print(df.reindex(np.linspace(0, len(df)-1, 2*len(df)-1))
.reset_index(drop=True)
.apply(lambda x: x.interpolate('linear')))
# a b
# 0 4.184830 99.901390
# 1 4.168795 99.408390
# 2 4.152760 98.915390
# 3 4.142160 98.422395
# 4 4.131560 97.929400
# 5 4.124230 97.436400
# 6 4.116900 96.943400https://stackoverflow.com/questions/61692215
复制相似问题