我下面有一只熊猫。这给了我从个别点到以下城市的距离,即法戈,奥兰治和泽西城。但是在下面的数据栏中,像'Fargo‘这样的每一列都有它的行号0到3,填充到任意一个点的最短4段距离,对于它得到的其余8行,因为我们正在查找到另一个城市'Orange’的4个最短距离,等等。从下面的数据中总结
Points = ['Point1','Point4','Point5','Point2','Point2','Point5','Point1','Point4','Point3','Point6','Point4','Point1']
Fargo = [2.90300755828,3.91961324034,21.9825588597,24.3141420303,24.3141420303,21.9825588597,2.90300755828,3.91961324034,25.3599772676,25.8509998739,3.91961324034,2.90300755828]
Orange = [25.5464458592,27.1527975618,6.17298387907,4.80214941294,4.80214941294,6.17298387907,25.5464458592,27.1527975618,46.4066249652,45.8853687976,27.1527975618,25.5464458592]
Jersey_City = [21.1030418227,19.6763385681,39.3194029761,41.8121131045,41.8121131045,39.3194029761,21.1030418227,19.6763385681,2.09632277264,2.67885042284,19.6763385681,21.1030418227]
toy_data=pd.DataFrame(index=Points,columns=['Fargo','Orange','Jersey_City'])
toy_data['Fargo']= Fargo
toy_data['Orange']=Orange
toy_data['Jersey_City']=Jersey_City假设对于列Fargo,前4行:行0到3表示距离Fargo最短的点。类似地,在Orange列中,第4行到第7行表示距离Orange最短的点,而现在在第4行到第7行中,列Fargo用从最近的四个点到Orange的距离填充。但我想要一个框架,其中我得到了4个点,有最短的距离,每个城市在一个单一的数据。所以,在Fargo列中,0-3行是它的4个最近点,在Orange列中,4-7行是它最近的4点,在Jersey City列中,8-11行是它的4个最近点。我想保持这4个最近的点为每个城市,并删除剩余的,如我已经做了下面。我想要的是:
Fargo = [2.9030075582789885,3.919613240342197,21.982558859743925,24.314142030334484,'NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN']
Orange = ['NAN','NAN','NAN','NAN',4.802149412942695,6.172983879065276,25.546445859236265,27.15279756182145,'NAN','NAN','NAN','NAN']
Jersey_City = ['NAN','NAN','NAN','NAN','NAN','NAN','NAN','NAN',2.096322772642856,2.67885042283533,19.676338568056806,21.10304182269932]
result_wanted_data =pd.DataFrame(index= Points,columns = ['Fargo','Orange','Jersey_City'])
result_wanted_data['Fargo']=Fargo
result_wanted_data['Orange']=Orange
result_wanted_data['Jersey_City']=Jersey_City发布于 2018-11-15 13:56:49
你能做的并不完全是我猜你想要的,但我认为这会解决这个问题:
newdf=np.empty([12])
for i in range(12):
newdf[i]=data.iloc[i,[(math.ceil((i+1)/4))]]
newdf1=[]
cities=list(data.columns.values[1:])
for i in range(12):
newdf1.append(cities[(math.ceil((i+1)/4)-1)])
strs = ["" for x in range(12)]
for i in range(12):
strs[i]=data.iloc[i,0]
final_data=pd.DataFrame(columns=['city','point','distance' ])
final_data['city']=newdf1
final_data['distance']=newdf
final_data['point']=strs 发布于 2018-11-15 14:16:31
您可以使用np.split()和for循环:
x = 0
split = []
for num in range(len(toy_data.columns)-1):
split.append(x+4)
x+=4
dfs = np.split(toy_data, split)
data = []
for i in range(len(dfs)):
data.append(pd.DataFrame(dfs[i][dfs[i].columns[i]]))
pd.concat(data, sort=False)
Fargo Orange Jersey_City
Point1 2.903008 NaN NaN
Point4 3.919613 NaN NaN
Point5 21.982559 NaN NaN
Point2 24.314142 NaN NaN
Point2 NaN 4.802149 NaN
Point5 NaN 6.172984 NaN
Point1 NaN 25.546446 NaN
Point4 NaN 27.152798 NaN
Point3 NaN NaN 2.096323
Point6 NaN NaN 2.678850
Point4 NaN NaN 19.676339
Point1 NaN NaN 21.103042发布于 2018-11-15 14:19:05
您可以使用以下内容:
intervals = np.array_split(np.arange(toy_data.shape[0]), 3)
df = pd.DataFrame(columns=['Distances'], index=toy_data.reset_index().index)
for i, j in zip(range(toy_data.shape[1]), intervals):
df.loc[j,'Distances'] = toy_data.reset_index(drop=True).iloc[j,i]
print(df)
Distances
0 2.90301
1 3.91961
2 21.9826
3 24.3141
4 4.80215
5 6.17298
6 25.5464
7 27.1528
8 2.09632
9 2.67885
10 19.6763
11 21.103https://stackoverflow.com/questions/53320534
复制相似问题