我有一个数据,它只在一列中具有以下结构:
Datetime stamp 1
Obs1
Obs2
Obs3
Datetime stamp 2
Obs1
Obs2
Obs3我想像上面那样转换它。使得日期时间将是标题,且该特定日期时间所有of将成为该特定日期时间戳的行
Date time stamp 1. Date time stamp2
Obs1 Obs1
Obs2. obs2
Obs3. Obs3发布于 2021-10-11 07:38:10
假设您的单列存储在列表/数组中,您可以像这样制作所需的子列表:
lst = ['Datetime stamp 1', 'Obs1', 'Obs2', 'Obs3', 'Datetime stamp 2', 'Obs1', 'Obs2', 'Obs3']
result = []
temp = [lst[0]]
for item in lst[1:]:
if item.startswith('Datetime'):
result.append(temp)
temp = [item]
else:
temp.append(item)
result.append(temp)
print(result)输出:
[['Datetime stamp 1', 'Obs1', 'Obs2', 'Obs3'], ['Datetime stamp 2', 'Obs1', 'Obs2', 'Obs3']]它现在是一个列表列表,其中的每个元素都可以为您表示一列。
发布于 2021-10-11 07:38:17
假设格式始终相同(即所有拆分都以字符串“Datetime”开头),您可以获得字符串以"Datetime"开头的索引,并选择每个拆分之间的所有数据:
import pandas as pd
data = pd.Series(["Datetime stamp 1",
"Obs1",
"Obs2",
"Obs3",
"Datetime stamp 2",
"Obs1",
"Obs2",
"Obs3"])
#Get splits
idx_split =data.str.startswith("Datetime ")
idx_split = idx_split.index[idx_split] # [0,4]
N_COLS = len(idx_split) #number of columns
vals = [0]*N_COLS #Initialize values
#Loop over each split-index and slize data
for i in range(N_COLS-1):
vals[i] = list(data[idx_split[i]:idx_split[i+1]])
vals[i+1] = list(data[idx_split[-1]:]) #Get the last one
print(vals)
#[['Datetime stamp 1', 'Obs1', 'Obs2', 'Obs3'],
#['Datetime stamp 2', 'Obs1', 'Obs2', 'Obs3']]
#Get the first element from each list and use that as column name
# + remove it
cols = [p.pop(0) for p in vals]
#The data list is in wrong shape for pandas, use https://stackoverflow.com/questions/6473679/transpose-list-of-lists to transpose the list to right shape
df = pd.DataFrame(list(map(list, zip(*vals))),columns = cols)
print(df)
#Datetime stamp 1 Datetime stamp 2
#0 Obs1 Obs1
#1 Obs2 Obs2
#2 Obs3 Obs3https://stackoverflow.com/questions/69522335
复制相似问题