问基于索引拆分字符串列表
EN

Stack Overflow用户

提问于 2020-11-30 21:20:46

回答 1查看 45关注 0票数 0

我的数据是以一种不太有用的方式生成的，首先是几个空格，然后是索引号(在本例中是1-12)，然后是与索引关联的实际值。我想要的是将字符串分成两个列表:一个列表包含索引，另一个列表包含值。我已经写了下面的代码，它可以为我想要的工作。然而，对于几千行的数据集，它似乎很麻烦，而且需要几秒钟的时间。对于大型数据集，有没有办法加快这一速度？

data = ['         11.814772E3',
 '         2-1.06152E3',
 '         33.876477E1',
 '         4-2.65704E3',
 '         51.141537E4',
 '         61.378482E4',
 '         71.401565E4',
 '         86.782599E3',
 '         9-1.22921E3',
 '        103.400054E3',
 '        111.558086E3',
 '        121.017818E4']

values_total = [] #without empty strings
location     = [] #index when id goes to value
ids          = [] #Store ids
values       = [] #Store values

step_array = np.linspace(1,1E3,1E3) #needed to calculate index values

for i in range(len(data)):

    #Check how many indices have to be removed
    location.append([])
    location[i].append(int(math.log10(step_array[i]))+1)

    #Store values after empty strings
    for j in range(len(data[i])):
        values_total.append([])
        if data[i][j] != ' ':
            values_total[i].append(data[i][j])

    #Split list based on calculated lengths
    ids.append(values_total[i][:location[i][0]])
    values.append(values_total[i][location[i][0]:])

python

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-11-30 21:30:51

您可以尝试使用以下代码：

indices = []
vals = []
for i, d in enumerate(data, 1):  # enumerate starting from 1, so we know current index
    tmp = d.strip()  # remove whitespace
    split_idx = len(str(i))  # figure out the length of the current index
    indices.append(i)  # current index
    vals.append(float(tmp[split_idx:]))  # everything after current index length

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/65074379

复制

相似问题

问基于索引拆分字符串列表
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问基于索引拆分字符串列表EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问基于索引拆分字符串列表
EN