文章/答案/技术大牛

发布

社区首页 >问答首页 >如何在python列表中查找所有行的开始和结束索引

问如何在python列表中查找所有行的开始和结束索引
EN

Stack Overflow用户

提问于 2021-09-05 07:33:52

回答 2查看 52关注 0票数 1

我的代码是-

df=pd.read_csv("file")
l1=[]
l2=[]
for i in range(0,len(df['unions']),len(df['district'])):
    l1.append(' '.join((df['unions'][i], df['district'][i])))
    l2.append(({"entities": [[(ele.start(), ele.end() - 1) for ele in re.finditer(r'\S+', df['unions'][i])] ,df['subdistrict'][i]],}))

TRAIN_DATA=list(zip(l1,l2))
print(TRAIN_DATA)

结果- [('Dhansagar Bagerhat', {'entities': [[(0, 8)], 'Sarankhola']})]

My expected - [('Dhansagar Bagerhat', {'entities': [[(0, 8)], 'Sarankhola'],[[(10, 17)], 'AnyLabel']})]如何获得所有行的输出？我只得到了一行的结果。看起来我的循环不工作了。有谁能指出我的错误吗？

我的csv文件如下所示。"AnyLabel“是另一列。我大概有500行-

unions        subdistrict   district 
Dhansagar     Sarankhola    Bagerhat 
Daibagnyahati Morrelganj    Bagerhat 
Ramchandrapur Morrelganj    Bagerhat 
Kodalia       Mollahat      Bagerhat

loops

python

pandas

list

dataframe

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-09-05 07:46:47

尝试使用str.join

df=pd.read_csv("file")
l1=[]
l2=[]

for idx, row in df.iterrows():
    l1.append(' '.join((row['unions'], row['district'])))
    l2.append(({"entities": [[[ele.start(), ele.end() - 1], ele.group(0)] for ele in re.finditer(r'\S+', ' '.join([row['unions'] ,row['subdistrict']]))]}))
    

TRAIN_DATA=list(zip(l1,l2))
print(TRAIN_DATA)

输出：

[('Dhansagar Bagerhat', {'entities': [[[0, 8], 'Dhansagar'], [[10, 19], 'Sarankhola']]}), ('Daibagnyahati Bagerhat', {'entities': [[[0, 12], 'Daibagnyahati'], [[14, 23], 'Morrelganj']]}), ('Ramchandrapur Bagerhat', {'entities': [[[0, 12], 'Ramchandrapur'], [[14, 23], 'Morrelganj']]}), ('Kodalia Bagerhat', {'entities': [[[0, 6], 'Kodalia'], [[8, 15], 'Mollahat']]})]

票数 1

Stack Overflow用户

发布于 2021-09-05 07:43:12

你使用range是错误的，你基本上是在告诉它迭代从0到len(df['unions'])的所有数字，但是要以相同长度的len(df['district'])步长来做。所以你基本上是在告诉它只迭代第一行。您可以通过打印行号来查看：

for i in range(0,len(df['unions']),len(df['district'])):
    print(i)

另外，您也不应该像那样迭代行，而应该使用df.iterrows()

df=pd.read_csv("file")
l1=[]
l2=[]

for i, row in df.iterrows():
    l1.append(' '.join((row['unions'], row['district'])))
    l2.append(({"entities": [[(ele.start(), ele.end() - 1) for ele in re.finditer(r'\S+', ' '.join([row['unions'] ,row['subdistrict']]))]]}))

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/69061409

复制

相似问题

问如何在python列表中查找所有行的开始和结束索引
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在python列表中查找所有行的开始和结束索引EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在python列表中查找所有行的开始和结束索引
EN