首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Python循环遍历不同长度的嵌套序列

Python循环遍历不同长度的嵌套序列
EN

Stack Overflow用户
提问于 2019-09-03 11:28:25
回答 1查看 203关注 0票数 1

我正在尝试制作一个简单的程序,通过参考关键字列表来分配代码给课程。

现在,我能够处理一个关键字列表,其中每行关键字的长度固定为2:

代码语言:javascript
复制
#The list of keyword with length fixed to 2
keyword = pd.DataFrame({
        'code':['001','002','003'], 
        'keyword': [
                ['edu|teach','primary sch|secondary sch|junior sch|preliminary sch'],  # length = 2
                ['elderly|disabled|special','care'],        # length = 2
                ['digital|social media','marketing']]       # length = 2
            })

# The list of educational programmed for which codes are to be assigned
course = pd.DataFrame({
        'course': 
            ['certificate in digital marketing',
             'certificate in elderly care',
             'diploma in primary school education',
             'bachelor in traditional chinese medicine',
             'master of law']
            })

# To generate shortlist of coded courses

courseresult = pd.DataFrame()
for i in range(0,len(keyword['keyword'])):
    courseshortlist = course[
            (course.course.str.contains(keyword['keyword'][i][0]) & course.course.str.contains(keyword['keyword'][i][1])) 
           ]
    courseshortlist['autocode'] = keyword['code'][i]
    courseresult = courseresult.append(courseshortlist)

但是,我不确定如何处理长度可变的关键字列表的循环:

代码语言:javascript
复制
keyword_variable = pd.DataFrame({
        'code':['001','002','003','004','005'], 
        'keyword': [
                ['law'],                                # length = 1
                ['edu|teach','primary sch|secondary sch|junior sch|preliminary sch'], # length = 2
                ['elderly|disabled|special','care'],  # length = 2
                ['digital|social media','marketing'], # length = 2
                ['traditional','chinese','medicine']  # length = 3
                ] 
            })

更新:我只是通过一些丑陋和笨拙的尝试和例外代码得到了我想要的:

代码语言:javascript
复制
courseresult = pd.DataFrame()
for i in range(0,len(keyword_variable['keyword'])):
    try: 
        condition0 = course.course.str.contains(keyword_variable['keyword'][i][0])
        condition1 = course.course.str.contains(keyword_variable['keyword'][i][1])
        condition2 = course.course.str.contains(keyword_variable['keyword'][i][2])
        condition = condition0 & condition1 & condition2
    except IndexError: 
        try: 
            condition0 = course.course.str.contains(keyword_variable['keyword'][i][0])
            condition1 = course.course.str.contains(keyword_variable['keyword'][i][1])
            condition = condition0 & condition1 
        except IndexError: 
            condition = course.course.str.contains(keyword_variable['keyword'][i][0])
    courseshortlist = course[(condition)]
    courseshortlist['autocode'] = keyword_variable['code'][i]
    courseresult = courseresult.append(courseshortlist)

courseresult
Out[1]: 
                                     course autocode
4                             master of law      001
2       diploma in primary school education      002
1               certificate in elderly care      003
0          certificate in digital marketing      004
3  bachelor in traditional chinese medicine      005

但我相信一定有更好的方法来做到这一点?非常感谢!

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-09-03 12:31:37

假设您并不真正需要将结果放在单独的DataFrame中:

代码语言:javascript
复制
for i in range(0,len(keyword_variable['keyword'])):
    condition = pd.Series([True]*len(course))
    for k in keyword_variable['keyword'][i]:
        condition = condition & course.course.str.contains(k)
    course.loc[condition, 'autocode'] = keyword_variable['code'][i]

print(course)

如果您确实需要一个新的副本,只需先创建一个副本,相同的解决方案。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/57764690

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档