我在数据框中有一列,如下所示。这是一个调查问题,允许您选择多个答案。我想要的是一个新的专栏,区分一个人是否以英语为母语。
当english后面没有(fluently)或其他任何东西时,一个人就是以英语为母语的人。
35874 english (fluently), chinese (fluently), spanis...
40792 english
39405 english (fluently)
51413 english
49929 english (fluently), french (fluently), german ...
7147 english (fluently), japanese (poorly), thai (o...
17019 english
12321 english (okay), french (fluently), german (poo...
20974 english, sign language (fluently), spanish (po...
6134 english (fluently)
43291 english (fluently), german (fluently), french ...
7023 english (fluently), french (poorly), japanese ...
47637 english, french, hawaiian, japanese, spanish
56354 english, spanish (okay), other (fluently), fre...
5094 english (fluently), lisp (okay), japanese (poo...
14654 english
37842 english (fluently)
11962 english, chinese
37021 english
30360 english (fluently)
43865 english, spanish
29744 english (fluently), italian (fluently), spanis...
37088 english (fluently), dutch (okay), spanish (okay)
52986 english
59871 english
28376 english
3973 english (fluently), spanish (poorly)
46417 english (fluently), spanish (fluently), french...
7986 english (fluently), ancient greek (poorly)
9919 english, spanish (okay)我尝试了正则表达式和split(',',expand=True),但仍然有困难。
发布于 2020-10-17 10:05:04
使用str.split(',')并检查列表中是否有'English'`:
df2['speaks'] = df2['speaks'].astype(str)
df2['English Native?'] = df2['speaks'].str.split(',').apply(lambda x: 'Native' if 'english' in x else 'Not Native')
df2
Out[1]:
0 speaks English Native?
0 35874 english (fluently), chinese (fluently), spanis... Not Native
1 40792 english Native
2 39405 english (fluently) Not Native
3 51413 english Native
4 49929 english (fluently), french (fluently), german ... Not Native
5 7147 english (fluently), japanese (poorly), thai (o... Not Native
6 17019 english Native
7 12321 english (okay), french (fluently), german (poo... Not Native
8 20974 english, sign language (fluently), spanish (po... Native
9 6134 english (fluently) Not Native
10 43291 english (fluently), german (fluently), french ... Not Native
11 7023 english (fluently), french (poorly), japanese ... Not Native
12 47637 english, french, hawaiian, japanese, spanish Native
13 56354 english, spanish (okay), other (fluently), fre... Native
14 5094 english (fluently), lisp (okay), japanese (poo... Not Native
15 14654 english Native
16 37842 english (fluently) Not Native
17 11962 english, chinese Native
18 37021 english Native
19 30360 english (fluently) Not Native
20 43865 english, spanish Native
21 29744 english (fluently), italian (fluently), spanis... Not Native
22 37088 english (fluently), dutch (okay), spanish (okay) Not Native
23 52986 english Native
24 59871 english Native
25 28376 english Native
26 3973 english (fluently), spanish (poorly) Not Native
27 46417 english (fluently), spanish (fluently), french... Not Native
28 7986 english (fluently), ancient greek (poorly) Not Native
29 9919 english, spanish (okay) Nativehttps://stackoverflow.com/questions/64398336
复制相似问题