我有一份土耳其语单词组的列表。我想申请词干,我找到了turkishnlp包。虽然它有一些缺点,但它经常返回正确的单词。然而,当我将它应用到列表中时,我不希望列表的结构发生变化,我希望那些他不知道的单词保持不变。
例如,我有一个列表:mylist = ['yolda','gelirken','kopek',‘g rdüm’,'cok','tatlıydı']
我写了这个函数:
from trnlp import TrnlpWord
def tr_stemming(x):
obj = TrnlpWord()
obj.setword(x) if isinstance(x, str) else type(x)(map(tr_stemming, x))
return obj.get_stem if isinstance(x, str) else type(x)(map(tr_stemming, x))此函数返回以下列表:
tr_stemming(mylist)[例][‘yol’,'gelir','',‘g’,'tatlı']
然而,我想得到这个输出:['yol','gelir','kopek',‘g r’,'cok','tatlı']
如何更新我的功能?谢谢你的帮助!
发布于 2022-03-28 11:26:14
IIUC,您可以将您的功能修改为:
def tr_stemming(x):
if isinstance(x, str):
obj = TrnlpWord()
obj.setword(x)
stem = obj.get_stem
return stem if stem else x
elif isinstance(x, list):
return [tr_stemming(e) for e in x]
out = tr_stemming(mylist)产出:
[['yol', 'gelir', 'kopek', 'gör'], ['cok', 'tatlı']]https://stackoverflow.com/questions/71646581
复制相似问题