首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Python解析论文中特定格式的引用

Python解析论文中特定格式的引用
EN

Stack Overflow用户
提问于 2022-09-14 09:11:56
回答 1查看 48关注 0票数 1

我有这样的短信清单:

代码语言:javascript
复制
inp = """Something at the beginning

References
1. Ryff, C.D. (2014) Psychological Well-Being Revisited: Advances in the Science and Practice of Eudaimonia. 
2. Deci, E.L. & Ryan, R.M. (2002) Self-determination research: reflections and future directions. 
3. Acedo, F. J., & Casillas, J. C. (2005). Current paradigms in the international management field.
    
Other References
1. Tarelli, E. (2003), “How to transfer responsibilities from expatriates to local nationals”.
2. Riusala, K. and Suutari, V. (2004), “International knowledge transfers through expatriates”.
3. Wallace, J. (2001), “The benefits of mentoring for female lawyers”.

Something at the end
12. Wallace, J. (2001), “The benefits of mentoring for female lawyers”.
Something else at the end"""

“其他参考”部分出现在一些文本中,在另一些文本中,下一部分以“好的参考资料”开始。同样,文本中的任何地方都可能出现类似的字符串。所有引用字符串有时用'\n‘分隔,有时仅用空格分隔。此外,'\n‘可能发生在文本中的任何地方,就在引用字符串的中间。

我需要regex在re.findall中使用,并在如下字符串列表中的“引用”之后返回所有字符串:

代码语言:javascript
复制
['Ryff, C.D. (2014) Psychological Well-Being Revisited: Advances in the Science and Practice of Eudaimonia.', 'Deci, E.L. & Ryan, R.M. (2002) Self-determination research: reflections and future directions.', 'Acedo, F. J., & Casillas, J. C. (2005). Current paradigms in the international management field.']

但只在“引用”之后,而不是在前面或后面的任何地方。

有人建议我使用这个正则表达式:

代码语言:javascript
复制
refs = re.findall(r'^References\s+((?:\d+\.\s*.*?\n)+)', inp, flags=re.M|re.S)
data = ''.join(refs)
output = re.findall(r'\d+\.\s*(.*?)\n', data)
print(output)

但是,只有当引用字符串被'\n‘分隔时,它才能工作--这在某些文本中不是这样。文本中的任何地方都可能出现“\n”。我根本不需要这些“\n”,这样它们就可以从文本中删除了。

当建议的regex不起作用时,示例:

代码语言:javascript
复制
inp = """Something at the beginning

References 1. Ryff, C.D. (2014) Psychological Well-Being Revisited: Advances in the Science and Practice of Eudaimonia. Additional Fields. 2. Deci, E.L. & Ryan, R.M. (2002) Self-determination research: reflections and future directions. 3. Acedo, F. J., & Casillas, J. C. (2005). Current paradigms in the international management field. Other References 1. Tarelli, E. (2003), “How to transfer responsibilities from expatriates to local nationals”.
2. Riusala, K. and Suutari, V. (2004), “International knowledge transfers through expatriates”.
3. Wallace, J. (2001), “The benefits of mentoring for female lawyers”.

Something at the end
12. Wallace, J. (2001), “The benefits of mentoring for female lawyers”.
Something else at the end"""

有谁能给我建议一个代码来帮助我获得参考的列表吗?

EN

回答 1

Stack Overflow用户

发布于 2022-09-14 09:28:01

我认为这个代码可以解决你的问题

代码语言:javascript
复制
refs = re.findall(r'(?<=References\s)+((?:\d+\.\s*.*?\n)+)[^.\s]*', inp, flags=re.M|re.S)
    data = ''.join(refs)
    output = re.findall(r'\d+\.\s*(.*?)\n', data)
    print(output)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73714319

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档