文章/答案/技术大牛

发布

社区首页 >问答首页 >使用Regex抓取所有文本(包括新行)

问使用Regex抓取所有文本(包括新行)
EN

Stack Overflow用户

提问于 2022-05-18 03:02:47

回答 3查看 34关注 0票数 1

我正在试图找出如何在[text](URL)之后抓取所有文本，但是由于新的行(\n\n)，我很难包括后面的整个文本。我目前正在尝试(?<=.\)\n\n)(.*\n+)的变体，但它只包括下一段。

以下是文本的样子：

---
layout: post
title: "13 - First Principles of AGI Safety with Richard Ngo"
date: 2022-03-30 22:15 -0700
categories: episode
---

[Google Podcasts link](https://podcasts.google.com/feed/aHR0cHM6Ly9heHJwb2RjYXN0LmxpYnN5bi5jb20vcnNz/episode/OTlmYzM1ZjEtMDFkMi00ZTExLWExYjEtNTYwOTg2ZWNhOWNi)

How should we think about artificial general intelligence (AGI), and the risks it might pose? What constraints exist on technical solutions to the problem of aligning superhuman AI systems with human intentions? In this episode, I talk to Richard Ngo about his report analyzing AGI safety from first principles, and recent conversations he had with Eliezer Yudkowsky about the difficulty of AI alignment.

Topics we discuss:
- [The nature of intelligence and AGI](#agi-intelligence-nature)
  - [The nature of intelligence](#nature-of-intelligence)
  - [AGI: what and how](#agi-what-how)
  - [Single vs collective AI minds](#single-collective-ai-minds)
- [AGI in practice](#agi-in-practice)
  - [Impact](#agi-impact)
  - [Timing](#agi-timing)
  - [Creation](#agi-creation)
  - [Risks and benefits](#agi-risks-benefits)
- [Making AGI safe](#making-agi-safe)
  - [Robustness of the agency abstraction](#agency-abstraction-robustness)
  - [Pivotal acts](#pivotal-acts)
- [AGI safety concepts](#agi-safety-concepts)
  - [Alignment](#ai-alignment)
  - [Transparency](#transparency)
  - [Cooperation](#cooperation)
- [Optima and selection pressures](#optima-selection-pressures)
- [The AI alignment research community](#ai-alignment-research-community)
  - [Updates from Yudkowsky conversation](#yudkonversation-updates)
  - [Corrections to the community](#community-corrections)
  - [Why others don't join](#why-others-dont-join)
- [Richard Ngo as a researcher](#ngo-as-researcher)
- [The world approaching AGI](#world-approaching-agi)
- [Following Richard's work](#following-richards-work)

**Daniel Filan:**
Hello, everybody. Today, I'll be speaking with Richard Ngo. Richard is a researcher at OpenAI, where he works on AI governance and forecasting. He also was a research engineer at DeepMind, and designed the course ["AGI Safety Fundamentals"](https://www.eacambridge.org/agi-safety-fundamentals). We'll be discussing his report, [AGI Safety from First Principles](https://www.alignmentforum.org/s/mzgtmmTKKn5MuCzFJ), as well as his [debate with Eliezer Yudkowsky](https://www.alignmentforum.org/s/n945eovrA3oDueqtq) about the difficulty of AI alignment. For links to what we're discussing, you can check the description of this episode, and you can read the transcripts at [axrp.net](https://axrp.net/). Well, Richard, welcome to the show.

**Richard Ngo:**
Thanks so much for having me.

谢谢你的帮助！

python-re

regex

回答 3

Stack Overflow用户

回答已采纳

发布于 2022-05-18 03:22:37

假设您可以将整个文本读入字符串变量，您可以在这里使用re.search：

s = re.search(r'\[.*?\]\(https?://.*?\)\s+(.*)', text, flags=re.S)
print(s.group(1)))

这将打印您似乎想要的文本：

我们应该如何看待人工智能(AGI)，以及它可能带来的风险？在使超人人工智能系统与人类意图相一致问题的技术解决方案方面存在哪些制约因素？在这一集中，我和Richard谈了他的报告，分析了AGI的安全性，以及最近他和Eliezer Yudkowsky关于AI校准的困难的谈话。

请注意，我们在点所有模式下执行这个regex查找，这样.*就可以跨换行符进行匹配。

票数 1

Stack Overflow用户

发布于 2022-05-18 03:28:26

我决定选择以下方法：

end = re.search('\[(Google Podcasts link)\]\((.+)\)\n\n)', text).end()
text = text[end:]

因此，我只需要查找我想在之后启动文本的文本，然后使用.end()将文本字符串切片到我想要的位置。

票数 0

Stack Overflow用户

发布于 2022-05-18 08:41:19

由于似乎出现了1次，所以您还可以在模式中使用拆分：

(?m)^\s*\[[^][]*]\(https?://[^\s()]*\)\s*

解释

(?m)启用多行
字符串的^开始
\s*匹配选项空格字符
来自\[[^][]*]的[...]匹配
\(https?://[^\s()]*\)匹配括号之间的url
\s*匹配尾随空格字符

参见regex101演示上的匹配

示例

result = re.split(r"(?m)^\s*\[[^][]*]\(https?://[^\s()]*\)\s*", text)
print(result[1])

或者更具体

result = re.split(r"(?m)^\s*\[Google Podcasts link]\(https?://[^\s()]*\)\s*", text)
print(result[1])

看一个Python演示。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/72282756

复制

相似问题

问使用Regex抓取所有文本(包括新行)
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Regex抓取所有文本(包括新行)EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Regex抓取所有文本(包括新行)
EN