问无法将值应用于列表
EN

Stack Overflow用户

提问于 2020-11-27 23:37:37

回答 1查看 59关注 0票数 1

我正在尝试存储从某些链接获得的电子邮件。我遇到了两个问题。第一个原因是，由于某种原因，元素email存储了两个相同类型的项目。第二个问题是if语句检测到电子邮件有值，但没有将其存储在电子邮件列表中。谢谢你的帮助！

emails = []
comment = []

with open('comment.txt', 'r') as filehandle:
    for line in filehandle:
        currentPlace = line[:-1]
        comment.append(currentPlace)

print(emails)

i = 0
while i < len(comment) :
    url = str(comment[i]) + '/about'

    print("Crawling URL %s" % url)
    response = requests.get(url)

    email = re.findall(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]+", response.text, re.I)

    print(email)

    if email:
        emails.append(email)

    email.clear()

    i += 1
    time.sleep(0.2)

print(emails)

输出：

[]
Crawling URL ...
['email@gmail.com', 'email@gmail.com']
Crawling URL ...
[]
Crawling URL ...
['email@yahoo.com', 'email@yahoo.com']
Crawling URL ...
[]
Crawling URL ...
[]
[[], []]

旧代码输出正确：

emails = set()
print("Crawling URL %s" % starting_url)

response = requests.get(starting_url)

new_emails = set(re.findall(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]+", response.text, re.I))
emails.update(new_emails)
print(emails)
# create a beutiful soup for the html document
soup = BeautifulSoup(response.text, 'lxml')

python

list

web-scraping

回答 1

Stack Overflow用户

发布于 2020-11-27 23:43:19

https://docs.python.org/3/library/re.html#re.findall返回正则表达式的所有匹配项的列表。因此，正则表达式为您的电子邮件regexp找到2个匹配项。

然后执行emails.append(email)。但email本身就是一个电子邮件列表。因此，您的emails列表最终看起来像[["1@1.com","1@1.com"], ["2@2.com","2@2.com"], ... ]。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/65039973

复制

相似问题

问无法将值应用于列表
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问无法将值应用于列表EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问无法将值应用于列表
EN