文章/答案/技术大牛

发布

社区首页 >问答首页 >列表()在一行中，不能引入/newlines

问列表()在一行中，不能引入/newlines
EN

Stack Overflow用户

提问于 2021-03-25 02:47:49

回答 1查看 17关注 0票数 0

作为一个项目im编码的网页刮一个网站统计的某些怪物从一个游戏，问题是，当我把数据附加到一个列表，它会被打印成一个非常长的单行形式。我已经试过了.append(clean_data.getText().replace('\n'，“\n”)。需要考虑的是，如果我不使用.getText()，我会在列表中添加大量的[td]和tr标记，这样会变得非常混乱。我认为这里的问题是，im获取的文本被视为纯文本，所以当我将\n替换为\n时，它被直接替换为\n，就像它不承认\n一样。

我的代码：

import requests
import pandas as pd
from bs4 import BeautifulSoup
import csv

url = 'https://guildstats.eu/monsters?world=Yonabra'

page = requests.get(url)

soup = BeautifulSoup(page.content, 'html.parser')

monsters = ('adult goannas', 'young goannas', 'manticores', 'feral sphinxes', 'ogre ruffians', 'ogre rowdies', 'ogre sages', 'dogs')
finding_td = soup.find_all('td', string=monsters)
list_of_monsters = []
for looking_for_parent in finding_td:
    parent_tr = looking_for_parent.find_parents('tr')
    for clean_data in parent_tr:

        list_of_monsters.append(clean_data.getText().replace('\n', " "))

print(list_of_monsters)

它提供了以下输出：

[' 7 adult goannas  2020-05-28 1519 0 736893 133 ', ' 222 dogs  2020-05-27 143 0 40043 0 ', ' 298 feral sphinxes  2020-05-28 1158 1 480598 152 ', ' 498 manticores  2020-05-28 961 1 299491 68 ', ' 581 ogre rowdies  2020-05-28 306 0 188324 13 ', ' 582 ogre ruffians  2020-05-29 217 0 121964 7 ', ' 583 ogre sages  2020-05-28 156 0 63489 8 ', ' 911 young goannas  2020-05-28 1880 0 972217 74 ']

我希望它更像这样：

[' 7 adult goannas  2020-05-28 1519 0 736893 133 '
' 222 dogs  2020-05-27 143 0 40043 0 '
' 298 feral sphinxes  2020-05-28 1158 1 480598 152 '
' 498 manticores  2020-05-28 961 1 299491 68 '
' 581 ogre rowdies  2020-05-28 306 0 188324 13 '
' 582 ogre ruffians  2020-05-29 217 0 121964 7 '
' 583 ogre sages  2020-05-28 156 0 63489 8 '
' 911 young goannas  2020-05-28 1880 0 972217 74 ']

list

beautifulsoup

scrape

回答 1

Stack Overflow用户

发布于 2021-03-25 03:44:08

您想要的是更改数组的分隔符，而不是,，您需要一个新的行。正如@QHarr所提到的，您可以使用python 以更好的格式打印结果。

尝试：

import requests
import pandas as pd
from bs4 import BeautifulSoup
import csv
from pprint import pprint
url = 'https://guildstats.eu/monsters?world=Yonabra'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
monsters = ('adult goannas', 'young goannas', 'manticores', 'feral sphinxes', 'ogre ruffians', 'ogre rowdies', 'ogre sages', 'dogs')
finding_td = soup.find_all('td', string=monsters)
list_of_monsters = []
for looking_for_parent in finding_td:
    parent_tr = looking_for_parent.find_parents('tr')
    for clean_data in parent_tr:
        list_of_monsters.append(clean_data.getText().replace("\n", " "))
pprint(list_of_monsters)

这意味着：

[' 7 adult goannas  2020-05-28 1519 0 736893 133 ',
 ' 222 dogs  2020-05-27 143 0 40043 0 ',
 ' 298 feral sphinxes  2020-05-28 1158 1 480598 152 ',
 ' 498 manticores  2020-05-28 961 1 299491 68 ',
 ' 581 ogre rowdies  2020-05-28 306 0 188324 13 ',
 ' 582 ogre ruffians  2020-05-29 217 0 121964 7 ',
 ' 583 ogre sages  2020-05-28 156 0 63489 8 ',
 ' 911 young goannas  2020-05-28 1880 0 972217 74 ']

您获得的\n字符已经是新的行字符。没有必要在python中添加额外的转义字符。正如您已经尝试过的，replace("\n", " ")已经为您提供了所需的replace效果。此外，由于要打印数组，即使元素以新行结尾，它仍将被打印为\n。pprint不会对原始数组产生任何影响，只会以更好的格式打印它。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/66792348

复制

相似问题

问列表()在一行中，不能引入/newlines
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问列表()在一行中，不能引入/newlinesEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问列表()在一行中，不能引入/newlines
EN