文章/答案/技术大牛

发布

社区首页 >问答首页 >美丽的汤:没有抓住正确的信息

问美丽的汤:没有抓住正确的信息
EN

Stack Overflow用户

提问于 2015-12-10 23:52:49

回答 1查看 61关注 0票数 0

我正在使用漂亮的汤刮粗体花名及其相应的图片链接：http://www.all-my-favourite-flower-names.com/list-of-flower-names.html

我想要这样做的不仅仅是以"A“开头的花，还要让刮刀为您尝试得到的所有其他花工作(以"B”、"C“、"D”等开头的花)。

我能为一些"A“花拼凑些东西.

for flower in soup.find_all('b'):  #Finds flower names and appends them to the flowers list
        flower = flower.string
        if (flower != None and flower[0] == "A"):
            flowers.append(flower.strip('.()'))
        
    for link in soup.find_all('img'):  #Finds 'src' in <img> tag and appends 'src' to the links list
        links.append(link['src'].strip('https://'))

    for stragler in soup.find_all('a'):  #Finds the only flower name that doesn't follow the pattern of the other names and inserts it into flowers list
        floss = stragler.string
        if floss != None and floss == "Ageratum houstonianum.":
            flowers.insert(3, floss)

最明显的问题是，当任何事情发生变化时，它肯定会破裂。有人能帮我一下吗？

python

django

beautifulsoup

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-12-11 01:28:12

问题似乎是这些花被分页了。类似这样的内容应该可以帮助您遍历不同的页面。代码未测试

import urllib2
test = {'A':'', 'B':'-B', 'XYZ': '-X-Y-Z'}
flower_list = []
for key, value in test.items():
     page = urllib2.urlopen('http://www.all-my-favourite-flower-names.com/list-of-flower-names{0}.html'.format(
value)).read()
     soup = BeautifulSoup(page)
     # Now do your logic or every page, and probably save the flower names in a list.

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/34213904

复制

相似问题

问美丽的汤:没有抓住正确的信息
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问美丽的汤:没有抓住正确的信息EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问美丽的汤:没有抓住正确的信息
EN