文章/答案/技术大牛

发布

问条件语句语法
EN

Stack Overflow用户

提问于 2018-06-08 05:21:02

回答 2查看 91关注 0票数 -1

我现在正在写一个网络爬虫，我的Python就像地狱一样生锈，所以我只是想知道是否有更短的语法来完成下面的工作……

def parse(self, response):
    prc_path = '//span[@class="result-meta"]/span[@class="result-price"]/text()'
    sqf_path = '//span[@class="result-meta"]/span[@class="housing"]/text()'
    loc_path = '//span[@class="result-meta"]/span[@class="result-hood"]/text()'
    prc_resp = response.xpath(prc_path).extract_first()
    sqf_resp = response.xpath(sqf_path).extract_first()
    loc_resp = response.xpath(loc_path).extract_first()
    if sqf_resp and loc_resp:
        yield {
            'prc': response.xpath(prc_path).extract_first(),
            'sqf': response.xpath(sqf_path).extract_first(),
            'loc': response.xpath(loc_path).extract_first()
        }
    elif sqf_resp:
        yield {
            'prc': response.xpath(prc_path).extract_first(),
            'sqf': response.xpath(sqf_path).extract_first()
        }
    else:
        yield {
            'prc': response.xpath(prc_path).extract_first(),
            'loc': response.xpath(loc_path).extract_first()
        }

正如你所看到的，有相当多的重复，我想尽可能地保持干练。

python

if-statement

scrapy

yield

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-06-08 05:33:54

您可以创建字典，然后向其中添加适当的条目。

result = { 'prc': response.xpath(prc_path).extract_first() }
if sqf_path:
    result['sqf'] = response.xpath(sqf_path).extract_first()
if loc_path:
    result['loc'] = response.xpath(loc_path).extract_first()
yield result

您还可以通过对字典的理解来提取extract_path部分。

result = { 'prc': prc_path, 'sqf': sqf_path, 'loc': loc_path }
yield { key : response.xpath(value).extract_first()
          for (key, value) in result.items() if value }

在早期版本的Python中，这将是：

result = { 'prc': prc_path, 'sqf': sqf_path, 'loc': loc_path }
yield dict((key, response.xpath(value).extract_first())
          for (key, value) in result.items() if value)

票数 1

Stack Overflow用户

发布于 2018-06-08 05:37:38

我会使用查找图：

def parse(self, response):
    # initialize your prc_path/sqf_path/loc_path here
    lookup_map = {"prc": prc_path, "sqf": sqf_path, "loc": loc_path}  # add as many as needed
    return {k: response.xpath(v).extract_first() for k, v in lookup_map.items() if v}

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/50750332

复制

相似问题

问条件语句语法
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问条件语句语法EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问条件语句语法
EN