在HTML源代码中,我希望有带有正则表达式的JSON文件。
这是我的源代码:
<div class="record-img"></div><span>沪公网安备 31010502000392号</span></a></div></div></div><script id="__NEXT_DATA__" type="application/json">{"dataManager":"[]","props":{"pageProps":{"error":"no_error","query":{"catId":"-11"},"list":[{"mallName":"双星八特澜蔓专卖店","merchantType":4,"goodsId":246071838673,"goodsSign":"JoD7R1E7y","goodsName":"双星男鞋夏季透气2021新款男士网面韩版百搭春季运动休闲跑步鞋子","goodsDesc":"双星男鞋夏季透气2021新款男士网面韩版百搭春季运动休闲跑步鞋子","goodsImageUrl":"https://t00img.yangkeduo.com/goods/images/2021-05-21/81cc46223c6e76e075a292acd1da3514.jpeg","goodsThumbnailUrl":"https://t00img.yangkeduo.com/goods/i............{"catId":"-11"},"buildId":"8L-Nn12bTfxvzblo_QyVL","dynamicBuildId":false,"runtimeConfig":{"youhuiHost":"//youhui.pinduoduo.com","loginHost":"//api.yangkeduo.com","goodsHost":"//api.yangkeduo.com","isDev":false}}</script><script async="" id="__NEXT_PAGE__/search/landing" src="/_next/static/8L-Nn12bTfxvzblo_QyVL/pages/search/landing.js"></script>我想让json文件以单词dataManager开头,如何为它编写正则表达式?我在用蟒蛇。
发布于 2021-06-22 01:05:53
因此,从earlier question中,我假设您有要从其中检索json部分的r.text。你可以用多种方式来做这件事,read why regex is not one of them。
我会这样做的:
from lxml import html
import json
tree = html.fromstring(r.text)
json_as_str = tree.xpath('//script[@id="__NEXT_DATA__"]/text()')[0]
json_as_dict = json.loads(json_as_str)https://stackoverflow.com/questions/68076233
复制相似问题