首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何通过python-3.6在html网站中进行搜索?

如何通过python-3.6在html网站中进行搜索?
EN

Stack Overflow用户
提问于 2019-02-12 01:54:20
回答 2查看 233关注 0票数 3

我有很多礼物,我需要创建检查器,这将检查,如果是礼物工作或没有-->它将搜索在html中的一些单词。我在找“礼品码无效”

当我尝试通过urllib或request读取html时,它只加载一小部分html。我是初学者,所以我可能做错了什么。

我的代码是:

代码语言:javascript
复制
import requests
link = "https://discord.gift/o2uzOR7YE3CoBpGq"
r = requests.get(link)
print(r.text)

输出为:

代码语言:javascript
复制
<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8" />
    <meta content="width=device-width, initial-scale=1.0, maximum-scale=1, user-scalable=no" name="viewport" />

    <!-- section:seometa -->
    <meta property="og:type" content="website" />
    <meta property="og:site_name" content="Discord" />
    <meta property="og:title" content="Discord - Free voice and text chat for gamers" />
    <meta
      property="og:description"
      content="Step up your game with a modern voice & text chat app. Crystal clear voice, multiple server and channel support, mobile apps, and more. Get your free server now!"
    /><meta property="og:image" content="https://discordapp.com/assets/ee7c382d9257652a88c8f7b7f22a994d.png" />    <meta name="twitter:card" content="summary_large_image" />
    <meta name="twitter:site" content="@discordapp" />
    <meta name="twitter:creator" content="@discordapp" />
    <!-- endsection -->

    <link
      rel="chrome-webstore-item"
      href="https://chrome.google.com/webstore/detail/lcbhdgefieegnkbopmgklhlpjjdgmbog"
    />
<link rel="stylesheet" href="/assets/0.830216ebaf585f92a484.css" integrity="sha256-qzZED1N67NuVMyWOdvhIGhtLtKnOXSg+F3HcanmdW4Q= sha512-D0iS5hrftKNpXWnvjpfujnvlabUq6K5gsHbsdvctRMtQXzdf2jvZ/JwaRHAPSb9Z5Xb2o8SBeXeMTajvtrkeRw=="><link rel="icon" href="/assets/07dca80a102d4149e9736d4b162cff6f.ico" />    <!-- section:title -->
    <title>Discord</title>
    <!-- endsection -->
  </head>

  <body>
    <div id="app-mount"></div><script nonce="NjksMjM0LDU4LDI4LDkxLDUxLDYzLDE3Mg==">window.__OVERLAY__ = /overlay/.test(location.pathname)</script><script nonce="NjksMjM0LDU4LDI4LDkxLDUxLDYzLDE3Mg==">window.GLOBAL_ENV = {
      API_ENDPOINT: '//discordapp.com/api',
      WEBAPP_ENDPOINT: '//discordapp.com',
      CDN_HOST: 'cdn.discordapp.com',
      ASSET_ENDPOINT: 'https://discordapp.com',
      WIDGET_ENDPOINT: '//discordapp.com/widget',
      INVITE_HOST: 'discord.gg',
      GIFT_CODE_HOST: 'discord.gift',
      MARKETING_ENDPOINT: '//discordapp.com',
      NETWORKING_ENDPOINT: '//router.discordapp.net',
      RELEASE_CHANNEL: 'stable',
      BRAINTREE_KEY: 'production_5st77rrc_49pp2rp4phym7387',
      STRIPE_KEY: 'pk_live_CUQtlpQUF0vufWpnpUmQvcdi',
    };</script><script nonce="NjksMjM0LDU4LDI4LDkxLDUxLDYzLDE3Mg==">!function(){if(null!=window.WebSocket){var n=function(n){try{var e=localStorage.getItem(n);return null==e?null:JSON.parse(e)}catch(n){return null}},e=n("token"),o=n("gatewayURL");if(e&&o){var r=null!=window.DiscordNative||null!=window.require?"etf":"json",t=o+"/?encoding="+r+"&v=6";void 0!==window.Uint8Array&&(t+="&compress=zlib-stream"),console.log("[FAST CONNECT] "+t+", encoding: "+r+", version: 6");var a=new WebSocket(t);a.binaryType="arraybuffer";var i=Date.now(),s={open:!1,gateway:t,messages:[]};a.onopen=function(){console.log("[FAST CONNECT] connected in "+(Date.now()-i)+"ms"),s.open=!0},a.onclose=a.onerror=function(){window._ws=null},a.onmessage=function(n){s.messages.push(n)},window._ws={ws:a,state:s}}}}();</script><script src="/assets/294f56f239ff22f62fc1.js" integrity="sha256-wTRQJKoqMfG3makS9dDuuegpcHSdaGmfoEBQUPXMdDM= sha512-OVrPyjx2akoJ6QS8OZ+9blz/ADtDHruxw4gwLsjfDVUgolO1ZtcgWbOo0Zj9JBNyzAjKOSCfoFoN9lnkF0EYCw=="></script><script src="/assets/eaa48b00154d2e7ac545.js" integrity="sha256-FRTrm1gL5gkDUoKwVuL9hrrmllKXQsZg7r5zy0Xo4bo= sha512-QZ4c5JQKE5rLJf1uGLQaHHL4NpkAigt4TtluicuMZDYDE5fiL7wkaD2CMBxr0xhOO5aNfSFCxcaqBkU/xOEggQ=="></script><script src="/assets/c73d229b094bb39f0686.js" integrity="sha256-thaBLLvK6Up+B8O7zIOF9Uv8IF+gwGuOW+WUe26l/vk= sha512-5ez2fLO3oMI1UPZDif1Szfjwz04ftTNfhWWSqM81hNhuVN7kckAAZR5a1SuQG8rgsqXwN1is53uAL5M2rz/FOg=="></script>  </body>
</html>

你可以在第一张图片中看到,在该站点的html中有文本“礼物代码无效”,但这个字符串不在python输出中。

https://ctrlv.cz/kKd3

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-02-12 03:10:52

该网站在后台发送ajax请求,并检查礼物代码的有效性。它发送一个json响应来指示赠送代码是否有效。然后通过javascript填充数据。

获得所需结果的最简单方法是模拟ajax请求并获取消息。您可以在不使用selenium、requests html或任何其他javascript呈现机制的情况下完成此操作,并且仍然可以获得您想要的输出-检查礼物是否有效。

代码语言:javascript
复制
import requests
gift_code='o2uzOR7YE3CoBpGq' #gift code here
link = f"https://discordapp.com/api/v6/entitlements/gift-codes/{gift_code}?with_application=true&with_subscription_plan=true"
r = requests.get(link)
print(r.json()['message'])

输出

代码语言:javascript
复制
Unknown Gift Code
票数 1
EN

Stack Overflow用户

发布于 2019-02-12 02:04:33

您正在寻找的“礼物代码无效”可能是由js呈现的。请求没有呈现js输出,这就是为什么你找不到它的原因。

如果您使用的是Python3.6,请尝试使用-html来呈现带有js输出的网页。

更新示例:

代码语言:javascript
复制
from requests_html import HTMLSession

link = 'https://discord.gift/o2uzOR7YE3CoBpGq'
targetString = "Gift Code Invalid"
session = HTMLSession()
r = session.get(link)
print("Before render is call: ", targetString in r.html.text)
# sleep has to be implemented after initial the render to get the proper response
r.html.render(wait=2, sleep=1)
print("After render is call: ", targetString in r.html.text)

输出:

代码语言:javascript
复制
Before render is call:  False
After render is call:  True
Process finished with exit code 0

您可以访问该库的文档,了解不同的方法,如按元素查找,甚至在呈现后将响应转换为lxml对象:https://html.python-requests.org/

票数 4
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/54636399

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档