文章/答案/技术大牛

发布

社区首页 >问答首页 >Python3代码问题

问Python3代码问题
EN

Stack Overflow用户

提问于 2017-04-28 23:24:10

回答 2查看 127关注 0票数 0

我不知道我的问题是什么。这就是终端机上出现的情况，我得到的是没有任何信息的csv。

$ python3 test1.py

名单->

刮擦

回溯(最近一次调用)：

File "test1.py", line 162, in <module>

search_bing(i)

File "test1.py", line 131, in search_bing

driver.get("https://duckduckgo.com/?q=linkedin+" + n + "&t=hb&ia=web")

File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-
packages/selenium/webdriver/remote/webdriver.py", line 264, in get

self.execute(Command.GET, {'url': url})

File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-
packages/selenium/webdriver/remote/webdriver.py", line 252, in execute

self.error_handler.check_response(response)

File 
"/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-
packages/selenium/webdriver/remote/errorhandler.py", line 194, in 
check_response

raise exception_class(message, screen, stacktrace)

selenium.common.exceptions.WebDriverException: Message: unknown error: 
Runtime.executionContextCreated has invalid 'context': {"auxData":
{"frameId":"40864.1","isDefault":true},"id":1,"name":"","origin":"://"}

(Session info: chrome=58.0.3029.81)

(Driver info: chromedriver=2.9.248307,platform=Mac OS X 10.12.4 x86_64)

下面是完整的脚本。你可以忽略进入组代码，因为HTML从我正在抓取的网站，去那里，它是为了这个帖子。

-编码: utf-8 -

from bs4 import BeautifulSoup
from selenium import webdriver
import time
import csv

c = csv.writer(open("linkedin-group-results.csv", "w"))
c.writerow(["Member","Profile"])
driver = webdriver.Chrome(executable_path=r'/usr/local/bin/chromedriver')


your_groups_code = """

#enter group code here
"""

users = []
ul = []
def search_bing(name):
n = name.replace(" ", "+")
driver.get("https://duckduckgo.com/?q=linkedin+" + n + "&t=hb&ia=web")
time.sleep(3)
s = BeautifulSoup(driver.page_source, 'lxml')
fr = s.find("div", class_="result__body links_main links_deep")

for a in fr.find_all('a'):
    try:
        if 'linkedin.com/in' in a['href']:
            print ('found linkedin url'), a['href']
            if a['href'] in ul:
                print ('skipping dup')
            else:
                ul.append(a['href'])
                c.writerow([name, a['href']])
                break
    except Exception as e:
        print (e,'..continue')


soup = BeautifulSoup(your_groups_code, 'lxml')
for a in soup.find_all('img'):
name = a['alt']
if name in users:
    print ('skipping dup')
else:
    users.append(name)

if len(users) > 1:
print ('LIST -->'), users
for i in users:
    print ("Scraping"), i
    search_bing(i)
else:
print ('Congrats! Your making progress.. Now please insert the code of 
the linkedin group you want to scrape (as seen in tutorial)')

python

python-3.x

web-scraping

回答 2

Stack Overflow用户

回答已采纳

发布于 2017-04-29 02:01:31

您似乎使用的是一个较旧版本的ChromeDriver，2.9，这很可能与Chrome 58不兼容。请下载并试用最新版本2.29。见注：https://chromedriver.storage.googleapis.com/2.29/notes.txt

票数 0

Stack Overflow用户

发布于 2017-04-29 00:35:29

您已经删除了大量代码，因此很难进行调试。据我所知，当调用search_bing()时，您的代码在driver.get()方法中失败了。我尝试了这段代码的一个简单版本，它起作用了，因此我建议您了解传入search_bing()的'name‘var是否有问题。

#! /usr/bin/env python    
from bs4 import BeautifulSoup
from selenium import webdriver
import time
import csv

c = csv.writer(open("linkedin-group-results.csv", "w"))
c.writerow(["Member","Profile"])
driver = webdriver.Chrome(executable_path=r'/usr/local/bin/chromedriver')

name = 'John Smith'
n = name.replace(" ", "+")
driver.get("https://duckduckgo.com/?q=linkedin+" + n + "&t=hb&ia=web")

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/43690142

复制

相似问题

问Python3代码问题
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python3代码问题EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python3代码问题
EN