在Heroku上运行selenium刮除。它运行,但每隔6-7秒,它就会因这个错误而崩溃。
2021-06-24T15:33:07.835601+00:00 herokuweb.1:错误R10 (启动超时) -> Web进程在启动后60秒内绑定到$PORT -24T15:33:07.884148+00:00herokuweb.1:SIGKILL2021-06-24T15:33:08.022511+00:00 herokuweb.1:状态137 2021-06:33:08.119687+00:00herokuweb.1:状态从开始更改到崩溃
import os
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
import requests
from PIL import Image
GOOGLE_CHROME_PATH = '/app/.apt/usr/bin/google_chrome'
CHROMEDRIVER_PATH = '/app/.chromedriver/bin/chromedriver'
PORT = int(os.environ.get('PORT', 13978))
options1 = Options()
options1.binary_location = os.environ.get('$GOOGLE_CHROME_BIN')
options1.add_argument("--headless")
options1.add_argument("--example-flag")
options1.add_argument('--no-sandbox')
options1.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(executable_path=str(os.environ.get('CHROMEDRIVER_PATH')),options=options1)
driver.maximize_window()
driver.get('https://tryshowtime.com/c/spotlights')
print("On the page")
time.sleep(6)
i = 0
old_rest = set()
while True:
try:
driver.execute_script("window.scrollBy(0,3225)", "")
time.sleep(12)
images = driver.find_elements_by_xpath('//div[@class="relative"]//img')
ans = set(images) - set(old_rest) # Remove old elements
for image in ans:
i += 1
link = image.get_attribute('src')
print(f"got {i}th" + "link")
img_f = requests.get(link, stream=True)
with open(f'Image_{i}.jpg', 'wb') as f:
f.write(img_f.content)
img = Image.open(f'Image_{i}.jpg')
if img.mode != 'RGB':
img = img.convert('RGB')
img_final = img.resize((1024,1024))
img_final.save(f'Image_{i}.jpg')
print("Image saved successfully")
old_rest = images
except:
pass看来我没有将port设置正确?对于Heroku上的另一个铲运机来说,也是一样的,但是那个刮刀只在命令下运行,而不是连续运行。有人能指点一下这个问题是什么吗?
INFO:-这可能无法在Heroku上工作,因为它超过了dyno's内存(512 MB)?
发布于 2021-07-02 15:39:27
我在我的Procfile中使用了Procfile。我把它改成了worker,它起了作用。
https://stackoverflow.com/questions/68119042
复制相似问题