文章/答案/技术大牛

发布

社区首页 >问答首页 >批量下载带标签的google图片

问批量下载带标签的google图片
EN

Stack Overflow用户

提问于 2016-02-06 22:28:59

回答 1查看 2.6K关注 0票数 3

我正在尝试找到一种有效且可复制的方式，从Google图片搜索中批量下载全尺寸图片文件。其他人也问过类似的问题，但我没有找到任何我正在寻找或理解的东西。

大多数人指的是折旧的Google Image Search API或Google Custom Search API，这些API似乎不适用于整个网络，或者只是从一个URL下载图像。

我想这可能是一个两步的过程:首先，从搜索中提取所有的图像URL，然后从这些URL中批量下载？

我应该补充说，我是一个初学者(这可能是显而易见的；对不起)。因此，如果有人能向我解释并指出正确的方向，那将不胜感激。

我也研究过免费软件选项，但这些选项似乎也参差不齐。除非有人知道可靠的方法。

Download images from google image search (python)

In Python, is there a way I can download all/some the image files (e.g. JPG/PNG) from a **Google Images** search result?

如果有人知道任何关于这些标签的信息，以及它们是否存在于某个地方/与这些图像相关？https://en.wikipedia.org/wiki/Google_Image_Labeler

import json
import os
import time
import requests
from PIL import Image
from StringIO import StringIO
from requests.exceptions import ConnectionError

def go(query, path):
"""Download full size images from Google image search.
Don't print or republish images without permission.
I used this to train a learning algorithm.
"""
BASE_URL = 'https://ajax.googleapis.com/ajax/services/search/images?'\
         'v=1.0&q=' + query + '&start=%d'

BASE_PATH = os.path.join(path, query)

 if not os.path.exists(BASE_PATH):
 os.makedirs(BASE_PATH)

start = 0 # Google's start query string parameter for pagination.
while start < 60: # Google will only return a max of 56 results.
r = requests.get(BASE_URL % start)
for image_info in json.loads(r.text)['responseData']['results']:
  url = image_info['unescapedUrl']
  try:
    image_r = requests.get(url)
  except ConnectionError, e:
    print 'could not download %s' % url
    continue

  # Remove file-system path characters from name.
  title = image_info['titleNoFormatting'].replace('/', '').replace('\\', '')

  file = open(os.path.join(BASE_PATH, '%s.jpg') % title, 'w')
  try:
    Image.open(StringIO(image_r.content)).save(file, 'JPEG')
  except IOError, e:
    # Throw away some gifs...blegh.
    print 'could not save %s' % url
    continue
  finally:
    file.close()

print start
start += 4 # 4 images per page.

# Be nice to Google and they'll be nice back :)
time.sleep(1.5)

# Example use
go('landscape', 'myDirectory')

更新

我能够使用指定的here使用完整的网络创建自定义搜索，并成功地执行以获得图像链接，但正如上一篇文章中所提到的，它们与正常的谷歌图像结果并不完全一致。

google-custom-search

google-image-search

python

image

batch-processing

回答 1

Stack Overflow用户

发布于 2017-09-09 03:26:49

尝试使用ImageSoup模块。要安装它，只需执行以下操作：

pip install imagesoup

示例代码：

>>> from imagesoup import ImageSoup
>>>
>>> soup = ImageSoup()
>>> images_wanted = 50
>>> query = 'landscape'
>>> images = soup.search(query, n_images=50)

现在你已经有了来自Google Images的50张风景图片的列表。让我们来玩一下第一个：

>>> im = images[0]
>>> im.URL
https://static.pexels.com/photos/279315/pexels-photo-279315.jpeg
>>> im.size
(2600, 1300)
>>> im.mode
RGB
>>> im.dpi
(300, 300)
>>> im.color_count
493230
>>> # Let's check the main 4 colors in the image. We use
>>> # reduce_size = True to speed up the process.
>>> im.main_color(reduce_size=True, n=4))
[('black', 0.2244), ('darkslategrey', 0.1057), ('darkolivegreen', 0.0761), ('dodgerblue', 0.0531)]
# Let's take a look on our image
>>> im.show()

>>> # Nice image! Let's save it.
>>> im.to_file('landscape.jpg')

每次搜索返回的图像数量可能会发生变化。通常是一个小于900的数字。如果要获取所有图像，请设置n_images=1000。

要贡献或报告错误，请查看github代码库：https://github.com/rafpyprog/ImageSoup

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/35242151

复制

相似问题

问批量下载带标签的google图片
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问批量下载带标签的google图片EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问批量下载带标签的google图片
EN