首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >谷歌公共专利数据SQL (BigQuery)

谷歌公共专利数据SQL (BigQuery)
EN

Stack Overflow用户
提问于 2021-07-10 07:38:20
回答 2查看 284关注 0票数 0

我试图通过SQL检索Google公共专利数据中的特定cpc代码和受让人。我正在寻找“大众”和cpc.code "H01M8“这两个词。

但我发现了一个错误:

不匹配操作符=参数类型的签名:数组>、STRING。支持签名: ANY = 15:3的任何签名

代码:

代码语言:javascript
复制
SELECT
  publication_number application_number,
  family_id,
  publication_date,
  filing_date,
  priority_date,
  priority_claim,
  ipc,
  cpc.code,
  inventor,
  assignee_harmonized,
FROM
  `patents-public-data.patents.publications`
WHERE
  assignee_harmonized = "VOLKSWAGEN" AND cpc.code = "H01M8"
LIMIT
  1000

我还对搜索多个受让人感兴趣,例如:

in ("VOLKSWAGEN", "PORSCHE", "AUDI", "SCANIA", "SKODA", "MAZDA", "TOYOTA", "HONDA", "BOSCH", "KYOCERA", "PANASONIC", "TOTO", "NISSAN", "LG FUEL CELL SYSTEMS", "SONY", "HYUNDAI", "SUZUKI", "PLUG POWER", "SFC ENERGY", "BALLARD", "KIA MOTORS", "SIEMENS", "KAWASAKI", "BAYERISCHE MOTORENWERKE", "HYDROGENICS", "POWERCELL SWEDEN", "ELRINGKLINGER", "PROTON MOTOR")

我最近开始使用SQL,没有看到错误:/

非常感谢你的帮助!

EN

回答 2

Stack Overflow用户

发布于 2021-07-10 11:18:42

在Google中,访问数组元素需要BigQuery UNNEST。这里描述了这一点:

https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays

下面的查询适用于我。

代码语言:javascript
复制
SELECT
  publication_number application_number,
  family_id,
  publication_date,
  filing_date,
  priority_date,
  priority_claim,
  ipc,
  cpc__u.code,
  inventor,
  assignee_harmonized,
FROM
  `patents-public-data.patents.publications`,
  UNNEST(assignee_harmonized) AS assignee_harmonized__u,
  UNNEST(cpc) AS cpc__u
WHERE
  assignee_harmonized__u.name = "VOLKSWAGEN AG"
  AND cpc__u.code LIKE "H01M8%"
LIMIT
  1000

以下是我为产生结果所做的更改:

  1. UNNEST(assignee_harmonized) as assignee_harmonized__u访问assignee_harmonized__u.name.
  2. UNNEST(cpc) as cpc__u以访问cpc__u.code.
  3. assignee_harmonized__u.name = "VOLKSWAGEN AG",因为"VOLKSWAGEN"不返回results.
  4. cpc__u.code LIKE "H01M8%",因为"H01M8"不返回结果。一个示例值是H01M8/10.

这将返回以下内容:

查询完成(经过2.3秒,29.2 GB处理)

如果您想要筛选多个受让人名称,IN的工作方式如下所示,但是需要有一个与VOLKSWAGEN AGAUDI AG完全匹配的名称。

assignee_harmonized__u.name IN ("VOLKSWAGEN", "PORSCHE", "AUDI", "SCANIA", "SKODA", "MAZDA", "TOYOTA", "HONDA", "BOSCH", "KYOCERA", "PANASONIC", "TOTO", "NISSAN", "LG FUEL CELL SYSTEMS", "SONY", "HYUNDAI", "SUZUKI", "PLUG POWER", "SFC ENERGY", "BALLARD", "KIA MOTORS", "SIEMENS", "KAWASAKI", "BAYERISCHE MOTORENWERKE", "HYDROGENICS", "POWERCELL SWEDEN", "ELRINGKLINGER", "PROTON MOTOR")

如果要使用多个字符串进行LIKE样式匹配,可以尝试使用REGEXP_CONTAINS

https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#regexp_contains

票数 0
EN

Stack Overflow用户

发布于 2021-07-10 12:57:05

非常感谢,现在我创建了这个代码来筛选多家公司。是否可以将请求的查询从一行单元格中的"cpc__u.code“中提取出来?使用",“分隔输出字符串之间的代码?同样,我也想在这里考虑一下assignee_harmonized__u.name!

你认为这些公司会接受这个先例和"IN“运营商的筛选吗?

代码语言:javascript
复制
SELECT
  publication_number application_number,
  family_id,
  publication_date,
  filing_date,
  priority_date,
  priority_claim,
  cpc__u.code,
  inventor,
  assignee_harmonized,
  assignee
FROM
  `patents-public-data.patents.publications`,
  UNNEST(assignee_harmonized) AS assignee_harmonized__u,
  UNNEST(cpc) AS cpc__u
WHERE
  assignee_harmonized__u.name in ("VOLKSWAGEN", "PORSCHE", "AUDI", "SCANIA", "SKODA", "MAZDA", "TOYOTA", "HONDA", "BOSCH", "KYOCERA", "PANASONIC", "TOTO", "NISSAN", "LG FUEL CELL SYSTEMS", "SONY", "HYUNDAI", "SUZUKI", "PLUG POWER", "SFC ENERGY", "BALLARD", "KIA MOTORS", "SIEMENS", "KAWASAKI", "BAYERISCHE MOTORENWERKE", "HYDROGENICS", "POWERCELL SWEDEN", "ELRINGKLINGER", "PROTON MOTOR")
  AND cpc__u.code LIKE "H01M8%"
LIMIT
  100000
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68325713

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档