我试图通过SQL检索Google公共专利数据中的特定cpc代码和受让人。我正在寻找“大众”和cpc.code "H01M8“这两个词。
但我发现了一个错误:
不匹配操作符=参数类型的签名:数组>、STRING。支持签名: ANY = 15:3的任何签名
代码:
SELECT
publication_number application_number,
family_id,
publication_date,
filing_date,
priority_date,
priority_claim,
ipc,
cpc.code,
inventor,
assignee_harmonized,
FROM
`patents-public-data.patents.publications`
WHERE
assignee_harmonized = "VOLKSWAGEN" AND cpc.code = "H01M8"
LIMIT
1000我还对搜索多个受让人感兴趣,例如:
in ("VOLKSWAGEN", "PORSCHE", "AUDI", "SCANIA", "SKODA", "MAZDA", "TOYOTA", "HONDA", "BOSCH", "KYOCERA", "PANASONIC", "TOTO", "NISSAN", "LG FUEL CELL SYSTEMS", "SONY", "HYUNDAI", "SUZUKI", "PLUG POWER", "SFC ENERGY", "BALLARD", "KIA MOTORS", "SIEMENS", "KAWASAKI", "BAYERISCHE MOTORENWERKE", "HYDROGENICS", "POWERCELL SWEDEN", "ELRINGKLINGER", "PROTON MOTOR")
我最近开始使用SQL,没有看到错误:/
非常感谢你的帮助!
发布于 2021-07-10 11:18:42
在Google中,访问数组元素需要BigQuery UNNEST。这里描述了这一点:
https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays
下面的查询适用于我。
SELECT
publication_number application_number,
family_id,
publication_date,
filing_date,
priority_date,
priority_claim,
ipc,
cpc__u.code,
inventor,
assignee_harmonized,
FROM
`patents-public-data.patents.publications`,
UNNEST(assignee_harmonized) AS assignee_harmonized__u,
UNNEST(cpc) AS cpc__u
WHERE
assignee_harmonized__u.name = "VOLKSWAGEN AG"
AND cpc__u.code LIKE "H01M8%"
LIMIT
1000以下是我为产生结果所做的更改:
UNNEST(assignee_harmonized) as assignee_harmonized__u访问assignee_harmonized__u.name.UNNEST(cpc) as cpc__u以访问cpc__u.code.assignee_harmonized__u.name = "VOLKSWAGEN AG",因为"VOLKSWAGEN"不返回results.cpc__u.code LIKE "H01M8%",因为"H01M8"不返回结果。一个示例值是H01M8/10.这将返回以下内容:
查询完成(经过2.3秒,29.2 GB处理)
如果您想要筛选多个受让人名称,IN的工作方式如下所示,但是需要有一个与VOLKSWAGEN AG或AUDI AG完全匹配的名称。
assignee_harmonized__u.name IN ("VOLKSWAGEN", "PORSCHE", "AUDI", "SCANIA", "SKODA", "MAZDA", "TOYOTA", "HONDA", "BOSCH", "KYOCERA", "PANASONIC", "TOTO", "NISSAN", "LG FUEL CELL SYSTEMS", "SONY", "HYUNDAI", "SUZUKI", "PLUG POWER", "SFC ENERGY", "BALLARD", "KIA MOTORS", "SIEMENS", "KAWASAKI", "BAYERISCHE MOTORENWERKE", "HYDROGENICS", "POWERCELL SWEDEN", "ELRINGKLINGER", "PROTON MOTOR")
如果要使用多个字符串进行LIKE样式匹配,可以尝试使用REGEXP_CONTAINS。
https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#regexp_contains
发布于 2021-07-10 12:57:05
非常感谢,现在我创建了这个代码来筛选多家公司。是否可以将请求的查询从一行单元格中的"cpc__u.code“中提取出来?使用",“分隔输出字符串之间的代码?同样,我也想在这里考虑一下assignee_harmonized__u.name!
你认为这些公司会接受这个先例和"IN“运营商的筛选吗?
SELECT
publication_number application_number,
family_id,
publication_date,
filing_date,
priority_date,
priority_claim,
cpc__u.code,
inventor,
assignee_harmonized,
assignee
FROM
`patents-public-data.patents.publications`,
UNNEST(assignee_harmonized) AS assignee_harmonized__u,
UNNEST(cpc) AS cpc__u
WHERE
assignee_harmonized__u.name in ("VOLKSWAGEN", "PORSCHE", "AUDI", "SCANIA", "SKODA", "MAZDA", "TOYOTA", "HONDA", "BOSCH", "KYOCERA", "PANASONIC", "TOTO", "NISSAN", "LG FUEL CELL SYSTEMS", "SONY", "HYUNDAI", "SUZUKI", "PLUG POWER", "SFC ENERGY", "BALLARD", "KIA MOTORS", "SIEMENS", "KAWASAKI", "BAYERISCHE MOTORENWERKE", "HYDROGENICS", "POWERCELL SWEDEN", "ELRINGKLINGER", "PROTON MOTOR")
AND cpc__u.code LIKE "H01M8%"
LIMIT
100000https://stackoverflow.com/questions/68325713
复制相似问题