首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >雅典娜在CloudFront日志上查询5分钟间隔内的最大速率

雅典娜在CloudFront日志上查询5分钟间隔内的最大速率
EN

Stack Overflow用户
提问于 2020-11-05 00:29:38
回答 1查看 383关注 0票数 1

我正在尝试查询CloudFront日志,以根据IP地址推导出最大请求速率的流量基准。我正在努力将日期和时间列组合成一个时间戳,我可以用它来生成5分钟间隔。我是不是接近这个查询了?

代码语言:javascript
复制
SELECT from_iso8601_timestamp(CONCAT(CAST(date as varchar), 'T', time, 'Z')) AS datetime, request_ip, COUNT(request_ip) AS count
FROM cloudfront_logs
WHERE "date"
  BETWEEN DATE '2020-10-28'
      AND DATE '2020-11-04'
GROUP BY CONCAT(CAST(date as varchar), 'T', time, 'Z'), request_ip
ORDER BY datetime ASC
LIMIT 10;

我正在尝试达到一个点,我可以做这样的事情:

代码语言:javascript
复制
SELECT
  FROM_UNIXTIME((UNIX_TIMESTAMP(table.timestamp) DIV 300) * 300) AS Timestamp,
  request_ip,
  COUNT(request_ip) AS count
FROM cloudfront_logs
WHERE "date"
  BETWEEN DATE '2020-10-28'
      AND DATE '2020-11-04'
GROUP BY CONCAT(CAST(date as varchar), 'T', time, 'Z'), request_ip, count
ORDER BY datetime ASC
LIMIT 10;

对于每个请求,以下是基于日志格式的cloudfront_logs表的格式。在创建表的过程中,我曾尝试创建连接的时间戳,但出现了很多语法错误:

代码语言:javascript
复制
CREATE EXTERNAL TABLE IF NOT EXISTS default.cloudfront_logs (
  `date` DATE,
  time STRING,
  location STRING,
  bytes BIGINT,
  request_ip STRING,
  method STRING,
  host STRING,
  uri STRING,
  status INT,
  referrer STRING,
  user_agent STRING,
  query_string STRING,
  cookie STRING,
  result_type STRING,
  request_id STRING,
  host_header STRING,
  request_protocol STRING,
  request_bytes BIGINT,
  time_taken FLOAT,
  xforwarded_for STRING,
  ssl_protocol STRING,
  ssl_cipher STRING,
  response_result_type STRING,
  http_version STRING,
  fle_status STRING,
  fle_encrypted_fields INT,
  c_port INT,
  time_to_first_byte FLOAT,
  x_edge_detailed_result_type STRING,
  sc_content_type STRING,
  sc_content_len BIGINT,
  sc_range_start BIGINT,
  sc_range_end BIGINT
)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY '\t'
LOCATION 's3://CloudFront_bucket_name/CloudFront/'
TBLPROPERTIES ( 'skip.header.line.count'='2' )

摘自Querying Amazon CloudFront Logs上的亚马逊网络服务官方文档

失败示例:

代码语言:javascript
复制
CREATE EXTERNAL TABLE IF NOT EXISTS default.cloudfront_logs (
  timestamp AS from_iso8601_timestamp(CONCAT(CAST(date as varchar), 'T', time, 'Z')) DATE,
  `date` DATE,
  time STRING,
  location STRING,
  bytes BIGINT,

语法错误:

代码语言:javascript
复制
line 1:8: no viable alternative at input 'create external' (service: amazonathena; status code: 400; error code: invalidrequestexception; request id: e5147782-2e62-465e...)

更新:

以下是来自实际日志的示例:

代码语言:javascript
复制
#Version: 1.0
#Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query cs(Cookie) x-edge-result-type x-edge-request-id x-host-header cs-protocol cs-bytes time-taken x-forwarded-for ssl-protocol ssl-cipher x-edge-response-result-type cs-protocol-version fle-status fle-encrypted-fields c-port time-to-first-byte x-edge-detailed-result-type sc-content-type sc-content-len sc-range-start sc-range-end
2020-10-29      04:59:03        MEL50-C1        1524    192.168.0.2     GET     example.cloudfront.net    /mobile/js/category/category.js 200     https://www.example.com/    Mozilla/5.0%20(Linux;%20Android%2010;%20A7)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Chrome/192.168.0.1%20Mobile%20Safari/537.36      v=2020102601    cookie=redacted Hit     BSVYL08_EXAMPLE==        cdn.example.com      https   2001    0.005   -       TLSv1.3 TLS_AES_128_GCM_SHA256  Hit     HTTP/2.0
        -       -       26906   0.004   Hit     application/javascript  787     -       -

解决方案:

代码语言:javascript
复制
SELECT from_unixtime(floor(to_unixtime(from_iso8601_timestamp(CONCAT(CAST(date AS VARCHAR), 'T', time, 'Z'))) / 300) * 300), request_ip, COUNT(request_ip) AS count
FROM cloudfront_logs
WHERE "date"
  BETWEEN DATE '2020-10-28'
      AND DATE '2020-11-04'
GROUP BY from_unixtime(floor(to_unixtime(from_iso8601_timestamp(CONCAT(CAST(date AS VARCHAR), 'T', time, 'Z'))) / 300) * 300), request_ip
ORDER BY count DESC;
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-11-06 01:50:12

也许有一种更短的方式来表达它,但你应该能够使用以下表达式将日期+时间舍入为5分钟间隔:

代码语言:javascript
复制
from_unixtime(floor(to_unixtime(from_iso8601_timestamp(CONCAT(CAST(date AS VARCHAR), 'T', time, 'Z'))) / 300) * 300)

如果你在第一个查询中使用它,你应该会得到你想要的结果。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64683996

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档