首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从域名中提取TLD并根据TLD对其进行分组

从域名中提取TLD并根据TLD对其进行分组
EN

Stack Overflow用户
提问于 2020-12-01 18:17:24
回答 2查看 124关注 0票数 0

我需要从网址中提取TLD,如果它与预先定义的一组TLDs(com、edu、nz、au)相匹配,我需要对其进行总结。如果它与预定义的TLD不匹配,它应该被归类到“其他”类别中。如果某一特定的商业活动无法使用网页,则应将其归类为“不可用”。

预期产出:

代码语言:javascript
复制
CLIENT TYPE     TOTAL
-------------  ----------
com             4
au              5
nz              0
Not Available   0
Other           0

我编写了以下查询,但它没有给出具有0值的行。

代码语言:javascript
复制
select tld2, NVL(cnt,0)  from (select REGEXP_SUBSTR (webaddress, '\.([a-z]+)(/|$)', 1, 1, NULL, 1) as tld2, count(*) cnt from client group by REGEXP_SUBSTR (webaddress, '\.([a-z]+)(/|$)', 1, 1, NULL, 1))a where tld2 in ('com','edu','gov','org')
UNION ALL
select 'Not Available' as tld2, COUNT(webaddress) from client where webaddress is null
UNION
select 'Other' as tld2, NVL(cnt,0)  from (select REGEXP_SUBSTR (webaddress, '\.([a-z]+)(/|$)', 1, 1, NULL, 1) as tld2, count(*) cnt from client group by REGEXP_SUBSTR (webaddress, '\.([a-z]+)(/|$)', 1, 1, NULL, 1))a where tld2 not in ('com','edu','gov','org');

如果我在这里使用案例,谁能帮我指点一下吗?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2020-12-01 20:51:48

请试着修改一下你的方法:

代码语言:javascript
复制
select tld2, NVL(cnt,0)  from (select REGEXP_SUBSTR (webaddress, '\.([a-z]+)(/|$)', 1, 1, NULL, 1) as tld2, count(*) cnt from client group by REGEXP_SUBSTR (webaddress, '\.([a-z]+)(/|$)', 1, 1, NULL, 1))a where tld2 in ('com','edu','gov','org')
UNION ALL
select 'Not Available' as tld2, cnt from (select COUNT(webaddress) cnt from client where webaddress is null)
UNION
select 'Other' as tld2, cnt  from (select count(webaddress) cnt from client where REGEXP_SUBSTR (webaddress, '\.([a-z]+)(/|$)', 1, 1, NULL, 1) not in ('com','edu','gov','org'))a ;

谢谢。

票数 1
EN

Stack Overflow用户

发布于 2020-12-01 19:09:39

看一下这个例子:

代码语言:javascript
复制
with sub as (
select case when webaddress is null then 'Not Available' 
             when domain_name  in ('com','edu','gov','org') then domain_name else 'Other' end client_type 
             from (
SELECT 
    regexp_substr(webaddress, '\.([a-z]+)(/|$)', 1, 1, NULL,
                  1) domain_name,
    webaddress
FROM
    (
        SELECT
            'https://stackoverflow.com/questions/65096217/' webaddress
        FROM
            dual
        UNION ALL
        SELECT
            'https://stackoverflow.edu/questions/65096217/' 
        FROM
            dual
        UNION ALL
        SELECT
            'https://stackoverflow.edu/questions/6509621/' 
        FROM
            dual
        UNION ALL
        select 'https://stackoverflow.de/questions/65096217/' 
        from dual
        /*UNION ALL
        select null 
        from dual*/
    ))),
cat as (select regexp_substr('Not Available,com,edu,gov,org,Other','[^,]+', 1, level ) val
from dual
connect by regexp_substr('Not Available,com,edu,gov,org,Other', '[^,]+', 1, level) is not null)
select c.val, sum(case when s.client_type is null then 0 else 1 end)
from sub s right outer join cat c on (c.val = s.client_type)
group by c.val;

以前(不完全溶胶):非常简单的解决方案使用平原组和案例stmt。可能是这样的:

代码语言:javascript
复制
select  case when s.webaddress is null then 'Not Available' 
             when s.domain_name  in ('com','edu','gov','org') then s.domain_name else 'Other' end client_type, 
             count(*)
        from (SELECT
    regexp_substr(webaddress, '\.([a-z]+)(/|$)', 1, 1, NULL,
                  1) domain_name,
    webaddress
FROM
    (
        SELECT
            'https://stackoverflow.com/questions/65096217/' webaddress
        FROM
            dual
        UNION ALL
        SELECT
            'https://stackoverflow.edu/questions/65096217/' 
        FROM
            dual
        UNION ALL
        SELECT
            'https://stackoverflow.edu/questions/6509621/' 
        FROM
            dual
        UNION ALL
        SELECT
            NULL 
        FROM
            dual
        UNION ALL
        select 'https://stackoverflow.de/questions/65096217/' 
        from dual
    )) s
        group by case when s.webaddress is null then 'Not Available' 
             when s.domain_name  in ('com','edu','gov','org') then s.domain_name else 'Other' end
;
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/65096217

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档