这是我之前问题的第二部分。我有一张牌照号码表,并试图确定它们是否合法。我试图运行的测试之一是,数字是否是顺序/计数模式。真正有帮助的是,如果我能得到一个顺序字符的最大计数,但我觉得这更困难。例如:
LICENSE_NUMBER IS_COUNTING NUM_SEQ_CHARS
123456789 TRUE 9
123455678 FALSE 5
456789012 TRUE 9
12345 TRUE 5
123451234 FALSE 5谢谢你的帮忙!
发布于 2016-11-22 15:38:14
^(01(2345678901)*(2(3(4(5(6(7(8(90?)?)?)?)?)?)?)?)?
|12(3456789012)*(3(4(5(6(7(8(9(01?)?)?)?)?)?)?)?)?
|23(4567890123)*(4(5(6(7(8(9(0(12?)?)?)?)?)?)?)?)?
|34(5678901234)*(5(6(7(8(9(0(1(23?)?)?)?)?)?)?)?)?
|45(6789012345)*(6(7(8(9(0(1(2(34?)?)?)?)?)?)?)?)?
|56(7890123456)*(7(8(9(0(1(2(3(45?)?)?)?)?)?)?)?)?
|67(8901234567)*(8(9(0(1(2(3(4(56?)?)?)?)?)?)?)?)?
|78(9012345678)*(9(0(1(2(3(4(5(67?)?)?)?)?)?)?)?)?
|89(0123456789)*(0(1(2(3(4(5(6(78?)?)?)?)?)?)?)?)?
|90(1234567890)*(1(2(3(4(5(6(7(89?)?)?)?)?)?)?)?)?)$它将匹配任何两个或多个数字的字符串,该字符串完全由顺序数字组成。
123456789 MATCH
123455678 NOT MATCH
456789012 MATCH
12345 MATCH演示:http://regexr.com/3enkb
发布于 2016-11-24 16:29:13
这里是确定给定许可证号码中连续递增数字的最大计数的一种有效方法。我没有费心使用“是”/“否”列--这可以通过比较MAX_COUNT和length(license_number)来获得。把它当作练习。
我们需要将许可证号分解成它的各个字符,跟踪位置。然后我们可以减去“位置”,取剩余模10。因为甲骨文认为mod(-1, 10)是-1而不是9(他们真的需要回到小学重新学习MODULO函数),所以我需要添加10,所以在所有情况下我都得到了正确的结果(在下面的解决方案中解释了一个“奇怪”)。现在的问题是:为每个许可号找到最长的“相等”剩余序列mod 10。
在一个序列中找到连续的“某事”是使用所谓的"Tabibitosan方法“(在我的解决方案中称为prep的CTE i中的两个preps的差异)中最有效的。然后,这就变成了分组、计数和获取max()的问题。
with
-- begin of test data; not part of the solution
test_data ( license_number ) as (
select '123456789' from dual union all
select '123455678' from dual union all
select '456789012' from dual union all
select '12345' from dual union all
select '123451234' from dual union all
select '402023488' from dual union all
select '4189012' from dual
),
-- end of test data; solution (SQL query) continues below this line
tokenized ( license_number, idx, res ) as (
select license_number, level,
mod(10 + to_number(substr(license_number, level, 1)) - level, 10)
from test_data
connect by level <= length(license_number)
and prior license_number = license_number
and prior sys_guid() is not null
),
prep ( license_number, res, grp ) as (
select license_number, res,
row_number() over (partition by license_number order by idx) -
row_number() over (partition by license_number, res order by idx)
from tokenized
),
grouped ( license_number, res, grp, ct ) as (
select license_number, res, grp, count(*)
from prep
group by license_number, res, grp
)
select license_number, max(ct) as max_count
from grouped
group by license_number
;输出
LICENSE_NUMBER MAX_COUNT
-------------- ---------
123455678 5
123456789 9
456789012 9
123451234 5
4189012 5
12345 5
402023488 3发布于 2016-11-22 19:05:22
此查询还计算顺序字符的计数:
SELECT LICENSE_NUMBER,
CASE length( LICENSE_NUMBER ) WHEN max( cnt) THEN 'TRUE' ELSE 'FALSE' END As IS_COUNTING ,
max( cnt) As NUM_SEQ_CHARS
FROM (
SELECT LICENSE_NUMBER, p, count(*) As cnt
FROM (
SELECT LICENSE_NUMBER,
SUM( xx ) OVER ( Partition By LICENSE_NUMBER ORDER BY x ) As p
FROM (
SELECT LICENSE_NUMBER,
x, qq,
CASE WHEN qq - 1 = LAG( qq ) OVER ( Partition By LICENSE_NUMBER ORDER BY x )
THEN 0
WHEN qq = 0 AND 9 = LAG( qq ) OVER ( Partition By LICENSE_NUMBER ORDER BY x )
THEN 0
ELSE 1
END As xx
FROM (
SELECT LICENSE_NUMBER, x, substr( LICENSE_NUMBER, x, 1 ) qq
FROM (
SELECT *
FROM table
CROSS JOIN (
SELECT level x FROM dual
CONNECT BY LEVEL <= ( SELECT MAX( length( LICENSE_NUMBER )) FROM table )
)
)
)
)
)
GROUP BY LICENSE_NUMBER, p
)
GROUP BY LICENSE_NUMBERhttps://stackoverflow.com/questions/40745219
复制相似问题