首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用ValidFrom/ValidTo dates连接多个表(SCD2)

使用ValidFrom/ValidTo dates连接多个表(SCD2)
EN

Stack Overflow用户
提问于 2022-05-19 23:38:46
回答 1查看 242关注 0票数 1

问题:如何连接多个(3+)表,这些表都具有SCD 2 validFrom/validTo dates?

我有以下表格:

代码语言:javascript
复制
-- table 1
CREATE TABLE dbo.Clients (
    clientCode    varchar(10) NOT NULL,
    startDate     date NOT NULL,
    [name]        varchar(200) NOT NULL,
    CONSTRAINT PK_Clients PRIMARY KEY CLUSTERED (clientCode, startDate)
);

-- table 2
CREATE TABLE dbo.Projects (
    clientCode    varchar(10) NOT NULL,  --- Each project belongs to a client.
    projectCode   varchar(10) NOT NULL,
    startDate     date NOT NULL,
    [name]        varchar(200) NOT NULL,
    CONSTRAINT PK_Projects PRIMARY KEY CLUSTERED (projectCode, startDate)
);

。。使用以下虚拟数据:

代码语言:javascript
复制
-- dummy data
INSERT INTO dbo.Clients (clientCode, startDate, [name])
VALUES ('A', {d '2010-01-01'}, 'Client A (first)'),
       ('A', {d '2011-04-01'}, 'Client A (second)'),
       ('A', {d '2011-09-01'}, 'Client A (third)'),
       ('A', {d '2012-02-01'}, 'Client A (fourth)'),
       ('A', {d '2014-01-01'}, 'Client A (fifth)'),
       ('B', {d '2010-01-01'}, 'Client B (first)'),
       ('B', {d '2011-02-01'}, 'Client B (second)'),
       ('B', {d '2011-08-01'}, 'Client B (third)'),
       ('B', {d '2011-12-01'}, 'Client B (fourth)'),
       ('B', {d '2012-11-01'}, 'Client B (fifth)');

-- dummy data
INSERT INTO dbo.Projects (clientCode, projectCode, startDate, [name])
VALUES ('A', '1', {d '2010-01-15'}, 'Project 1, first revision'),
       ('A', '1', {d '2012-04-22'}, 'Project 1, second revision'),
       ('A', '2', {d '2010-02-08'}, 'Project 2, first revision'),
       ('A', '2', {d '2010-09-12'}, 'Project 2, second revision'),
       ('A', '2', {d '2012-08-18'}, 'Project 2, third revision'),
       ('B', '3', {d '2011-04-01'}, 'Project 3, first revision'),
       ('B', '3', {d '2011-12-01'}, 'Project 3, second revision'),
       ('B', '3', {d '2014-02-28'}, 'Project 3, third revision');

使用这两个表,我们生成startDate和endDate间隔:

代码语言:javascript
复制
--- Clients:
WITH c (clientCode, [name], startDate, endDate) AS (
    SELECT clientCode, [name], startDate,
           --- Find the next record's startDate, ordered by startDate.
           LEAD(startDate, 1, {d '2099-12-31'}) OVER (
               PARTITION BY clientCode
               ORDER BY startDate) AS endDate
    FROM dbo.Clients),

--- Projects:
     p (projectCode, clientCode, [name], startDate, endDate) AS (
    SELECT projectCode, clientCode, [name], startDate,
           --- Find the next record's startDate, order by startDate
           LEAD(startDate, 1, {d '2099-12-31'}) OVER (
               PARTITION BY projectCode
               ORDER BY startDate) AS endDate
    FROM dbo.Projects)

SELECT c.clientCode, c.[name] AS clientName,
       p.projectCode, p.[name] AS projectName,
       --- Start date is the last of (c.startDate, p.startDate)
       (CASE WHEN c.startDate<p.startDate THEN p.startDate ELSE c.startDate END) AS startDate,
       --- End date is the first of (c.endDate, p.endDate)
       (CASE WHEN c.endDate<p.endDate THEN c.endDate ELSE p.endDate END) AS endDate
FROM c
LEFT JOIN p ON
    c.clientCode=p.clientCode AND
    c.startDate<p.endDate AND
    c.endDate>p.startDate

-- IF two new tables were introducted (t3 and t4), would the following JOINS work?
-- LEFT JOIN dbo.Table3 as t3
-- on p.clientCode = t3.clientcode AND
-- p.startdate<t3.endate AND
-- p.endDate>t3.startdate
-- LEFT JOIN dbo.Table4 as t4
-- on t3.toolId = t4.toolid AND      --> toolId is a new key that I need for the join, since t4 does not have clientCode
-- t3.startdate<t4.enddate AND
-- t3.enddate>t4.startdate
ORDER BY c.clientCode, p.projectCode, 5;

My problem:在上面查询的底部,我注释掉了左边的联接,当引入更多的SCD2表时,我将不得不这样做。我不确定我所做的左撇子注释是否会起作用。你觉得有什么问题吗?

添加更多联接可能与上述查询中使用的语句的情况相冲突..:

代码语言:javascript
复制
       --- Start date is the last of (c.startDate, p.startDate)
       (CASE WHEN c.startDate<p.startDate THEN p.startDate ELSE c.startDate END) AS startDate,
       --- End date is the first of (c.endDate, p.endDate)
       (CASE WHEN c.endDate<p.endDate THEN c.endDate ELSE p.endDate END) AS endDate

这种情况下,使用语句是因为我不希望有两个间隔来引用同一日期。因此,输出间隔由较大的of (a.startTime,b.startTime)和较小的(a.endTime,b.endTime)定义。

我在这里看到了一个问题,因为在这种情况下,语句只计算来自2个表的startDate和endDate间隔,而不是3、4个或更多的表。

如何解决这个问题呢?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-05-21 05:22:10

您是否有兴趣使用SqlServer的geometry数据类型来表示时间段?在这里,我将其应用于您的示例:

代码语言:javascript
复制
WITH c (clientCode, [name], Perd) AS (
    SELECT clientCode, [name],
           Perd=geometry::STGeomFromText('LINESTRING (' + format(startdate,'yyyyMMdd')+' 0, '+
                      format(LEAD(startDate, 1, {d '2099-12-31'}) OVER (
                               PARTITION BY clientCode
                               ORDER BY startDate) , 'yyyyMMdd') +' 0)', 0)
    FROM #Clients),

--- Projects:
     p (projectCode, clientCode, [name], Perd) AS (
    SELECT projectCode, clientCode, [name], 
           Perd=geometry::STGeomFromText('LINESTRING (' + format(startdate,'yyyyMMdd')+' 0, '+
                       format(LEAD(startDate, 1, {d '2099-12-31'}) OVER (
                                PARTITION BY projectCode
                                ORDER BY startDate) , 'yyyyMMdd') +' 0)', 0)
    FROM #Projects)
SELECT c.clientCode, c.[name] AS clientName,
       p.projectCode, p.[name] AS projectName,
       startDate=try_cast(format(c.Perd.STIntersection(p.Perd).STEndPoint().STX ,'########') as date),
       endDate=try_cast(format(c.Perd.STIntersection(p.Perd).STStartPoint().STX, '########') as date)
FROM 
    c
    inner join
    p on
    c.clientCode=p.clientCode AND p.Perd.STIntersection(c.Perd).STLength()>0
order by 1,5

这可以更容易地将nest作为子查询,并连接到另一个时态表。

不过,我可以想象,对于非常大的数据集,这种情况不会很快发生。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/72312128

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档