文章/答案/技术大牛

发布

社区首页 >问答首页 >如何使用SQL连接具有分组和大小写的两行

问如何使用SQL连接具有分组和大小写的两行
EN

Stack Overflow用户

提问于 2017-11-20 21:20:31

回答 1查看 47关注 0票数 0

我有一些在SQLServer2016Express中的基因组数据，它目前是用一个参考基因组和由一个SubjectID、基因和密码子(例如一个三元组)分割的参考基因组和测试基因组的长格式形成的。

我真正需要的是将我的数据重组成一个元组连接在一起的数据，但只有当元组中存在一个突变(与参考基因组相比)时。对于每个人来说，这将是一个更有用的格式。

我的数据是这样的

DECLARE @myTable TABLE 
(
     SubjectID VARCHAR(MAX), 
     country VARCHAR(MAX), 
     gene VARCHAR(MAX), 
     position INT, 
     ReferenceNucleotide VARCHAR(1), 
     TestNucleotide VARCHAR(1), 
     codon INT, 
     nucleotide_order INT 
)

INSERT INTO @myTable 
VALUES
('1-0002','India','gyrA',   65,'A','x', 92,1),
('1-0002','India','gyrA',   66,'T','x', 92,2),
('1-0002','India','gyrA',   67,'C','C', 92,3),
('1-0002','India','gyrA',   68,'T','T', 93,1),
('1-0002','India','gyrA',   69,'A','A', 93,2),
('1-0002','India','gyrA',   70,'C','C', 93,3),
('1-0002','India','gyrA',   71,'G','G', 94,1),
('1-0002','India','gyrA',   72,'A','A', 94,2),
('1-0002','India','gyrA',   73,'C','C', 94,3),
('1-0002','India','gyrA',   74,'A','A', 95,1),
('1-0002','India','gyrA',   75,'G','C', 95,2),
('1-0002','India','gyrA',   76,'C','C', 95,3),
('1-0002','India','gyrA',   77,'C','C', 96,1),
('1-0002','India','gyrA',   78,'T','T', 96,2),
('1-0002','India','gyrA',   79,'G','N', 96,3)

然而，有几个条件

如果所有三个核苷酸都是相同的，作为参考和测试基因组，我想要一个'WT‘
如果核苷酸有什么不同，我想要测试基因组中的三元组(按核苷酸顺序)。
我需要按SubjectID和基因分组，因为我有很多学科和基因

我的结果看起来就像

1-0002 India gyrA 92 xxC
1-0002 India gyrA 93 WT
1-0002 India gyrA 94 WT
1-0002 India gyrA 95 ACC
1-0002 India gyrA 96 CTN

我可以确定密码子在哪里，需要有三元组，但我很难把它们连在一起。

DECLARE @myCodons TABLE (SubjectID varchar(max), country varchar(max), gene  varchar(max), codon int, WT int)

INSERT INTO @myCodons
    SELECT 
        SubjectID, country, gene, codon,
        SUM(CASE WHEN RefNucleotide=TestNucleotide THEN 0 ELSE 1 END) AS WT
    FROM 
        @myTable
    GROUP BY 
        SubjectID, country, gene, codon 

SELECT * 
FROM @myCodons 
ORDER BY codon

tsql

group-by

string-concatenation

sql-server

回答 1

Stack Overflow用户

发布于 2017-11-20 22:41:13

一种可能有用的替代办法：

select SubjectID, country, gene, codon, case when RefGenomeStr = TestGenomeStr then 'WT' else TestGenomeStr end wanted_string
from @myTable t1
cross apply(
    SELECT
        STUFF((
              SELECT
                    ', ' +RefGenome
              FROM @myTable t2
              WHERE t2.SubjectID= t1.SubjectID and t2.country = t1.country and  t2.gene = t1.gene and t2.codon = t1.codon
              FOR XML PATH ('')
              )
             , 1, 1, '')
      , STUFF((
              SELECT
                    ', ' +TestGenome
              FROM @myTable t2
              WHERE t2.SubjectID= t1.SubjectID and t2.country = t1.country and  t2.gene = t1.gene and t2.codon = t1.codon
              FOR XML PATH ('')
              )
             , 1, 1, '')
         ) ca (RefGenomeStr,TestGenomeStr)    
where nucleotide_order = 1

结果：

+----+-----------+---------+------+-------+---------------+
|    | SubjectID | country | gene | codon | wanted_string |
+----+-----------+---------+------+-------+---------------+
|  1 | 1-0002    | India   | gyrA |    92 |  x, x, C      |
|  2 | 1-0002    | India   | gyrA |    93 | WT            |
|  3 | 1-0002    | India   | gyrA |    94 | WT            |
|  4 | 1-0002    | India   | gyrA |    95 |  A, C, C      |
|  5 | 1-0002    | India   | gyrA |    96 |  C, T, N      |
+----+-----------+---------+------+-------+---------------+

Demo

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/47401245

复制

相似问题

问如何使用SQL连接具有分组和大小写的两行
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使用SQL连接具有分组和大小写的两行EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使用SQL连接具有分组和大小写的两行
EN