首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从Server检索数据,并根据分组将结果连在行上

从Server检索数据,并根据分组将结果连在行上
EN

Stack Overflow用户
提问于 2016-01-09 05:52:56
回答 2查看 81关注 0票数 0

几天来,我一直在研究一个问题,终于想出了一个适合我的解决方案。如果这个解决方案对其他人有用,我会问一个问题并自己回答。

我拥有对包含超过100万条记录的大型Server数据库的只读访问权限。数据库中的一些表通过查找表在多到多的关系中链接。为简化有关事项,表如下所示:

代码语言:javascript
复制
table names
|-----------|
| id | name |
|----|------|
|  1 | dave |
|  2 | phil |
|  3 | john |       table foods_relationship        table clothes_relationship
|  4 | pete |       |--------------------------|    |----------------------------|
|-----------|       | id | names_id | foods_id |    | id | names_id | clothes_id |
                    |----|----------|----------|    |----|----------|------------|
table foods         |  1 |        1 |        1 |    |  1 |        1 |          1 |
|---------------|   |  2 |        1 |        3 |    |  2 |        1 |          3 |
| id | food     |   |  3 |        1 |        4 |    |  3 |        1 |          4 |
|----|----------|   |  4 |        2 |        2 |    |  4 |        2 |          2 |
|  1 | beef     |   |  5 |        2 |        3 |    |  5 |        2 |          3 |
|  2 | tomatoes |   |  6 |        2 |        4 |    |  6 |        2 |          4 |
|  3 | bacon    |   |  7 |        2 |        5 |    |  7 |        3 |          1 |
|  4 | cheese   |   |  8 |        3 |        3 |    |  8 |        3 |          3 |
|  5 | apples   |   |  9 |        3 |        5 |    |  9 |        3 |          5 |
|---------------|   | 10 |        4 |        1 |    | 10 |        4 |          2 |
                    | 11 |        4 |        2 |    | 11 |        4 |          4 |
table clothes       | 12 |        4 |        3 |    | 12 |        4 |          5 |
|---------------|   | 13 |        4 |        5 |    |----------------------------|
| id | clothes  |   |--------------------------|
|----|----------|
|  1 | trousers |
|  2 | shorts   |
|  3 | shirt    |
|  4 | socks    |
|  5 | jumper   |
|  6 | jacket   |
|---------------|

这些表可以使用以下SQL重新创建(从MySQL数据库改编而来,因此在Server中工作可能需要稍加调整):

代码语言:javascript
复制
CREATE TABLE `clothes` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `clothes` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`id`)
);

INSERT INTO `clothes` (`id`, `clothes`)
VALUES
    (1,'trousers'),
    (2,'shorts'),
    (3,'shirt'),
    (4,'socks'),
    (5,'jumper'),
    (6,'jacket');

CREATE TABLE `clothes_relationships` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `names_id` int(11) DEFAULT NULL,
  `clothes_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
);

INSERT INTO `clothes_relationships` (`id`, `names_id`, `clothes_id`)
VALUES
    (1,1,1),
    (2,1,3),
    (3,1,4),
    (4,2,2),
    (5,2,3),
    (6,2,4),
    (7,3,1),
    (8,3,3),
    (9,3,5),
    (10,4,2),
    (11,4,4),
    (12,4,5);

CREATE TABLE `food_relationships` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `names_id` int(11) DEFAULT NULL,
  `foods_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
);

INSERT INTO `food_relationships` (`id`, `names_id`, `foods_id`)
VALUES
    (1,1,1),
    (2,1,3),
    (3,1,4),
    (4,2,2),
    (5,2,3),
    (6,2,4),
    (7,2,5),
    (8,3,3),
    (9,3,5),
    (10,4,1),
    (11,4,2),
    (12,4,3),
    (13,4,5);

CREATE TABLE `foods` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `food` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`id`)
);

INSERT INTO `foods` (`id`, `food`)
VALUES
    (1,'beef'),
    (2,'tomatoes'),
    (3,'bacon'),
    (4,'cheese'),
    (5,'apples');

CREATE TABLE `names` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(32) DEFAULT NULL,
  PRIMARY KEY (`id`)
);

INSERT INTO `names` (`id`, `name`)
VALUES
    (1,'dave'),
    (2,'phil'),
    (3,'john'),
    (4,'pete');

我想查询数据库,并以某种方式获得以下输出:

代码语言:javascript
复制
|-------------------------------------------------------------|
| name | food                         | clothes               |
|------|------------------------------|-----------------------|
| dave | beef,cheese,bacon            | trousers,socks,shirt  |
| john | apples,bacon                 | jumper,shirt,trousers |
| pete | beef,apples,bacon,tomatoes   | shorts,jumper,socks   |
| phil | bacon,tomatoes,apples,cheese | shirt,shorts,socks    |
|-------------------------------------------------------------|

但是,运行一个SELECT查询,将“name”表连接到另一个或两个其他表(通过各自的查找表),每个名称会产生多个行。例如:

代码语言:javascript
复制
SELECT
    names.name,
    foods.food

FROM
    names
    LEFT JOIN food_relationships ON names.id = food_relationships.names_id
    LEFT JOIN foods ON food_relationships.foods_id = foods.id;

...produces以下一组结果:

代码语言:javascript
复制
|-----------------|
| name | food     |
|------|----------|
| dave | beef     |
| dave | bacon    |
| dave | cheese   |
| phil | tomatoes |
| phil | bacon    |
| phil | cheese   |
| phil | apples   |
| john | bacon    |
| john | apples   |
| pete | beef     |
| pete | tomatoes |
| pete | bacon    |
| pete | apples   |
|-----------------|

如果SELECT查询从两个表返回数据,则问题更加复杂:

代码语言:javascript
复制
SELECT
    names.name,
    foods.food,
    clothes.clothes

FROM
    names
    LEFT JOIN food_relationships ON names.id = food_relationships.names_id
    LEFT JOIN foods ON food_relationships.foods_id = foods.id
    LEFT JOIN clothes_relationships ON names.id = clothes_relationships.names_id
    LEFT JOIN clothes ON clothes_relationships.clothes_id = clothes.id;

|-----------------------------|
| name | food     | clothes   |
|------|----------|-----------|
| dave | beef     | trousers  |
| dave | beef     | shirt     |
| dave | beef     | socks     |
| dave | bacon    | trousers  |
| dave | bacon    | shirt     |
| dave | bacon    | socks     |
| dave | cheese   | trousers  |
| dave | cheese   | shirt     |
| dave | cheese   | socks     |
| phil | tomatoes | shorts    |
| phil | tomatoes | shirt     |
| phil | tomatoes | socks     |
| phil | bacon    | shorts    |
| phil | bacon    | shirt     |
| phil | bacon    | socks     |
| phil | cheese   | shorts    |
| phil | cheese   | shirt     |
| phil | cheese   | socks     |
| phil | apples   | shorts    |
| phil | apples   | shirt     |
| phil | apples   | socks     |
| ...
| etc.

问题是,如何查询Server数据库以检索所有数据,但如何处理它,使每个人只有一行数据?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2016-01-09 05:52:56

如果数据库是MySQL,则解决方案相对容易,因为MySQL有一个连接行的GROUP_CONCAT函数。所以,对于其中的一张桌子,我可以用:

代码语言:javascript
复制
SELECT
    names.name,
    GROUP_CONCAT(foods.food)

FROM
    names
    LEFT JOIN food_relationships ON names.id = food_relationships.names_id
    LEFT JOIN foods ON food_relationships.foods_id = foods.id

GROUP BY (names.name);

...to给予:

代码语言:javascript
复制
name    food
dave    beef,cheese,bacon
john    apples,bacon
pete    beef,apples,bacon,tomatoes
phil    bacon,tomatoes,apples,cheese

为了从“姓名”和“衣服”表中获得同等的数据,我可以使用以下内容:

代码语言:javascript
复制
SELECT
    temp_foods_table.name               AS 'name',
    temp_foods_table.food               AS 'food',
    temp_clothes_table.clothes          AS 'clothes'

FROM
(
    SELECT
        names.name,
        GROUP_CONCAT(foods.food)        AS 'food'

    FROM
        names
        LEFT JOIN food_relationships ON names.id = food_relationships.names_id
        LEFT JOIN foods ON food_relationships.foods_id = foods.id

    GROUP BY (names.name)

) AS temp_foods_table

LEFT JOIN

(
    SELECT
        names.name,
        GROUP_CONCAT(clothes.clothes)    AS 'clothes'

    FROM
        names
        LEFT JOIN clothes_relationships ON names.id = clothes_relationships.names_id
        LEFT JOIN clothes ON clothes_relationships.clothes_id = clothes.id

    GROUP BY (names.name)

) AS temp_clothes_table

ON temp_foods_table.name = temp_clothes_table.name;

...to给出了以下结果:

代码语言:javascript
复制
name    food                            clothes
dave    beef,cheese,bacon               trousers,socks,shirt
john    apples,bacon                    jumper,shirt,trousers
pete    beef,apples,bacon,tomatoes      shorts,jumper,socks
phil    bacon,tomatoes,apples,cheese    shirt,shorts,socks

然而,在SQL SERVER中,情况似乎不那么直接.对于单个表,有一些在线建议的解决方案,包括使用公共表表达式或XML路径。然而,所有的解决方案似乎都有缺点,并给人一种明显的印象,即它们是以工作为中心而不是专门设计的特性。每个建议的解决方案都有一些缺点(例如,FOR路径解决方案假设文本是XML,因此文本中包含的特殊字符可能会导致问题)。此外,一些评论者表示担心,这样的工作是基于无文件或不受欢迎的特性,因此,可能是不可靠的长期。

因此,我决定不把自己绑在SQL节点上,而是使用Python和Pandas处理数据后检索。无论如何,我总是将数据传输到Pandas数据,以便进行绘图和分析,所以这并不是一个很大的不便。为了连接多个列上的数据,我使用了groupby()。但是,由于有两个多到多个表,所以每一列都有重复,因此,最后的级联字符串包含了所有这些重复。为了只具有唯一的值,我使用了Python集(根据定义,它只能包含唯一的值)。这种方法唯一潜在的缺点是字符串的顺序不被维护,但对于我的情况来说,这不是一个问题。最后的Python解决方案如下所示:

导入必要的图书馆:

代码语言:javascript
复制
>>> import pandas as pd
>>> import pymssql
>>> import getpass

输入连接数据库所需的详细信息:

代码语言:javascript
复制
>>> myServer = input("Enter server address: ")
>>> myUser = input("Enter username: ")
>>> myPwd = getpass.getpass("Enter password: ")

创建一个连接:

代码语言:javascript
复制
>>> myConnection = pymssql.connect(server=myServer, user=myUser, password=myPwd, port='1433')

定义查询以检索必要的数据:

代码语言:javascript
复制
>>> myQuery = """SELECT
                         names.name,
                         foods.food,
                         clothes.clothes

                     FROM
                         names
                         LEFT JOIN food_relationships ON names.id = food_relationships.names_id
                         LEFT JOIN foods ON food_relationships.foods_id = foods.id
                         LEFT JOIN clothes_relationships ON names.id = clothes_relationships.names_id
                         LEFT JOIN clothes ON clothes_relationships.clothes_id = clothes.id """

运行查询,将结果放入dataframe并关闭连接:

代码语言:javascript
复制
>>> myLatestData = pd.io.sql.read_sql(myQuery, con=myConnection)
>>> myConnection.close()

将字符串串联在多个行中,并删除重复字符串:

代码语言:javascript
复制
>>> tempDF = tempDF.groupby('name').agg(lambda col: ','.join(set(col)))

打印最终数据:

代码语言:javascript
复制
>>> print(tempDF)

name                          food                clothes
dave             beef,bacon,cheese   socks,trousers,shirt
john                  bacon,apples  jumper,trousers,shirt
pete    tomatoes,beef,bacon,apples    socks,jumper,shorts
phil  tomatoes,bacon,cheese,apples     socks,shorts,shirt

对我来说,这个解决方案比尝试执行SQL查询中的所有数据处理要直观得多。希望这能帮到别人。

票数 0
EN

Stack Overflow用户

发布于 2016-01-09 09:29:59

如果是..。

你可以使用物质功能。例如:

宣布@英雄表( HeroName VARCHAR(20) )

插入@Heroes ( HeroName )值(“超人”)、(“蝙蝠侠”)、(“铁人”)、(“金刚狼”)

为XML路径( '')选择HeroName ((选择',‘+ HeroName命令),1,1,’‘)作为输出

输出

蝙蝠侠,铁人,超人,金刚狼

我觉得这应该能回答你的问题。

谢谢

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/34690199

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档