文章/答案/技术大牛

发布

问熊猫:夷平树结构
EN

Stack Overflow用户

提问于 2021-03-24 15:18:45

回答 3查看 495关注 0票数 2

我有一个类别树表示如下。

import pandas as pd

asset_tree = [
    {'id': 1, 'name': 'Linear Asset', 'parent_id': -1},
    {'id': 2, 'name': 'Lateral', 'parent_id': 1},
    {'id': 3, 'name': 'Main', 'parent_id': 1},
    {'id': 4, 'name': 'Point Asset', 'parent_id': -1},
    {'id': 5, 'name': 'Fountain', 'parent_id': 4},
    {'id': 6, 'name': 'Hydrant', 'parent_id': 4}
]
tree = pd.DataFrame(asset_tree)
print(tree)

这给我提供了如下数据：

   id          name  parent_id
0   1  Linear Asset         -1
1   2       Lateral          1
2   3          Main          1
3   4   Point Asset         -1
4   5      Fountain          4
5   6       Hydrant          4

树中的最高节点具有parent_id等于-1，因此树可以用图形表示如下：

Linear Asset
   | - Lateral
   | - Main
Point Asset
   | - Fountain
   | - Hydrant

我需要生成以下数据。

   id          name  parent_id  flat_name
0   1  Linear Asset         -1  Linear Asset
1   2       Lateral          1  Linear Asset : Lateral
2   3          Main          1  Linear Asset : Main
3   4   Point Asset         -1  Point Asset
4   5      Fountain          4  Point Asset : Fountain
5   6       Hydrant          4  Point Asset : Hydrant

树是动态生成的，可以有任意数量的级别，因此下面的树

asset_tree = [
    {'id': 1, 'name': 'Linear Asset', 'parent_id': -1},
    {'id': 2, 'name': 'Lateral', 'parent_id': 1},
    {'id': 3, 'name': 'Main', 'parent_id': 1},
    {'id': 4, 'name': 'Point Asset', 'parent_id': -1},
    {'id': 5, 'name': 'Fountain', 'parent_id': 4},
    {'id': 6, 'name': 'Hydrant', 'parent_id': 4},
    {'id': 7, 'name': 'Steel', 'parent_id': 2},
    {'id': 8, 'name': 'Plastic', 'parent_id': 2},
    {'id': 9, 'name': 'Steel', 'parent_id': 3},
    {'id': 10, 'name': 'Plastic', 'parent_id': 3}
]

应产生以下结果：

   id          name  parent_id  flat_name
0   1  Linear Asset         -1  Linear Asset
1   2       Lateral          1  Linear Asset : Lateral
2   3          Main          1  Linear Asset : Main
3   4   Point Asset         -1  Point Asset
4   5      Fountain          4  Point Asset : Fountain
5   6       Hydrant          4  Point Asset : Hydrant
6   7         Steel          2  Linear Asset : Lateral : Steel
7   8       Plastic          2  Linear Asset : Lateral : Plastic
8   9         Steel          3  Linear Asset : Main : Steel
9  10       Plastic          3  Linear Asset : Main : Plastic

python

pandas

回答 3

Stack Overflow用户

回答已采纳

发布于 2021-03-24 15:35:30

这里有一个递归的apply函数来实现这一点。函数接受一个id并通过树返回它的“路径”：

def flatname(ID):
    row = df[df['id'] == ID].squeeze()
    if row['parent_id'] == -1:
        return row['name']
    else:
        return flatname(row['parent_id']) + ' : ' + row['name']

要使用，请呼叫：

df['flat_name'] = df['id'].apply(flatname)

在第二个示例中使用后的df：

   id          name  parent_id                         flat_name
0   1  Linear Asset         -1                      Linear Asset
1   2       Lateral          1            Linear Asset : Lateral
2   3          Main          1               Linear Asset : Main
3   4   Point Asset         -1                       Point Asset
4   5      Fountain          4            Point Asset : Fountain
5   6       Hydrant          4             Point Asset : Hydrant
6   7         Steel          2    Linear Asset : Lateral : Steel
7   8       Plastic          2  Linear Asset : Lateral : Plastic
8   9         Steel          3       Linear Asset : Main : Steel
9  10       Plastic          3     Linear Asset : Main : Plastic

OP注意到，上面的函数显式地引用在函数范围之外定义的df变量。因此，如果您将您的DataFrame称为不同的东西，或者您想在许多DataFrames上调用它，这可能会导致问题。一个解决办法是将apply函数转换为更多的私有助手，并创建一个外部(更方便用户使用)函数来调用它：

def _flatname_recurse(ID, df):
    row = df[df['id'] == ID].squeeze()
    if row['parent_id'] == -1:
        return row['name']
    else:
        return _flatname_recurse(row['parent_id'], df=df) + ' : ' + row['name']

# asset_df to specify we are looking for a specific kind of df
def flatnames(asset_df):
    return asset_df['id'].apply(_flatname_recurse, df=asset_df)

然后打电话给：

df['flat_name'] = flatnames(df)

另外，请注意，我以前使用row = df.iloc[ID - 1, :]来标识行，在这种情况下，行可以工作，但依赖于id大于索引。This approach更通用。

票数 4

Stack Overflow用户

发布于 2021-03-24 15:31:48

可以使用递归查找父id的路径：

import pandas as pd
asset_tree = [{'id': 1, 'name': 'Linear Asset', 'parent_id': -1}, {'id': 2, 'name': 'Lateral', 'parent_id': 1}, {'id': 3, 'name': 'Main', 'parent_id': 1}, {'id': 4, 'name': 'Point Asset', 'parent_id': -1}, {'id': 5, 'name': 'Fountain', 'parent_id': 4}, {'id': 6, 'name': 'Hydrant', 'parent_id': 4}]
a_tree = {i['id']:i for i in asset_tree} #to dictionary for more efficient lookup
def get_parent(d, c = []):
   if (k:=a_tree.get(d['parent_id'])) is None:
      return c + [d['name']]
   return get_parent(k, c+[d['name']])

r = [{**i, 'flat_name':' : '.join(get_parent(i)[::-1])} for i in asset_tree]
df = pd.DataFrame(r)

输出：

    id         name  parent_id               flat_name
0   1  Linear Asset         -1            Linear Asset
1   2       Lateral          1  Linear Asset : Lateral
2   3          Main          1     Linear Asset : Main
3   4   Point Asset         -1             Point Asset
4   5      Fountain          4  Point Asset : Fountain
5   6       Hydrant          4   Point Asset : Hydrant

在你更大的asset_tree上

asset_tree = [{'id': 1, 'name': 'Linear Asset', 'parent_id': -1}, {'id': 2, 'name': 'Lateral', 'parent_id': 1}, {'id': 3, 'name': 'Main', 'parent_id': 1}, {'id': 4, 'name': 'Point Asset', 'parent_id': -1}, {'id': 5, 'name': 'Fountain', 'parent_id': 4}, {'id': 6, 'name': 'Hydrant', 'parent_id': 4}, {'id': 7, 'name': 'Steel', 'parent_id': 2}, {'id': 8, 'name': 'Plastic', 'parent_id': 2}, {'id': 9, 'name': 'Steel', 'parent_id': 3}, {'id': 10, 'name': 'Plastic', 'parent_id': 3}]
a_tree = {i['id']:i for i in asset_tree}
r = [{**i, 'flat_name':' : '.join(get_parent(i)[::-1])} for i in asset_tree]
df = pd.DataFrame(r)

输出：

   id          name  parent_id                         flat_name
0   1  Linear Asset         -1                      Linear Asset
1   2       Lateral          1            Linear Asset : Lateral
2   3          Main          1               Linear Asset : Main
3   4   Point Asset         -1                       Point Asset
4   5      Fountain          4            Point Asset : Fountain
5   6       Hydrant          4             Point Asset : Hydrant
6   7         Steel          2    Linear Asset : Lateral : Steel
7   8       Plastic          2  Linear Asset : Lateral : Plastic
8   9         Steel          3       Linear Asset : Main : Steel
9  10       Plastic          3     Linear Asset : Main : Plastic

票数 2

Stack Overflow用户

发布于 2021-03-24 15:32:19

这是一个网络问题，试试networkx

import networkx as nx

# build the graph
G = nx.from_pandas_edgelist(tree, source='parent_id', target='id',
                            create_using=nx.DiGraph)

# map id to name
node_names = tree.set_index('id')['name'].to_dict()

# get path from root (-1) to the node
def get_path(node):
    # this is a tree, so exactly one simple path for each node
    for path in nx.simple_paths.all_simple_paths(G, -1, node):
        return ' : '.join(node_names.get(i) for i in path[1:])

tree['flat_name'] = tree['id'].apply(get_path)

输出：

   id          name  parent_id                         flat_name
0   1  Linear Asset         -1                      Linear Asset
1   2       Lateral          1            Linear Asset : Lateral
2   3          Main          1               Linear Asset : Main
3   4   Point Asset         -1                       Point Asset
4   5      Fountain          4            Point Asset : Fountain
5   6       Hydrant          4             Point Asset : Hydrant
6   7         Steel          2    Linear Asset : Lateral : Steel
7   8       Plastic          2  Linear Asset : Lateral : Plastic
8   9         Steel          3       Linear Asset : Main : Steel
9  10       Plastic          3     Linear Asset : Main : Plastic

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/66784106

复制

相似问题

问熊猫:夷平树结构
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问熊猫:夷平树结构EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问熊猫:夷平树结构
EN