文章/答案/技术大牛

发布

社区首页 >问答首页 >涉及字典和列表的数据转换

问涉及字典和列表的数据转换
EN

Code Review用户

提问于 2019-01-08 16:37:14

回答 1查看 118关注 0票数 4

我最初的问题是要转化一个List of Mappings，他的价值观是Lists的价值观。转换后的列表必须包含属于单独字典的所有列表(字典中的值)的笛卡儿积(换句话说，相同目录中的列表的值是“耦合”的)。

基本上，如果您忽略字典的键，这将简单地用itertools.product解决。

输入：

[
   {
      ('B3G', 'B1'): [1.0, 2.0], 
      ('B1G', 'B1'): [11.0, 12.0]
   }, 
   {
      ('B2G', 'B1'): [1.5, 2.5, 3.5]
   }
]

输出：

[
  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 1.5},
  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 2.5},
  {('B3G', 'B1'): 1.0, ('B1G', 'B1'): 11.0, ('B2G', 'B1'): 3.5},
  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 1.5},
  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 2.5},
  {('B3G', 'B1'): 2.0, ('B1G', 'B1'): 12.0, ('B2G', 'B1'): 3.5}
]

更令人困惑的是，每本字典的关键都是字符串的Tuples。

下面是一个可能的实现，使用一个class来隔离整个混乱。

@dataclass
class ParametricMapping:
"""Abstraction for multi-dimensional parametric mappings."""

mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])

@property
def combinations(self) -> List[Mapping[Tuple[str], float]]:
    """Cartesian product adapted to work with dictionaries, roughly similar to `itertools.product`."""

    labels = [label for arg in self.mappings for label in tuple(arg.keys())]
    pools = [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]

    def cartesian_product(*args):
        """Cartesian product similar to `itertools.product`"""
        result = [[]]
        for pool in args:
            result = [x + [y] for x in result for y in pool]
        return result

    results = []
    for term in cartesian_product(*pools):
        results.append([pp for p in term for pp in p])

    tmp = []
    for r in results:
        tmp.append({k: v for k, v in zip(labels, r)})

    if len(tmp) == 0:
        return [{}]
    else:
        return tmp

问题：我如何改进它，使其更干净(优先级#1)和速度(#2)？

python

python-3.x

回答 1

Code Review用户

发布于 2019-01-09 01:51:50

TL;DR:提供的代码的尾端滚动，并提出改进建议

建议

1. `combinations`

范围外的独立cartesian_product助手函数

一个更好的解决方案是简单地使用itertools.product，因为大多数的Python读者都会熟悉它--对于那些不熟悉的读者来说，这是很好的文档。

如果您仍然不想使用itertools.product__：

虽然构建只在必要时公开事物的作用域层次结构确实提高了清晰度，但在本例中，我认为使cartesian_product成为combinations中的嵌套函数只会混淆combinations的用途。最好在上面定义cartesian_product，使下面的代码更清晰、更容易理解。现在，阅读您的代码的人将首先看到cartesian_product的S定义，并理解其相对简单的用途。然后，在ParametricMapping.combinations内部，读者已经熟悉了cartesian_product，并且他们的思路不会因为试图理解嵌套函数而偏离轨道。

2. `flatten`助手函数

这个助手函数应该是这样使用的：

for term in cartesian_product(*pools):
    results.append(flatten(term))

这可能看起来很傻，因为扁平化是一种简单的操作，但在本例中，有一些比较棘手的列表/dict理解。因此，它可能有助于用一个简单的flatten调用来替换该部分，以消除一些混乱，并强调这是一个直接的操作--在当前状态下，这一事实可能会丢失在代码的某些读者身上。我在这里的观点是，将很多循环和理解叠加在一起(特别是没有文档/注释)会很快变得混乱和混乱。

3.合并一些代码

如果您同时考虑了上述两项建议，那么就出现了一个合并某些代码的好机会。在上面的更改之后，原来是

results = []
for term in cartesian_product(*pools):
    results.append([pp for p in term for pp in p])

现在应该是

results = []
for term in itertools.product(*pools):
    results.append(flatten(term))

这个区块可以用下列清单理解清楚而简明地表达出来：

results = [flatten(term) for term in itertools.product(*pools)]

请注意，由于将功能分离为辅助函数，此列表理解的目的非常清楚。它创建results，这是一个包含product的flattened输出的列表。

4.检查`combinations`

顶部的空mappings

不要在combinations的末尾检查len(tmp) == 0，而是让combinations的前两行如下所示：

if self.mappings == [{}]:
    return [{}]

else子句在combinations底部不再是必要的，您可以简单地使用return tmp。这是更干净的，因为它处理mappings立即为空的情况，这意味着这些情况绕过了len(tmp)在combinations结束时执行的所有代码。这也允许读者在进入combinations正在完成的实际工作时假设D43是非空的，这是需要担心的少一件事。或者，一个更简洁的选择是替换

if len(tmp) == 0:
    return [{}]
else:
    return tmp

使用

return tmp or [{}]

这是因为如果mappings为空，则tmp的值将为空列表，该列表将计算为False，并返回or后面的值。这个更简洁的版本可能以可读性下降为代价。

5.使用助手方法或非init字段

定义labels和pools

从combinations中删除前两行代码，并将它们移动到助手方法或处理后初始化的附加字段。我建议这样做，以进一步澄清combinations实际正在做的工作，并整体而言清理代码。这两者都有一个额外的好处，即将labels和pools公开为可以在combinations之外访问的dataclass的附加属性。

有关使用helper方法的实现，请参见下面一节中定义的ParametricMapping1类，并查看下面定义的使用非init字段的实现的替代ParametricMapping2类。

TL；

博士

最后，如果遵循所有建议，代码应该包括以下flatten声明，以及以下两个块中的一个(ParametricMapping1，或 ParametricMapping2)：

def flatten(l):
    return [item for sublist in l for item in sublist]

With .

@dataclass
class ParametricMapping1:
    mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])

    def _labels(self) -> List[Tuple[str]]:
        return flatten(self.mappings)

    def _pools(self) -> List[List[Sequence[float]]]:
        return [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]

    @property
    def combinations(self) -> List[Mapping[Tuple[str], float]]:
        if self.mappings == [{}]:
            return [{}]

        pool_values = [flatten(term) for term in itertools.product(*self._pools())]
        return [dict(zip(self._labels(), v)) for v in pool_values]

Or…

@dataclass
class ParametricMapping2:
    mappings: List[Mapping[Tuple[str], Sequence[float]]] = field(default_factory=lambda: [{}])
    labels: List[Tuple[str]] = field(init=False, repr=False)
    pools: List[List[Sequence[float]]] = field(init=False, repr=False)

    def __post_init__(self):
        self.labels = flatten(self.mappings)
        self.pools = [list(map(tuple, zip(*arg.values()))) for arg in self.mappings]

    @property
    def combinations(self) -> List[Mapping[Tuple[str], float]]:
        pool_values = [flatten(term) for term in itertools.product(*self.pools)]
        return [dict(zip(self.labels, v)) for v in pool_values] or [{}]

Edit (2019-01-09-1530)：

上述两个代码块中的_labels和self.labels的定义已分别简化为per @MathiasEttinger's 极好的建议。有关其原始定义，请参阅修订历史。

票数 6

页面原文内容由Code Review提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://codereview.stackexchange.com/questions/211121

复制

相似问题

问涉及字典和列表的数据转换
EN

回答 1

Code Review用户

建议

1. `combinations`

2. `flatten`助手函数

3.合并一些代码

4.检查`combinations`

5.使用助手方法或非init字段

TL；

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问涉及字典和列表的数据转换EN

回答 1

Code Review用户

建议

1. combinations

2. flatten助手函数

3.合并一些代码

4.检查combinations

5.使用助手方法或非init字段

TL；

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问涉及字典和列表的数据转换
EN

1. `combinations`

2. `flatten`助手函数

4.检查`combinations`