我一直忘记了标准的join()只能使用一个可迭代的方法,所以我做了一些函数来递归地处理任何传递的参数。具有讽刺意味的是,join()的深层版本不是由用户调用的,而是仅在sum()的深层版本中调用的--标准的sum()中断并告诉您在尝试使用sum()字符串时使用join(),但这个版本只是自动调用深层join()。
有几个问题促使我在这里发表这篇文章:
它总共是100行,包括注释,但是每个函数都在4到12行之间(加上定义)。请随意评论他们的全部或任何一个。
def _djoin(*args, s=''):
"""
Executes a recursive string join on all passed arguments and their contents.
Parameters:
*args (tuple): An unrolled tuple of arguments.
s (string): Optional. Separates each element with the given string.
"""
if len(args) == 1:
try:
iter(args[0])
if type(args[0]) == str:
raise TypeError
return s.join(_djoin(arg, s=s) for arg in args[0])
except TypeError:
return str(args[0])
return s.join(_djoin(arg, s=s) for arg in args)
def dall(*args):
"""
Executes a recursive all() on all passed arguments and their contents.
Parameter:
*args (tuple): An unrolled tuple of arguments.
"""
if len(args) == 1:
try:
iter(args[0])
if type(args[0]) == str or not len(args[0]):
raise TypeError
return all(dall(arg) for arg in args[0])
except TypeError:
return bool(args[0])
return all(dall(arg) for arg in args)
def dany(*args):
"""
Executes a recursive any() on all passed arguments and their contents.
Parameter:
*args (tuple): An unrolled tuple of arguments.
"""
if len(args) == 1:
try:
iter(args[0])
if type(args[0]) == str or not len(args[0]):
raise TypeError
return any(dany(arg) for arg in args[0])
except TypeError:
return bool(args[0])
return any(dany(arg) for arg in args)
def dsum(*args, s=0):
"""
Executes a recursive sum() on all passed arguments and their contents.
If s is a string, _djoin(args, s) is returned.
Parameters:
*args (tuple): An unrolled tuple of arguments.
s: An initial value to which all other values will be added.
"""
if type(s) == str:
return _djoin(*args, s=s)
if len(args) == 1:
try:
iter(args[0])
if type(args[0]) == str:
raise TypeError
return sum((dsum(arg, s=s) if arg else s for arg in args[0]), s)
except TypeError:
if type(s) == list:
return [args[0]]
return (args[0])
return sum((dsum(arg, s=s) for arg in args), s)
def ssum(*seq):
"""
Executes a sum() on the given seq, automatically determining a reasonable start value.
Parameter:
*seq (tuple): An unrolled tuple of arguments.
"""
n = next(iter(seq))
if len(seq) == 1:
return sum(n, type(next(iter(n)))())
return sum(seq, type(n)())
def dlen(*args, deep=False):
"""
Executes a recursive len() on all passed arguments and their contents.
Parameters:
*args (tuple): An unrolled tuple of arguments.
deep (bool): An initial value to which all other values will be added (with type conversions if necessary).
"""
if len(args) == 1:
try:
iter(args[0])
if type(args[0]) == str:
raise TypeError
return sum((dlen(arg, deep=deep) for arg in args[0]))
except TypeError:
if deep and type(args[0]) == str:
return len(args[0])
return 1
return sum((dlen(arg, deep=deep) for arg in args))这就是我用来测试它们的东西。到目前为止,每一行都成功地打印了函数结果和预期结果。新的,更艰苦的测试,以揭露错误将是受欢迎的。
from deep import *
import datetime as dt
print(_djoin('foo', 'bar', 123, s=' '), '#### foo bar 123')
print(_djoin(['foo', 'bar', 123], s=' '), '#### foo bar 123')
print(_djoin('foo', 'bar', [123,456,789,'baz'], s=' '), '#### foo bar 123 456 789 baz')
print(_djoin(['foo', 'bar', [123,456,789,'baz']], s=' '), '#### foo bar 123 456 789 baz')
print(_djoin([10,11,12, 0.0000000000003], s=' '), '#### 10 11 12 3e-13')
print(' '.join(_djoin([10,11,12, 0.0000000000003])), '#### 1 0 1 1 1 2 3 e - 1 3')
print(dall([1],[1],[[],[]]), False)
print(dall([0],), False)
print(dall(0), False)
print(dall(1), True)
print(dall(1,2,[3]), True)
print(dall([1],), True, '\n')
print(dany([],[0],[[],[]]), False)
print(dany([0],), False)
print(dany(0), False)
print(dany(1), True)
print(dany([0,0],[3]), True)
print(dany([],[1]), True, '\n')
print(dsum(1,2,3), 6)
print(dsum([1,2,3]), 6)
print(dsum(1,[2,3]), 6)
print(dsum([1,2],[3,4],5), 15)
print(dsum([1,2],[3,4], s=[]), [1,2,3,4])
print(dsum([1,2],[3,4],5, s=[]), [1,2,3,4,5])
print(dsum(1,2,3,[4,[5,6]]), 21)
print(dsum('a','b',s='-'), 'a-b')
print(dsum(1,2,3, s='-'), '1-2-3')
print(dsum(1,2,3), 6)
print(dsum(dt.timedelta(3), dt.timedelta(4), s=dt.timedelta()), "7 days, 0:00:00")
print(dsum(1,2,3,[[],[3,0]]), 9, '\n')
print(dlen([1,2,3]), 3)
print(dlen([1,2],[3]), 3)
print(dlen([[1,2],[0],[2,[2,[2]]]]), 6)
print(dlen([['hello',2],[0],[2,[2,[2]]]]), 6)
print(dlen([['hello',2],[0],[2,[2,[2]]]], deep=True), 10, '\n')
print(ssum([[1,2],[3]]), [1,2,3])
print(ssum([1,2,3]), 6)
print(ssum(1,2,3), 6)
print(ssum([1,2],[3]), [1,2,3])发布于 2015-04-14 09:50:18
这里有大量的复制--用于扁平*args元组的代码多次出现。我会将其考虑到一个单一的函数,_flatten,它可能是一个处理大型输入的生成器:
def _flatten(iter_):
if isinstance(iter_, str):
yield iter_
else:
try:
for obj in iter_:
yield from _flatten(obj)
except TypeError:
yield iter_(请注意,yield from只有可用的调自Python 3.3。)这将整齐地展开您的元组参数:
>>> list(_flatten(('foo', 'bar', [123,456,789,'baz'])))
['foo', 'bar', 123, 456, 789, 'baz']现在,_djoin变成了:
def _djoin(*args, s=''):
return s.join(map(str, _flatten(args)))而且工作原理完全一样:
>>> _djoin('foo', 'bar', [123,456,789,'baz'], s=' ')
'foo bar 123 456 789 baz'类似地,dall变成了return all(_flatten(args))。
注意,在上面的_flatten实现中,我使用了isinstance,而不是type(iter) == str。这将适当地处理继承(即str的子类也将被正确处理)。dsum还应该使用以下内容:
def dsum(*args, s=0):
if isinstance(s, str):
return _djoin(*args, s=s)
...参见例如python中isinstance()和type()之间的区别
当前的测试套件要求您读取每一行,以验证输出是否如预期的那样。例如,如果您将assert用于此,那么生活就会简单得多:
assert _djoin('foo', 'bar', 123, s=' ') == 'foo bar 123'如果一切正常,这将不会提供输出,但如果测试失败,则会引发错误:
>>> assert _djoin('foo', 'bar', 123, s=' ') == 'foo bar 123'
>>> assert _djoin('foo', 'bar', 123, s=' ') == 'derp'
Traceback (most recent call last):
File "<pyshell#18>", line 1, in <module>
assert _djoin('foo', 'bar', 123, s=' ') == 'derp'
AssertionError或者,您可以考虑实现doctestS,例如:
def _djoin(*args, s=''):
"""Flatten the arguments and join them together as strings.
>>> _djoin('foo', 'bar', 123, s=' ')
'foo bar 123'
"""
...然后,在deep.py的底部,您可以轻松地使用以下方法运行所有测试:
if __name__ == '__main__':
import doctest
doctest.testmod(verbose=True)您将得到有用的输出,包括测试的内容、有效的输出和未测试的输出。例如,下面我开发的ssum的输出失败:
...
Trying:
ssum(1, 2, 3)
Expecting:
6
ok
Trying:
ssum('foo', 'bar', 'baz')
Expecting:
'foobarbaz'
**********************************************************************
File "C:/Python34/deep.py", line 49, in __main__.ssum
Failed example:
ssum('foo', 'bar', 'baz')
Expected:
'foobarbaz'
Got:
'foofoobarbaz'
1 items had no tests:
__main__
3 items passed all tests:
2 tests in __main__._djoin
3 tests in __main__._flatten
1 tests in __main__.dsum
**********************************************************************
1 items had failures:
1 of 2 in __main__.ssum
8 tests in 5 items.
7 passed and 1 failed.
***Test Failed*** 1 failures.(我已经通过了seq,而不是“_ _to_ _d”-哦!)
ssum实现似乎有点奇怪;重复使用iter和next会使代码难以阅读,而且不太可能有效。相反,可以考虑这样的事情:
def ssum(*seq):
"""Sum over the sequence, determining a sensible start value.
>>> ssum(1, 2, 3)
6
>>> ssum('foo', 'bar', 'baz')
'foobarbaz'
"""
iter_ = _flatten(seq)
first = next(iter_)
if isinstance(first, str):
return _djoin(first, iter_)
return sum(iter_, first)这清楚地表明,逻辑是基于计算first对象的类型来确定“合理的开始值”。
发布于 2015-04-14 10:31:01
您有这么多代码的原因是您没有很好地分解问题。例如,dsum()负责扁平、类型检查和求和.扁平化工作对于您的许多功能来说都是常见的,并且应该委托给它自己的功能。你会发现这是一个以前也曾解决过类似的问题。。
def flatten(*sequence):
for item in sequence:
if isinstance(item, str) or not hasattr(item, '__iter__'):
yield item
else:
for i in item:
yield from flatten(i)
# yield from is a Python 3.3 feature
# https://docs.python.org/3/whatsnew/3.3.html#pep-380
def dall(*sequence):
return all(flatten(*sequence))
def dany(*sequence):
return any(flatten(*sequence))
# I would just inline _djoin() within dsum().
def _djoin(*sequence, s=''):
return s.join(str(item) for item in flatten(*sequence))
def dsum(*sequence, s=0):
if isinstance(s, str):
return _djoin(*sequence, s=s)
try:
return sum(flatten(*sequence), s)
except TypeError:
return s + list(flatten(*sequence))
def dlen(*sequence, deep=False):
def length(item):
return len(item) if deep and isinstance(item, str) else 1
return sum(length(item) for item in flatten(*sequence))这些实现与您的测试用例一致,但dall([1],[1],[[],[]])返回的是True而不是您想要的False。这是由于不同的解释。注意,all([])是True,bool([])是False。问题是,你对递归的理解有多深?如果您完全恢复到空列表中,则结果是True。否则,如果将空列表视为False,则结果是False。但是,用这么多特殊情况编写dall()的方式,我倾向于说,您的解释是人为的。
我没有重新实现的函数是ssum()。你说它“自动确定一个合理的开始值”。我一点也不清楚什么是合理的--对我来说,这一切听起来都很武断,因此函数的实现基本上是它自己的规范。
发布于 2015-04-14 10:00:56
如果您查看您的函数,您会发现它们都有几乎相同的公共代码:
if len(args) == 1:
try:
iter(args[0])
if type(args[0]) == str or not len(args[0]):
raise TypeError
return A(B(arg) for arg in args[0])
except TypeError:
return C(args[0])
return A(B(arg) for arg in args)除了我已经用A、B和C替换的部分之外。但是,即使这些部分因函数不同而不同,B始终是对包含函数的递归调用,A实现了“组合”逻辑,C计算单个项的结果。
因此,您可以通过将这些公共代码提取到一个函数中来简化代码,例如:
def map_reduce_tree(f, r, *args):
"""Apply f to each leaf element of the tree args and combine the
results by calling r.
"""
if len(args) == 1:
try:
iter(args[0])
if type(args[0]) == str or not len(args[0]):
raise TypeError
return r(map_reduce_tree(f, r, a) for a in args[0])
except TypeError:
return f(args[0])
else:
return r(map_reduce_tree(f, r, a) for a in args)现在,dall变成:
map_reduce_tree(bool, all, *args)dany变成:
map_reduce_tree(bool, any, *args)dsum变成:
identity = lambda x:x
map_reduce_tree(identity, sum, *args)djoin变成:
map_reduce_tree(identity, ''.join, *args)诸若此类。如果您想知道为什么我要调用这个map_reduce_tree,那是因为"地图缩减“是一个众所周知的数据处理模型,而树是我们正在操作的递归数据结构。
现在,我们可以将map_reduce_tree简化如下:
iter并捕获TypeError,我们可以使用抽象基类collections.abc.Iterable并编写isinstance(x, Iterable)。r(map_reduce_tree(f, r, a) for a in ...)的两个实例合并为一个。not len(args[0]) -- r最好处理一个空的参数序列,然后f尝试处理它。其结果是:
from collections.abc import Iterable
def map_reduce_tree(f, r, args):
"""Apply f to each leaf element of the tree args and combine the
results by calling r.
"""
if isinstance(args, Iterable) and type(args) != str:
return r(map_reduce_tree(f, r, a) for a in args)
else:
return f(args)但是我们可以进一步分解这些功能。这里有三个步骤:(1)递归地遍历树,(2)将f应用于每个叶元素;(3)通过调用r来组合结果。因此,我们可以将其分为三个部分,步骤(i)使用下面的leaves函数,步骤(ii)使用内置的map,步骤(iii)只调用r。
def leaves(tree):
"""Generate the leaf elements of tree."""
if isinstance(tree, Iterable) and type(tree) != str:
for t in tree:
yield from leaves(t)
else:
yield tree现在,dall变成:
all(leaves(args))(我们不需要将bool应用于树叶--all已经这样做了)。类似地,dany变成:
any(leaves(args))dsum变成:
sum(leaves(args))(因为f是身份函数,所以我们可以省略映射步骤),djoin变成:
''.join(leaves(args))我希望你能同意这比原来的要短得多,也更容易理解。
https://codereview.stackexchange.com/questions/86842
复制相似问题