令我懊恼的是,STL和Boost都没有笛卡尔的产品。即,作为一个或多个可迭代的参数,创建一个生成每个元素组合的std::tuples的迭代器(每个迭代器按参数的顺序从每个迭代器中抽取一个)。
我在这里的目标是使cartesian_product的行为与python的itertools.product完全一样(除了可选的repeat kwarg)。理想情况下,它将是一个零成本抽象(与使用嵌套的for循环相比)。
例如:
#include
std::vector as = {1, 2};
std::vector bs = {'a', 'b'};
std::vector cs = {1.5, 2.5};
for (auto [a, b, c] : cartesian_product(as, bs, cs)) {
std::cout << "(a = " << a << ", b = " << b << ", c = " << c << ")" << std::endl;
}应产生:
(a = 1, b = a, c = 1.5)
(a = 1, b = a, c = 2.5)
(a = 1, b = b, c = 1.5)
(a = 1, b = b, c = 2.5)
(a = 2, b = a, c = 1.5)
(a = 2, b = a, c = 2.5)
(a = 2, b = b, c = 1.5)
(a = 2, b = b, c = 2.5)为了使其具有通用性并支持任意数量的args,我必须使用参数包并进行一些模板模式匹配。
以下是我得出的结论:
#pragma once
#include
template
class product_iterator;
template
class product;
template
class product {
public:
explicit product(const T &x) : m_x(x) {}
product_iterator begin() const;
product_iterator end() const;
protected:
const T &m_x;
};
template
class product {
public:
product(const T &x, const Ts&... xs) : m_x(x), m_xs(product(xs...)) {}
product_iterator begin() const;
product_iterator end() const;
protected:
const T &m_x;
product m_xs;
};
template
class product_iterator {
friend class product;
public:
std::tuple operator*() const {
return std::make_tuple(*m_it);
}
const product_iterator &operator++() {
m_it++;
return *this;
}
bool operator==(const product_iterator &other) const {
return m_it == other.m_it;
}
bool operator!=(const product_iterator &other) const {
return !(*this == other);
}
protected:
typedef typename T::const_iterator t_iterator;
product_iterator(t_iterator it, t_iterator end) : m_it(it), m_end(end) {}
t_iterator m_it;
t_iterator m_end;
};
template
class product_iterator {
friend class product;
public:
decltype(auto) operator*() const {
return std::tuple_cat(std::make_tuple(*m_x), *m_xs);
}
const product_iterator &operator++() {
if (++m_xs == m_xs_end && ++m_x != m_x_end) {
m_xs = m_xs_begin;
}
return *this;
}
bool operator==(const product_iterator &other) const {
return m_x == other.m_x && m_xs == other.m_xs;
}
bool operator!=(const product_iterator &other) const {
return !(*this == other);
}
protected:
typedef typename T::const_iterator t_iterator;
typedef product_iterator ts_iterator;
product_iterator(t_iterator x, t_iterator x_end, ts_iterator xs,
ts_iterator xs_begin, ts_iterator xs_end)
: m_x(x), m_x_end(x_end), m_xs(xs), m_xs_begin(xs_begin), m_xs_end(xs_end) {}
t_iterator m_x;
t_iterator m_x_end;
ts_iterator m_xs;
ts_iterator m_xs_begin;
ts_iterator m_xs_end;
};
template
product_iterator product::begin() const {
return product_iterator(m_x.begin(), m_x.end());
}
template
product_iterator product::end() const {
return product_iterator(m_x.end(), m_x.end());
}
template
product_iterator product::begin() const {
return product_iterator(m_x.begin(), m_x.end(), m_xs.begin(),
m_xs.begin(), m_xs.end());
}
template
product_iterator product::end() const {
return product_iterator(m_x.end(), m_x.end(), m_xs.end(), m_xs.begin(),
m_xs.end());
}
template
product cartesian_product(Ts&... xs) {
return product(xs...);
}我已经测试了正确性和速度。clang 6和gcc 8都在努力将其优化为相当于嵌套的for循环。有关实证结果,请参见这个有一个可重复基准的要点。在我的机器上,我一直能做到以下几点:
$ g++-8 --version
g++-8 (Homebrew GCC 8.1.0) 8.1.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ clang++-6 --version
clang version 6.0.1 (tags/RELEASE_601/final)
Target: x86_64-apple-darwin17.7.0
Thread model: posix
InstalledDir: /usr/local/opt/llvm/bin
$ make clean && CXX=g++-8 make benchmark
# ... snip ...
time ./run_cartesian
32.53 real 32.42 user 0.03 sys
time ./run_loop
32.39 real 32.29 user 0.03 sys
$ make clean && CXX=clang++-6 make benchmark
# ... snip ...
time ./run_cartesian
30.31 real 30.24 user 0.02 sys
time ./run_loop
27.30 real 27.24 user 0.02 sys不知道gcc在这里的时间是怎么回事(希望这个基准没有被限制!),但与clang有一个明显的区别。此外,查看gcc-8和clang-6的哥德波特,可以看出两者似乎都无法优化product_iterator的一些抽象。
此外,如果用累加器替换dummy并执行实际的点积,如下所示:
uint32_t dot(const std::vector &as, const std::vector &bs) {
uint32_t acc = 0;
for (auto [a, b] : cartesian_product(as, bs)) {
acc += a * b;
}
return acc;
}和:
uint32_t dot(const std::vector &as, const std::vector &bs) {
uint32_t acc = 0;
for (auto a : as) {
for (auto b : bs) {
acc += a * b;
}
}
return acc;
}这种不同变得难以置信的明显。循环版本运行时下降到大约1秒,笛卡尔产品在我的机器上停留在30。看看哥德波特为了这个,您可以非常清楚地看到,gcc和clang都能够将嵌套的循环版本矢量化,而不是笛卡尔版本。
我对以下几件事很好奇:
cartesian_product优化循环?理想情况下,cartesian_product应该是零成本抽象(与嵌套的for循环相比)。发布于 2018-07-14 07:19:11
将m_x用于容器(在product中)和迭代器(在product_iterator中)会引起混淆。类似于m_xs。
编译器依赖于循环的“嵌套”来进行优化。product_iterator会删除这一点,因此编译器无法将其解释得非常完美。
比较而言: Clang和GCC能够为这两个备选方案(点积示例)生成几乎相同的程序集(主要是一些不同的XMM寄存器分配),就像在嵌套循环1中一样。
uint32_t dot(const std::vector& as, const std::vector& bs) {
uint32_t acc = 0;
acc = std::accumulate(
std::begin(as),
std::end(as),
acc,
[&bs](auto&& acc, auto&& a) {
return std::accumulate(
std::begin(bs),
std::end(bs),
acc,
[&a](auto&& acc, auto&& b) { return acc + a*b; });
});
return acc;
}template
auto cartesian_product(Callable&& op, const Cont& cont) {
for(auto&& x : cont) {
op(x);
}
}
template
void cartesian_product(Callable&& op, const Cont& cont, const Conts&... conts) {
for(auto&& x : cont) {
cartesian_product([&x, &op](auto&&... args) { op(x, args...); }, conts...);
}
}
uint32_t dot(const std::vector& as, const std::vector& bs) {
uint32_t acc = 0;
cartesian_product([&acc](auto&& a, auto&& b){ acc += a*b; }, as, bs);
return acc;
}{})而不是()。product_iterator::m_end。product_iterator::m_x_end。(可以删除operator++中的比较。)private。product_iterators的建设者有必要成为protected吗?使它们成为public,消除了friend class声明的必要性。std::begin/std::end。这允许使用那些成员函数不存在的类型(例如C样式数组,如int arr[4])。iterator_category、value_type、reference、difference_type和pointer (用于推断迭代器的特性)operator->())operator++(int))const product*中保留一个product_iterator,而不是m_xs_begin和m_xs_end。毕竟,其他一些代码(例如,使用迭代器的代码)可能会修改底层容器(S),对于某些容器来说,这是定义良好的行为。https://codereview.stackexchange.com/questions/198475
复制相似问题