首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >无法使用dask-cudf创建第三列。

无法使用dask-cudf创建第三列。
EN

Stack Overflow用户
提问于 2022-06-04 10:52:42
回答 1查看 40关注 0票数 1

我有以下dask_cudf.core.DataFrame

代码语言:javascript
复制
import pandas as pd
import numpy as np
import dask_cudf
import cudf


data = {"x":range(1,21), "nor":np.random.normal(2, 4, 20), "unif":np.random.uniform(size = 20)}
df = cudf.DataFrame(data)
ddf = dask_cudf.from_cudf(df, npartitions = 2)
ddf.compute()

我想为列norunif创建第1到第5个滞后值。然而,我创建它们的方式如下:

代码语言:javascript
复制
colz = ["nor", "unif"]
ddf[[s + "_" + str(1) for s in colz]] = ddf[colz].shift(1)
ddf[[s + "_" + str(2) for s in colz]] = ddf[colz].shift(2)

我可以创建第一个和第二个滞后值,但仅此而已。当我运行值大于2的shift时,会得到以下错误:

代码语言:javascript
复制
    /usr/local/lib/python3.7/site-packages/dask/dataframe/utils.py in raise_on_meta_error(funcname, udf)
    175     try:
--> 176         yield
    177     except Exception as e:

16 frames
cudf/_lib/copying.pyx in cudf._lib.copying.shift()

RuntimeError: parallel_for failed: cudaErrorInvalidConfiguration: invalid configuration argument

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
/usr/local/lib/python3.7/site-packages/dask/dataframe/utils.py in raise_on_meta_error(funcname, udf)
    195         )
    196         msg = msg.format(f" in `{funcname}`" if funcname else "", repr(e), tb)
--> 197         raise ValueError(msg) from e
    198
    199

ValueError: Metadata inference failed in `shift`.

Original error is below:
------------------------
RuntimeError('parallel_for failed: cudaErrorInvalidConfiguration: invalid configuration argument')

Traceback:
---------
  File "/usr/local/lib/python3.7/site-packages/dask/dataframe/utils.py", line 176, in raise_on_meta_error
    yield
  File "/usr/local/lib/python3.7/site-packages/dask/dataframe/core.py", line 5833, in _emulate
    return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
  File "/usr/local/lib/python3.7/site-packages/dask/utils.py", line 1021, in __call__
    return getattr(__obj, self.method)(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/cudf/core/frame.py", line 1788, in shift
    return self._shift(periods)
  File "/usr/local/lib/python3.7/site-packages/cudf/core/frame.py", line 1793, in _shift
    zip(self._column_names, data_columns), self._index
  File "/usr/local/lib/python3.7/site-packages/cudf/core/dataframe.py", line 818, in _from_data
    out = super()._from_data(data, index)
  File "/usr/local/lib/python3.7/site-packages/cudf/core/frame.py", line 140, in _from_data
    Frame.__init__(obj, data, index)
  File "/usr/local/lib/python3.7/site-packages/cudf/core/frame.py", line 78, in __init__
    self._data = cudf.core.column_accessor.ColumnAccessor(data)
  File "/usr/local/lib/python3.7/site-packages/cudf/core/column_accessor.py", line 121, in __init__
    data = dict(data)
  File "/usr/local/lib/python3.7/site-packages/cudf/core/frame.py", line 1791, in <genexpr>
    data_columns = (col.shift(offset, fill_value) for col in self._columns)
  File "/usr/local/lib/python3.7/site-packages/cudf/core/column/column.py", line 391, in shift
    return libcudf.copying.shift(self, offset, fill_value)
  File "cudf/_lib/copying.pyx", line 633, in cudf._lib.copying.shift

我似乎不明白为什么会发生这种事。

EN

回答 1

Stack Overflow用户

发布于 2022-07-02 00:06:17

谢谢您的最小复制;只要做一点小小的改动就可以了。不要.compute()达斯克太早。如果您需要在dask/dask_cudf中执行某些操作并继续处理,请使用.persist()

代码语言:javascript
复制
import pandas as pd
import numpy as np
import dask_cudf
import cudf


data = {"x":range(1,21), "nor":np.random.normal(2, 4, 20), "unif":np.random.uniform(size = 20)}
df = cudf.DataFrame(data)
ddf = dask_cudf.from_cudf(df, npartitions = 2)
colz = ["nor", "unif"]
ddf[[s + "_" + str(1) for s in colz]] = ddf[colz].shift(1)
ddf[[s + "_" + str(2) for s in colz]] = ddf[colz].shift(2)
ddf[[s + "_" + str(3) for s in colz]] = ddf[colz].shift(3)
ddf[[s + "_" + str(5) for s in colz]] = ddf[colz].shift(5)
ddf.compute()

输出

代码语言:javascript
复制
    x   nor unif    nor_1   unif_1  nor_2   unif_2  nor_3   unif_3  nor_5   unif_5
0   1   3.711132    0.021615    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>
1   2   -2.465054   0.081927    3.711131915 0.021614727 <NA>    <NA>    <NA>    <NA>    <NA>    <NA>
2   3   1.543548    0.481731    -2.465054359    0.081927168 3.711131915 0.021614727 <NA>    <NA>    <NA>    <NA>
3   4   8.820771    0.040135    1.543548323 0.481731194 -2.465054359    0.081927168 3.711131915 0.021614727 <NA>    <NA>
4   5   0.233656    0.135811    8.82077073  0.040135259 1.543548323 0.481731194 -2.465054359    0.081927168 <NA>    <NA>
5   6   2.526556    0.360873    0.23365638  0.135810979 8.82077073  0.040135259 1.543548323 0.481731194 3.711131915 0.021614727
6   7   2.799205    0.383579    2.526555817 0.360873336 0.23365638  0.135810979 8.82077073  0.040135259 -2.465054359    0.081927168
7   8   5.960305    0.362417    2.799205226 0.383579063 2.526555817 0.360873336 0.23365638  0.135810979 1.543548323 0.481731194
8   9   1.878898    0.609364    5.960304782 0.362416925 2.799205226 0.383579063 2.526555817 0.360873336 8.82077073  0.040135259
9   10  1.217635    0.041408    1.878898482 0.609364119 5.960304782 0.362416925 2.799205226 0.383579063 0.23365638  0.135810979
10  11  0.580250    0.128405    1.21763458  0.04140812  1.878898482 0.609364119 5.960304782 0.362416925 2.526555817 0.360873336
11  12  4.907322    0.708164    0.580249571 0.128405085 1.21763458  0.04140812  1.878898482 0.609364119 2.799205226 0.383579063
12  13  6.591673    0.105310    4.907321929 0.708164063 0.580249571 0.128405085 1.21763458  0.04140812  5.960304782 0.362416925
13  14  -2.974896   0.587859    6.591673409 0.105310053 4.907321929 0.708164063 0.580249571 0.128405085 1.878898482 0.609364119
14  15  2.284847    0.978458    -2.974896021    0.587858754 6.591673409 0.105310053 4.907321929 0.708164063 1.21763458  0.04140812
15  16  -5.616458   0.114736    2.28484689  0.97845785  -2.974896021    0.587858754 6.591673409 0.105310053 0.580249571 0.128405085
16  17  -3.003533   0.279865    -5.616457873    0.114736009 2.28484689  0.97845785  -2.974896021    0.587858754 4.907321929 0.708164063
17  18  0.241106    0.923462    -3.003532592    0.279864688 -5.616457873    0.114736009 2.28484689  0.97845785  6.591673409 0.105310053
18  19  -2.100202   0.613850    0.241106056 0.923462497 -3.003532592    0.279864688 -5.616457873    0.114736009 -2.974896021    0.587858754
19  20  8.364832    0.929587    -2.100201941    0.613850209 0.241106056 0.923462497 -3.003532592    0.279864688 2.28484689  0.97845785
​
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/72499158

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档