我正在尝试转换python数据类型,以便可以使用sp_execute_external_scripts过程通过sql server返回它们。有些专栏特别给了我一些问题。样本数据:
>>> df.column1
0 NaN
1 1403
2 NaN
3 NaN
4 NaN使用在另一个答案(https://stackoverflow.com/a/60779074/3084939)中找到的方法,我创建了一个函数来完成这个任务并返回一个新的系列。
def str_convert(series):
null_cells = series.isnull()
return series.astype(str).mask(null_cells, np.NaN)然后我会:
df.column1 = str_convert(df.column1)当我在management中运行该过程时,会得到一个错误:
Msg 39004, Level 16, State 20, Line 0
A 'Python' script error occurred during execution of 'sp_execute_external_script' with HRESULT 0x80004004.
Msg 39019, Level 16, State 2, Line 0
An external script error occurred:
C:\SQL\MSSQL14.SQL2017\PYTHON_SERVICES.3.7\lib\site-packages\revoscalepy\functions\RxSummary.py:4: FutureWarning: The Panel class is removed from pandas. Accessing it from the top-level namespace will also be removed in the next version
from pandas import DataFrame, Index, Panel
INTERNAL ERROR: should have tag
error while running BxlServer: caught exception: Error communicating between BxlServer and client: 0x000000e9
STDOUT message(s) from external script:
Express Edition will continue to be enforced.
Warning: numpy.int64 data type is not supported. Data is converted to float64.
Warning: numpy.int64 data type is not supported. Data is converted to float64.
SqlSatelliteCall function failed. Please see the console output for more information.
Traceback (most recent call last):
STDOUT message(s) from external script:
File "C:\SQL\MSSQL14.SQL2017\PYTHON_SERVICES.3.7\lib\site-packages\revoscalepy\computecontext\RxInSqlServer.py", line 605, in rx_sql_satellite_call
rx_native_call("SqlSatelliteCall", params)
File "C:\SQL\MSSQL14.SQL2017\PYTHON_SERVICES.3.7\lib\site-packages\revoscalepy\RxSerializable.py", line 375, in rx_native_call
ret = px_call(functionname, params)
RuntimeError: The type numpy.ndarray(numpy.ustr) for column1 is not supported.不知道从何处开始,但当我简单地执行以下操作时,它不会出错,但是NaN值将被替换为'nan‘,因此返回作为字符串返回,而不是在server中返回null,这不是我想要的。希望其他人对发生的事情有一些洞察力。我试着搜索,但没有发现任何相关的信息。
df.column1 = df.column1 .astype(str)编辑:
一个更简单的例子似乎揭示了当本系列中的第一个值是NaN时发生的情况。
declare @script nvarchar(max) = N'
import os
import datetime
import numpy as np
import pandas as pd
df = pd.DataFrame([[np.NaN, "a", "b"],["w","x",np.NaN],[1, 2, 3]])
df.columns = ["a","b","c"]
print(df.head())
'
execute sp_execute_external_script
@language = N'Python',
@script = @script,
@output_data_1_name = N'df'
with result sets ((
a varchar(100) null
,b varchar(100) null
,c varchar(100) null
))发布于 2021-04-15 07:03:03
我认为这可能是因为np.NaN不是字符串,因此不能转换为varchar。
尝试将df值转换为str,即在最近的示例中:
df["a"] = df["a"].apply(str)
https://stackoverflow.com/questions/64688549
复制相似问题