如何使用DataFrame API将多台变压器应用于单个熊猫ColumnTransformer列?
例如,我想取立方根,然后标准化DataFrame列中的值:
df = pd.DataFrame(
np.array([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]]),
columns=['a', 'b', 'c']
)
transformer = ColumnTransformer(
[
('root3_std', StandardScaler() + FunctionTransformer(np.cbrt), 'a') <-- pseudocode
],
remainder='passthrough'
)如果我写
transformer = ColumnTransformer(
[
('root3', FunctionTransformer(np.cbrt), 'a'),
('standardize', StandardScaler(), 'a')
],
remainder='passthrough'
)我得到两个独立的列,一个具有立方根,另一个具有标准化的原始值。如何将和变压器一举应用?
发布于 2020-10-28 09:42:22
from sklearn.pipeline import Pipeline
import pandas as pd
import numpy as np
from sklearn.preprocessing import FunctionTransformer, StandardScaler
df = pd.DataFrame(
np.array([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]]),
columns=['a', 'b', 'c']
)
pipe = Pipeline([('function_transformer', FunctionTransformer(np.cbrt)),
('standard_scalar', StandardScaler())])
pipe.fit_transform(df[['a']])
#op
array([[-1.32381804],
[ 0.23106179],
[ 1.09275626]])https://stackoverflow.com/questions/64568504
复制相似问题