首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Pandas Melt :从列到行

Pandas Melt :从列到行
EN

Stack Overflow用户
提问于 2019-09-28 05:09:05
回答 1查看 585关注 0票数 1

这是我自学Python三天后的第一个问题,所以请多多关照。

我已经连接了四个数据帧:

代码语言:javascript
复制
frames = [dfLocationID, dfDimensions, dfCategories, dfTags]  
result = pd.concat(frames,
                   ignore_index=True,
                   sort=False)

要获得以下信息:

代码语言:javascript
复制
        Location ID   Dimensions     Categories              Tags
    0        1000.0          NaN            NaN               NaN
    1           NaN  3,000 sq ft            NaN               NaN
    2           NaN          NaN  * In the Zone               NaN
    3           NaN          NaN      Apartment               NaN
    4           NaN          NaN           Loft               NaN
    5           NaN          NaN            NaN          Bohemian
    6           NaN          NaN            NaN          Colorful
    7           NaN          NaN            NaN   Eclectic Quirky
    8           NaN          NaN            NaN           Kitchen
    9           NaN          NaN            NaN       Living Room
    10          NaN          NaN            NaN             Piano
    11          NaN          NaN            NaN        Wood Floor

我想要做到这一点:

代码语言:javascript
复制
        Location ID   Dimensions        Item              Data
    0        1000.0  3,000 sq ft  Categories     * In the Zone
    1        1000.0  3,000 sq ft  Categories         Apartment
    2        1000.0  3,000 sq ft  Categories              Loft
    3        1000.0  3,000 sq ft        Tags          Bohemian
    4        1000.0  3,000 sq ft        Tags          Colorful
    5        1000.0  3,000 sq ft        Tags   Eclectic Quirky
    6        1000.0  3,000 sq ft        Tags           Kitchen
    7        1000.0  3,000 sq ft        Tags       Living Room
    8        1000.0  3,000 sq ft        Tags             Piano
    9        1000.0  3,000 sq ft        Tags        Wood Floor

然后我试了一下:

代码语言:javascript
复制
dfTemp = ((dfLocationID.join(dfDimensions, how='outer')).join(dfCategories, how='outer')).join(dfTags, how='outer')

要获得以下信息:

代码语言:javascript
复制
       Location ID   Dimensions     Categories              Tags
    0       1000.0  3,000 sq ft  * In the Zone          Bohemian
    1          NaN          NaN      Apartment          Colorful
    2          NaN          NaN           Loft   Eclectic Quirky
    3          NaN          NaN            NaN           Kitchen
    4          NaN          NaN            NaN       Living Room
    5          NaN          NaN            NaN             Piano
    6          NaN          NaN            NaN        Wood Floor

现在,我尝试将最后两列转换为行:

代码语言:javascript
复制
dfFinal = dfTemp.melt(id_vars=["Location ID", "Dimensions"],
                          var_name="Item",
                          value_name="Data")

但我得到的是:

代码语言:javascript
复制
        Location ID   Dimensions        Item              Data
    0        1000.0  3,000 sq ft  Categories     * In the Zone
    1           NaN          NaN  Categories         Apartment
    2           NaN          NaN  Categories              Loft
    3           NaN          NaN  Categories               NaN
    4           NaN          NaN  Categories               NaN
    5           NaN          NaN  Categories               NaN
    6           NaN          NaN  Categories               NaN
    7        1000.0  3,000 sq ft        Tags          Bohemian
    8           NaN          NaN        Tags          Colorful
    9           NaN          NaN        Tags   Eclectic Quirky
    10          NaN          NaN        Tags           Kitchen
    11          NaN          NaN        Tags       Living Room
    12          NaN          NaN        Tags             Piano
    13          NaN          NaN        Tags        Wood Floor

对如何清理数据有什么建议吗?此外,我将不得不迭代不同的位置in,类别和标签中的值的数量将不会是恒定的。

谢谢。

EN

回答 1

Stack Overflow用户

发布于 2019-09-28 06:31:03

首先,我会将NaN转换为None,因为它们更容易处理:

代码语言:javascript
复制
df = df.where((pd.notnull(df)), None)

然后,您希望整个第一列和第二列具有相同的值(我不知道您在哪里做出这样的假设):

代码语言:javascript
复制
df['Location ID'] = df['Location ID'].iloc[0]
df['Dimensions'] = df['Dimensions'].iloc[1]

然后你就可以按原样运行你的熔化函数了。现在,您只需过滤掉"Item“或”None“列中的所有行:

代码语言:javascript
复制
df = df[~(df["Item"].isnull() | df["Data"].isnull())]

然后,输出是您想要的:

代码语言:javascript
复制
Location ID Dimensions  Item    Data
2   1000.0  3000 sq ft  Categories  * In the Zone
3   1000.0  3000 sq ft  Categories  Apartment
4   1000.0  3000 sq ft  Categories  Loft
17  1000.0  3000 sq ft  Tags    Bohemian
18  1000.0  3000 sq ft  Tags    Colorful
19  1000.0  3000 sq ft  Tags    Eclectic Quirky
20  1000.0  3000 sq ft  Tags    Kitchen
21  1000.0  3000 sq ft  Tags    Living Room
22  1000.0  3000 sq ft  Tags    Piano
23  1000.0  3000 sq ft  Tags    Wood Floor

如果需要对不同的位置执行此操作,请将此过程打包到函数transform中并使用groupby

代码语言:javascript
复制
df_new = pd.DataFrame(columns = df.columns)
for name, group in df.groupby(['Location ID', 'Dimensions']):
    df_group = transform(group)
    pd.concat([df_new, df_group], axis=0)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/58141356

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档