我有以下代码:
import pandas as pd
y = pd.ExcelFile('C:\\Users\\vibhu\\Desktop\\Training docs\\excel training\\super store data transformation\\Sample - Superstore data transformation by Vaibhav.xlsx')
superstore_orders = y.parse(sheet_name='Orders Input data')
superstore_orders.dtypes
factual_table= superstore_orders[['Order ID','Customer ID','Postal Code','Product ID','Product Name','Sales','Quantity','Discount','Profit' ]]
Order_table= superstore_orders[['Order ID','Order Date','Ship Date','Ship Mode']]
Order_table1= Order_table.drop_duplicates(subset='Order ID', keep='first', inplace=False)
Customer_table= superstore_orders[['Customer ID','Customer Name','Segment']]
Customer_table1= Customer_table.drop_duplicates(subset='Customer ID', keep='first', inplace=False)
Geographical_table= superstore_orders[['Postal Code','Country','City','State','Region']]
Geographical_table1= Geographical_table.drop_duplicates(subset='Postal Code', keep='first', inplace=False)
Product_table= superstore_orders[['Product ID','Category','Sub-Category','Product Name']]
Product_table1= Product_table.drop_duplicates(subset=['Product ID','Product Name'], keep='first', inplace=False)
Final_factual_data = pd.merge(Order_table1, factual_table, how='left', on='Order ID')
Final_factual_data = pd.merge(Customer_table1, Final_factual_data, how='left', on='Customer ID')
Final_factual_data = pd.merge(Geographical_table1,Final_factual_data,how='left', on='Postal Code')
Final_factual_data = pd.merge(Product_table1,Final_factual_data,how='left', on=['Product ID','Product Name'] )
Output is this format:- Product ID Category Sub-Category Product Name Postal Code Country City State Region Customer ID Customer Name Segment Order ID Order Date Ship Date Ship Mode Sales Quantity Discount Profit我需要按照以下顺序重新格式化:
订单ID、订货日期、装运日期、客户身份、地址、地区、城市、国家、国家、企业名称、产品名称、产品名称、关键类别、销售数量、折扣利润
发布于 2019-12-26 08:28:36
Final_factual_data1 = Final_factual_data [“订单ID”、“订购日期”、“装运日期”、“船舶模式”、“客户ID”、“客户名称”、“分段”、“国家”、“城市”、“州”、“邮编”、“区域”、“产品ID”、“类别”、“子类别”、“产品名称”、“销售”、“数量”、“折扣”、“利润”)
这段代码帮助我得到想要的答案。
发布于 2019-12-25 10:42:38
只需将预定的有序序列分配给columns属性:
Final_factual_data.columns = ['Order ID', 'order date', 'ship date', 'ship mode', 'Customer ID', 'cutomer name', 'segment', 'Postal Code', 'country', 'city', 'state reion', 'Product ID', 'Product Name', 'product key', 'cateory', 'subcategory', 'Sales', 'Quantity', 'Discount', 'Profit']https://stackoverflow.com/questions/59477408
复制相似问题