这是我在尝试使用要素工具时的数据集
data
Unit Price Customer Name Product Category Region Profit Quantity ordered new Sales Order ID
0 2.88 Janice Fletcher Office Supplies Central 1.320000 2 5.90 88525
1 2.84 Bonnie Potter Office Supplies West 4.560000 4 13.01 88522
2 6.68 Bonnie Potter Office Supplies West -47.640000 7 49.92 88523
3 5.68 Bonnie Potter Office Supplies West -30.510000 7 41.64 88523
4 205.99 Bonnie Potter Technology West 998.202300 8 1446.67 88523
9426 rows × 8 columns
returns
Order ID Status
0 65 Returned
1 612 Returned
2 614 Returned
3 678 Returned
4 710 Returned
1634 rows × 2 columns
users
Region Manager
0 Central Chris
1 East Erin
2 South Sam
3 West William
entities = {
"data" : (data, "Order ID"),
"returns" : (returns, "Status"),
"users" : (users, "Manager")}
relationships = [
('data', 'Order ID', 'returns', 'Order ID'),
('data', 'Region', 'users', 'Region')]
combined_table, features_defs = ft.dfs(entities = entities,
relationships = relationships,
target_entity = "Unit Price")combined_table
这就是我收到的错误消息
AssertionError: Index is not unique on dataframe (Entity data)有人能告诉我我做错了什么吗?enter image description here
发布于 2020-06-26 03:17:56
每个实体上的索引值必须是唯一的。在您的数据实体上,所有订单ID值的indize都为空。
此外:
target_entity = "Unit Price"将不起作用,因为您必须提供一个实体(数据、返回或用户),而不是表/实体的列。Featurtools在每次运行时只在一个表/实体上生成特征,而不是在所有表/实体上生成特征。
https://stackoverflow.com/questions/62555372
复制相似问题