文章/答案/技术大牛

发布

社区首页 >问答首页 >如何使用pandas.to_sql，但仅在不存在行的情况下添加行

问如何使用pandas.to_sql，但仅在不存在行的情况下添加行
EN

Stack Overflow用户

提问于 2020-05-11 08:32:40

回答 4查看 6.6K关注 0票数 9

我对python有一些经验，但对SQL非常陌生，并且尝试使用pandas.to_sql将表数据添加到我的数据库中，但是当我将它添加到时，我要检查数据在追加之前是否存在。

这是我的两个数据

>>> df0.to_markdown()
|    |   Col1 |   Col2 |
|---:|-------:|-------:|
|  0 |      0 |     00 |
|  1 |      1 |     11 |

>>> df1.to_markdown()
|    |   Col1 |   Col2 |
|---:|-------:|-------:|
|  0 |      0 |     00 |
|  1 |      1 |     11 |
|  2 |      2 |     22 |

所以这里我用熊猫to_sql

>>> df0.to_sql(con=con, name='test_db', if_exists='append', index=False)
>>> df1.to_sql(con=con, name='test_db', if_exists='append', index=False)

在这里，我检查数据库文件中的数据

>>> df_out = pd.read_sql("""SELECT * FROM test_db""", con)
>>> df_out.to_markdown()
|    |   Col1 |   Col2 |
|---:|-------:|-------:|
|  0 |      0 |      0 |
|  1 |      1 |     11 |
|  2 |      0 |      0 | # Duplicate
|  3 |      1 |     11 | # Duplicate
|  4 |      2 |     22 |

但是我希望我的数据库看起来像这样，所以我不想把重复的数据添加到我的数据库

|    |   Col1 |   Col2 |
|---:|-------:|-------:|
|  0 |      0 |      0 |
|  1 |      1 |     11 |
|  3 |      2 |     22 |

我是否可以设置任何选项或添加代码行来实现这一点？

谢谢！

编辑:有一些SQL代码只能提取唯一的数据，但我要做的是首先不要将数据添加到数据库中

python

mysql

pandas

pandas-to-sql

回答 4

Stack Overflow用户

发布于 2021-04-06 13:58:40

不要使用to_sql --一个简单的查询可以工作

query = text(f""" INSERT INTO test_db VALUES {','.join([str(i) for i in list(df0.to_records(index=False))])} ON CONFLICT ON CONSTRAINT test_db_pkey DO NOTHING""")

self.engine.connect().execute(query)

对于每个DataFrame，将df0更改为df1

遵循这些链接，以便更好地理解

Insert values if records don't already exist in Postgres

How to upsert pandas DataFrame to PostgreSQL table?

票数 5

Stack Overflow用户

发布于 2022-04-26 05:13:12

例如，在我的Sqlit3数据库中，我使用了临时表：

我将所有数据数据插入到临时表中：

df0.to_sql(con=con，name='temptable'，if_df0.to_sql=‘append’，index=False)

df1.to_sql(con=con，name='temptable'，if_=‘append’，index=False)

然后，我只复制新数据并删除(删除)表：

con.executescript('''
INSERT INTO test_db
SELECT test_db.* FROM temptable 
LEFT JOIN test_db on 
   test_db.Col1 = temptable.Col1
WHERE test_db.Col1 IS NULL; -- only items, that not presented in 'test_db' table

DROP TABLE temptable;
''')

票数 0

Stack Overflow用户

发布于 2022-09-26 09:20:57

有两种方式：

如果来自数据库的数据不太大，请将数据库中的数据读入dataframe，并将这两列(和Col2)组合起来创建一个新列，即combined_column，并将其保存到list combined_column_list中。从df0和df2中筛选出相应的combined_column不出现在combined_column_list中的行，并将过滤后的行直接插入数据库表.

将df1和df2插入到临时表中，例如，名称为"temp“。使用pymysql运行以下代码：

conn = pymysql.connect(host=DB_ip，user=DB_user，passwd=DB_password，db=DB_name) cur = conn.cursor() temp_query =“insert * into test_db (select * from temp where ( Col1，Col2)不在其中(选择Col1，Col2从test_db ));”cur.execute(temp_query) conn.commit()

这只会将新数据插入数据库表。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/61725432

复制

相似问题

问如何使用pandas.to_sql，但仅在不存在行的情况下添加行
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使用pandas.to_sql，但仅在不存在行的情况下添加行EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使用pandas.to_sql，但仅在不存在行的情况下添加行
EN