文章/答案/技术大牛

发布

问事务回滚
EN

Stack Overflow用户

提问于 2019-11-27 09:40:26

回答 2查看 505关注 0票数 0

我有一个大列表，它本身是由5300万个较小的列表作为元素组成的。我想把每个较小的列表作为一行提交到分批的db中，批处理大小为1,000,000，这意味着每当脚本连接到db时，它就会提交1000,000个元素，然后断开与db的连接，然后再连接到另外1,000,000行。

现在我的问题是，如果在提交了50,000,000行后，在中间发生了错误，对于ex，我需要删除db中的所有行，并尝试从一开始就提交所有的行。

我在想，也许我可以使用rollback()来删除到现在为止添加的所有50,000,000行，但是只要我使用一个循环，我就不知道如何回滚所有以批方式提交的所有50,000,000行。

有人有什么建议吗？

这是我的脚本：“结果”是以53,000,000个较小的列表作为元素的列表。

batch = []
counter = 0
BATCH_SIZE =1000000
cursor_count = 0

def prepare_names(names):
    return [w.replace("'", '') for w in names]

for i in range(len(results)):
    if counter < BATCH_SIZE:
        batch.append(prepare_names([results[i][0], results[i][1], results[i][2]]))  # batch => [[ACC1234.0, 'Some full taxa name'], ...]
        counter += 1
    else:
        batch.append(prepare_names([results[i][0], results[i][1], results[i][2]]))

        values = (", ").join([f"('{d[0]}', '{d[1]}', '{d[2]}')" for d in batch])
        sql = f"INSERT INTO accession_taxonomy(accession_number, taxon_id, taxonomy) VALUES {values}"

        try:
            cursor.execute(sql)
            db.commit()
        except Exception as exception:
            print(exception)
            print(f"Problem with query: {sql}")

        print(cursor.rowcount, "Records Inserted")
        cursor_count += cursor.rowcount
        counter = 0
        batch = []
else:
    if batch:
        values = (", ").join([f"('{d[0]}', '{d[1]}', '{d[2]}')" for d in batch])
        sql = f"INSERT INTO accession_taxonomy(accession_number, taxon_id, taxonomy) VALUES {values}"

        try:
            cursor.execute(sql)
            db.commit()
        except Exception as exception:
            print(exception)
            print(f"Problem with query: {sql}")

        print(cursor.rowcount, "Records Inserted")
        cursor_count += cursor.rowcount

print("Total Number Of %s Rows Has Been Added." %(cursor_count))
db.close()

mysql

transactions

rollback

python

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-11-27 10:05:28

我会用一些旗子来确保

inserted

nothing
出了什么事发生了

然后，使用这些标志选择提交或回滚，例如：

nothing_wrong_happened = True
something_was_inserted = False

for i in range(len(results)):

    # Your code that generates the query

        try:
            cursor.execute(sql)
            something_was_inserted = True  # <-- you inserted something
        except Exception as exception:
            nothing_wrong_happened = False # <-- Something bad happened
            print(exception)
            print(f"Problem with query: {sql}")

        # the rest of your code
else:

    # Your code that generates the query

        try:
            cursor.execute(sql)
            something_was_inserted = True  # <-- you inserted something
        except Exception as exception:
            nothing_wrong_happened = False # <-- Something bad happened
            print(exception)
            print(f"Problem with query: {sql}")

        # the rest of your code

# The loop is now over
if (something_was_inserted):
    if (nothing_wrong_happened):
        db.commit()   # commit everything
    else:
        db.rollback() # rollback everything

票数 0

Stack Overflow用户

发布于 2019-11-27 10:02:29

在commit之后没有回滚。

看门人：

1st Attempt 1M rows : committed
2nd Attempt 1M rows : committed
3rd Attempt 1m rows : error

您只能回滚第三次尝试。第一和第二是完成。

解决方法是修改accession_taxonomy表并添加一个字段(称为insertHash )。您的批处理更新过程对于此字段有一个唯一的值-for，这个批处理执行。假设是todaysDate- -如果任何插入步骤失败，则可以执行

Delete T from accession_taxonomy T Where T.insertHash ='TheValueUSet'

所以本质上是这样的：

1st Attempt 1M rows : committed
2nd Attempt 1M rows : committed
3rd Attempt 1m rows : error
Delete AllRows where insertHash = 'TheValueUSet'

话虽如此，你确定你想拍100万行吗？您是否检查过您的服务器是否能够接受那个大数据包？

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/59067164

复制

相似问题

问事务回滚
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问事务回滚EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问事务回滚
EN