首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >优化列表和字典的迭代

优化列表和字典的迭代
EN

Code Review用户
提问于 2021-05-14 19:23:39
回答 1查看 57关注 0票数 0

该项目的目标是解析包含数据的大文件。我将数据解析为包含字典的列表。然后,我根据这些数据进行计算,并可选择地为可视化目的绘制它。我使用这些数据进行简单的计算,根据它们的表现来计算奖励。我写了这篇文章,这样每一位在池中采矿的工人都会得到公平的回报。这允许多个人在同一个帐户上进行挖掘,以获得更快的支付时间。

保存的数据文件示例:

Worker_Data.Data:

代码语言:javascript
复制
Data={'pool_current_hashrate': '100215904', 'pool_average_hashrate': '61640734', 'pool_reported_hashrate': '78165786', 'current_hashrate_alex147': '47721859', 'average_hashrate_alex147': '35791394', 'reported_hashrate_alex147': '36895352', 'current_hashrate_henry147': '52494045', 'average_hashrate_henry147': '25849340', 'reported_hashrate_henry147': '41354162', 'time_stamp': '1620751617', 'eth': '0.008999485617836284', 'zil': '4.654624711084'}
Data={'pool_current_hashrate': '100215904', 'pool_average_hashrate': '61640734', 'pool_reported_hashrate': '78337185', 'current_hashrate_alex147': '47721859', 'average_hashrate_alex147': '35791394', 'reported_hashrate_alex147': '36890956', 'current_hashrate_henry147': '52494045', 'average_hashrate_henry147': '25849340', 'reported_hashrate_henry147': '41509445', 'time_stamp': '1620751678', 'eth': '0.008999485617836284', 'zil': '4.654624711084'}

注意:可以只有1名矿工或池允许的人数,每个人都指定采矿平台的当前、平均和报告的哈希率。

我解析这个文件,其中包含数以万计的这些行,以计算每个矿工在付款之间的时间内做了多少“工作”,然后计算他们对差额中的变化的接受程度。每一行都是列表中的一个单独的字典。

代码:

代码语言:javascript
复制
from ast import literal_eval

PATH = "A:\\Python Project\\ezil_api\\Data\\" # path of data file
WORKER_SPLIT = 0.50  # used if start balance is not 0


def make_file(name, config_dict, type_conf, path=PATH):
    with open(path + name + "." + type_conf, "a+") as file:
        for keys, values in zip(config_dict.keys(), config_dict.values()):
            file.write(f"{keys}={values}\n")


def read_data(path, file_name):
    data = []
    with open(path + file_name, "r+") as config:
        lines = config.readlines()
        for line in lines:
            line = line[line.find("=") + 1:]
            line_data = literal_eval(line)
            data.append(line_data)
    return data


def eval_data():
    workers = []
    start_balance_eth = 0
    start_balance_zil = 0
    balance_eth = []
    balance_zil = []
    balance_delta_eth = []
    balance_delta_zil = []
    delta_eth_range = [0]
    time = []
    time_delta = []
    balance_workers_eth = {}
    balance_workers_zil = {}
    hashrate_workers = {}
    integral_worker = {}
    worker_percentage = {}
    b = {}
    odd = 0
    even = 0
    hashrate_pool = []
    balance_eth_delta = []
    total_integral = []
    temp_integral = 0

    files_workers = read_data(path=PATH, file_name="Worker_Data.Data")

    from time import time as t

    for worker_data in files_workers:
        index = files_workers.index(worker_data)

        worker_list_temp = [worker_temp[17:] for worker_temp in worker_data.keys() if "average_hashrate_" in worker_temp]

        for worker in worker_list_temp:
            if worker not in workers:
                workers.append(worker)
                hashrate_workers[worker] = []
                balance_workers_eth[worker] = 0
                balance_workers_zil[worker] = 0
                integral_worker[worker] = []
                worker_percentage[worker] = []
                b[worker] = []

        current_balance_eth = float(worker_data["eth"])
        current_balance_zil = float(worker_data["zil"])
        current_time = int(worker_data["time_stamp"])

        for worker in workers:
            current_worker_in_keys = False
            for keys in worker_data.keys():
                if worker in keys:
                    current_worker_in_keys = True
            if current_worker_in_keys:
                worker_hashrate = worker_data[f"current_hashrate_{worker}"]
                hashrate_workers[worker].append(int(worker_hashrate))
            else:
                hashrate_workers[worker].append(0)

        hashrate_pool.append(float(worker_data["pool_current_hashrate"]))

        if index > 0:
            if current_balance_eth > balance_eth[-1]:
                delta_eth = current_balance_eth - balance_eth[-1]
                balance_delta_eth.append(delta_eth)
            else:
                balance_delta_eth.append(0)
            if current_balance_zil > balance_zil[-1]:
                delta_zil = current_balance_zil - balance_zil[-1]
                balance_delta_zil.append(delta_zil)
            else:
                balance_delta_zil.append(0)

            delta_time = current_time - time[-1]
            time_delta.append(delta_time)

        else:
            start_balance_eth = current_balance_eth
            start_balance_zil = current_balance_zil
            balance_delta_eth.append(0)
            balance_delta_zil.append(0)

        balance_eth.append(current_balance_eth)
        balance_zil.append(current_balance_zil)
        time.append(current_time)

    for d_eth, index_temp in zip(balance_delta_eth, range(len(balance_delta_eth))):
        if d_eth != 0:
            delta_eth_range.append(index_temp)

    for worker in workers:
        # if it doesn't have data for balances, it splits it between workers
        if start_balance_zil > 0:
            balance_workers_zil[worker] += start_balance_zil * WORKER_SPLIT
        if start_balance_eth > 0:
            balance_workers_eth[worker] += start_balance_eth * WORKER_SPLIT

        for index in range(len(delta_eth_range)):
            # integral of hashrate
            if index > 0:
                temp_time_delta_list = time_delta[delta_eth_range[index - 1]:delta_eth_range[index]]
                temp_hashrate_list = [hashrate_workers[worker][delta_eth_range[index - 1]:delta_eth_range[index]],
                                      temp_time_delta_list]
                while len(temp_hashrate_list[0]) < len(temp_hashrate_list[1]):
                    temp_hashrate_list[0].append(0)

                temp_hashrate_len = len(temp_hashrate_list[0])
                x = temp_hashrate_list[0]
                y = temp_hashrate_list[1]
                if temp_hashrate_len > 4:
                    # do simpsons integration:
                    # start = (delta x * h[0] + delta x * h[-1])/3
                    # odd = (delta x * h[1] + delta x * h[3]...) * (4/3)
                    # evens = (delta x * h[2] + delta x h[4]...) * (2/3)
                    start = (x[0] * y[0] + x[-1] * y[-1]) * (4 / 3)

                    for i in range(len(temp_hashrate_list)):
                        if ((temp_hashrate_len - 1) > i) and (i > 0):
                            if i % 2:
                                odd += (x[i] * y[i]) * (4 / 3)
                            else:
                                even += (x[i] * y[i]) * (2 / 3)

                    integral = start + even + odd
                    integral_worker[worker].append(integral)
                    even = 0
                    odd = 0

                elif temp_hashrate_len > 1:
                    # do trapezoid integration
                    # delta x/2(h[0] + 2*h[1] + 2*h[2]... + h[-1])
                    trap_integral = ((x[0] * y[0]) + (x[-1] * y[-1]))
                    for i in range(len(temp_hashrate_list)):
                        if ((temp_hashrate_len - 1) > i) and (i > 0):
                            trap_integral += (x[i] * y[i])
                    integral_worker[worker].append(trap_integral)

                elif temp_hashrate_len == 1:
                    # do riemann sum integration
                    # y * delta x
                    riemann_integral = y[0] * x[0]
                    integral_worker[worker].append(riemann_integral)

    for index in range(len(integral_worker[workers[0]])):
        for worker in integral_worker.keys():
            temp_integral += integral_worker[worker][index]
        total_integral.append(temp_integral)
        temp_integral = 0

    for worker in workers:
        for integral_t, worker_integral in zip(total_integral, integral_worker[worker]):
            try:
                worker_percentage[worker].append(worker_integral / integral_t)
            except ZeroDivisionError:
                pass

    for delta in balance_delta_eth:
        if delta != 0:
            balance_eth_delta.append(delta)

    for worker in workers:
        for percentage, delta in zip(worker_percentage[worker], balance_eth_delta):
            balance_workers_eth[worker] += percentage * delta
            b[worker].append(balance_workers_eth[worker])

    def plot():
        import matplotlib.pyplot as plt
        time.sort(reverse=True)
        plt.xlabel = "Delta Balance index"
        plt.ylabel = "ETH Balance"
        plt.title("Index Vs ETH")

        for worker_d in workers:
            temp_x = []
            for index_d in range(len(worker_percentage[worker_d])):
                temp_x.append(index_d + 1)
            x_d = temp_x
            y_d = b[worker_d]
            plt.plot(x_d, y_d, "-", label=f"{worker_d}")

        def plot_ddx():
            d_list = []
            for local_index in range(len(delta_eth_range)):
                if local_index > 0:
                    temp_index = delta_eth_range[local_index]
                    prev_temp_index = delta_eth_range[local_index - 1]

                    delta_eth_temp = balance_eth[temp_index] - balance_eth[prev_temp_index]
                    delta_time_temp = time[temp_index] - time[prev_temp_index]
                    average_hashrate_temp = (sum(hashrate_pool[prev_temp_index:temp_index])) / (
                            temp_index - prev_temp_index)

                    d_list.append(
                        ((delta_eth_temp / delta_time_temp) / average_hashrate_temp) * 1000000 * 60 * 60 * 24 * 10)
                    # magic numbers are as follows:
                    # 1000000, convert to per mh/s,
                    # 60*60*24, convert from seconds to days,

            plt.plot(x_d, d_list, "-", label="ETH per 10 Mh/s per day")

        plt.legend()
        plt.show()

    for keys in balance_workers_eth.keys():
        print(keys, balance_workers_eth[keys])

    plot()


if __name__ == "__main__":
    eval_data()

有什么办法能让我加快速度吗?目前,在22,000行,它需要大约40秒来执行这个程序。占用最多时间的主要部分是我在行中迭代files_workers的部分:for worker_data in files_workers: --根据我的测试,它占用了所需时间的很大一部分,而peg是我的核心。是否有更有效的方法来解决这个问题?我感谢任何帮助/建设性的批评。

EN

回答 1

Code Review用户

回答已采纳

发布于 2021-05-27 20:30:08

对于初学者,代码似乎搜索当前dict对象的相关索引:

代码语言:javascript
复制
for worker_data in files_workers:
    index = files_workers.index(worker_data)

但是:

  1. 您可以很容易地使用enumerate来避免这种情况。
  2. 似乎您只关心它是否是第一次迭代,所以也许一个标志就足够了。
票数 2
EN
页面原文内容由Code Review提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://codereview.stackexchange.com/questions/260752

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档