文章/答案/技术大牛

发布

社区首页 >问答首页 >针对逐行比较优化Python代码

问针对逐行比较优化Python代码
EN

Stack Overflow用户

提问于 2020-10-29 02:21:59

回答 1查看 50关注 0票数 0

我已经编写了Python代码，它使用多个if条件和一个for循环。代码的主要目标是在一定条件下生成交通灯系统。

Red = -1
Yellow = 0
Green = 1

它需要4个月(m0、m1、m2、m3)和dataframe作为输入，并对每行运行条件并返回-1、0和1

代码将月1与月0、月2与月1、月3与月2进行比较。

对于输入：

if month+1 < Month for any value, then red else green.

例如，如果July2020的收入小于June2020，则输入为红色，否则为绿色。根据三种比较结果进行了计算。结果可以是1-、0或1。

我写的代码运行良好，但没有以任何方式进行优化。有没有更好的方法来做这件事？

这将是一个O(n)操作，但至少应该有一种用python简洁地编写它的方法。或者代码是否也可以在操作上得到改进。

def getTrafficLightData(df, dimension, m1, m2, m3, m4):
'''
Inputs - 
    Dataframe
    dimension = on which we want to calculate traffic light system
    m1, m2, m3, m4 - Could be any for months but we have taken consecutive months for Traffic Light System.
    Example Call - getTrafficLightData(report6_TLS_data, "Revenue_","2020-6","2020-7","2020-8","2020-9")

'''

TFS_df = pd.DataFrame(columns=[dimension + "_TLS"])

if dimension == "Overstrike_":
    suffix = "%"
    for i in range(len(df)):
        if (
            (
                df[dimension + m1 + suffix].iloc[i]
                > df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                > df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                > df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [1]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                < df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                < df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                < df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [-1]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                < df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                > df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                < df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [-1]
        elif (
            (
                df[dimension + m2 + suffix].iloc[i]
                > df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                < df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                < df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [-1]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                > df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                > df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                < df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [0]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                > df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                < df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                > df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [1]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                < df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                > df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                > df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [1]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                < df[dimension + m2 + suffix].iloc[i]
            )
            and (  #
                df[dimension + m2 + suffix].iloc[i]
                < df[dimension + m3 + suffix].iloc[i]
            )
            and (  #
                df[dimension + m3 + suffix].iloc[i]
                > df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [0]
        else:
            TFS_df.loc[i] = [0]
    return TFS_df
else:
    if dimension == "Margin_":
        suffix = "%"
    else:
        suffix = ""
    for i in range(len(df)):
        if (
            (
                df[dimension + m1 + suffix].iloc[i]
                > df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                > df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                > df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [-1]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                < df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                < df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                < df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [1]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                < df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                > df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                < df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [1]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                > df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                < df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                < df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [1]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                > df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                > df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                < df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [0]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                > df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                < df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                > df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [-1]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                < df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                > df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                > df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [-1]
        elif (
            (
                df[dimension + m1 + suffix].iloc[i]
                < df[dimension + m2 + suffix].iloc[i]
            )
            and (
                df[dimension + m2 + suffix].iloc[i]
                < df[dimension + m3 + suffix].iloc[i]
            )
            and (
                df[dimension + m3 + suffix].iloc[i]
                > df[dimension + m4 + suffix].iloc[i]
            )
        ):
            TFS_df.loc[i] = [0]
        else:
            TFS_df.loc[i] = [0]
    return TFS_df

该函数的调用方式如下：

report6_TLS_data['Revenue_TLS']=getTrafficLightData(report6_TLS_data, "Revenue_","2020-6","2020-7","2020-8","2020-9")
report6_TLS_data["Margin_TLS"]=getTrafficLightData(report6_TLS_data, "Margin_","2020-6","2020-7","2020-8","2020-9")
report6_TLS_data["Overstrike_TLS"]=getTrafficLightData(report6_TLS_data, "Overstrike_","2020-6","2020-7","2020-8","2020-9")

任何提示都会很有帮助。

输入数据的形式为-

ym  PART NUMBER BranchCode  Revenue_2019-1  Revenue_2019-10 Revenue_2019-11 Revenue_2019-12 Revenue_2019-2  Revenue_2019-3  Revenue_2019-4  Revenue_2019-5  Revenue_2019-6  Revenue_2019-7  Revenue_2019-8  Revenue_2019-9  Revenue_2020-1  Revenue_2020-2  Revenue_2020-3  Revenue_2020-4  Revenue_2020-5  Revenue_2020-6  Revenue_2020-7  Revenue_2020-8  Revenue_2020-9  Margin_2019-1   Margin_2019-10  Margin_2019-11  Margin_2019-12  Margin_2019-2   Margin_2019-3   Margin_2019-4   Margin_2019-5   Margin_2019-6   Margin_2019-7   Margin_2019-8   Margin_2019-9   Margin_2020-1   Margin_2020-2   Margin_2020-3   Margin_2020-4   Margin_2020-5   Margin_2020-6   Margin_2020-7   Margin_2020-8   Margin_2020-9   Overstrike_2019-1   Overstrike_2019-10  Overstrike_2019-11  Overstrike_2019-12  Overstrike_2019-2   Overstrike_2019-3   Overstrike_2019-4   Overstrike_2019-5   Overstrike_2019-6   Overstrike_2019-7   Overstrike_2019-8   Overstrike_2019-9   Overstrike_2020-1   Overstrike_2020-2   Overstrike_2020-3   Overstrike_2020-4   Overstrike_2020-5   Overstrike_2020-6   Overstrike_2020-7   Overstrike_2020-8   Overstrike_2020-9   Transactions_2019-1 Transactions_2019-10    Transactions_2019-11    Transactions_2019-12    Transactions_2019-2 Transactions_2019-3 Transactions_2019-4 Transactions_2019-5 Transactions_2019-6 Transactions_2019-7 Transactions_2019-8 Transactions_2019-9 Transactions_2020-1 Transactions_2020-2 Transactions_2020-3 Transactions_2020-4 Transactions_2020-5 Transactions_2020-6 Transactions_2020-7 Transactions_2020-8 Transactions_2020-9 Margin_2019-1%  Margin_2019-10% Margin_2019-11% Margin_2019-12% Margin_2019-2%  Margin_2019-3%  Margin_2019-4%  Margin_2019-5%  Margin_2019-6%  Margin_2019-7%  Margin_2019-8%  Margin_2019-9%  Margin_2020-1%  Margin_2020-2%  Margin_2020-3%  Margin_2020-4%  Margin_2020-5%  Margin_2020-6%  Margin_2020-7%  Margin_2020-8%  Margin_2020-9%  Overstrike_2019-1%  Overstrike_2019-10% Overstrike_2019-11% Overstrike_2019-12% Overstrike_2019-2%  Overstrike_2019-3%  Overstrike_2019-4%  Overstrike_2019-5%  Overstrike_2019-6%  Overstrike_2019-7%  Overstrike_2019-8%  Overstrike_2019-9%  Overstrike_2020-1%  Overstrike_2020-2%  Overstrike_2020-3%  Overstrike_2020-4%  Overstrike_2020-5%  Overstrike_2020-6%  Overstrike_2020-7%  Overstrike_2020-8%  Overstrike_2020-9%
0   BAGG001 BC  71.75   90.00   20.25   43.50   42.50   30.00   70.00   44.25   45.00   46.75   129.50  58.00   81.00   36.00   33.25   0.75    15.00   24.75   0.00    0.00    2.50    32.97   39.15   8.95    14.31   18.95   7.86    30.68   19.27   19.74   18.12   59.38   22.30   34.95   17.59   14.10   0.32    6.35    5.30    0.00    0.00    1.06    0.00    0.00    0.00    1.00    0.00    1.00    0.00    0.00    3.00    3.00    1.00    1.00    2.00    0.00    0.00    0.00    0.00    2.00    0.00    0.00    0.00    8   16  5   9   5   6   12  7   10  7   13  10  13  5   11  1   2   4   0   0   1   1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    0.00    0.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00    0.00    0.00    1.00
1   BAGG001 PK  0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    25.50   0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    9.90    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    1.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0   0   0   0   0   0   0   0   0   2   0   0   0   0   0   0   0   0   0   0   0   0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    1.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    1.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00

python

pandas

optimization

回答 1

Stack Overflow用户

发布于 2020-10-29 02:51:22

您正在为每一行执行的for循环是使用apply()方法在pd.DataFrame中构建的。

这个方法基本上将一个给定的函数应用于pd.DataFrame中的每一行。

仅基于您的一个案例，您可以执行以下操作：

In [0]: import pandas as pd
df = pd.DataFrame({
    "Month 0":[3,6,3,14,6,3,1,6],
    "Month 1":[1,4,3,15,6,3,4,5],
    "Month 2":[9,4,1,14,1,3,2,1], 
    "Month 3":[1,6,9,14,6,2,6,8],
})

Out[0]:
   Month 0  Month 1  Month 2  Month 3
0        3        1        9        1
1        6        4        4        6
2        3        3        1        9
3       14       15       14       14
4        6        6        1        6
5        3        3        3        2
6        1        4        2        6
7        6        5        1        8

然后你可以生成你的比较函数。它应该有3种可能的结果，如果第2个月的数量低于第1个月，则为-1，如果它们相似，则为0，如果第2个月的数量更大，则为1。

def compare(prior, posterior):
    """Compares quantities of two months and it 
    returns -1 if prior is lower, 0 i they are the same,
    and 1 if posterior is greater"""
    
    if posterior < prior:
        return -1
    elif prior == posterior:
        return 0
    else:
        return 1

现在，您可以使用apply和参数lambda将此函数应用于pd.DataFrame。

df["Month_1 Output"] = df.apply(lambda row: compare(row["Month 0"], row["Month 1"]), axis=1)
df["Month_2 Output"] = df.apply(lambda row: compare(row["Month 1"], row["Month 2"]), axis=1)
df["Month_3 Output"] = df.apply(lambda row: compare(row["Month 2"], row["Month 3"]), axis=1)

请注意，现在，您可以使用不同的维度执行for循环。此外，如果你有几个月的时间，最好做一个for循环，而不是单独编写每个人的代码。上面代码的输出将是：

   Month_1 Output  Month_2 Output  Month_3 Output
0               1              -1               1
1               1               0              -1
2               0               1              -1
3              -1               1               0
4               0               1              -1
5               0               0               1
6              -1               1              -1
7               1               1              -1

编辑的

如果您只想知道对于任何值，Month +1是否为< month，这与下一个函数只有-1相同，因此您可以这样做：

df["output"] = (
                df[["Month_1 Output","Month_2 Output","Month_3 Output"]].
                    apply(lambda r: (r == -1).any(), axis=1).
                    astype(int)
               )

# With this you would have red as 1 and green as 0, so you could do
df["output"] = df["output"].apply(lambda x: -1 if x==1 else 1)



print(df[["Month_1 Output","Month_2 Output","Month_3 Output", "output"]])

Out[]:
   Month_1 Output  Month_2 Output  Month_3 Output  output
0               0               1               1       1
1              -1               0               1      -1
2               0              -1               1      -1
3               1               1               1       1
4               0              -1               1      -1
5               0               0              -1      -1
6               1               1               1       1
7              -1               1               1      -1

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/64579065

复制

相似问题

问针对逐行比较优化Python代码
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问针对逐行比较优化Python代码EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问针对逐行比较优化Python代码
EN