我试图运行一个for循环,通过定义一个函数来打印三个不同斜率和偏差=0的代价函数。数据集有5行,成本函数是根据考勤来预测分数。如果为每个斜率值定义三个独立的函数,我就可以打印成本函数。这是我的代码:
dataset = {"Attendance":[100, 87, 15, 63, 47], "Marks": [100, 95, 6, 73, 50]}
Marks = pd.DataFrame(dataset, columns = ["Attendance", "Marks"])
bias = 0
slope = {"values": [-1, 0, 3]}
slope = pd.DataFrame(slope)
def error():
a = []
sum_of_squared_error = 0
for i in range(len(slope)):
for j in range(0, len(Marks)):
x = Marks.iloc[j, 0]
y = Marks.iloc[j, 1]
sum_of_squared_error += (y - (slope.iloc[0, i]*x + bias)) ** 2
cost_function = sum_of_squared_error / (2 * len(Marks))
a.append(cost_function)
return a
error()我得到了这个错误。
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-126-1bf54ede9ae1> in <module>()
13 a.append(cost_function)
14 return a
---> 15 error()
5 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _validate_integer(self, key, axis)
2061 len_axis = len(self.obj._get_axis(axis))
2062 if key >= len_axis or key < -len_axis:
-> 2063 raise IndexError("single positional indexer is out-of-bounds")
2064
2065 def _getitem_tuple(self, tup: Tuple):
IndexError: single positional indexer is out-of-bounds发布于 2020-08-08 19:11:33
您没有正确地访问斜率df的元素。slope.shape返回(3, 1),以便迭代行号,而不是列号。
sum_of_squared_error += (y - (slope.iloc[0, i]*x + bias)) ** 2应该是:sum_of_squared_error += (y - (slope.iloc[i, 0]*x + bias)) ** 2
此外,应该在内部循环和外部循环之间将sum_of_squared_error重置为0:
import pandas as pd
dataset = {"Attendance":[100, 87, 15, 63, 47], "Marks": [100, 95, 6, 73, 50]}
Marks = pd.DataFrame(dataset, columns = ["Attendance", "Marks"])
bias = 0
slope = {"values": [-1, 0, 3]}
slope = pd.DataFrame(slope)
def error():
a = []
sum_of_squared_error = 0
for i in range(len(slope)):
for j in range(0, len(Marks)):
x = Marks.iloc[j, 0]
y = Marks.iloc[j, 1]
sum_of_squared_error += (y - (slope.iloc[i, 0]*x + bias)) ** 2
cost_function = sum_of_squared_error / (2 * len(Marks))
sum_of_squared_error = 0
a.append(cost_function)
return a
error()输出:
>>> error()
[10147.0, 2689.0, 9081.4]发布于 2020-08-08 19:31:04
错误出现在网上-
sum_of_squared_error += (y - (slope.iloc[0, i]*x + bias)) ** 2之所以会发生这种情况,是因为iloc只能在轴的length-1之前接受值。
参考文献- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html
在你的例子0中,我取了以下值-
因为,在slope中没有列1和2。
https://stackoverflow.com/questions/63319129
复制相似问题