我对JSON数据有一个简单的select查询,使用一些简单的计算,这些计算在我多次运行JSON数据时返回不同的结果。
为了我的生命,我只是不知道为什么,到目前为止,我还没有找到答案。从查询统计数据中可以看出,构建了一个临时表,我认为它在查询完成后无法截断/保存在某个地方,但我无法找到(更不用说截断/删除)该表。不过,这只是一个有效的假设-答案可能是在某种背景下或者.
有人能找到这方面的逻辑吗?
我在本地实例的MySQL DB上使用MySQL工作台。MySQL: 8.0.19版MySQL社区服务器- GPL。
这是表的简化版本:
CREATE TABLE regulation_entries (
id INTEGER UNSIGNED AUTO_INCREMENT,
employee_id VARCHAR(10) NOT NULL,
regulation JSON NOT NULL,
PRIMARY KEY (id)
);
# Sample data to work with
INSERT INTO regulation_entries VALUES
(DEFAULT, 2, '{"entry_base_salary": "8000", "pension_pct": "0.08"}'),
(DEFAULT, 3, '{"entry_base_salary": "10000", "pension_pct": "0.08"}'),
(DEFAULT, 5, '{"entry_base_salary": "11000", "pension_pct": "0.08"}'),
(DEFAULT, 8, '{"entry_base_salary": "11000", "pension_pct": "0.08"}'),
(DEFAULT, 9, '{"entry_base_salary": "9000", "pension_pct": "0.08"}'),
(DEFAULT, 1, '{"entry_base_salary": "14000", "pension_pct": "0.10"}'),
(DEFAULT, 6, '{"entry_base_salary": "13000", "pension_pct": "0.08"}'),
(DEFAULT, 7, '{"entry_base_salary": "14000", "pension_pct": "0.08"}'),
(DEFAULT, 2, '{"base_salary_adjustment": "500"}'),
(DEFAULT, 9, '{"base_salary_adjustment": "800"}'),
(DEFAULT, 3, '{"base_salary_adjustment": "400"}'),
(DEFAULT, 5, '{"base_salary_adjustment": "350"}'),
(DEFAULT, 8, '{"base_salary_adjustment": "200"}'),
(DEFAULT, 9, '{"base_salary_adjustment": "1250"}'),
(DEFAULT, 1, '{"base_salary_adjustment": "-200"}'),
(DEFAULT, 1, '{"base_salary_adjustment": "50"}'),
(DEFAULT, 6, '{"base_salary_adjustment": "700"}'),
(DEFAULT, 7, '{"base_salary_adjustment": "825"}');
# The query that is bugging me:
SELECT employee_id,
SUM(regulation->>'$.entry_base_salary') AS entry_base_salary,
regulation->>'$.pension_pct' AS pension_pct,
AVG(regulation->>'$.pension_pct') * SUM(regulation->>'$.entry_base_salary') AS entry_pension,
SUM(regulation->>'$.base_salary_adjustment') AS salary_adjustments,
SUM(regulation->>'$.entry_base_salary') + SUM(regulation->>'$.base_salary_adjustment') AS future_salary,
AVG(regulation->>'$.pension_pct') * (SUM(regulation->>'$.entry_base_salary') + SUM(regulation->>'$.base_salary_adjustment')) AS future_pension
FROM sreg.regulation_entries
GROUP BY employee_id
ORDER BY employee_id当我运行查询时,我希望看到以下内容:
employee_id #entry_base_salary #pension_pct #entry_pension #salary_adjustments #future_salary #future_pension#
#1 #14000 #0.10 #1400 #1250 #15250 #1525
#2 #8000 #0.08 #640 #1100 #9100 #728
#3 #10000 #0.08 #800 #1000 #11000 #880
#5 #11000 #0.08 #880 #950 #11950 #956
#6 #13000 #0.08 #1040 #1900 #14900 #1192
#7 #14000 #0.08 #1120 #2025 #16025 #1282
#8 #11000 #0.08 #880 #800 #11800 #944
#9 #9000 #0.08 #720 #2650 #11650 #932但是,如果再次执行相同的查询,就会得到新的随机结果。例如:
employee_id #entry_base_salary #pension_pct #entry_pension #salary_adjustments #future_salary #future_pension#
1 #14000 #0.10 #1400 #3050 #17050 #1705
2 #8000 #0.08 #640 #2900 #10900 #872
3 #10000 #0.08 #800 #2800 #12800 #1024
5 #11000 #0.08 #880 #2750 #13750 #1100
6 #13000 #0.08 #1040 #3700 #16700 #1336
7 #14000 #0.08 #1120 #3825 #17825 #1426
8 #11000 #0.08 #880 #2600 #13600 #1088
9 #9000 #0.08 #720 #4450 #13450 #1076
Another example
1 #14000 #0.10 #1400 #2.119191149652875e88 #15250 #1525
2 #8000 #0.08 #640 #2.119191149652875e88 #9100 #728
3 #10000 #0.08 #800 #2.119191149652875e88 #11000 #880
5 #11000 #0.08 #880 #2.119191149652875e88 #11950 #956
6 #13000 #0.08 #1040 #2.119191149652875e88 #14900 #1192
7 #14000 #0.08 #1120 #2.119191149652875e88 #16025 #1282
8 #11000 #0.08 #880 #2.119191149652875e88 #11800 #944
9 #9000 #0.08 #720 #2.119191149652875e88 #11650 #932
And a third
1 #14000 #0.10 #1400 #3650 #17650 #1765
2 #8000 #0.08 #640 #3500 #11500 #920
3 #10000 #0.08 #800 #3400 #13400 #1072
5 #11000 #0.08 #880 #3350 #14350 #1148
6 #13000 #0.08 #1040 #4300 #17300 #1384
7 #14000 #0.08 #1120 #4425 #18425 #1474
8 #11000 #0.08 #880 #3200 #14200 #1136
9 #9000 #0.08 #720 #5050 #14050 #1124以前有人见过这个吗?有什么解释吗?或者更好的是,有没有人知道我要改变什么才能得到一致的结果?
发布于 2020-06-08 14:51:58
这完全不是关于JSON的。这是MySQL 8.0.18中的一个bug,修正在8.0.20中。这个错误可能发生在非JSON列中。
https://bugs.mysql.com/bug.php?id=97920聚合函数和()返回随机数
事实上,这些数字并不完全是随机的,但是每次运行查询时都会增加。这就像SUM()在temp表中累积和一样,在查询的后续运行中使用相同的temp表,而不将和归零。
下面是上面错误中给出的测试用例的结果:
mysql> SELECT ym, ROUND(SUM(vb)), ROUND(SUM(vc)) FROM t_test GROUP BY ym;
+--------+----------------+----------------+
| ym | ROUND(SUM(vb)) | ROUND(SUM(vc)) |
+--------+----------------+----------------+
| 201912 | 22675 | 227 |
+--------+----------------+----------------+
mysql> SELECT ym, ROUND(SUM(vb)), ROUND(SUM(vc)) FROM t_test GROUP BY ym;
+--------+----------------+----------------+
| ym | ROUND(SUM(vb)) | ROUND(SUM(vc)) |
+--------+----------------+----------------+
| 201912 | 22675 | 454 |
+--------+----------------+----------------+
mysql> SELECT ym, ROUND(SUM(vb)), ROUND(SUM(vc)) FROM t_test GROUP BY ym;
+--------+----------------+----------------+
| ym | ROUND(SUM(vb)) | ROUND(SUM(vc)) |
+--------+----------------+----------------+
| 201912 | 22675 | 682 |
+--------+----------------+----------------+在我测试时,每次运行查询时,它都会继续增加227。227是正确的结果,并在我第一次运行查询时被返回。
此外,如果要求和的底层数据不包括任何NULL,则不会出现问题。
mysql> SELECT ym, ROUND(SUM(vb)), ROUND(SUM(vc)) FROM t_test where vc is not null GROUP BY ym;
+--------+----------------+----------------+
| ym | ROUND(SUM(vb)) | ROUND(SUM(vc)) |
+--------+----------------+----------------+
| 201912 | 252 | 227 |
+--------+----------------+----------------+每次运行此查询时,都会返回正确的227值。
在查询中也会发生同样的情况。由于表达式regulation->>'$.base_salary_adjustment'在JSON不包含该键的行上返回NULL,所以在使用SUM()时它会显示相同的错误。
如果我将查询修改为将NULL转换为0,则它将给出正确的结果,并且无论运行多少次查询都不会更改。
SUM(COALESCE(regulation->>'$.base_salary_adjustment', 0)) AS salary_adjustments这必须在每次引用JSON键(或对可能为NULL的表达式使用SUM() )时完成。
我建议您升级到MySQL 8.0.20。
https://stackoverflow.com/questions/62263597
复制相似问题