
“数据明明插进去了,为什么 SHOW TABLE STATUS 说表是空的?” 这是许多 DBA 和开发在批量导入数据后常遇到的“灵异现象”。 更诡异的是:SHOW INDEX FROM orders 的主键 Cardinality 却显示有 298,920 条记录! 到底谁在说谎?是 MySQL 出了 Bug,还是我们误解了它的行为?
上述情况是我刚刚做测试时偶尔发现的,数据库版本是MYSQL8.0.39社区版,查询的现象如下:

下面我们还原一下过程并从源码中探索原因。
1. 问题复现:一个批量插入数据的存储过程
我使用如下存储过程生成测试订单数据:
DELIMITER ;;
CREATE PROCEDURE GenerateOrderData()
BEGIN
DECLARE i INT DEFAULT 1;
START TRANSACTION; -- 关键:开启大事务
WHILE (i <= 100000) DO
INSERT INTO `orders` (
`order_number`, `customer_id`, `product_id`,
`quantity`, `amount`, `order_date`, `status`
) VALUES (
CONCAT('ORD', DATE_FORMAT(NOW(), '%Y%m%d'), LPAD(i, 8, '0')),
FLOOR(1 + RAND() * 5000),
FLOOR(1 + RAND() * 200),
FLOOR(1 + RAND() * 5),
ROUND((10 + RAND() * 500) * FLOOR(1 + RAND() * 5), 2),
DATE_SUB(NOW(), INTERVAL FLOOR(RAND() * 365) DAY),
FLOOR(1 + RAND() * 4)
);
SET i = i + 1;
END WHILE;
COMMIT; -- 提交
END;;
DELIMITER ;
-- 执行
CALL GenerateOrderData();调用3次存储过程插入数据,执行后验证结果如下:
mysql> show table status like 'orders';
+--------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+--------------------+----------+----------------+-----------+
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+--------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+--------------------+----------+----------------+-----------+
| orders | InnoDB | 10 | Dynamic | 0 | 0 | 16384 | 0 | 65536 | 0 | 1 | 2025-12-23 18:47:50 | NULL | NULL | utf8mb3_general_ci | NULL | | 订单表 |
+--------+--------+---------+------------+------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+--------------------+----------+----------------+-----------+
1 row in set (0.00 sec)
mysql> show index from `orders`;
+--------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+--------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| orders | 0 | PRIMARY | 1 | id | A | 298920 | NULL | NULL | | BTREE | | | YES | NULL |
| orders | 1 | idx_customer_date | 1 | customer_id | A | 4978 | NULL | NULL | | BTREE | | | YES | NULL |
| orders | 1 | idx_customer_date | 2 | order_date | A | 298920 | NULL | NULL | | BTREE | | | YES | NULL |
| orders | 1 | idx_order_date | 1 | order_date | A | 7535 | NULL | NULL | | BTREE | | | YES | NULL |
| orders | 1 | idx_customer_status | 1 | customer_id | A | 5329 | NULL | NULL | | BTREE | | | YES | NULL |
| orders | 1 | idx_customer_status | 2 | status | A | 20382 | NULL | NULL | | BTREE | | | YES | NULL |
| orders | 1 | idx_product_id | 1 | product_id | A | 190 | NULL | NULL | | BTREE | | | YES | NULL |
+--------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
7 rows in set (0.01 sec)
矛盾点:数据真实存在,但表状态却说“没数据”!
以上可以发现 show table status的rows是0 ,show index 方式查询结果有298920(接近实际值)。顿感诧异,这两个不是调用同一个方法采样的?于是我也问了几个大佬,他们也没遇到过或者之前没注意到过这个现象,那我们就自己探索一下。
2. 表面差异:两个命令的本质不同
我们已知的知识如下
命令 | 返回字段 | 含义 | 是否精确 |
|---|---|---|---|
SHOW TABLE STATUS | Rows | 表的估算行数 | 估算 |
SHOW INDEX FROM | Cardinality | 索引的唯一值数量估算 | 估算(但通常更准) |
两者都不是实时精确值!它们只是 InnoDB 为优化器提供的“提示”(hints)。但为何一个为 0,一个却接近真实?我们从表面情况来猜测,是不是bug?是不是不是同一个地方取统计数据?是不是进行采样的时机和方式不同?
3. 源码深挖:Rows = 0 从何而来?
3.1 SHOW TABLE STATUS 的调用链
在 MySQL 源码中,该命令最终调用存储引擎的 handler::info() 接口:
// sql/sql_show.cc
table->file->info(HA_STATUS_VARIABLE);
InnoDB 实现如下(storage/innobase/handler/ha_innodb.cc):
/** Returns statistics information of the table to the MySQL interpreter,
in various fields of the handle object.
@return HA_ERR_* error code or 0 */
int ha_innobase::info(uint flag) /*!< in: what information is requested */
{
return (info_low(flag, false /* not ANALYZE */));
}
上面有个大大的“not ANALYZE”

int ha_innobase::info(uint flag) {
if (flag & HA_STATUS_VARIABLE) {
stats.records = table->stat_n_rows; // ← 核心字段!
}
}3.2 stat_n_rows 的生命周期
定义于 dict_table_t 结构体(storage/innobase/include/dict0mem.h):
/** Statistics for query optimization. @{ */
/** Creation state of 'stats_latch'. */
std::atomic<os_once::state_t> stats_latch_created;
/** This latch protects:
"dict_table_t::stat_initialized",
"dict_table_t::stat_n_rows (*)",
"dict_table_t::stat_clustered_index_size",
"dict_table_t::stat_sum_of_other_index_sizes",
"dict_table_t::stat_modified_counter (*)",
"dict_table_t::indexes*::stat_n_diff_key_vals[]",
"dict_table_t::indexes*::stat_index_size",
"dict_table_t::indexes*::stat_n_leaf_pages".
(*) Those are not always protected for
performance reasons. */
rw_lock_t *stats_latch;


rows何时被设为 0的呢?有以下情况:
可在源码 storage/innobase/dict/dict0stats.cc中发现
/** Write all zeros (or 1 where it makes sense) into a table and its indexes'
statistics members. The resulting stats correspond to an empty table. */
static void dict_stats_empty_table(dict_table_t *table) /*!< in/out: table */
{
/* Zero the stats members */
dict_table_stats_lock(table, RW_X_LATCH);
table->stat_n_rows = 0;
table->stat_clustered_index_size = 1;
/* 1 page for each index, not counting the clustered */
table->stat_sum_of_other_index_sizes = UT_LIST_GET_LEN(table->indexes) - 1;
table->stat_modified_counter = 0;
dict_index_t *index;
for (index = table->first_index(); index != nullptr; index = index->next()) {
if (index->type & DICT_FTS) {
continue;
}
ut_ad(!dict_index_is_ibuf(index));
dict_stats_empty_index(index);
}
table->stat_initialized = true;
dict_table_stats_unlock(table, RW_X_LATCH);
}

何时更新?
调用analyze时,会调用ha_innobase::info_low(uint flag, bool is_analyze)进行收集

/** Returns statistics information of the table to the MySQL interpreter, in
various fields of the handle object.
@param[in] flag what information is requested
@param[in] is_analyze True if called from "::analyze()".
@return HA_ERR_* error code or 0 */
intha_innobase::info_low(uint flag, bool is_analyze){
dict_table_t *ib_table;
uint64_t n_rows;
DBUG_TRACE;
DEBUG_SYNC_C("ha_innobase_info_low");// 来自 dict0stats.cc
dberr_t dict_stats_update(dict_table_t *table, dict_stats_upd_option_t stats_upd_option)
{
switch (stats_upd_option) {
case DICT_STATS_RECALC_PERSISTENT:
// ... 被 ANALYZE TABLE 或后台线程调用
err = dict_stats_update_persistent(table);
return dict_stats_save(table, nullptr);
case DICT_STATS_FETCH_ONLY_IF_NOT_IN_MEMORY:
// 当表统计未初始化时进入
if (!table->stat_initialized) {
return DB_SUCCESS;
}
// 尝试从持久化存储加载
err = dict_stats_fetch_from_ps(t);
if (err == DB_STATS_DO_NOT_EXIST) {
if (dict_stats_auto_recalc_is_enabled(table)) {
// 如果启用自动重算 且 统计不存在,则触发重算!
return dict_stats_update(table, DICT_STATS_RECALC_PERSISTENT);
}
}
// ...
}
}如果触发的时候优先选择第一个索引,通常是聚簇索引(Clustered Index),也就是主键索引进行统计
index = table->first_index();
table->stat_n_rows =
index->stat_n_diff_key_vals[dict_index_get_n_unique(index) - 1];注: 普通 INSERT/UPDATE/DELETE 不会更新 stat_n_rows!即使你插入 100 万行,只要没触发统计更新,它就一直保持旧值(如 0)。
4. Cardinality 是怎么算出来的?
4.1 SHOW INDEX 的行为特殊
当你执行 SHOW INDEX,MySQL 需要返回 Cardinality,因此会主动检查并可能刷新索引统计。
if (need_index_stats) {
innodb_stats_update_if_needed(table);
}源码路径(ha_innodb.cc):
/** Match index columns between MySQL and InnoDB.
This function checks whether the index column information
is consistent between KEY info from mysql and that from innodb index.
@param[in] key_info Index info from mysql
@param[in] index_info Index info from InnoDB
@return true if all column types match. */
bool innobase_match_index_columns(const KEY *key_info,
const dict_index_t *index_info) {
const KEY_PART_INFO *key_part;
const KEY_PART_INFO *key_end;
const dict_field_t *innodb_idx_fld;
const dict_field_t *innodb_idx_fld_end;
DBUG_TRACE;
/* Check whether user defined index column count matches */
if (key_info->user_defined_key_parts != index_info->n_user_defined_cols) {
return false;
}
4.2 InnoDB 如何估算 Cardinality
InnoDB 使用 页采样(Page Sampling) 机制:
按比例推算全局基数:

因此,即使 stat_n_rows = 0,只要索引 B+Tree 中有真实数据页,采样就能得到合理估计。
4.3 为什么 Cardinality 能“感知”新数据,而 Rows 不能?
因此和show table status的本质区别是一个是“按需计算”,一个是“懒加载缓存”。
5. 小结
InnoDB 为何不自动更新统计?性能权衡:避免 DML 变慢
如果每插入一行都要采样20个页来更新统计,那么整个innodb的写入性能将暴跌!因此,InnoDB 采用延迟更新+ 事件驱动 策略,DML 操作只更新 stat_modified_counter(修改计数器)真正的统计更新由以下事件触发:
因此,在批量导入后不要依赖后台自动更新,应显式执行 ANALYZE TABLE 以确保统计信息立即生效。
(其他的代码内容太多,看晕了,请大佬们来补充)
源码解析相关文章请参考: