@adamhand 2019-03-08T01:24:10.000000Z 字数 12638 阅读 1104

05 | 深入浅出索引（下）

首先来看一个问题：

在下面这个表 T 中，如果执行 select * from T where k between 3 and 5，需要执行几次树的搜索操作，会扫描多少行？

mysql> create table T (
ID int primary key,
k int NOT NULL DEFAULT 0, 
s varchar(16) NOT NULL DEFAULT '',
index k(k))
engine=InnoDB;
insert into T values(100,1, 'aa'),(200,2,'bb'),(300,3,'cc'),(500,5,'ee'),(600,6,'ff'),(700,7,'gg');

这个表的InnoDB存储结构如下图所示：

下面是这条SQL语句的执行流程：

在 k 索引树上找到 k=3 的记录，取得 ID = 300；
再到 ID 索引树查到 ID=300 对应的 R3；
在 k 索引树取下一个值 k=5，取得 ID=500；
再回到 ID 索引树查到 ID=500 对应的 R4；
在 k 索引树取下一个值 k=6，不满足条件，循环结束。

在这个过程中，回到主键索引树搜索的过程，我们称为回表。可以看到，这个查询过程读了 k 索引树的 3 条记录（步骤 1、3 和 5），回表了两次（步骤 2 和 4）。

在这个例子中，由于查询结果所需要的数据只在主键索引上有，所以不得不回表。那么，有没有可能经过索引优化，避免回表过程呢？

覆盖索引

如果执行的语句是 select ID from T where k between 3 and 5，这时只需要查 ID 的值，而 ID 的值已经在 k 索引树上了，因此可以直接提供查询结果，不需要回表。也就是说，在这个查询里面，索引 k 已经“覆盖了”我们的查询需求，我们称为覆盖索引。也就是说，索引已经包含所有需要查询的字段的值，只读取索引就能够查询到需要的值。

由于覆盖索引可以减少树的搜索次数，显著提升查询性能，所以使用覆盖索引是一个常用的性能优化手段。

联合索引

联合索引有叫多列索引，是使用多个列作为条件进行查询。

先来看一个问题，假如有一个市民表定义如下：

CREATE TABLE `tuser` (
  `id` int(11) NOT NULL,
  `id_card` varchar(32) DEFAULT NULL,
  `name` varchar(32) DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  `ismale` tinyint(1) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `id_card` (`id_card`),
  KEY `name_age` (`name`,`age`)
) ENGINE=InnoDB;

在上表中，是否有必要将身份证号和名字建立联合索引？

众所周知，身份证号是市民的唯一标识。也就是说，如果有根据身份证号查询市民信息的需求，只要在身份证号字段上建立索引就够了，没有必要使用(身份证号，姓名)两列联合查询。那么再建立一个（身份证号、姓名）的联合索引，是不是浪费空间？

如果现在有一个高频请求，要根据市民的身份证号查询他的姓名，这个联合索引就有意义了。它可以在这个高频请求上用到覆盖索引，不再需要回表查整行记录，减少语句的执行时间。

所以，可以看到在以下两种情况下可以使用联合索引：

使用多个列作为查询条件，可以将这多个列建立联合索引
从一个二级索引查找某列，将该索引和该列建立联合索引，可以减少回表次数

最左前缀原则

在使用联合索引时，需要满足最左前缀原则。最左前缀：查询条件必须包含联合索引最左边的列。比如联合索引为(name,age,gender)，查询条件必须包含name才行，但是name和查询条件中的其他条件字段顺序可以任意。

下面，使用（name，age）这个联合索引来分析。这个索引的示意图如下：

可以看到，索引项是按照索引定义里面出现的字段顺序排序的。即先按照name进行排序，再按照age进行排序。

最左前缀可以是联合索引的最左 N 个字段，也可以是字符串索引的最左 M 个字符。

最左N个字段。当逻辑需求是查到所有名字是“张三”的人时，可以快速定位到 ID4，然后向后遍历得到所有需要的结果。
最左M个字符。如果要查的是所有名字第一个字是“张”的人， SQL 语句的条件是"where name like ‘张 %’"。这时，也能够用上这个索引，查找到第一个符合条件的记录是 ID3，然后向后遍历，直到不满足条件为止。

索引列的顺序

在建立联合索引的时候，如何安排索引内的字段顺序。有两个原则。

如果通过调整顺序，可以少维护一个索引，那么这个顺序往往就是需要优先考虑采用的。这个主要是考虑减少空间。
让选择性最强的索引列放在前面，也就是说，将重复最小的索引放在最前面。这个主要是考虑减少查询时间。

比如，有a和b两列，现在需要建立a的索引和a、b两列的联合索引，不需要b列的索引，那建立(a,b)联合查询就会比(b,a)好，因为前者建立之后就不再需要单独建立a的索引。

索引的选择性是指：不重复的索引值和记录总数的比值。最大值为 1，此时每个记录都有唯一的索引与其对应。选择性越高，查询效率也越高。

例如下面显示的结果中 customer_id 的选择性比 staff_id 更高，因此最好把 customer_id 列放在多列索引的前面。

SELECT COUNT(DISTINCT staff_id)/COUNT(*) AS staff_id_selectivity,
COUNT(DISTINCT customer_id)/COUNT(*) AS customer_id_selectivity,
COUNT(*)
FROM payment;

   staff_id_selectivity: 0.0001
customer_id_selectivity: 0.0373
               COUNT(*): 16049

索引下推

索引下推主要解决的是不符合最左前缀的查询。

在上面的表中，有一个需求：检索出表中“名字第一个字是张，而且年龄是 10 岁的所有男孩”。那么，SQL 语句是这么写的：

mysql> select * from tuser where name like '张 %' and age=10 and ismale=1;

因为存在like语句，所以不能使用联合索引。所以这个语句在搜索索引树的时候，只能用 “张”，找到第一个满足条件的记录 ID3。

接下来，如果在 MySQL 5.6 之前，只能从 ID3 开始一个个回表。到主键索引上找出数据行，再对比字段值。

而 MySQL 5.6 引入的索引下推优化（index condition pushdown)，可以在索引遍历过程中，对索引中包含的字段先做判断，直接过滤掉不满足条件的记录，减少回表次数。

两种情况的执行过程分别如下图所示：

不使用索引下推

使用索引下推

上述图中虚线表示回表。可以看到，不使用索引下推时，InnoDB在(name,age)索引中看不到age的值，需要回表4次。而使用索引下推，InnoDB 在 (name,age) 索引内部就判断了 age 是否等于 10，对于不等于 10 的记录，直接判断并跳过。只需要回表2次。

默认启动。可以通过optimizer_switch系统变量去控制它是否开启：

SET optimizer_switch = 'index_condition_pushdown=off';
SET optimizer_switch = 'index_condition_pushdown=on';

补丁

最左前缀原则

最左前缀原则需要满足的条件：

查询条件的字段数要小于等于多列索引的字段数
中间字段不能存在like
只能用and不能用or

比如联合索引(a,b,c)，满足最左前缀可以使用联合索引的情况有：

a
a and b
a and c
a and b and c

需要注意的是，最左前缀原则是只有用到了最左边的那个索引就会执行，与顺序无关，所以形如c and b and a下面的也会使用联合索引。

不符合的情况有：

a or b
a or c
a or b or c
a like 'xx' and b

下面用一个具体的例子来验证，使用mqsql8.0。

首先建表：

mysql>  create table T(
    id int(20) primary key not null auto_increment,
    name varchar(255) default null,
    age int(20) default null,
    gender varchar(255) default null,
    address varchar(255) default null,
    KEY `unique_index` (`name`,`age`,`gender`))ENGINE=InnoDB;

上面的表中对name,age,gender字段加上了联合索引。

然后插入几条数据：

insert into T values(1, '张三',20,'male','beijing');
insert into T values(2, '王五',21,'feale','shanghai');
insert into T values(3, '张三',23,'male','shandong');
insert into T values(4, '张无',24,'male','chognqing');
insert into T values(5, '李九',20,'male','guangzhou');

查询条件为`name`

explain select * from t where name='张三'\G;

结果为：

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: ref
possible_keys: unique_index
          key: unique_index
      key_len: 767
          ref: const
         rows: 2
     filtered: 100.00
        Extra: Using index
1 row in set, 1 warning (0.00 sec)

可以看到，key: Using index，使用了联合索引。(关于explain关键字各个字段的含义，可以参考[MySQL高级](一) EXPLAIN用法和结果分析)

查询条件为`age`

explain select * from t where age=20\G;

结果为：

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

可以看到，key: null，没有使用索引。

查询条件为`gender`

explain select * from t where gender='male'\G;

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

没有使用索引。

查询条件为`name and age`

explain select * from t where name='张三' and age=20\G;

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: ref
possible_keys: unique_index
          key: unique_index
      key_len: 773
          ref: const,const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

使用了联合索引。

查询条件为`age and name`

依旧使用name，但是不在最左边，也不影响使用联合索引。

explain select * from t where age=10 and name='张三'\G;

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: ref
possible_keys: unique_index
          key: unique_index
      key_len: 773
          ref: const,const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

查询条件为`name and gender`

explain select * from t where name='张三' and gender='male'\G;

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: ref
possible_keys: unique_index
          key: unique_index
      key_len: 768
          ref: const
         rows: 1
     filtered: 100.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

这两个条件分别位于联合索引位置的第一和第三，测试联合索引依旧有效。

查询条件为`name or age`

explain select * from t where name='fdf' or age=10\G;

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: ALL
possible_keys: unique_index
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

把 and 换成 or，联合所索引无效。

查询条件为`age and gender`

explain select * from t where age=10 and gender='male'\G;

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

这两个条件分别位于联合索引位置的第二和第三，发现联合索引无效。

三个索引一起用

explain select * from t where name='zhagnsan' and  age=10 and gender='male'\G;

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: ref
possible_keys: unique_index
          key: unique_index
      key_len: 1541
          ref: const,const,const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

三个索引一起用，有效。

包含`like`的语句

explain select * from t where name like '%z%' and  age=10 and gender='male'\G;

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

包含like的语句，索引无效。

语句包含`<=`等含有范围的关键字

explain select * from t where name='zhagnsan' and  age<=10 and gender='male'\G;

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t
   partitions: NULL
         type: range
possible_keys: unique_index
          key: unique_index
      key_len: 1541
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

可以看到，索引有效。

小补丁

前面通过实际情况分析了最左前缀规则，但是分析的还比较粗略，知识分析了前缀索引能不能用，没有具体分析select语句中的具体哪个条件可以用索引，哪个条件不可以用索引。下面就这个问题分析一下。

还是创建一个表，为了分析简单起见，创建一个只有tinyint类型的表，该类型占1个字节。

mysql> create table t3(
    id1 tinyint not null primary key auto_increment,
    id2 tinyint not null,
    id3 tinyint not null,
    id4 tinyint not null,
    id5 tinyint not null
)engine=InnoDB;
insert into t3 values(1,2,3,4,5);

下面来分析一下语句中的哪个条件可以使用索引而哪个条件不可以使用。

语句1

explain select * from t3 where id2=2 and id3=3 and id4=4\G;

结果如下：

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
   partitions: NULL
         type: ref
possible_keys: union_index
          key: union_index
      key_len: 3
          ref: const,const,const
         rows: 2
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

可以看到，key字段不为空，使用了索引。更进一步，key_len的值为3，表示使用的索引长度为3个字节。而tinyint类型的长度为1个字节，所以说明，id2、id3和id4都使用了索引。

语句2

explain select * from t3 where id2=2 and id3>3 and id4=4\G;

执行结果为：

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
   partitions: NULL
         type: range
possible_keys: union_index
          key: union_index
      key_len: 2
          ref: NULL
         rows: 1
     filtered: 50.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

可以看到，key_len的值为2，也就是id2、id3和id4中只用到了两个索引。为了验证哪两个字段使用了索引，执行下列语句：

explain select * from t3 where id2=2 and id3>3\G;

结果为：

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
   partitions: NULL
         type: range
possible_keys: union_index
          key: union_index
      key_len: 2
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

可以看到，将id4去掉之后还是使用两个索引，说明id4没有用到索引，而id2和id3用到了索引。

语句3

explain select * from t3 where id2=2 and id3=3 and id4>4\G;

执行结果为：

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
   partitions: NULL
         type: range
possible_keys: union_index
          key: union_index
      key_len: 3
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

可以看到，三列都用了索引。

语句4

explain select * from t3 where id2>2 and id3=3 and id4=4\G;

执行结果为：

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
   partitions: NULL
         type: range
possible_keys: union_index
          key: union_index
      key_len: 1
          ref: NULL
         rows: 1
     filtered: 50.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

可以看到，只有一列用了索引。为了验证是哪一列用了索引，执行下列语句：

explain select * from t3 where id2>2\G;

结果为为：

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
   partitions: NULL
         type: range
possible_keys: union_index
          key: union_index
      key_len: 1
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

可以看到，仍然用了一个索引，说明id3和id4没用到索引，只有id2用了。

语句5

explain select * from t3 where id2=2 and id3=3 order by id4\G;

结果如下

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
   partitions: NULL
         type: ref
possible_keys: union_index
          key: union_index
      key_len: 2
          ref: const,const
         rows: 1
     filtered: 100.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

可以看到，key_ken为2，可想而知，id2和id3用了索引而id4没用到。

语句6

explain select * from t3 where id2=2 and id4=4 order by id3\G;

结果如下，

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
   partitions: NULL
         type: ref
possible_keys: union_index
          key: union_index
      key_len: 1
          ref: const
         rows: 1
     filtered: 50.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

可想而知，只有id2用了索引。

语句7

explain select * from t3 where id3=3 and id4=4 order by id2\G;

结果如下，

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2
     filtered: 50.00
        Extra: Using where; Using filesort
1 row in set, 1 warning (0.00 sec)

可以看到，所有列都没用到索引。

语句8

explain select * from t3 where id2 like 2 and id3=3 and id4=4\G;

结果如下：

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
   partitions: NULL
         type: ALL
possible_keys: union_index
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 2
     filtered: 50.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)

可以看到，没有用到索引。

语句9

explain select * from t3 where id2=2 and id3 like 3 and id4=4\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
   partitions: NULL
         type: ref
possible_keys: union_index
          key: union_index
      key_len: 1
          ref: const
         rows: 2
     filtered: 50.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

可以看到，只有id2用了索引。

语句10

explain select * from t3 where id2=2 and id3=3 and id4 like 4\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t3
   partitions: NULL
         type: ref
possible_keys: union_index
          key: union_index
      key_len: 2
          ref: const,const
         rows: 2
     filtered: 50.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

可以看到，只有id2和id3用了索引。

小结：

使用>、<等范围符号的时候，使用的列可以用到联合索引，但是后面的列都用不到。
使用order by的列用不到联合索引，后面的列也都用不到索引。
使用like的情况和使用order by的情况一样，本列用不到索引，后面的列也用不到。

这种情况的原因是联合索引是按照索引的顺序进行排序的，而使用范围符号和order by时就破坏了有序性，所以导致联合索引不能用。

单引号和双引号的区别

mysql的字段名、表名通常不需要加任何引号，如果非要加上引号，必须加反引号(``)；
mysql的别名可以不加引号，如果加引号，单引号和双引号以及反引号都可以；
单引号和双引号都可以表示字符串；

如果双引号的sql执行报错，可以执行：

SET SESSION SQL_MODE=ANSI_QUOTES;

一个问题

如果将上面验证最左前缀的表中的address字段去掉，变为下面的表：

create table T2(
    id int(20) primary key not null auto_increment,
    name varchar(255) default null,
    age int(20) default null,
    gender varchar(255) default null,
    KEY `unique_index` (`name`,`age`,`gender`))ENGINE=InnoDB;

也就是说，表中所有字段都是索引(主键id默认为聚簇索引)，这时，即使查询条件不满足最左前缀规则，也可以使用联合索引了。比如执行下面的查询语句：

explain select * from t2 where age=10 and gender='male'\G;

结果为：

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
   partitions: NULL
         type: index
possible_keys: unique_index
          key: unique_index
      key_len: 1541
          ref: NULL
         rows: 1
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

参考

MYSQL创建表，字段用双引号时报错
 mysql的单引号和双引号的区别？？（基本没有）
理解Mysql的多列索引
 mysql-覆盖索引
 mysql高效索引之覆盖索引
 mysql覆盖索引详解
 mysql联合索引详解
 多个单列索引和联合索引的区别详解
 [MySQL高级](一) EXPLAIN用法和结果分析

05 | 深入浅出索引（下）

覆盖索引

联合索引

最左前缀原则

索引列的顺序

索引下推

补丁

最左前缀原则

查询条件为name

查询条件为age

查询条件为gender

查询条件为name and age

查询条件为age and name

查询条件为name and gender

查询条件为name or age

查询条件为age and gender

三个索引一起用

包含like的语句

语句包含<=等含有范围的关键字

小补丁

语句1

语句2

语句3

语句4

语句5

语句6

语句7

语句8

语句9

语句10

单引号和双引号的区别

一个问题

参考

内容目录

查询条件为`name`

查询条件为`age`

查询条件为`gender`

查询条件为`name and age`

查询条件为`age and name`

查询条件为`name and gender`

查询条件为`name or age`

查询条件为`age and gender`

包含`like`的语句

语句包含`<=`等含有范围的关键字