解析OceanBase v4.2函数索引进行查询优化

发布于:2024-04-20 ⋅ 阅读:(15) ⋅ 点赞:(0)

一、如何通过函数索引来进行查询优化

函数索引是一种优化查询的技术,其主要作用在于提升包含函数调用的查询语句的执行速度。当查询语句中包含函数调用时,数据库系统需要逐行执行函数计算,这无疑会增加查询的复杂性,导致查询速度下降。然而,通过创建函数索引,我们可以在查询时直接定位到匹配的函数值,从而避免重复计算,提高查询的速度。下面,我们将通过一个具体示例来展示如何利用函数索引来优化查询性能。

假设有一张表t1,t1中有一个数据类型为date的列date_col:

create table t1(date_col date, sales_col int, name_col varchar(10));

如果需要获取3月份的数据,那么可以使用Month()函数来查询:

select * from t1 where month(date_col) = 3;

这种场景下,数据库需要为表中每行计算month(date_col),过滤掉不符合"month(date_col) = 3"的行。如果需要频繁使用月份信息来过滤数据,每次都需要重新计算month(date_col),就会造成大量开销。因此可以在date_col上建一个函数索引,将month(date_col)存储到了索引表中,来加速查询,创建函数索引的语句如下:

create index i1 on t1((month(date_col)));

建立索引后,查询时就可以直接使用该索引,避免了对每个行进行函数计算,提高了查询效率。

explain select * from t1 where month(date_col) = 3;
+-------------------------------------------------------------------------------------------------+
| Query Plan                                                                                      |
+-------------------------------------------------------------------------------------------------+
| ==================================================                                              |
| |ID|OPERATOR        |NAME  |EST.ROWS|EST.TIME(us)|                                              |
| --------------------------------------------------                                              |
| |0 |TABLE RANGE SCAN|t1(i1)|1       |7           |                                              |
| ==================================================                                              |
| Outputs & filters:                                                                              |
| -------------------------------------                                                           |
|   0 - output([t1.date_col], [t1.sales_col], [t1.name_col]), filter(nil), rowset=16              |
|       access([t1.__pk_increment], [t1.date_col], [t1.sales_col], [t1.name_col]), partitions(p0) |
|       is_index_back=true, is_global_index=false,                                                |
|       range_key([t1.SYS_NC19$], [t1.__pk_increment]), range(3,MIN ; 3,MAX),                     |
|       range_cond([t1.SYS_NC19$ = 3])                                                            |
+-------------------------------------------------------------------------------------------------+

二、OceanBase 4.2在MySQL模式下支持函数索引功能

OceanBase 4.1 以及之前的版本中,已在Oracle模式下支持了函数索引功能。OceanBase 4.2 在MySQL模式支持函数索引功能,兼容MySQL 8.0。

示例1:使用create index语句创建函数索引。

语法如下:

CREATE [UNIQUE | FULLTEXT | SPATIAL] INDEX index_name
    ON tbl_name (expr,...)
    [index_option] ...

key_part: (expr) [ASC | DESC]

expr是一个合法的函数索引表达式,且允许是布尔表达式,例如"c1=c1"。与MySQL不同的是,OceanBase禁止在函数索引的定义中引用生成列。

例如以下语句在t1_func表上创建了一个索引定义是c1+c2 < 1的函数索引i1。

create table t1_func(c1 int, c2 int);
create index i1 on t1_func ((c1+c2 < 1));

示例2:使用alter table语句创建函数索引。

语法如下:

ALTER TABLE tbl_name
    [alter_option [, alter_option] ...]
    [partition_options]

alter_option: {
    table_options
  | ADD {INDEX | KEY} [index_name]
        [index_type] (key_part,...) [index_option] ...
  | ADD SPATIAL [INDEX | KEY] [index_name]
        (key_part,...) [index_option] ...
  | ...

key_part: (expr) [ASC | DESC]

例如以下语句在t1_func上添加了3个函数索引,其中一个名字是i2,另外两个由系统自动生成的名称,格式为‘functional_index’前缀加编号。

alter table t1_func add index ((concat(c1,'a')));
alter table t1_func add index ((c1+1));
alter table t1_func add index i2 ((concat(c1,'a')));

示例3:使用create table语句在建表时创建函数索引。

语法如下:

CREATE [TEMPORARY] TABLE [IF NOT EXISTS] tbl_name
    (create_definition,...)
    [table_options]
    [partition_options]

create_definition: {
    col_name column_definition
  | {INDEX | KEY} [index_name] [index_type] (key_part,...)
      [index_option] ...
  | SPATIAL [INDEX | KEY] [index_name] (key_part,...)
      [index_option] ...
  | [CONSTRAINT [symbol]] UNIQUE [INDEX | KEY]
      [index_name] [index_type] (key_part,...)
      [index_option] ...
  ...
}

key_part: (expr) [ASC | DESC]

例如以下语句在创建表t2_func时,创建了一个函数索引i1。

create table t2_func (c1 int, c2 int, index i1 ((c1+1)), unique key ((c1+c2)));

可以使用show create table语句查看创建的函数索引:

show create table t1_func;
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t1_func | CREATE TABLE `t1_func` (
  `c1` int(11) DEFAULT NULL,
  `c2` int(11) DEFAULT NULL,
  KEY `i1` (((`c1` + `c2`) < 1)) BLOCK_SIZE 16384 LOCAL,
  KEY `functional_index` (concat(`c1`,'a')) BLOCK_SIZE 16384 LOCAL,
  KEY `functional_index_2` ((`c1` + 1)) BLOCK_SIZE 16384 LOCAL,
  KEY `i2` (concat(`c1`,'a')) BLOCK_SIZE 16384 LOCAL
) DEFAULT CHARSET = utf8mb4 ROW_FORMAT = DYNAMIC COMPRESSION = 'lz4_1.0' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 |
+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

show create table t2_func;
+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table   | Create Table                                                                                                                                                                                                                                                                                                                                                                       |
+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t2_func | CREATE TABLE `t2_func` (
  `c1` int(11) DEFAULT NULL,
  `c2` int(11) DEFAULT NULL,
  UNIQUE KEY `functional_index` ((`c1` + `c2`)) BLOCK_SIZE 16384 LOCAL,
  KEY `i1` ((`c1` + 1)) BLOCK_SIZE 16384 LOCAL
) DEFAULT CHARSET = utf8mb4 ROW_FORMAT = DYNAMIC COMPRESSION = 'lz4_1.0' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 |
+---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

也可以使用show index命令来查看已创建的函数索引,例如:

show index from t1_func;
+---------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+---------------------+
| Table   | Non_unique | Key_name           | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment   | Index_comment | Visible | Expression          |
+---------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+---------------------+
| t1_func |          1 | i1                 |            1 | SYS_NC18$   | A         |        NULL | NULL     | NULL   | YES  | BTREE      | available |               | YES     | ((`c1` + `c2`) < 1) |
| t1_func |          1 | functional_index   |            1 | SYS_NC19$   | A         |        NULL | NULL     | NULL   | YES  | BTREE      | available |               | YES     | concat(`c1`,'a')    |
| t1_func |          1 | functional_index_2 |            1 | SYS_NC20$   | A         |        NULL | NULL     | NULL   | YES  | BTREE      | available |               | YES     | (`c1` + 1)          |
| t1_func |          1 | i2                 |            1 | SYS_NC19$   | A         |        NULL | NULL     | NULL   | YES  | BTREE      | available |               | YES     | concat(`c1`,'a')    |
+---------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+---------------------+

show index from t2_func;
+---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+---------------+
| Table   | Non_unique | Key_name         | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment   | Index_comment | Visible | Expression    |
+---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+---------------+
| t2_func |          1 | i1               |            1 | SYS_NC18$   | A         |        NULL | NULL     | NULL   | YES  | BTREE      | available |               | YES     | (`c1` + 1)    |
| t2_func |          0 | functional_index |            1 | SYS_NC19$   | A         |        NULL | NULL     | NULL   | YES  | BTREE      | available |               | YES     | (`c1` + `c2`) |
+---------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+---------------+

三、OceanBase 4.2禁止非确定性函数用于函数索引

OceanBase 4.2版本禁止了一些非确定性的系统函数被用于创建函数索引和生成列,以提升稳定性。这些系统函数的结果会随着系统或用户环境的变化而变化的函数,例如:

select current_time();
+----------------+
| current_time() |
+----------------+
| 20:44:22       |
+----------------+

select current_time();
+----------------+
| current_time() |
+----------------+
| 20:44:24       |
+----------------+

MySQL模式

以下函数被禁止用于生成列和函数索引:

aes_decrypt from_unixtime statement_digest
aes_encrypt host_ip sys_privilege_check
benchmark icu_version sysdate
connection_id is_serving_tenant time_to_usec
current_date last_excution_id unix_timestamp
current_time last_insert_id usec_to_time
current_timestamp lnnvl user
current_user localtimestamp utc_date
current_user_priv mysql_port utc_time
database name_const utc_timestamp
des_decrypt ob_version uuid
des_encrypt rand uuid_short
des_hex_str random_bytes validate_password_strength
dump row_count version
encrypt rpc_port weight_string
found_rows sleep

一些时间相关的系统函数禁止在参数为TIME类型时用于函数索引,例如:

create table time_func(c1 time, c2 date);
create index i1 on time_func((date(c1)));
ERROR 3758 (HY000): Expression of functional index contains a disallowed function.

涉及以下函数:

adddate dayofyear timestampdiff
date last_day to_days
date_sub month week
datediff monthname weekday
day quarter weekofyear
dayname timestamp year
dayofmonth timestamp_nvl yearweek
dayofweek timestampadd

此外,cast函数将time类型的参数转换成非time的其他时间类型,或者将timestamp类型的参数转换为非timestamp的其他数据类型时,也被禁止用于生成列和函数索引,例如:

create index i1 on time_func((cast(c1 as DATE)));
ERROR 3758 (HY000): Expression of functional index contains a disallowed function.

Oracle模式

与之前的版本相比,4.2版本在Oracle模式下新增禁止了以下函数用于生成列和函数索引:

dump ob_version rpc_port
host_ip regexp_count scn_to_timestamp
is_serving_tenant regexp_instr timestamp_to_scn
last_trace_id regexp_substr

以下系统函数在参数是字符串类型时禁止用于函数索引:

add_months next_day
last_day timestamp_nvl
months_between

例如:

create table t1(c1 date, c2 varchar(40), c3 timestamp);
create index i1 on t1(ADD_MONTHS(c2, 1));
ORA-01743: only pure functions can be indexed

此外,cast函数在将字符串转为时间类型、将时间类型转为字符串、将不带timezone信息的时间类型转为带timezone信息的时间类型的情况下禁止用于函数索引,例如:

create table t2(c1 varchar(20), c2 date, c3 TIMESTAMP WITH TIME ZONE, c4 number);
create index i1 on t2(cast(c1 as DATE));
ORA-01743: only pure functions can be indexed

create index i1 on t2(cast(c2 as TIMESTAMP WITH TIME ZONE));
ORA-01743: only pure functions can be indexed

create index i1 on t2(cast(c3 as VARCHAR(10)));
ORA-01743: only pure functions can be indexed

四、总结

Oceanbase 4.2版本支持了在MySQL模式下创建和使用函数索引,并且禁止了部分非确定性函数用于创建函数索引和生成列以提升稳定性。但是其中部分函数在MySQL和Oracle中是允许用于函数索引的,例如:

(Mysql 8.0.31)
mysql> create table time_func(c1 timestamp, c2 date);
Query OK, 0 rows affected (0.04 sec)

mysql> create index i1 on time_func((date(c1)));
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

Oceanbase后续版本将会放开对这些函数的限制,进一步提升兼容性。