创建索引：GaussDB（PostgreSQL）开发笔记-EW帮帮网

一、查看数据库和表信息

（一）查看数据库版本

SELECT version()

PostgreSQL 9.2.4 (GaussDB 9.1.0 build 53f3b265) compiled at 2025-03-21 14:49:19 commit 9292 last mr 24793

（二）查看表信息

使用 \d** 命令** ：在 gsql 命令行工具中，执行 \d table_name 查看表的基本信息，包括列、数据类型、索引等。
使用 \d+** 命令** ：执行 \d+ table_name 查看更详细的表信息，如列的详细信息、表的存储选项等。
查询 information_schema.columns** 视图** ：执行 SELECT * FROM information_schema.columns WHERE table_name = 'table_name'，获取指定表的列信息，包括列名、数据类型、是否可为空等。
查询 pg_tables** 系统表** ：执行 SELECT * FROM pg_tables WHERE tablename = 'table_name'，查看表的基本信息，如表名、所属模式、表空间等。

select count(1) from table_for_test_index

-- 总数 4579369

二、索引管理

（一）查看索引

使用 \d** 命令** ：在 gsql 命令行工具中，执行 \d table_name 查看表的索引信息。
查询 pg_indexes** 系统表** ：执行 SELECT * FROM pg_indexes WHERE tablename = 'table_name'，查看指定表的索引信息，包括索引名、表名、索引定义等。

（二）创建索引

创建普通索引 ：CREATE INDEX index_name ON table_name (column_name);
创建唯一索引 ：CREATE UNIQUE INDEX index_name ON table_name (column_name);

三、查询优化

（一）执行计划分析

使用 EXPLAIN ANALYZE ：查看查询的执行计划，分析查询的性能。关键指标包括操作节点的执行时间（A-time）、返回的行数（A-rows）、估计的行数（E-rows）、内存使用（Peak Memory）等。

（二）优化建议

增加索引 ：为查询条件中的列创建索引，加速查询过滤和排序。
优化查询语句 ：确保查询条件的顺序合理，使数据库可以利用索引。
定期维护索引 ：定期对表进行 VACUUM 和 ANALYZE 操作，以保持索引和统计信息的准确性。

四、数据库维护操作

（一）`VACUUM` 操作

作用：回收删除或更新操作产生的废弃空间，更新数据库的可见性信息，减少磁盘空间的浪费，提高查询性能。
使用方法 ：VACUUM [VERBOSE] [TABLE_NAME];
示例：VACUUM VERBOSE table_for_test_index;

（二）`ANALYZE` 操作

作用：收集表的统计信息，帮助查询规划器生成更优的查询执行计划。
使用方法 ：ANALYZE [VERBOSE] [TABLE_NAME];
示例：ANALYZE VERBOSE table_for_test_index;

（三）组合命令

VACUUM ANALYZE ：先执行 VACUUM 回收空间，然后执行 ANALYZE 更新统计信息。
示例：VACUUM ANALYZE activity;

五、案例分析

create table table_for_test_index
(
data_id bigint,
inspect_status text,
item_name varchar(4000),
item_content varchar(4000),
item_result text,
province_code varchar(400),
province_name varchar(400),
area_code varchar(400),
area_name varchar(400),
sub_dealer_id integer,
sub_dealer_name varchar(400),
team_id integer,
team_name varchar(400),
)

-- 计划创建索引 --场景二单一索引 CREATE INDEX idx_table_for_test_index_confirm_first_time ON table_for_test_index(confirm_first_time); --场景四联合索引 CREATE INDEX idx_confirm_first_time_team_id ON table_for_test_index (confirm_first_time, team_id);

-- 执行目标sql

explain analyse
select order_no,
item_name,
item_content,
item_result,
province_name,
area_name,
dealer_name,
dealer_team_name,
supo_dealer_manager_name,
to_char(fst_pass_time, 'YYYY-MM-DD') fst_pass_time,
to_char(confirm_first_time, 'YYYY-MM-DD') as confirm_first_time
from table_for_test_index
where 1 = 1
AND confirm_first_time >= '2025-02-28T00:00'
and '2025-05-22T23:59:59.999999999' >= confirm_first_time
and item_name in ('固定', '焊接', '线缆', '阴影')
and item_result in ('不合格')
and area_code in
('aa','cc','vv','ss')
and province_code in ('a')
and sub_dealer_id in (11)
and team_id in (1122)
order by data_id
LIMIT 10

（一）场景 1：不加索引

 id |                         operation                          |        A-time        | A-rows | E-rows |  Peak Memory   | E-memory | A-width | E-width |  E-costs  
----+------------------------------------------------------------+----------------------+--------+--------+----------------+----------+---------+---------+-----------
  1 | ->  Limit                                                  | 1448.363             |      1 |     10 | 8KB            |          |         |     307 | 136675.50 
  2 |    ->  Streaming (type: GATHER)                            | 1448.360             |      1 |     30 | 160KB          |          |         |     307 | 136679.86 
"  3 |       ->  Limit                                            | [1396.774, 1425.354] |      1 |     30 | [8KB, 8KB]     | 1MB      |         |     307 | 136671.86 "
"  4 |          ->  Sort                                          | [1396.769, 1425.350] |      1 |    663 | [40KB, 40KB]   | 16MB     | [0,360] |     307 | 136672.38 "
"  5 |             ->  Subquery Scan on table_for_test_index  | [1396.596, 1425.256] |      1 |    662 | [24KB, 32KB]   | 1MB      |         |     307 | 136667.05 "
"  6 |                ->  Hash Semi Join (7, 8)                   | [1396.594, 1425.183] |      1 |    662 | [24KB, 32KB]   | 1MB      |         |     307 | 136665.94 "
"  7 |                   ->  Seq Scan on table_for_test_index | [1396.586, 1424.857] |      1 |    662 | [80KB, 80KB]   | 1MB      |         |     314 | 136662.74 "
"  8 |                   ->  Hash                                 | [0.076, 0.076]       |     88 |    264 | [296KB, 296KB] | 16MB     | [23,23] |      32 | 1.10      "
"  9 |                      ->  Values Scan on ""*VALUES*""         | [0.031, 0.031]       |     88 |    264 | [16KB, 16KB]   | 1MB      |         |      32 | 1.10      "------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
"  6 --Hash Semi Join (7, 8)"
"        Hash Cond: ((table_for_test_index.area_code)::text = ""*VALUES*"".column1)"
  7 --Seq Scan on table_for_test_index
"        Filter: ((confirm_first_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_first_time) AND (item_result = '不合格'::text) AND ((province_code)::text = '45'::text) AND (sub_dealer_id = 1077) AND (team_id = 1122) AND ((item_name)::text = ANY ('{压块固定问题,焊接防腐,线缆虚接,阴影遮挡}'::text[])))"
        Rows Removed by Filter: 4579368
""
           Memory Information (identified by plan id)            
-----------------------------------------------------------------
Coordinator Query Peak Memory:
        Query Peak Memory: 0MB
Datanode:
        Max Query Peak Memory: 0MB
        Min Query Peak Memory: 0MB
  4 --Sort
        Sort Method: quicksort  Memory: 25kB ~ 25kB
  8 --Hash
        Max Buckets: 32768  Max Batches: 1  Max Memory Usage: 4kB
        Min Buckets: 0  Min Batches: 0  Min Memory Usage: 0kB
""
                                 User Define Profiling                                  
----------------------------------------------------------------------------------------
Plan Node id: 2  Track name: coordinator get datanode connection
" (actual time=[9.030, 9.030], calls=[1, 1])"
Plan Node id: 2  Track name: coordinator begin transaction
" (actual time=[0.002, 0.002], calls=[1, 1])"
Plan Node id: 2  Track name: coordinator send command
" (actual time=[0.091, 0.091], calls=[3, 3])"
Plan Node id: 2  Track name: coordinator get the first tuple
" (actual time=[37.865, 37.865], calls=[1, 1])"
Plan Node id: 7  Track name: heap scan read buffer
" (sample time=[72.655, 77.479], estimated time=[726.567, 774.834], calls=[87041, 87065])"
""
                            ====== Query Summary =====                            
----------------------------------------------------------------------------------
"Datanode executor start time [dn_6005_6006, dn_6001_6002]: [0.358 ms,0.399 ms]"
"Datanode executor run time [dn_6005_6006, dn_6003_6004]: [1398.015 ms,1435.453 ms]"
"Datanode executor end time [dn_6003_6004, dn_6001_6002]: [0.031 ms,0.037 ms]"
System available mem: 4546560KB
Query Max mem: 4546560KB
Query estimated mem: 32768KB
Enqueue time: 34.512 ms
Coordinator executor start time: 0.203 ms
Coordinator executor run time: 1450.309 ms
Coordinator executor end time: 0.041 ms
Parser runtime: 0.000 ms
Planner runtime: 2.931 ms
Query Id: 73464969046324330
Unique SQL Id: 2238842644
Unique SQL Hash: sql_07acf26f0590c5ec805e5875efec3870
Total runtime: 1488.180 ms

执行时间 ：总执行时间为 1488.180 ms。
主要操作节点 ：
- 顺序扫描（Seq Scan） ：由于没有索引，数据库需要全表顺序扫描来查找符合条件的记录，这导致大量数据被扫描（Rows Removed by Filter: 4579368），占用了大部分查询时间。
- 哈希半连接（Hash Semi Join） ：用于连接两个数据集。一个数据集来自顺序扫描，另一个来自一个值扫描（Values Scan）。连接条件是 install_area_code。
- 排序（Sort） ：在结果集中按 data_id 排序，使用了快速排序算法，内存使用为 25kB。
- 限制（Limit） ：限制返回的行数为 10，有助于减少最终返回的数据量。

问题分析 ：顺序扫描导致查询性能较差，大量数据被扫描后又被过滤掉，浪费了大量时间。

（二）场景 2：增加 `confirm_first_time` 索引

 id |                             operation                              |       A-time       | A-rows  | E-rows  |  Peak Memory   | E-memory | A-width | E-width |  E-costs  
----+--------------------------------------------------------------------+--------------------+---------+---------+----------------+----------+---------+---------+-----------
  1 | ->  Limit                                                          | 636.978            |       1 |      10 | 8KB            |          |         |     307 | 102387.55 
  2 |    ->  Streaming (type: GATHER)                                    | 636.974            |       1 |      30 | 160KB          |          |         |     307 | 102391.90 
"  3 |       ->  Limit                                                    | [614.182, 626.947] |       1 |      30 | [8KB, 8KB]     | 1MB      |         |     307 | 102383.90 "
"  4 |          ->  Sort                                                  | [614.180, 626.944] |       1 |     663 | [40KB, 40KB]   | 16MB     | [0,360] |     307 | 102384.43 "
"  5 |             ->  Subquery Scan on table_for_test_index          | [614.131, 626.898] |       1 |     662 | [24KB, 32KB]   | 1MB      |         |     307 | 102379.10 "
"  6 |                ->  Hash Semi Join (7, 9)                           | [614.128, 626.895] |       1 |     662 | [24KB, 32KB]   | 1MB      |         |     307 | 102377.99 "
"  7 |                   ->  Bitmap Heap Scan on table_for_test_index | [613.883, 626.598] |       1 |     662 | [72KB, 72KB]   | 1MB      |         |     314 | 102374.79 "
"  8 |                      ->  Bitmap Index Scan                         | [69.465, 75.358]   | 1082697 | 1081447 | [4MB, 4MB]     | 1MB      |         |       0 | 3609.07   "
"  9 |                   ->  Hash                                         | [0.071, 0.083]     |     264 |     264 | [296KB, 296KB] | 16MB     | [23,23] |      32 | 1.10      "
" 10 |                      ->  Values Scan on ""*VALUES*""                 | [0.027, 0.045]     |     264 |     264 | [16KB, 16KB]   | 1MB      |         |      32 | 1.10      "
""
                                                                                                                                 Predicate Information (identified by plan id)                                                                                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
"  6 --Hash Semi Join (7, 9)"
"        Hash Cond: ((table_for_test_index.area_code)::text = ""*VALUES*"".column1)"
  7 --Bitmap Heap Scan on table_for_test_index
        Recheck Cond: ((confirm_frist_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_time))
"        Filter: ((item_result = '不合格'::text) AND ((province_code)::text = '45'::text) AND (sub_dealer_id = 1077) AND (team_id = 1122) AND ((item_name)::text = ANY ('{压块,焊接,线缆,阴影}'::text[])))"
        Rows Removed by Filter: 1082696
  8 --Bitmap Index Scan
        Index Cond: ((confirm_first_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_first_time))
""
           Memory Information (identified by plan id)            
-----------------------------------------------------------------
Coordinator Query Peak Memory:
        Query Peak Memory: 2MB
Datanode:
        Max Query Peak Memory: 5MB
        Min Query Peak Memory: 5MB
  4 --Sort
        Sort Method: quicksort  Memory: 25kB ~ 25kB
  9 --Hash
        Max Buckets: 32768  Max Batches: 1  Max Memory Usage: 4kB
        Min Buckets: 32768  Min Batches: 1  Min Memory Usage: 4kB
""
                                 User Define Profiling                                  
----------------------------------------------------------------------------------------
Plan Node id: 2  Track name: coordinator get datanode connection
" (actual time=[5.730, 5.730], calls=[1, 1])"
Plan Node id: 2  Track name: coordinator begin transaction
" (actual time=[0.003, 0.003], calls=[1, 1])"
Plan Node id: 2  Track name: coordinator send command
" (actual time=[0.141, 0.141], calls=[3, 3])"
Plan Node id: 2  Track name: coordinator get the first tuple
" (actual time=[13.444, 13.444], calls=[1, 1])"
Plan Node id: 8  Track name: bitmap heap scan read buffer
" (sample time=[35.496, 36.298], estimated time=[355.004, 363.007], calls=[40175, 40206])"
""
                           ====== Query Summary =====                           
--------------------------------------------------------------------------------
"Datanode executor start time [dn_6005_6006, dn_6003_6004]: [0.413 ms,0.486 ms]"
"Datanode executor run time [dn_6005_6006, dn_6003_6004]: [614.189 ms,626.956 ms]"
"Datanode executor end time [dn_6005_6006, dn_6001_6002]: [0.054 ms,0.156 ms]"
System available mem: 4546560KB
Query Max mem: 4546560KB
Query estimated mem: 32768KB
Enqueue time: 5.010 ms
Coordinator executor start time: 0.317 ms
Coordinator executor run time: 637.256 ms
Coordinator executor end time: 0.043 ms
Parser runtime: 0.000 ms
Planner runtime: 4.348 ms
Query Id: 73464969046932357
Unique SQL Id: 2238842644
Unique SQL Hash: sql_07acf26f0590c5ec805e5875efec3870
Total runtime: 647.191 ms

执行时间 ：总执行时间为 647.191 ms。
主要操作节点 ：
- 位图索引扫描（Bitmap Index Scan） ：利用 confirm_first_time 索引快速找到符合条件的记录。这一操作实际返回了 1082697 行，实际时间为 [69.465, 75.358] 毫秒，相比顺序扫描，大大减少了需要处理的数据量。
- 位图堆扫描（Bitmap Heap Scan） ：根据位图索引扫描的结果，从堆中获取完整的行数据，并应用其他过滤条件（如 inspect_item_result、srt_unqualified_cnt_flag 等）。
- 排序（Sort） ：在结果集中按 data_id 排序，使用了快速排序算法，内存使用为 25kB。
- 限制（Limit） ：限制返回的行数为 10，有助于减少最终返回的数据量。
优化效果 ：相比未增加索引时，查询时间减少了一半以上，说明索引的添加对查询性能有显著的提升。

（三）场景 3：增加 `confirm_first_time` 和 `data_id` 索引

 id |                             operation                              |       A-time       | A-rows  | E-rows  |  Peak Memory   | E-memory | A-width | E-width |  E-costs  
----+--------------------------------------------------------------------+--------------------+---------+---------+----------------+----------+---------+---------+-----------
  1 | ->  Limit                                                          | 628.006            |       1 |      10 | 8KB            |          |         |     307 | 102387.55 
  2 |    ->  Streaming (type: GATHER)                                    | 628.002            |       1 |      30 | 160KB          |          |         |     307 | 102391.90 
"  3 |       ->  Limit                                                    | [610.628, 617.545] |       1 |      30 | [8KB, 8KB]     | 1MB      |         |     307 | 102383.90 "
"  4 |          ->  Sort                                                  | [610.625, 617.543] |       1 |     663 | [40KB, 40KB]   | 16MB     | [0,360] |     307 | 102384.43 "
"  5 |             ->  Subquery Scan on table_for_test_index          | [610.567, 617.491] |       1 |     662 | [24KB, 32KB]   | 1MB      |         |     307 | 102379.10 "
"  6 |                ->  Hash Semi Join (7, 9)                           | [610.491, 617.487] |       1 |     662 | [24KB, 32KB]   | 1MB      |         |     307 | 102377.99 "
"  7 |                   ->  Bitmap Heap Scan on table_for_test_index | [610.248, 617.215] |       1 |     662 | [72KB, 72KB]   | 1MB      |         |     314 | 102374.79 "
"  8 |                      ->  Bitmap Index Scan                         | [68.714, 82.117]   | 1082697 | 1081447 | [4MB, 4MB]     | 1MB      |         |       0 | 3609.07   "
"  9 |                   ->  Hash                                         | [0.071, 0.074]     |     264 |     264 | [296KB, 296KB] | 16MB     | [23,23] |      32 | 1.10      "
" 10 |                      ->  Values Scan on ""*VALUES*""                 | [0.024, 0.028]     |     264 |     264 | [16KB, 16KB]   | 1MB      |         |      32 | 1.10      "
""
                                                                                                                                 Predicate Information (identified by plan id)                                                                                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
"  6 --Hash Semi Join (7, 9)"
"        Hash Cond: ((table_for_test_index.area_code)::text = ""*VALUES*"".column1)"
  7 --Bitmap Heap Scan on table_for_test_index
        Recheck Cond: ((confirm_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_time))
"        Filter: ((item_result = '不合格'::text) AND AND ((province_code)::text = '45'::text) AND (sub_dealer_id = 1077) AND (team_id = 1122) AND ((item_name)::text = ANY ('{压块,焊接,线缆,阴影}'::text[])))"
        Rows Removed by Filter: 1082696
  8 --Bitmap Index Scan
        Index Cond: ((confirm_first_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_first_time))
""
           Memory Information (identified by plan id)            
-----------------------------------------------------------------
Coordinator Query Peak Memory:
        Query Peak Memory: 0MB
Datanode:
        Max Query Peak Memory: 5MB
        Min Query Peak Memory: 5MB
  4 --Sort
        Sort Method: quicksort  Memory: 25kB ~ 25kB
  9 --Hash
        Max Buckets: 32768  Max Batches: 1  Max Memory Usage: 4kB
        Min Buckets: 32768  Min Batches: 1  Min Memory Usage: 4kB
""
                                 User Define Profiling                                  
----------------------------------------------------------------------------------------
Plan Node id: 2  Track name: coordinator get datanode connection
" (actual time=[5.962, 5.962], calls=[1, 1])"
Plan Node id: 2  Track name: coordinator begin transaction
" (actual time=[0.002, 0.002], calls=[1, 1])"
Plan Node id: 2  Track name: coordinator send command
" (actual time=[0.095, 0.095], calls=[3, 3])"
Plan Node id: 2  Track name: coordinator get the first tuple
" (actual time=[7.891, 7.891], calls=[1, 1])"
Plan Node id: 8  Track name: bitmap heap scan read buffer
" (sample time=[34.368, 35.879], estimated time=[343.731, 358.835], calls=[40175, 40206])"
""
                           ====== Query Summary =====                           
--------------------------------------------------------------------------------
"Datanode executor start time [dn_6005_6006, dn_6003_6004]: [0.504 ms,0.547 ms]"
"Datanode executor run time [dn_6001_6002, dn_6003_6004]: [610.664 ms,617.553 ms]"
"Datanode executor end time [dn_6005_6006, dn_6001_6002]: [0.051 ms,0.067 ms]"
System available mem: 4546560KB
Query Max mem: 4546560KB
Query estimated mem: 32768KB
Enqueue time: 4.886 ms
Coordinator executor start time: 0.191 ms
Coordinator executor run time: 628.284 ms
Coordinator executor end time: 0.045 ms
Parser runtime: 0.000 ms
Planner runtime: 3.145 ms
Query Id: 73464969047523501
Unique SQL Id: 2238842644
Unique SQL Hash: sql_07acf26f0590c5ec805e5875efec3870
Total runtime: 636.740 ms

执行时间 ：总执行时间为 628.006 ms。
主要操作节点 ：
- 位图索引扫描（Bitmap Index Scan） ：利用 confirm_first_time 索引快速找到符合条件的记录，实际时间为 [68.714, 82.117] 毫秒。
- 位图堆扫描（Bitmap Heap Scan） ：根据位图索引扫描的结果，从堆中获取完整的行数据，并应用其他过滤条件。
- 排序（Sort） ：在结果集中按 data_id 排序，使用了快速排序算法，内存使用为 25kB。由于 data_id 上有索引，排序操作可能利用了该索引，进一步减少了排序的时间。
- 限制（Limit） ：限制返回的行数为 10，有助于减少最终返回的数据量。
优化效果 ：相比场景 2，总执行时间略有减少，说明 data_id 索引对排序操作有一定的帮助，进一步提高了查询效率。

（四）场景 4：增加 `confirm_first_time` 和 `team_id` 增加联合主键

 id |                             operation                              |      A-time      | A-rows | E-rows |  Peak Memory   | E-memory | A-width | E-width | E-costs 
----+--------------------------------------------------------------------+------------------+--------+--------+----------------+----------+---------+---------+---------
  1 | ->  Limit                                                          | 28.507           |      1 |     10 | 8KB            |          |         |     307 | 5987.59 
  2 |    ->  Streaming (type: GATHER)                                    | 28.504           |      1 |     30 | 96KB           |          |         |     307 | 5991.94 
"  3 |       ->  Limit                                                    | [22.979, 24.620] |      1 |     30 | [8KB, 8KB]     | 1MB      |         |     307 | 5983.94 "
"  4 |          ->  Sort                                                  | [22.977, 24.616] |      1 |    663 | [40KB, 40KB]   | 16MB     | [0,360] |     307 | 5984.47 "
"  5 |             ->  Subquery Scan on table_for_test_index          | [22.968, 24.602] |      1 |    662 | [24KB, 32KB]   | 1MB      |         |     307 | 5979.14 "
"  6 |                ->  Hash Semi Join (7, 9)                           | [22.966, 24.584] |      1 |    662 | [24KB, 32KB]   | 1MB      |         |     307 | 5978.02 "
"  7 |                   ->  Bitmap Heap Scan on table_for_test_index | [22.859, 24.464] |      1 |    662 | [72KB, 72KB]   | 1MB      |         |     314 | 5974.83 "
"  8 |                      ->  Bitmap Index Scan                         | [22.762, 24.362] |     67 |   1145 | [32KB, 32KB]   | 1MB      |         |       0 | 4510.28 "
"  9 |                   ->  Hash                                         | [0.064, 0.075]   |    264 |    264 | [296KB, 296KB] | 16MB     | [23,23] |      32 | 1.10    "
" 10 |                      ->  Values Scan on ""*VALUES*""                 | [0.023, 0.033]   |    264 |    264 | [16KB, 16KB]   | 1MB      |         |      32 | 1.10    "
""
                                                                                                                   Predicate Information (identified by plan id)                                                                                                                    
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
"  6 --Hash Semi Join (7, 9)"
"        Hash Cond: ((table_for_test_index.area_code)::text = ""*VALUES*"".column1)"
  7 --Bitmap Heap Scan on table_for_test_index
        Recheck Cond: ((confirm_first_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_first_time) AND (dealer_team_id = 1122))
"        Filter: ((item_result = '不合格'::text) AND ((install_province_code)::text = '45'::text) AND (sub_dealer_id = 1077) AND ((item_name)::text = ANY ('{压块,焊接,线缆,阴影}'::text[])))"
        Rows Removed by Filter: 66
  8 --Bitmap Index Scan
        Index Cond: ((confirm_first_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_first_time) AND (team_id = 1122))
""
           Memory Information (identified by plan id)            
-----------------------------------------------------------------
Coordinator Query Peak Memory:
        Query Peak Memory: 0MB
Datanode:
        Max Query Peak Memory: 0MB
        Min Query Peak Memory: 0MB
  4 --Sort
        Sort Method: quicksort  Memory: 25kB ~ 25kB
  9 --Hash
        Max Buckets: 32768  Max Batches: 1  Max Memory Usage: 4kB
        Min Buckets: 32768  Min Batches: 1  Min Memory Usage: 4kB
""
                           User Define Profiling                            
----------------------------------------------------------------------------
Plan Node id: 2  Track name: coordinator get datanode connection
" (actual time=[0.012, 0.012], calls=[1, 1])"
Plan Node id: 2  Track name: coordinator begin transaction
" (actual time=[0.003, 0.003], calls=[1, 1])"
Plan Node id: 2  Track name: coordinator send command
" (actual time=[0.097, 0.097], calls=[3, 3])"
Plan Node id: 2  Track name: coordinator get the first tuple
" (actual time=[2.296, 2.296], calls=[1, 1])"
Plan Node id: 8  Track name: bitmap heap scan read buffer
" (sample time=[0.000, 0.001], estimated time=[0.000, 0.011], calls=[21, 22])"
""
                          ====== Query Summary =====                          
------------------------------------------------------------------------------
"Datanode executor start time [dn_6001_6002, dn_6005_6006]: [0.107 ms,0.128 ms]"
"Datanode executor run time [dn_6005_6006, dn_6001_6002]: [22.986 ms,24.641 ms]"
"Datanode executor end time [dn_6001_6002, dn_6003_6004]: [0.041 ms,0.045 ms]"
System available mem: 4546560KB
Query Max mem: 4546560KB
Query estimated mem: 4362KB
Enqueue time: 0.055 ms
Coordinator executor start time: 0.229 ms
Coordinator executor run time: 28.775 ms
Coordinator executor end time: 0.045 ms
Parser runtime: 0.000 ms
Planner runtime: 3.295 ms
Query Id: 73464969050955362
Unique SQL Id: 2238842644
Unique SQL Hash: sql_07acf26f0590c5ec805e5875efec3870
Total runtime: 32.631 ms

联合索引的创建

CREATE INDEX idx_confirm_first_time_team_id ON table_for_test_index (confirm_first_time, team_id);

执行时间 ：总执行时间缩短到 32.631 ms，相比未增加索引时的 1488.180 ms 有显著提升，也优于仅增加单个索引的场景。
主要操作节点 ：
- 位图索引扫描（Bitmap Index Scan） ：利用 confirm_first_time 和 team_id 的联合索引快速找到符合条件的记录，实际时间为 [22.762, 24.362] 毫秒。
- 位图堆扫描（Bitmap Heap Scan） ：根据位图索引扫描的结果，从堆中获取完整的行数据，并应用其他过滤条件。
- 排序（Sort） ：在结果集中按 data_id 排序，使用了快速排序算法，内存使用为 25kB。
- 限制（Limit） ：限制返回的行数为 10，有助于减少最终返回的数据量。
优化效果 ：联合索引的使用显著减少了数据扫描的时间，查询时间大幅减少，说明索引的添加对查询性能有显著的提升。

（五）四种场景对比分析

在 GaussDB（PostgreSQL）中，根据是否添加索引及索引类型的不同，查询性能表现各异。以下是四种查询场景的总结分析：

场景一：未添加索引

特点：查询时需全表扫描，逐行检查所有数据，过滤掉不符合条件的记录。
性能表现：在数据量较大时，查询性能较差，执行时间较长，如案例中的 1488.180 ms。

场景二：添加 `confirm_first_time` 单列索引

特点：利用索引快速定位 confirm_first_time 范围内的数据，减少需扫描的数据量。
性能表现：查询性能显著提升，执行时间大幅减少，案例中为 647.191 ms。

场景三：添加 `confirm_first_time` 和 `data_id` 两个单列索引

特点：confirm_first_time 索引用于快速过滤时间范围数据，data_id 索引用于加速排序。
性能表现：查询性能进一步提升，执行时间略短于场景二，案例中为 628.006 ms。

场景四：添加 `confirm_first_time` 和 `team_id` 联合索引

特点：联合索引同时利用两个列的过滤条件，减少扫描数据量，且无需单独维护两个单列索引。
性能表现：查询性能较好，执行时间短，案例中为 32.631 ms。

六、增加索引时可能遇到的问题及解决方案

（一）权限不足

问题描述 ：在创建索引时，可能会出现权限不足的错误，提示用户没有足够的权限来修改表结构。
解决方法 ：确保当前用户具有对表的 ALTER 权限。可以使用以下命令授予权限：
- 授予权限 ：GRANT ALTER ON TABLE table_name TO username;

（二）表结构冲突

问题描述 ：如果表是分布式表（如使用 DISTRIBUTE BY HASH 创建的表），可能会出现不支持主键或唯一索引的错误。

[2025-05-20 15:20:45] [0A000] ERROR: Round-Robin table does not support PrimaryKey/UniqueIndex

解决方法 ：对于 GaussDB（PostgreSQL），分布式表不支持主键和唯一索引。可以创建普通索引来提高查询性能：
- 创建普通索引 ：CREATE INDEX idx_column ON table_name (column_name);

（三）索引创建失败

问题描述 ：在创建索引时，可能会出现索引创建失败的错误，提示索引名重复或列类型不支持。
解决方法 ：
- 检查索引名 ：确保索引名不重复。
- 检查列类型 ：确保列类型支持索引创建。对于不支持的列类型，可以考虑使用函数索引或修改列类型。

（四）性能问题

问题描述 ：在创建索引时，可能会对表的写入性能产生影响，特别是对于大表来说，索引创建过程可能会比较耗时，并占用较多的磁盘空间。
解决方法 ：
- 选择合适的时间 ：在低峰期创建索引，以减少对业务的影响。
- 使用并发创建索引 ：可以使用 CONCURRENTLY 关键字创建索引，减少对表的写入锁定。例如：
  - 并发创建索引 ：CREATE INDEX CONCURRENTLY idx_column ON table_name (column_name);

（五）索引维护成本

问题描述 ：索引会增加数据插入、更新和删除操作的维护成本，因为每次数据变更都需要更新索引。
解决方法 ：
- 权衡索引数量 ：根据查询需求合理创建索引，避免创建过多的索引。
- 定期维护索引 ：定期对索引进行维护，如重建索引以提高性能。可以使用以下命令重建索引：
  - 重建索引 ：REINDEX INDEX index_name;

七、总结

定期执行 VACUUM** 和 **ANALYZE** 操作** ：以保持数据库的性能和查询优化处于最佳状态。VACUUM 回收废弃空间，ANALYZE 更新统计信息，帮助查询规划器生成更优的执行计划。
合理创建索引 ：根据查询需求，为经常用于过滤和排序的列创建索引。特别是对查询条件中的列（如 confirm_first_time）和排序列（如 data_id）创建索引，可以显著提高查询性能。
分析执行计划 ：通过 EXPLAIN ANALYZE 查看查询的执行计划，了解查询的性能瓶颈，并针对性地进行优化。关注操作节点的执行时间、返回的行数、内存使用等关键指标。
处理索引创建问题 ：在创建索引时，注意权限、表结构冲突、索引创建失败等问题，并采取相应的解决方法。定期维护索引，权衡索引数量，以保持数据库性能和查询效率的平衡。

以上内容涵盖了在 GaussDB（PostgreSQL）中进行数据库开发和优化的关键步骤和技巧，以及在增加索引时可能遇到的问题及解决方案，希望对日常工作有所帮助。

创建索引：GaussDB（PostgreSQL）开发笔记

一、查看数据库和表信息

（一）查看数据库版本

（二）查看表信息

二、索引管理

（一）查看索引

（二）创建索引

三、查询优化

（一）执行计划分析

（二）优化建议

四、数据库维护操作

（一）`VACUUM` 操作

（二）`ANALYZE` 操作

（三）组合命令

五、案例分析

（一）场景 1：不加索引

（二）场景 2：增加 `confirm_first_time` 索引

（三）场景 3：增加 `confirm_first_time` 和 `data_id` 索引

（四）场景 4：增加 `confirm_first_time` 和 `team_id` 增加联合主键

（五）四种场景对比分析

场景一：未添加索引

场景二：添加 `confirm_first_time` 单列索引

场景三：添加 `confirm_first_time` 和 `data_id` 两个单列索引

场景四：添加 `confirm_first_time` 和 `team_id` 联合索引

六、增加索引时可能遇到的问题及解决方案

（一）权限不足

（二）表结构冲突

（三）索引创建失败

（四）性能问题

（五）索引维护成本

七、总结

网站公告

今日签到

热门文章

最新发布

创建索引：GaussDB（PostgreSQL）开发笔记

一、查看数据库和表信息

（一）查看数据库版本

（二）查看表信息

二、索引管理

（一）查看索引

（二）创建索引

三、查询优化

（一）执行计划分析

（二）优化建议

四、数据库维护操作

（一）VACUUM 操作

（二）ANALYZE 操作

（三）组合命令

五、案例分析

（一）场景 1：不加索引

（二）场景 2：增加 confirm_first_time 索引

（三）场景 3：增加 confirm_first_time 和 data_id 索引

（四）场景 4：增加 confirm_first_time 和 team_id 增加联合主键

（五）四种场景对比分析

场景一：未添加索引

场景二：添加 confirm_first_time 单列索引

场景三：添加 confirm_first_time 和 data_id 两个单列索引

场景四：添加 confirm_first_time 和 team_id 联合索引

六、增加索引时可能遇到的问题及解决方案

（一）权限不足

（二）表结构冲突

（三）索引创建失败

（四）性能问题

（五）索引维护成本

七、总结

网站公告

今日签到

热门文章

最新发布

（一）`VACUUM` 操作

（二）`ANALYZE` 操作

（二）场景 2：增加 `confirm_first_time` 索引

（三）场景 3：增加 `confirm_first_time` 和 `data_id` 索引

（四）场景 4：增加 `confirm_first_time` 和 `team_id` 增加联合主键

场景二：添加 `confirm_first_time` 单列索引

场景三：添加 `confirm_first_time` 和 `data_id` 两个单列索引

场景四：添加 `confirm_first_time` 和 `team_id` 联合索引