背景
在开发企业级应用时,我们经常需要处理批量数据的插入和更新操作。传统的逐条处理方式性能低下,而简单的REPLACE INTO
或INSERT ... ON DUPLICATE KEY UPDATE
在某些场景下又不够灵活。本文将介绍一种基于临时表的高效批量插入/更新方案,解决复杂业务场景下的数据同步问题。
场景
这个表需要大量插入和更新数据,频繁的比对浪费时间,且效率不高,要减少数据库连接时间开销。可以采用临时表的方式进行插入更新。假设下表,根据username和age组合判断记录是否存在,存在则更新,不存在则插入。
整体设计逻辑
创建mapper接口
/**
* 批量插入或更新(根据username和age判断)
* @param list 批量数据列表
* @return 影响的行数
*/
int batchInsertOrUpdateByUsernameAndAge(List<BatchTest> list);
创建mapper.xml
<insert id="batchInsertOrUpdateByUsernameAndAge" parameterType="java.util.List">
<!-- 创建临时表存储批量数据 -->
CREATE TEMPORARY TABLE temp_batch_test (
username varchar(50),
age int,
email varchar(100),
status tinyint
) ENGINE=MEMORY;
<!-- 插入数据到临时表 -->
INSERT INTO temp_batch_test (username, age, email, status)
VALUES
<foreach collection="list" item="item" separator=",">
(#{item.username}, #{item.age}, #{item.email}, #{item.status})
</foreach>;
<!-- 更新已存在的记录(匹配username和age) -->
UPDATE batch_test b
JOIN temp_batch_test t ON b.username = t.username AND b.age = t.age
SET
b.email = t.email,
b.status = t.status;
<!-- 插入新记录(不存在的username和age组合) -->
INSERT INTO batch_test (username, age, email, status)
SELECT t.username, t.age, t.email, t.status
FROM temp_batch_test t
LEFT JOIN batch_test b ON t.username = b.username AND t.age = b.age
WHERE b.username IS NULL;
<!-- 删除临时表 -->
DROP TEMPORARY TABLE IF EXISTS temp_batch_test;
</insert>
更新和插入的逻辑
案例数据流程
初始数据(batch_test表)
id | username | age | status | |
---|---|---|---|---|
1 | 张三 | 25 | zhangsan@old.com | 1 |
2 | 李四 | 30 | lisi@old.com | 1 |
批量输入数据(temp_batch_test表)
username | age | status | |
---|---|---|---|
张三 | 25 | zhangsan@new.com | 0 |
李四 | 35 | lisi@new.com | 1 |
王五 | 28 | wangwu@new.com | 1 |
操作结果
更新操作:
匹配记录:张三(25岁)
执行:
UPDATE ... SET email='zhangsan@new.com', status=0
插入操作:
新记录:李四(35岁)、王五(28岁)
执行:
INSERT INTO ... VALUES ('李四',35,...), ('王五',28,...)
最终数据
id | username | age | status | ||
---|---|---|---|---|---|
1 | 张三 | 25 | zhangsan@new.com | 0 | ← 更新 |
2 | 李四 | 30 | lisi@old.com | 1 | |
3 | 李四 | 35 | lisi@new.com | 1 | ← 新增 |
4 | 王五 | 28 | wangwu@new.com | 1 | ← 新增 |
实现逻辑详解
核心逻辑步骤
临时表创建阶段
CREATE TEMPORARY TABLE temp_batch_test ( username varchar(50), age int, email varchar(100), status tinyint ) ENGINE=MEMORY;
使用MEMORY引擎提高临时表操作速度
只包含必要字段,减少内存占用
数据加载阶段
INSERT INTO temp_batch_test VALUES ('张三',25,'zhangsan@new.com',0), ('李四',35,'lisi@new.com',1), ('王五',28,'wangwu@new.com',1);
使用MyBatis的foreach实现动态批插
参数化查询防止SQL注入
更新阶段
UPDATE batch_test b JOIN temp_batch_test t ON b.username = t.username AND b.age = t.age SET b.email = t.email, b.status = t.status;
插入阶段(重点)
这是插入操作的核心技术,通过 LEFT JOIN + IS NULL
实现:
FROM temp_batch_test t
LEFT JOIN batch_test b ON t.username = b.username AND t.age = b.age
WHERE b.username IS NULL
执行过程:
左连接:将临时表(t)与主表(b)按username和age进行连接
过滤:只保留主表中不存在的记录(即b.username为NULL的记录)
内存中的连接结果示例:
t.username | t.age | t.email | t.status | b.username | b.age | b.email | |
---|---|---|---|---|---|---|---|
张三 | 25 | zhangsan@new.com | 0 | 张三 | 25 | ... | 主表存在 |
李四 | 35 | lisi@new.com | 1 | NULL | NULL | NULL | 主表不存在 |
王五 | 28 | wangwu@new.com | 1 | NULL | NULL | NULL | 主表不存在 |
WHERE条件过滤后结果:
t.username | t.age | t.email | t.status |
---|---|---|---|
李四 | 35 | lisi@new.com | 1 |
王五 | 28 | wangwu@new.com | 1 |
执行批量插入
将过滤后的结果插入主表:
INSERT INTO batch_test (username, age, email, status) -- 上一步的查询结果
执行效果等价于:
INSERT INTO batch_test (username, age, email, status) VALUES ('李四', 35, 'lisi@new.com', 1), ('王五', 28, 'wangwu@new.com', 1);
关键技术点解析
反连接(Anti-Join)模式:
通过LEFT JOIN + IS NULL实现"不存在于"的逻辑
比NOT IN或NOT EXISTS性能更好,特别是大数据量时
复合条件判断:
ON t.username = b.username AND t.age = b.age
同时匹配username和age字段
只有当两个字段都相等时才认为是重复记录
NULL安全比较:
如果age可能为NULL,应该使用:ON t.username = b.username AND (t.age = b.age OR (t.age IS NULL AND b.age IS NULL))
批量插入优势:
单次SQL执行所有插入操作
比循环执行单条INSERT效率高10-100倍
减少网络往返和SQL解析开销
清理阶段
- 显式释放临时表资源
- 避免连接池复用时的表冲突
DROP TEMPORARY TABLE temp_batch_test;
实际执行案例
初始主表数据
id | username | age | status | 备注 | |
---|---|---|---|---|---|
1 | 张三 | 25 | zhangsan@old.com | 1 | |
2 | 李四 | 30 | lisi@old.com | 1 |
批量处理数据
username | age | status | 操作说明 | |
---|---|---|---|---|
张三 | 25 | zhangsan@new.com | 0 | 更新操作 |
李四 | 35 | lisi@new.com | 1 | 插入操作 |
王五 | 28 | wangwu@new.com | 1 | 插入操作 |
插入操作执行过程
临时表与主表LEFT JOIN中间结果:
临时表数据 主表匹配结果 张三(25) 匹配id=1的记录 李四(35) 无匹配(NULL) 王五(28) 无匹配(NULL) 过滤后待插入数据:
username age email status 李四 35 lisi@new.com 1 王五 28 wangwu@new.com 1 最终主表数据:
id username age email status 操作说明 1 张三 25 zhangsan@new.com 0 被更新 2 李四 30 lisi@old.com 1 3 李四 35 lisi@new.com 1 新插入 4 王五 28 wangwu@new.com 1 新插入
性能优化建议
索引优化:
ALTER TABLE batch_test ADD INDEX `idx_username_age` (`username`, `age`);
批量大小控制:
建议每批500-1000条记录
过大的批次可能导致内存问题
临时表优化:
CREATE TEMPORARY TABLE ... ( INDEX `idx_temp` (`username`, `age`) ) ENGINE=MEMORY;
服务器参数:
# my.cnf配置 tmp_table_size = 256M max_heap_table_size = 256M
这种插入机制通过巧妙的SQL设计,实现了高效、准确的批量数据插入,是处理数据同步场景的理想解决方案。
必要配置
properties文件
# application.properties
spring.datasource.url=jdbc:mysql://localhost:3306/dbname?allowMultiQueries=true
spring.datasource.hikari.connection-init-sql=SET SESSION sql_mode='NO_ENGINE_SUBSTITUTION'
测试数据:一定要自己动手试试
第一批数据(全部插入,status=1) - 20条
[
{"username": "仇癸霖2", "age": 22, "email": "2244442", "status": 1},
{"username": "靳浩然", "age": 33, "email": "hvbk3d38@vip.qq.com", "status": 1},
{"username": "束雪", "age": 7, "email": "ssxtbf_ios@qq.com", "status": 1},
{"username": "公孙雨", "age": 28, "email": "rain_gs@163.com", "status": 1},
{"username": "欧阳明日", "age": 45, "email": "oymr@hotmail.com", "status": 1},
{"username": "司马青", "age": 19, "email": "smqing@126.com", "status": 1},
{"username": "令狐冲", "age": 32, "email": "linghuchong@gmail.com", "status": 1},
{"username": "东方不败", "age": 40, "email": "dfbb@yeah.net", "status": 1},
{"username": "西门吹雪", "age": 35, "email": "xmcx@sina.com", "status": 1},
{"username": "慕容复", "age": 38, "email": "murongfu@qq.com", "status": 1},
{"username": "赵灵儿", "age": 18, "email": "zle@163.com", "status": 1},
{"username": "李逍遥", "age": 25, "email": "lxy@gmail.com", "status": 1},
{"username": "林月如", "age": 22, "email": "lyr@126.com", "status": 1},
{"username": "景天", "age": 30, "email": "jtian@qq.com", "status": 1},
{"username": "唐雪见", "age": 27, "email": "txj@sina.com", "status": 1},
{"username": "龙葵", "age": 20, "email": "lkui@163.com", "status": 1},
{"username": "紫萱", "age": 300, "email": "zxuan@yeah.net", "status": 1},
{"username": "徐长卿", "age": 35, "email": "xczq@hotmail.com", "status": 1},
{"username": "重楼", "age": 500, "email": "chonglou@gmail.com", "status": 1},
{"username": "花楹", "age": 15, "email": "huaying@qq.com", "status": 1}
]
第二批数据(混合更新和插入,更新status=0/新插入status=1) - 30条
[
// 需要更新的记录(username+age与第一批重复)
{"username": "仇癸霖2", "age": 22, "email": "new_2244442", "status": 0},
{"username": "靳浩然", "age": 33, "email": "new_hvbk3d38@vip.qq.com", "status": 0},
{"username": "束雪", "age": 7, "email": "new_ssxtbf_ios@qq.com", "status": 0},
{"username": "公孙雨", "age": 28, "email": "new_rain_gs@163.com", "status": 0},
{"username": "欧阳明日", "age": 45, "email": "new_oymr@hotmail.com", "status": 0},
// 新插入的记录
{"username": "张无忌", "age": 28, "email": "zwj@mingjiao.org", "status": 1},
{"username": "赵敏", "age": 25, "email": "zhaomin@yuandynasty.com", "status": 1},
{"username": "周芷若", "age": 24, "email": "zzr@emei.org", "status": 1},
{"username": "小昭", "age": 20, "email": "xiaozao@persia.com", "status": 1},
{"username": "殷离", "age": 22, "email": "yinli@butterfly.com", "status": 1},
{"username": "杨逍", "age": 40, "email": "yangxiao@mingjiao.org", "status": 1},
{"username": "范遥", "age": 38, "email": "fanyao@mingjiao.org", "status": 1},
{"username": "黛绮丝", "age": 42, "email": "daiqisi@persia.com", "status": 1},
{"username": "谢逊", "age": 50, "email": "xiexun@lionking.com", "status": 1},
{"username": "殷天正", "age": 60, "email": "yintianzheng@tiandihui.com", "status": 1},
{"username": "韦一笑", "age": 45, "email": "weiyixiao@batman.com", "status": 1},
{"username": "说不得", "age": 48, "email": "shuobude@monk.com", "status": 1},
{"username": "冷谦", "age": 52, "email": "lengqian@cool.com", "status": 1},
{"username": "彭莹玉", "age": 55, "email": "pengyingyu@pearl.com", "status": 1},
{"username": "周颠", "age": 50, "email": "zhoudian@crazy.com", "status": 1},
{"username": "铁冠道人", "age": 58, "email": "tieguandaoren@taoist.com", "status": 1},
{"username": "朱元璋", "age": 35, "email": "zhuyuanzhang@emperor.com", "status": 1},
{"username": "常遇春", "age": 38, "email": "changyuchun@general.com", "status": 1},
{"username": "徐达", "age": 40, "email": "xuda@marshal.com", "status": 1},
{"username": "汤和", "age": 42, "email": "tanghe@general.com", "status": 1},
{"username": "邓愈", "age": 37, "email": "dengyu@general.com", "status": 1},
{"username": "沐英", "age": 30, "email": "muying@general.com", "status": 1},
{"username": "蓝玉", "age": 45, "email": "lanyu@general.com", "status": 1},
{"username": "傅友德", "age": 50, "email": "fuyoude@general.com", "status": 1}
]