目标:在不改变前端接口的前提下,为高仙 OpenAPI 代理链路加上细粒度限流与读接口的降级缓存兜底,并让写接口在异常时明确失败且不产生副作用。
一、改造思路总览
注解下沉:把
@SentinelResource
从 Controller 统一下沉到 Service(GsOpenApiServiceImpl
),一个业务方法对应一个 Sentinel 资源名,集中治理更清晰。资源命名规范(用于规则绑定)
gs.listRobots
gs.getRobotStatus
gs.postListRobotMap
gs.listSubareas
gs.listRobotCommands
gs.sendTempTask
读写分治
读(查询类)接口:命中限流/熔断 → 读缓存兜底(短 TTL)。
写(下发任务)接口:命中限流/熔断 → 直接返回失败语义(不做缓存,防止错发/重复发)。
二、Service 改造(核心代码骨架)
1)注入 Redis & 统一 Key/JSON 工具
@Slf4j
@Service
public class GsOpenApiServiceImpl implements GsOpenApiService {
private final RestTemplate restTemplate;
private final GsOpenApiProperties props;
private final StringRedisTemplate redis;
private static final ObjectMapper OM = new ObjectMapper();
// 缓存 Key 统一管理
private static String kRobotList() { return "robot:list"; }
private static String kRobotStatus(String sn){ return "robot:xxxx:" + sn; }
private static String kRobotMapList(String sn){ return "robot:map:xxxx:" + sn; }
private static String kSubareas(String mapId, String sn){ return "robot:map:xxxx:" + id + ":" + sn; }
private static String kRobotCmds(String sn, int p, int s){ return "robot:cmds:" + sn + ":" + p + ":" + s; }
// JSON 工具
private String toJson(Object o){ try { return OM.writeValueAsString(o); } catch (Exception e){ return "{}"; } }
private <T> T fromJson(String s, TypeReference<T> t){ try { return OM.readValue(s, t); } catch (Exception e){ return null; } }
// 省略构造器与 token 获取...
}
读接口我们会在“正常返回”时写缓存;命中限流/熔断或出现异常时,读取缓存兜底。
2)读接口(示例:状态查询)—有缓存兜底
@SentinelResource(
value = "gs.getRobotStatus",
blockHandler = "getRobotStatusBlockHandler",
fallback = "getRobotStatusFallback"
)
public String getRobotStatus(String robotSn) {
String api = props.getBaseUrl() + "/openapi/xxxxx/xx/xxxxx/" + xxxx + "/status";
HttpHeaders headers = new HttpHeaders();
headers.setBearerAuth(getToken());
ResponseEntity<String> resp = restTemplate.exchange(api, HttpMethod.GET, new HttpEntity<>(headers), String.class);
String body = resp.getBody();
if (body != null) {
// 写缓存:短 TTL(示例 3 分钟)
try { redis.opsForValue().set(kRobotStatus(robotSn), body, 3, TimeUnit.MINUTES); } catch (Exception ignore) {}
}
return body;
}
// 被限流/熔断 → 读缓存;无缓存则返回 429 语义 JSON
public String getRobotStatusBlockHandler(String robotSn, BlockException ex) {
log.warn("[getRobotStatus] blocked: {}, sn={}", ex.getClass().getSimpleName(), robotSn);
String cached = redis.opsForValue().get(kRobotStatus(robotSn));
return (cached != null) ? cached : "{\"code\":429,\"msg\":\"限流/熔断,且无缓存\"}";
}
// 方法异常 → 读缓存;无缓存则返回 503 语义 JSON
public String getRobotStatusFallback(String robotSn, Throwable ex) {
log.warn("[getRobotStatus] fallback: {}, sn={}", ex.toString(), robotSn);
String cached = redis.opsForValue().get(kRobotStatus(robotSn));
return (cached != null) ? cached : "{\"code\":503,\"msg\":\"服务异常,且无缓存\"}";
}
其它读接口(机器人列表 / 地图列表 / 分区 / 指令列表)同理:正常→写缓存、异常→读缓存,TTL 适度不同:
listRobots:5min
postListRobotMap:3min
listSubareas:10min
listRobotCommands:2min
3)写接口(示例:无站点临时任务)—不缓存,明确失败
@SentinelResource(
value = "gs.sendTempTask",
blockHandler = "sendTempTaskBlock",
fallback = "sendTempTaskFallback"
)
public String sendTempTask(GsTempTaskDto dto) {
String api = props.getBaseUrl() + "/openapi/xxxxx/xxxxxx/xxxxxxx";
HttpHeaders headers = new HttpHeaders();
headers.setContentType(MediaType.APPLICATION_JSON);
headers.setBearerAuth(getToken());
return restTemplate.postForEntity(api, new HttpEntity<>(dto, headers), String.class).getBody();
}
// 被限流/熔断:直接“明确失败”,不做任何缓存/副作用
public String sendTempTaskBlock(GsTempTaskDto dto, BlockException ex) {
log.warn("[sendTempTask] blocked: {}", ex.getClass().getSimpleName());
return "{\"code\":429,\"msg\":\"限流/熔断,任务未下发\"}";
}
// 方法异常:同上“明确失败”
public String sendTempTaskFallback(GsTempTaskDto dto, Throwable ex) {
log.warn("[sendTempTask] fallback: {}", ex.toString());
return "{\"code\":503,\"msg\":\"服务异常,任务未下发\"}";
}
写接口的“明确失败”让前端可以清晰提示并二次重试;避免“误以为成功”或“重试导致重复下发”。
三、RestTemplate 与统一异常处理
1)超时/连接池(让“慢调用/异常”可被准确感知)
@Bean
public RestTemplate restTemplate() {
var http = HttpClients.custom()
.disableAutomaticRetries()
.setMaxConnTotal(200)
.setMaxConnPerRoute(50)
.build();
var f = new HttpComponentsClientHttpRequestFactory(http);
f.setConnectTimeout(2000);
f.setReadTimeout(5000);
f.setConnectionRequestTimeout(2000);
return new RestTemplate(f);
}
注:如果你希望 4xx/5xx 直接抛异常,别自定义 ErrorHandler
;如果你要“200+错误体”风格,就保留一个吞错的 ErrorHandler
并在业务里解析。本文使用“抛异常→fallback 更清晰”。
2)全局异常(可选,但推荐)
@RestControllerAdvice
public class GlobalExceptionHandler {
@ExceptionHandler(BlockException.class)
public AjaxResult onBlock(BlockException ex){
return AjaxResult.error(429, "流控/降级触发:" + ex.getClass().getSimpleName());
}
@ExceptionHandler(Throwable.class)
public AjaxResult onAny(Throwable ex, HttpServletRequest req){
log.error("Unhandled ex, uri={}", req.getRequestURI(), ex);
return AjaxResult.error(500, "系统繁忙,请稍后再试");
}
}
如果你还有URL 级限流(不走注解),可以自定义 UrlBlockHandler
统一返回 HTTP 429。
四、Nacos 规则持久化(Flow/Degrade)
1)应用配置(ruoyi-robot-dev.yml
片段,已脱敏)
spring:
cloud:
sentinel:
eager: true
transport:
dashboard: <SENTINEL_DASHBOARD_HOST:PORT> # 例:127.0.0.1:8718
datasource:
flow:
nacos:
serverAddr: <NACOS_ADDR> # 例:127.0.0.1:8848
groupId: DEFAULT_GROUP
username: <NACOS_USER>
password: <NACOS_PASS>
dataId: ruoyi-robot-flow-rules
dataType: json
ruleType: flow
degrade:
nacos:
serverAddr: <NACOS_ADDR>
groupId: DEFAULT_GROUP
username: <NACOS_USER>
password: <NACOS_PASS>
dataId: ruoyi-robot-degrade-rules
dataType: json
ruleType: degrade
2)日常口径(Flow 限流)
json
[
{"resource":"gs.listRobots","grade":1,"count":5,"intervalSec":1},
{"resource":"gs.getRobotStatus","grade":1,"count":10,"intervalSec":1},
{"resource":"gs.postListRobotMap","grade":1,"count":6,"intervalSec":1},
{"resource":"gs.listSubareas","grade":1,"count":6,"intervalSec":1},
{"resource":"gs.listRobotCommands","grade":1,"count":8,"intervalSec":1},
{"resource":"gs.sendTempTask","grade":1,"count":2,"intervalSec":1}
]
grade=1
表示 QPS 阈值。压测要看熔断时,请先调大这些阈值,避免总是被限流先命中。
3)日常口径(Degrade 熔断)
读接口采用“慢调用比例”,写接口采用“异常比例”(阈值仅示例,实际按你的接口时延/稳定性调优):
json
[
{"resource":"gs.getRobotStatus","grade":0,"count":800,"slowRatioThreshold":0.5,"minRequestAmount":20,"statIntervalMs":10000,"timeWindow":10},
{"resource":"gs.listRobots","grade":0,"count":1200,"slowRatioThreshold":0.5,"minRequestAmount":20,"statIntervalMs":10000,"timeWindow":10},
{"resource":"gs.postListRobotMap","grade":0,"count":1200,"slowRatioThreshold":0.5,"minRequestAmount":20,"statIntervalMs":10000,"timeWindow":10},
{"resource":"gs.listSubareas","grade":0,"count":1200,"slowRatioThreshold":0.5,"minRequestAmount":20,"statIntervalMs":10000,"timeWindow":10},
{"resource":"gs.listRobotCommands","grade":0,"count":1200,"slowRatioThreshold":0.5,"minRequestAmount":20,"statIntervalMs":10000,"timeWindow":10},
{"resource":"gs.sendTempTask","grade":1,"count":0.2,"minRequestAmount":10,"statIntervalMs":10000,"timeWindow":5}
]
注意:Nacos 配置里是 纯 JSON,不要写 //
注释,否则加载失败(踩过一次坑)。
五、联调与自测要点(带占位符)
通过网关访问(不要直连服务):
机器人列表:
GET <GW>/external/gs/xxxxx
机器人状态:
GET <GW>/external/gs/xxxxx
地图列表(表单):
POST <GW>/external/gs/map/xxxxxx
,Content-Type: application/x-www-form-urlencoded
,Body:robotSn=<ROBOT_SN>
分区查询(JSON):
POST <GW>/external/gs/map/xxxxx
,Body:{"mapId":"<MAP_ID>","robotSn":"<ROBOT_SN>"}
下发无站点任务(JSON):
POST <GW>/external/gs/xxxxx
,Body:{"robotSn":"<ROBOT_SN>","taskName":"<TASK>","mapId":"<MAP_ID>","subareaId":"<SUB_ID>","loop":false,"times":1}
并发/压测时如果你希望验证“熔断”而不是“限流”,就:
先调大 Flow(QPS)阈值;
用极小的 Degrade RT 阈值或在方法内临时
sleep(...)
;观察日志出现
DegradeException
(熔断)而非FlowException
(限流)。
日志期待:
命中限流:
FlowException
命中熔断:
DegradeException
读接口命中时,能看到从缓存兜底返回;写接口命中时,明确失败 JSON。
六、常见坑与修复记录
Nacos JSON 不能有注释,否则规则加载失败。
POST 表单 vs JSON:
robotMap
必须走application/x-www-form-urlencoded
;分区查询是 JSON。@SentinelResource
handlers 签名必须与原方法一致,且在最后加上BlockException
/Throwable
。资源名对齐:
dataId
下发的规则resource
必须等于@SentinelResource.value
。网关与服务内同时开启:谁先达到阈值谁生效。要测服务内熔断,就临时放开网关限流或把服务内限流阈值设置更敏感的口径。
七、定义完成(DoD)
压测阶段能在日志中稳定看到
FlowException
与DegradeException
;读接口在限流/熔断时走缓存兜底;无缓存时返回清晰 429/503 JSON;
写接口在限流/熔断/异常时统一返回失败 JSON,不产生副作用;
Nacos 改规则可热生效,测试后恢复到“日常口径”。
附:脱敏占位符一览
<GW>
:你的网关地址(例:http://localhost:8080
)<NACOS_ADDR>
:Nacos 地址(例:127.0.0.1:8848
)<SENTINEL_DASHBOARD_HOST:PORT>
:Sentinel 控制台(例:127.0.0.1:8718
)<NACOS_USER>
/<NACOS_PASS>
:Nacos 鉴权<ROBOT_SN>
:机器人序列号(例:GS***-****
)<MAP_ID>
/<SUB_ID>
:地图/分区 ID<TASK>
:任务名