智慧园区机器人：服务内限流与熔断降级的实战（Sentinel + Redis 兜底）-EW帮帮网

目标：在不改变前端接口的前提下，为高仙 OpenAPI 代理链路加上细粒度限流与读接口的降级缓存兜底，并让写接口在异常时明确失败且不产生副作用。

一、改造思路总览

注解下沉：把 @SentinelResource 从 Controller 统一下沉到 Service（GsOpenApiServiceImpl），一个业务方法对应一个 Sentinel 资源名，集中治理更清晰。
资源命名规范（用于规则绑定）
- gs.listRobots
- gs.getRobotStatus
- gs.postListRobotMap
- gs.listSubareas
- gs.listRobotCommands
- gs.sendTempTask
读写分治
- 读（查询类）接口：命中限流/熔断 → 读缓存兜底（短 TTL）。
- 写（下发任务）接口：命中限流/熔断 → 直接返回失败语义（不做缓存，防止错发/重复发）。

二、Service 改造（核心代码骨架）

1）注入 Redis & 统一 Key/JSON 工具

@Slf4j
@Service
public class GsOpenApiServiceImpl implements GsOpenApiService {

    private final RestTemplate restTemplate;
    private final GsOpenApiProperties props;
    private final StringRedisTemplate redis;

    private static final ObjectMapper OM = new ObjectMapper();

    // 缓存 Key 统一管理
    private static String kRobotList() { return "robot:list"; }
    private static String kRobotStatus(String sn){ return "robot:xxxx:" + sn; }
    private static String kRobotMapList(String sn){ return "robot:map:xxxx:" + sn; }
    private static String kSubareas(String mapId, String sn){ return "robot:map:xxxx:" + id + ":" + sn; }
    private static String kRobotCmds(String sn, int p, int s){ return "robot:cmds:" + sn + ":" + p + ":" + s; }

    // JSON 工具
    private String toJson(Object o){ try { return OM.writeValueAsString(o); } catch (Exception e){ return "{}"; } }
    private <T> T fromJson(String s, TypeReference<T> t){ try { return OM.readValue(s, t); } catch (Exception e){ return null; } }

    // 省略构造器与 token 获取...
}

读接口我们会在“正常返回”时写缓存；命中限流/熔断或出现异常时，读取缓存兜底。

2）读接口（示例：状态查询）—有缓存兜底

@SentinelResource(
    value = "gs.getRobotStatus",
    blockHandler = "getRobotStatusBlockHandler",
    fallback = "getRobotStatusFallback"
)
public String getRobotStatus(String robotSn) {
    String api = props.getBaseUrl() + "/openapi/xxxxx/xx/xxxxx/" + xxxx + "/status";
    HttpHeaders headers = new HttpHeaders();
    headers.setBearerAuth(getToken());
    ResponseEntity<String> resp = restTemplate.exchange(api, HttpMethod.GET, new HttpEntity<>(headers), String.class);

    String body = resp.getBody();
    if (body != null) {
        // 写缓存：短 TTL（示例 3 分钟）
        try { redis.opsForValue().set(kRobotStatus(robotSn), body, 3, TimeUnit.MINUTES); } catch (Exception ignore) {}
    }
    return body;
}

// 被限流/熔断 → 读缓存；无缓存则返回 429 语义 JSON
public String getRobotStatusBlockHandler(String robotSn, BlockException ex) {
    log.warn("[getRobotStatus] blocked: {}, sn={}", ex.getClass().getSimpleName(), robotSn);
    String cached = redis.opsForValue().get(kRobotStatus(robotSn));
    return (cached != null) ? cached : "{\"code\":429,\"msg\":\"限流/熔断，且无缓存\"}";
}

// 方法异常 → 读缓存；无缓存则返回 503 语义 JSON
public String getRobotStatusFallback(String robotSn, Throwable ex) {
    log.warn("[getRobotStatus] fallback: {}, sn={}", ex.toString(), robotSn);
    String cached = redis.opsForValue().get(kRobotStatus(robotSn));
    return (cached != null) ? cached : "{\"code\":503,\"msg\":\"服务异常，且无缓存\"}";
}

其它读接口（机器人列表 / 地图列表 / 分区 / 指令列表）同理：正常→写缓存、异常→读缓存，TTL 适度不同：

listRobots：5min
postListRobotMap：3min
listSubareas：10min
listRobotCommands：2min

3）写接口（示例：无站点临时任务）—不缓存，明确失败

@SentinelResource(
    value = "gs.sendTempTask",
    blockHandler = "sendTempTaskBlock",
    fallback = "sendTempTaskFallback"
)
public String sendTempTask(GsTempTaskDto dto) {
    String api = props.getBaseUrl() + "/openapi/xxxxx/xxxxxx/xxxxxxx";
    HttpHeaders headers = new HttpHeaders();
    headers.setContentType(MediaType.APPLICATION_JSON);
    headers.setBearerAuth(getToken());
    return restTemplate.postForEntity(api, new HttpEntity<>(dto, headers), String.class).getBody();
}

// 被限流/熔断：直接“明确失败”，不做任何缓存/副作用
public String sendTempTaskBlock(GsTempTaskDto dto, BlockException ex) {
    log.warn("[sendTempTask] blocked: {}", ex.getClass().getSimpleName());
    return "{\"code\":429,\"msg\":\"限流/熔断，任务未下发\"}";
}

// 方法异常：同上“明确失败”
public String sendTempTaskFallback(GsTempTaskDto dto, Throwable ex) {
    log.warn("[sendTempTask] fallback: {}", ex.toString());
    return "{\"code\":503,\"msg\":\"服务异常，任务未下发\"}";
}

写接口的“明确失败”让前端可以清晰提示并二次重试；避免“误以为成功”或“重试导致重复下发”。

三、RestTemplate 与统一异常处理

1）超时/连接池（让“慢调用/异常”可被准确感知）

@Bean
public RestTemplate restTemplate() {
    var http = HttpClients.custom()
        .disableAutomaticRetries()
        .setMaxConnTotal(200)
        .setMaxConnPerRoute(50)
        .build();
    var f = new HttpComponentsClientHttpRequestFactory(http);
    f.setConnectTimeout(2000);
    f.setReadTimeout(5000);
    f.setConnectionRequestTimeout(2000);
    return new RestTemplate(f);
}

注：如果你希望 4xx/5xx 直接抛异常，别自定义 ErrorHandler；如果你要“200+错误体”风格，就保留一个吞错的 ErrorHandler 并在业务里解析。本文使用“抛异常→fallback 更清晰”。

2）全局异常（可选，但推荐）

@RestControllerAdvice
public class GlobalExceptionHandler {
  @ExceptionHandler(BlockException.class)
  public AjaxResult onBlock(BlockException ex){
    return AjaxResult.error(429, "流控/降级触发：" + ex.getClass().getSimpleName());
  }
  @ExceptionHandler(Throwable.class)
  public AjaxResult onAny(Throwable ex, HttpServletRequest req){
    log.error("Unhandled ex, uri={}", req.getRequestURI(), ex);
    return AjaxResult.error(500, "系统繁忙，请稍后再试");
  }
}

如果你还有URL 级限流（不走注解），可以自定义 UrlBlockHandler 统一返回 HTTP 429。

四、Nacos 规则持久化（Flow/Degrade）

1）应用配置（`ruoyi-robot-dev.yml` 片段，已脱敏）

spring:
  cloud:
    sentinel:
      eager: true
      transport:
        dashboard: <SENTINEL_DASHBOARD_HOST:PORT>   # 例：127.0.0.1:8718
      datasource:
        flow:
          nacos:
            serverAddr: <NACOS_ADDR>               # 例：127.0.0.1:8848
            groupId: DEFAULT_GROUP
            username: <NACOS_USER>
            password: <NACOS_PASS>
            dataId: ruoyi-robot-flow-rules
            dataType: json
            ruleType: flow
        degrade:
          nacos:
            serverAddr: <NACOS_ADDR>
            groupId: DEFAULT_GROUP
            username: <NACOS_USER>
            password: <NACOS_PASS>
            dataId: ruoyi-robot-degrade-rules
            dataType: json
            ruleType: degrade

2）日常口径（Flow 限流）

json

[
  {"resource":"gs.listRobots","grade":1,"count":5,"intervalSec":1},
  {"resource":"gs.getRobotStatus","grade":1,"count":10,"intervalSec":1},
  {"resource":"gs.postListRobotMap","grade":1,"count":6,"intervalSec":1},
  {"resource":"gs.listSubareas","grade":1,"count":6,"intervalSec":1},
  {"resource":"gs.listRobotCommands","grade":1,"count":8,"intervalSec":1},
  {"resource":"gs.sendTempTask","grade":1,"count":2,"intervalSec":1}
]

grade=1 表示 QPS 阈值。压测要看熔断时，请先调大这些阈值，避免总是被限流先命中。

3）日常口径（Degrade 熔断）

读接口采用“慢调用比例”，写接口采用“异常比例”（阈值仅示例，实际按你的接口时延/稳定性调优）：

json

[
  {"resource":"gs.getRobotStatus","grade":0,"count":800,"slowRatioThreshold":0.5,"minRequestAmount":20,"statIntervalMs":10000,"timeWindow":10},
  {"resource":"gs.listRobots","grade":0,"count":1200,"slowRatioThreshold":0.5,"minRequestAmount":20,"statIntervalMs":10000,"timeWindow":10},
  {"resource":"gs.postListRobotMap","grade":0,"count":1200,"slowRatioThreshold":0.5,"minRequestAmount":20,"statIntervalMs":10000,"timeWindow":10},
  {"resource":"gs.listSubareas","grade":0,"count":1200,"slowRatioThreshold":0.5,"minRequestAmount":20,"statIntervalMs":10000,"timeWindow":10},
  {"resource":"gs.listRobotCommands","grade":0,"count":1200,"slowRatioThreshold":0.5,"minRequestAmount":20,"statIntervalMs":10000,"timeWindow":10},
  {"resource":"gs.sendTempTask","grade":1,"count":0.2,"minRequestAmount":10,"statIntervalMs":10000,"timeWindow":5}
]

注意：Nacos 配置里是 纯 JSON，不要写 // 注释，否则加载失败（踩过一次坑）。

五、联调与自测要点（带占位符）

通过网关访问（不要直连服务）：
- 机器人列表：GET <GW>/external/gs/xxxxx
- 机器人状态：GET <GW>/external/gs/xxxxx
- 地图列表（表单）：POST <GW>/external/gs/map/xxxxxx，Content-Type: application/x-www-form-urlencoded，Body：robotSn=<ROBOT_SN>
- 分区查询（JSON）：POST <GW>/external/gs/map/xxxxx，Body：{"mapId":"<MAP_ID>","robotSn":"<ROBOT_SN>"}
- 下发无站点任务（JSON）：POST <GW>/external/gs/xxxxx，Body：{"robotSn":"<ROBOT_SN>","taskName":"<TASK>","mapId":"<MAP_ID>","subareaId":"<SUB_ID>","loop":false,"times":1}
并发/压测时如果你希望验证“熔断”而不是“限流”，就：
1. 先调大 Flow（QPS）阈值；
2. 用极小的 Degrade RT 阈值或在方法内临时 sleep(...)；
3. 观察日志出现 DegradeException（熔断）而非 FlowException（限流）。

日志期待：
- 命中限流：FlowException
- 命中熔断：DegradeException
- 读接口命中时，能看到从缓存兜底返回；写接口命中时，明确失败 JSON。

六、常见坑与修复记录

Nacos JSON 不能有注释，否则规则加载失败。
POST 表单 vs JSON：robotMap 必须走 application/x-www-form-urlencoded；分区查询是 JSON。
@SentinelResource handlers 签名必须与原方法一致，且在最后加上 BlockException/Throwable。
资源名对齐：dataId 下发的规则 resource 必须等于 @SentinelResource.value。
网关与服务内同时开启：谁先达到阈值谁生效。要测服务内熔断，就临时放开网关限流或把服务内限流阈值设置更敏感的口径。

七、定义完成（DoD）

压测阶段能在日志中稳定看到 FlowException 与 DegradeException；
读接口在限流/熔断时走缓存兜底；无缓存时返回清晰 429/503 JSON；
写接口在限流/熔断/异常时统一返回失败 JSON，不产生副作用；
Nacos 改规则可热生效，测试后恢复到“日常口径”。

附：脱敏占位符一览

<GW>：你的网关地址（例：http://localhost:8080）
<NACOS_ADDR>：Nacos 地址（例：127.0.0.1:8848）
<SENTINEL_DASHBOARD_HOST:PORT>：Sentinel 控制台（例：127.0.0.1:8718）
<NACOS_USER> / <NACOS_PASS>：Nacos 鉴权
<ROBOT_SN>：机器人序列号（例：GS***-****）
<MAP_ID> / <SUB_ID>：地图/分区 ID
<TASK>：任务名

智慧园区机器人：服务内限流与熔断降级的实战（Sentinel + Redis 兜底）

一、改造思路总览

二、Service 改造（核心代码骨架）

1）注入 Redis & 统一 Key/JSON 工具

三、RestTemplate 与统一异常处理

1）超时/连接池（让“慢调用/异常”可被准确感知）

四、Nacos 规则持久化（Flow/Degrade）

1）应用配置（`ruoyi-robot-dev.yml` 片段，已脱敏）

3）日常口径（Degrade 熔断）

五、联调与自测要点（带占位符）

六、常见坑与修复记录

七、定义完成（DoD）

附：脱敏占位符一览

网站公告

今日签到

热门文章

最新发布

智慧园区机器人：服务内限流与熔断降级的实战（Sentinel + Redis 兜底）

一、改造思路总览

二、Service 改造（核心代码骨架）

1）注入 Redis & 统一 Key/JSON 工具

三、RestTemplate 与统一异常处理

1）超时/连接池（让“慢调用/异常”可被准确感知）

四、Nacos 规则持久化（Flow/Degrade）

1）应用配置（ruoyi-robot-dev.yml 片段，已脱敏）

3）日常口径（Degrade 熔断）

五、联调与自测要点（带占位符）

六、常见坑与修复记录

七、定义完成（DoD）

附：脱敏占位符一览

网站公告

今日签到

热门文章

最新发布

1）应用配置（`ruoyi-robot-dev.yml` 片段，已脱敏）