好复杂,俺不中了。。
问题描述
本查询旨在分析2023年各产品的销售情况,包括:
- 每个产品的总销售额、单价、总销量和月均销售额
- 每个产品销量最高的月份及其销量
- 每个产品购买量最高的客户年龄段
解题思路
1. 基础数据准备 (base_data
)
首先从订单、客户和产品表中提取2023年的销售数据,并计算:
- 订单月份 (
month(order_date)
) - 订单总金额 (
quantity * unit_price
) - 客户年龄段分组 (
customer_age_group
)
WITH base_data AS (
SELECT
products.product_id,
MONTH(order_date) AS month,
unit_price,
quantity,
quantity * unit_price AS total_amount,
customer_age,
CASE
WHEN customer_age BETWEEN 1 AND 10 THEN '1-10'
WHEN customer_age BETWEEN 11 AND 20 THEN '11-20'
WHEN customer_age BETWEEN 21 AND 30 THEN '21-30'
WHEN customer_age BETWEEN 31 AND 40 THEN '31-40'
WHEN customer_age BETWEEN 41 AND 50 THEN '41-50'
WHEN customer_age BETWEEN 51 AND 60 THEN '51-60'
ELSE '61+'
END AS customer_age_group
FROM
orders
JOIN customers USING (customer_id)
JOIN products USING (product_id)
WHERE
YEAR(order_date) = 2023
)
2. 计算产品月最高销量 (max_monthly_quantity_data
)
找出每个产品在所有月份中的最高销量:
- 按产品和月份分组计算总销量
- 使用窗口函数
max(sum(quantity))
找出每个产品的月最高销量
max_monthly_quantity_data AS (
SELECT DISTINCT
product_id,
max_monthly_quantity
FROM
(
SELECT
product_id,
MAX(SUM(quantity)) OVER (
PARTITION BY product_id
) AS max_monthly_quantity
FROM
base_data
GROUP BY
product_id,
month
ORDER BY
product_id ASC,
month ASC
) t1
)
3. 识别主要客户群体 (grouped_data
)
找出每个产品购买量最高的客户年龄段:
- 按产品和客户年龄段分组计算总销量
- 使用
ROW_NUMBER()
按销量降序排序(销量相同时按年龄段升序) - 筛选排名第一的记录
grouped_data AS (
SELECT
product_id,
customer_age_group
FROM
(
SELECT
product_id,
customer_age_group,
total_quantity,
ROW_NUMBER() OVER (
PARTITION BY product_id
ORDER BY total_quantity DESC, customer_age_group ASC
) AS sales_rank
FROM
(
SELECT
product_id,
customer_age_group,
SUM(quantity) AS total_quantity
FROM
base_data
GROUP BY
product_id,
customer_age_group
) t2
ORDER BY
product_id,
sales_rank
) t3
WHERE
sales_rank = 1
)
4. 计算基本销售统计 (basic_stat
)
计算每个产品的核心销售指标:
- 总销售额 (
sum(total_amount)
) - 单价 (
unit_price
) - 总销量 (
sum(quantity)
) - 月均销售额 (
sum(total_amount)/12
)
basic_stat AS (
SELECT
product_id,
ROUND(SUM(total_amount), 2) AS total_sales,
ROUND(unit_price, 2) AS unit_price,
SUM(quantity) AS total_quantity,
ROUND(SUM(total_amount) / 12, 2) AS avg_monthly_sales
FROM
base_data
GROUP BY
product_id,
unit_price
)
5. 合并最终结果
将三个中间结果合并,并按总销售额降序、产品ID升序、客户年龄段升序排序:
SELECT
*
FROM
basic_stat
JOIN max_monthly_quantity_data USING (product_id)
JOIN grouped_data USING (product_id)
ORDER BY
total_sales DESC,
product_id ASC,
customer_age_group ASC
技术亮点
- 多维度分析:同时考虑了时间维度(月份)和客户维度(年龄段)
- 窗口函数应用:使用
MAX() OVER()
和ROW_NUMBER()
高效计算极值和排名 - 数据完整性:确保所有计算都基于2023年的销售数据
- 精确分组:正确处理了产品和月份、产品和客户年龄段的交叉分组
结果解读
最终结果将展示:
- 按总销售额排序的所有产品
- 每个产品的核心销售指标
- 该产品销量最高的月份对应的销量
- 该产品最主要的客户年龄段