以下文章来源于数有道 ,作者数据星爷
SQL查询是数据分析工作的基础,也是CDA数据分析师一级的核心考点,人工智能时代,AI能为我们节省多少工作量?本来想用deepseek部署,被卡出三界外,不在五行中,后来选择了通义灵码。
环境
AI环境:VScode+通义灵码
练习网站:https://sqlfiddle.com/postgresql/online-compiler
数据库:数据库是一个有组织的数据集合,通常以电子表格的形式存储。它由多个表组成,每个表代表一种特定类型的数据集合。
表:表是数据库中数据的基本存储单位,类似于Excel中的工作表。表由行(记录)和列(字段)构成。每一行代表一条记录,每一列代表一个属性。
例如,在一个电商数据库中,可能会有以下几张表:
SQL(Structured Query Language,结构化查询语言)是一种用于管理和操作关系型数据库的标准语言。它的基本结构包括以下几个部分:
假设我们有一个名为 users 的表,包含以下字段:
我们可以编写一个简单的SQL查询来获取所有用户的姓名和邮箱:
SELECT name, email FROM users;
如果我们只想获取名字中包含 "张" 的用户,可以使用 WHERE 子句:
SELECT name, email
FROM users
WHERE name LIKE '%张%';
通过理解数据库和表的概念以及SQL语言的基本结构,你可以开始构建简单的查询语句来检索和操作数据。这是学习SQL的基础,后续我们将在此基础上深入学习更复杂的查询和操作。
数据检索是SQL中最常用的操作之一,它允许你从数据库中提取所需的数据。我们将详细讲解如何使用SELECT语句及其相关子句来实现各种数据检索操作。
首先,我们创建一个包含10条员工数据的临时表 employees。
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
示例 要获取所有员工的姓名和部门:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT name, department
FROM employees;
结果:
示例 要获取工资大于5000的员工信息:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT name, salary
FROM employees
WHERE salary > 5000;
结果
示例 要按工资从高到低排序员工信息:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT name, salary
FROM employees
ORDER BY salary DESC;
结果
示例 要获取前5名最高工资的员工信息,并按工资降序排序:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT name, salary
FROM employees
ORDER BY salary DESC
LIMIT 5;
结果
示例 要获取属于 "销售部" 的前3名最高工资的员工信息,并按工资降序排序:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT name, salary
FROM employees
WHERE department = '销售部'
ORDER BY salary DESC
LIMIT 3;
结果
通过上述示例,你可以看到如何使用 WITH 子句创建一个临时表,并在此基础上进行各种数据检索操作。这些示例涵盖了 SELECT、WHERE、ORDER BY 和 LIMIT 的基本用法,帮助你更好地理解和验证查询结果。
使用COUNT(), SUM(), AVG(), MAX(), MIN()等聚合函数
使用GROUP BY对数据进行分组
首先,我们创建一个包含10条员工数据的临时表 employees。
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
聚合函数用于对一组值执行计算并返回单个值。常用的聚合函数包括:
示例- count 计算总员工数:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT COUNT(*) AS total_employees
FROM employees;
结果:
total_employees
10
示例-sum 计算所有员工的总薪水:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT SUM(salary) AS total_salary
FROM employees;
结果:
total_salary
59000
示例-avg 计算所有员工的平均薪水:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT AVG(salary) AS average_salary
FROM employees;
结果:
average_salary
5900.00
示例-max 返回最高薪水:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT MAX(salary) AS max_salary
FROM employees;
结果:
max_salary
8000
示例-min 返回最低薪水:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT MIN(salary) AS min_salary
FROM employees;
结果:
min_salary
4000
GROUP BY 子句用于将结果集按一个或多个列进行分组,通常与聚合函数一起使用。
示例-count 按部门分组,计算每个部门的员工数:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;
结果:
示例-sum 按部门分组,计算每个部门的总薪水:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT department, SUM(salary) AS total_salary
FROM employees
GROUP BY department;
结果:
示例-avg 按部门分组,计算每个部门的平均薪水:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT department, AVG(salary) AS average_salary
FROM employees
GROUP BY department;
结果:
示例-max-min 按部门分组,返回每个部门的最高和最低薪水:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, '销售部' AS department, 5000 AS salary UNION ALL
SELECT 2, '李四', '市场部', 6000 UNION ALL
SELECT 3, '王五', '销售部', 5500 UNION ALL
SELECT 4, '赵六', '技术部', 7000 UNION ALL
SELECT 5, '孙七', '市场部', 6500 UNION ALL
SELECT 6, '周八', '技术部', 7500 UNION ALL
SELECT 7, '吴九', '销售部', 4500 UNION ALL
SELECT 8, '郑十', '市场部', 5000 UNION ALL
SELECT 9, '钱十一', '技术部', 8000 UNION ALL
SELECT 10, '王十二', '销售部', 4000
)
SELECT department, MAX(salary) AS max_salary, MIN(salary) AS min_salary
FROM employees
GROUP BY department;
结果:
通过上述示例,你可以看到如何使用聚合函数和 GROUP BY 子句来对数据进行分组和计算。这些示例涵盖了 COUNT()、SUM()、AVG()、MAX() 和 MIN() 的基本用法,并结合 GROUP BY 进行分组操作。
我们将详细讲解SQL中的多表操作,包括内连接(INNER JOIN)、左连接(LEFT JOIN)、右连接(RIGHT JOIN)和全外连接(FULL OUTER JOIN)。我们将使用 WITH 子句创建两个临时表 employees 和 departments,然后进行各种多表查询。
首先,我们创建两个临时表 employees 和 departments。
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
),
departments AS (
SELECT 1 AS id, '销售部' AS department_name UNION ALL
SELECT 2, '市场部' UNION ALL
SELECT 3, '技术部'
)
内连接返回两个表中满足连接条件的所有行。
示例 要获取每个员工及其所属部门的名称:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
),
departments AS (
SELECT 1 AS id, '销售部' AS department_name UNION ALL
SELECT 2, '市场部' UNION ALL
SELECT 3, '技术部'
)
SELECT e.name, d.department_name, e.salary
FROM employees e
INNER JOIN departments d ON e.department_id = d.id;
结果
左连接返回左表中的所有行,以及右表中满足连接条件的行。如果右表中没有匹配的行,则结果为 NULL。
示例 要获取所有员工及其所属部门的名称,即使某些员工没有部门:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000 UNION ALL
SELECT 11, '无部门员工', NULL, 4000 -- 添加一个没有部门的员工
),
departments AS (
SELECT 1 AS id, '销售部' AS department_name UNION ALL
SELECT 2, '市场部' UNION ALL
SELECT 3, '技术部'
)
SELECT e.name, d.department_name, e.salary
FROM employees e
LEFT JOIN departments d ON e.department_id = d.id;
结果
右连接返回右表中的所有行,以及左表中满足连接条件的行。如果左表中没有匹配的行,则结果为 NULL。
示例 要获取所有部门及其所属员工的名称,即使某些部门没有员工:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
),
departments AS (
SELECT 1 AS id, '销售部' AS department_name UNION ALL
SELECT 2, '市场部' UNION ALL
SELECT 3, '技术部' UNION ALL
SELECT 4, '人力资源部' -- 添加一个没有员工的部门
)
SELECT e.name, d.department_name, e.salary
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.id;
结果
全外连接返回两个表中的所有行,如果某个表中没有匹配的行,则结果为 NULL。需要注意的是,并非所有数据库系统都支持 FULL OUTER JOIN,例如 MySQL 不支持,但可以通过 UNION 实现类似效果。
示例 要获取所有员工及其所属部门的名称,以及所有部门及其所属员工的名称,即使某些员工或部门没有匹配:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
),
departments AS (
SELECT 1 AS id, '销售部' AS department_name UNION ALL
SELECT 2, '市场部' UNION ALL
SELECT 3, '技术部' UNION ALL
SELECT 4, '人力资源部' -- 添加一个没有员工的部门
)
SELECT e.name, d.department_name, e.salary
FROM employees e
FULL OUTER JOIN departments d ON e.department_id = d.id;
结果
通过上述示例,你可以看到如何使用内连接、左连接、右连接和全外连接来操作多个表。这些示例涵盖了 INNER JOIN、LEFT JOIN、RIGHT JOIN 和 FULL OUTER JOIN 的基本用法,并结合 WITH 子句创建临时表进行验证。
我们将详细讲解SQL中的高级特性,包括公用表表达式(CTE)、子查询和窗口函数。我们将使用 WITH 子句创建临时表 employees 和 departments,然后进行各种高级查询。
首先,我们创建两个临时表 employees 和 departments。
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
),
departments AS (
SELECT 1 AS id, '销售部' AS department_name UNION ALL
SELECT 2, '市场部' UNION ALL
SELECT 3, '技术部'
)
公用表表达式(CTE)是一个临时结果集,可以在查询中多次引用。CTE 使用 WITH 子句定义。
示例 要计算每个部门的平均薪水,并使用 CTE 来简化查询:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
),
departments AS (
SELECT 1 AS id, '销售部' AS department_name UNION ALL
SELECT 2, '市场部' UNION ALL
SELECT 3, '技术部'
),
department_avg_salary AS (--每个部门平均薪水
SELECT department_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
)
SELECT d.department_name, das.avg_salary
FROM departments d
JOIN department_avg_salary das ON d.id = das.department_id;
结果
子查询是嵌套在另一个查询中的查询。子查询可以出现在 SELECT、FROM、WHERE 子句中。
示例 要获取薪水高于平均薪水的员工:
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
)
SELECT name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
结果:
窗口函数对一组行进行计算,并返回每个行的计算结果。窗口函数通常与 OVER 子句一起使用。
ROW_NUMBER(),RANK() 和 DENSE_RANK()
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 4500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
)
SELECT
name, department_id, salary,
ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY salary DESC) AS row_number,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank_salary,
DENSE_RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS dense_rank_salary
FROM employees
结果:
NTILE(2)
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 4500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
)
SELECT
name, department_id, salary,
NTILE(2) OVER (PARTITION BY department_id ORDER BY salary DESC) AS ntile_2
FROM
employees;
结果:
LAG() 和 LEAD()
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 4500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
)
SELECT
name, department_id, salary,
LAG(salary, 1, 0) OVER (PARTITION BY department_id ORDER BY salary) AS lag_salary,
LEAD(salary, 1, 0) OVER (PARTITION BY department_id ORDER BY salary) AS lead_salary
FROM employees;
结果:
FIRST_VALUE() 和 LAST_VALUE()
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 4500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
)
SELECT
name, department_id, salary,
FIRST_VALUE(salary) OVER (PARTITION BY department_id ORDER BY salary) AS first_value_salary,
LAST_VALUE(salary) OVER (PARTITION BY department_id ORDER BY salary) AS last_value_salary
FROM employees;
结果
????注意: LAST_VALUE() 默认情况下会在窗口内逐行计算,如果需要在整个分区计算,可以使用 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING。
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 4500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
)
SELECT
name, department_id, salary,
FIRST_VALUE(salary) OVER (PARTITION BY department_id ORDER BY salary) AS first_value_salary,
LAST_VALUE(salary) OVER (PARTITION BY department_id ORDER BY salary ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_value_salary
FROM employees;
结果:
聚合函数在窗口中的用法允许你在窗口内进行聚合计算,例如计算窗口内的总和、平均值等。这通常使用 OVER 子句来定义窗口。
计算每个员工的累积薪水(按薪水降序排列)。
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
)
SELECT
name, salary,
SUM(salary) OVER (ORDER BY salary DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cumulative_salary
FROM
employees;
结果:
????在这个示例中,SUM(salary) OVER (ORDER BY salary DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) 计算从第一个行到当前行的累积薪水。
计算每个员工的移动平均薪水(按薪水降序排列,窗口大小为3)。
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
)
SELECT name, salary,
AVG(salary) OVER (ORDER BY salary DESC ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS moving_avg_salary
FROM employees;
结果:
????在这个示例中,AVG(salary) OVER (ORDER BY salary DESC ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) 计算当前行及其前两行的平均薪水。
计算每个员工的窗口内行数(按薪水降序排列,窗口大小为3)。
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
)
SELECT name, salary,
COUNT(*) OVER (ORDER BY salary DESC ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS window_count
FROM employees;
结果:
????在这个示例中,COUNT(*) OVER (ORDER BY salary DESC ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) 计算当前行及其前两行的行数。
计算每个员工的窗口内最大值和最小值(按薪水降序排列,窗口大小为3)。
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
)
SELECT name, salary,
MAX(salary) OVER (ORDER BY salary DESC ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS max_salary,
MIN(salary) OVER (ORDER BY salary DESC ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS min_salary
FROM employees;
结果:
????在这个示例中,MAX(salary) OVER (ORDER BY salary DESC ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) 和 MIN(salary) OVER (ORDER BY salary DESC ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) 分别计算当前行及其前两行的最大值和最小值。
通过上述示例,你可以看到如何使用 FIRST_VALUE() 和 LAST_VALUE() 窗口函数来获取窗口内的第一个和最后一个值,以及如何在窗口中使用聚合函数(如 SUM()、AVG()、COUNT()、MAX() 和 MIN())进行复杂的计算。
下是三个综合的SQL案例,涵盖了我们之前学习的各种概念,包括数据检索、聚合与分组、多表操作以及高级特性(如窗口函数和公用表表达式)。这些案例将帮助你更好地理解和应用这些知识。
分析每个部门的员工薪资情况,包括平均薪资、最高薪资、最低薪资以及每个员工的薪资排名。
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
),
departments AS (
SELECT 1 AS id, '销售部' AS department_name UNION ALL
SELECT 2, '市场部' UNION ALL
SELECT 3, '技术部'
)
WITH employees AS (
SELECT 1 AS id, '张三' AS name, 1 AS department_id, 5000 AS salary UNION ALL
SELECT 2, '李四', 2, 6000 UNION ALL
SELECT 3, '王五', 1, 5500 UNION ALL
SELECT 4, '赵六', 3, 7000 UNION ALL
SELECT 5, '孙七', 2, 6500 UNION ALL
SELECT 6, '周八', 3, 7500 UNION ALL
SELECT 7, '吴九', 1, 4500 UNION ALL
SELECT 8, '郑十', 2, 5000 UNION ALL
SELECT 9, '钱十一', 3, 8000 UNION ALL
SELECT 10, '王十二', 1, 4000
),
departments AS (
SELECT 1 AS id, '销售部' AS department_name UNION ALL
SELECT 2, '市场部' UNION ALL
SELECT 3, '技术部'
),
department_stats AS (
SELECT department_id,
AVG(salary) AS avg_salary,
MAX(salary) AS max_salary,
MIN(salary) AS min_salary
FROM employees
GROUP BY department_id
)
SELECT e.name,
e.salary,
d.department_name,
ds.avg_salary,
ds.max_salary,
ds.min_salary,
RANK() OVER (PARTITION BY e.department_id ORDER BY e.salary DESC) AS salary_rank
FROM employees e
JOIN departments d ON e.department_id = d.id
JOIN department_stats ds ON e.department_id = ds.department_id;
分析每个销售员的销售业绩,包括总销售额、平均销售额以及每个销售员的排名。
WITH sales AS (
SELECT 1 AS id, '张三' AS salesperson, 1000 AS amount UNION ALL
SELECT 2, '李四', 1500 UNION ALL
SELECT 3, '张三', 2000 UNION ALL
SELECT 4, '李四', 2500 UNION ALL
SELECT 5, '王五', 3000 UNION ALL
SELECT 6, '王五', 3500 UNION ALL
SELECT 7, '张三', 4000 UNION ALL
SELECT 8, '李四', 4500 UNION ALL
SELECT 9, '王五', 5000 UNION ALL
SELECT 10, '张三', 5500
)
WITH sales AS (
SELECT 1 AS id, '张三' AS salesperson, 1000 AS amount UNION ALL
SELECT 2, '李四', 1500 UNION ALL
SELECT 3, '张三', 2000 UNION ALL
SELECT 4, '李四', 2500 UNION ALL
SELECT 5, '王五', 3000 UNION ALL
SELECT 6, '王五', 3500 UNION ALL
SELECT 7, '张三', 4000 UNION ALL
SELECT 8, '李四', 4500 UNION ALL
SELECT 9, '王五', 5000 UNION ALL
SELECT 10, '张三', 5500
),
sales_summary AS (
SELECT salesperson,
SUM(amount) AS total_sales,
AVG(amount) AS avg_sales
FROM sales
GROUP BY salesperson
)
SELECT salesperson,
total_sales,
avg_sales,
RANK() OVER (ORDER BY total_sales DESC) AS sales_rank
FROM sales_summary;
分析股票价格的变化趋势,包括最高价、最低价以及每日价格变化。
WITH stock_prices AS (
SELECT '2023-01-01' AS date, 100 AS open_price, 105 AS close_price UNION ALL
SELECT '2023-01-02', 105, 110 UNION ALL
SELECT '2023-01-03', 110, 108 UNION ALL
SELECT '2023-01-04', 108, 112 UNION ALL
SELECT '2023-01-05', 112, 115 UNION ALL
SELECT '2023-01-06', 115, 113 UNION ALL
SELECT '2023-01-07', 113, 118 UNION ALL
SELECT '2023-01-08', 118, 120 UNION ALL
SELECT '2023-01-09', 120, 119 UNION ALL
SELECT '2023-01-10', 119, 122
)
WITH stock_prices AS (
SELECT '2023-01-01' AS date, 100 AS open_price, 105 AS close_price UNION ALL
SELECT '2023-01-02', 105, 110 UNION ALL
SELECT '2023-01-03', 110, 108 UNION ALL
SELECT '2023-01-04', 108, 112 UNION ALL
SELECT '2023-01-05', 112, 115 UNION ALL
SELECT '2023-01-06', 115, 113 UNION ALL
SELECT '2023-01-07', 113, 118 UNION ALL
SELECT '2023-01-08', 118, 120 UNION ALL
SELECT '2023-01-09', 120, 119 UNION ALL
SELECT '2023-01-10', 119, 122
),
daily_stats AS (
SELECT date,
close_price,
LAG(close_price) OVER (ORDER BY date) AS prev_close_price,
MAX(close_price) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS max_close_price,
MIN(close_price) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS min_close_price
FROM stock_prices
)
SELECT date,
close_price,
prev_close_price,
max_close_price,
min_close_price,
close_price - prev_close_price AS price_change
FROM daily_stats;
6.3.4 结果
通过这三个综合案例,你可以看到如何将数据检索、聚合与分组、多表操作以及高级特性(如窗口函数和公用表表达式)结合起来,解决实际问题。这些案例涵盖了从简单到复杂的查询,帮助你更好地理解和应用SQL知识。
随着各行各业进行数字化转型,数据分析能力已经成了职场的刚需能力,这也是这两年CDA数据分析师大火的原因。和领导提建议再说“我感觉”“我觉得”,自己都觉得心虚,如果说“数据分析发现……”,肯定更有说服力。想在职场精进一步还是要学习数据分析的,统计学、概率论、商业模型、SQL,Python还是要会一些,能让你工作效率提升不少。备考CDA数据分析师的过程就是个自我提升的过程。
CDA 考试官方报名入口:https://www.cdaglobal.com/pinggu.html
数据分析咨询请扫描二维码
若不方便扫码,搜微信号:CDAshujufenxi
在数字化浪潮中,数据驱动决策已成为企业发展的核心竞争力,数据分析人才的需求持续飙升。世界经济论坛发布的《未来就业报告》, ...
2025-03-28你有没有遇到过这样的情况?流量进来了,转化率却不高,辛辛苦苦拉来的用户,最后大部分都悄无声息地离开了,这时候漏斗分析就非 ...
2025-03-27TensorFlow Datasets(TFDS)是一个用于下载、管理和预处理机器学习数据集的库。它提供了易于使用的API,允许用户从现有集合中 ...
2025-03-26"不谋全局者,不足谋一域。"在数据驱动的商业时代,战略级数据分析能力已成为职场核心竞争力。《CDA二级教材:商业策略数据分析 ...
2025-03-26当你在某宝刷到【猜你喜欢】时,当抖音精准推来你的梦中情猫时,当美团外卖弹窗刚好是你想吃的火锅店…… 恭喜你,你正在被用户 ...
2025-03-26当面试官问起随机森林时,他到底在考察什么? ""请解释随机森林的原理""——这是数据分析岗位面试中的经典问题。但你可能不知道 ...
2025-03-25在数字化浪潮席卷的当下,数据俨然成为企业的命脉,贯穿于业务运作的各个环节。从线上到线下,从平台的交易数据,到门店的运营 ...
2025-03-25在互联网和移动应用领域,DAU(日活跃用户数)是一个耳熟能详的指标。无论是产品经理、运营,还是数据分析师,DAU都是衡量产品 ...
2025-03-24ABtest做的好,产品优化效果差不了!可见ABtest在评估优化策略的效果方面地位还是很高的,那么如何在业务中应用ABtest? 结合企业 ...
2025-03-21在企业数据分析中,指标体系是至关重要的工具。不仅帮助企业统一数据标准、提升数据质量,还能为业务决策提供有力支持。本文将围 ...
2025-03-20解锁数据分析师高薪密码,CDA 脱产就业班助你逆袭! 在数字化浪潮中,数据驱动决策已成为企业发展的核心竞争力,数据分析人才的 ...
2025-03-19在 MySQL 数据库中,查询一张表但是不包含某个字段可以通过以下两种方法实现:使用 SELECT 子句以明确指定想要的字段,或者使 ...
2025-03-17在当今数字化时代,数据成为企业发展的关键驱动力,而用户画像作为数据分析的重要成果,改变了企业理解用户、开展业务的方式。无 ...
2025-03-172025年是智能体(AI Agent)的元年,大模型和智能体的发展比较迅猛。感觉年初的deepseek刚火没多久,这几天Manus又成为媒体头条 ...
2025-03-14以下的文章内容来源于柯家媛老师的专栏,如果您想阅读专栏《小白必备的数据思维课》,点击下方链接 https://edu.cda.cn/goods/sh ...
2025-03-13以下的文章内容来源于刘静老师的专栏,如果您想阅读专栏《10大业务分析模型突破业务瓶颈》,点击下方链接 https://edu.cda.cn/go ...
2025-03-12以下的文章内容来源于柯家媛老师的专栏,如果您想阅读专栏《小白必备的数据思维课》,点击下方链接 https://edu.cda.cn/goods/sh ...
2025-03-11随着数字化转型的加速,企业积累了海量数据,如何从这些数据中挖掘有价值的信息,成为企业提升竞争力的关键。CDA认证考试体系应 ...
2025-03-10推荐学习书籍 《CDA一级教材》在线电子版正式上线CDA网校,为你提供系统、实用、前沿的学习资源,助你轻松迈入数据分析的大门! ...
2025-03-07在数据驱动决策的时代,掌握多样的数据分析方法,就如同拥有了开启宝藏的多把钥匙,能帮助我们从海量数据中挖掘出关键信息,本 ...
2025-03-06