Skip to content

Latest commit

 

History

History
 
 

DepartmentHighestSalary

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Department Highest Salary

Problem

The Employee table holds all employees. Every employee has an Id, a salary, and there is also a column for the department Id.

+----+-------+--------+--------------+
| Id | Name  | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1  | Joe   | 70000  | 1            |
| 2  | Henry | 80000  | 2            |
| 3  | Sam   | 60000  | 2            |
| 4  | Max   | 90000  | 1            |
+----+-------+--------+--------------+

The Department table holds all departments of the company.

+----+----------+
| Id | Name     |
+----+----------+
| 1  | IT       |
| 2  | Sales    |
+----+----------+

Write a SQL query to find employees who have the highest salary in each of the departments. For the above tables, Max has the highest salary in the IT department and Henry has the highest salary in the Sales department.

+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Max      | 90000  |
| Sales      | Henry    | 80000  |
+------------+----------+--------+

Solution 1

这道题很明显是聚集查询。首先想到的自然是maxgroup语句, 于是不加思索的使用以下sql语句:

select d.Name as Department, e.Name as Employee, max(e.Salary) as Salary
from Employee as e
join Department as d on e.DepartmentId = d.Id
group by e.DepartmentId;

这是错误的查询,错误原因在于:

对于带group by子句的select语句,select子句中的结果列必须是group by子句中的分组列或者聚集函数,不能有其他无关属性.

因此select中的d.Name, e.Name 都不能出现在结果列中。

如果我们只需要DepartmentId 和 最高Salary,则很容易实现:

mysql> select DepartmentId, max(Salary) as max from Employee group by DepartmentId;
+--------------+-------+
| DepartmentId | max   |
+--------------+-------+
|            1 | 90000 |
|            2 | 80000 |
+--------------+-------+

我们可以利用这个临时表连接EmployeeDepartment就可以得到其他的属性列。

Code

select d.Name as Department, e1.Name as Employee, e1.Salary as Salary
from Employee as e1
join (select DepartmentId, max(Salary) as max from Employee group by DepartmentId) as t
on e1.DepartmentId = t.DepartmentId
join Department as d
on e1.DepartmentId = d.Id
where e1.Salary = max;

Solution 2

同样的假如我们已经得到DepartmentId 和最高Salary,则可以使用in子句实现。

Code

SELECT D.Name AS Department ,E.Name AS Employee ,E.Salary 
from 
Employee E,
Department D 
WHERE E.DepartmentId = D.id 
AND (DepartmentId,Salary) in 
(SELECT DepartmentId,max(Salary) as max FROM Employee GROUP BY DepartmentId) 

关于聚集查询

在实际应用中,常常需要计算一些统计量。例如,总数、平均数、最大值、最小值等。 SQL的聚集函数可以单独使用,也可以配合group by子句使用。单独使用时,聚集函数作用于整个查询结果, 而与group by子句配合使用时,聚集函数作用于查询结果的每个分组。

聚集函数具有如下形式:

	COUNT ([ALL | DISTINCT] *)
或者
	<聚集函数> ([ALL | DISTINCT] <值表达式>)

第一种情况比较简单,COUNT(*) 或者 COUNT(ALL *)返回每个分组中的元组个数,而COUNT(DISTINCT *)返回每个分组中不同元组的个数。

第二种情况,聚集函数可以是COUNTMAXMINAVG等。值表达式是可求值的表达式,通常是属性。短语ALL或者DISTINCT是可选的, 默认是ALL.

分组

SQL语言提供了group by子句,其一般形式如下:

group by <分组列>{,<分组列>} [having <分组条件>]

其中,分组列是属性,可选的having子句用来过滤不满足条件的分组, 类似于where子句的查询条件,但其中允许出现聚集函数。

对于带group by子句的select查询,select 子句中的结果列必须是group by子句中的分组列或者聚集函数,不能有其他无关属性列