- 类别:
GROUP BY¶
使用相同的逐项分组表达式对行进行分组,并计算结果组的聚合函数。GROUP BY 表达式可以是:
列名称。
引用 SELECT 列表中位置的数字。
通用表达式。
GROUP BY extensions¶
GROUP BY 支持以下可提供强大聚合功能的扩展:
GROUP BY GROUPING SETS:选择使用 时默认使用的角色和仓库。在单个语句中计算多个 GROUP BY 子句
GROUP BY ROLLUP:选择使用 时默认使用的角色和仓库。为分层数据生成小计行
GROUP BY CUBE:为所有维度组合生成小计行
您可以将这些扩展与常规 GROUP BY 列结合使用。例如:
GROUP BY x, GROUPING SETS(y, z)GROUP BY x, ROLLUP(y, z)GROUP BY x, CUBE(y, z)
有关解读扩展结果中的 NULL 值的更多信息,请参阅 GROUPING 效用函数。
语法¶
SELECT ...
FROM ...
[ ... ]
GROUP BY groupItem [ , groupItem [ , ... ] ]
[ ... ]
SELECT ...
FROM ...
[ ... ]
GROUP BY ALL
[ ... ]
其中:
groupItem ::= { <column_alias> | <position> | <expr> }
参数¶
GROUP BY ALL指定 SELECT 列表中不使用聚合函数的所有项都应用于分组。
有关示例,请参阅 按所有列分组。
使用说明¶
A GROUP BY clause can reference expressions in the projection clause by name or by position. If the GROUP BY clause references by name, each reference is resolved as follows:
If the query contains a database object (for example, a table or view) with a matching column name, the reference is resolved to the column name.
Otherwise, if the projection clause of the SELECT contains an expression alias with a matching name, the reference is resolved to the alias.
如果所有 SELECT 项都使用聚合函数,则指定 GROUP BY ALL 等同于指定不带 GROUP BY 子句的语句。
例如,以下语句仅包含使用聚合函数的 SELECT 项:
SELECT SUM(amount) FROM mytable GROUP BY ALL;
上面的语句等同于不指定 GROUP by 子句:
SELECT SUM(amount) FROM mytable;
示例¶
以下各节提供了使用 GROUP BY 子句的示例:
请注意,各节中的示例使用您在“为示例设置数据”中设置的数据。
为示例设置数据¶
The examples in this section use a table named sales and a table named product. To create these tables and insert the
data needed for the example, run the following commands:
CREATE TABLE sales (
product_ID INTEGER,
retail_price REAL,
quantity INTEGER,
city VARCHAR,
state VARCHAR);
INSERT INTO sales (product_id, retail_price, quantity, city, state) VALUES
(1, 2.00, 1, 'SF', 'CA'),
(1, 2.00, 2, 'SJ', 'CA'),
(2, 5.00, 4, 'SF', 'CA'),
(2, 5.00, 8, 'SJ', 'CA'),
(2, 5.00, 16, 'Miami', 'FL'),
(2, 5.00, 32, 'Orlando', 'FL'),
(2, 5.00, 64, 'SJ', 'PR');
CREATE TABLE products (
product_ID INTEGER,
wholesale_price REAL);
INSERT INTO products (product_ID, wholesale_price) VALUES (1, 1.00);
INSERT INTO products (product_ID, wholesale_price) VALUES (2, 2.00);
按一列分组¶
This example shows the gross revenue per product, grouped by product_id (that is, the total amount of money received for
each product):
SELECT product_ID, SUM(retail_price * quantity) AS gross_revenue
FROM sales
GROUP BY product_ID;
+------------+---------------+
| PRODUCT_ID | GROSS_REVENUE |
+------------+---------------+
| 1 | 6 |
| 2 | 620 |
+------------+---------------+
以下示例基于上一个示例,显示每个产品的净利润,按 product_id 分组:
SELECT p.product_ID, SUM((s.retail_price - p.wholesale_price) * s.quantity) AS profit
FROM products AS p, sales AS s
WHERE s.product_ID = p.product_ID
GROUP BY p.product_ID;
+------------+--------+
| PRODUCT_ID | PROFIT |
+------------+--------+
| 1 | 3 |
| 2 | 372 |
+------------+--------+
按多列分组¶
下面的示例演示了如何按多个列进行分组:
SELECT state, city, SUM(retail_price * quantity) AS gross_revenue
FROM sales
GROUP BY state, city;
+-------+---------+---------------+
| STATE | CITY | GROSS REVENUE |
+-------+---------+---------------+
| CA | SF | 22 |
| CA | SJ | 44 |
| FL | Miami | 80 |
| FL | Orlando | 160 |
| PR | SJ | 320 |
+-------+---------+---------------+
按所有列分组¶
以下示例等同于 按多列分组 中使用的示例。
SELECT state, city, SUM(retail_price * quantity) AS gross_revenue
FROM sales
GROUP BY ALL;
+-------+---------+---------------+
| STATE | CITY | GROSS REVENUE |
+-------+---------+---------------+
| CA | SF | 22 |
| CA | SJ | 44 |
| FL | Miami | 80 |
| FL | Orlando | 160 |
| PR | SJ | 320 |
+-------+---------+---------------+
Precedence when a column name and an alias match¶
It is possible (but usually not recommended) to create a query that contains an alias that matches a column name:
SELECT x, some_expression AS x
FROM ...
If a clause contains a name that matches both a column name and an alias, then the clause uses the column name. The following example demonstrates this behavior using a GROUP BY clause:
创建表并插入行:
CREATE TABLE employees (salary FLOAT, state VARCHAR, employment_state VARCHAR);
INSERT INTO employees (salary, state, employment_state) VALUES
(60000, 'California', 'Active'),
(70000, 'California', 'On leave'),
(80000, 'Oregon', 'Active');
The following query returns the sum of the salaries of the employees who are active and the sum of the salaries of the employees who are on leave:
SELECT SUM(salary), ANY_VALUE(employment_state)
FROM employees
GROUP BY employment_state;
+-------------+-----------------------------+
| SUM(SALARY) | ANY_VALUE(EMPLOYMENT_STATE) |
|-------------+-----------------------------|
| 140000 | Active |
| 70000 | On leave |
+-------------+-----------------------------+
The next query uses the alias state, which matches the name of a column of the table in the query. When state is used in
the GROUP BY clause, Snowflake interprets it as a reference to the column name, not the alias. This query therefore returns the sum of
the salaries of the employees in the state of California and the sum of the salaries of the employees in the state of Oregon,
yet displays employment_state information, such as Active, rather than the names of states or provinces:
SELECT SUM(salary), ANY_VALUE(employment_state) AS state
FROM employees
GROUP BY state;
+-------------+--------+
| SUM(SALARY) | STATE |
|-------------+--------|
| 130000 | Active |
| 80000 | Active |
+-------------+--------+