Scalar SQL UDFs¶

本主题介绍特定于 SQL UDFs（用户定义函数）的概念和用法详细信息。

一般用法

SQL UDF 计算任意 SQL 表达式，并返回该表达式的结果。

函数定义可以是 SQL 表达式，该表达式返回一个标量（即单个）值，或者如果定义为表函数，则返回一组行。例如，下面是计算圆面积的标量 UDF 的基本示例：

CREATE FUNCTION area_of_circle(radius FLOAT)
  RETURNS FLOAT
  AS
  $$
    pi() * radius * radius
  $$
  ;

SELECT area_of_circle(1.0);

输出：

SELECT area_of_circle(1.0);
+---------------------+
| AREA_OF_CIRCLE(1.0) |
|---------------------|
|         3.141592654 |
+---------------------+

The expression can be a query expression (a SELECT expression). For example:

CREATE FUNCTION profit()
  RETURNS NUMERIC(11, 2)
  AS
  $$
    SELECT SUM((retail_price - wholesale_price) * number_sold)
        FROM purchases
  $$
  ;

在 SQL UDF 中使用查询表达式时，不要在 UDF 主体中包含分号来终止查询表达式。

只能包含一个查询表达式。表达式可以包含 UNION [ALL]。

Note

虽然 UDF 的主体可以包含完整的 SELECT 语句，但它不能包含 DDL 语句或 SELECT 以外的任何 DML 语句。

Note

标量函数 (UDFs) 最多有 500 个输入实参。

可记忆的 UDFs¶

A scalar SQL UDF can be memoizable. A memoizable function caches the result of calling a scalar SQL UDF and then returns the cached result when the output is needed at a later time. The benefit of using a memoizable function is to improve performance for complex queries, such as multiple column lookups in mapping tables referenced within a row access policy or masking policy.

策略所有者（例如对行访问策略具有 OWNERSHIP 权限的角色）可以更新其策略条件，以将具有映射表的子查询替换为可记忆函数。当用户以后在查询中引用受策略保护的列时，可根据需要使用可记忆函数的缓存结果。

Note

The USE_CACHED_RESULT session parameter must be set to TRUE to use memoizable functions.

创建可记忆函数

You can define a scalar SQL UDF to be memoizable in the CREATE FUNCTION statement by specifying the MEMOIZABLE keyword. You can create a memoizable to function with or without arguments. By using arguments, you have more freedom to define the SQL UDF. When you write a policy to call the memoizable function, you have more freedom in terms of how to define the policy.

如果指定实参，则实参必须是具有以下数据类型之一的常量值：

VARCHAR 以及其他字符串数据类型。
NUMBER 以及其他数字数据类型。
TIMESTAMP 以及其他日期数据类型。
BOOLEAN。

Nonconstant values and their data types, such as semi-structured data types and table columns are not supported.

编写可记忆函数时：

Specify BOOLEAN or other scalar data types as the result_data_type.

Exercise caution when specifying ARRAY as the result_data_type because there are limits to cache size.
不要指定其他数据类型，例如 OBJECT 和 VARIANT。
不要以任何方式引用其他可记忆函数。

调用可记忆函数

可记忆函数可以在 SELECT 语句中调用，也可以包含在策略定义中，然后策略定义根据策略条件来调用可记忆函数。

调用可记忆函数时，请注意：

For SQL UDFs that return the ARRAY data type or specify a non-scalar value, use the memoizable function as an argument in the ARRAY_CONTAINS function.
缓存大小限制：

对于当前的 Snowflake 会话，每个可记忆函数都有 10 KB 的限制。

如果可记忆函数超过结果集缓存的此限制，则 Snowflake 不会缓存调用可记忆函数的结果。此时， UDF 将根据函数编写方式用作普通的标量 UDF。

缓存使用情况：

Memoizable functions have a reusable result cache for different SQL statements when the query environment and context do not change. Generally, this means the result cache applies to different SQL statements provided that:
- 对查询中引用的对象和列的访问控制授权保持不变。
- 未通过 DML 语句等方式修改查询中引用的对象。
The CHILD_QUERIES_WAIT_TIME column in the Account Usage QUERY_HISTORY view records the time (in milliseconds) to complete the cached lookup when calling a memoizable function.
在以下情况下，可记忆函数不会重用缓存的结果：
- 函数引用表或其他对象，并且对引用的表进行了更新。
- 对表的访问控制发生了变更。
- 函数调用非确定性函数。
- 函数调用外部函数或不是 SQL UDF 的 UDF。

示例

SQL 标量 UDF 基本示例¶

此示例返回数学常数 pi 的硬编码近似值。

CREATE FUNCTION pi_udf()
  RETURNS FLOAT
  AS '3.141592654::FLOAT'
  ;

SELECT pi_udf();

输出：

SELECT pi_udf();
+-------------+
|    PI_UDF() |
|-------------|
| 3.141592654 |
+-------------+

常见 SQL 示例¶

Query expression with SELECT statement¶

创建要使用的表和数据：

CREATE TABLE purchases (number_sold INTEGER, wholesale_price NUMBER(7,2), retail_price NUMBER(7,2));
INSERT INTO purchases (number_sold, wholesale_price, retail_price) VALUES 
   (3,  10.00,  20.00),
   (5, 100.00, 200.00)
   ;

创建 UDF：

CREATE FUNCTION profit()
  RETURNS NUMERIC(11, 2)
  AS
  $$
    SELECT SUM((retail_price - wholesale_price) * number_sold)
        FROM purchases
  $$
  ;

在查询中调用 UDF ：

SELECT profit();

输出：

SELECT profit();
+----------+
| PROFIT() |
|----------|
|   530.00 |
+----------+

WITH 子句中的 UDF¶

CREATE TABLE circles (diameter FLOAT);

INSERT INTO circles (diameter) VALUES
    (2.0),
    (4.0);

CREATE FUNCTION diameter_to_radius(f FLOAT) 
  RETURNS FLOAT
  AS 
  $$ f / 2 $$
  ;

WITH
    radii AS (SELECT diameter_to_radius(diameter) AS radius FROM circles)
  SELECT radius FROM radii
    ORDER BY radius
  ;

输出：

+--------+
| RADIUS |
|--------|
|      1 |
|      2 |
+--------+

JOIN 操作¶

此示例使用更复杂的查询，其中包含了 JOIN 操作：

创建要使用的表和数据：

CREATE TABLE orders (product_ID varchar, quantity integer, price numeric(11, 2), buyer_info varchar);
CREATE TABLE inventory (product_ID varchar, quantity integer, price numeric(11, 2), vendor_info varchar);
INSERT INTO inventory (product_ID, quantity, price, vendor_info) VALUES 
  ('X24 Bicycle', 4, 1000.00, 'HelloVelo'),
  ('GreenStar Helmet', 8, 50.00, 'MellowVelo'),
  ('SoundFX', 5, 20.00, 'Annoying FX Corporation');
INSERT INTO orders (product_id, quantity, price, buyer_info) VALUES 
  ('X24 Bicycle', 1, 1500.00, 'Jennifer Juniper'),
  ('GreenStar Helmet', 1, 75.00, 'Donovan Liege'),
  ('GreenStar Helmet', 1, 75.00, 'Montgomery Python');

创建 UDF：

CREATE FUNCTION store_profit()
  RETURNS NUMERIC(11, 2)
  AS
  $$
  SELECT SUM( (o.price - i.price) * o.quantity) 
    FROM orders AS o, inventory AS i 
    WHERE o.product_id = i.product_id
  $$
  ;

在查询中调用 UDF ：

SELECT store_profit();

输出：

SELECT store_profit();
+----------------+
| STORE_PROFIT() |
|----------------|
|         550.00 |
+----------------+

The topic CREATE FUNCTION contains additional examples.

在不同子句中使用 UDFs¶

在任何可以使用标量表达式的地方都可以使用标量 UDF。例如：

-- ----- These examples show a UDF called from different clauses ----- --

select MyFunc(column1) from table1;

select * from table1 where column2 > MyFunc(column1);

在 UDF 中使用 SQL 变量¶

此示例演示如何设置 SQL 变量并在 UDF 中使用该变量：

SET id_threshold = (SELECT COUNT(*)/2 FROM table1);

CREATE OR REPLACE FUNCTION my_filter_function()
RETURNS TABLE (id int)
AS
$$
SELECT id FROM table1 WHERE id > $id_threshold
$$
;

可记忆函数

有关示例，请参阅：

Memoizable function without arguments in a row access policy.
Memoizable function with arguments in a masking policy.