snowflake.snowpark.udtf.UDTFRegistration.register¶
- UDTFRegistration.register(handler: Type, output_schema: Union[StructType, Iterable[str], PandasDataFrameType], input_types: Optional[List[DataType]] = None, input_names: Optional[List[str]] = None, name: Optional[Union[str, Iterable[str]]] = None, is_permanent: bool = False, stage_location: Optional[str] = None, imports: Optional[List[Union[str, Tuple[str, str]]]] = None, packages: Optional[List[Union[str, module]]] = None, replace: bool = False, if_not_exists: bool = False, parallel: int = 4, strict: bool = False, secure: bool = False, external_access_integrations: Optional[List[str]] = None, secrets: Optional[Dict[str, str]] = None, immutable: bool = False, max_batch_size: Optional[int] = None, comment: Optional[str] = None, *, statement_params: Optional[Dict[str, str]] = None, **kwargs) UserDefinedTableFunction [source] (https://github.com/snowflakedb/snowpark-python/blob/v1.16.0/src/snowflake/snowpark/udtf.py#L532-L675)¶
Registers a Python class as a Snowflake Python UDTF and returns the UDTF. The usage, input arguments, and return value of this method are the same as they are for
udtf()
, butregister()
cannot be used as a decorator. See examples inUDTFRegistration
.- Parameters:
handler – A Python class used for creating the UDTF.
output_schema – A list of column names, or a
StructType
instance that represents the table function’s columns, or aPandasDataFrameType
instance for vectorized UDTF. If a list of column names is provided, theprocess
method of the handler class must have a return type hint to indicate the output schema data types.input_types – A list of
DataType
representing the input data types of the UDTF. Optional if type hints are provided.input_names – A list of str representing the input column names of the UDTF, this only applies to vectorized UDTF and is essentially a noop for regular UDTFs. If unspecified, default column names will be ARG1, ARG2, etc.
name – A string or list of strings that specify the name or fully-qualified object identifier (database name, schema name, and function name) for the UDTF in Snowflake. If it is not provided, a name will be automatically generated for the UDTF. A name must be specified when
is_permanent
isTrue
.is_permanent – Whether to create a permanent UDTF. The default is
False
. If it isTrue
, a validstage_location
must be provided.stage_location – The stage location where the Python file for the UDTF and its dependencies should be uploaded. The stage location must be specified when
is_permanent
isTrue
, and it will be ignored whenis_permanent
isFalse
. It can be any stage other than temporary stages and external stages.imports – A list of imports that only apply to this UDTF. You can use a string to represent a file path (similar to the
path
argument inadd_import()
) in this list, or a tuple of two strings to represent a file path and an import path (similar to theimport_path
argument inadd_import()
). These UDTF-level imports will override the session-level imports added byadd_import()
.packages – A list of packages that only apply to this UDTF. These UDTF-level packages will override the session-level packages added by
add_packages()
andadd_requirements()
. To use Python packages that are not available in Snowflake, refer tocustom_package_usage_config()
.replace – Whether to replace a UDTF that already was registered. The default is
False
. If it isFalse
, attempting to register a UDTF with a name that already exists results in aSnowparkSQLException
exception being thrown. If it isTrue
, an existing UDTF with the same name is overwritten.if_not_exists – Whether to skip creation of a UDTF when one with the same signature already exists. The default is
False
.if_not_exists
andreplace
are mutually exclusive and aValueError
is raised when both are set. If it isTrue
and a UDTF with the same signature exists, the UDTF creation is skipped.session – Use this session to register the UDTF. If it’s not specified, the session that you created before calling this function will be used. You need to specify this parameter if you have created multiple sessions before calling this method.
parallel – The number of threads to use for uploading UDTF files with the PUT command. The default value is 4 and supported values are from 1 to 99. Increasing the number of threads can improve performance when uploading large UDTF files.
strict – Whether the created UDTF is strict. A strict UDTF will not invoke the UDTF if any input is null. Instead, a null value will always be returned for that row. Note that the UDTF might still return null for non-null inputs.
secure – Whether the created UDTF is secure. For more information about secure functions, see Secure UDFs.
statement_params – Dictionary of statement level parameters to be set while executing this action.
external_access_integrations – The names of one or more external access integrations. Each integration you specify allows access to the external network locations and secrets the integration specifies.
secrets – The key-value pairs of string types of secrets used to authenticate the external network location. The secrets can be accessed from handler code. The secrets specified as values must also be specified in the external access integration and the keys are strings used to retrieve the secrets using secret API.
immutable – Whether the UDTF result is deterministic or not for the same input.
max_batch_size – The maximum number of rows per input pandas DataFrame or pandas Series inside a vectorized UDTF. Because a vectorized UDTF will be executed within a time limit, which is 60 seconds, this optional argument can be used to reduce the running time of every batch by setting a smaller batch size. Note that setting a larger value does not guarantee that Snowflake will encode batches with the specified number of rows. It will be ignored when registering a non-vectorized UDTF.
comment – Adds a comment for the created object object. See COMMENT
See also