snowflake.snowpark.functions.xmlget¶
- snowflake.snowpark.functions.xmlget(xml: Union[Column, str], tag: Union[Column, str], instance_num: Union[Column, str, int] = 0) Column [source] (https://github.com/snowflakedb/snowpark-python/blob/v1.16.0/src/snowflake/snowpark/functions.py#L6404-L6453)¶
Extracts an XML element object (often referred to as simply a tag) from a content of outer XML element object by the name of the tag and its instance number (counting from 0).
The following example returns the first inner level (level2) from the XML object created via parse_xml.
Example:
>>> df = session.create_dataframe(['<level1 attr1="a">1<level2 attr2="b">2<level3>3a</level3><level3>3b</level3></level2></level1>'], schema=["str"]).select(parse_xml("str").as_("obj")) >>> df.collect() [Row(OBJ='<level1 attr1="a">\n 1\n <level2 attr2="b">\n 2\n <level3>3a</level3>\n <level3>3b</level3>\n </level2>\n</level1>')] >>> df.select(xmlget("obj", lit("level2")).as_("ans")).collect() [Row(ANS='<level2 attr2="b">\n 2\n <level3>3a</level3>\n <level3>3b</level3>\n</level2>')]
When multiple tags exist at a level, instance_num can be used to distinguish which element to return.
Example:
>>> df.select(xmlget(xmlget("obj", lit("level2")), lit("level3"), lit(0)).as_("ans")).collect() [Row(ANS='<level3>3a</level3>')] >>> df.select(xmlget(xmlget("obj", lit("level2")), lit("level3"), lit(1)).as_("ans")).collect() [Row(ANS='<level3>3b</level3>')] >>> df.select(xmlget("obj", lit("level2"), lit(5)).as_("ans")).collect() [Row(ANS=None)]
In order to get the tagname, the value of an attribute or the content within a tag the get function can be used.
Example:
>>> df.select(get(xmlget("obj", lit("level2")), lit("@")).as_("ans")).collect() [Row(ANS='"level2"')] >>> df.select(get(xmlget("obj", lit("level2")), lit("$")).as_("ans")).collect() [Row(ANS='[\n 2,\n {\n "$": "3a",\n "@": "level3"\n },\n {\n "$": "3b",\n "@": "level3"\n }\n]')] >>> df.select(get(xmlget(xmlget("obj", lit("level2")), lit("level3")), lit("$")).as_("ans")).collect() [Row(ANS='"3a"')] >>> df.select(get(xmlget("obj", lit("level2")), lit("@attr2")).as_("ans")).collect() [Row(ANS='"b"')]