Hugging Face 管道的推断签名¶
Snowflake Model Registry 会自动推断包含以下列表中的单个任务的 Hugging Face 管道的签名:
conversational
fill-mask
question-answering
summarization
table-question-answering
text2text-generation
text-classification
(别名:sentiment-analysis
)text-generation
token-classification
(别名:ner
)translation
translation_xx_to_yy
zero-shot-classification
本主题介绍以下这些类型的 Hugging Face (link removed) 管道的签名,包括所需输入和预期输出的描述和示例。所有输入和输出均为 Snowpark DataFrames。
有关在注册表中记录 Hugging Face 管道的一般指南,请参阅 Hugging Face pipeline。
对话管道¶
` 对话 <https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.ConversationalPipeline (link removed)>`_ 任务的管道具有以下输入和输出。
输入¶
user_inputs
:表示用户先前和当前输入的字符串列表。列表中的最后一个是当前输入。generated_responses
:表示模型先前响应的字符串列表。
示例:
---------------------------------------------------------------------------
|"user_inputs" |"generated_responses" |
---------------------------------------------------------------------------
|[ |[ |
| "Do you speak French?", | "Yes I do." |
| "Do you know how to say Snowflake in French?" |] |
|] | |
---------------------------------------------------------------------------
输出¶
generated_responses
:表示模型先前和当前响应的字符串列表。列表中的最后一个是当前响应。
示例:
-------------------------
|"generated_responses" |
-------------------------
|[ |
| "Yes I do.", |
| "I speak French." |
|] |
-------------------------
Fill-mask 管道¶
“Fill-mask (link removed)”任务的管道具有以下输入和输出。
输入¶
inputs
:要填充掩码的字符串。
示例:
--------------------------------------------------
|"inputs" |
--------------------------------------------------
|LynYuu is the [MASK] of the Grand Duchy of Yu. |
--------------------------------------------------
输出¶
outputs
:一个字符串,包含以 JSON 格式表示的对象列表,列表中的每个对象都可能包含score
、token
、token_str
或sequence
等键。 有关详细信息,请参阅 FillMaskPipeline (link removed)。
示例:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|"outputs" |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|[{"score": 0.9066258072853088, "token": 3007, "token_str": "capital", "sequence": "lynyuu is the capital of the grand duchy of yu."}, {"score": 0.08162177354097366, "token": 2835, "token_str": "seat", "sequence": "lynyuu is the seat of the grand duchy of yu."}, {"score": 0.0012052370002493262, "token": 4075, "token_str": "headquarters", "sequence": "lynyuu is the headquarters of the grand duchy of yu."}, {"score": 0.0006560495239682496, "token": 2171, "token_str": "name", "sequence": "lynyuu is the name of the grand duchy of yu."}, {"score": 0.0005427763098850846, "token": 3200, "token_str"... |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
令牌分类¶
“ner”或“` token-classification <https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TokenClassificationPipeline (link removed)>`_”任务的管道具有以下输入和输出。
输入¶
inputs
:包含要分类的令牌的字符串。
示例:
------------------------------------------------
|"inputs" |
------------------------------------------------
|My name is Izumi and I live in Tokyo, Japan. |
------------------------------------------------
输出¶
outputs
:一个字符串,包含以 JSON 格式表示的结果对象的列表,列表中的每个对象都可能包含entity
、score
、index
、word
、name
、start
或end
等键。 有关详细信息,请参阅 TokenClassificationPipeline <https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TokenClassificationPipeline (link removed)>`_。
示例:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|"outputs" |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|[{"entity": "PRON", "score": 0.9994392991065979, "index": 1, "word": "my", "start": 0, "end": 2}, {"entity": "NOUN", "score": 0.9968984127044678, "index": 2, "word": "name", "start": 3, "end": 7}, {"entity": "AUX", "score": 0.9937735199928284, "index": 3, "word": "is", "start": 8, "end": 10}, {"entity": "PROPN", "score": 0.9928083419799805, "index": 4, "word": "i", "start": 11, "end": 12}, {"entity": "PROPN", "score": 0.997334361076355, "index": 5, "word": "##zumi", "start": 12, "end": 16}, {"entity": "CCONJ", "score": 0.999173104763031, "index": 6, "word": "and", "start": 17, "end": 20}, {... |
问答(单个输出)¶
“` question-answering <https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.QuestionAnsweringPipeline (link removed)>`_”任务的管道,其中 top_k
未设置或设置为 1,具有以下输入和输出。
输入¶
question
:包含要回答的问题的字符串。context
:可能包含答案的字符串。
示例:
-----------------------------------------------------------------------------------
|"question" |"context" |
-----------------------------------------------------------------------------------
|What did Doris want to do? |Doris is a cheerful mermaid from the ocean dept... |
-----------------------------------------------------------------------------------
输出¶
score
:浮点置信度分数从 0.0 到 1.0。start
:在上下文中,答案第一个词元的整数索引。end
:在原始上下文中,答案最后一个词元的整数索引。answer
:包含找到的答案的字符串。
示例:
--------------------------------------------------------------------------------
|"score" |"start" |"end" |"answer" |
--------------------------------------------------------------------------------
|0.61094731092453 |139 |178 |learn more about the world of athletics |
--------------------------------------------------------------------------------
问答(多个输出)¶
任务是“` question-answering <https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.QuestionAnsweringPipeline (link removed)>`_”的管道,其中 top_k
设置为大于 1,具有以下输入和输出。
输入¶
question
:包含要回答的问题的字符串。context
:可能包含答案的字符串。
示例:
-----------------------------------------------------------------------------------
|"question" |"context" |
-----------------------------------------------------------------------------------
|What did Doris want to do? |Doris is a cheerful mermaid from the ocean dept... |
-----------------------------------------------------------------------------------
输出¶
outputs
:一个字符串,包含以 JSON 格式表示的结果对象的列表,列表中的每个对象都可能包含score
、start
、end
或answer
等键。
示例:
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|"outputs" |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|[{"score": 0.61094731092453, "start": 139, "end": 178, "answer": "learn more about the world of athletics"}, {"score": 0.17750297486782074, "start": 139, "end": 180, "answer": "learn more about the world of athletics.\""}, {"score": 0.06438097357749939, "start": 138, "end": 178, "answer": "\"learn more about the world of athletics"}] |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
摘要¶
任务是“` summarization <https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.SummarizationPipeline (link removed)>`_”的管道,其中 return_tensors
为 False 或未设置,具有以下输入和输出。
输入¶
documents
:包含要汇总的文本的字符串。
示例:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|"documents" |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|Neuro-sama is a chatbot styled after a female VTuber that hosts live streams on the Twitch channel "vedal987". Her speech and personality are generated by an artificial intelligence (AI) system wh... |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
输出¶
summary_text
:包含生成的摘要的字符串,或者,如果num_return_sequences
大于 1,则字符串包含以 JSON 格式表示的结果列表,每个结果都是一个包含字段的字典,其中包括summary_text
。
示例:
---------------------------------------------------------------------------------
|"summary_text" |
---------------------------------------------------------------------------------
| Neuro-sama is a chatbot styled after a female VTuber that hosts live streams |
---------------------------------------------------------------------------------
表问答¶
任务是“` table-question-answering <https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TableQuestionAnsweringPipeline (link removed)>`_”的管道具有以下输入和输出。
输入¶
query
:包含要回答的问题的字符串。table
:包含 JSON 序列化字典的字符串,形式为{column -> [values]}
,表示可能包含答案的表。
示例:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|"query" |"table" |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|Which channel has the most subscribers? |{"Channel": ["A.I.Channel", "Kaguya Luna", "Mirai Akari", "Siro"], "Subscribers": ["3,020,000", "872,000", "694,000", "660,000"], "Videos": ["1,200", "113", "639", "1,300"], "Created At": ["Jun 30 2016", "Dec 4 2017", "Feb 28 2014", "Jun 23 2017"]} |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
输出¶
answer
:包含可能答案的字符串。coordinates
:表示答案所在的单元格坐标的整数列表。cells
:包含答案所在的单元格内容的字符串列表。aggregator
:包含所用聚合器名称的字符串。
示例:
----------------------------------------------------------------
|"answer" |"coordinates" |"cells" |"aggregator" |
----------------------------------------------------------------
|A.I.Channel |[ |[ |NONE |
| | [ | "A.I.Channel" | |
| | 0, |] | |
| | 0 | | |
| | ] | | |
| |] | | |
----------------------------------------------------------------
文本分类(单个输出)¶
“text-clasification (link removed)”任务的管道,其中 top_k
未设置或为 None,具有以下输入和输出。
输入¶
text
:要分类的字符串。text_pair
:与text
一起分类的字符串,用于计算文本相似度的模型。如果模型不使用它,则留空。
示例:
----------------------------------
|"text" |"text_pair" |
----------------------------------
|I like you. |I love you, too. |
----------------------------------
输出¶
label
:表示文本分类标签的字符串。score
:浮点置信度分数从 0.0 到 1.0。
示例:
--------------------------------
|"label" |"score" |
--------------------------------
|LABEL_0 |0.9760091304779053 |
--------------------------------
文本分类(多个输出)¶
“text-clasification (link removed)”任务的管道,其中 top_k
设置为一个数字,具有以下输入和输出。
备注
如果将 top_k
设置为任何数字,即使该数字为 1,文本分类任务也被视为多个输出。要获取 单个输出,请将 top_k
值设为 None。
输入¶
text
:要分类的字符串。text_pair
:与text
一起分类的字符串,用于计算文本相似度的模型。如果模型不使用它,则留空。
示例:
--------------------------------------------------------------------
|"text" |"text_pair" |
--------------------------------------------------------------------
|I am wondering if I should have udon or rice fo... | |
--------------------------------------------------------------------
输出¶
outputs
:一个字符串,包含以 JSON 格式表示的结果列表,每个结果都包含包括label
和score
的字段。
示例:
--------------------------------------------------------
|"outputs" |
--------------------------------------------------------
|[{"label": "NEGATIVE", "score": 0.9987024068832397}] |
--------------------------------------------------------
文本生成¶
任务是“` text-generation <https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TextGenerationPipeline (link removed)>`_”的管道,其中 return_tensors
为 False 或未设置,具有以下输入和输出。
备注
文本生成管道,其中 return_tensors
是 True,不受支持。
输入¶
inputs
:包含提示的字符串。
示例:
--------------------------------------------------------------------------------
|"inputs" |
--------------------------------------------------------------------------------
|A descendant of the Lost City of Atlantis, who swam to Earth while saying, " |
--------------------------------------------------------------------------------
输出¶
outputs
:一个字符串,包含以 JSON 格式表示的结果对象的列表,每个对象都包含包括generated_text
的字段。
示例:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|"outputs" |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|[{"generated_text": "A descendant of the Lost City of Atlantis, who swam to Earth while saying, \"For my life, I don't know if I'm gonna land upon Earth.\"\n\nIn \"The Misfits\", in a flashback, wh... |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
文本到文本的生成¶
任务是“` text2text-generation <https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.Text2TextGenerationPipeline (link removed)>`_”的管道,其中 return_tensors
为 False 或未设置,具有以下输入和输出。
备注
文本到文本的生成管道,其中 return_tensors
是 True,不受支持。
输入¶
inputs
:包含提示的字符串。
示例:
--------------------------------------------------------------------------------
|"inputs" |
--------------------------------------------------------------------------------
|A descendant of the Lost City of Atlantis, who swam to Earth while saying, " |
--------------------------------------------------------------------------------
输出¶
generated_text:如果
num_return_sequences
为 1,则为包含生成文本的字符串;如果 num_return_sequences 大于 1,则为以 JSON 格式表示字典结果列表的字符串,字典包含generated_text
在内的字段。
示例:
----------------------------------------------------------------
|"generated_text" |
----------------------------------------------------------------
|, said that he was a descendant of the Lost City of Atlantis |
----------------------------------------------------------------
翻译生成¶
任务是“` translation <https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TranslationPipeline (link removed)>`_”的管道,其中 return_tensors
为 False 或未设置,具有以下输入和输出。
备注
翻译生成管道,其中 return_tensors
是 True,不受支持。
输入¶
inputs
:包含要翻译的文本的字符串。
示例:
------------------------------------------------------------------------------------------------------
|"inputs" |
------------------------------------------------------------------------------------------------------
|Snowflake's Data Cloud is powered by an advanced data platform provided as a self-managed service. |
------------------------------------------------------------------------------------------------------
输出¶
translation_text
:如果num_return_sequences
为 1,则为表示生成的翻译的字符串,或者是以 JSON 格式表示字典结果列表的字符串,每个字典均包含包括translation_text
的字段。
示例:
---------------------------------------------------------------------------------------------------------------------------------
|"translation_text" |
---------------------------------------------------------------------------------------------------------------------------------
|Le Cloud de données de Snowflake est alimenté par une plate-forme de données avancée fournie sous forme de service autogérés. |
---------------------------------------------------------------------------------------------------------------------------------
Zero-shot 分类¶
“zero-shot-classification (link removed)”任务的管道具有以下输入和输出。
输入¶
sequences
:包含要分类的文本的字符串。candidate_labels
:包含要应用于文本的标签的字符串列表。
示例:
-----------------------------------------------------------------------------------------
|"sequences" |"candidate_labels" |
-----------------------------------------------------------------------------------------
|I have a problem with Snowflake that needs to be resolved asap!! |[ |
| | "urgent", |
| | "not urgent" |
| |] |
|I have a problem with Snowflake that needs to be resolved asap!! |[ |
| | "English", |
| | "Japanese" |
| |] |
-----------------------------------------------------------------------------------------
输出¶
sequence
:输入字符串。labels
:表示已应用的标签的字符串列表。scores
:每个标签的浮点置信度分数列表。
示例:
--------------------------------------------------------------------------------------------------------------
|"sequence" |"labels" |"scores" |
--------------------------------------------------------------------------------------------------------------
|I have a problem with Snowflake that needs to be resolved asap!! |[ |[ |
| | "urgent", | 0.9952737092971802, |
| | "not urgent" | 0.004726255778223276 |
| |] |] |
|I have a problem with Snowflake that needs to be resolved asap!! |[ |[ |
| | "Japanese", | 0.5790848135948181, |
| | "English" | 0.42091524600982666 |
| |] |] |
--------------------------------------------------------------------------------------------------------------