1-Transformer_models-2-Transformers_what_can_they_do

中英文对照学习,效果更佳!
原课程链接:https://huggingface.co/course/chapter1/3?fw=pt

Transformers, what can they do?

Transformer,他们能做什么?

Ask a Question
Open In Colab
Open In Studio Lab
In this section, we will look at what Transformer models can do and use our first tool from the 🤗 Transformers library: the pipeline() function.

在本节中,我们将了解Transformers模型可以做什么,并使用🤗Transformers库中的第一个工具:`Pipeline()‘函数。

👀 See that Open in Colab button on the top right? Click on it to open a Google Colab notebook with all the code samples of this section. This button will be present in any section containing code examples.
If you want to run the examples locally, we recommend taking a look at the [setup].

👀看到右上角的Open in Colab按钮了吗?点击它打开一个Google Colab笔记本,上面有本节的所有代码示例。此按钮将出现在包含代码示例的任何部分中。如果您想在本地运行这些示例,我们建议您查看一下设置。

Transformers are everywhere!

Transformer无处不在!

Transformer models are used to solve all kinds of NLP tasks, like the ones mentioned in the previous section. Here are some of the companies and organizations using Hugging Face and Transformer models, who also contribute back to the community by sharing their models:

Transformer模型用于解决所有类型的NLP任务,如上一节中提到的任务。以下是一些使用Hugging Face和Transformer模型的公司和组织,他们也通过分享自己的模型来回馈社区:

Companies using Hugging Face
The 🤗 Transformers library provides the functionality to create and use those shared models. The Model Hub contains thousands of pretrained models that anyone can download and use. You can also upload your own models to the Hub!

使用Hugging Face的公司🤗Transformer程序库提供了创建和使用这些共享模型的功能。Model Hub包含数千个预先训练好的模型,任何人都可以下载和使用。您也可以将您自己的模型上传到Hub!

⚠️ The Hugging Face Hub is not limited to Transformer models. Anyone can share any kind of models or datasets they want! Create a huggingface.co account to benefit from all available features!

⚠️的Hugging Face中心并不局限于Transformer的型号。任何人都可以共享他们想要的任何类型的模型或数据集!创建一个huggingface.co帐户,从所有可用功能中受益!

Before diving into how Transformer models work under the hood, let’s look at a few examples of how they can be used to solve some interesting NLP problems.

在深入研究Transformer模型如何在幕后工作之前,让我们来看几个如何使用它们来解决一些有趣的NLP问题的示例。

Working with pipelines

使用管道

The most basic object in the 🤗 Transformers library is the pipeline() function. It connects a model with its necessary preprocessing and postprocessing steps, allowing us to directly input any text and get an intelligible answer:

管道Transformer库中最基本的对象是🤗()函数。它将模型与其必要的前处理和后处理步骤联系起来,允许我们直接输入任何文本并获得易于理解的答案:

1
2
3
4
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("I've been waiting for a HuggingFace course my whole life.")
1
[{'label': 'POSITIVE', 'score': 0.9598047137260437}]

We can even pass several sentences!

我们甚至可以说出几句话!

1
2
3
classifier(
["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!"]
)
1
2
[{'label': 'POSITIVE', 'score': 0.9598047137260437},
{'label': 'NEGATIVE', 'score': 0.9994558095932007}]

By default, this pipeline selects a particular pretrained model that has been fine-tuned for sentiment analysis in English. The model is downloaded and cached when you create the classifier object. If you rerun the command, the cached model will be used instead and there is no need to download the model again.

默认情况下,此管道选择一个特定的预先训练的模型,该模型已针对英语情感分析进行了微调。模型在您创建assassfier对象时下载并缓存。如果重新运行该命令,则将使用缓存的模型,无需再次下载该模型。

There are three main steps involved when you pass some text to a pipeline:

将一些文本传递到管道时,涉及三个主要步骤:

  1. The text is preprocessed into a format the model can understand.
  2. The preprocessed inputs are passed to the model.
  3. The predictions of the model are post-processed, so you can make sense of them.

Some of the currently available pipelines are:

文本被预处理为模型可以理解的格式。经过预处理的输入被传递给模型。模型的预测是后处理的,因此您可以理解它们。一些当前可用的管道是:

  • feature-extraction (get the vector representation of a text)
  • fill-mask
  • ner (named entity recognition)
  • question-answering
  • sentiment-analysis
  • summarization
  • text-generation
  • translation
  • zero-shot-classification

Let’s have a look at a few of these!

`Feature-Exaction(获取文本的矢量表示)Fill-Maskner`(命名实体recognition)`question-answeringsentiment-analysissummarizationtext-generationtranslationzero-shot-classification`Let’s看看其中的几个!

Zero-shot classification

零射击分类

We’ll start by tackling a more challenging task where we need to classify texts that haven’t been labelled. This is a common scenario in real-world projects because annotating text is usually time-consuming and requires domain expertise. For this use case, the zero-shot-classification pipeline is very powerful: it allows you to specify which labels to use for the classification, so you don’t have to rely on the labels of the pretrained model. You’ve already seen how the model can classify a sentence as positive or negative using those two labels — but it can also classify the text using any other set of labels you like.

我们将从处理一项更具挑战性的任务开始,在那里我们需要对尚未标记的文本进行分类。这是实际项目中的常见场景,因为注释文本通常很耗时,并且需要领域专业知识。对于这个用例,‘零镜头分类’管道非常强大:它允许您指定要使用哪些标签来进行分类,因此您不必依赖预先训练的模型的标签。您已经看到了该模型如何使用这两个标签将句子分类为肯定的或否定的-但它也可以使用您喜欢的任何其他标签集对文本进行分类。

1
2
3
4
5
6
7
from transformers import pipeline

classifier = pipeline("zero-shot-classification")
classifier(
"This is a course about the Transformers library",
candidate_labels=["education", "politics", "business"],
)
1
2
3
{'sequence': 'This is a course about the Transformers library',
'labels': ['education', 'business', 'politics'],
'scores': [0.8445963859558105, 0.111976258456707, 0.043427448719739914]}

This pipeline is called zero-shot because you don’t need to fine-tune the model on your data to use it. It can directly return probability scores for any list of labels you want!

这个管道被称为零激发,因为您不需要微调数据上的模型就可以使用它。它可以直接返回您想要的任何标签列表的概率分数!

✏️ Try it out! Play around with your own sequences and labels and see how the model behaves.

✏️试试看吧!使用您自己的序列和标签,看看模型是如何运行的。

Text generation

文本生成

Now let’s see how to use a pipeline to generate some text. The main idea here is that you provide a prompt and the model will auto-complete it by generating the remaining text. This is similar to the predictive text feature that is found on many phones. Text generation involves randomness, so it’s normal if you don’t get the same results as shown below.

现在,让我们看看如何使用管道生成一些文本。这里的主要思想是,您提供一个提示,模型将通过生成剩余的文本来自动完成它。这类似于许多手机上的预测文本功能。文本生成涉及随机性,所以如果您没有得到如下所示的相同结果,这是正常的。

1
2
3
4
from transformers import pipeline

generator = pipeline("text-generation")
generator("In this course, we will teach you how to")
1
2
3
4
5
[{'generated_text': 'In this course, we will teach you how to understand and use '
'data flow and data interchange when handling user data. We '
'will be working with one or more of the most commonly used '
'data flows — data flows of various types, as seen by the '
'HTTP'}]

You can control how many different sequences are generated with the argument num_return_sequences and the total length of the output text with the argument max_length.

您可以使用参数num_Return_Sequences控制生成多少个不同的序列,以及使用参数Max_LENGTH控制输出文本的总长度。

✏️ Try it out! Use the num_return_sequences and max_length arguments to generate two sentences of 15 words each.

✏️试试看吧!使用num_Return_SequencesMAX_LENGTH参数生成两个句子,每个句子15个单词。

Using any model from the Hub in a pipeline

在管道中使用来自Hub的任何模型

The previous examples used the default model for the task at hand, but you can also choose a particular model from the Hub to use in a pipeline for a specific task — say, text generation. Go to the Model Hub and click on the corresponding tag on the left to display only the supported models for that task. You should get to a page like this one.

前面的示例使用的是手头任务的默认模型,但您也可以从Hub中选择一个特定的模型,用于特定任务的管道中–比如文本生成。转到Model Hub并单击左侧相应的标记,以仅显示该任务支持的模型。你应该会看到像这样的页面。

Let’s try the distilgpt2 model! Here’s how to load it in the same pipeline as before:

让我们来试试`dispugpt2‘模型吧!下面是如何将其加载到与前面相同的管道中:

1
2
3
4
5
6
7
8
from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
"In this course, we will teach you how to",
max_length=30,
num_return_sequences=2,
)
1
2
3
4
5
[{'generated_text': 'In this course, we will teach you how to manipulate the world and '
'move your mental and physical capabilities to your advantage.'},
{'generated_text': 'In this course, we will teach you how to become an expert and '
'practice realtime, and with a hands on experience on both real '
'time and real'}]

You can refine your search for a model by clicking on the language tags, and pick a model that will generate text in another language. The Model Hub even contains checkpoints for multilingual models that support several languages.

您可以通过单击语言标签来优化对模型的搜索,并选择将以另一种语言生成文本的模型。Model Hub甚至包含支持多种语言的多语言模型的检查点。

Once you select a model by clicking on it, you’ll see that there is a widget enabling you to try it directly online. This way you can quickly test the model’s capabilities before downloading it.

一旦您通过点击选择了一个模型,您将看到有一个小部件允许您直接在线尝试它。这样,你就可以在下载之前快速测试模型的性能。

✏️ Try it out! Use the filters to find a text generation model for another language. Feel free to play with the widget and use it in a pipeline!

✏️试试看吧!使用过滤器查找另一种语言的文本生成模型。您可以随意操作该小部件并在管道中使用它!

The Inference API

推理API

All the models can be tested directly through your browser using the Inference API, which is available on the Hugging Face website. You can play with the model directly on this page by inputting custom text and watching the model process the input data.

所有的模型都可以通过浏览器使用推理API直接进行测试,该API可以在Hugging Face网站上找到。您可以通过输入自定义文本并观看模型处理输入数据,直接在此页面上处理模型。

The Inference API that powers the widget is also available as a paid product, which comes in handy if you need it for your workflows. See the pricing page for more details.

支持小部件的推理API也可以作为付费产品提供,如果您的工作流需要它,它将派上用场。有关详细信息,请参阅定价页面。

Mask filling

掩膜填充

The next pipeline you’ll try is fill-mask. The idea of this task is to fill in the blanks in a given text:

您将尝试的下一条管道是‘Fill-Mask’。此任务的目的是填充给定文本中的空白处:

1
2
3
4
from transformers import pipeline

unmasker = pipeline("fill-mask")
unmasker("This course will teach you all about <mask> models.", top_k=2)
1
2
3
4
5
6
7
8
[{'sequence': 'This course will teach you all about mathematical models.',
'score': 0.19619831442832947,
'token': 30412,
'token_str': ' mathematical'},
{'sequence': 'This course will teach you all about computational models.',
'score': 0.04052725434303284,
'token': 38163,
'token_str': ' computational'}]

The top_k argument controls how many possibilities you want to be displayed. Note that here the model fills in the special <mask> word, which is often referred to as a mask token. Other mask-filling models might have different mask tokens, so it’s always good to verify the proper mask word when exploring other models. One way to check it is by looking at the mask word used in the widget.

`top_k参数控制要显示的可能性数。请注意,在这里,模型填充了特殊的<掩码>`字,它通常被称为掩码令牌。其他掩码填充模型可能有不同的掩码标记,因此在研究其他模型时,验证正确的掩码字总是很好的。检查它的一种方法是查看小部件中使用的掩码字。

✏️ Try it out! Search for the bert-base-cased model on the Hub and identify its mask word in the Inference API widget. What does this model predict for the sentence in our pipeline example above?

✏️试试看吧!在Hub上搜索`bert-base-case‘模型,并在推理API小部件中识别其掩码。这个模型对我们上面的“管道”例子中的句子有什么预测?

Named entity recognition

命名实体识别

Named entity recognition (NER) is a task where the model has to find which parts of the input text correspond to entities such as persons, locations, or organizations. Let’s look at an example:

命名实体识别(NER)是一项任务,在该任务中,模型必须找到输入文本的哪些部分对应于人员、位置或组织等实体。让我们来看一个例子:

1
2
3
4
from transformers import pipeline

ner = pipeline("ner", grouped_entities=True)
ner("My name is Sylvain and I work at Hugging Face in Brooklyn.")
1
2
3
4
[{'entity_group': 'PER', 'score': 0.99816, 'word': 'Sylvain', 'start': 11, 'end': 18}, 
{'entity_group': 'ORG', 'score': 0.97960, 'word': 'Hugging Face', 'start': 33, 'end': 45},
{'entity_group': 'LOC', 'score': 0.99321, 'word': 'Brooklyn', 'start': 49, 'end': 57}
]

Here the model correctly identified that Sylvain is a person (PER), Hugging Face an organization (ORG), and Brooklyn a location (LOC).

在这里,模型正确地识别出Sylvain是一个人(PER),Hugging Face是一个组织(ORG),布鲁克林是一个地点(LOC)。

We pass the option grouped_entities=True in the pipeline creation function to tell the pipeline to regroup together the parts of the sentence that correspond to the same entity: here the model correctly grouped “Hugging” and “Face” as a single organization, even though the name consists of multiple words. In fact, as we will see in the next chapter, the preprocessing even splits some words into smaller parts. For instance, Sylvain is split into four pieces: S, ##yl, ##va, and ##in. In the post-processing step, the pipeline successfully regrouped those pieces.

我们在管道创建函数中传递选项GROUPED_ENTITIES=True,以告诉管道将对应于同一实体的句子部分重新组合在一起:在这里,模型正确地将“拥抱”和“脸”组合为一个组织,即使名称由多个单词组成。事实上,正如我们将在下一章中看到的那样,预处理甚至会将一些单词拆分成更小的部分。例如,Sylvain分为四个部分:S##yl##va##in。在后处理步骤中,管道成功地对这些片段进行了重组。

✏️ Try it out! Search the Model Hub for a model able to do part-of-speech tagging (usually abbreviated as POS) in English. What does this model predict for the sentence in the example above?

✏️试试看吧!在Model Hub中搜索能够进行英语词性标注(通常缩写为POS)的模型。这个模型对上面例子中的句子有什么预测?

Question answering

答疑

The question-answering pipeline answers questions using information from a given context:

“问题-答案”管道使用给定上下文中的信息来回答问题:

1
2
3
4
5
6
7
from transformers import pipeline

question_answerer = pipeline("question-answering")
question_answerer(
question="Where do I work?",
context="My name is Sylvain and I work at Hugging Face in Brooklyn",
)
1
{'score': 0.6385916471481323, 'start': 33, 'end': 45, 'answer': 'Hugging Face'}

Note that this pipeline works by extracting information from the provided context; it does not generate the answer.

请注意,此管道通过从提供的上下文中提取信息来工作;它不会生成答案。

Summarization

摘要

Summarization is the task of reducing a text into a shorter text while keeping all (or most) of the important aspects referenced in the text. Here’s an example:

摘要的任务是将一篇文章缩短为较短的文本,同时保留文本中提到的所有(或大多数)重要方面。下面是一个例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
from transformers import pipeline

summarizer = pipeline("summarization")
summarizer(
"""
America has changed dramatically during recent years. Not only has the number of
graduates in traditional engineering disciplines such as mechanical, civil,
electrical, chemical, and aeronautical engineering declined, but in most of
the premier American universities engineering curricula now concentrate on
and encourage largely the study of engineering science. As a result, there
are declining offerings in engineering subjects dealing with infrastructure,
the environment, and related issues, and greater concentration on high
technology subjects, largely supporting increasingly complex scientific
developments. While the latter is important, it should not be at the expense
of more traditional engineering.

Rapidly developing economies such as China and India, as well as other
industrial countries in Europe and Asia, continue to encourage and advance
the teaching of engineering. Both China and India, respectively, graduate
six and eight times as many traditional engineers as does the United States.
Other industrial countries at minimum maintain their output, while America
suffers an increasingly serious decline in the number of engineering graduates
and a lack of well-educated engineers.
"""
)
1
2
3
4
5
6
7
[{'summary_text': ' America has changed dramatically during recent years . The '
'number of engineering graduates in the U.S. has declined in '
'traditional engineering disciplines such as mechanical, civil '
', electrical, chemical, and aeronautical engineering . Rapidly '
'developing economies such as China and India, as well as other '
'industrial countries in Europe and Asia, continue to encourage '
'and advance engineering .'}]

Like with text generation, you can specify a max_length or a min_length for the result.

与文本生成类似,您可以为结果指定MAX_LENGTHMIN_LENGTH

Translation

翻译

For translation, you can use a default model if you provide a language pair in the task name (such as "translation_en_to_fr"), but the easiest way is to pick the model you want to use on the Model Hub. Here we’ll try translating from French to English:

对于翻译,如果您在任务名称中提供了语言对(如“ting_en_to_fr”),则可以使用默认模型,但最简单的方法是选择您想要在Model Hub上使用的模型。下面我们将尝试将法语翻译成英语:

1
2
3
4
from transformers import pipeline

translator = pipeline("translation", model="Helsinki-NLP/opus-mt-fr-en")
translator("Ce cours est produit par Hugging Face.")
1
[{'translation_text': 'This course is produced by Hugging Face.'}]

Like with text generation and summarization, you can specify a max_length or a min_length for the result.

与文本生成和摘要类似,您可以为结果指定MAX_LENGTHMIN_LENGTH

✏️ Try it out! Search for translation models in other languages and try to translate the previous sentence into a few different languages.

✏️试试看吧!搜索其他语言的翻译模型,并尝试将前面的句子翻译成几种不同的语言。

The pipelines shown so far are mostly for demonstrative purposes. They were programmed for specific tasks and cannot perform variations of them. In the next chapter, you’ll learn what’s inside a pipeline() function and how to customize its behavior.

到目前为止展示的管道大多是为了演示目的。它们被编程用于特定的任务,不能执行它们的变体。在下一章中,您将学习Pipeline()函数的内部内容以及如何定制其行为。