GPT2用于使用拥抱面部变压器进行文本分类

由柏拉图重新发布

关注： 0

文字分类

该笔记本用于微调GPT2模型，以使用以下命令进行文本分类拥抱脸变形金刚自定义数据集上的库。

对于我们来说，Hugging Face非常高兴，它包含将GPT2用于分类任务所需的所有功能。谢谢拥抱脸！

我找不到有关如何使用GPT2进行分类的太多信息，因此我决定使用与其他变压器模型类似的结构来制作本教程。

如果这些深入的教育内容对您有用，订阅我们的AI研究邮件列表当我们发布新材料时被提醒。

大意： 由于GPT2是解码器转换器，因此使用输入序列的最后一个令牌来预测应该跟随输入的下一个令牌。这意味着输入序列的最后一个标记包含预测中所需的所有信息。考虑到这一点，我们可以使用该信息在分类任务而不是生成任务中进行预测。

换句话说，我们不会像在Bert中那样使用第一个令牌嵌入来进行预测，而是将使用最后一个令牌嵌入来对GPT2进行预测。

由于我们只关心Bert中的第一个令牌，因此我们向右填充。现在，在GPT2中，我们使用最后一个令牌进行预测，因此我们需要在左侧填充。由于对HuggingFace Transformers进行了很好的升级，因此我们可以配置GPT2 Tokenizer来完成此任务。

我应该为这本笔记本知道些什么？

由于我正在使用PyTorch来微调我们的变压器模型，因此有关PyTorch的任何知识都是非常有用的。

了解一点变形金刚图书馆也有帮助。

如何使用此笔记本？

像每个项目一样，我在构建此笔记本时考虑了可重用性。

所有更改都将在数据处理部分发生，您需要在其中自定义PyTorch数据集，数据整理器和DataLoader以适合您自己的数据需求。

所有可以更改的参数都在进口部分。对每个参数都进行了很好的注释，并使其结构尽可能直观。

数据集

该笔记本将涵盖自定义数据集上的预训练变压器。我将使用著名的电影评论正面-负面标签大电影评论数据集.

斯坦福大学网站上提供的说明：

这是一个用于二进制情感分类的数据集，它包含的数据比以前的基准数据集要多得多。我们提供了25,000套极地电影评论供培训，而25,000套则用于测试。也有其他未标记的数据可供使用。提供原始文本和已处理的单词格式袋。有关更多详细信息，请参见发行版中包含的自述文件。

为什么使用此数据集？ 我相信这是一个易于理解和使用的数据集进行分类。我认为，使用情感数据总是很有趣。

编码

现在让我们做一些编码！我们将遍历笔记本中的每个编码单元，并描述其功能，代码是什么以及何时相关—显示输出。

如果您决定在自己的python笔记本中运行每个代码单元，则我使这种格式易于遵循。

当我从教程中学习时，我总是尝试复制结果。我相信，如果在说明旁边有代码，则很容易遵循。

资料下载

下载 大电影评论数据集 并在本地解压缩。

Download the dataset.
!wget -q -nc http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
Unzip the dataset.
!tar -zxf /content/aclImdb_v1.tar.gz

安装

变形金刚 需要安装库才能使用Hugging Face的所有出色代码。要获取最新版本，我将直接从GitHub安装它。
ml_things 用于各种机器学习相关任务的库。我创建了这个库，以减少为每个机器学习项目编写的代码量。

# Install transformers library.
!pip install -q git+https://github.com/huggingface/transformers.git
# Install helper functions.
!pip install -q git+https://github.com/gmihaila/ml_things.git

Installing build dependencies ... done Getting requirements to build wheel ... done Preparing wheel metadata ... done |████████████████████████████████| 2.9MB 6.7MB/s |████████████████████████████████| 890kB 48.9MB/s |████████████████████████████████| 1.1MB 49.0MB/s Building wheelfor transformers (PEP 517) ... done Building wheel for sacremoses (setup.py) ... done |████████████████████████████████| 71kB 5.2MB/s Building wheel for ml-things (setup.py) ... done Building wheel for ftfy (setup.py) ... done

进口

导入此笔记本所需的所有库。声明此笔记本使用的参数：

set_seed(123) –始终为固定重现性设置好种子。
epochs –训练时期的数量（作者建议2到4之间）。
batch_size –批数–取决于最大序列长度和GPU内存。对于512个序列长度，每批10个USUALY可以正常工作，而不会产生cuda内存问题。对于较小的序列长度，可以尝试32个或更高的批次。 max_length –将文本序列填充或截断为特定长度。我将其设置为60以加快训练速度。
device –查找要使用的GPU。如果找不到gpu，则默认情况下将使用cpu。
model_name_or_path –变压器型号名称–将使用已经预先训练的模型。变压器模型的路径–将从本地磁盘加载您自己的模型。在本教程中，我将使用 gpt2 模型。
labels_ids –标签字典及其ID –将用于将字符串标签转换为数字。
n_labels –我们在此数据集中使用了多少个标签。这用于确定分类头的大小。

import io
import os
import torch
from tqdm.notebook import tqdm
from torch.utils.data import Dataset, DataLoader
from ml_things import plot_dict, plot_confusion_matrix, fix_text
from sklearn.metrics import classification_report, accuracy_score
from transformers import (set_seed, TrainingArguments, Trainer, GPT2Config, GPT2Tokenizer, AdamW, get_linear_schedule_with_warmup, GPT2ForSequenceClassification) # Set seed for reproducibility.
set_seed(123) # Number of training epochs (authors on fine-tuning Bert recommend between 2 and 4).
epochs = 4 # Number of batches - depending on the max sequence length and GPU memory.
# For 512 sequence length batch of 10 works without cuda memory issues.
# For small sequence length can try batch of 32 or higher.
batch_size = 32 # Pad or truncate text sequences to a specific length
# if `None` it will use maximum sequence of word piece tokens allowed by model.
max_length = 60 # Look for gpu to use. Will use `cpu` by default if no gpu found.
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') # Name of transformers model - will use already pretrained model.
# Path of transformer model - will load your own model from local disk.
model_name_or_path = 'gpt2' # Dictionary of labels and their id - this will be used to convert.
# String labels to number ids.
labels_ids = {'neg': 0, 'pos': 1} # How many labels are we using in training.
# This is used to decide size of classification head.
n_labels = len(labels_ids)
辅助功能
我喜欢在本节的下面保留将在此笔记本中使用的所有类和函数，以帮助保持笔记本的外观整洁：
MovieReviewsDataset（数据集）
如果您以前使用过PyTorch，这是非常标准的。 我们需要此类来读取我们的数据集，对其进行解析并返回带有关联标签的文本。
在此类中，我只需要读入每个文件的内容，使用fix_text即可解决任何Unicode问题并跟踪积极和消极的情绪。
我将所有文本和标签添加到列表中。
此PyTorch数据集类包含三个主要部分：

在里面（） 我们在其中读取数据集并将文本和标签转换为数字。
len（） 需要返回读入的示例数的位置。在调用len（MovieReviewsDataset（））时使用此方法。
getitem（） 始终将一个int值作为输入，该int值表示示例中要从数据集中返回的示例。 如果传递的值为3，我们将从数据集的位置3返回示例。

class MovieReviewsDataset(Dataset): r"""PyTorch Dataset class for loading data. This is where the data parsing happens. This class is built with reusability in mind: it can be used as is as. Arguments: path (:obj:`str`): Path to the data partition. """ def __init__(self, path, use_tokenizer): # Check if path exists. if not os.path.isdir(path): # Raise error if path is invalid. raise ValueError('Invalid `path` variable! Needs to be a directory') self.texts = [] self.labels = [] # Since the labels are defined by folders with data we loop # through each label. for label in ['pos', 'neg']: sentiment_path = os.path.join(path, label) # Get all files from path. files_names = os.listdir(sentiment_path)#[:10] # Sample for debugging. # Go through each file and read its content. for file_name in tqdm(files_names, desc=f'{label} files'): file_path = os.path.join(sentiment_path, file_name) # Read content. content = io.open(file_path, mode='r', encoding='utf-8').read() # Fix any unicode issues. content = fix_text(content) # Save content. self.texts.append(content) # Save encode labels. self.labels.append(label) # Number of exmaples. self.n_examples = len(self.labels) return def __len__(self): r"""When used `len` return the number of examples. """ return self.n_examples def __getitem__(self, item): r"""Given an index return an example from the position. Arguments: item (:obj:`int`): Index position to pick an example to return. Returns: :obj:`Dict[str, str]`: Dictionary of inputs that contain text and asociated labels. """ return {'text':self.texts[item], 'label':self.labels[item]}
Gpt2ClassificationCollator
我使用此类创建数据整理器。 这将在DataLoader中用于创建汇总到模型的数据池。 我在每个序列上都使用了分词器和标签编码器，以将文本和标签转换为数字。
对我们来说幸运的是，Hugging Face想到了一切，并使令牌化程序完成了所有繁重的工作（将文本拆分为令牌，填充，截断，将文本编码为数字），并且非常易于使用！
此数据整理器类有两个主要部分：

在里面（） 我们计划在何处初始化使用的标记生成器，如何对标签进行编码以及是否需要将序列长度设置为其他值。
称呼（） 用作函数整理器，以输入大量数据示例作为输入。 它需要返回一个具有可以馈入我们模型的格式的对象。 幸运的是，我们的标记器为我们做到了，并返回准备以这种方式馈送到模型的变量字典： model(**inputs)。 由于我们对模型进行了微调，因此我还添加了标签。

class Gpt2ClassificationCollator(object): r""" Data Collator used for GPT2 in a classificaiton rask. It uses a given tokenizer and label encoder to convert any text and labels to numbers that can go straight into a GPT2 model. This class is built with reusability in mind: it can be used as is as long as the `dataloader` outputs a batch in dictionary format that can be passed straight into the model - `model(**batch)`. Arguments: use_tokenizer (:obj:`transformers.tokenization_?`): Transformer type tokenizer used to process raw text into numbers. labels_ids (:obj:`dict`): Dictionary to encode any labels names into numbers. Keys map to labels names and Values map to number associated to those labels. max_sequence_len (:obj:`int`, `optional`) Value to indicate the maximum desired sequence to truncate or pad text sequences. If no value is passed it will used maximum sequence size supported by the tokenizer and model. """ def __init__(self, use_tokenizer, labels_encoder, max_sequence_len=None): # Tokenizer to be used inside the class. self.use_tokenizer = use_tokenizer # Check max sequence length. self.max_sequence_len = use_tokenizer.model_max_length if max_sequence_len is None else max_sequence_len # Label encoder used inside the class. self.labels_encoder = labels_encoder return def __call__(self, sequences): r""" This function allowes the class objesct to be used as a function call. Sine the PyTorch DataLoader needs a collator function, I can use this class as a function. Arguments: item (:obj:`list`): List of texts and labels. Returns: :obj:`Dict[str, object]`: Dictionary of inputs that feed into the model. It holddes the statement `model(**Returned Dictionary)`. """ # Get all texts from sequences list. texts = [sequence['text'] for sequence in sequences] # Get all labels from sequences list. labels = [sequence['label'] for sequence in sequences] # Encode all labels using label encoder. labels = [self.labels_encoder[label] for label in labels] # Call tokenizer on all texts to convert into tensors of numbers with # appropriate padding. inputs = self.use_tokenizer(text=texts, return_tensors="pt", padding=True, truncation=True, max_length=self.max_sequence_len) # Update the inputs with the associated encoded labels as tensor. inputs.update({'labels':torch.tensor(labels)}) return inputs
训练（数据加载器，优化程序_，调度程序_，设备_）
我创建了此函数以对DataLoader对象执行完整遍历（DataLoader对象是使用** MovieReviewsDataset类从我们的Dataset *类型对象创建的）。 这基本上是整个数据集的一个历程。
数据加载器是从PyTorch DataLoader创建的，该数据加载器采用了从MovieReviewsDataset类创建的对象，并将每个示例分批放置。 通过这种方式，我们可以提供模型的数据批次！
在PyTorch中，optimizer_和scheduler_很常见。 他们需要在训练过程中更新我们模型的参数并更新我们的学习率。 还有很多，但我不会详细介绍。 实际上，这可能是一个巨大的兔子洞，因为很多这些功能背后都发生了我们无需担心的事情。 谢谢PyTorch！
在此过程中，我们会跟踪实际标签和预测标签以及损失。
def train(dataloader, optimizer_, scheduler_, device_): r""" Train pytorch model on a single pass through the data loader. It will use the global variable `model` which is the transformer model loaded on `_device` that we want to train on. This function is built with reusability in mind: it can be used as is as long as the `dataloader` outputs a batch in dictionary format that can be passed straight into the model - `model(**batch)`. Arguments: dataloader (:obj:`torch.utils.data.dataloader.DataLoader`): Parsed data into batches of tensors. optimizer_ (:obj:`transformers.optimization.AdamW`): Optimizer used for training. scheduler_ (:obj:`torch.optim.lr_scheduler.LambdaLR`): PyTorch scheduler. device_ (:obj:`torch.device`): Device used to load tensors before feeding to model. Returns: :obj:`List[List[int], List[int], float]`: List of [True Labels, Predicted Labels, Train Average Loss]. """ # Use global variable for model. global model # Tracking variables. predictions_labels = [] true_labels = [] # Total loss for this epoch. total_loss = 0 # Put the model into training mode. model.train() # For each batch of training data... for batch in tqdm(dataloader, total=len(dataloader)): # Add original labels - use later for evaluation. true_labels += batch['labels'].numpy().flatten().tolist() # move batch to device batch = {k:v.type(torch.long).to(device_) for k,v in batch.items()} # Always clear any previously calculated gradients before performing a # backward pass. model.zero_grad() # Perform a forward pass (evaluate the model on this training batch). # This will return the loss (rather than the model output) because we # have provided the `labels`. # The documentation for this a bert model function is here: # https://huggingface.co/transformers/v2.2.0/model_doc/bert.html#transformers.BertForSequenceClassification outputs = model(**batch) # The call to `model` always returns a tuple, so we need to pull the # loss value out of the tuple along with the logits. We will use logits # later to calculate training accuracy. loss, logits = outputs[:2] # Accumulate the training loss over all of the batches so that we can # calculate the average loss at the end. `loss` is a Tensor containing a # single value; the `.item()` function just returns the Python value # from the tensor. total_loss += loss.item() # Perform a backward pass to calculate the gradients. loss.backward() # Clip the norm of the gradients to 1.0. # This is to help prevent the "exploding gradients" problem. torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0) # Update parameters and take a step using the computed gradient. # The optimizer dictates the "update rule"--how the parameters are # modified based on their gradients, the learning rate, etc. optimizer.step() # Update the learning rate. scheduler.step() # Move logits and labels to CPU logits = logits.detach().cpu().numpy() # Convert these logits to list of predicted labels values. predictions_labels += logits.argmax(axis=-1).flatten().tolist() # Calculate the average loss over the training data. avg_epoch_loss = total_loss / len(dataloader) # Return all true labels and prediction for future evaluations. return true_labels, predictions_labels, avg_epoch_loss
验证（数据加载器，device_）
我以与训练非常相似的方式实现了此功能，但没有参数更新，后向传递和渐变体面部分。 我们只需要关心模型的预测，就不需要执行所有那些非常耗费计算量的任务。
我使用DataLoader的方式与在火车中使用的方式类似，是为了批量获取要馈送到我们模型中的数据。
在此过程中，我会跟踪实际标签和预测标签以及损失。
def validation(dataloader, device_): r"""Validation function to evaluate model performance on a separate set of data. This function will return the true and predicted labels so we can use later to evaluate the model's performance. This function is built with reusability in mind: it can be used as is as long as the `dataloader` outputs a batch in dictionary format that can be passed straight into the model - `model(**batch)`. Arguments: dataloader (:obj:`torch.utils.data.dataloader.DataLoader`): Parsed data into batches of tensors. device_ (:obj:`torch.device`): Device used to load tensors before feeding to model. Returns: :obj:`List[List[int], List[int], float]`: List of [True Labels, Predicted Labels, Train Average Loss] """ # Use global variable for model. global model # Tracking variables predictions_labels = [] true_labels = [] #total loss for this epoch. total_loss = 0 # Put the model in evaluation mode--the dropout layers behave differently # during evaluation. model.eval() # Evaluate data for one epoch for batch in tqdm(dataloader, total=len(dataloader)): # add original labels true_labels += batch['labels'].numpy().flatten().tolist() # move batch to device batch = {k:v.type(torch.long).to(device_) for k,v in batch.items()} # Telling the model not to compute or store gradients, saving memory and # speeding up validation with torch.no_grad(): # Forward pass, calculate logit predictions. # This will return the logits rather than the loss because we have # not provided labels. # token_type_ids is the same as the "segment ids", which # differentiates sentence 1 and 2 in 2-sentence tasks. # The documentation for this `model` function is here: # https://huggingface.co/transformers/v2.2.0/model_doc/bert.html#transformers.BertForSequenceClassification outputs = model(**batch) # The call to `model` always returns a tuple, so we need to pull the # loss value out of the tuple along with the logits. We will use logits # later to to calculate training accuracy. loss, logits = outputs[:2] # Move logits and labels to CPU logits = logits.detach().cpu().numpy() # Accumulate the training loss over all of the batches so that we can # calculate the average loss at the end. `loss` is a Tensor containing a # single value; the `.item()` function just returns the Python value # from the tensor. total_loss += loss.item() # get predicitons to list predict_content = logits.argmax(axis=-1).flatten().tolist() # update list predictions_labels += predict_content # Calculate the average loss over the training data. avg_epoch_loss = total_loss / len(dataloader) # Return all true labels and prediciton for future evaluations. return true_labels, predictions_labels, avg_epoch_loss
负载模型和分词器
加载预训练的GPT2变压器的三个基本部分：配置，令牌生成器和模型。
在这个例子中，我将使用 gpt2 来自HuggingFace预训练的变压器。 您可以使用所需的任何GP2版本。
在创造 model_config 我将提及分类任务所需的标签数量。 由于我只预测两种情绪：积极和消极，因此我只需要两个标签即可 num_labels.
创造 tokenizer 使用Transformers库时，这是非常标准的。 创建令牌生成器之后，对于本教程来说至关重要的是将填充设置为左侧 tokenizer.padding_side = "left" 并将填充令牌初始化为 tokenizer.eos_token 这是GPT2的原始序列结束令牌。 这是本教程最重要的部分，因为GPT2使用最后一个令牌进行预测，因此我们需要向左滑动。
HuggingFace已经为我们完成了大部分工作，并在GPT2模型中添加了一个分类层。 在创建模型时，我使用了 GPT2ForSequenceClassification。 由于我们有一个自定义的填充令牌，因此我们需要使用以下方法为模型初始化该令牌： model.config.pad_token_id。 最后，我们需要将模型移至我们先前定义的设备。
# Get model configuration.
print('Loading configuraiton...')
model_config = GPT2Config.from_pretrained(pretrained_model_name_or_path=model_name_or_path, num_labels=n_labels) # Get model's tokenizer.
print('Loading tokenizer...')
tokenizer = GPT2Tokenizer.from_pretrained(pretrained_model_name_or_path=model_name_or_path)
# default to left padding
tokenizer.padding_side = "left"
# Define PAD Token = EOS Token = 50256
tokenizer.pad_token = tokenizer.eos_token # Get the actual model.
print('Loading model...')
model = GPT2ForSequenceClassification.from_pretrained(pretrained_model_name_or_path=model_name_or_path, config=model_config) # resize model embedding to match new tokenizer
model.resize_token_embeddings(len(tokenizer)) # fix model padding token id
model.config.pad_token_id = model.config.eos_token_id # Load model to defined device.
model.to(device)
print('Model loaded to `%s`'%device)
Loading configuraiton... Loading tokenizer... Loading model... Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at gpt2 and are newly initialized: ['score.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Model loaded to `cuda`
数据集和整理器
在这里，我使用数据整理器对象创建PyTorch数据集和数据加载器，这些对象将用于将数据输入到我们的模型中。
这是我使用的地方 电影评论数据集 类以创建PyTorch数据集，该数据集将返回文本和标签。
由于我们需要在模型中输入数字，因此需要将文本和标签转换为数字。 这是整理者的目的！ 它接收由PyTorch数据集输出并通过数据整理器功能传递的数据，以输出模型的序列。
我使令牌生成器远离PyTorch数据集，以使代码更整洁，结构更好。 显然，您可以在PyTorch数据集中使用令牌生成器，并直接使用模型中的输出序列，而无需使用数据整理器。
我强烈建议使用验证文本文件，以确定需要多少培训才能避免过度拟合。 确定了哪些参数可产生最佳结果后，可以将验证文件合并到火车中，并使用整个数据集运行最终火车。
数据整理器用于格式化PyTorch数据集输出以匹配GPT2所需的输入。
# Create data collator to encode text and labels into numbers.
gpt2_classificaiton_collator = Gpt2ClassificationCollator(use_tokenizer=tokenizer, labels_encoder=labels_ids, max_sequence_len=max_length) print('Dealing with Train...')
# Create pytorch dataset.
train_dataset = MovieReviewsDataset(path='/content/aclImdb/train', use_tokenizer=tokenizer)
print('Created `train_dataset` with %d examples!'%len(train_dataset)) # Move pytorch dataset into dataloader.
train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, collate_fn=gpt2_classificaiton_collator)
print('Created `train_dataloader` with %d batches!'%len(train_dataloader)) print() print('Dealing with Validation...')
# Create pytorch dataset.
valid_dataset = MovieReviewsDataset(path='/content/aclImdb/test', use_tokenizer=tokenizer)
print('Created `valid_dataset` with %d examples!'%len(valid_dataset)) # Move pytorch dataset into dataloader.
valid_dataloader = DataLoader(valid_dataset, batch_size=batch_size, shuffle=False, collate_fn=gpt2_classificaiton_collator)
print('Created `eval_dataloader` with %d batches!'%len(valid_dataloader))
Dealing with Train... pos files: 100%|████████████████████████████████|12500/12500 [01:17<00:00, 161.19it/s] neg files: 100%|████████████████████████████████|12500/12500 [01:05<00:00, 190.72it/s] Created `train_dataset` with 25000 examples! Created `train_dataloader` with 782 batches! Reading pos files... pos files: 100%|████████████████████████████████|12500/12500 [00:54<00:00, 230.93it/s] neg files: 100%|████████████████████████████████|12500/12500 [00:42<00:00, 291.07it/s] Created `valid_dataset` with 25000 examples! Created `eval_dataloader` with 782 batches!
培训
我在培训中创建了PyTorch使用的优化程序和调度程序。 我使用了变压器模型使用的最常用参数。
我遍历了已定义的纪元数，并调用了 培养 和 验证 功能。
我试图在每个时期之后输出与Keras类似的信息： train_loss：— val_loss：— train_acc：— valid_acc.
训练后，绘制训练以及验证损失和准确性曲线，以检查训练的进行情况。
请注意： 训练图可能看起来有些怪异：验证准确度开始高于训练准确度，而验证损失开始低于训练损失。 通常情况是相反的。 我认为，对于验证部分来说，数据拆分只是比较容易，对于培训部分来说，这太困难了，或者两者兼而有之。 由于本教程是关于使用GPT2进行分类的，因此我不会过多担心模型的结果。
# Note: AdamW is a class from the huggingface library (as opposed to pytorch) # I believe the 'W' stands for 'Weight Decay fix"
optimizer = AdamW(model.parameters(), lr = 2e-5, # default is 5e-5, our notebook had 2e-5 eps = 1e-8 # default is 1e-8. ) # Total number of training steps is number of batches * number of epochs.
# `train_dataloader` contains batched data so `len(train_dataloader)` gives # us the number of batches.
total_steps = len(train_dataloader) * epochs # Create the learning rate scheduler.
scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps = 0, # Default value in run_glue.py num_training_steps = total_steps) # Store the average loss after each epoch so we can plot them.
all_loss = {'train_loss':[], 'val_loss':[]}
all_acc = {'train_acc':[], 'val_acc':[]} # Loop through each epoch.
print('Epoch')
for epoch in tqdm(range(epochs)): print() print('Training on batches...') # Perform one full pass over the training set. train_labels, train_predict, train_loss = train(train_dataloader, optimizer, scheduler, device) train_acc = accuracy_score(train_labels, train_predict) # Get prediction form model on validation data. print('Validation on batches...') valid_labels, valid_predict, val_loss = validation(valid_dataloader, device) val_acc = accuracy_score(valid_labels, valid_predict) # Print loss and accuracy values to see how training evolves. print(" train_loss: %.5f - val_loss: %.5f - train_acc: %.5f - valid_acc: %.5f"%(train_loss, val_loss, train_acc, val_acc)) print() # Store the loss value for plotting the learning curve. all_loss['train_loss'].append(train_loss) all_loss['val_loss'].append(val_loss) all_acc['train_acc'].append(train_acc) all_acc['val_acc'].append(val_acc) # Plot loss curves.
plot_dict(all_loss, use_xlabel='Epochs', use_ylabel='Value', use_linestyles=['-', '--']) # Plot accuracy curves.
plot_dict(all_acc, use_xlabel='Epochs', use_ylabel='Value', use_linestyles=['-', '--'])
Epoch 100%|████████████████████████████████|4/4 [15:11<00:00, 227.96s/it] Training on batches... 100%|████████████████████████████████|782/782 [02:42<00:00, 4.82it/s] Validation on batches... 100%|████████████████████████████████|782/782 [02:07<00:00, 6.13it/s] train_loss: 0.54128 - val_loss: 0.38758 - train_acc: 0.75288 - valid_acc: 0.81904 Training on batches... 100%|████████████████████████████████|782/782 [02:36<00:00, 5.00it/s] Validation on batches... 100%|████████████████████████████████|782/782 [01:41<00:00, 7.68it/s] train_loss: 0.36716 - val_loss: 0.37620 - train_acc: 0.83288 -valid_acc: 0.82912 Training on batches... 100%|████████████████████████████████|782/782 [02:36<00:00, 5.00it/s] Validation on batches... 100%|████████████████████████████████|782/782 [01:24<00:00, 9.24it/s] train_loss: 0.31409 - val_loss: 0.39384 - train_acc: 0.86304 - valid_acc: 0.83044 Training on batches... 100%|████████████████████████████████|782/782 [02:36<00:00, 4.99it/s] Validation on batches... 100%|████████████████████████████████|782/782 [01:09<00:00, 11.29it/s] train_loss: 0.27358 - val_loss: 0.39798 - train_acc: 0.88432 - valid_acc: 0.83292

培训和验证损失。


训练和验证的准确性。

评估
在处理分类时，查看精度查全率和F1分数很有用。
评估模型时最好使用的指标是混淆矩阵。
# Get prediction form model on validation data. This is where you should use
# your test data.
true_labels, predictions_labels, avg_epoch_loss = validation(valid_dataloader, device) # Create the evaluation report.
evaluation_report = classification_report(true_labels, predictions_labels, labels=list(labels_ids.values()), target_names=list(labels_ids.keys()))
# Show the evaluation report.
print(evaluation_report) # Plot confusion matrix.
plot_confusion_matrix(y_true=true_labels, y_pred=predictions_labels, classes=list(labels_ids.keys()), normalize=True, magnify=0.1, );
Training on batches... 100%|████████████████████████████████|782/782 [01:09<00:00, 11.24it/s] precision recall f1-score support neg 0.84 0.83 0.83 12500 pos 0.83 0.84 0.83 12500 accuracy 0.83 25000 macro avg 0.83 0.83 0.83 25000 weighted avg 0.83 0.83 0.83 25000

混淆矩阵归一化。

最后的最后
如果你走了这么远 恭喜！ 🎊和 谢谢！ 🙏感谢您对我的教程的关注！
我已经使用了一段时间了，我觉得它已经被很好地记录在案，并且易于理解。
我当然很容易理解，因为我建立了它。 这就是欢迎任何反馈的原因，它可以帮助我改善以后的教程！
如果您发现有问题，请通过打开一个问题通知我 ml_things GitHub存储库!
许多教程都是一次性的，没有得到维护。 我计划尽我所能使教程保持最新。
这篇文章最初发表于 乔治·米海拉（George Mihaila）的个人网站  并在获得作者许可的情况下重新发布到TOPBOTS。
喜欢这篇文章吗？ 注册以获取更多AI更新。
我们会在发布更多技术教育时通知您。
相关

 资料来源：https://www.topbots.com/gpt2-text-classification-using-hugging-face-transformers/