Introduction to Safetensors - KDnuggets

Introduction to Safetensors – KDnuggets

Source Node: 2161256

Introduction to Safetensors
Image by Author

Hugging Face has developed a new serialization format called Safetensors, aimed at simplifying and streamlining the storage and loading of large and complex tensors. Tensors are the primary data structure used in deep learning, and their size can pose challenges when it comes to efficiency.

Safetensors use a combination of efficient serialization and compression algorithms to reduce the size of large tensors, making it faster and more efficient than other serialization formats like pickle. This means that Safetensors is 76.6X faster on CPU and 2X faster on GPU compared to the traditional PyTorch serialization format, <code>pytorch_model.bin</code> with <code>model.safetensors</code>. Check out Speed Comparison.

Easy of use

Safetensors have a simple and intuitive API to serialize and deserialize tensors in Python. This means that developers can focus on building their deep learning models instead of spending time on serialization and deserialization.

Cross-platform compatibility

You can serialize in Python and conveniently load the resulting files in various programming languages and platforms, such as C++, Java, and JavaScript. This allows for seamless sharing of models across different programming environments.

Speed

Safetensors is optimized for speed and can efficiently handle the serialization and deserialization of large tensors. As a result, it is an excellent choice for applications that use large language models.

Size Optimization

It uses a blend of effective serialization and compression algorithms to decrease the size of large tensors, resulting in faster and more efficient performance compared to other serialization formats such as pickle.

Secure

To prevent any corruption during storage or transfer of serialized tensors, Safetensors uses a checksum mechanism. This guarantees an added layer of security, ensuring that all data stored in Safetensors is accurate and dependable. Moreverover, it prevents DOS attacks

Lazy loading

When working in distributed settings with multiple nodes or GPUs, it is helpful to load only a portion of the tensors on each model. BLOOM utilizes this format to load the model on 8 GPUs in just 45 seconds, compared to the regular PyTorch weights which took 10 minutes. 

In this section, we will look at <code>safetensors</code> API and how you can save and load file tensor files. 

We can simply Install safetensors using pip manager:

pip install safetensors

 

We will use  the example from Torch shared tensors to build a simple neural network and save the model using <code>safetensors.torch</code> API for PyTorch. 

from torch import nn class Model(nn.Module): def __init__(self): super().__init__() self.a = nn.Linear(100, 100) self.b = self.a def forward(self, x): return self.b(self.a(x)) model = Model()
print(model.state_dict())

 

As we can see, we have successfully created the model. 

OrderedDict([('a.weight', tensor([[-0.0913, 0.0470, -0.0209, ..., -0.0540, -0.0575, -0.0679], [ 0.0268, 0.0765, 0.0952, ..., -0.0616, 0.0146, -0.0343], [ 0.0216, 0.0444, -0.0347, ..., -0.0546, 0.0036, -0.0454], ...,

 

Now, we will save the model by providing the <code>model</code> object and the file name. After that, we will load the save file into the <code>model</code> object created using <code>nn.Module</code>.

from safetensors.torch import load_model, save_model save_model(model, "model.safetensors") load_model(model, "model.safetensors")
print(model.state_dict())

 

OrderedDict([('a.weight', tensor([[-0.0913, 0.0470, -0.0209, ..., -0.0540, -0.0575, -0.0679], [ 0.0268, 0.0765, 0.0952, ..., -0.0616, 0.0146, -0.0343], [ 0.0216, 0.0444, -0.0347, ..., -0.0546, 0.0036, -0.0454], ...,

 

In the second example, we will try to save the tensor created using <code>torch.zeros</code>. For that we will use the <code>save_file</code> function. 

import torch
from safetensors.torch import save_file, load_file tensors = { "weight1": torch.zeros((1024, 1024)), "weight2": torch.zeros((1024, 1024))
}
save_file(tensors, "new_model.safetensors")

 

And to load the tensors, we will use the <code>load_file</code> function. 

load_file("new_model.safetensors")

 

{'weight1': tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]]), 'weight2': tensor([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]])}

 

The safetensors API is available for Pytorch, Tensorflow, PaddlePaddle, Flax, and Numpy. You can understand it by reading the Safetensors documentation.

 

Introduction to Safetensors
Image from Torch API

In short, safetensors is a new way to store large tensors used in deep learning applications. Compared to other techniques, it offers faster, more efficient, and user-friendly features. Additionally, it ensures the confidentiality and safety of data while supporting various programming languages and platforms. By utilizing Safetensors, machine learning engineers can optimize their time and concentrate on developing superior models.

I highly recommend using Safetensors for your projects. Many top AI companies, such as Hugging Face, EleutherAI, and StabilityAI, utilize Safetensors for their projects.

Reference

 
 
Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master’s degree in Technology Management and a bachelor’s degree in Telecommunication Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.
 

Time Stamp:

More from KDnuggets