Exciting news! Gradient has launched a FREE GPU plan. Read More
Project Details

🤗 Hugging Face

A dive into the Hugging Face tokenizers and transformers libraries

By
Hugging Face

Description

Hugging Face’s open-source framework Transformers has been downloaded over a million times, amassed over 25,000 stars on GitHub, and has been tested by researchers at Google, Microsoft and Facebook. We're excited to be offering new resources from Hugging Face for state-of-the-art NLP.

The new Transformers container comes with all dependencies pre-installed, so you can immediately utilize the library's state-of-the-art models for training, fine-tuning and inference. The new notebooks cover how to train a tokenizer from scratch, how to use popular pretrained language models in a couple lines of code, and pipelines which embed the tokenizer and model in a single call for downstream tasks. This includes models like BERT, GPT-2, T5, Transformer-XL, XLM, and more.

Some of the use cases covered include:

  • Sentence classification (sentiment analysis)
  • Token classification (named entity recognition, part-of-speech tagging)
  • Feature extraction
  • Question answering
  • Summarization
  • Mask filling
  • Translation

For a walkthrough of the code with Hugging Face's ML Engineer Morgan Funtowitz, check out the webinar.