This issue recommends a lightweight natural language processing (NLP) toolkit——fastNLP。
fastNLP is a domestic natural language processing open source project initiated by the Natural Language Processing team of Fudan University, a lightweight framework for natural language processing (NLP), with the goal of quickly realizing NLP tasks and building complex models.
fastNLP has the following features:
- Unified Tabular data container to simplify data preprocessing;
- Loader and Pipe with multiple data sets built in, eliminating preprocessing code;
- Various convenient NLP tools, such as Embedding loading (including ELMo and BERT), intermediate data cache, etc.
- Automatic download of partial data sets and pre-trained models;
- Provides a variety of neural network components and recurrence models (covering tasks such as Chinese word segmentation, named entity recognition, syntax analysis, text classification, text matching, reference resolution, and summarization);
- The Trainer provides a variety of built-in Callback functions to facilitate experiment recording and exception capture.
install:
fastNLP Depends on the following packages:
numpy>=1.14.2torch>=1.0.0tqdm>=4.28.1nltk>=3.4.1requestsspacyprettytable>=0.7.2
The torch installation may be related to the operating system and CUDA version, please see the PyTorch official website. After the dependency package is installed, you can run the following command on the CLI to complete the installation
>>> pip install fastNLP>>> python -m spacy download en
Detailed tutorial:
- Preprocess text using DataSet
fastNLP中的DataSet — fastNLP 0.6.0 document
- Use Vocabulary to convert text to index
fastNLP inVocabulary — fastNLP 0.6.0 document
- Using the Embedding module to convert text into vectors
useEmbeddingThe module converts text into vectors— fastNLP 0.6.0 document
- Load and process the data set using Loader and Pipe
useLoader And Pipe load and process the data set — fastNLP 0.6.0 document
- Use the Trainer and Tester to quickly train and test
useLoader And Pipe load and process the data set — fastNLP 0.6.0 document
- Use DataSetIter to customize the training process
useDataSetIter Implement a custom training process — fastNLP 0.6.0 document
- Use Metric to quickly evaluate your model
use Metric Quickly evaluate your model — fastNLP 0.6.0 document
- Use Modules and Models to quickly build custom models
use Modules And Models Quickly build custom models — fastNLP 0.6.0 document
- Use Callback to customize your training process
use Callback Customize your training process — fastNLP 0.6.0 document
- Further Reading 1: Uses of BertEmbedding
BertEmbedding Various uses of — fastNLP 0.6.0 document
- Further Reading 2: Introduction to distributed training
Distributed Parallel Training — fastNLP 0.6.0 document
- Further reading 3: Use fitlog to aid fastNLP research
use fitlog assist fastNLP Conduct scientific research — fastNLP 0.6.0v
You can read more on your own.
Open source address: Click download