macanv / BERT-BiLSTM-CRF-NER
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services
AI Architecture Analysis
This repository is indexed by RepoMind. By analyzing macanv/BERT-BiLSTM-CRF-NER in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.
Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.
Repository Overview (README excerpt)
Crawler viewBERT-BiLSTM-CRF-NER Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning 使用谷歌的BERT模型在BLSTM-CRF模型上进行预训练用于中文命名实体识别的Tensorflow代码' 中文文档请查看https://blog.csdn.net/macanv/article/details/85684284 如果对您有帮助,麻烦点个star,谢谢~~ Welcome to star this repository! The Chinese training data($PATH/NERdata/) come from:https://github.com/zjy-ucas/ChineseNER The CoNLL-2003 data($PATH/NERdata/ori/) come from:https://github.com/kyzhouhzau/BERT-NER The evaluation codes come from:https://github.com/guillaumegenthial/tf_metrics/blob/master/tf_metrics/__init__.py Try to implement NER work based on google's BERT code and BiLSTM-CRF network! This project may be more close to process Chinese data. but other language only need Modify a small amount of code. THIS PROJECT ONLY SUPPORT Python3. ################################################################### Download project and install You can install this project by: OR if you do not want to install, you just need clone this project and reference the file of to train the model or start the service. UPDATE: • 2020.2.6 add simple flask ner service code • 2019.2.25 Fix some bug for ner service • 2019.2.19: add text classification service • fix Missing loss error • add label_list params in train process, so you can using -label_list xxx to special labels in training process. Train model: You can use -help to view the relevant parameters of the training named entity recognition model, where data_dir, bert_config_file, output_dir, init_checkpoint, vocab_file must be specified. train/dev/test dataset is like this: The first one of each line is a token, the second is token's label, and the line is divided by a blank line. The maximum length of each sentence is [max_seq_length] params. You can get training data from above two git repos You can training ner model by running below command: like my init_checkpoint: you can special labels using -label_list params, the project get labels from training data. After training model, the NER model will be saved in {output_dir} which you special above cmd line. My Training environment:Tesla P40 24G mem As Service Many server and client code comes from excellent open source projects: bert as service of hanxiao If my code violates any license agreement, please let me know and I will correct it the first time. ~~and NER server/client service code can be applied to other tasks with simple modifications, such as text categorization, which I will provide later.~~ this project private Named Entity Recognition and Text Classification server service. Welcome to submit your request or share your model, if you want to share it on Github or my work. You can use -help to view the relevant parameters of the NER as Service: which model_dir, bert_model_dir is need and than you can using below cmd start ner service: or text classification service: as you see: mode: If mode is NER/CLASS, then the service identified by the Named Entity Recognition/Text Classification will be started. If it is BERT, it will be the same as the [bert as service] project. bert_model_dir: bert_model_dir is a BERT model, you can download from https://github.com/google-research/bert ner_model_dir: your ner model checkpoint dir model_pb_dir: model freeze save dir, after run optimize func, there will contains like ner_model.pb binary file >You can download my ner model from:https://pan.baidu.com/s/1m9VcueQ5gF-TJc00sFD88w, ex_code: guqq > Or text classification model from: https://pan.baidu.com/s/1oFPsOUh1n5AM2HjDIo2XCw, ex_code: bbu8 Set ner_mode.pb/classification_model.pb to model_pb_dir, and set other file to model_dir(Different models need to be stored separately, you can set ner models label_list.pkl and label2id.pkl to model_dir/ner/ and set text classification file to model_dir/text_classification) , Text classification model can classify 12 categories of Chinese data: '游戏', '娱乐', '财经', '时政', '股票', '教育', '社会', '体育', '家居', '时尚', '房产', '彩票' You can see below service starting info: you can using below code test client: • NER Client you can see this after run the above code: If you want to customize the word segmentation method, you only need to make the following simple changes on the client side code. • Text Classification Client you can see this after run the above code: Note that it can not start NER service and Text Classification service together. but you can using twice command line start ner service and text classification with different port. Flask server service sometimes, multi thread deep learning model service may not use C/S service, you can useing simple http service replace that, like using flask. now you can reference code:bert_base/server/simple_flask_http_service.py,building your simple http server service License MIT. The following tutorial is an old version and will be removed in the future. How to train • Download BERT chinese model : • create output dir create output path in project path: • Train model first method ##### OR replace the BERT path and project path in bert_lstm_ner.py Than Run: USING BLSTM-CRF OR ONLY CRF FOR DECODE! Just alter bert_lstm_ner.py line of 450, the params of the function of add_blstm_crf_layer: crf_only=True or False ONLY CRF output layer: BiLSTM with CRF output layer Result: all params using default In dev data set: In test data set entity leval result: last two result are label level result, the entitly level result in code of line 796-798,this result will be output in predict process. show my entity level result : > my model can download from baidu cloud: >链接:https://pan.baidu.com/s/1GfDFleCcTv5393ufBYdgqQ 提取码:4cus NOTE: My model is trained by crf_only params ONLINE PREDICT If model is train finished, just run ## Using NER as Service Service Using NER as Service is simple, you just need to run the python script below in the project root path: You can download my ner model from:https://pan.baidu.com/s/1m9VcueQ5gF-TJc00sFD88w, ex_code: guqq Set ner_mode.p…