Skip to content

NamrataNair/SectionTagging

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Contents

topic-annotate: This is a modified version of ClarityNLP SectionTagger, used to generate clinical sentences and section header labels in fast-text format.

scripts: This contains python script used for the the text classification model.

Notes on the embeddings:

  1. The bert-base-clinical-cased embedding used in the python script can be obtained from ClinicalBERT using the following commands:

!wget -O pretrained_bert_tf.tar.gz https://www.dropbox.com/s/8armk04fu16algz/pretrained_bert_tf.tar.gz\?dl\=1 !tar -xzf pretrained_bert_tf.tar.gz !tar -xzf pretrained_bert_tf/bert_pretrain_output_all_notes_150000.tar.gz !mv bert_pretrain_output_all_notes_150000 bert-base-clinical-cased !mv ./bert-base-clinical-cased/bert_config.json ./bert-base-clinical-cased/config.json

  1. The cui2vec_embed_vectors.bin used in the python script can be obtained by emailing the authors.

  2. The forward-lm.pt, backward-lm.pt are fine-tuned versions of pubmed-forward and pubmed-backward learning models and can also be obtained upon request.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages