Skip to content

Latest commit

 

History

History
74 lines (60 loc) · 2.64 KB

README.md

File metadata and controls

74 lines (60 loc) · 2.64 KB

BAND: Biomedical Alert News Dataset

Zihao Fu,1 Meiru Zhang,1 Zaiqiao Meng,2,1 Yannan Shen,3 David Buckeridge,3 Nigel Collier1
1Language Technology Lab, University of Cambridge
2School of Computing Science, University of Glasgow
3School of Population and Global Health, McGill University

PWC YouTube Video Slides License

About

The Biomedical Alert News Dataset (BAND) is a well-annotated dataset aimed at improving disease surveillance and understanding of disease spread. It includes 1,508 samples from reported news articles, open emails, and alerts, along with 30 epidemiology-related questions. BAND is designed to challenge and advance NLP models in tasks like Named Entity Recognition (NER), Question Answering (QA), and Event Extraction (EE), with a focus on epidemiological analysis.

image

Dataset

The BAND dataset can be fond under corresponding folders.

Usage

Question Answering

Run Decoder-Only Model

./scripts/train_lm.sh lmrand base

Run Encoder-Decoder Model

./scripts/train_seq2seq.sh band_rand ptm=t5

Event Extraction

Run Encoder-Decoder Model

./scripts/train_outbreak.sh

Run Decoder-only Model

python train_lm.py --name "gpt2"

Named Entity Recognition

Run Token-Based NER Model

./scripts/run_token_ner.sh

Run CRF-Based NER Model

./scripts/run_crf_ner.sh

Run Span-Based NER Model

./scripts/run_span_ner.sh

Citation

If you find our dataset or paper useful, please cite our work:

@inproceedings{band2024,
  title={BAND: Biomedical Alert News Dataset},
  author={Fu, Zihao and Zhang, Meiru and Meng, Zaiqiao and Shen, Yannan and Buckeridge, David and Collier, Nigel},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2024}
}

License

This project is licensed under the MIT License - refer to the LICENSE file for details.