2School of Computing Science, University of Glasgow
3School of Population and Global Health, McGill University
The Biomedical Alert News Dataset (BAND) is a well-annotated dataset aimed at improving disease surveillance and understanding of disease spread. It includes 1,508 samples from reported news articles, open emails, and alerts, along with 30 epidemiology-related questions. BAND is designed to challenge and advance NLP models in tasks like Named Entity Recognition (NER), Question Answering (QA), and Event Extraction (EE), with a focus on epidemiological analysis.
The BAND dataset can be fond under corresponding folders.
./scripts/train_lm.sh lmrand base
./scripts/train_seq2seq.sh band_rand ptm=t5
./scripts/train_outbreak.sh
python train_lm.py --name "gpt2"
./scripts/run_token_ner.sh
./scripts/run_crf_ner.sh
./scripts/run_span_ner.sh
If you find our dataset or paper useful, please cite our work:
@inproceedings{band2024,
title={BAND: Biomedical Alert News Dataset},
author={Fu, Zihao and Zhang, Meiru and Meng, Zaiqiao and Shen, Yannan and Buckeridge, David and Collier, Nigel},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2024}
}
This project is licensed under the MIT License - refer to the LICENSE file for details.