Zihao Fu,1 Meiru Zhang,1 Zaiqiao Meng,2,1 Yannan Shen,3 David Buckeridge,3 Nigel Collier1
1Language Technology Lab, University of Cambridge
2School of Computing Science, University of Glasgow
3School of Population and Global Health, McGill University
2School of Computing Science, University of Glasgow
3School of Population and Global Health, McGill University
The Biomedical Alert News Dataset (BAND) is a well-annotated dataset aimed at improving disease surveillance and understanding of disease spread. It includes 1,508 samples from reported news articles, open emails, and alerts, along with 30 epidemiology-related questions. BAND is designed to challenge and advance NLP models in tasks like Named Entity Recognition (NER), Question Answering (QA), and Event Extraction (EE), with a focus on epidemiological analysis.
The BAND dataset can be fond under corresponding folders.
./scripts/train_lm.sh lmrand base
./scripts/train_seq2seq.sh band_rand ptm=t5
./scripts/train_outbreak.sh
python train_lm.py --name "gpt2"
./scripts/run_token_ner.sh
./scripts/run_crf_ner.sh
./scripts/run_span_ner.sh
If you find our dataset or paper useful, please cite our work:
@inproceedings{band2024,
title={BAND: Biomedical Alert News Dataset},
author={Fu, Zihao and Zhang, Meiru and Meng, Zaiqiao and Shen, Yannan and Buckeridge, David and Collier, Nigel},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2024}
}
This project is licensed under the MIT License - refer to the LICENSE file for details.