Humaid-Ner: A Disaster Tweet Dataset for Joint Named Entity Recognition and Event Classification Via Uncertainty-Weighted Multitask Learning

Aijaz  Ali; Nazish  Basir; Sarfaraz  Nawaz; Danish Nazir  Arain; Haris  Ali

doi:10.62019/zabvxd97

Authors

Aijaz Ali Department of Software Engineering, Faculty of Engineering and Technology, University of Sindh, Jamshoro, Pakistan.
Nazish Basir Department of Information Technology, Faculty of Engineering and Technology, University of Sindh, Jamshoro, Pakistan
Sarfaraz Nawaz Department of Software Engineering, Faculty of Engineering and Technology, University of Sindh, Jamshoro, Pakistan.
Danish Nazir Arain Dr. A. H. S. Bukhari Postgraduate Centre Of ICT, University of Sindh, Jamshoro, Pakistan.
Haris Ali Department of Software Engineering, Mehran University Of Engineering & Technology, Jamshoro, Pakistan

DOI:

https://doi.org/10.62019/zabvxd97

Keywords:

disaster tweet analysis, named entity recognition, humanitarian event classification, multitask learning, uncertainty weighting, RoBERTa, crisis informatics, social media NLP, HUMAID-NER, BIO tagging.

Abstract

Rapid extraction of structured information from social media is central to effective humanitarian response, yet disaster tweet resources to date offer only document-level category labels with no span-level entity annotations. We address this gap with HUMAID-NER, the first named entity recognition dataset built on the HumAID benchmark: 60,000 English disaster tweets annotated in BIO format across ten operationally motivated entity types, including CASUALTY, DISPLACED, REQUEST, RESOURCE, and RESCUE, yielding 21 entity classes and roughly 175,000 labelled entity spans. Annotations were produced through a reproducible three-stage hybrid pipeline that combines a spaCy transformer backbone, disaster-domain EntityRuler patterns, and structured regular expressions with priority-based overlap resolution. We also propose a joint multitask learning framework that performs disaster-specific NER and humanitarian event classification through a single RoBERTa-large encoder. A core difficulty in joint training is task-conflict: the NER objective produces up to 2,688 token-level gradient signals per example while classification contributes one, and under fixed task weights this imbalance caused classification macro-F1 to fall 1.4 points across epochs. Homoscedastic uncertainty weighting with learnable per-task log-variance parameters resolves the conflict, paired with a two-stage training schedule that freezes the lower 18 of 24 encoder layers in the second stage to permit task-specific specialisation without eroding shared representations. A controlled four-row ablation study isolates each component’s contribution. On the HUMAID-NER validation set, the proposed system reaches NER span micro-F1 of 0.841 and classification macro-F1 of 0.761 simultaneously; under this setting, classification performance meets or exceeds dedicated single-task RoBERTa-large classifiers on the same benchmark (0.730–0.750), suggesting joint modelling introduces no classification trade-off while adding complete entity extraction capability. A real-time web dashboard demonstrates end-to-end deployment. Dataset, models, and pipeline code are released to support reproducibility and future crisis informatics research.

Author Biography

Nazish Basir, Department of Information Technology, Faculty of Engineering and Technology, University of Sindh, Jamshoro, Pakistan

Assistant Professor