Named Entity Recognition for Specific Domains - Take Advantage of Transfer Learning

  • Sunna Torge Technische Universität Dresden Centre for Information Services and High Performance Computing (ZIH) 01062 Dresden
  • Waldemar Hahn Technische Universität Dresden Centre for Information Services and High Performance Computing (ZIH) 01062 Dresden
  • Lalith Manjunath
  • René Jäkel Technische Universität Dresden Centre for Information Services and High Performance Computing (ZIH) 01062 Dresden

Abstract

Automated text analysis as named entity recognition (NER) heavily relies on large amounts of high-quality training data. For domain-specific NER transfer learning approaches aim to overcome the problem of lacking domain-specific training data. In this paper, we investigate transfer learning approaches in order to improve domain-specific NER in low-ressource domains. The first part of the paper is dedicated to information transfer from known to unknown entities using BiLSTM-CRF neural networks, considering also the influence of varying training data size. In the second part instead, pre-trained BERT models are fine-tuned to domain-specific German NER. The performance of models of both architectures is compared w.r.t. different hyperparameters and a set of 16 entities. The experiments are based on the revised German SmartData Corpus, and a baseline model, trained on this corpus.
Published
Sep 16, 2022
How to Cite
TORGE, Sunna et al. Named Entity Recognition for Specific Domains - Take Advantage of Transfer Learning. International Journal of Information Science and Technology, [S.l.], v. 6, n. 3, p. 4 - 15, sep. 2022. ISSN 2550-5114. Available at: <https://www.innove.org/ijist/index.php/ijist/article/view/189>. Date accessed: 07 dec. 2024. doi: http://dx.doi.org/10.57675/IMIST.PRSM/ijist-v6i3.189.
Section
Special Issue : Machine Learning and Natural Language Processing