Domain Adaptation of A Large Language Model | by Mina Ghashami | Nov, 2023

November 14, 2023
by Mina Ghashami
AI, Syndicated
173 Views

Adapt a pre-trained model to a new domain using HuggingFace

Large language models (LLMs) like BERT are usually pre-trained on general domain corpora like Wikipedia and BookCorpus. If we apply them to more specialized domains like medical, there is often a drop in performance compared to models adapted for those domains.

In this article, we will explore how to adapt a pre-trained LLM like Deberta base to medical domain using the HuggingFace Transformers library. Specifically, we will cover an effective technique called intermediate pre-training where we do further pre-training of the LLM on data from our target domain. This adapts the model to the new domain, and improves its performance.

This is a simple yet effective technique to tune LLMs to your domain and gain significant improvements in downstream task performance.

Let’s get started.

First step in any project is to prepare the data. Since our dataset is in medical domain, it contains the following fields and many more:

Putting the full list of fields here is impossible, as there are many fields. But even this glimpse into the existing fields help us to form the input sequence for an LLM.

First point to keep in mind is that, the input has to be a sequence because LLMs read input as text sequences.

To form this into a sequence, we can inject special tags to tell the LLM what piece of information is coming next. Consider the following example: <patient>name:John, surname: Doer, patientID:1234, age:34</patient> , the <patient> is a special tag that tells LLM that what follows are information about a patient.

So we form the input sequence as following:

As you see, we have injected four tags:

<patient> </patient>: to contain…

Source link

Domain Adaptation of A Large Language Model | by Mina Ghashami | Nov, 2023

Adapt a pre-trained model to a new domain using HuggingFace

About Us

Our Services

Latest QSOL IT News

Domain Adaptation of A Large Language Model | by Mina Ghashami | Nov, 2023

Adapt a pre-trained model to a new domain using HuggingFace

Related Post

Exploring Microsoft Entra Private Access

2025’s Top Biotech Cybersecurity Threats & Risk Reduction

2025’s Top Biotech Cybersecurity Threats & Risk Reduction

Prioritizing UC&C “Experience” to Drive Better Business Outcomes