Fine-Tune Your LLM Without Maxing Out Your GPU | by John Adeojo | Aug, 2023

How you can fine-tune your LLMs with hardware and a tight budget

John Adeojo
Towards Data Science
Image by Author: Generated with Midjourney

With the success of ChatGPT, we have witnessed a surge in demand for bespoke large language models.

However, there has been a barrier to adoption. As these models are so large, it has been challenging for businesses, researchers, or hobbyists with a modest budget to customise them for their own datasets.

Now with innovations in parameter efficient fine-tuning (PEFT) methods, it is entirely possible to fine-tune large language models at a relatively low cost. In this article, I demonstrate how to achieve this in a Google Colab.

I anticipate that this article will prove valuable for practitioners, hobbyists, learners, and even hands-on start-up founders.

So, if you need to mock up a cheap , test an idea, or create a cool data project to out from the crowd — keep reading.

Businesses often have private datasets that drive some of their processes.

To give you an example, I worked for a bank where we logged customer complaints in an Excel spreadsheet. An analyst was responsible for categorising these complaints (manually) for reporting purposes. Dealing with thousands of complaints each month, this process was -consuming and prone to human error.

Had we had the , we could have fine-tuned a large language model to carry out this categorisation for us, saving time through and potentially reducing the rate of incorrect categorisations.

Inspired by this example, the remainder of this article demonstrates how we can fine-tune an LLM for categorising consumer complaints about financial and .

The dataset comprises real consumer complaints data for and products. It is open, publicly available data published by the Consumer Financial Protection Bureau.

There are over 120k anonymised complaints, categorised into approximately 214 “subissues”.

Source link