How you can fine-tune your LLMs with limited hardware and a tight budget
With the success of ChatGPT, we have witnessed a surge in demand for bespoke large language models.
However, there has been a barrier to adoption. As these models are so large, it has been challenging for businesses, researchers, or hobbyists with a modest budget to customise them for their own datasets.
Now with innovations in parameter efficient fine-tuning (PEFT) methods, it is entirely possible to fine-tune large language models at a relatively low cost. In this article, I demonstrate how to achieve this in a Google Colab.
I anticipate that this article will prove valuable for practitioners, hobbyists, learners, and even hands-on start-up founders.
So, if you need to mock up a cheap prototype, test an idea, or create a cool data science project to stand out from the crowd — keep reading.
Businesses often have private datasets that drive some of their processes.
To give you an example, I worked for a bank where we logged customer complaints in an Excel spreadsheet. An analyst was responsible for categorising these complaints (manually) for reporting purposes. Dealing with thousands of complaints each month, this process was time-consuming and prone to human error.
Had we had the resources, we could have fine-tuned a large language model to carry out this categorisation for us, saving time through automation and potentially reducing the rate of incorrect categorisations.
Inspired by this example, the remainder of this article demonstrates how we can fine-tune an LLM for categorising consumer complaints about financial products and services.
The dataset comprises real consumer complaints data for financial services and products. It is open, publicly available data published by the Consumer Financial Protection Bureau.
There are over 120k anonymised complaints, categorised into approximately 214 “subissues”.