A complete LLM project walk-through with code implementation
When I was learning data science and machine learning at university, the curriculum was geared heavily towards algorithms and machine learning techniques. I still remember those days cracking the math, not exactly fun, but nonetheless a rewarding process that had given me a solid foundation.
Once I graduated and started working as a data scientist, I soon realized the challenge: In real life, problems rarely present themselves as nicely formulated and readily addressable by machine learning techniques. It is the data scientist’s job to first define, scope, and convert the real-life problem into a machine-learning problem, before even talking about the algorithms. This is a crucial step as completely different approaches may be adopted depending on how the problem is formulated, what is the desired outcome, what data is available, the timeline, the budget, the computing infrastructure, and many other factors. In a word, it is not a simple math problem anymore.
This gap in my data science training made me feel disoriented and pressured in the beginning. Luckily, I had my mentor and project colleagues, who helped me a lot in picking up the essentials and learning to ask the right questions. Step by step, I became more confident in managing data science projects.
Reflecting on my own experience, I really wish I could have the chance to learn those soft skills in data science to better prepare for my professional life. Now I have gone through the struggles, but is there anything I could do for the newly graduated data scientists?
A famous book for preparing interviews in management consulting is “Case in Point”. This book provides numerous practice case studies that cover a wide range of topics and industries. By observing and understanding how those case studies are solved, the candidates can learn quite a lot in practical problem-solving processes and be ready for real-life challenges.
Inspired by this case-study format, a thought occurred to me: Can we leverage the recent large language models (LLM) to generate relevant, diverse…