How to Start a Machine Learning Project Before Starting a Machine Learning Project

Jul 9, 2024

As someone who has made lots of mistakes in ML, I’ve come to realize the critical thinking required when approaching machine learning projects. Despite the huge potential of ML, many projects often fall short of expectations. Some reports show that only 15% of businesses’ ML projects succeed, and just 53% of AI projects make it from prototype to production.

Avoid Fooling Yourself

One of the most profound lessons I’ve learned in machine learning development is the temptation of solving the wrong problem. It’s surprisingly easy to succumb to biases or assumptions that lead us astray. Approaching new challenges with a healthy dose of skepticism and a critical mindset is key to ensuring we’re tackling the root issue.

Before diving into any ML endeavor, I’ve found it extremely useful to craft a kind of “treasure map”. This map distills the project down to its essential landmarks, stripping away unnecessary complexities. Here are five pivotal questions that have helped me understand the business problem:

What Problem Are You Solving?

Understanding the crux of the problem is crucial. It’s easy to get caught up in crafting sophisticated models that miss the mark. For instance, in one pricing system project, the focus could be on predicting product prices to boost margins. However, if you start by creating a price prediction model that mirrors the existing pricing system, you might inadvertently perpetuate outdated practices, leading to a misaligned solution and no innovation. Always ask, “What is the true business problem we’re aiming to solve?”

What Problems Are You Ignoring?

Just as important as identifying what we’re solving is recognizing what we’re intentionally overlooking. This strategic omission helps avoid unnecessary complexity. For instance, consider a project tasked with detecting suspicious behavior in a supermarket surveillance system. As a first step, you can choose to ignore complex activity recognition models and focus solely on detecting movement, identifying people in restricted areas, or recognizing individuals who do not pass through specific zones like a cashier in a store. This approach simplifies the problem, allowing you to tackle it in manageable chunks.

Who Is Your Customer and Why Do They Care?

In ML projects, knowing the real customer—the individual or team who will ultimately use and benefit from the solution—is vital. By cutting out intermediaries and directly understanding end-users’ needs and goals, we ensure alignment between the solution and their expectations, thereby avoiding potential miscommunications. It is crucial to understand where and how your solution will be used: What are the pain points they are facing? Will our model improve product quality or employee productivity? Does it need to be deployed on the edge or achieve real-time operation?

What Do Existing Solutions Look Like?

Understanding current solutions provides a solid foundation and prevents reinventing the wheel. This insight can stimulate improvements and innovations.

How Do You Measure Success?

Establishing clear success criteria is pivotal for any ML initiative. Without tangible benchmarks, it’s impossible to gauge progress or the efficacy of our solution. Success metrics should be precise and aligned with project goals. For instance, in the surveillance video project, success hinged on reducing the number of personnel needed to monitor the system, aiming for a substantial increase in productivity (where the same number of people could monitor more cameras) while maintaining quality.

Embracing Iteration and Discovery

ML projects often need a discovery phase to assess feasibility. Lasting several weeks, this phase involves delving into the problem domain, evaluating data quality and availability, and validating initial assumptions. It serves as a crucial litmus test, setting realistic expectations. By iterating through potential solutions and validating assumptions early on, you can mitigate risks and boost the odds of project success.

Conclusion

Crafting effective ML systems demands a disciplined approach that circumvents common pitfalls. By leveraging these questions and adhering to fundamental principles, we navigate the intricate landscape of ML development and deliver solutions that have a meaningful impact. Remember, MLOps isn’t just about deploying models that ace test sets; it’s about building systems that operate reliably in the real world, delivering tangible value to users and organizations.