How to start a machine learning project with an external AI company – a practical guide
Today’s topic has resulted directly from discussions with our clients and discovering their concerns. I won’t beat around the bush – machine learning projects are characterized by a high risk. Unlike software development, which is difficult but still way easier to plan, there are many uncertainties involved. Starting an ML project pretty often you don’t know if your problem can be solved by technology at all because no one has ever solved it. Thus, it belongs to the domain of research and starts with carrying out a feasibility study. By the way, if any AI company can promise you to solve a niche, untypical problem beforehand, I’d recommend being extremely cautious. Because it sounds too good to be true. There are just too many questions, predictions, and hypotheses that need to be supported by evidence.
Beyond a doubt, the game is worth the candle. If a machine learning project is successful – you can boost your business immensely, gain a great advantage over your competition, and create new, unique products. In this article, I want to show you how we launch and manage machine learning projects in order to reduce risk and costs, while still being able to deliver expected results in a planned way.
What is machine learning
Machine learning is a subfield of artificial intelligence. Its purpose is to develop algorithms so that computers learn and can improve automatically over time. They are programmed to find patterns and eventually make data-based predictions and decisions. The keyword here is ‘automatically’, which means without any further human intervention. Different types of machine learning algorithms are utilized in plenty of industries and applications, such as medical diagnosis, insurance, finance, telecommunication, and production.
Specifics of machine learning projects
Machine learning projects aim at significant changes in an organization – improving production processes, optimizing supply chains, and making better business decisions. They can be also a basis for new, groundbreaking products or unique features to be added to the existing ones. No matter your aim is to improve manufacturing efficiency by 10% or predict how many baubles you need to order this year to meet market demand, there are no ready-made solutions available.
Encompass unpredictability
Thus, the machine learning projects, along with other artificial intelligence projects, are accompanied by relatively high risk. Just like every innovation, research projects mean we need to confirm our hypotheses first. We must prove that what we want to achieve is in fact achievable. For this reason, it is practically impossible to determine the time and budget needed for the completion of most machine learning projects just at the very beginning. This generates obvious problems such as how to get round your board of directors to agree to the costs and with the unpredictability of the outcome. It doesn’t apply to typical projects that take advantage of machine learning, such as recommender systems, which make use of ready-made tools and libraries. But such projects I’d in fact call IT projects.
Minimize the risk
The key here is to minimize the risk and costs and to make the process as safe as possible. Especially that apart from the very project there’s also the question of choosing the right supplier, which always adds to the risk if you are to cooperate for the first time. We’ve already expanded upon the topic in our previous blog post on 5 common fears about working with an external provider , where we reflect on general issues concerning cooperation with an outsourcing company. Having told you how risky the process is, let’s discuss how to make your machine learning project less uncertain.
Stages of our machine learning development process
Companies don’t want to create long-lasting bonds with their providers before they get to know them well – and that’s completely understandable. We facilitate that by having created the grounds for ending cooperation quickly if something goes wrong and a client isn’t satisfied with the outcome. You don’t have to sign a multi-annual contract at the very beginning of the cooperation and you can test us beforehand. This way we also help you to verify quite quickly, even after one or two sprints, if your project has chances to be successful.
So, here’s the process explained, which has been developed over the years on projects realized not only for our clients but also internally. I believe this way of managing R&D projects helps to minimize the risk for both sides.
Phase 1: analyze the problem
You come to us with a problem. What it seems to you is that you can solve it or just make things in a better way – faster, more effectively – with e.g. neural networks or deep learning.
Since every case is different when it comes to machine learning, we need to learn as much as we can about your organization and its challenges. To achieve that, after signing an NDA, we interview you and conduct a workshop session (which is discussed further in this article). Knowing your problem, existing solutions and equipment, the area that is to be automated, and having sample data we can propose an initial idea for a solution or a solving approach. This stage of the project is entirely free and requires pieces of information from you (possibly sample data) and 2-3 hours of your time.
Phase 2: create a feasibility study
Already knowing the problem, we can investigate it further on our own. In phase two we devote time to verify our initial assumptions and theories that had been developed in stage one. We conduct a feasibility study to assess if particular algorithms and solutions will work out. In the end, we try to propose a complete, long-term solution. This phase takes up to a week.
Phase 3: divide into subprojects and start with the first one
Machine learning projects can last from a few weeks to several months and it’s difficult to give the right estimation with minimal information. To be able to effectively manage such a project, we try to divide it into subprojects that have clear timing, deliverables, and estimation. It’s important to define the first 1-2 sprints during which we can prove the core hypothesis of the project. Thus, you know clearly if the path we’ve taken is correct and will solve your problem.
Without a long-term commitment from your side, you can validate basic assumptions for the ML project, such as how many training samples are needed, how effective the system/functions will be, and whether the results you’re expecting are likely to be achieved. Moreover, this method allows you to get to know us quickly – if we keep our promises, how responsive we are, and if you like working with us.
From our experience, the key part of each ML project is data – not only its quantity but also the quality. The initial sprints allow us to realistically evaluate what challenges we will face working with client’s data – and this is particularly important for estimating the project. These can be simple things like mistakes and lack of standardization of data to problems with gathering real-live datasets that could be used for the project.
This way we minimize the risk ‘to the minimum’ – for both sides. You don’t have to sign long-term contracts with an external provider and we don’t have to estimate projects that are highly unpredictable and thus risky. Of course, we won’t deliver the full production solution within the first sprint but it will be significantly cheaper to verify if the very idea is executable at all – and how long it might take to build a complete system.
Examples of machine learning projects
Machine learning can be taken advantage of by companies from a variety of industries. Some of the greater examples are microbiology, nanoelectronics, transportation (self driving cars), mining, healthcare, financial services, and manufacturing.
For example, we’ve developed deep learning algorithms that count and classify microorganisms grown on cell-culture dishes. To achieve high-precision we used methods such as deep neural networks, image processing, and object detection. You’ll read more about this object counting project here .
Yet another example but from a different industry is our AI and AR platform for industrial processes facilitation. It allows managers to draw benefits from augmented reality training by adopting it straight away. The full case study of nsFlow is right here .
Summary
It is possible to develop machine learning solutions without the high risk and immense expenses. You don’t need a multi-million budget to develop it and finally achieve what was desired – doing it step by step. The machine learning development process described above will help you achieve your goals with more peace of mind.