“This blog is authored by Hai Nguyen, Senior Data Scientist at Gousto”
Gousto is the UK’s best value recipe box, serving up more recipe choices and variety than anyone else in the market. The recipes are designed by professional chefs and include a diverse range of flavors and cuisines, allowing you to try new recipes without having to source exotic ingredients or spend time measuring portions. Each box contains pre-measured, fresh ingredients and easy-to-follow recipes so you can quickly prepare delicious, nourishing meals at home without any food waste.
We sell millions of meals every month and a number of these customers’ orders come from recommendations, offers, and discounts made by our recipe machine learning (ML) recommendation engine. Ensuring that our models recommend the best possible recipes is a challenging task.
We use the Databricks Lakehouse Platform as our ML pipeline to build the recommendation engine. Using Databricks has helped cut down our model deployment time by 50%. Additionally, we have been able to improve the performance massively, translating to a significant commercial impact on the business.
Changing Weekly Menus
At Gousto, we use machine learning to create recipe recommendations. This capability is necessary because we have a diverse range of customers with different tastes and dietary preferences. We do not want to recommend a meat recipe to a vegetarian or one with cheese to a lactose-intolerant customer.
And for every week, there are about 75 changing recipes on our menu with lots of variants (e.g., veggie/fish) per recipe. With so many customers, it would be impossible to manually create recommendations, discounts, and offers tailored to each individual’s preferences. To tackle this challenge, our team developed Rouxcommender, a recipe recommendation engine.
ML allows us to create highly-accurate recommendations that are based on customer data and previous customer orders and our rich recipe database . Our team developed deep learning models that utilize transformer-based architectures to learn from customer flavors and tastes from their purchasing behaviors.These models look at a number of factors, including previous orders, how easy the recipe is to cook, the number of vegetables in the recipe, and many more.
This process allows us to learn from our customers’ purchasing patterns and use that to predict what they will order and suggest recips to them. As a result, our customers can be confident that they will only see recipes that they will likely enjoy on our weekly menu. Despite the success of our engine, it didn’t come without some challenges along the way:
Lack of Visibility
The amount of data the recipe recommendation engine produces is massive.Our engine produces a massive amount of data. Our team has to create and store millions of customer recommendations each week because we cannot be certain which customers will order. Querying the data produced during this process was inherently tricky. Our data analysts had little to no visibility of this recommendation data.
We used a number of model-building techniques along with several repositories. Our team had to use four data repositories to change the ML model.Using these different techniques and repositories was a clunky process. We were looking for a solution where we could perform the entire ML model process in one environment.
Once a model is built, our team must ensure that it will perform as expected. This process requires our team to use different tools. Integrating the tools and ensuring the models were properly deployed on various platforms was challenging.
Long Model Development Time
The model development time was a cumbersome process that took about six months. We did not have the proper platform to optimize the process where we could easily integrate our tools and deploy the model. We needed a solution that could help us track each model and experiment.
Using Databricks to Make Dish Recommendations
At Gousto, we use Databricks to improve our machine learning recommendation models and processes.
One Environment and Improved Visibility
We can handle everything from ETL to collecting data to interactive dashboards to monitoring our data. This way, we can use that data to build the models and, at the same time, track the performance using the name of flows. Instead of spending time integrating tools and platforms for A/B tests, we can focus on continuous improvement.
We have also been able to improve our data visibility throughout the organization. Databricks’ quick access to data has made it possible for our analysts to query and explore the data easily, which has helped us move our projects forward quickly.
Improved Model Performance and Deployment Speed
One of our biggest wins has been the speed at which we can deliver more value. In particular, the MLflow tracking server, MLflow registry, and other Databricks integrated features such as Delta, SQL workspace and Petastorm have been invaluable in allowing us to track the iterations and runs of our models. We use MLflow quite a lot, and all the features are already integrated into Databricks to build the model and iterate on the model, and that’s how we can improve our model iteration pace.
We can now iterate on models much more quickly, allowing us to keep up with the latest customer needs and behavior patterns. Instead of iterating models over the course of six months, they can deliver models 50% faster. Previously we had the first version of the model last year, but within this year, for only two consecutive quarters, we have delivered two models.
These two models have also had a really significant commercial impact on the business. Databricks has allowed us to iterate quickly and try different things, which has led to a lot of success, especially regarding the performance of our top recipes.
Since using Databricks, we have been able to recommend much more relevant recipes to our customer base. Since adopting Databricks in our model development pipeline, the sales coming from the top-recommended recipes have increased significantly.
We’re looking forward to continuing to use Databricks to drive value for Gousto. Our team hopes to use future Databricks capabilities to further improve our model iteration and testing process.