The text generation capabilities of ChatGPT, Dolly and the like are truly impressive and are rightfully recognized as major steps forward in the field of AI. But as the excitement around the future heralded by these models settles in, many organizations are beginning to ask, how can we make use of these technologies today?
As with most new technologies, the full range of applications for these large language models (LLMs) is not currently known, but we can identify several areas where they can be used to augment and enhance things we do today – as we shared in a previous blog. Places where individuals are tasked with summarizing large volumes of written content in order to provide informed opinions or guidance are the natural fit.
Customers Need Help Searching Product Catalogs
One area where we see an immediate need that can help drive growth for retailer and consumer goods companies (and not just cut costs) is in the area of search. With the rapid expansion of online activity in the last few years, more and more customers are engaging online outlets for a wider range of needs. In response, many organizations have rapidly expanded the range of content and goods they make available online to better ensure customers have access to the items they want.
While more is often better, many online sites hit a tipping point beyond which the number of offerings make it actually harder for customers to find what they are looking for. Without the precise terms to locate a specific widget or an article on a narrowly defined topic, consumers find themselves frustrated scrolling through lists of items that just aren’t quite right.
Using LLMs, we can task a model with poring over product descriptions, written content or the transcripts associated with audio recordings and responding to user searches with suggestions for things relevant to their prompts. Users don’t need the precise terms to find what they are looking for, just a general description with which the LLM can orient itself to their needs. The end result is a powerful new experience that leaves users feeling as if they have received personalized, expert guidance as they engage the site.
Fine-Tuning Ensures Tailored Search Results
To build such a solution, organizations DO NOT need to subscribe to third-party services. Like most machine learning models available today, most LLMs are built on open source technologies and are licensed for a broad range of uses. Many of these come pre-trained on large volumes of data from which they have already learned many of the language patterns we wish to support. But this knowledge may inherit usage restrictions that block some use cases.
Pre-trained LLMs can be used to greatly reduce the content requirements and training times associated with bringing a model online. As proven by Databricks’s Dolly 2.0 model, if trained on even a relatively small volume of content, these models can perform content summarization and generation tasks with impressive acumen. And to be effective in searching a specific body of documents, the model doesn’t even need to be trained specifically on it.
But with fine-tuning, we can tweak the orientation of the model to the specific content against which it is intended to be engaged. By taking a pre-trained model and engaging it in additional rounds of training on the product descriptions, product reviews, written posts, transcripts, etc. that make up a specific site, the ability of model to respond to user prompts in a manner more consistent with this content is improved, making it a worthwhile step for many organizations to perform.
Getting Started Enabling LLM-based Search
So, how does one go about doing all of this? The answer is surprisingly straightforward. To get started:
- Download a pre-trained, open source LLM model
- Use the model to transform the product text into an embedding
- Configure the model to use those embeddings as the body of knowledge against which to focus its search
- Deploy the model as a microservice you can integrate with your various applications
These steps will provide you a basic, out-of-the-box search capability that is surprisingly robust. To fine-tune the search:
- Collect a set of searches and product results
- Label the results for their relevance
- Fit the model to these results, an
- Repeat steps 2-4 above
As simple as these steps appear, there are some new terms and concepts that are worth exploring.
Understanding Some Key Concepts
First, where does one find a pre-trained, open source LLM? Dolly 2.0, mentioned earlier, is one such model, and it can be freely downloaded and widely used per the licensing terms presented on its download site. Hugging Face is another popular place to discover (large and otherwise) language models that are ideal for what the AI-community refers to as semantic search. With a little more search effort, you can probably find many other LLMs available for download but do take a moment to review the licensing terms associated with each to understand their availability for commercial re-use.
Next, what is an embedding? The answer to this question can get quite technical but in a nutshell an embedding is a numerical representation of a sentence, paragraph or document. The mechanics of how these are generated are buried within the model but the key thing to understand is that when a model converts two documents to embeddings, the mathematical distance (difference) between the numerical values tells us something about the degree of similarity between them.
How are embeddings coupled with the model? This part is a little more complicated but tools like open source langchain provide the building blocks for this. The key thing to understand is that the embeddings that form the details of the product catalog we wish to search are not searchable from within a traditional relational database or even a NoSQL data store. A specialized vector store needs to be used instead.
Next, what is a microservice? A microservice is a lightweight application that receives a request, such as a search phrase, and returns a response. Packaging the model and the embeddings it will search within a microservice provides a simple way to not only make the search functionality it provides widely accessible to applications, most microservice infrastructure solutions support elastic scalability so that you can allocate resources to the service to keep up with demand as it ebbs and flows. This is essential for managing uptime while controlling cost.
Finally, how does one label search results? While a lot of the items addressed in the previous questions get very technical, this one is surprisingly simple. All you need is a set of queries and the results returned for them. (Most search engines used on ecommerce sites provide functionality for this.) This data set doesn’t need to be super large for it to be effective though the more search results available the better.
A human then must assign a numerical score to each search result to indicate its relevance to the search phrase. While this can be made complicated, you will likely find very good results by simply assigning relevant search results a value of 1.0, irrelevant search results a value of 0.0, and partially relevant results a value somewhere in between.
Want to See Exactly How This Is Done?
At Databricks, our goal has always been to make data and AI technologies widely accessible to a wide variety of organizations. With that in mind, we’ve developed an online search solution accelerator using the Wayfair Annotation Dataset (WANDS). This dataset provides descriptive text for 42,000+ products on the Wayfair website and 233K labeled results generated from 480 searches.
Using an open source model from Hugging Face, we first assemble an out-of-the box search with no fine-tuning and are able to deliver surprisingly good results. We then fine-tune the model using our labeled search results, boosting search performance considerably. These models are then packaged for deployment as a microservice hosted with Databricks model serving.
All the gory details of this work are presented in 4 notebook assets that you can freely download here. The notebooks are annotated with descriptive content that seeks to clarify the steps being performed and alternative paths organizations may take to better meet their specific needs. We encourage you to first run these notebooks as-is using the publicly available data and then borrow any code you need to get your own search capabilities off the ground.