Retail Reference Architecture Part 2: Approaches to Inventory Optimization

MongoDB
May 12, 2015 | Updated: May 2, 2018
#Technical

Series:

Building a Flexible, Searchable, Low-Latency Product Catalog
Approaches to Inventory Optimization
Query Optimization and Scaling
Recommendations and Personalizations

In part one of our series on retail reference architecture we looked at some best practices for how a high-volume retailer might use MongoDB as the persistence layer for a large product catalog. This involved index, schema, and query optimization to ensure our catalog could support features like search, per-store pricing and browsing with faceted search in a highly performant manner. Over the next two posts we will be looking at approaches to similar types of optimization, but applied to an entirely different aspect of retail business, inventory.

A solid central inventory system that is accessible across a retailer’s stores and applications is a large part of the foundation needed for improving and enriching the customer experience. Here are just a few of the features that a retailer might want to enable:

Reliably check real-time product availability.
Give the option for in-store pick-up at a particular location.
Detect the need for intra-day replenishment if there is a run on an item.

The Problem with Inventory Systems

These are features that seem basic but they present real challenges given the types of legacy inventory systems commonly used by major retailers. In these systems, individual stores keep their own field inventories, which then report data back to the central RDBMS at a set time interval, usually nightly. That RDBMS then reconciles and categorizes all of the data received that day and makes it available for operations like analytics, reporting, as well as consumption by external and internal applications. Commonly there is also a caching layer present between the RDBMS and any applications, as relational databases are often not well-suited to the transaction volume required by such clients, particularly if we are talking about a consumer-facing mobile or web app.

So the problem with the status quo is pretty clear. The basic setup of these systems isn’t suited to providing a continually accurate snapshot of how much inventory we have and where that inventory is located. In addition, we also have the increased complexity involved in maintaining multiple systems, i.e. caching, persistence, etc. MongoDB, however, is ideal for supporting these features with a high degree of accuracy and availability, even if our individual retail stores are very geographically dispersed.

Design Principles

To begin, we determined that the inventory system in our retail reference architecture needed to do the following:

Provide a single view of inventory, accessible by any client at any time.
Be usable by any system that needs inventory data.
Handle a high-volume, read-dominated workload, i.e. inventory checks.
Handle a high volume of real-time writes, i.e. inventory updates.
Support bulk writes to refresh the system of record.
Be geographically distributed.
Remain horizontally scalable as the number of stores or items in inventory grows.

In short, what we needed was to build a high performance, horizontally scalable system where stores and clients over a large geographic area could transact in real-time with MongoDB to view and update inventory.

Stores Schema

Since a primary requirement of our use case was to maintain a centralized, real-time view of total inventory per store, we first needed to create the schema for a stores collection so that we had locations to associate our inventory with. The result is a fairly straightforward document per store:

{
	“_id”:ObjectId(“78s89453d8chw28h428f2423”),
	“className”:”catalog.Store”,
	“storeId”:”store100”,
	“name”:”Bessemer Store”,
	“address”:{
		“addr1”:”1 Main St.”,
		“city”:”Bessemer”,
		“state”:”AL”,
		“zip”:”12345”,
		“country”:”USA”
	},
	“location”:[-86.95444, 33.40178],
	…
}

We then created the following indices to optimize the most common types of reads on our store data:

{“storeId”:1},{“unique”:true}: Get inventory for a specific store.
{“name”:1}: Get a store by name.
{“address.zip”:1}: Get all stores within a zip code, i.e. store locator.
{“location”: 2dsphere}: Get all stores around a specified geolocation.

Of these, the location index is especially useful for our purposes, as it allows us to query stores by proximity to a location, e.g. a user looking for the nearest store with a product in stock. To take advantage of this in a sharded environment, we used a geoNear command that retrieves the documents whose ‘location’ attribute is within a specified distance of a given point, sorted nearest first:

db.runCommand({
	geoNear:“stores”,
	near:{
		type:”Point”,
		coordinates:[-82.8006,40.0908], //GeoJSON or coordinate pair
		maxDistance:10000.0, //in meters
		spherical:true //required for 2dsphere indexes
	}
})

This schema gave us the ability to locate our objects, but the much bigger challenge was tracking and managing the inventory in those stores.

Inventory Data Model

Now that we had stores to associate our items with, we needed to create an inventory collection to track the actual inventory count of each item and all its variants. Some trade-offs were required for this, however. To both minimize the number of roundtrips to the database, as well as mitigate application-level joins, we decided to duplicate data from the stores collection into the inventory collection. The document we came up with looked like this:

{
	“_id”:”902372093572409542jbf42r2f2432”,
	“storeId”:”store100”,
	“location”:[-86.95444, 33.40178],
	“productId”:”20034”,
	“vars”:[
		{“sku”:”sku1”, “quantity”:”5”},
		{“sku”:”sku2”, “quantity”:”23”},
		{“sku”:”sku3”, “quantity”:”2”},
		…
	]
}

Notice first that we included both the ‘storeId’ and ‘location’ attribute in our inventory document. Clearly the ‘storeId’ is necessary so that we know which store has what items, but what happens when we are querying for inventory near the user? Both the inventory data and store location data are required to complete the request. By adding geolocation data to the inventory document we eliminate the need to execute a separate query to the stores collection, as well as a join between the stores and inventory collections.

For our schema we also decided to represent inventory in our documents at the productId level. As was noted in part one of our retail reference architecture series, each product can have many, even thousands of variants, based on size, color, style, etc., and all these variants must be represented in our inventory. So the question is should we favor larger documents that contain a potentially large variants collection, or many more documents that represent inventory at the variant level? In this case, we favored larger documents to minimize the amount of data duplication, as well as decrease the total number of documents in our inventory collection that would need to be queried or updated.

Next, we created our indices:

{storeId:1}: Get all items in inventory for a specific store.
{productId:1},{storeId:1}: Get inventory of a product for a specific store.
{productId:1},{location:”2dsphere”}: Get all inventory of a product within a specific distance.

It’s worth pointing out here that we chose not to include an index with ‘vars.sku’. The reason for this is that it wouldn’t actually buy us very much, since we are already able to do look ups in our inventory based on ‘productID’. So, for example, a query to get a specific variant sku that looks like this:

db.inventory.find(
	{
		“storeId”:”store100”,
		“productId”:“20034”,
		“vars.sku”:”sku11736”
	},
	{“vars.$”:1}
)

Doesn’t actually benefit much from an added index on ‘vars.sku’. In this case, our index on ‘productId’ is already giving us access to the document, so an index on the variant is unnecessary. In addition, because the variants array can have thousands of entries, an index on it could potentially take up a large block in memory, and consequently decrease the number of documents stored in memory, meaning slower queries. All things considered, an unacceptable trade-off, given our goals.

So what makes this schema so good anyhow? We’ll take a look in our next post at some of the features this approach makes available to our inventory system.

Learn More

To discover how you can re-imagine the retail experience with MongoDB, read our white paper. In this paper, you'll learn about the new retail challenges and how MongoDB addresses them.

Learn more about how leading brands differentiate themselves with technologies and processes that enable the omni-channel retail experience.

Read our guide on the digitally oriented consumer

<< Read Part 1

Read Part 3 >>

← Previous

DICE Scales with MongoDB to Sell-Out Wembley Stadium in Less than 60 Seconds

Many of the largest and most sophisticated companies in the world rely on MongoDB, including over a third of the Fortune 100. In addition to well established businesses using the modern database, innovative start ups from around the world put MongoDB at the heart of their data strategy. This blog series highlights three UK-based start ups transforming their industries with MongoDB. First up, DICE. Why are we charged booking fees when we buy a ticket to see our favorite band? Years ago, there was a reason. Companies had to manually process orders, print and mail out tickets to fans - which involved a cost. Today, we carry around powerful devices everywhere we go and booking is simply a few swipes, a click and then the ticket is delivered directly to your phone. Booking fees are dinosaurs, and DICE wants to be the meteor that wipes them out. The guardian described it as: “DICE aims to take tickets out of the hands of touts and put them into the phones of fans.” However, it’s much more than that at DICE. We’re building applications that have Wembley Stadium scale and to do it, we’re relying on MongoDB. Best Gigs, No Booking Fees, But lots of data Built entirely on MongoDB, DICE went live on September 19th 2014 and we launched big. Users had access to big shows such as Jack White at 02 Arena and Red Bull Culture Clash at Earls Court Arena as well as lots of amazing smaller shows featuring brilliant bands. However, building a robust application that scales up to massive peaks of activity as ticket sales go online requires a lot of backend engineering. We’ve all been there, sitting at a laptop at 9am, trying to refresh and pulling our hair out because we don’t know if we’ve actually got the tickets we just bought. It turns out building a ticketing application that can have high performance with thousands of operations a second isn’t all that easy. But we had a mission. How to Sell Out Wembley Stadium - In A Minute When I joined DICE that was the challenge I was tasked with - how can we sell a million tickets and have the application work seamlessly, while providing a consistent view of ticket inventory. In all of my previous roles, whenever I needed a database with great performance, I went with MongoDB. Once we did some initial testing and the DICE team saw how intuitive MongoDB was to develop on and how well it performed, MongoDB was an obvious decision. It quickly became a key part of our data strategy and therefore our business plan. Some of the capabilities that come baked into MongoDB have been vital to our success. For instance, if we need to sell 90,000 tickets for an event, we have to be absolutely sure we don’t end up selling 90,001 or 90,100. Which is kind of obvious, but when bottlenecks start and maybe 150 people are all buying the same ticket at the same time, it’s actually a tricky problem to solve. We implemented a managed object pool within MongoDB, that creates all the tickets beforehand taking advantage of MongoDB ACID compliant operations. This ensures that each customer gets a unique ticket for the event. That’s our take on concurrency. Using that system, we know we can sell 90,000 tickets a minute, with the site still performing comfortably. That’s basically Wembley Stadium, every minute, and we know exactly where to push to get those numbers even higher. The reason we can do that is because we have a database that is rock solid at scale. Our headquarters are currently in London and we are rolling out to more UK cities, before taking on Europe and North America. Selecting a database that can scale as our business grows is essential. As with many start ups, expanding quickly is a key metric - we need users and we need lots of them. Geographic growth is good, but it also can add complexity for our data. MongoDB is well placed to help address that. If we’re selling tickets to an LA concert, we need to ensure that the customer has the same excellent experience as a customer in the UK or in Europe. To do this, we have to distribute data to local servers that are physically near to our customers. To make this run smoothly we’ll use MongoDB’s location-aware sharding, to ensure that if someone is in LA, they will be routed to a local server which eliminates the effects of cross-continent geographic latency. As we expand, we know we need to offer a great service to customers and our partners. Crucially we have to also present a robust plan to potential investors. Having MongoDB at the heart of our application strategy means we’re in a place where we feel very good about scaling this wonderful, crazy idea - best gigs, no booking fees. Also, we’re looking for all sorts of people to join our team, in particular a MongoDB database administrator . To see how organizations around the world are building applications never before possible, read our white paper on quantifying business advantage: Explore The Value of Database Selection

May 11, 2015

Next →

Retrieval Augmented Generation for Claim Processing: Combining MongoDB Atlas Vector Search and Large Language Models

Following up on our previous blog, AI, Vectors, and the Future of Claims Processing: Why Insurance Needs to Understand The Power of Vector Databases , we’ll pick up the conversation right where we left it. We discussed extensively how Atlas Vector Search can benefit the claim process in insurance and briefly covered Retrieval Augmented Generation (RAG) and Large Language Models (LLMs). MongoDB.local NYC Join us in person on May 2, 2024 for our keynote address, announcements, and technical sessions to help you build and deploy mission-critical applications at scale. Use Code Web50 for 50% off your ticket! Learn More One of the biggest challenges for claim adjusters is pulling and aggregating information from disparate systems and diverse data formats. PDFs of policy guidelines might be stored in a content-sharing platform, customer information locked in a legacy CRM, and claim-related pictures and voice reports in yet another tool. All of this data is not just fragmented across siloed sources and hard to find but also in formats that have been historically nearly impossible to index with traditional methods. Over the years, insurance companies have accumulated terabytes of unstructured data in their data stores but have failed to capitalize on the possibility of accessing and leveraging it to uncover business insights, deliver better customer experiences, and streamline operations. Some of our customers even admit they’re not fully aware of all the data in their archives. There’s a tremendous opportunity to leverage this unstructured data to benefit the insurer and its customers. Our image search post covered part of the solution to these challenges, opening the door to working more easily with unstructured data. RAG takes it a step further, integrating Atlas Vector Search and LLMs, thus allowing insurers to go beyond the limitations of baseline foundational models, making them context-aware by feeding them proprietary data. Figure 1 shows how the interaction works in practice: through a chat prompt, we can ask questions to the system, and the LLM returns answers to the user and shows what references it used to retrieve the information contained in the response. Great! We’ve got a nice UI, but how can we build an RAG application? Let’s open the hood and see what’s in it! Figure 1: UI of the claim adjuster RAG-powered chatbot Architecture and flow Before we start building our application, we need to ensure that our data is easily accessible and in one secure place. Operational Data Layers (ODLs) are the recommended pattern for wrangling data to create single views. This post walks the reader through the process of modernizing insurance data models with Relational Migrator, helping insurers migrate off legacy systems to create ODLs. Once the data is organized in our MongoDB collections and ready to be consumed, we can start architecting our solution. Building upon the schema developed in the image search post , we augment our documents by adding a few fields that will allow adjusters to ask more complex questions about the data and solve harder business challenges, such as resolving a claim in a fraction of the time with increased accuracy. Figure 2 shows the resulting document with two highlighted fields, “claimDescription” and its vector representation, “claimDescriptionEmbedding” . We can now create a Vector Search index on this array, a key step to facilitate retrieving the information fed to the LLM. Figure 2: document schema of the claim collection, the highlighted fields are used to retrieve the data that will be passed as context to the LLM Having prepared our data, building the RAG interaction is straightforward; refer to this GitHub repository for the implementation details. Here, we’ll just discuss the high-level architecture and the data flow, as shown in Figure 3 below: The user enters the prompt, a question in natural language. The prompt is vectorized and sent to Atlas Vector Search; similar documents are retrieved. The prompt and the retrieved documents are passed to the LLM as context. The LLM produces an answer to the user (in natural language), considering the context and the prompt. Figure 3: RAG architecture and interaction flow It is important to note how the semantics of the question are preserved throughout the different steps. The reference to “adverse weather” related accidents in the prompt is captured and passed to Atlas Vector Search, which surfaces claim documents whose claim description relates to similar concepts (e.g., rain) without needing to mention them explicitly. Finally, the LLM consumes the relevant documents to produce a context-aware question referencing rain, hail, and fire, as we’d expect based on the user's initial question. So what? To sum it all up, what’s the benefit of combining Atlas Vector Search and LLMs in a Claim Processing RAG application? Speed and accuracy: Having the data centrally organized and ready to be consumed by LLMs, adjusters can find all the necessary information in a fraction of the time. Flexibility: LLMs can answer a wide spectrum of questions, meaning applications require less upfront system design. There is no need to build custom APIs for each piece of information you’re trying to retrieve; just ask the LLM to do it for you. Natural interaction: Applications can be interrogated in plain English without programming skills or system training. Data accessibility: Insurers can finally leverage and explore unstructured data that was previously hard to access. Not just claim processing The same data model and architecture can serve additional personas and use cases within the organization: Customer Service: Operators can quickly pull customer data and answer complex questions without navigating different systems. For example, “Summarize this customer's past interactions,” “What coverages does this customer have?” or “What coverages can I recommend to this customer?” Customer self-service: Simplify your members’ experience by enabling them to ask questions themselves. For example, “My apartment is flooded. Am I covered?” or “How long do windshield repairs take on average?” Underwriting: Underwriters can quickly aggregate and summarize information, providing quotes in a fraction of the time. For example, “Summarize this customer claim history.” “I Am renewing a customer policy. What are the customer's current coverages? Pull everything related to the policy entity/customer. I need to get baseline info. Find relevant underwriting guidelines.” If you would like to discover more about Converged AI and Application Data Stores with MongoDB, take a look at the following resources: RAG for claim processing GitHub repository From Relational Databases to AI: An Insurance Data Modernization Journey Modernize your insurance data models with MongoDB and Relational Migrator

April 18, 2024