What Is a Feature Store Anyway?
AUTHOR
Sara Hoormann
As Artificial Intelligence (AI) continues to become more prevalent in supply chain, you may start to notice the term feature store pop up in related articles and discussions. Whether you are a supply chain executive, data science manager, or supply chain analyst there are likely two main questions that follow: “What is it?” and “Why should I care?”.
Before we jump into exploring what a feature store is, however, we have to first define a feature. This part should be easy. You probably use features in many ways in your daily life. The most natural way to think about the concept of a feature is when describing yourself to others. For example, my features include being female, 5’9” inches tall, and having blond hair. I might tell a new acquaintance about these features so they can gauge whether I’m among the people in the coffee shop where we’ve agreed to meet.
But what if I needed someone to find me in a crowded airport instead? I might need to get a little more creative with my features, right? More detailed features might include the length of my hair, whether I typically wear a hat, or if I commonly travel alone or with others. As the features I use get more creative, there is often some trial and error involved in determining which ones are the most useful in picking me out of a crowd.
Have you ever played the game Guess Who? The winner is typically the person who can come up with the best features, narrowing the sea of faces down to one before their opponent!
This is very similar to the way a form of AI called machine learning (ML) works. ML models require features as they build internal rules to predict answers based on the signals provided.
Let’s switch to a supply chain example. Consider the problem of predicting whether a product will stock out or not. Features included in a related model might be as straightforward as the product’s category, its associated lead time, or its usual demand variability. These features typically already exist, out of the box, in your IT systems. However, more creative features are often needed to get the most predictive power out of your model. Other useful signals might include the demand of a set of complementary products, the number of stockouts in the last six weeks, the distribution of orders by day of the week, or even weather patterns near suppliers. These features are more complicated, and typically don’t already exist in your available data. They must be created, calculated, and delivered to the ML models.
This creative piece of what the industry calls feature engineering is often viewed as a huge competitive advantage for companies. The more complex the features, however, the more time-consuming and complicated they are to engineer. Data scientists often spend inordinate amounts of time manipulating and transforming data, leaving less time for more valuable (and fun) data science work.
Feature consistency has also been a big issue for models delivered by data scientists in recent years. The complexity of feature creation requires that critical decisions be made in the process, like deciding on a specific definition of a ‘stockout’, determining what distribution type to apply, or deciding what exact proximity of weather pattern to a supplier is considered relevant. Until now, data scientists have often created features on the fly as they build, test, and deploy ML models. This makes standardizing very difficult, both across models and between the training and production phases of a single model as well. Even small changes in the way a feature is engineered from one stage of model building to another can have significant impacts on the quality of predictions.
As you might have guessed, this is where the advent of the feature store comes into play. At the highest level, a feature store substantially reduces the effort required to put machine learning solutions into operation across your organization. The less time this takes, the more quickly you’re able to deliver predictions, ultimately delivering more prediction bang for your buck. Simple as that. At its core, it provides data scientists (or anyone deploying machine learning models) with access to ready-made features on demand.
Feature stores also provide the ability to deliver features quickly and consistently for ongoing use. Whether you require batch predictions (i.e., pulling predictions at set intervals of time) or real-time predictions (i.e., constantly recomputing values as data changes) the feature store precomputes each feature and has it ready to be called upon where and when you need them.
The biggest benefit is the ability to standardize and organize your features however. As we mentioned previously, data scientists often start from scratch and create features as they need them. A feature store links directly to all feature definitions and presents an easily searchable catalog of features previously created by you or others across your organization.
Let’s consider another supply chain example: a model that predicts whether or not an order will fail to deliver on time. One model feature might be the average transit time to customers over the last week. This feature requires multiple data sources and many decisions on how exactly to calculate “transit time”. With a feature store, modelers can now simply jump in and search for whether they (or someone else) have already created a similar feature. If so, they have already saved tons of time, and can be sure they are using a standard method for engineering the feature so that their work is consistent. On top of this, the feature can also be recorded in code like a variable, linking back to a single feature definition. This way you (or others) can edit or update the “transit time” feature and it will automatically update all the other models currently utilizing that feature as well.
Finally (and maybe most importantly), organizations are now able to ensure they get their data science money’s worth. As I mentioned before, really great features take a certain kind of genius to deliver, and good data scientists are paid well as a result. Organizations can put this genius to work over and over again with the help of the feature store.
Have I sold you? While that wasn’t my intention, I do ‘predict’ (see what I did there) that machine learning models will provide a huge competitive advantage for supply chains over the coming years. Organizations that take predictions past the typical ‘AI-driven demand forecasting’ and search out ways to embed machine learning into the decisions they make will come out on top in the long run. Whether you are just starting your journey with predictions or ramping up to the next level, consider feature stores in order to greatly improve data organization, feature consistency, and machine learning efficiency for the long run.
Read more
Our Dream is to Make Every Supply Chain AI-First
Oct 18, 2023
Ganesh Ramakrishna
read more
Supply Chain AI Ain’t Easy
Feb 20, 2023
Ganesh Ramakrishna & Sara Hoormann
read more
Four Ways to Improve Supply Chain Operations with Machine Learning
Jan 26, 2023
Vish Oza
read more
Prediction is the New Visualization
May 30, 2024
Frank Corrigan
read more