Feature Store
Summit 2026
Register ▸
[ FSS/24 / Presentation ]

Large-Scale Embedding Feature Generation at Uber

Oct 15, 2024, 10:50 AM · 15 min

Embeddings are integral to numerous top-tier models at Uber, driving critical machine learning (ML) systems such as UberEats, HomeFeed, and Ads platforms. This talk will provide an in-depth exploration of how embeddings are generated at scale for various entities, such as eaters and restaurants. These embeddings are extensively utilized as features in downstream critical models and nearest neighbor-based retrieval systems. We will discuss the entire lifecycle of embeddings, from creation to deployment, and essential aspects such as versioning, analytics, and monitoring, which ensure the safe and consistent usage of embeddings in both offline and online environments. Additionally, we will showcase the ongoing enhancements to Michelangelo, Uber’s central ML platform, aimed at supporting the new embedding data type alongside numerical and categorical data types. These upgrades elevate embeddings to first-class citizens, promoting embedding reuse and significantly improving ML systems. Through a detailed case study of our HomeFeed ML system, we will demonstrate the tangible benefits of using embeddings, highlighting their impact on driving business metrics and performance.

[ SPEAKER ]