10 min
Oct 12, 2022
1:05 am

OpenMLDB: An Open-Source Real-Time Feature Platform Computing Consistent Features for Training and Inference

Lu introduces OpenMLDB, an open-source ML database that provides a real-time feature platform for ML applications that reduces dev cost.

About this session

Real-time features (computing features based on real-time data and returning results in milliseconds) are essential for many machine learning applications, such as risk analytics, self-driving cars, and personalized recommendation. However, a feature script developed by data scientists (usually Python or SparkSQL) cannot be directly deployed onto online real-time serving due to production requirements, such as low latency, high throughput, and high availability. Therefore, an engineering team must be involved to optimize the code, which takes significant effort. OpenMLDB is an open-source machine learning database that provides a real-time feature platform for ML applications, to significantly reduce the cost from development to deployment for feature engineering.

OpenMLDB consists of a batch SQL engine (improved based on Spark), a realtime SQL engine (built from scratch to efficiently compute real-time features), and a unified execution plan generator to make the two SQL engines have consistent feature definition and computation. As a result, a feature script developed by data scientists using SQL can be directly deployed onto online, while ensure the low latency, high throughput, and high availability for computing online real-time features.

You may find more information from the OpenMLDB GitHub: https://github.com/4paradigm/OpenMLDB/

Moderator

Session Speaker

Session Speaker

Session Speaker

Session Speaker

Session Speaker

Join our Slack channel to stay up to date on all the latest feature store news, including early notification when the conference details emerge.