Real-time ML: Accelerating Python for inference (< 10ms) at scale
Learn how a symbolic Python interpreter converts Python into DAGs to accelerate ML pipelines.
Learn how a symbolic Python interpreter converts Python into DAGs to accelerate ML pipelines.
Real-time machine learning depends on features and data that by definition can’t be pre-computed. Detecting fraud or acute diseases like sepsis requires processing events that emerged seconds ago. How do we build an infrastructure platform that executes complex data pipelines (< 10ms) end-to-end and on-demand? All while meeting data teams where they are–in Python–the language of ML. We’ll share how we built a symbolic python interpreter that accelerates ML pipelines by transpiling Python into DAGs of static expressions. These expressions are optimized and run at scale with Velox–an OSS (~4k stars) unified query engine (C++) from Meta.