Feature Engineering with Hamilton: Wrote Once Run Everywhere
Elijah will discuss Hamilton, a lightweight open-source framework in python that enables data practitioners to cleanly define dataflows.
Elijah will discuss Hamilton, a lightweight open-source framework in python that enables data practitioners to cleanly define dataflows.
Write Once, Run Everywhere Most data transformations are written twice. In the field of feature engineering for Machine Learning, data scientists regularly have to build, manage, and iterate on batch jobs, then translate those jobs to a service setting to load data and make fresh predictions. At best, this process is an engineering headache. At worst, this can result in difficult-to-detect deltas between training and inference, complex code, and highly bespoke infrastructure. In this talk we discuss Hamilton, a lightweight open-source framework in python that enables data practitioners to cleanly and portably define dataflows. Hamilton places no restrictions on the nature of transformations, allowing data scientists to use their favorite python libraries. With Hamilton, you can run the same code in your airflow DAG for training as you would in your fastAPI service for inference, and get the same result.