Building Machine Learning Systems with a Feature Store
by
Jim Dowling
Description: Building Machine Learning Systems with a Feature Store by Jim Dowling covers the development and operation of batch, real-time, and large language model-based machine learning systems using a unified feature store approach
ISBN: 9781098165222
We may earn a commission from purchases made through links on this page.
I wrote a book on feature stores by O'Reilly. The bad query they wrote in Clickhouse could have been caused by another more error - duplicate rows in materialized feature data. For example, in Hopsworks it prevents duplicate rows by building on primary key uniqueness enforcement in Apache Hudi. In contrast, Delta lake and Iceberg do not enforce primary key constraints, and neither does Clickhouse. So they could have the same bug again due to a bug in feature ingestion - and given they hacked together their feature store, it is not beyond the bounds of possibility.
Reference: https://www.oreilly.com/library/view/building-machine-learni...