This document provides recommendations for building machine learning software from the perspective of Netflix's experience.
The first recommendation is to be flexible about where and when computation happens by distributing components across offline, nearline, and online systems. The second is to think about distribution starting from the outermost levels of the problem by parallelizing across subsets of data, hyperparameters, and machines. The third recommendation is to design application software for experimentation by sharing components between experiment and production code. The fourth recommendation is to make algorithms and models extensible and modular by providing reusable building blocks. The fifth recommendation is to describe input and output transformations with models. The sixth recommendation is to not rely solely on metrics for testing and instead implement unit testing of code.
Related topics: