3 Comments

A comment on the "Pitfalls" paper on large, static model degradation...

You mention that retraining frequently would solve the problem, but it's too expensive. You are half right. The reason it's too expensive is not inherent in problem - it's an issue with the chosen ML technology (deep learning). At Textician, we use very fast regression algorithms. Our customers can retrain entire models overnight on vanilla hardware, with a side benefit that convergence is built in.

This solves problems of both time and space. As the paper discusses, models degrade over time, but in our application, we often face the issue that each installation must deal with different jargon and/or documentation customs. (We work with medical records text, which is highly variable doctor to doctor and facility to facility.) Rapid (re)training is the solution here and a competitive advantage for us versus competitors that use one-size-fits-all static ML models or GOFAI rules-based systems.*

So - yes - large, static deep-learning models degrade over time and space, but the solution is not more data or regular retraining. It's picking a more appropriate technology!

* Rule-based systems in this application have 500K+ rules! It's a nontrivial task to tune them over time, let alone space.

Expand full comment