Historically, when we deploy a machine learning model into production, the parameters that the model learned during its training on data were the sole driver of the model’s outputs. With the Generative LLMs that have taken the world by storm in the past few years, however, the model parameters alone are not enough to get reliably high-quality outputs. For that, the so-called decoding method that we choose when we deploy our LLM into production is also critical.
Read More