MUSCLE: A Model Update Strategy Compatible with LLM Evolution

When we update large language models, the new models may perform worse than the old models on some tasks, which can confuse users. The authors propose a method called MUSCLE, which aims to maintain consistency between new and old models while updating them. Specifically, they found that after model updates, some questions that were answered correctly before are now answered incorrectly. This situation is called "negative flips." To reduce this, the MUSCLE method trains a "compatibility adapter" that learns from both new and old models, trying to keep the new model consistent with the old one while improving overall performance.

When updating large language models, we often encounter a problem: new models may not perform as well as the old ones on some specific tasks. This can be very confusing for users.

In light of this, the authors have conducted in-depth research and exploration, proposing an innovative method called MUSCLE. The uniqueness of this method lies in its ability to maintain consistency in behavior between new and old models during updates. Specifically, the authors found that after model updates, there is a phenomenon where some questions that were correctly answered by the old model are now incorrectly answered by the new model. This situation is referred to as "negative flips."

To effectively reduce the occurrence of "negative flips," the MUSCLE method specifically trains a "compatibility adapter." This adapter learns from both new and old models, striving to keep the new model consistent with the old one while continuously improving overall performance, thus providing users with more stable and reliable services.