1 item with this tag.

  • papers

    NuMuon

    Paper note on making LLM training explicitly produce more low-rank, compressible weights by constraining Muon updates with a nuclear-norm budget.