Instant PyTorch Training: Hot-Swapping LLMs without VRAM Unloading

2025-04-21
Instant PyTorch Training: Hot-Swapping LLMs without VRAM Unloading

Large language model loading times can significantly slow down development. This project introduces a hot-swapping solution for PyTorch training code. By keeping the model resident in VRAM via a background process, it achieves near-instantaneous startup. Even after the script exits, the model remains loaded, ready for immediate use on the next run. Remote debugging and Dear ImGui UI integration are supported, boosting developer efficiency. Simply replace your `from_pretrained` calls to experience instant execution and easy debugging.

Development Hot-Swapping