* [Ollama v0.30.0](https://github.com/ollama/ollama) – Tool for running and managing large language models.

* [LocalAI v4.3.2](https://github.com/mudler/LocalAI) – Self-hosted alternative to popular AI APIs for local inferencing on consumer-grade hardware.

* [yzma v1.15.0](https://github.com/hybridgroup/yzma) – Go-based library for hardware-accelerated local inference with llama.cpp integration.

* [Gerbil v1.25.0](https://github.com/lone-cloud/gerbil) – Desktop app for running Large Language Models locally with cross-platform support and integrated image generation.

* [HelixML 2.11.14](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling.

* [llama-swap v219](https://github.com/mostlygeek/llama-swap) – Reliable on-demand model switching between local OpenAI-compatible inference servers without restarting applications.