* [Ollama v0.30.0](https://github.com/ollama/ollama) – Tool for running and managing large language models.

* [LocalAI v4.3.0](https://github.com/mudler/LocalAI) – Self-hosted alternative to popular AI APIs for local inferencing on consumer-grade hardware.

* [Transformer Lab v0.38.0](https://github.com/transformerlab/transformerlab-app) – Cross-platform app for downloading, training, fine-tuning, chatting with, and evaluating large language and diffusion models.

* [yzma v1.15.0](https://github.com/hybridgroup/yzma) – Go-based library for hardware-accelerated local inference with llama.cpp integration.

* [Gerbil v1.25.0](https://github.com/lone-cloud/gerbil) – Desktop app for running Large Language Models locally with cross-platform support and integrated image generation.

* [llama.rn v0.12.0](https://github.com/mybigday/llama.rn) – React Native binding for running LLaMA model inference with multimodal support including vision and audio.

* [HelixML 2.11.14](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling.

* [llama_cpp.rb v0.25.2](https://github.com/yoshoku/llama_cpp.rb) – Ruby bindings for llama.cpp, enabling easy integration of the library in Ruby applications.

* [llama-swap v219](https://github.com/mostlygeek/llama-swap) – Reliable on-demand model switching between local OpenAI-compatible inference servers without restarting applications.