* [LocalAI v4.0.0](https://github.com/mudler/LocalAI) – Self-hosted alternative to popular AI APIs for local inferencing on consumer-grade hardware. * [Ollama v0.18.0](https://github.com/ollama/ollama) – Tool for running and managing large language models. * [Transformer Lab v0.32.0](https://github.com/transformerlab/transformerlab-app) – Cross-platform app for downloading, training, fine-tuning, chatting with, and evaluating large language and diffusion models. * [node-llama-cpp v3.18.0](https://github.com/withcatai/node-llama-cpp) – Run AI models locally with Node.js bindings for llama.cpp. * [HelixML 2.9.0](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [yzma v1.11.0](https://github.com/hybridgroup/yzma) – Go-based library for hardware-accelerated local inference with llama.cpp integration. * [Gerbil v1.20.0](https://github.com/lone-cloud/gerbil) – Desktop app for running Large Language Models locally with cross-platform support and integrated image generation. * [llama_cpp.rb v0.24.0](https://github.com/yoshoku/llama_cpp.rb) – Ruby bindings for llama.cpp, enabling easy integration of the library in Ruby applications. * [llama.rn v0.11.4](https://github.com/mybigday/llama.rn) – React Native binding for running LLaMA model inference with multimodal support including vision and audio. * [llama-swap v198](https://github.com/mostlygeek/llama-swap) – Reliable on-demand model switching between local OpenAI-compatible inference servers without restarting applications.