* [LocalAI v4.0.0](https://github.com/mudler/LocalAI) – Self-hosted alternative to popular AI APIs for local inferencing on consumer-grade hardware. * [Ollama v0.18.0](https://github.com/ollama/ollama) – Tool for running and managing large language models. * [Transformer Lab v0.32.0](https://github.com/transformerlab/transformerlab-app) – Cross-platform app for downloading, training, fine-tuning, chatting with, and evaluating large language and diffusion models. * [node-llama-cpp v3.18.0](https://github.com/withcatai/node-llama-cpp) – Run AI models locally with Node.js bindings for llama.cpp. * [Ollama v0.18.2](https://github.com/ollama/ollama) – Tool for running and managing large language models. * [node-llama-cpp v3.18.1](https://github.com/withcatai/node-llama-cpp) – Run AI models locally with Node.js bindings for llama.cpp. * [Ollama v0.18.1](https://github.com/ollama/ollama) – Tool for running and managing large language models. * [HelixML 2.9.0](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [yzma v1.11.0](https://github.com/hybridgroup/yzma) – Go-based library for hardware-accelerated local inference with llama.cpp integration. * [Transformer Lab v0.31.1](https://github.com/transformerlab/transformerlab-app) – Cross-platform app for downloading, training, fine-tuning, chatting with, and evaluating large language and diffusion models. * [HelixML 2.9.6](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [HelixML 2.9.5](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [HelixML 2.9.4](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [HelixML 2.9.3](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [HelixML 2.9.1](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [yzma v1.11.1](https://github.com/hybridgroup/yzma) – Go-based library for hardware-accelerated local inference with llama.cpp integration. * [HelixML 2.8.7](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [llama_cpp.rb v0.24.2](https://github.com/yoshoku/llama_cpp.rb) – Ruby bindings for llama.cpp, enabling easy integration of the library in Ruby applications. * [HelixML 2.8.6](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [HelixML 2.8.5](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [HelixML 2.8.4](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [Gerbil v1.20.3](https://github.com/lone-cloud/gerbil) – Desktop app for running Large Language Models locally with cross-platform support and integrated image generation. * [HelixML 2.8.3](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [HelixML 2.8.2](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [HelixML 2.8.1](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [llama-swap v198](https://github.com/mostlygeek/llama-swap) – Reliable on-demand model switching between local OpenAI-compatible inference servers without restarting applications. * [Ollama v0.18.0-rc2](https://github.com/ollama/ollama) – Tool for running and managing large language models. * [HelixML 2.9.5-rc2](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [HelixML 2.9.2-rc2](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling. * [HelixML 2.8.2-rc2](https://github.com/helixml/helix) – Private GenAI stack for deploying AI agents with support for RAG, API calls, vision, and efficient GPU scheduling.