* [llm-server v3.0.0](https://github.com/raketenkater/llm-server) – Hardware-detecting launcher that tunes and starts GGUF inference servers with automatic GPU placement, backend selection, and OpenAI-compatible serving.

* [llama.rn v0.12.4](https://github.com/mybigday/llama.rn) – React Native binding for running LLaMA model inference with multimodal support including vision and audio.