Open-Source Intelligent Command Layer
-
Updated
Jun 9, 2026 - Python
Open-Source Intelligent Command Layer
Make local LLM inference faster with chunk-level KV cache reuse
🎬 Nano Cinema: An all-in-one local AI video production studio. Automatically orchestrates Llama-3 (Script), SDXL-Turbo (Visuals), EdgeTTS (Audio), and LTX-Video (Motion) into a seamless Python workflow. Create cinematic short films with no API fees, full privacy, and professional-grade editing logic included!!! 🚀
Local-first Personal AI Memory OS - RAG over your entire life. Git, notes, calendar, location. 100% offline. No cloud.
Cross-platform desktop tool for chaining local AI models and plugins into powerful, agentic workflows. It supports prompt-driven orchestration, visual DAG editing, and full offline execution.
An intelligent local AI agent powered by open-source LLMs, featuring free web search, hybrid memory, and context-aware query rewriting for real-time, grounded answers.
HWP / HWPX files are a web-based editor that can be opened and edited directly in the browser. You can modify Hangul documents without installing any separate program, and even use local AI (OLLAMA) to get Korean synonym suggestions.
**LocalEcho** is a fully local, open-source Text-to-Speech engine powered by **Qwen3 TTS** models
A lightweight, self-contained Python project for running a local large language model (LLM) with minimal dependencies. This system uses TinyLlama-1.1B-Chat-v1.0.0 and llama-cpp-python for inference, and Rich for a user-friendly console chat interface
Local AI assistant that lives in your MacBook's notch. Powered by Ollama — chat, vision, web search, and an autonomous file-system Agent Mode (beta), all on-device.
One API. 20+ AI Providers. Smart Routing. Strong Security. Self-hosted OpenAI-compatible gateway with intelligent fallback & battle-tested defenses.
Lightweight Ruby gem for interacting with locally running Ollama LLMs with streaming, chat, and full offline privacy.
Run model GGUF gui esay ,faster,localy 100%
Free macOS dictation with local AI profiles — Wispr Flow alternative powered by Whisper + llama.cpp. No cloud, no subscription.
LayerRun is a Rust-based local LLM runtime for memory-aware model execution, layer-wise loading, model inspection, and flexible inference serving.
Setup guide for AI-Mini PC. For hosting local LLM's via LM-Studio as RDP/headless-GUI Setup. In this example we'll use a Minisforum AI X1 Pro, AMD Ryzen AI 9 HX 370 / 64GB RAM
Local AI desktop app built for a single user. No accounts. No teams. No telemetry. Just you and your models.
Local-first desktop AI daemon that runs fully offline. Tracks active desktop context, exposes a CLI, streams responses from local LLMs via Ollama, and runs as a systemd user service. Built for systems-level learning: IPC, daemons, streaming inference, OS integration.
Reproducible LLM inference optimization lab for comparing backends, quantization, latency, VRAM, and output sanity on local hardware.
Add a description, image, and links to the local-ai-llm topic page so that developers can more easily learn about it.
To associate your repository with the local-ai-llm topic, visit your repo's landing page and select "manage topics."