[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.
-
Updated
May 14, 2026 - Python
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.
97% token reduction for AI coding sessions — zero deps, 31 languages, MCP server
CLI proxy that reduces LLM token usage by 60-90%. Declarative YAML filters for Claude Code, Cursor, Copilot, Gemini. rtk alternative in Go.
The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM", IJCV2025
Less is more. Make your agents smarter and faster. It’s not just about saving time; it’s about the feeling of not wasting it.
A high-performance Semantic Signal Engine with Context OS for Agentic AI. Run your AI with zero noise, pure context, and 90% lower token costs.
A discovery and compression tool for your Python codebase. Creates a knowledge graph for a LLM context window, efficiently outlining your project | Code structure visualization | LLM Context Window Efficiency | Static analysis for AI | Large Language Model tooling #LLM #AI #Python #CodeAnalysis #ContextWindow #DeveloperTools
AI-powered text compression library for RAG systems and API calls. Reduce token usage by up to 50-60% while preserving semantic meaning with advanced compression strategies.
A lightweight tool to optimize your Javascript / Typescript project for LLM context windows by using a knowledge graph | AI code understanding | LLM context enhancement | Code structure visualization | Static analysis for AI | Large Language Model tooling #LLM #AI #JavaScript #TypeScript #CodeAnalysis #ContextWindow #DeveloperTools
[CVPR 2025] PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models
ZON → 35-70% cheaper LLM prompts than JSON/TOON. Zero overhead.
[AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models
A lightweight tool to optimize your C# project for LLM context windows by using a knowledge graph | Code structure visualization | Static analysis for AI | Large Language Model tooling | .NET ecosystem support #LLM #AI #CSharp #DotNet #CodeAnalysis #ContextWindow #DeveloperTools
An Open Source Intelligent Codebase Visualizer for javascript, reactjs, nextjs and nodejs for easy PR review, fast Onboarding and deep architectural understanding
token-ninja routes deterministic shell commands locally — zero LLM calls, ~19µs latency. Works silently inside AI tools via MCP.
⚡ Cut Claude token usage by 90%+ — free, open-source, local-first context compression for Claude Code. Hybrid RAG (BM25 + ONNX vectors), AST chunking, reranking. No API needed.
A discovery and compression tool for your Java codebase. Creates a knowledge graph for a LLM context window, efficiently outlining your project #LLM #AI #Java #CodeAnalysis #ContextWindow #DeveloperTools #StaticAnalysis #CodeVisualization
CLI proxy for coding agents that cuts noisy terminal output while preserving command behavior
DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference
A terse-output skill for AI agents. Shorter replies without stripping technical details.
Add a description, image, and links to the token-reduction topic page so that developers can more easily learn about it.
To associate your repository with the token-reduction topic, visit your repo's landing page and select "manage topics."