Open-source toolkit for reliable RAG pipelines: convert PDFs to Markdown, clean documents, inspect chunks, compare chunking strategies, and enrich metadata for LLM applications.
markdown chunking document-processing document-extraction rag pdf-processing chunking-algorithm text-splitter llms langchain retrieval-augmented-generation rag-evaluation rag-pipeline pdf-to-markdown chonkie docling document-chunking semantic-chunker chunk-validation rag-chunk
-
Updated
Jun 6, 2026 - Python