developers

Developer Documentation

Welcome to the QDrant Loader developer documentation! This guide provides everything you need to understand, extend, test, and deploy QDrant Loader. Whether you're contributing to the core project or building custom extensions, you'll find detailed technical information and practical examples here.

🎯 Quick Navigation

Core Development

Architecture Guide - System design, components, and data flow
Extending QDrant Loader - Custom connectors and processors

Quality & Deployment

Testing Guide - Testing strategies, frameworks, and best practices
Deployment Guide - Production deployment, containerization, and CI/CD

Contributing

Best Practices - Pythonic patterns, AI/RAG guidelines, and PR review checklist

Documentation

Documentation Maintenance - Maintaining and updating documentation

🏗️ Architecture Overview

QDrant Loader follows a modular architecture designed for multi-project document ingestion and vector storage:

┌─────────────────────────────────────────────────────────────┐
│ QDrant Loader Core │
├─────────────────────────────────────────────────────────────┤
│ Data Sources │ Processing │ Vector Storage │
│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────────┐ │
│ │ Connectors │ │ │ Processors │ │ │ QDrant Client │ │
│ │ - Local │ │ │ - MarkItDown│ │ │ - Collections │ │
│ │ - Git │ │ │ - Text │ │ │ - Vectors │ │
│ │ - Confluence│ │ │ - Chunking │ │ │ - Search │ │
│ │ - Jira │ │ │ - Embedding │ │ │ - Metadata │ │
│ │ - PublicDocs│ │ │ │ │ │ │ │
│ └─────────────┘ │ └─────────────┘ │ └─────────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ MCP Server │ CLI Interface │ Configuration │
│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────────┐ │
│ │ Search APIs │ │ │ Commands │ │ │ YAML Config │ │
│ │ - Semantic │ │ │ - init │ │ │ - Multi-project │ │
│ │ - Hierarchy │ │ │ - ingest │ │ │ - Workspace │ │
│ │ - Attachment│ │ │ - config │ │ │ - Environment │ │
│ │ │ │ │ - project │ │ │ - Validation │ │
│ └─────────────┘ │ └─────────────┘ │ └─────────────────┘ │
└─────────────────────────────────────────────────────────────┘

🚀 Getting Started for Developers

1. Development Environment Setup

# Clone the repository
git clone https://github.com/martin-papy/qdrant-loader.git
cd qdrant-loader

# Install all workspace packages with development dependencies
# uv automatically creates and manages the virtual environment
uv sync --all-packages --all-extras

# Start QDrant for development
docker run -p 6333:6333 qdrant/qdrant:latest

2. Running Tests

# Run all tests from workspace root
make test

# Run specific package tests
make test-loader
make test-mcp
make test-core

# Run with coverage
make test-coverage

# Or run pytest directly via uv
uv run pytest packages/qdrant-loader/tests/
uv run pytest packages/qdrant-loader-mcp-server/tests/ --cov=src --cov-report=html

3. Code Quality Checks

# From workspace root
make lint
make format

# Or run tools directly via uv
uv run ruff check --fix .
uv run black .
uv run isort .

📚 Core Concepts for Developers

Data Flow Architecture

Understanding the data flow is crucial for development:

Configuration Phase
- Multi-project workspace configuration
- Global settings and project-specific sources
- Environment variable management
- Validation and initialization
Ingestion Phase
- Connectors fetch documents from data sources
- File conversion using MarkItDown library
- Content extraction and cleaning
- Chunking strategies for large documents
- Metadata extraction and enrichment
Embedding Phase
- Text content converted to embeddings via configurable LLM providers (OpenAI, Azure OpenAI, Ollama)
- Batch processing for efficiency
- Error handling and retries
- Progress tracking and metrics
Storage Phase
- Vectors stored in QDrant collections
- Metadata indexed for filtering
- Project-based organization
- State tracking and change detection
Search Phase (MCP Server)
- Semantic similarity search
- Hierarchy-aware search
- Attachment-specific search
- Project filtering and organization

Connector System

QDrant Loader uses a connector-based architecture for data sources:

# Example connector implementation
from qdrant_loader.connectors.base import BaseConnector
from qdrant_loader.core.document import Document

class CustomConnector(BaseConnector):
    async def get_documents(self) -> list[Document]:
        """Get documents from the source."""
        documents = []
        # Your custom logic here
        for item in self.fetch_data():
            doc = Document(
                content=item.content,
                metadata=item.metadata,
                source_type="custom",
                source_name=self.config.name
            )
            documents.append(doc)
        return documents

Available connectors:

LocalFileConnector - Local file system
GitConnector - Git repositories
ConfluenceConnector - Confluence spaces
JiraConnector - Jira projects
PublicDocsConnector - Public documentation sites

🔧 Development Workflows

Contributing to Core

Fork and Clone

git clone https://github.com/your-username/qdrant-loader.git
cd qdrant-loader
git remote add upstream https://github.com/martin-papy/qdrant-loader.git

Create Feature Branch

git checkout -b feature/your-feature-name

Development Cycle

# Make changes
# Run tests
make test

# Check code quality
make lint

# Commit changes
git commit -m "feat: add new feature"

Submit Pull Request
- Ensure all tests pass
- Update documentation
- Add changelog entry
- Request review

Custom Connector Development

Create Connector Structure

my-connector/
├── src/
│   └── my_connector/
│       ├── __init__.py
│       ├── connector.py
│       └── config.py
├── tests/
└── pyproject.toml

Implement Connector Interface

from qdrant_loader.connectors.base import BaseConnector
from qdrant_loader.config.source_config import SourceConfig

class MyConnector(BaseConnector):
    def __init__(self, config: SourceConfig):
        super().__init__(config)
        # Initialize your connector

    async def get_documents(self) -> list[Document]:
        # Implement document fetching logic
        pass

Add Configuration Support

from pydantic import BaseModel

class MyConnectorConfig(SourceConfig):
    source_type: str = "my_connector"
    api_key: str
    base_url: str
    # Add your configuration fields

📖 Detailed Guides

Architecture Guide

Deep dive into system design, component interactions, and architectural decisions. Essential reading for understanding how QDrant Loader works internally.

Key Topics:

Multi-project workspace architecture
Connector and processor interfaces
Async ingestion pipeline design
State management and change detection
MCP server integration

Extending Guide

Comprehensive guide for building custom functionality and connectors. Learn how to extend QDrant Loader for your specific needs.

Key Topics:

Custom connector development
File conversion extensions
Configuration schema extensions
Testing custom components
Packaging and distribution

Testing Guide

Testing strategies, frameworks, and best practices for ensuring code quality and reliability.

Key Topics:

Unit testing with pytest
Integration testing strategies
Async testing patterns
Mock and fixture usage
CI/CD integration

Deployment Guide

Production deployment strategies, containerization, and operational best practices.

Key Topics:

Docker containerization
Environment configuration
Monitoring and logging
Performance optimization
Security considerations

🛠️ Development Tools and Utilities

Available CLI Commands

# Initialize QDrant collection
qdrant-loader init --workspace .

# Ingest documents
qdrant-loader ingest --workspace .

# View configuration
qdrant-loader config --workspace .

# Project management
qdrant-loader config --workspace .
qdrant-loader config --workspace .
qdrant-loader config --workspace .

# Start MCP server
mcp-qdrant-loader

Debugging and Profiling

# Enable debug logging
qdrant-loader --log-level DEBUG --workspace . ingest

# Profile performance
qdrant-loader ingest --workspace . --profile

# Memory profiling (requires memory_profiler)
python -m memory_profiler your_script.py

Development Scripts

# Makefile targets
make test    # Run all tests
make lint    # Run linting
make format  # Format code
make docs    # Build documentation
make clean   # Clean build artifacts

🔗 Integration Examples

Workspace Configuration

# config.yaml
global:
  qdrant:
    url: "http://localhost:6333"
    collection_name: "my_collection"
  llm:
    provider: "openai"
    base_url: "https://api.openai.com/v1"
    api_key: "${LLM_API_KEY}"
    models:
      embeddings: "text-embedding-3-small"
      chat: "gpt-4o-mini"
    embeddings:
      vector_size: 1536

projects:
  - project_id: "docs"
    sources:
      - source_type: "local_files"
        name: "documentation"
        config:
          base_url: "file://./docs"
          include_paths:
            - "**/*.md"

Programmatic Usage

from qdrant_loader.config import Settings, get_settings
from qdrant_loader.core.async_ingestion_pipeline import AsyncIngestionPipeline

# Load settings
settings = get_settings()

# Create and run pipeline
pipeline = AsyncIngestionPipeline(settings)
await pipeline.run()

MCP Server Integration

# The MCP server runs as a separate process
# Start with: mcp-qdrant-loader
# It provides search tools to AI development environments
# Tools available:
# - search_documents
# - search_with_hierarchy
# - search_attachments

📝 Development Checklist

Before Submitting Code

All tests pass (make test)
Code style checks pass (make lint)
Type checking passes (mypy)
Documentation updated
Changelog entry added (if applicable)

For New Features

Design document created (for major features)
Tests cover all code paths
Documentation includes examples
Backward compatibility maintained
Configuration schema updated (if needed)

For Bug Fixes

Root cause identified
Regression test added
Fix verified in multiple environments
Documentation updated (if needed)

🤝 Community and Support

Getting Help

GitHub Issues - Bug reports and feature requests
Discussions - Questions and community support
Documentation - Comprehensive guides and references
Code Examples - Real-world usage patterns

Contributing Guidelines

Code of Conduct - Be respectful and inclusive
Issue Templates - Use provided templates for consistency
Pull Request Process - Follow the established workflow
Review Process - Participate in code reviews
Documentation - Keep documentation up to date

Development Roadmap

Core Features - Enhanced search capabilities and performance
Connectors - Additional data source integrations
Developer Experience - Better tooling and documentation
Enterprise Features - Advanced security and compliance

Ready to start developing? Choose your path:

New to QDrant Loader? Start with the Architecture Guide
Creating connectors? Follow the Extending Guide
Setting up CI/CD? Use the Deployment Guide

Need help? Join our community discussions or open an issue on GitHub!

Name		Name	Last commit message	Last commit date
parent directory ..
architecture		architecture
cli		cli
contributing		contributing
deployment		deployment
documentation		documentation
extending		extending
testing		testing
README.md		README.md

FilesExpand file tree

developers

Directory actions

More options

Directory actions

More options

Latest commit

History

developers

Folders and files

parent directory

README.md

Developer Documentation

🎯 Quick Navigation

Core Development

Quality & Deployment

Contributing

Documentation

🏗️ Architecture Overview

🚀 Getting Started for Developers

1. Development Environment Setup

2. Running Tests

3. Code Quality Checks

📚 Core Concepts for Developers

Data Flow Architecture

Connector System

🔧 Development Workflows

Contributing to Core

Custom Connector Development

📖 Detailed Guides

Architecture Guide

Extending Guide

Testing Guide

Deployment Guide

🛠️ Development Tools and Utilities

Available CLI Commands

Debugging and Profiling

Development Scripts

🔗 Integration Examples

Workspace Configuration

Programmatic Usage

MCP Server Integration

📝 Development Checklist

Before Submitting Code

For New Features

For Bug Fixes

🤝 Community and Support

Getting Help

Contributing Guidelines

Development Roadmap