Plugins Ecosystem — FiftyOne 1.17.0 documentation

Plugins Ecosystem#

Welcome to the FiftyOne Plugins ecosystem! 🚀

Here you’ll discover cutting-edge research, state-of-the-art models, and powerful add-ons that unlock new FiftyOne workflows.

FiftyOne plugins allow you to extend and customize the functionality of the core tool to suit your specific needs. From advanced computer vision models to integrations with other popular AI tools, this curated collection of plugins will transform FiftyOne into your bespoke visual AI development workbench.

Fiftyone-locate-anything ⭐ 6

by Burhan-Q
NVIDIA LocateAnything-3B is an open-vocabulary grounding VLM from the Eagle family, supporting object detection, phrase grounding, pointing, scene-text/OCR localization, document layout, and GUI element grounding across images and video.

Community,model,vlm,Model

Fiftycomfy ⭐ 15

by harpreetsahota
A FiftyOne Panel for modular node-based workflows which takes inspiration from ComfyUI.

Community,visualization

Qwen3vl Video ⭐ 18

by harpreetsahota
A FiftyOne zoo model integration for Qwen3-VL that enables comprehensive video understanding with multiple label types in a single forward pass and for computing video embeddings.

Community,model,vlm,Model

Semantic Video Search ⭐ 28

by danielgural
search through your video datasets using FiftyOne Brain and Twelve Labs!

Community,model,search

Annotation ⭐ 140

by voxel51
Utilities for integrating FiftyOne with annotation tools

Voxel51,annotation

Brain ⭐ 140

by voxel51
Utilities for working with the FiftyOne Brain

Voxel51,curation,visualization

Dashboard ⭐ 140

by voxel51
Create your own custom dashboards from within the App

Voxel51,visualization

Io ⭐ 140

by voxel51
A collection of import/export utilities

Voxel51,io

Indexes ⭐ 140

by voxel51
Utilities working with FiftyOne database indexes

Voxel51,utils

Plugins ⭐ 140

by voxel51
Utilities for managing and building FiftyOne plugins

Voxel51,utils

Delegated ⭐ 140

by voxel51
Utilities for managing your delegated operations

Voxel51,utils

Runs ⭐ 140

by voxel51
Utilities for managing your custom runs

Voxel51,utils

Utils ⭐ 140

by voxel51
Call your favorite SDK utilities from the App

Voxel51,utils

Zoo ⭐ 140

by voxel51
Download datasets and run inference with models from the FiftyOne Zoo, all without leaving the App

Voxel51,model,dataset

Youtube Panel Plugin ⭐ 5

by jacobmarks
Play YouTube videos in the FiftyOne App!

Community,visualization

Screenparser ⭐ 0

by Burhan-Q
ScreenParser is a YOLO11-L detector fine-tuned by the docling-project on ~1.45M screenshots to localize 55 UI element classes (buttons, tables, navigation bars, text inputs, icons, etc.) in application and web screenshots.

Community,model,detection

Fiftyone-vllm ⭐ 2

by Burhan-Q
Run inference using an online vLLM instance for image captioning, classification, object detection, VQA, and OCR.

Community,model,vlm

Fo-doom ⭐ 6

by Burhan-Q
Play the classic DOOM (1993) shareware game directly within the FiftyOne App.

Community,examples

Fo-openai ⭐ 1

by Burhan-Q
Label images with OpenAI vision models for classification, detection, captioning, VQA, and OCR via the Responses API with Pydantic-validated structured output.

Community,model,vlm

Fo-glitch ⭐ 0

by Burhan-Q
Augment dataset samples with image-corruption artifacts\: pixel sorting, block corruption, channel shifting, scan-line noise, frame tearing, and more.

Community,data,curation

Cradiov4 ⭐ 17

by harpreetsahota
CRADIOv4 performs visual feature extraction whose image embeddings can be used by a downstream model for various tasks. This implementation also produces attention maps.

Community,model,embeddings,Model

Fiftyone Lerobot Importer ⭐ 5

by harpreetsahota
Import your LeRobot format dataset into FiftyOne format

Community,io

Nomic-embed-multimodal ⭐ 2

by harpreetsahota
Nomic Embed Multimodal is a family of vision language models built on Qwen2.5-VL that generates high-dimensional embeddings for both images and text in a shared vector space.

Community,model,embeddings,Model

Online Video Depth Anything ⭐ 6

by harpreetsahota
Integrating Online Video Depth Anything (oVDA) for a temporally-consistent monocular depth estimator for videos that runs in an online setting with low VRAM consumption.

Community,model,depth,Model

Mlflow ⭐ 6

by voxel51
Track model training experiments on your FiftyOne datasets with MLflow!

Voxel51,training

Gemini-vision-plugin ⭐ 5

by AdonaiVera
This plugin integrates Google Gemini's multimodal vision models (e.g., gemini-2.5-flash) into your FiftyOne workflows. Prompt with text and one or more images; receive a text response grounded in visual inputs

Community,model,vlm,Model

Jina Embeddings V4 ⭐ 2

by harpreetsahota
Jina Embeddings v4 is a state-of-the-art vision language model that generates embeddings for both images and text in a shared vector space.

Community,model,embeddings,Model

Ade ⭐ 2

by landingai
Parse, extract, and split documents using LandingAI's Agentic Document Extraction (ADE) API. Converts PDFs, images, spreadsheets, and Office files into structured Markdown with spatial bounding box grounding.

Community,ocr,model,text

Egoexor ⭐ 27

by ardamamur
EgoExOR is an Operating Room dataset fusing egocentric and exocentric perspectives for surgical procedures. See here to load it with FiftyOne.

Community,dataset,medical

Optimal Confidence Threshold ⭐ 6

by danielgural
Find the optimal confidence threshold for your detection models automatically!

Community,evaluation

Moondream3 ⭐ 17

by harpreetsahota
Moondream 3 (Preview) is an vision language model with a mixture-of-experts architecture (9B total parameters, 2B active). This model makes no compromises, delivering state-of-the-art visual reasoning while still retaining our efficient and deployment-friendly ethos.

Community,model,vlm,Model

Torchvision-classifier-finetuner ⭐ 1

by sherpan
Fine-tune pretrained torchvision backbones (ResNet-50, EfficientNet-B2, MobileNetV3) on FiftyOne datasets with Classification labels and run inference directly from the App.

Community,model,training

Video2dataset ⭐ 1

by parva101
Convert YouTube URLs or local videos into FiftyOne image datasets with uniform/scene-change/hybrid frame sampling, perceptual deduplication, and source metadata.

Community,utils

Synthetic Gui Samples Plugins ⭐ 8

by harpreetsahota
A FiftyOne plugin for generating synthetic samples for datasets in COCO4GUI format

Community,model,vlm

Nvlabs Cradiov3 ⭐ 27

by harpreetsahota
Implementing NVLabs C-RADIOv3 Embeddings Model as Remotely Sourced Zoo Model for FiftyOne

Community,model,embeddings,Model

Multi Annotator Toolkit ⭐ 6

by madave94
Tackle noisy annotation! Find and analyze annotation issues in datasets with multiple annotators per image.

Community,annotation

Mose-v2 ⭐ 2

by voxel51
Load and explore the MOSE complex video object segmentation dataset via the FiftyOne Zoo.

Community,dataset,video,Dataset

Rerun-plugin ⭐ 10

by voxel51
Visualize Rerun data files (.rrd) inside the FiftyOne App

Voxel51,visualization

Envi-spetral-viewer ⭐ 3

by ehofesmann
Explore hyperspectral image datasets, interactively visualize pixel-level spectra, and dynamically recolor images.

Community,visualization,hyperspectral

Coco4gui Fiftyone ⭐ 4

by harpreetsahota
Implementing the COCO4GUI dataset type in FiftyOne with importers and exports

Community,io

Qwen3vl Embeddings ⭐ 7

by harpreetsahota
Qwen3-VL-Embedding maps text, images, and video into a unified representation space, enabling powerful cross-modal retrieval and understanding.

Community,model,embeddings,Model

Qwen3 5 Vl ⭐ 4

by harpreetsahota
Implementing Qwen3.5VL as a Remote Source Zoo Model for FiftyOne.

Community,model,Model

Gemma4 ⭐ 0

by Burhan-Q
Google's Gemma 4 multimodal family as a Remote Zoo Model, supporting image and video operations including detection, classification, VQA, OCR, and temporal localization.

Community,model,vlm,video,Model

Davis-2017 ⭐ 0

by voxel51
Load and explore the DAVIS-2017 video segmentation dataset via the FiftyOne Zoo.

Community,dataset,video,Dataset

Isaac-0 2 ⭐ 9

by perceptron-ai-inc
Isaac-0.2 is Perceptron AI's hybrid-reasoning vision language model supporting object detection, keypoint detection, OCR, instance segmentation, visual question answering, and UI understanding. Includes thinking and tool use for improving detection in complex scenes.

Community,model,vlm,Model

Molmo Point ⭐ 1

by harpreetsahota
Integrating MolmoPoint a model that locates and tracks objects in images and videos by pointing and returning precise pixel coordinates

Community,model,tracking,Model

Albumentations Augmentation ⭐ 15

by jacobmarks
Test out any Albumentations data augmentation transform with FiftyOne!

Community,data

Lightonocr-2 ⭐ 6

by harpreetsahota
LightOnOCR-2-1B is a compact multilingual VLM that converts document images into clean, naturally ordered text without brittle multi-stage OCR pipelines.

Community,model,vlm,Model

Caption Viewer ⭐ 3

by harpreetsahota
A plugin that intelligently displays and formats vision language model outputs and text fields. Perfect for viewing OCR results, receipt analysis, document processing, and any text-heavy computer vision workflows.

Community,visualization,vlm

Image Editing Panel ⭐ 2

by harpreetsahota
Chat-based image editing powered by HuggingFace image-to-image Inference API.

Community,model,huggingface

Zero Shot Prediction ⭐ 37

by jacobmarks
Run zero-shot (open vocabulary) prediction on your data!

Community,model

Vlm Prompt Lab ⭐ 4

by harpreetsahota
Experiment with any VLM that can be run in a Hugging Face image-text-to-text pipeline right in the FiftyOne App!

Community,model,vlm

Text Evaluation Metrics ⭐ 2

by harpreetsahota
This plugin provides five text evaluation metrics for comparing predictions against ground truth\: ANLS, Exact Match, Normalized Similarity, Character Error Rate, and Word Error Rate.

Community,model,evaluation,text

Kimi Vl A3b ⭐ 7

by harpreetsahota
FiftyOne Remotely Sourced Zoo Model integration for Moonshot AI's Kimi-VL-A3B models enabling object detection, keypoint localization, and image classification with strong GUI and document understanding capabilities.

Community,model,vlm,Model

Image Issues ⭐ 35

by jacobmarks
Find common image quality issues in your datasets

Community,curation

Qwen Image Edit ⭐ 5

by harpreetsahota
Chat-based image editing powered by drbaph/Qwen-Image-Edit-2511-FP8

Community,model

Siglip2 ⭐ 5

by harpreetsahota
A FiftyOne Remotely Sourced Zoo Model integration for Google's SigLIP2 model enabling natural language search across images in your FiftyOne Dataset

Community,model,vlm,Model

Sample-inspector ⭐ 1

by allenleetc
Adjust image brightness and contrast and filter semantic masks by class in a sample detail view!

Community,curation

Roi-patches ⭐ 1

by mgustineli
Tile images into a configurable grid of ROI patches with adjustable overlap for region-based analysis, using FiftyOne's native patches view.

Community,curation

Vlmrun-voxel51-plugin ⭐ 9

by vlm-run
Extract structured data from visual and audio sources including documents, images, and videos

Community,model,vlm

Molmo2 ⭐ 1

by harpreetsahota
Molmo2 is a family of open vision language models developed by the Allen Institute for AI (Ai2) that support image, video, and multi-image understanding and grounding.

Community,model,vlm,Model

Hiera Video Embeddings ⭐ 4

by harpreetsahota
Compute embeddings for video using Facebook Hiera Models

Community,model,video

Sam3 Images ⭐ 4

by harpreetsahota
Integration of Meta's SAM3 (Segment Anything Model 3) into FiftyOne, with full support of text prompts, keypoint prompts, bounding box prompts, auto segmentation, and image embeddings.

Community,model,segmentation,Model

Fast Vlm ⭐ 11

by harpreetsahota
Integrating FastVLM as a Remote Source Zoo Model for FiftyOne

Community,model,vlm,Model

Hf Fine Tuner Plugin ⭐ 3

by harpreetsahota
A plugin to fine-tune Hugging Face models on your FiftyOne Dataset.

Community,model,huggingface

Fiftyone-tile ⭐ 1

by mmoollllee
Tile your high resolution images to squares for training small object detection models

Community,visualization

Glm Ocr ⭐ 4

by harpreetsahota
GLM-OCR is a lightweight 0.9B vision language model achieving state-of-the-art document understanding, including formula recognition, table recognition, and structured information extraction.

Community,model,vlm,Model

Gui Actor ⭐ 2

by harpreetsahota
Implementing Microsoft's GUI Actor as a Remote Zoo Model for FiftyOne

Community,model,vlm,Model

Image Captioning ⭐ 11

by jacobmarks
Caption all your images with state of the art vision language models!

Community,model,vlm

Medgemma 1 5 ⭐ 1

by harpreetsahota
Implementing MedGemma 1.5 as a Remote Zoo Model for FiftyOne

Community,model,medical,Model

Fiftyone Wandb Plugin ⭐ 4

by harpreetsahota
This plugin connects FiftyOne datasets with Weights & Biases to enable reproducible, data-centric ML workflows.

Community,utils,evaluation

Isaac0 1 ⭐ 5

by harpreetsahota
Isaac-0.1 is the first in Perceptron AI's family of models built to be the intelligence layer for the physical world. This integration supports various computer vision tasks including object detection, classification, OCR, visual question answering, and more.

Community,model,vlm,Model

Apple Sharp ⭐ 4

by harpreetsahota
SHARP is Apple's state-of-the-art model for predicting 3D Gaussian Splats from a single RGB image. This integration brings SHARP to FiftyOne, enabling batch inference on image datasets with 3D visualization.

Community,model,3d,Model

Active Learning ⭐ 18

by jacobmarks
Accelerate your data labeling with Active Learning!

Community,annotation

Anonymize ⭐ 7

by swheaton
Anonymize/blur images based on a FiftyOne Detections field.

Community,curation

Plotly-map-panel ⭐ 1

by allenleetc
Plotly-based Map Panel with adjustable marker cosmetics!

Community,visualization

Clustering Algorithms ⭐ 5

by danielgural
Find the clusters in your data using some of the best algorithms available!

Community,curation

Reverse Image Search ⭐ 14

by jacobmarks
Find the images in your dataset most similar to an image from filesystem or the internet!

Community,curation

Huggingface Hub ⭐ 2

by voxel51
Push FiftyOne datasets to the Hugging Face Hub, and load datasets from the Hub into FiftyOne!

Voxel51,dataset,huggingface

Transformers ⭐ 2

by voxel51
Run inference on your datasets using Hugging Face Transformers models!

Voxel51,model,huggingface

Pdf-loader ⭐ 4

by brimoor
Load your PDF documents into FiftyOne as per-page images

Community,io

Mineru 2 5 ⭐ 6

by harpreetsahota
MinerU2.5 is a 1.2B-parameter vision language model for efficient high-resolution document parsing. This model can support grounding OCR as well as free text OCR.

Community,model,vlm,Model

Fiftyone-vlm-efficient ⭐ 4

by AdonaiVera
Improve VLM training data quality with state-of-the-art dataset pruning and quality techniques

Community,model,vlm,curation

Nemotron Nano Vl ⭐ 3

by harpreetsahota
Implementing Llama-3.1-Nemotron-Nano-VL-8B-V1 as a Remote Zoo Model for FiftyOne

Community,model,vlm,Model

Model-comparison ⭐ 14

by allenleetc
Compare two object detection models!

Community,evaluation

Vggt ⭐ 20

by harpreetsahota
Implemeting Meta AI's VGGT as a FiftyOne Remote Zoo Model

Community,model,3d,Model

Medsiglip ⭐ 2

by harpreetsahota
Implementing MedSigLIP as a Remote Zoo Model for FiftyOne

Community,model,medical,Model

Nanonets Ocr2 ⭐ 1

by harpreetsahota
Nanonets-OCR2 transforms documents into structured markdown with intelligent content recognition and semantic tagging, making it ideal for downstream processing by Large Language Models (LLMs).

Community,model,ocr,Model

Olmocr-2 ⭐ 1

by harpreetsahota
olmOCR-2 is a state-of-the-art OCR model built on Qwen2.5-VL architecture that extracts text from document images with high accuracy.

Community,model,ocr,Model

Deepseek Ocr ⭐ 3

by harpreetsahota
DeepSeek-OCR is a vision language model designed for optical character recognition with a focus on "contextual optical compression."

Community,model,vlm,Model

Semantic Document Search ⭐ 9

by jacobmarks
Perform semantic search on text in your documents!

Community,search

Kosmos2 5 ⭐ 3

by harpreetsahota
Kosmos-2.5 excels at two core tasks\: generating spatially-aware text blocks (OCR) and producing structured markdown output from images.

Community,model,ocr,Model

Medgemma ⭐ 10

by harpreetsahota
Implementing MedGemma as a Remote Zoo Model for FiftyOne

Community,model,medical,Model

Bimodernvbert ⭐ 1

by harpreetsahota
BiModernVBert is a vision language model built on the ModernVBert architecture that generates embeddings for both images and text in a shared 768-dimensional vector space.

Community,model,embeddings,Model

Colmodernvbert ⭐ 1

by harpreetsahota
ColModernVBert is a multi-vector vision language model built on the ModernVBert architecture that generates ColBERT-style embeddings for both images and text.

Community,model,embeddings,Model

Colqwen2 5 V0 2 ⭐ 1

by harpreetsahota
ColQwen2.5 is a vision language model based on Qwen2.5-VL-3B-Instruct that generates ColBERT-style multi-vector representations for efficient document retrieval. This version takes dynamic image resolutions (up to 768 image patches) and doesn't resize them, preserving aspect ratios for better accuracy.

Community,model,embeddings,Model

Bddoia-fiftyone ⭐ 2

by AdonaiVera
Load and explore the BDDOIA Safe/Unsafe Action dataset via the FiftyOne Zoo

Community,dataset,Dataset

Ui Tars ⭐ 7

by harpreetsahota
Implementing UI-TARS-1.5 as a Remote Zoo Model for FiftyOne

Community,model,vlm,Model

Fiftyone-agents ⭐ 1

by AdonaiVera
A comprehensive FiftyOne plugin for testing and evaluating multiple vision langugage models with dynamic prompts and built-in evaluation capabilities

Community,vlm,evaluation

Colpali V1 3 ⭐ 1

by harpreetsahota
ColPali is a vision language model based on PaliGemma-3B that generates ColBERT-style multi-vector representations for efficient document retrieval.

Community,model,embeddings,Model

Paligemma2 ⭐ 5

by harpreetsahota
Implementing PaliGemma-2-Mix as a Remote Zoo Model for FiftyOne

Community,model,vlm,Model

Minicpm-v ⭐ 4

by harpreetsahota
Integrating MiniCPM-V 4.5 as a Remote Source Zoo Model in FiftyOne

Community,model,vlm,Model

Multimodal Rag ⭐ 21

by jacobmarks
Create and test multimodal RAG pipelines with LlamaIndex, Milvus, and FiftyOne!

Community,search,embeddings

Audio Retrieval ⭐ 11

by jacobmarks
Find the images in your dataset most similar to an audio file!

Community,audio

Nemo Retriever Parse Plugin ⭐ 4

by harpreetsahota
Implementing NVIDIA NeMo Retriever Parse as a FiftyOne Plugin

Community,model,ocr

Clustering ⭐ 11

by jacobmarks
Cluster your images using embeddings with FiftyOne and scikit-learn!

Community,curation

Vitpose ⭐ 3

by harpreetsahota
Run ViTPose Models from Hugging Face on your FiftyOne Dataset

Community,model,pose

Moondream2 ⭐ 3

by harpreetsahota
Moondream2 implementation as a remotely sourced zoo model for FiftyOne

Community,model,vlm,Model

Florence2 ⭐ 4

by harpreetsahota
Implementing Florence2 as a Remote Zoo Model for FiftyOne

Community,model,vlm,Model

Showui ⭐ 2

by harpreetsahota
Integrating ShowUI into FiftyOne as a Remote Source Zoo Model

Community,model,vlm,Model

Mimo Vl ⭐ 3

by harpreetsahota
Implementing MiMo-VL as a Remote Zoo Model for FiftyOne

Community,model,vlm,Model

Os Atlas ⭐ 5

by harpreetsahota
Integrating OS-Atlas Base into FiftyOne as a Remote Source Zoo Model

Community,model,vlm,Model

Vqa-plugin ⭐ 19

by jacobmarks
Ask (and answer) open-ended visual questions about your images!

Community,model,vqa

Segments-voxel51-plugin ⭐ 5

by segmentsai
Integrate FiftyOne with the Segments.ai annotation tool!

Community,annotation

Edit Label Attributes ⭐ 3

by ehofesmann
Edit attributes of your labels directly in the FiftyOne App!

Community,annotation

Qwen2 5 Vl ⭐ 1

by harpreetsahota
Implementing Qwen2.5-VL as a Remote Zoo Model for FiftyOne

Community,model,vlm,Model

Pytesseract Ocr ⭐ 11

by jacobmarks
Run optical character recognition with PyTesseract!

Community,model,ocr

Audio Loader ⭐ 5

by danielgural
Import your audio datasets as spectograms into FiftyOne!

Community,audio,visualization

Visual Document Retrieval ⭐ 3

by harpreetsahota
A FiftyOne Remotely Sourced Zoo Model integration for LlamaIndex's VDR model enabling natural language search across document images, screenshots, and charts in your datasets.

Community,model,ocr,Model

Image Deduplication ⭐ 18

by jacobmarks
Find exact and approximate duplicates in your dataset!

Community,curation

Emoji Search ⭐ 7

by jacobmarks
Semantically search emojis and copy to clipboard!

Community,examples

Janus Vqa ⭐ 6

by harpreetsahota
Run the Janus Pro Models from Deepseek on your Fiftyone Dataset

Community,model,vlm

Depth Pro Plugin ⭐ 2

by harpreetsahota
Perfom zero-shot metric monocular depth estimation using the Apple Depth Pro model

Community,model,depth

Outlier Detection ⭐ 7

by danielgural
Find those troublesome outliers in your dataset automatically!

Community,curation

Text To Image ⭐ 33

by jacobmarks
Add synthetic data from prompts with text-to-image models and FiftyOne!

Community,model,vlm

Concept Space Traversal ⭐ 5

by jacobmarks
Navigate concept space with CLIP, vector search, and FiftyOne!

Community,embeddings

Concept Interpolation ⭐ 6

by jacobmarks
Find images that best interpolate between two text-based extremes!

Community,curation

Gpt4 Vision ⭐ 9

by jacobmarks
Chat with your images using GPT-4 Vision!

Community,model,vlm

Fiftyone-timestamps ⭐ 1

by mmoollllee
Compute datetime-related fields (sunrise, dawn, evening, weekday, ...) from your samples' filenames or creation dates

Community,curation

Keyword Search ⭐ 3

by jacobmarks
Perform keyword search on a specified field!

Community,search

Img To Video ⭐ 1

by danielgural
Bring images to life with image to video!

Community,video

Double Band Filter ⭐ 2

by jacobmarks
on two numeric ranges simultaneously!

Community,search

Filter Values ⭐ 1

by ehofesmann
Filter a field of your FiftyOne dataset by one or more values.

Community,search

Line2d ⭐ 4

by wayofsamu
Visualize x,y-Points as a line chart.

Community,visualization

Twilio Automation ⭐ 2

by jacobmarks
Automate data ingestion with Twilio!

Community,data

Labs Panel ⭐ 12

by 51labs
panel listing all the available FiftyOne Labs features

Labs,ml,utils

Video Apply Model ⭐ 12

by 51labs
image model to video dataset using torch dataloader

Labs,ml,video

Few Shot Learning ⭐ 12

by 51labs
few-shot learning with multiple model types

Labs,ml,classification

Label Propagation ⭐ 12

by 51labs
Labels across frames of a video

Labs,ml,video,segmentation

Box Combine ⭐ 12

by 51labs
Boxes Fusion for detections

Labs,ml,detection

Click Segmentation ⭐ 12

by 51labs
image segmentation via prompts

Labs,ml,segmentation

Zero-shot-coreset-selection ⭐ 3

by 51labs
coreset selection (ZCore) for unlabeled image data

Labs,ml

Note

Community plugins are external projects maintained by their respective authors. They are not part of FiftyOne core and may change independently. Please review each plugin’s documentation and license before use.