annomage — viraj mavani

abstract

Anno-Mage is a semi-automatic image annotation tool that suggests bounding box annotations, which users can then confirm, correct, or supplement manually. It supports two detection backends: RetinaNet ResNet50 FPN V2 (PyTorch, COCO-pretrained) and OWL-v2 (Hugging Face Transformers, open-vocabulary zero-shot detection). It significantly reduces the time and effort required to build annotated datasets for object detection tasks. Annotations are exported as CSV or Pascal VOC XML.

Anno-Mage ships as a web app (FastAPI backend + React frontend).

demo

Download Source (GitHub)

user guide

installation

PyPI package (recommended):

pip install anno-mage
anno-mage

From source:

cd web
bash start.sh
# backend on :8000, frontend on :3000

instructions

Launch the tool (from source or anno-mage CLI).
Select the folder of images to annotate.
Enter your object class names.
Choose an auto-suggestion engine — RetinaNet ResNet50 FPN V2 (COCO classes) or OWL-v2 (open-vocabulary, any text prompt) — and let it suggest bounding boxes for each image.
Refine boxes via drag-and-drop; navigate images with arrow keys — annotations save automatically.

usage

Annotations are written to ~/annotations/:

CSV: annotations.csv — columns: image_path,x1,y1,x2,y2,label
Pascal VOC XML: annotations_voc/ — one .xml file per image

community

Join the discussion on Slack.

acknowledgments

Meditab Software Inc. — for supporting this project.
PyTorch / Torchvision teams — RetinaNet backbone powering the auto-suggestion engine.
Hugging Face Transformers — OWL-v2 open-vocabulary zero-shot detection.
Computer Vision Group, LDCE — for feedback and testing.

AnnoMage: A Semi-Automatic Image Annotation Tool