Florence2-Sharp

Guides

Walkthroughs for each major Florence-2 task family.

Captioning

Short, detailed, and verbose captions — and when to choose each.

OCR

Read text from an image as plain text or with bounding boxes per region.

Object detection

Detect objects with OD, DENSE_REGION_CAPTION, and OPEN_VOCABULARY_DETECTION.

Phrase grounding

Highlight the region of an image that matches a natural-language phrase.

© 2026 Florence2-Sharp. All rights reserved.