AI Vision Benchmark Results

SnapJournal — Schedule Extraction from Handwritten Planner | 10 iterations per model | March 10, 2026

F1 Score Overview — All Models

Per-Iteration F1 Scores — Consistency View

OpenAI

Gemini

Claude

Grok

Stability & Variance

Latency vs Accuracy

Accuracy Breakdown by Field

Detailed Results Table

Model Provider Tier Avg F1 Min F1 Max F1 Std Dev Precision Recall Avg Latency Consistency