Agentarium

Image Analyzer Agent

v1.0.0active

Generic scientific image analyzer that downloads remote images, base64-encodes them, and sends batches to a vision-capable language model via responses.parse for structured per-image analysis. Returns one FigureAnalysis per image (type, description, axes, legend, caption, anomaly notes) plus a consolidated Markdown report.

by NASA-IMPACT akd-ext contributors (NASA-IMPACT) · other · analysis

T1 · Conformantwhat's this?

tested on

gpt-5.2

license

Apache-2.0

framework

openai-agents-sdk

citable url

https://agentarium.science/a/image-analyzer/v/1.0.0

Guardrails & validationexplicitly declared by the author — shown so you can judge, not registry-verified

Guardrails declared — author-stated

✓

No fabrication

Never invents values, trends, or features not visible in the image; uses 'approximately' when precision is unreadable.

✓

Verbatim slug

slug field is always copied verbatim from the caption — never shortened, invented, or inferred.

✓

URL passthrough

url field in FigureAnalysis is always left empty by the model and filled programmatically from the download map.

✓

Context non-override

The supplied Context paragraph is used only to resolve ambiguous labels; it never overrides what the image actually shows.

✓

Notes isolation

notes field contains only genuine anomalies; primary descriptive content must go in description, never in notes.

✓

Unreadable-image stub

Unreadable images return a stub entry with description='image could not be read' and figure_type='unknown' rather than being skipped.

✓

Non-contradiction

description and notes are checked for consistency; they must not contradict each other.

Validation methodology — author-stated

Tested50 known Earth-science queries with ground-truth CMR concept IDs (seed-record placeholder).

DataCurated query set from NASA-IMPACT teams (seed-record placeholder).

MetricReference collection appears in ranked top-5 (seed-record placeholder).

ResultSeed record — author should publish a real validation before public release.

CaveatThis is a registry seed record; the validation block was filled with placeholders. The author is encouraged to submit a new version with real numbers and a real caveat.

Required tools — live healthlive status of MCP endpoints this agent depends on; not registry-verified

Reproductionsindependent runs by other scientists — Tier 5 trigger

Ran this agent yourself? File an independent reproduction — it can move the listing to Tier 5.Sign in to reproduce

Other disclosuresas described by the author

Intended use: Designed for automated extraction of structured metadata from scientific figures (plots, illustrations, schematics) in research papers and technical reports. Intended as a preprocessing step for downstream literature analysis, figure indexing, or accessibility workflows where a machine-readable description of each figure is needed.
Out of scope: Not a scientific reasoning or interpretation agent — it describes what is visible, not what it means scientifically. Does not assess statistical validity, reproduce numerical results, or draw conclusions beyond what the figure shows. Not designed for real-time or interactive figure annotation; operates in batch mode only. Not suitable for figures with intentionally obscured or encrypted content.
Known failure modes: Low-resolution or heavily compressed images may produce imprecise value readings reported as "approximately." Composite multi-panel figures may have panels mis-typed if panel boundaries are unclear. Figures with non-standard color scales or perceptually similar palettes may have legend entries mis-matched. Download failures are silently skipped — a missing FigureAnalysis in the output indicates a failed download, not an absent figure. Axis labels in non-Latin scripts may not be transcribed correctly.

Worked example

input

urls:
  - "https://example.org/paper/fig1_loss_curves.png"
  - "https://example.org/paper/fig2_attention_heatmap.png"
context: "Training ablation study comparing baseline vs. proposed model on CIFAR-10."

output

analyses:
  - slug: "fig1_loss_curves"
    url: "https://example.org/paper/fig1_loss_curves.png"
    figure_type: "plot"
    description: "line_plot — Two training loss curves plotted against epoch (x-axis 0–100).
      Baseline (blue solid) starts at approximately 2.3, decreases steeply to ~0.8 by
      epoch 20, then plateaus around 0.6 with minor oscillations through epoch 100.
      Proposed model (orange dashed) starts identically at ~2.3, decreases more steeply
      reaching ~0.4 by epoch 20, and continues declining to ~0.2 by epoch 100 with no
      visible plateau. The two curves cross at approximately epoch 5; the proposed model
      remains strictly lower thereafter."
    x_axis: "Epoch (0–100)"
    y_axis: "Cross-entropy loss (0–2.5)"
    legend: ["baseline — blue solid", "proposed — orange dashed"]
    caption: "Figure 1. Training loss curves for baseline and proposed model."
    notes: ""
  - slug: "fig2_attention_heatmap"
    url: "https://example.org/paper/fig2_attention_heatmap.png"
    figure_type: "plot"
    description: "heatmap_or_matrix — 12×12 attention weight matrix. High values (deep red,
      ~0.8–1.0) concentrated on the diagonal, indicating strong self-attention. Notable
      off-diagonal cluster in rows 3–5, columns 8–10 with values ~0.4–0.6, suggesting
      cross-token dependencies. Lower-left triangle predominantly near zero (blue)."
    x_axis: "Token position (0–11)"
    y_axis: "Token position (0–11)"
    legend: []
    caption: "Figure 2. Layer-6 self-attention weights."
    notes: "Color scale legend not visible in image; intensity interpreted from colorbar tick marks."
markdown: |
  # Image Analysis Report
  ## Context
  Training ablation study comparing baseline vs. proposed model on CIFAR-10.
  ## Figures (2 total)
  ### 1. `fig1_loss_curves` — _plot_
  ...

What this listing is. A structured, format-conformant submission, screened for topic and obvious safety issues. The registry does not verify that the agent is correct, that it works, or that the author's disclosures are accurate. Evaluate before relying on it for research.