Hosted and managed by the University of Alabama in Huntsville

Agentarium

scientific agent registry
ToolsGovernanceSign in with ORCID
← All policies

Moderation Policy

Moderation in Agentarium is the smallest amount of human review that maintains trust. Most decisions are made by the automated conformance gate; humans only enter when judgment is required.

What the conformance gate does (no humans involved)

The gate runs on every submission and checks:

  • All required fields are present and non-empty.
  • Required-field content is substantive (no "n/a", "none", "tbd" placeholders).
  • The validation block has a non-trivial caveat — "none" is rejected.
  • Every tool the agent references in its prompt is registered and declared.
  • Declared tools are actually mentioned in the prompt (warning, not block).
  • The submitted record validates against agent.schema.json.
  • Guardrails are declared as discrete mechanisms, not as adjectives.
  • The author exists, has ORCID-verified status, and (for first-time authors) has an endorsement.

Submissions that fail the gate get specific error messages and never enter the human queue. Authors fix and resubmit.

Auto-approval matrix

Action Auto-approved when Human-reviewed when
Agent submission, endorsed author Gate passes + tools resolve + on-topic domain Borderline domain, manual flag
Agent submission, unendorsed author Never Endorsement first, then auto-flows
Tool registration, remote-query Auth declared, schema valid, URL reachable Borderline permission scope
Tool registration, local-action Never Always
Author withdrawal of own agent ORCID match Identity dispute
Agent supersession (replace own) Same author + same concept_id Cross-author supersession
Endorsement request Never Always one moderator
Public flag report Never Always (within 7 days)

The principle: conformance is mechanical, identity and scope are human.

Service-level targets

Action Target turnaround
First-time author endorsement 72 hours
Tool registration review 72 hours
Flag report review 7 days
Withdrawal processing 24 hours
Appeal review 72 hours

Targets, not guarantees. The current queue depth and any unmet SLAs are shown on /about.

What gets moderated, in practice

First-time author endorsement

A new author can't publish without one moderator (or qualified endorser) signing off. See Endorsement Policy for the full process.

Off-topic submissions

If an author selects "something else / off-topic" or the domain classifier flags the submission as borderline, the submission routes to moderation. Moderators decide whether the submission belongs in a recognized scientific domain or whether the registry isn't the right home for it. Off-topic decisions include a redirect suggestion when possible.

Tool registrations

  • Remote-query tools auto-register if the endpoint responds to a health check with the declared schema. Borderline permission claims (e.g., "remote-query" for a tool that takes actions) go to human review.
  • Local-action tools always get human review. The review is heavier: documentation must clearly state what files/processes the tool touches, what user data it transmits, and what authorization is required. Local-action tools may be approved with documented warnings shown to consumers.

Flag reports

Anyone can flag a listed agent or tool. Flag reasons:

  • misrepresented_validation — the validation block doesn't match the agent's actual behavior
  • broken_tool — endpoint has been unreachable for more than the grace period
  • fraud — fabricated authorship, fake ORCID, fake institution
  • off_topic — listing doesn't belong in a scientific registry
  • safety — the agent or tool poses a safety concern
  • other — anything else; include a description

Moderators review within 7 days. The accused author is notified, can respond, and the decision goes to the public audit log with reasons.

Withdrawals

  • Voluntary withdrawal (author's own listing): ORCID-verified author, 24-hour processing, agent stays citable with a public withdrawal notice.
  • Forced withdrawal (moderator-initiated): requires two moderators (one to propose, one to approve), full audit log entry, author appeal option.

Appeals

Every moderation decision can be appealed once. The appeal:

  • Routes to a different moderator than the original.
  • Must be filed within 30 days of the original decision.
  • Is itself logged in the audit trail.
  • Results in: uphold, reverse, or modify (with new reason).

There is no second-level appeal in v1. If a community-wide concern arises, it goes through the policy-change process (see Governance, "Changes to these policies").

Moderator selection and accountability

  • Founding pool: 1 UAH staff moderator + 2–3 AKD/NASA-IMPACT moderators.
  • New moderators are nominated by an existing moderator and require approval from one other moderator. Conflicts are disclosed publicly on the new moderator's profile.
  • Moderator terms: 2 years renewable. Moderators can step down at any time.
  • Each moderator's actions are visible on their public profile. This is intentional: moderation power is paired with public traceability.

What we don't do

  • We don't pre-review correctness. Author claims are author claims.
  • We don't moderate the content of an agent's prompts beyond the safety screen. The gate checks structure; the prompt's scientific quality is the reader's job.
  • We don't act on anonymous accusations without independent evidence.
  • We don't moderate based on funding source, institutional affiliation, or other non-conduct attributes.

When in doubt

Moderators err toward the lighter action:

  • A confusing submission gets clarification request, not rejection.
  • A borderline domain gets a question to the author, not a block.
  • A first-time mistake gets a fix-and-resubmit, not a strike.
  • A clear bad-faith pattern gets the heavier action.

This is the same posture arXiv takes, and we adopt it for the same reason: the registry's credibility comes from being predictable and fair, not from being strict.