Hosted and managed by the University of Alabama in Huntsville

Agentarium

scientific agent registry
ToolsGovernanceSign in with ORCID
← All policies

Moderator Handbook

This is the internal playbook moderators work from. It's public because the registry's credibility depends on visible reasoning.

If you're a new moderator, read this end-to-end once. If you're an experienced moderator, this is your quick-reference.

Your role in one paragraph

You are not a peer reviewer. You don't judge whether agents are scientifically correct, whether the validation numbers are accurate, or whether the methodology is sound. You judge whether a submission belongs in the registry as a scientific artifact — does it have the structure, is the topic in scope, is the author plausible? The reader's job is to evaluate the science. Your job is to keep the registry from filling up with junk, fraud, or off-topic content.

What you'll see in the queue

Every item in your moderation queue is there because the automated gate couldn't make the call alone. There are five types of items:

  1. First-time author endorsements — someone new wants to publish; one moderator endorses them. (See Endorsement Policy.)
  2. Borderline-domain submissions — the automatic on-topic check flagged this as ambiguous.
  3. Local-action tool registrations — always reviewed, because the tool can act on user machines.
  4. Flag reports from the public — someone reported a listed agent or tool.
  5. Appeals — a previous moderation decision is being contested.

The queue UI shows: item type, age, SLA, and whether you have any disclosed conflicts.

Endorsements — what to actually check

A first-time author endorsement is a quick check (target: 5 minutes per request), not a deep review.

Check these:

  • Their ORCID record exists and matches the institution they claim.
  • Their ORCID has at least some prior public activity (publications, datasets, etc.), or their affiliation is from a known institution.
  • Their first draft submission has substantive content — not obvious test data, not placeholder text, not copy-paste from another agent.
  • The claimed domain is plausible for someone with their background.
  • The author isn't a known sock puppet or banned account (the moderation tool will flag if they share an IP/email with a banned account).

Don't check:

  • Whether their validation numbers are right
  • Whether the agent will work
  • Whether the tools they declare exist and are good
  • Whether you'd cite this agent yourself

The decision:

  • Endorse — the author is real and the submission isn't on its face fraudulent.
  • Decline — there's something specific that gives you pause. Provide a one-sentence reason (the author sees this).
  • Request info — you need more before deciding. Provide a specific question.

If you can't endorse but you also don't want to decline outright, decline with a helpful reason and a suggestion. "Please add an example with a complete validation block before I review again" is better than "decline."

Off-topic / borderline domain

The on-topic classifier flags submissions where the domain looks ambiguous or out-of-scope.

Check:

  • Is this in a recognized scientific domain (Earth, Planetary, Astrophysics, Physical, Bio)?
  • If "other," is there a clear scientific use case described?
  • Is the agent positioned as a scientific tool, or is it a general-purpose tool that happens to mention science?

Decision:

  • Approve — it's scientific, even if niche.
  • Reject with redirect — "This looks like a general-purpose tool, not a scientific agent. Consider [other registry] or rewrite the submission to focus on the scientific application."
  • Approve with note — borderline; the listing gets a "domain reviewed: borderline" banner.

Local-action tool registrations

This is the heaviest review you do. Local-action tools can read/write user files, execute code, or otherwise act on user machines. Approval means consumers of the registry will see this tool as "vetted enough to register" — that's a real claim and you should be careful.

Required, before approving:

  • The tool's documentation explicitly states what files / directories / processes it touches.
  • The tool's documentation states what authorization is required (e.g., "user must run an installer," "user grants per-file permission").
  • The endpoint is reachable.
  • The schema declares the actions clearly.
  • The operator is ORCID-verified and institutionally affiliated.
  • You can articulate, in one sentence, what a misuse of this tool would look like — and whether the documentation makes that misuse harder.

Approve with consumer warnings:

If the tool is legitimate but powerful, approve with a banner that will be shown to every consumer: "This tool can [do X on your machine]. Review before installing."

Reject:

  • The operator can't be verified.
  • The documentation is too vague to evaluate the safety surface.
  • The tool concedes capabilities you don't think a registered tool should have without much heavier review.

It's fine to reject and explain what would unblock the request. "Add a documented permission model and resubmit" is a normal outcome.

Flag reports

Anyone can flag a listing. Most flags are honest reports of real problems; a small fraction are mistaken or bad-faith. Treat each on its merits.

Process:

  1. Read the flag report carefully. Note the flag reason category.
  2. Look at the listed agent/tool, the author's other listings, the audit log.
  3. If the report's claim can be verified against public information (e.g., "broken tool" — check the health log), verify it.
  4. Reach out to the author through the moderation thread. Give them 72 hours to respond (24 hours for safety-category flags).
  5. Decide.

Outcomes:

  • Resolved, no action — the flag is wrong or unsubstantiated.
  • Resolved, actioned — corrective action taken (banner added, status flipped, withdrawal).
  • Escalated — needs another moderator's eyes or UAH admin review.

Every flag decision is logged publicly with your reason.

Withdrawals

Voluntary withdrawal by the author is mostly automated; you might see one in the queue if there's an identity question. Verify ORCID match, then approve.

Forced withdrawal is heavy. It needs:

  • A second moderator's approval (or your status as UAH staff).
  • A written reason from one of the allowed categories.
  • A 72-hour author response window — except for safety/legal where it's immediate with post-hoc response.

The withdrawal notice is public and names you (and the second moderator). Make sure the reason will read well to a third party reviewing the audit log a year from now.

Appeals

If you receive an appeal, you're seeing it because the original moderator was a different person.

Read in this order:

  1. The appeal text (the author's argument).
  2. The original moderation decision and its reason.
  3. The submission itself.
  4. The audit log.

Decide:

  • Uphold — the original decision was correct; explain why in light of the appeal's specific points.
  • Reverse — the original decision was wrong; restore status; note the appeal was upheld.
  • Modify — partial correction (e.g., uphold the rejection but change the reason; or restore with a banner).

Appeals are one-shot. There's no second-level appeal in v1. So write your decision as if it's the final answer — because it is.

Communication style

  • Be specific. "This submission is missing the validation caveat field" beats "this isn't ready."
  • Be neutral. You're not the author's adversary or advocate. You're the registry's filter.
  • Be brief. A two-sentence reason is fine if it's specific. A two-paragraph reason is okay if the situation needs it. A page-long reason is almost always a sign you're doing the reader's job.
  • Cite policy when you reject. Link to the specific section of the policy you're applying. This is how the policies get refined over time.

What to do when you're not sure

  • Recuse if you have a conflict. See Conflict of Interest.
  • Ask another moderator. Use the internal moderation thread. Other moderators see your question and can weigh in.
  • Default to the lighter action. A "request more info" beats a "reject." A 30-day grace beats a "withdraw." Authors should get fixable feedback before they get hard rejections, except where bad faith is clear.
  • Don't act fast on emotional reactions. If a submission makes you angry, sleep on it. The 7-day flag SLA gives you time.

Tooling reference

Action Where
See your queue /moderation
Take an action Per-item detail page
File a recusal Item detail page → Recuse
See your action history Your profile page
Disclose a conflict Item detail page → Disclose conflict
Escalate Item detail page → Escalate to UAH admin
Ask a co-moderator Item detail page → Internal note
Look up the policies /governance

When to step down

It happens. Life events, conflicts with the role, scope drift — any of these are reasons to step down. The process:

  1. Notify the moderator pool.
  2. Hand off any open items.
  3. UAH admin removes your moderator role.

Your past moderation actions remain in the public audit log. That's a feature, not a bug — moderator accountability doesn't end when the role does.

What we expect from you

  • Respond to queue items within the SLA.
  • Write reasons that an outsider could read and understand.
  • Disclose conflicts; recuse when they're material.
  • Treat authors and flag reporters with professional respect.
  • Stay current with policy changes (announced 14 days in advance).
  • Don't be a stranger — quiet moderators get rotated out, not because you're bad but because the queue needs people who're actually working it.

Three hours a week is the rough ask. Less is fine in slow periods; more is fine in active ones. The dashboard shows queue depth so you can self-pace.

Where to ask questions

In the moderator-only channel (set up at launch). For policy clarifications that should become public, open a PR on the relevant policy document.