---
title: "Introducing Science Superpowers: Scientific Discipline for Your Research Agent"
description: "Our new open source methodology makes AI research agents pre-register hypotheses, work reproducibly, and verify before they claim. Pre-registration over TDD."
updatedAt: "2026-05-28"
tags: ["AI", "Research", "Open Source", "Skills"]
canonical: "https://k-dense.ai/blog/introducing-science-superpowers"
---
Give a capable AI agent a dataset and a loosely worded question, and it will happily get to work. It loads the data, tries a few models, finds something that clears p < 0.05, and writes you a confident summary. The code runs cleanly. The plots are pretty. And you have no real way of knowing whether the result is true or whether the agent just went fishing until something surfaced.

This is the uncomfortable part of pointing powerful models at science. The bottleneck was never whether an agent *could* run the analysis. It is whether you should believe the answer.

Today we are releasing **[Science Superpowers](https://github.com/K-Dense-AI/science-superpowers)**, an open source, MIT-licensed methodology that gives your research agent the discipline to earn that belief. It is a complete computational-science workflow delivered as a set of composable skills, plus the instructions that make sure your agent actually uses them.

## Capability is not the bottleneck. Discipline is.

Science already has a credibility problem, and it is mostly a human one. More than 1,500 researchers told *Nature* they believe science is in a [reproducibility crisis](https://www.nature.com/articles/533452a). The usual suspects are well documented: [p-hacking](https://journals.sagepub.com/doi/10.1177/0956797611417632) (trying specifications until one is significant), [HARKing](https://journals.sagepub.com/doi/10.1207/s15327957pspr0203_4) (hypothesizing after the results are known), optional stopping, and the garden of forking paths.

Now hand that same toolbox to a tireless agent that can run a thousand analyses before lunch. Speed does not fix the problem. It industrializes it. An undisciplined agent will fork down every path, surface the one that "worked," and narrate it back to you as if it were the plan all along. More compute, without rigor, just means more ways to fool yourself faster.

The fix is not a smarter model. It is a stricter process around the model.

## What Science Superpowers is

Science Superpowers is a science-domain reimplementation of [Superpowers](https://github.com/obra/superpowers), the agentic methodology Jesse Vincent built for software-development agents. The architecture is the same: composable skills that auto-trigger via a session-start bootstrap. What changes is the domain and the central discipline.

Where Superpowers puts test-driven development at its core, Science Superpowers puts **pre-registration**. The pre-registration skill draws the parallel exactly:

> In test-driven development you write the test before the code so you know the test tests something. Here you write the prediction before the result so you know the result confirms something.

That single reordering, prediction before result, is the difference between a finding that replicates and one that evaporates.

## The Iron Law: predict before you peek

Pre-registration is the spine of the whole methodology, and the skill states it without hedging:

> NO CONFIRMATORY CLAIM WITHOUT A PRE-REGISTERED PREDICTION FIRST

Before the agent touches an outcome, it writes down the hypotheses, the exact analysis (model, variables, transformations, exclusions, covariates), a directional prediction, and the decision rule that will count as confirming or disconfirming. It states, explicitly, which result would prove it wrong. Then it commits that document to git. The commit is the timestamp. It proves the prediction came before the result.

After the freeze, the agent runs exactly what it registered. Anything it did not register, a new subgroup, a different model, a covariate that "obviously" belongs once you have seen the data, gets labeled exploratory. Exploratory work is allowed and valuable, but it is never dressed up as confirmation. It becomes a lead for the next pre-registration, not a conclusion in this one.

This is enforced from the very first step. The framing skill puts a hard gate at the front of every investigation: the agent may not load the dataset, fit a model, or plot an outcome until you have approved a precise, falsifiable question. The reasoning is blunt. Once an agent has seen the data, it cannot un-see it, and every later choice becomes suspect. Framing first is what keeps a result confirmatory instead of a story told after the fact. This applies to every investigation, no matter how simple it looks. A "quick t-test" has forking paths too.

## The workflow

Science Superpowers walks the agent through the full research lifecycle, one disciplined step at a time:

1. **Frame the question** (`framing-research-questions`). Turn a fuzzy interest into a precise, falsifiable question, with hypotheses and success criteria, before any data is touched.
2. **Survey prior work** (`surveying-prior-work`). Ground the question and the chosen methods in what is already known: standard methods, known confounds, prior effect sizes.
3. **Design the analysis** (`designing-the-analysis`). Break the work into bite-sized steps with exact datasets, variables, tests, power, and decision rules.
4. **Pre-register** (`preregistering-analysis`). The Iron Law. Lock predictions and decision rules, then freeze them with a commit.
5. **Set up a reproducible workspace** (`setting-up-reproducible-analysis`). Pinned environment, fixed seeds, immutable raw data, clean baseline.
6. **Execute** (`subagent-driven-analysis` or `executing-analysis`). Run the pre-registered plan with review checkpoints.
7. **Investigate anomalies** (`investigating-anomalous-results`). When a result looks wrong, run a root-cause investigation instead of quietly dropping the inconvenient data.
8. **Verify before claiming** (`verifying-results-before-claiming`). Re-run, check assumptions, test robustness, reproduce the evidence.
9. **Red-team the result** (`requesting-red-team-review` and `receiving-critical-review`). Dispatch a skeptical reviewer to attack the analysis, then respond with rigor rather than performative agreement.
10. **Report and archive** (`reporting-and-archiving-findings`). Run a reproducibility check, decide how to report, then archive the code, data, and environment.

A `dispatching-parallel-investigations` skill handles concurrent independent threads, and two meta skills (`writing-science-skills` and `using-science-superpowers`) round out the library at fifteen skills total. The recurring theme across all of them: investigate the root cause, verify against fresh evidence, and never quietly drop data that does not fit.

## Why deliver this as skills

A skill is just a folder with a `SKILL.md` file: instructions plus optional supporting scripts, built on the [Agent Skills](https://agentskills.io) open standard. Because of progressive disclosure, the agent keeps a compact index of skill names and descriptions in context and only loads a full skill when a task matches it. You get a deep methodology without burning your context window on it.

The part that matters most here is that the skills **auto-trigger**. A session-start bootstrap loads the index the moment you fire up your agent, and the agent checks for a relevant skill before any task. You do not have to remember to say "pre-register this" or "verify before you claim." The methodology shows up on its own. These are mandatory workflows, not gentle suggestions. (For more on why we think skills are the right abstraction for scientific work, see [Agent Skills: The Final Piece for AI-Powered Scientific Research](/blog/agent-skills-final-piece-for-ai-powered-research).)

## Install it in your harness

Science Superpowers runs wherever your agent runs. Installation differs by harness, so if you use more than one, install it for each. Exact steps live in the [repository README](https://github.com/K-Dense-AI/science-superpowers).

- **Cursor**: install from the plugin marketplace, or point Cursor at the repo as a plugin. The `sessionStart` hook loads the bootstrap automatically.
- **Claude Code**: register a marketplace pointing at the repo and install the `science-superpowers` plugin. The `SessionStart` hook loads the bootstrap.
- **Codex**: use the committed Codex manifest.
- **Gemini CLI**: install as an extension.
- **OpenCode**: follow the install notes in the repo.
- **Google Antigravity**: it supports Agent Skills natively (the same `SKILL.md` format) and reads always-on rules at session start, so the skills and bootstrap load right in.

Once installed, you do not invoke anything special. Start describing what you want to investigate, and the agent steps you through framing, prior work, design, and pre-registration before it runs a single line on your data.

## How this fits with K-Dense

We build everything at K-Dense around one stance: AI should be a [co-scientist, not an autopilot](/blog/ai-co-scientist-not-ai-scientist). A co-scientist's output has to be inspectable and trustworthy, and trust is not something a model can assert. It has to be earned through process. Pre-registration, reproducible workspaces, root-cause investigation, and verify-before-claiming are how you earn it.

Science Superpowers is the open foundation for that process, packaged so you can drop it into your own agent today. We are open-sourcing it because methodology improves the way science itself does: one person writes down a workflow, someone else adapts it, and the standard rises for everyone.

---

The science is yours. The rigor should be the default.

**Get the repo:** [Science Superpowers on GitHub](https://github.com/K-Dense-AI/science-superpowers). Star it, install it in your agent, and open a PR for the skill you wish existed.

*Questions, or war stories from your own bench? Email contact@k-dense.ai.*

**Related reading:**
- [Agent Skills: The Final Piece for AI-Powered Scientific Research](/blog/agent-skills-final-piece-for-ai-powered-research)
- [AI Co-Scientist, not AI Scientist](/blog/ai-co-scientist-not-ai-scientist)
- [Agentic Data Scientist: An Open Source AI That Actually Does the Analysis](/blog/agentic-data-scientist-open-source)