Workspace/Diagnose
OKR Orca

Diagnose your OKR.

Here is what a coach's read looks like.

Excerpt
Refine 3 fixes recommended
Rewrite Reframe Refine Solid Ship
O: Become the most trusted checkout in our category. KR1: Increase NPS from 32 to 55 by end of Q3. KR2: Launch the new payment page. KR3: Reduce support tickets from 1,200/week to under 600/week by Q3.
Fix first KR2 is output. Reframe to a behaviour change.
Already good KR1 has a baseline and a target.
OKR Orca
Diagnose

Diagnose your OKR in 60 seconds.

Paste your OKR. Get a structured critique, then three rewrites you can use immediately. Your OKR text goes directly to OpenAI or Anthropic, not to this site. No account, no tracking, your API key stays in the browser.

Diagnostic

Diagnose your OKR in 60 seconds.

Paste your OKR. Get a structured critique in seconds: what's broken, why, and three rewrites you can use immediately. Your API key never leaves your browser.

Runs locally before the LLM call. Output-verb detection, timebox regex, placeholder check. Instant pre-score, no key required.

Try an example

Or paste your own OKR

--
/100
Scanning

Key stored in browser only. One OKR set at a time.

~$0.005 to $0.01 per analysis with GPT-4o. ~$0.008 to $0.015 with Claude Sonnet 4.6. Cmd+Enter to submit.

Paste 12 to 24 OKRs, separated by blank lines

0 OKRs detected Blank line = OKR separator. Up to 24.

Approximately one Claude Sonnet call per OKR (~$0.20 to $0.30 for a 24-batch). Sequential, allow 30 to 60 seconds for a full batch.

Diagnosing batch...
How this works. Paste 12 to 24 OKRs separated by blank lines. Each OKR gets a quick Claude Sonnet score and a snapshot of anti-patterns. The list returns sorted lowest-score-first so the worst offenders surface at the top, with the bottom three expanded by default.
Sorted by score, lowest first. Three worst expanded.
You have an unfinished session.
Where are you?
Starting from scratch, have a goal but no KRs, or have a draft you want to pressure-test?

~$0.02-0.04 per session. Conversation stays in your browser only.

You have an unfinished session.
Add your pillar context first, then we can shape an OKR that ladders up to it.

~$0.02-0.04 per session. Conversation stays in your browser only.

--
/100
Analysing
Running diagnostic
Analysing your OKR against the 7-criterion rubric.

      
    

Getting LLM analysis

The rubric explained.

OKR Orca scores against a seven-criterion rubric. Three criteria apply to the Objective. Two criteria apply to each Key Result. Two apply to the set as a whole.

The core principle: "Who does what by how much." An outcome is a measurable change in behaviour. Outputs are not outcomes. Impact (revenue, profit) is too far removed. Key Results live at the outcome layer.

Objective criteria

O1 Clarity
Is the customer benefit explicit? 0 = No customer benefit stated. 1 = Vague, unclear who benefits. 2 = Clear customer and specific scope.
O2 Timebox
Is there a deadline? 0 = No deadline. 1 = Implicit ("this quarter"). 2 = Explicit date or quarter.
O3 Strategy
Is a solution prescribed? 0 = A feature or solution is named. 1 = Generic but solution-free. 2 = Problem-framed, no solution prescribed.

Key Result criteria (per KR)

Outcome form
Output or outcome? 0 = Output keyword detected (launch, migrate, deliver, build, implement). 1 = Has a metric but vague actor or direction. 2 = Full "who does what from X to Y".
Measurability
Baseline and target present? 0 = Neither. 1 = One present, one missing. 2 = Both present with implied data source.

Set-level criteria

A1 Alignment
Is there a parent objective or strategy reference? 0 = No reference. 1 = Implied. 2 = Explicit parent link.
C1 Completeness
Placeholders present? 0 = Placeholders detected (X%, TBD, (owner), (tbc)). 1 = Minor gaps. 2 = Fully specified.

The "So What?" test

For every KR, ask: (1) If all KRs are green, is the Objective obviously achieved? (2) If this KR turns red, does it signal a real problem? (3) Does the team actually control this metric? Any "no" means the KR needs rewriting.

Common anti-patterns: Output-as-KR. Impact-as-KR (revenue targets). Vanity metrics (engagement without "who + by how much"). Placeholders. Binary milestones (100% migrated). Task lists disguised as KRs.

Frequently asked.

Is my OKR text sent to a server?

No. Your OKR text is sent directly from your browser to OpenAI or Anthropic. OKR Orca has no backend. The source is public and you can verify this yourself.

Where is my API key stored?

In your browser's localStorage only. It is never transmitted to okrorca.com or any third party. You can clear it at any time via the "change" link in the nav.

Which models are supported?

OpenAI: GPT-4o. Anthropic: Claude Sonnet 4.6. Both are the default for their provider. The rule-engine pre-score runs locally and is free regardless of key.

How much does an analysis cost?

With GPT-4o roughly $0.005 to $0.01 per analysis. With Claude Sonnet 4.6 roughly $0.008 to $0.015. The estimate shown on the input panel is indicative only.

Can I use it without an API key?

Yes. The rule-engine pre-score runs instantly in the browser with no key required. You get output-verb flags, timebox checks, placeholder detection, and a pre-score. LLM rewrite suggestions require a key.

What is the rule-engine pre-score?

A heuristic score (0-100) computed locally in under 100ms. It checks for output verbs in KRs, missing timeboxes in the Objective, placeholder strings, and the presence of baseline and target numbers. It is a fast signal, not a replacement for the full LLM analysis.

What is Coach mode?

Coach mode is the second tab in the input panel. Instead of pasting an existing OKR, you start a conversation. The tool asks one focused question at a time, surfaces the gap between what you want to achieve and what you have written, and helps you arrive at an OKR in your own words. When the draft is ready, you can send it straight to Diagnose for a rubric score.

Who built this.

I'm Frederik Metz, an agile coach in Munich. I've reviewed several hundred OKRs over the last few years, mostly from product teams trying to get the rubric right and getting tangled by output-disguised-as-outcome KRs.

OKR Orca is the diagnostic I wish my teams had earlier. The 7-criterion rubric is the one I use in coaching conversations. It's opinionated, but it's the opinion that consistently produces OKRs that move outcomes, not OKRs that produce planning theatre.

The tool is free. Your key stays in your browser. No signup, no tracking. If you find a bug or want a feature, the source is on GitHub.

View source on GitHub  ·  [email protected]

Workspace
Go to Diagnose⌘D
Go to Batch⌘B
Go to Create⌘R
Go to Coach (advanced)⌘C
Add or change API key
Pages
Methodology
FAQ
Examples
Settings
Toggle theme⌘T
Switch language EN / DE