Trakkr Docs

Metrics

:::summarybox remember Every number Trakkr shows you, grouped by what it actually measures The real formula behind each score - with worked examples where they're not obvious "What good looks like" with the caveat that context decides


Visibility metrics

How much real estate your brand takes up in AI answers. These four are your headline numbers and show up across the Dashboard, Prompts, and Competitors pages.

Visibility Score {#visibility}

What it measures: How prominently AI models mention your brand across every tracked prompt and model, weighted by where you appear in a list-style answer.

How it's calculated: Three steps.

  1. For every appearance, Trakkr awards a position score. First position = 10 points, second = 9, down to tenth = 1. Position 11+ or not mentioned = 0.
  2. Sum the position scores, then divide by the maximum possible score across all prompt-model combinations (not just the ones where you appeared). That denominator is what keeps the score honest - it stops a brand that appears rarely but always at #1 from outscoring a brand that consistently lands in the top three.
  3. Apply a square-root scaling so consistent mid-range performance counts more than occasional spikes.
PositionPoints
1st10
2nd9
3rd8
10th1
11th+ or not mentioned0
raw  = (sum of position scores) / (prompts × models × 10) × 100
score = 100 × √(raw / 100)

Worked example. You're tracked on 10 prompts across 3 models - 30 opportunities. You appear at positions [1, 3, 5] across three of those runs and don't show up in the other 27.

A Visibility Score of 28 at low coverage - and that's deliberately conservative.

What's good: 40+ is solid. 60+ is excellent. 80+ means you dominate your tracked prompts (which usually means your tracked prompts are too easy - consider widening).

Where you see it: Dashboard hero, every Competitors row, top of the Brands list, Prompts page averages.


Presence Rate {#presence}

What it measures: The percentage of prompts where your brand appears at all - regardless of position.

How it's calculated: Prompts where you're mentioned ÷ total prompts × 100. No position weighting and no sqrt scaling.

What's good: 50%+ is solid coverage. 75%+ is excellent. 100% almost always means your tracking is too narrow - add a few aspirational prompts you don't yet win.

Why it's not the same as Visibility: Presence is breadth, Visibility is depth. A brand can hit 100% Presence but score 50% Visibility if it's always mentioned last. Use both.


Average Position {#position}

What it measures: When AI lists brands, where do you typically sit?

How it's calculated: Sum of all your positions across mentions ÷ total mentions. Prompts where you don't appear aren't counted - they push Presence down, not Average Position up.

What's good: 1-2 is excellent. 3-4 is good. 5+ means you're appearing but as an afterthought.

Watch out for: A great Average Position with low Presence Rate means you only show up in your easiest prompts. Narrow your view to the prompts where you're not yet appearing and the picture changes.


Mentions {#mentions}

What it measures: Raw count of times your brand appeared across all prompts and models in the selected period.

How it's counted: One mention per (prompt × model × run). The same brand appearing twice in one answer counts once. Mention totals across the period are summed across daily runs.

Why it matters: Volume is the lever for catching outliers. A sudden spike usually traces back to a single new citation source. A slow decline often points to a model retrain or a competitor displacing you on a high-frequency prompt.


Demand metrics

How much traffic flows through the prompts you track. These help you prioritize which prompts deserve attention before you grind on visibility.

Demand Score {#demand-score}

What it measures: How likely people are to ask this question in AI chat, expressed as a 0-100 score.

How it's calculated: A weighted blend of three signals, with adjustments.

  1. Search demand - real clickstream data showing how often people search for related topics, log-scaled to 0-100.
  2. LLM affinity - how naturally the query fits AI conversation patterns. Comparisons and creative requests score higher than simple navigational facts.
  3. Specificity penalty - very long, narrow queries get points deducted. They usually indicate niche topics with lower overall demand.

The output is compressed into a 20-90 range so a low-demand prompt isn't pinned at 0 (every prompt has some interest) and an extreme outlier doesn't max out the chart.

ScoreMeaning
70+High demand - valuable real estate
40-69Medium demand - solid opportunity
<40Lower demand - niche or specialized

How to use it: Sort prompts by Demand Score, then look at the ones where your Visibility is lowest. That's your highest-impact backlog.

Watch out for: Demand Score is directional, not surgical. A 72 vs a 68 is a coin flip - treat the score as a bucket, not a rank.


AI Volume {#ai-volume}

What it measures: The estimated number of times people ask AI platforms about a given topic each month, across ChatGPT, Gemini, Claude, Perplexity, Copilot, and others.

Why it's an estimate: Unlike Google, AI platforms don't publish query volume data. Trakkr combines multiple data sources, then rounds conservatively - the real number is usually higher than what's shown, never lower.

How it's calculated: A three-tier waterfall, choosing the highest-confidence data available.

ConfidenceLabelHow it works
HighMeasuredDirect panel data from AI search platforms, smoothed with a trailing 12-month average. Most accurate.
MediumCalibrated estimateDerived from Google search volume using learned ratios per query type - e.g. comparison queries have higher AI crossover than navigational ones.
LowProjected estimateClassified by topic type when no search data is available. Shown as a range rather than a specific number.

Hover any volume number to see its confidence tier. High and medium estimates display a number with a ~ prefix; low estimates display a range.

Platform breakdown: Total volume is split across platforms by current market share - ChatGPT ~72%, Gemini ~12%, Claude ~6%, Perplexity ~5%, Copilot ~3%, others ~2% (refreshed quarterly). The skew is then adjusted by query type - Perplexity over-indexes on research, Claude on technical, Copilot on productivity.

Query Type: Every prompt is classified into one of seven types - comparison, recommendation, how_to, factual, navigational, creative, technical. This feeds the platform skew and the AI Overviews trigger likelihood.

How to use it: Pair AI Volume with Visibility to find your biggest opportunities. High volume + low visibility = high-impact prompts to improve next.


Competitive metrics

How you stack up against rivals you track. These live on the Competitors page and the head-to-head drill-downs.

Share of Voice {#share-of-voice}

What it measures: Your visibility expressed as a proportion of total visibility across all tracked competitors. The closest thing Trakkr has to a "category leadership" number.

Share of Voice on the Competitors page is actually two donuts side by side. Both matter:

DonutWhat it measures
Recommended FirstOf all #1 mentions across your prompts, what share are you? Captures leadership.
Mention ShareOf all mentions at any position, what share are you? Captures total category footprint.

A brand can dominate Mention Share but lose Recommended First if it's always in the answer but rarely at the top. The gap between the two is itself a useful diagnostic.

What's good: 35%+ Mention Share in a market with five tracked rivals usually means you're the de facto leader on the prompts you've chosen.


Win Rate {#win-rate}

What it measures: How often you rank higher than a specific competitor when both of you appear in the same answer.

How it's calculated: Prompts where you outrank competitor X ÷ prompts where you both appear.

What's good: 55%+ means you're winning. 70%+ means you dominate that competitor. Below 40% with high co-occurrence is the threat zone.

Watch out for: Win Rate ignores prompts where you don't appear at all. A 90% Win Rate against a competitor who only shows up on three prompts isn't the flex it sounds like - check Presence first.


Threat Tier {#threats}

What it measures: Trakkr's classification of a competitor's pressure on your brand, computed from visibility gap and co-occurrence.

TierRoughly means
HighVisibility gap >20 points against you, or Win Rate <30% on 10+ shared prompts
MediumGap >10 points, or Win Rate <40% with regular co-occurrence
LowAnyone who outranks you anywhere - worth watching

Where you see it: The Threats filter on the Competitors page and the per-competitor row.


Competitive Gap {#competitive-gap}

The percentage-point difference between your Visibility Score and a competitor's. Positive = you're ahead. Negative = they are. Coloured green or red on every comparison row. It's a presentation of the underlying scores, not a separate metric.

Head-to-Head {#head-to-head}

The drill-down view that opens when you click a competitor row - wins, losses, and ties across every prompt and model where you both appear, plus a per-model breakdown. It's a view, not a number.


Citation metrics

How AI sources its answers about you. These power the Citations page and the citation widget on the Dashboard.

Citations {#citations-count}

What it measures: Unique URLs that AI models reference when discussing your brand in the selected period. De-duplicated across runs - the same URL cited five days in a row counts as one citation, not five.

Why it matters: More citations from authoritative sources = stronger AI presence. The list itself is your improvement roadmap.


Citation Quality Score {#citation-quality}

What it measures: Average authority of the sources citing your brand, on a 0-100 scale.

How it's calculated: A weighted average of Domain Authority across every citing URL, with extra weight given to sources cited multiple times. A single citation on TechCrunch lifts the score more than ten citations on small blogs.


Domain Authority {#domain-authority}

What it measures: How authoritative a citing website is. Forbes outranks a random blog.

Where it comes from: A blend of inbound link profile, traffic estimates, and domain age - borrowed from established SEO metrics and refined with AI-specific signals (whether the domain is a preferred source for live-retrieval models, for example).


Source Type {#source-type}

What it measures: How Trakkr classifies a citing domain. Different types deserve different responses.

TypeExamplesLeverage
Earned mediaTechCrunch, NYT, industry pubsHighest - chase these
Institution.edu, .gov, IEEE, ISOVery high - hard to land but durable
ReviewG2, Capterra, TrustRadiusHigh for SaaS
OwnedYour own domainModerate - models down-weight self-references
SocialReddit, LinkedIn, XVariable - Reddit is unusually weighted by AI
PR wirePRNewswire, Business WireLow - models discount these
CompetitionCompetitor blog comparisonsVariable - good signal of category presence
OtherAnything uncategorisedInvestigate before acting

Citation Intent {#citation-intent}

What it measures: The buyer intent behind queries that triggered each citation. Shown as a coverage bar on the Citations page so you can see whether your citations span the funnel or cluster in one stage.

IntentWhat it captures
Comparison"X vs Y" queries
Alternative"Alternatives to X" queries
Best For"Best X for use case Y" queries
Discovery"What's a tool that does X"
Recommendation"Recommend me a tool for X"

Watch out for: Heavy coverage on Discovery but nothing on Comparison means buyers know you exist but you're not in the final shortlist. That's a different fix than the reverse.


Reputation & Sentiment

How AI - and the sources AI reads - talk about you. The Citation Sentiment, Perception, and Reddit features all measure different facets of this.

Citation Sentiment {#citation-sentiment}

What it measures: Whether each citing page discusses your brand positively, neutrally, or negatively.

How it's calculated: Each citing page's content is passed through a sentiment classifier with brand context - so "Notion is the leader" reads positive, "Notion has bugs in version 2" reads negative, and "Notion costs $10/month" reads neutral.

Where you see it: A green/grey/red bar on every source card on the Citations page, and as a filter in the Citation Feed.


Reddit Citation Score {#reddit-citation-score}

What it measures: How likely AI models are to learn from a given Reddit thread, on a 0-100 scale. Drives the Citation Band filter (High / Mid / Low) and the Opportunity ranking on the Reddit page.

How it's calculated: A blend of subreddit authority, thread engagement, recency, and whether the discussion centres on your category. High-citation threads are the ones worth contributing to authentically.


Reddit Draft Quality Score {#reddit-draft-grade}

What it measures: The quality of an AI-drafted Reddit reply, on a 0-10 scale, with four sub-scores and a disclosure flag.

Sub-scoreWhat it grades
HelpfulnessDoes it actually answer the question?
SpecificityDoes it reference the thread, not just talk past it?
ToneDoes it sound like a human contributor, not a brand?
Non-spammyDoes it avoid pitch language and self-promotion?
DisclosureBoolean - is the brand affiliation transparent?

What's good: 7.5+ overall with all sub-scores above 6. Anything below 6 reads as marketing and Reddit will downvote it.


Perception metrics

How AI describes your brand qualitatively. These live on the Perception page.

Overall Perception {#perception-score}

What it measures: How positively AI describes your brand across 20 attributes in 5 categories, summarised as a single 0-100 score.

What's good: 75+ is excellent. 60-74 is good. Below 60 needs work - and the per-category breakdown will tell you where.

CategoryAttributes
Trust & ReliabilityOverall trust · Reliability · Transparency · Safety perception
Quality & PerformanceOverall quality · Problem resolution · Responsiveness · User satisfaction
Value & ExperienceValue for money · Ease of interaction · Accessibility · Necessity
Market PositionBrand recognition · Professional image · Recommendation likelihood · Uniqueness
Innovation & AppealForward thinking · Adaptability · Likability · Confidence-inspiring

Each category is the average of its four attributes. The Perception page shows the full 20-attribute grid plus a per-model breakdown.

Watch out for: Perception scores are noisier than visibility scores because they're sentiment classifications on a smaller corpus of mentions. Treat anything with fewer than 20 mentions in the period as directional, not diagnostic.


Crawler & AI Traffic metrics

What AI bots do on your site, and what humans do after they leave AI. These power the Crawlers page and the Visitors page.

Total Visits {#crawler-total}

What it measures: All page requests from tracked AI crawlers in the selected period. The hero number on the Crawlers dashboard.


Conversations {#crawler-conversations}

What it measures: Live AI fetches happening during a real chat - ChatGPT-User, Perplexity-User, Claude-User, MistralAI-User, Meta-ExternalAgent.

Why it matters: This is the strongest leading indicator that a citation is about to land. A spike on a page usually means an answer was generated that referenced it.


Indexing {#crawler-indexing}

What it measures: Search bots pre-fetching content for an AI search index - OAI-SearchBot, PerplexityBot, Claude-SearchBot, Applebot.

Why it matters: Indexing hits precede citations on retrieval-heavy models like Perplexity and ChatGPT Search. Heavy indexing of a page is often a 1-7 day leading indicator of new citations.


Training {#crawler-training}

What it measures: Bulk crawlers gathering content for a future model training cycle - GPTBot, ClaudeBot, CCBot, Amazonbot, Bytespider, DeepSeekBot.

Why it matters: Training effects show up in 6-18 months, not days. Track these for long-term trajectory, not weekly action.


Agent (emerging) {#crawler-agent}

What it measures: Autonomous agent bots that act on behalf of a user - currently Google-Agent and emerging equivalents.

Why it matters: Volume is currently small enough that the dashboard tracks it as a platform filter rather than a top-level category card alongside Training / Indexing / Conversations. The metric is still the first read on whether AI agents are starting to use your site to complete tasks - when the population grows past a single bot, it gets promoted.


AI Visitors {#ai-visitors}

What it measures: Real humans landing on your site from an AI referrer (ChatGPT, Perplexity, Claude, Gemini, Copilot). Powered by your Google Analytics 4 connection.

Why it matters: Crawler hits prove the bot saw you. AI Visitors prove the citation converted into traffic.


Citation Correlation {#citation-correlation}

What it measures: The relationship between crawler hits on a page and citations to it - shown as a chart on the Crawlers page that overlays bot traffic against citation appearances.

How to use it: Pages with high indexing but no citations are usually the closest "near-miss" - they're being read but not chosen. Often a content or schema fix away from landing.


Opportunity & Outreach metrics

How Trakkr scores and ranks the citation gaps worth chasing. These power Outreach.

Fit Score {#fit-score}

What it measures: How well a citation source matches your brand and prompts, on a 0-100 scale.

How it's calculated: A blend of source-prompt relevance, source-type weight, whether competitors are already cited there, and alignment with your positioning.

What's good: 70+ is a strong fit. Below 50 is usually noise.


Difficulty {#difficulty}

What it measures: How hard this kind of source typically is to land coverage on - Low, Medium, or High.

DifficultyTypical sources
LowRoundup posts, smaller blogs, niche reviews
MediumMid-tier publications, established review sites
HighLong-form editorial, institutional sources, top-tier news

Priority {#priority}

What it measures: Trakkr's blended ranking of an opportunity - Fit, Difficulty, competitor pressure on the domain, and recency signals. The default sort in the Outreach queue.

When to override: Switch to sorting by Fit when you want pure leverage and don't care about momentum signals.


Quadrants {#quadrants}

Cross Fit and Difficulty and you get four buckets. The names are the strategy:

Difficulty / FitHigh fitLow fit
EasyQuick Wins - start hereLow priority - skip unless idle
HardWorth It - the big landingsSkip - low payoff, high effort

Trend & comparison windows

How your numbers are moving over time. Every score on the Dashboard, Prompts, Competitors, and Citations pages can be compared across four windows.

WindowReads asBest for
7-dayShort-term momentumCatching outliers and live-retrieval movement
14-daySmoothed short-termFiltering out single-day noise
30-dayMedium-term trajectoryReal signal - this is the one to act on
90-dayLong-term directionQuarterly reviews, training-model effects

Reading direction: Green = improving. Red = declining. Grey = stable.

Reading magnitude:

Why model retrains show up here: A 30-day decline that hits across most prompts and models simultaneously almost always traces to either a new competitor landing a major citation or a model retraining with different training data. Single-model declines are usually fixable; cross-model declines need a content or citation response.


Site & content quality

Two scores that don't fit the visibility framework but shape it from upstream.

Audit Score {#audit-score}

Your site's AI-readiness rating from Optimize, on a 0-100 scale. Measures how easily AI crawlers can extract, parse, and cite your content. Higher Audit Score generally translates to higher Indexing crawler hits and faster citation pickup. See the Optimize docs for the full check list.

Narrative Score {#narrative-score}

For any Narrative you track, a 0-100 score per model showing how strongly the model associates your brand with the topic. Different from Visibility - this is what AI says about a specific theme, not how often you appear overall.


Three reminders before you act on a number {#reminders}

There's no universal "good" score. Every benchmark on this page is a rule of thumb. Your market, prompt mix, and competitor set all change the meaning. The most useful comparison is to your own number from last week.

Model performance varies a lot. Scoring 80 on Claude and 30 on Perplexity is normal - they have different training data, different cutoffs, different retrieval approaches. The Dashboard's per-model breakdown is where most diagnoses start.

Presence and Position answer different questions. Presence asks: were you mentioned at all? Position asks: where in the list? Both matter, and you usually need to fix Presence first.