Methodology

How we gather, process, and score Beyblade data

Bladers Hub does not assume to be the holy grail of beyblade x data. Our data is proprietary and manually reviewed. If external data is submitted to us, it needs to be verified by our team before being admitted to our database. So it is important to understand we cannot claim our data is the ultimate truth. Other local beyblade groups may conclude data that is very different from ours.

What we want to ensure however, is that our data is as useful to you as possible. For this, we have implemented some statistical tools for compensating vastly different data sample sizes and for providing meaningful performance scores. Below we explain exactly how our scoring works and what those numbers mean.

01

Our Data Tracking Methodology

Before we dive into how we analyze data, it's crucial to understand how we collect and verify it. At Bladers Hub, data quality and authenticity are our top priorities.

Beyblade X

Currently, we are focused on Beyblade X only. Other Beyblade series could be added in the future, but this will depend on the further development of the franchise.

Uncut Video Verification

Only video material showing uncut matches is processed. This ensures transparency and prevents cherry-picking of favorable results. Every battle we track must be verifiable from start to finish.

Two-Player Competitive Matches

All tracked battles must be 2-player matchups where both participants are actively competing. This eliminates bias and ensures performance data reflects real-world competitive scenarios.

Official Equipment Requirements

Stadium

Original Takara Tomy Beyblade X Stadium only

Beyblades & Parts

Official Hasbro and Takara Tomy Beyblade X products

These standards ensure consistency across all tracked battles and allow for meaningful performance comparisons.

Why This Matters: By maintaining strict video verification standards and equipment requirements, we ensure that every data point in our system represents genuine competitive performance. This commitment to authenticity is what makes our statistical analysis meaningful and reliable.

02

Scoring Methodology: PartScores & BeyScores

Understanding what those scores on your dashboard actually mean - from PartScores (individual blades, ratchets, bits) to BeyScores (complete combinations).

PartScore: Rating Individual Parts (0.0 - 10.0)

Every individual part (blade, ratchet, bit, or assist blade) that has competed in at least 10 battles gets a PartScore. This score reflects how well that part performs competitively, combining multiple factors into a single number from 0.0 to 10.0.

Quick Summary: A PartScore of 5.0 is average, 7.0+ is strong, and 8.5+ is exceptional. Parts with fewer than 10 battles don't get scored yet - we need enough data to make a fair assessment!

How We Calculate PartScores (The 4 Components)

1
Base Performance
0-10 points

What it measures: Core combat effectiveness

Formula:

(Winrate × 6) + (Avg Points ÷ 3 × 4)

Breaking it down:

  • Winrate × 6: Your winrate (0.0 to 1.0) multiplied by 6
    → If you win 100% of battles, you get 6 points from winrate
  • Avg Points ÷ 3: Your average points per battle divided by 3 (max possible)
    → This normalizes your points to a 0-1 scale (3 is Xtreme finish)
  • × 4: That normalized value multiplied by 4
    → If you average 3 points per battle, you get 4 points from this

Why split 6 + 4? Winrate is more important (60% of base score) than how you win (40%), but Xtreme finishes still matter!

2
Sample Confidence
×0.4 to ×1.0

What it measures: Data reliability

Formula:

min(1.0, 0.4 + (Battles ÷ 150 × 0.6))

  • Starts at 0.4 (40% confidence) with few battles
  • Gains 0.6 over 150 battles (reaching 1.0 at 150 battles)
  • Caps at 1.0 (full confidence) at 150+ battles

Example: With 75 battles → 0.4 + (75/150 × 0.6) = 0.7 multiplier

Why so punishing? This ensures well-tested parts with 100+ battles score higher than flashy but under-tested combos. The 7-10 score range is reserved for parts/combos with substantial battle data.

3
Matchup Diversity
0-2.5 points

What it measures: Versatility against different opponents

Formula:

(Unique Opponents Faced ÷ Total Opponents in DB) × 2.5

  • More unique opponents = higher bonus
  • Maximum 2.5 points if faced ALL opponents
  • Rewards parts tested in varied matchups

Example: Faced 30 out of 60 opponents → (30/60) × 2.5 = 1.25 points

4
Consistency Bonus
0-2.5 points

What it measures: Performance stability across matchups

Formula:

(1 - Standard Deviation of Winrates) × 2.5

  • We calculate your winrate against each unique opponent
  • Then measure how much those winrates vary (standard deviation)
  • Low variance = consistent = high bonus
  • Maximum 2.5 points for zero variance (perfect consistency)

Example: Your winrates vary by 0.18 → (1 - 0.18) × 2.5 = 2.05 points

The Final Formula

PartScore = (Base Performance × Confidence) + Diversity Bonus + Consistency Bonus

Result is capped at 10.0 and rounded to one decimal place

Real Example: Scoring WizardRod Blade

Step 1: Calculate Base Performance (0-10 points)

Data from battles: WizardRod has 68% winrate (0.68) and 1.9 avg points across 85 battles

Winrate component: 0.68 × 6 = 4.08
The 6 is winrate's max contribution to base score

Points component: (1.9 ÷ 3) × 4 = 0.633 × 4 = 2.53
The 3 is max points per battle (Xtreme finish)
The 4 is points' max contribution to base score

→ 4.08 + 2.53 = 6.61 base points

Step 2: Apply Sample Confidence Multiplier

Battle count: 85 battles

Confidence formula: 0.4 + (85 ÷ 150) × 0.6 = 0.74
Start at 0.4, gain 0.6 over 150 battles

→ 6.61 × 0.74 = 4.89 points after confidence

Step 3: Add Matchup Diversity Bonus

Opponent variety: Faced 42 out of 60 total unique opponents in database

Diversity formula: (42 ÷ 60) × 2.5 = 1.75
The 2.5 is the maximum diversity bonus

+1.75 bonus points

Step 4: Add Consistency Bonus

Performance variance: Standard deviation of winrates across opponents = 0.18

Consistency formula: (1 - 0.18) × 2.5 = 2.05
The 2.5 is the maximum consistency bonus

+2.05 bonus points

Final Calculation:
4.89 (base × confidence) + 1.75 (diversity) + 2.05 (consistency)
= 8.7 PartScore

(Capped at 10.0 and rounded to one decimal)

BeyScore: Rating Complete Combinations (0.0 - 10.0)

When you build a complete Beyblade (blade + ratchet + bit, plus assist blade for CX blades), we calculate a BeyScore. This tells you how strong that specific combination is expected to be.

Method 1: Tested Combos (10+ Battles)

If this exact combo has battled 10+ times:

We calculate a Score FOR THE COMBO ITSELF (treating the entire combo as a single "part"). This means we apply the same 4-component formula to the combo's battle data:

  • Base Performance: The combo's winrate and average points
  • Sample Confidence: Based on the combo's battle count
  • Matchup Diversity: How many different opponents this combo faced
  • Consistency Bonus: How stable the combo's performance is

Why not use individual parts? Real battle data reveals synergies! A combo might perform better or worse than its individual parts suggest. By calculating from the combo's actual battles, we capture these interactions automatically.

Example: WizardRod 3-60 Ball has 120 battles. We calculate its PartScore using those 120 battles (same formula as individual parts) → BeyScore = 8.9

Method 2: Untested Combos (Less than 10 Battles)

If the combo hasn't been tested enough:

We combine the PartScores of each component using weighted averages:

BX / UX Blades
  • Blade: 40%
  • Ratchet: 30%
  • Bit: 30%
CX Blades (with Assist)
  • Blade: 35%
  • Ratchet: 25%
  • Bit: 25%
  • Assist Blade: 15%

Untested Penalty: We apply a 5% penalty (×0.95) because untested combos might not perform as well as their parts suggest due to unknown synergies.

Example Calculation (BX Blade):

  • PhoenixWing PartScore: 7.8
  • 9-60 PartScore: 6.5
  • GearNeedle PartScore: 8.2

→ (7.8 × 0.40) + (6.5 × 0.30) + (8.2 × 0.30) = 7.53
→ Apply untested penalty: 7.53 × 0.95 = 7.2 BeyScore

Important: BeyScores are only calculated if all required parts have PartScores (meaning each has 10+ battles). If even one part lacks data, the combo won't have a BeyScore yet.

What Do These Scores Actually Mean?

Important Context: Our scoring system is designed to be conservative with high scores. Currently, the best proven competitive combos with full battle data score around 6.0-6.5, and reaching 7.0+ is exceptionally difficult. The 8.0-10.0 range is reserved for truly dominant performance backed by extensive data.

Scores grow over time: As combos accumulate more battle data, their confidence multiplier increases, allowing scores to naturally migrate upward. A combo with a great winrate but limited battles might score 5.5 initially, then reach 6.5+ as more data confirms its strength. This prevents volatile scores driven by small sample sizes.

6.5+ 🔥 Elite

Proven competitive dominance with extensive battle data. These are tournament-level parts/combos that have consistently demonstrated exceptional performance across many matchups.

5.5 - 6.4 ⭐ Strong

Solid competitive choices with good performance data. Reliable options that hold their own in most matchups. Many excellent combos sit in this range while building more battle data.

4.5 - 5.4 ✨ Viable

Respectable performance with room to grow. These parts/combos work well in certain matchups or may be building toward higher scores as more data comes in.

Below 4.5 ⚠️ Developing

Still gathering data or showing struggles in competitive settings. May work in niche scenarios, need more battles to establish true performance, or require optimization.

03

Statistical Smoothing: Dealing with Small Sample Sizes

Our scores are great for ranking parts, but what about finding the "best pairings"? When looking at specific part combinations (like "best blade for this ratchet"), small sample sizes can be misleading. That's where Empirical Bayes smoothing comes in.

The Problem: Small Data Samples Can Be Misleading

Imagine you're looking at the "best blade pairing" for a bit like "Ball":

PhoenixWing

100%

Paired with Ball has battled 6 times and won all 6

Sounds amazing!

WizardRod

66%

Paired with Ball has battled 83 times and won 55

Solid, and as we all know an absolutely phenomenal result. But of course looking weak against the PW performance.

Without adjustment, PhoenixWing looks "better" because of the 100% winrate. But with only 6 battles, that could be luck—like flipping heads 6 times in a row. WizardRod's 66% is based on way more evidence, so it's more trustworthy. Small samples often show extreme results (high or low) just by chance, while big samples give a truer picture.

This happens across our dashboard for "best" blade/ratchet/bit pairings. Raw data favors rare, lucky combos over proven ones. To fix this, we use a smart technique called Empirical Bayes smoothing.

What is Empirical Bayes Smoothing?

Think of it like blending a pairing's actual battle record with an "average" from all similar battles. It's a way to "calm down" extreme results from small data, making them more realistic without ignoring them.

The "Average" (Prior)

We look at the overall winrate or average points across all similar parts in our database. For example, if most blades, bits or ratchets have about 50% winrate (which makes sense, as wins and losses balance out across battles), that's our starting point.

Blending Based on Data Size

  • Small data (e.g., 6 battles)? We pull the result strongly toward the average. It's like saying, "We don't have enough proof yet, so let's be cautious."
  • Big data (e.g., 83 battles)? We barely change it - it's like saying, "This has lots of evidence, so trust it more."

How Much Blending?

We use the "median" (middle) number of battles across similar parts as a guide. If most pairings have 20 battles, we blend as if adding 20 "average" battles to each.

This isn't guessing though, as it's calculated from our proprietary or your submitted data (empirical means "from real evidence"). It helps highlight truly strong combos, not just lucky ones.

A Real Example: PhoenixWing vs. WizardRod

Let's use numbers from a bit like "Ball" (based on typical data):

Raw Data (No Smoothing)
  • PhoenixWing: 6 wins out of 6 → 100%
  • WizardRod: 55 wins out of 83 → 66%
  • "Best Blade" for "Ball": PhoenixWing (higher %)
VS
With Smoothing
  • Overall average winrate (prior): 50%
  • Blending strength: Median battles = 20
  • PhoenixWing smoothed: (6 + 10) / (6 + 20) ≈ 62%
  • WizardRod smoothed: (55 + 10) / (83 + 20) ≈ 63%
  • "Best Blade": WizardRod (63% > 62%)

Now the more tested pairing wins!

As more data comes in, smoothing fades and big samples win out.

Where We Apply It and How to Use It

Best Pairings

The smoothing mechanism is only applied to determining the "best pairings" for blades, ratchets & bits.

Toggle Option

You can turn it on or off in the dashboard for either looking at the raw data results or using the smoothed option.

Not Applied To

Overall winrates, battle counts, or finish types.

Not Applied To

Expandable tables that show individual performance stats against opponents or individual pairing stats with other parts. These are always raw by design.

And by the way: There is always full disclosure on the number of encounters and individual performance stats in the expandable tables below each blade/ratchet/bit that you're checking the data for.