Methodology
How we gather, process, and score Beyblade data
Bladers Hub does not assume to be the holy grail of beyblade x data. Our data is proprietary and manually reviewed. If external data is submitted to us, it needs to be verified by our team before being admitted to our database. So it is important to understand we cannot claim our data is the ultimate truth. Other local beyblade groups may conclude data that is very different from ours.
What we want to ensure however, is that our data is as useful to you as possible. For this, we have implemented some statistical tools for compensating vastly different data sample sizes and for providing meaningful performance scores. Below we explain exactly how our scoring works and what those numbers mean.
Our Data Tracking Methodology
Before we dive into how we analyze data, it's crucial to understand how we collect and verify it. At Bladers Hub, data quality and authenticity are our top priorities.
Beyblade X
Currently, we are focused on Beyblade X only. Other Beyblade series could be added in the future, but this will depend on the further development of the franchise.
Uncut Video Verification
Only video material showing uncut matches is processed. This ensures transparency and prevents cherry-picking of favorable results. Every battle we track must be verifiable from start to finish.
Two-Player Competitive Matches
All tracked battles must be 2-player matchups where both participants are actively competing. This eliminates bias and ensures performance data reflects real-world competitive scenarios.
Official Equipment Requirements
Stadium
Original Takara Tomy Beyblade X Stadium only
Beyblades & Parts
Official Hasbro and Takara Tomy Beyblade X products
These standards ensure consistency across all tracked battles and allow for meaningful performance comparisons.
Why This Matters: By maintaining strict video verification standards and equipment requirements, we ensure that every data point in our system represents genuine competitive performance. This commitment to authenticity is what makes our statistical analysis meaningful and reliable.
Scoring Methodology: PartScores & BeyScores
Understanding what those scores on your dashboard actually mean - from PartScores (individual blades, ratchets, bits) to BeyScores (complete combinations).
PartScore: Rating Individual Parts (0.0 - 10.0)
Every individual part (blade, ratchet, bit, or assist blade) that has competed in at least 10 battles gets a PartScore. This score reflects how well that part performs competitively, combining multiple factors into a single number from 0.0 to 10.0.
Quick Summary: A PartScore of 5.0 is average, 7.0+ is strong, and 8.5+ is exceptional. Parts with fewer than 10 battles don't get scored yet - we need enough data to make a fair assessment!
How We Calculate PartScores (The 4 Components)
Base Performance
What it measures: Core combat effectiveness
Formula:
(Winrate × 6) + (Avg Points ÷ 3 × 4)
Breaking it down:
- Winrate × 6: Your winrate (0.0 to 1.0) multiplied by 6
→ If you win 100% of battles, you get 6 points from winrate - Avg Points ÷ 3: Your average points per battle divided by 3 (max possible)
→ This normalizes your points to a 0-1 scale (3 is Xtreme finish) - × 4: That normalized value multiplied by 4
→ If you average 3 points per battle, you get 4 points from this
Why split 6 + 4? Winrate is more important (60% of base score) than how you win (40%), but Xtreme finishes still matter!
Sample Confidence
What it measures: Data reliability
Formula:
min(1.0, 0.4 + (Battles ÷ 150 × 0.6))
- Starts at 0.4 (40% confidence) with few battles
- Gains 0.6 over 150 battles (reaching 1.0 at 150 battles)
- Caps at 1.0 (full confidence) at 150+ battles
Example: With 75 battles → 0.4 + (75/150 × 0.6) = 0.7 multiplier
Why so punishing? This ensures well-tested parts with 100+ battles score higher than flashy but under-tested combos. The 7-10 score range is reserved for parts/combos with substantial battle data.
Matchup Diversity
What it measures: Versatility against different opponents
Formula:
(Unique Opponents Faced ÷ Total Opponents in DB) × 2.5
- More unique opponents = higher bonus
- Maximum 2.5 points if faced ALL opponents
- Rewards parts tested in varied matchups
Example: Faced 30 out of 60 opponents → (30/60) × 2.5 = 1.25 points
Consistency Bonus
What it measures: Performance stability across matchups
Formula:
(1 - Standard Deviation of Winrates) × 2.5
- We calculate your winrate against each unique opponent
- Then measure how much those winrates vary (standard deviation)
- Low variance = consistent = high bonus
- Maximum 2.5 points for zero variance (perfect consistency)
Example: Your winrates vary by 0.18 → (1 - 0.18) × 2.5 = 2.05 points
The Final Formula
Result is capped at 10.0 and rounded to one decimal place
Real Example: Scoring WizardRod Blade
Data from battles: WizardRod has 68% winrate (0.68) and 1.9 avg points across 85 battles
Winrate component: 0.68 × 6 = 4.08
The 6 is winrate's max contribution to base score
Points component: (1.9 ÷ 3) × 4 = 0.633 × 4 = 2.53
The 3 is max points per battle (Xtreme finish)
The 4 is points' max contribution to base score
→ 4.08 + 2.53 = 6.61 base points
Battle count: 85 battles
Confidence formula: 0.4 + (85 ÷ 150) × 0.6 = 0.74
Start at 0.4, gain 0.6 over 150 battles
→ 6.61 × 0.74 = 4.89 points after confidence
Opponent variety: Faced 42 out of 60 total unique opponents in database
Diversity formula: (42 ÷ 60) × 2.5 = 1.75
The 2.5 is the maximum diversity bonus
→ +1.75 bonus points
Performance variance: Standard deviation of winrates across opponents = 0.18
Consistency formula: (1 - 0.18) × 2.5 = 2.05
The 2.5 is the maximum consistency bonus
→ +2.05 bonus points
4.89 (base × confidence) + 1.75 (diversity) + 2.05 (consistency)
= 8.7 PartScore
(Capped at 10.0 and rounded to one decimal)
BeyScore: Rating Complete Combinations (0.0 - 10.0)
When you build a complete Beyblade (blade + ratchet + bit, plus assist blade for CX blades), we calculate a BeyScore. This tells you how strong that specific combination is expected to be.
Method 1: Tested Combos (10+ Battles)
If this exact combo has battled 10+ times:
We calculate a Score FOR THE COMBO ITSELF (treating the entire combo as a single "part"). This means we apply the same 4-component formula to the combo's battle data:
- Base Performance: The combo's winrate and average points
- Sample Confidence: Based on the combo's battle count
- Matchup Diversity: How many different opponents this combo faced
- Consistency Bonus: How stable the combo's performance is
Why not use individual parts? Real battle data reveals synergies! A combo might perform better or worse than its individual parts suggest. By calculating from the combo's actual battles, we capture these interactions automatically.
Example: WizardRod 3-60 Ball has 120 battles. We calculate its PartScore using those 120 battles (same formula as individual parts) → BeyScore = 8.9
Method 2: Untested Combos (Less than 10 Battles)
If the combo hasn't been tested enough:
We combine the PartScores of each component using weighted averages:
BX / UX Blades
- Blade: 40%
- Ratchet: 30%
- Bit: 30%
CX Blades (with Assist)
- Blade: 35%
- Ratchet: 25%
- Bit: 25%
- Assist Blade: 15%
Untested Penalty: We apply a 5% penalty (×0.95) because untested combos might not perform as well as their parts suggest due to unknown synergies.
Example Calculation (BX Blade):
- PhoenixWing PartScore: 7.8
- 9-60 PartScore: 6.5
- GearNeedle PartScore: 8.2
→ (7.8 × 0.40) + (6.5 × 0.30) + (8.2 × 0.30) = 7.53
→ Apply untested penalty: 7.53 × 0.95 = 7.2 BeyScore
Important: BeyScores are only calculated if all required parts have PartScores (meaning each has 10+ battles). If even one part lacks data, the combo won't have a BeyScore yet.
What Do These Scores Actually Mean?
Important Context: Our scoring system is designed to be conservative with high scores. Currently, the best proven competitive combos with full battle data score around 6.0-6.5, and reaching 7.0+ is exceptionally difficult. The 8.0-10.0 range is reserved for truly dominant performance backed by extensive data.
Scores grow over time: As combos accumulate more battle data, their confidence multiplier increases, allowing scores to naturally migrate upward. A combo with a great winrate but limited battles might score 5.5 initially, then reach 6.5+ as more data confirms its strength. This prevents volatile scores driven by small sample sizes.
Proven competitive dominance with extensive battle data. These are tournament-level parts/combos that have consistently demonstrated exceptional performance across many matchups.
Solid competitive choices with good performance data. Reliable options that hold their own in most matchups. Many excellent combos sit in this range while building more battle data.
Respectable performance with room to grow. These parts/combos work well in certain matchups or may be building toward higher scores as more data comes in.
Still gathering data or showing struggles in competitive settings. May work in niche scenarios, need more battles to establish true performance, or require optimization.
Statistical Smoothing: Dealing with Small Sample Sizes
Our scores are great for ranking parts, but what about finding the "best pairings"? When looking at specific part combinations (like "best blade for this ratchet"), small sample sizes can be misleading. That's where Empirical Bayes smoothing comes in.
The Problem: Small Data Samples Can Be Misleading
Imagine you're looking at the "best blade pairing" for a bit like "Ball":
PhoenixWing
Paired with Ball has battled 6 times and won all 6
Sounds amazing!
WizardRod
Paired with Ball has battled 83 times and won 55
Solid, and as we all know an absolutely phenomenal result. But of course looking weak against the PW performance.
Without adjustment, PhoenixWing looks "better" because of the 100% winrate. But with only 6 battles, that could be luck—like flipping heads 6 times in a row. WizardRod's 66% is based on way more evidence, so it's more trustworthy. Small samples often show extreme results (high or low) just by chance, while big samples give a truer picture.
This happens across our dashboard for "best" blade/ratchet/bit pairings. Raw data favors rare, lucky combos over proven ones. To fix this, we use a smart technique called Empirical Bayes smoothing.
What is Empirical Bayes Smoothing?
Think of it like blending a pairing's actual battle record with an "average" from all similar battles. It's a way to "calm down" extreme results from small data, making them more realistic without ignoring them.
The "Average" (Prior)
We look at the overall winrate or average points across all similar parts in our database. For example, if most blades, bits or ratchets have about 50% winrate (which makes sense, as wins and losses balance out across battles), that's our starting point.
Blending Based on Data Size
- Small data (e.g., 6 battles)? We pull the result strongly toward the average. It's like saying, "We don't have enough proof yet, so let's be cautious."
- Big data (e.g., 83 battles)? We barely change it - it's like saying, "This has lots of evidence, so trust it more."
How Much Blending?
We use the "median" (middle) number of battles across similar parts as a guide. If most pairings have 20 battles, we blend as if adding 20 "average" battles to each.
This isn't guessing though, as it's calculated from our proprietary or your submitted data (empirical means "from real evidence"). It helps highlight truly strong combos, not just lucky ones.
A Real Example: PhoenixWing vs. WizardRod
Let's use numbers from a bit like "Ball" (based on typical data):
- PhoenixWing: 6 wins out of 6 → 100%
- WizardRod: 55 wins out of 83 → 66%
- "Best Blade" for "Ball": PhoenixWing (higher %)
- Overall average winrate (prior): 50%
- Blending strength: Median battles = 20
- PhoenixWing smoothed: (6 + 10) / (6 + 20) ≈ 62%
- WizardRod smoothed: (55 + 10) / (83 + 20) ≈ 63%
- "Best Blade": WizardRod (63% > 62%)
Now the more tested pairing wins!
As more data comes in, smoothing fades and big samples win out.
Where We Apply It and How to Use It
Best Pairings
The smoothing mechanism is only applied to determining the "best pairings" for blades, ratchets & bits.
Toggle Option
You can turn it on or off in the dashboard for either looking at the raw data results or using the smoothed option.
Not Applied To
Overall winrates, battle counts, or finish types.
Not Applied To
Expandable tables that show individual performance stats against opponents or individual pairing stats with other parts. These are always raw by design.
And by the way: There is always full disclosure on the number of encounters and individual performance stats in the expandable tables below each blade/ratchet/bit that you're checking the data for.