Model Scaling Law Performance Calculator

JJ Ben-Joseph headshot JJ Ben-Joseph

Empirical scaling laws summarize how training loss tends to improve as you increase training resources. This calculator focuses on the common “data scaling” relationship where loss decreases as a power law in the number of training tokens.

What this calculator estimates

Given a baseline training run with token count N0 and observed training loss L0, plus a scaling exponent α and an irreducible loss floor B, the calculator:

Definitions and units

The scaling-law formula

The calculator uses the common form:

L(N) = A × N−α + B

Presented in MathML:

L (N) = A N α + B

Solving for A from the baseline

Using the baseline observation (N0, L0):

A = (L0 − B) × N0α

This requires L0 > B. If L0 is less than or equal to B, the fitted A is non-positive and the model no longer represents a diminishing-loss curve.

Projecting loss at N1

Once A is known, the projected loss at N1 is:

L(N1) = A × N1−α + B

Solving for tokens needed to reach a target loss

If you provide a target loss Ltarget (must satisfy Ltarget > B), then:

Ntarget = (A / (Ltarget − B))1/α

How to interpret the results

Worked example (matches the default inputs)

Suppose you observed:

First compute A:

Now project loss at N1:

If you also set Ltarget = 1.5:

This illustrates a common takeaway: pushing training loss close to the floor B can require enormous increases in tokens.

Quick comparison: what changes when you scale data?

Change What happens to L(N)? Practical implication
Increase N (more tokens) L decreases roughly as N−α until it nears B Diminishing returns; biggest gains are earlier
Increase α Curve falls faster with N Fewer extra tokens needed for the same loss drop
Increase B Loss floor rises; all projections shift upward Data/architecture/label noise may be limiting
Increase L0 with same N0 Implied A increases Worse baseline implies higher losses at all N unless you change B or α

Assumptions & limitations (read before acting on projections)

Practical tips

Enter baseline metrics and scaling parameters to project performance.

Embed this calculator

Copy and paste the HTML below to add the Model Scaling Law Performance Calculator to your website.