Proving the marginal value theorem

James V. Stone Share this page

The marginal value theorem holds true under three fairly mild conditions:

  1. The fixed cost $T$ is larger than zero.

  2. The reward function $E(t)$ increases with $t$.

  3. The slope $dE(t)/dt$ of the reward function decreases with $t$ (i.e. $E(t)$ is a diminishing returns function).

We wish to prove that the instantaneous reward rate $r(t)$ equals the average reward rate $R(t)$ when $R(t)$ is maximal. To achieve this, we need to find the value of $r(t)$ when $R(t)$ is maximal. To find the maximal average reward rate, we make use of the fact that its slope is zero at a maximum.

The average reward rate is defined as

  $\displaystyle  R(t)  $ $\displaystyle  =  $ $\displaystyle  \frac{E(t)}{T+t},  $   (1)

and its derivative is

  $\displaystyle  \frac{dR}{dt}  $ $\displaystyle  =  $ $\displaystyle  \frac{1}{T+t} \frac{dE}{dt} + E(t) \frac{d(T+t)^{-1}}{dt}, \label{eqaa}  $   (2)

where (by definition)

  $\displaystyle  \frac{dE}{dt}  $ $\displaystyle  =  $ $\displaystyle  r(t), \label{eqa}  $   (3)

is the instantaneous reward rate, and where

  $\displaystyle  \frac{d((T+t)^{-1}) }{ dt}  $ $\displaystyle  =  $ $\displaystyle  \frac{-1}{(T+t)^{2}}. \label{eqb}  $   (4)

Substituting Equations 3 and 4 into Equation 2,

  $\displaystyle  \frac{dR}{dt}  $ $\displaystyle  =  $ $\displaystyle  r(t) \frac{1}{T+t} - \frac{E(t)}{(T+t)^{2}} , \label{eqaab}  $   (5)


  $\displaystyle  \frac{ E(t) }{T+t}  $ $\displaystyle  =  $ $\displaystyle  R(t),  $   (6)

is the average reward rate, so that Equation 5 becomes

  $\displaystyle  \frac{dR}{dt}  $ $\displaystyle  =  $ $\displaystyle  \frac{r(t) }{T+t} - \frac{R(t)}{T+t}.  $   (7)

At a maximum, this is equal to zero,

  $\displaystyle  \frac{r(t) }{T+t} - \frac{R(t)}{T+t}  $ $\displaystyle  =  $ $\displaystyle  0.  $   (8)

Finally, multiplying both sides by $(T+t)$, and re-arranging yields

  $\displaystyle  r(t)  $ $\displaystyle  =  $ $\displaystyle  R(t).  $   (9)

This proves that the average reward rate is maximal when the instantaneous reward rate equals the average reward rate.

Back to the main article