Skip to main content
Home
plus.maths.org

Secondary menu

  • My list
  • About Plus
  • Sponsors
  • Subscribe
  • Contact Us
  • Log in
  • Main navigation

  • Home
  • Articles
  • Collections
  • Podcasts
  • Maths in a minute
  • Puzzles
  • Videos
  • Topics and tags
  • For

    • cat icon
      Curiosity
    • newspaper icon
      Media
    • graduation icon
      Education
    • briefcase icon
      Policy

      Popular topics and tags

      Shapes

      • Geometry
      • Vectors and matrices
      • Topology
      • Networks and graph theory
      • Fractals

      Numbers

      • Number theory
      • Arithmetic
      • Prime numbers
      • Fermat's last theorem
      • Cryptography

      Computing and information

      • Quantum computing
      • Complexity
      • Information theory
      • Artificial intelligence and machine learning
      • Algorithm

      Data and probability

      • Statistics
      • Probability and uncertainty
      • Randomness

      Abstract structures

      • Symmetry
      • Algebra and group theory
      • Vectors and matrices

      Physics

      • Fluid dynamics
      • Quantum physics
      • General relativity, gravity and black holes
      • Entropy and thermodynamics
      • String theory and quantum gravity

      Arts, humanities and sport

      • History and philosophy of mathematics
      • Art and Music
      • Language
      • Sport

      Logic, proof and strategy

      • Logic
      • Proof
      • Game theory

      Calculus and analysis

      • Differential equations
      • Calculus

      Towards applications

      • Mathematical modelling
      • Dynamical systems and Chaos

      Applications

      • Medicine and health
      • Epidemiology
      • Biology
      • Economics and finance
      • Engineering and architecture
      • Weather forecasting
      • Climate change

      Understanding of mathematics

      • Public understanding of mathematics
      • Education

      Get your maths quickly

      • Maths in a minute

      Main menu

    • Home
    • Articles
    • Collections
    • Podcasts
    • Maths in a minute
    • Puzzles
    • Videos
    • Topics and tags
    • Audiences

      • cat icon
        Curiosity
      • newspaper icon
        Media
      • graduation icon
        Education
      • briefcase icon
        Policy

      Secondary menu

    • My list
    • About Plus
    • Sponsors
    • Subscribe
    • Contact Us
    • Log in
    • Tails, Bayes loses (and invents data assimilation in the process)

      Chris Budd
      9 May, 2024

      Brief summary

      Scientists often only have imperfect data to base their predictions on. This is the case, for example, when it comes to weather forecasting. This article explains a technique called data assimilation designed to deal with this problem. It updates an initial prediction in the light of new data, taking account of the fact that both data and original prediction are imperfect.

      In the previous article we learnt how to use Bayes' theorem to adjust the probability of an event in the light of data. The example we used was a coin that had come up heads ten times in a row. Given this data, it seems likely that the coin is biased towards heads, so it seems the chance of heads coming up at the eleventh toss should be higher than the 50% you'd expect from an unbiased coin. Bayes' theorem confirms this intuition.

      However, there is another explanation of what we have observed. The coin itself may be perfectly fair, and it may be the data itself that may be biased. For example, I may have removed my glasses when recording the heads and the tails. In this case I simply can't tell one from another and for convenience I have recorded each toss as a head. Alternatively I may have correctly seen whether it was a head or a tail, but due to a computer malfunction all the tosses got recorded as heads.

      These are examples of an instrumentation error. Such errors are not uncommon when recording data (although probably not as extreme as this example). No data recording device is perfect and all make some errors.

      Another possibility is that I may have been deliberately lying when I recorded the data in order to pretend to you that the coin is biased, even though it may have fallen as a tail several times. This sort of situation is of course not uncommon in using data in criminal cases where a judgement has to be made as to whether we believe a witness or not.

      Thus we are faced with the question of, given data which is not (completely) reliable, what, if anything, can we say about the system we are studying, such as whether the coin is in fact a fair one?

      Bayes helps out

      To help to make an estimate of the state of the system given that the data might be dodgy, we need to have a measure of the reliability of the data. A good example of this is to consider a measuring instrument, such as a thermometer. The measurements it makes of a temperature T may vary slightly each time, but if the mean (average) value of those measurements it T, then we say the thermometer is unbiased. The variance of the measurements reflects how they spread out around the mean, and gives a way of assessing how reliable the measurements are. If there is a large variance we take less account of the data than if there is a small variance. This allows us to update a (prior) prediction of an event by assessing just how much we need to change the prediction given a possibly unreliable data measurement.

      This process is often called data assimilation. Data assimilation combines a dodgy prediction with dodgy data to produce an updated prediction which is more accurate than either. This really is like magic, truly we are getting something for (almost) nothing!

      Meteorologists have been using data assimilation for about twenty years, and this has had a significant impact on the reliability of the weather forecast. To reliably predict tomorrow's weather over the whole of the Earth based on today's weather, meteorologists would in theory need to take about a billion measurements of the state of the atmosphere today. In practice this is impossible: the most they can manage is about a million measurements, so they don't have enough information about the weather today from data alone.

      To get around this problem, meteorologists take yesterday's forecast for today's weather and then nudge that forecast in the direction of data measured today (and yes, "nudge" is a technical term!). The resulting estimate of today's weather is then used to make tomorrow's forecast.

      Data assimilation produces the nudge, and the general idea is as follows. The meteorologist makes a (prior) prediction of today's weather given a knowledge of yesterday's weather. This involves measuring as much of today's weather as possible, for example using a thermometer (or just looking out of the window). Since each measurement is going to be a little different, even an unbiased thermometer will come up with a whole spread of possible measurements.

      The prediction of today's weather from yesterday will also have an associated error, actually a whole spread of possible errors (this is because our weather models and the computations we make on them are far from perfect). We write Epred for the variance of this spread. The prediction is then compared with the (limited) observations (the data) that we can make of today's weather, which will also have an error whose variance we call Edata.

      If Epred is small compared to Edata, then the prediction is only nudged a little in the direction of the data. Loosely speaking, this is because the prediction is more reliable than the data we measured today, so we don't want to take too much account of today's data. In contrast if Epred is large compared to Edata, then the data is taken a lot more into account.

      The result after nudging, the analysis A, is the best estimate of today's weather consistent with both the original prediction and the data. Using the analysis the weather forecaster can then predict the weather for the next few days. (See the appendix of this article to find out more detail about the nudging process in a simple application involving estimating the temperature.)

       illustration of data assimilation

      An illustration of the process of data assimilation. The pink ellipse represents the prediction and its possible errors. The orange represents the data and its possible errors. Data assimilation nudges the original prediction in the direction of the data so that it's both within the error ellipse of the original prediction and of the data.

      The idea of assimilating the data into the forecast (variously called 3DVAR, 4DVAR and Ensemble Kalman Filtering) is how our weather is predicted every day by the Met Office, the European Centre for Medium Range Weather Forecasting (ECMWF), and other weather centres all over the world.

      data assimilation in meteorology

      An illustration of the process of data assimilation in meteorology.

      The role of Bayes' theorem in this, and all other applications of data assimilation, is that it tells us exactly how big the nudge needs to be. It updates the prediction in the light of new data, taking account of the fact that both data and original prediction are imperfect. We can use it to produce an algorithm for finding the best prediction. See the technical explanation in the appendix of this article to find out how all this works.

      The same idea of systematically combining the knowledge of a system with a stream of data is used in the very successful Kalman filter which was originally introduced to track satellites, and is now universal in many devices including aircraft navigation systems and your mobile phone (see this article to find out more). A sophisticated version of the same idea is used in modern machine learning where complex neural networks are trained against (possibly unreliable) data to perform a huge variety of different tasks.

      Truly the modern world relies on Bayes' theorem and its many applications!


      About this article

      Chris Budd

      Chris Budd.

      Chris Budd OBE is Professor of Applied Mathematics at the University of Bath and part of Maths4DL, a joint research programme of the Universities of Bath and Cambridge and University College London, which explores the mathematics of deep learning. Chris is particularly interested in applying mathematics to the real world and promoting the public understanding of mathematics.

      Chris has written the popular maths book Climate, Chaos and COVID, published by World Scientific, and co-written the popular mathematics books Mathematics Galore!, published by Oxford University Press, with C. Sangwin. Chris features in the book 50 Visions of Mathematics ed. Sam Parc.


      This content is part of our collaboration with the Mathematics for Deep Learning (Maths4DL) research programme, which brings together researchers from the universities of Bath and Cambridge, and University College London. Maths4DL aims to combine theory, modelling, data and computation to unlock the next generation of deep learning. You can see more content produced with Maths4DL here.

      Maths4DL logo
      • Log in or register to post comments

      Read more about...

      maths4DL
      bayes theorem
      bayesian inference
      data assimilation
      machine learning
      artificial intelligence
      statistics
      probability
      University of Cambridge logo

      Plus is part of the family of activities in the Millennium Mathematics Project.
      Copyright © 1997 - 2025. University of Cambridge. All rights reserved.

      Terms