Do we know what matter is?


Matter is at the same time something very mundane and incredibly complex. We deal with material objects all the time and jet when we are press to define what matter is, we end up with a unsatisfactory answer like “matter is that of which things are made of”. And then we can give examples of different materials (type of matter): steel, water, soil, rock, air,…

Our understanding of matter has changed a lot over the last century with new scientific insights, but much is still debated, many of the positions defended have a long history at their backs. This topic has been a central issue of philosophy for over 2000 years, clearly a lot of though has gone into the matter.

Philosophical understanding of matter

Matter has been a topic of great discussion from the time of the Greeks, and still is central debate between religious people, who in some way or other believe on the existence of non-material, let’s say spiritual, things and atheist or materialists who state that matter is all that exist.

The discussion of the essence of matter in ancient Greeks

The ancient Greek thought matter was formed by mixing 4 elements: water, air, fire and earth. Different materials from wood to iron and gold were thought to be mixtures on different percentages of the four basic elements. Thales of Miletus thought the first principle was water, this came from the observation that there is moisture everywhere so water is par of everything. His pupil Anaximander said that water couldn’t be the first, principia, since water couldn’t produce fire, the same happened with the other elements, non were able to create their opposite, thus there was no principal element as such. He nonetheless proposed a perfect, unlimited, eternal and indefinite substance, the Aperion, from which all was created. Anaximenes, Anaximandres pupil, returned to the elemental principia, theory but proposed air as the original element from which everything else is created. He said that though rarification air produced fire and through compression water and subsequently earth.

Pythagoras of Samos said that numbers not matter were the origin of everything. Heraclitus said that no matter was possible since all in life is flux and continuous change. On the other hand Parmenides believed that the universe was static and the hold the only truth, our senses however were changing and unreliable, rendering knowledge of truth impossible. Leucippus and Democritus held that matter was composed of indivisible constituents, atoms. As one can see there was a long debate of what matter was even as far back as the 6th century before Christ.

John Dalton discovered the atom in 1803, giving evidence towards the views held by Leucippus and Democrates.

Modern philosophy: idealism and materialism

The Idealism movement is a group of thinkers which held that reality is principally mental, all we know of the world is what our mind interprets of the world in itself which is out of our reach. The most well known of these thinkers are: Immanuel Kant, G. W. F. Hegel and A. Schopenhauer. In their theories matter is relegated to a secondary place since the world is basically immaterial, or at least our knowledge of the universe is strongly conditioned by our mental process meaning the original essence of matter is not critical since all we ever have a chance to know is mental.

On the opposite side the materialism is a form of philosophical monism which holds that matter is the fundamental substance in nature, and that all phenomena, including mental phenomena and consciousness, are results of material interactions.

There are other philosophical theories that fall in dualistic or pluralistic reality, meaning the world is not purely mental, spiritual or material but is a composition of 2 or more aspects. Descartes is probably one the best known representative of the dualistic nature of reality.


Scientific understanding of matter

From a scientific perspective in classical mechanics a material object is characterized with a position in space and time and some physical properties like volume and mass. All of these properties of matter have a very specific definition and meaning but the only definition of matter that will extract from mechanics is: “matter is that which lays on a position in space and time and occupies a volume and has mass” .

If we go further into quantum and relativistic mechanics we find that ordinary matter is structured, formed by atoms which themselves are composed of particles: electrons , neutrons and protons. neutrons and protons are also composite particles each consisting of 3 quarks. Other more exotic forms of matter exist, formed by all sorts of particles: muons, tauons, neutrinos (3 flavous), mesons (formed by a quark anti quark pair)… In addition we know from relativity that mass and energy are really part of the same thing so massless particles like photons also qualify as matter. From this perspective matter is formed from a sea of particles which in turn are just things that occupy a position in spacetime and have some physical properties like mass, electric charge and spin.

Particles on the standard model of physics.

As we can see science is good at telling us the structure of matter, which are its constructive blocks, but can’t really answer what matter is. This is a consequence of the scientific method, through which hypothesis are falsified and theoretical predictions are verified. This process ensures that the lasting theory has endured and all the predictions based on it have been verified, this however doesn’t mean that some future prediction may fail meaning a new refinement of the theory is required. As a consequence what science can definitely say is how the real world is not. It can’t be anything that produces falsifiable results, since these results are a definite prove that the world is not how we propose. Thats why science will never be able to answer what anything is, only what it is not.

In this sense matter is not a continuous media, since it is made of discrete pieces (subatomic particles). Matter is not static since these particles are in continuous motion. Matter is definitely not mass since mass is only a measure of a body’s resistance to change of motion, inertia, which is a property of most matter (all except mass-less particles, like photons). Different type of matter interact with each other in different ways by four forces: gravitation, electromagnetism, weak force and strong force.


Today, in our scientific worldview most people have a materialistic perspective on nature and life. All matter is made of atoms and all that is or ever has been is made of the particles that constitute the standard model and possible some other particles not jet discovered. This begs the question of whether abstract concepts exist, and by “abstract concepts” here I include such things as chairs and tables, not only truly abstract ones as goodness and happiness. These concepts exist in our minds but not in nature, in a materialistic explanation these concepts don’t really exist they are generated from chemical reactions in our brains the same way awareness arises. And here is were the opposites touch, in a materialistic perspective abstract concepts don’t exist, because the are non-material, from the point of view of idealism or dualism they do exist but in a different “realm of ideas”. Both agree that these concepts are non-material and, in the sense that we use them every day, they undoubtedly must exist in some non-material way. The only difference is whether we disregard this non-material existence as non-existence. So in an ontological way the difference is really not that great as initially expected.


El Cáos a través del péndulo doble

El péndulo doble es uno de los sistemas más sencillos, cualquiera puede construirse uno en su casa con dos masas sujetadas de dos barras, sin embargo demuestra la complejidad de la mecánica en la naturaleza. La imagen en la portada (fuente original) muestra lo compleja que resulta la trayectoria de este segundo péndulo; el primer péndulo recorre arcos de circunferencia, trayectoria roja sin embargo al segundo muestra toda clase de quiebros inesperados, linea amarilla.

Se trata de un sistema donde el comportamiento caótico se presenta de una forma aparente hasta para ángulos de desplazamiento iniciales no muy grandes.

Lo primero que haremos es deducir las ecuaciones del movimiento del péndulo doble. Esto llevará una buena parte del post deduciremos las ecuaciones por dos procedimientos distintos, el Newtoniano y el Lagrangiano. Después exploraremos en que sentido es el movimiento caótico a través del análisis de resultados de la simulación de las ecuaciones deducidas para distintas condiciones iniciales.

Método Newtoniano

Vamos a empezar por plantear el problema desde un punto de vista Newtoniano. Veremos después como el planteamiento se estandariza y facilita si se plantea desde un punto de vista Lagrangiano.

Presentamos en la Figura 1 un esquema de fuerzas y aceleraciones que actúan sobre el sistema.

Figura 1: a) Esquema de fuerzas b) Esquema de aceleraciones, en un péndulo doble.

Según la segunda ley de Newton F=ma (fuerza igual a masa por aceleración) en este caso aplicamos esta ley a las fuerzas de cada una de las masas en la direción vertical (y) y horizontal (x). Tenemos:





De la tercera de estas 4 ecuaciones sacamos:


De la primera sustituyendo el valor obtenido pata F_2 tenemos:


De las dos ecuaciones restantes, la segunda y la cuarta obtenemos las ecuaciones que rigen el sistema:



Ahora solo queda deducir las aceleraciones del sistema, para ello partimos de las posiciones de los péndulos, las derivamos con respecto al tiempo para obtener las velocidades y las volvemos a derivar para obtener aceleraciones.

Empezamos por las posiciones, de la Figura 1b, se observa (tened en cuenta que el eje y lo consideramos positivo en el sentido ascendente y el origen esta en el punto fijo del péndulo):





Ahora derivamos con respecto al tiempo y obtenemos las velocidades:





dónde usamos la notación  \dot{z} para denotar la derivada temporal de z. Por último realizamos la segunda derivada:




a_{y2}=L_1(\mathrm{sin}{\theta}_1\ddot{{\theta}_1}+\mathrm{cos}{\theta}_1\dot{{\theta}_1})+L_2(\mathrm{sin}{\theta}_2\ddot{{\theta}_2} +\mathrm{cos}{\theta}_2\dot{{\theta}_2})

Método Lagrangiano

La teoría de Euler-LaGrange dice que en mecánica se cumple la ecuación

\frac{d}{dt}\left(\frac{\partial L}{\partial \dot{x}}\right)-\frac{\partial L}{\partial x}=0

Donde $L$ es el Lagrangiano, $x$ es una coordenada generalizada del sistema y $\dot{x}$ su derivada temporal. En nuestro problema coordenadas generalizadas son las posiciones angulares, {\theta }_1 y{\theta }_2. La ley de Euler-LaGrange se demuestra desde los principios newtonianos y es equivalente a aquellos. Se trata de un planteamiento desde el punto de vista energético.

El Lagrangiano se define como:


Donde T es la energía cinética y V la energía potencial del sistema.

Recordemos que la energía potencial gravitatoria es V=-mgy  donde y es la altura (signo menos porque hemos elegido el eje y hacia abajo).

y_1=L_1{\mathrm{cos} {\theta }_1\ }
y_1=L_1{\mathrm{cos} {\theta }_1\ }+L_2{\mathrm{cos} {\theta }_2\ }
V=-m_1gL_1{\mathrm{cos} {\theta }_1\ }-m_2g\left(L_1{\mathrm{cos} {\theta }_1\ }+L_2{\mathrm{cos} {\theta }_2\ }\right)
V=-g\left[\left(m_1+m_2\right)L_1{\mathrm{cos} {\theta }_1\ }+{m_2L}_2{\mathrm{cos} {\theta }_2\ }\right]

La energía cinética es un poco más difícil de sacar. Recordemos que


La velocidad de la masa 1 es inmediata.

v_1=L_1\dot{{\theta }_1}

Sin embargo el de la masa 2 resulta de la suma vectorial de la velocidad 1 más la velocidad de la masa dos relativa al sistema no-inercial.

v_r=L_2\dot{{\theta }_2}

Es fácil de ver que el ángulo que forman \overrightarrow{v_1} y \overrightarrow{v_r} es {\theta }_2-{\theta }_1 y por tanto por el teorema del coseno:

{v_2}^2={\left(L_1\dot{{\theta }_1}\right)}^2+{\left(L_2\dot{{\theta }_2}\right)}^2+2L_1L_2\dot{{\theta }_1}\dot{{\theta }_2}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }

Por tanto la energía cinética es:

T=\frac{1}{2}m_1{L_1}^2{\dot{{\theta }_1}}^2+\frac{1}{2}m_2\left[{\left(L_1\dot{{\theta }_1}\right)}^2+{\left(L_2\dot{{\theta }_2}\right)}^2+2L_1L_2\dot{{\theta }_1}\dot{{\theta }_2}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }\right]

El Lagrangiano por tanto:

L=\frac{1}{2}(m_1+m_2){L_1}^2{\dot{{\theta }_1}}^2+\frac{1}{2}m_2\left[{\left(L_2\dot{{\theta }_2}\right)}^2+2L_1L_2\dot{{\theta }_1}\dot{{\theta }_2}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }\right]+ g\left[\left(m_1+m_2\right)L_1{\mathrm{cos} {\theta }_1\ }+{m_2L}_2{\mathrm{cos} {\theta }_2\ }\right]

El siguiente paso por tanto es calcular las derivadas parciales del Lagrangiano como función $L({\theta }_1,{\theta }_2,\ \dot{{\theta }_1}\dot{{,\theta }_2})$:

\frac{\partial L}{\partial {\theta }_1}=m_2L_1L_2\dot{{\theta }_1}\dot{{\theta }_2}{\mathrm{sin} \left({\theta }_2-{\theta }_1\right)\ }-g\left(m_1+m_2\right)L_1{\mathrm{sin} {\theta }_1\ }
\frac{\partial L}{\partial {\theta }_2}=-m_2L_1L_2\dot{{\theta }_1}\dot{{\theta }_2}{\mathrm{sin} \left({\theta }_2-{\theta }_1\right)\ }-gm_2L_2{\mathrm{sin} {\theta }_1\ }
\frac{\partial L}{\partial \dot{{\theta }_1}}=\left(m_1+m_2\right){L_1}^2\dot{{\theta }_1}+m_2L_1L_2\dot{{\theta }_2}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }
\frac{\partial L}{\partial \dot{{\theta }_2}}=m_2{L_2}^2\dot{{\theta }_2}+m_2L_1L_2\dot{{\theta }_1}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }

\frac{d}{dt}\left(\frac{\partial L}{\partial \dot{{\theta }_1}}\right)=\left(m_1+m_2\right){L_1}^2\ddot{{\theta }_1}+m_2L_1L_2\left[\ddot{{\theta }_2}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }-\dot{{\theta }_2}(\dot{{\theta }_2}-\dot{{\theta }_1}){\mathrm{sin} \left({\theta }_2-{\theta }_1\right)\ }\right]

\frac{d}{dt}\left(\frac{\partial L}{\partial \dot{{\theta }_2}}\right)=m_2{L_2}^2\ddot{{\theta }_2}+m_2L_1L_2\left[\ddot{{\theta }_1}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }-\dot{{\theta }_1}(\dot{{\theta }_2}-\dot{{\theta }_1}){\mathrm{sin} \left({\theta }_2-{\theta }_1\right)\ }\right]

Por tanto, aplicando \frac{d}{dt}\left(\frac{\partial L}{\partial \dot{{\theta }_i}}\right)-\frac{\partial L}{\partial {\theta }_i}=0, las dos ecuaciones que rigen el sistema (simplificando los términos que se cancelan) son:

\left(m_1+m_2\right)L_1\ddot{{\theta }_1}+m_2L_2\left[\ddot{{\theta }_2}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }-{\dot{{\theta }_2}}^2{\mathrm{sin} \left({\theta }_2-{\theta }_1\right)\ }\right]+g\left(m_1+m_2\right){\mathrm{sin} {\theta }_1\ }=0

L_2\ddot{{\theta }_2}+L_1\left[\ddot{{\theta }_1}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }-{\dot{{\theta }_1}}^2{\mathrm{sin} \left({\theta }_2-{\theta }_1\right)\ }\right]+g{\mathrm{sin} {\theta }_1\ }=0

Para que estén escritas de la misma manera que en el apartado anterior solo habría que despejar \ddot{{\theta }_1} y \ddot{{\theta }_2}, cosa que vamos a dejar para el lector.

La ventaja del método Lagrangiano es que no requiere calcular las fuerzas que soportan los péndulos que es un proceso engorroso. Además la manera de llegar hasta las ecuaciones es siempre igual, a cambio requiere realizar varias derivadas.

Comportamiento Caótico del sistema

En Física se dice que un sistema es caótico cuando pequeñas diferencias en las condiciones iniciales del sistema conducen a situaciones muy distintas con el trascurso del tiempo. Para mostrar esto voy a simular dos casos en el primero los péndulos van a soltarse con unos ángulos: {\theta}_1=30^{\circ} \quad (\frac{\pi}{6} \mathrm{rad}) y {\theta}_2=60^{\circ} \quad (\frac{\pi}{3} \mathrm{rad}); el segundo se soltara desde:{\theta}_1=31^{\circ} \quad (\frac{31\pi}{180} \mathrm{rad}) y {\theta}_2=61^{\circ} \quad (\frac{61\pi}{3} \mathrm{rad}). Una diferencia de tan solo 1º produce diferencias importantes en tan solo 3s de simulación. De la misma manera diferencias de 1″ llevarían a errores inaceptables al cabo de unos minutos. Los péndulos simulados tienen una longitud de 1 m y unas masas de 1 kg.

Figura 2: Diferencia en la posición de cada una de las masas entre los casos simulados. En azul la diferencia entre los valores de {\theta}_1 en radianes. En naranja la diferencia entre los valores de {\theta}_2 en radianes.
Figura 3: posiciones de las masas de los pendulos durante los primeros 3s.


De la Figura 3 vemos que los dos sistemas se siguen bastante bien durante 1.5 s o así, a partir de ahí empiezan a separarse bastante bruscamente. La Figura 2 muestra la diferencia entre el caso cuyas condiciones iniciales son (30º, 60º) y aquel que tiene (31º,61º). Durante los primeros 1.75s o así las graficás de la Figura 2 estan bastante cerca de cero, indicando que la diferencia entre los dos sistemas es escasa. Sin embargo a los 2s diverge sustancialmente ya que llegan a mas de 2 rad (114º) de diferencia. Observar que la máxima diferencia posible es de 180º, significando que estan en la posición opuesta.

Notar que un error de 1º en la posición de un péndulo de 1m es bastante grande ya que equivale a unos 17 mm de desplazamiento, por esta razón los dos casos divergen a los pocos segundos.

Climate Radiation Model

This model is inspired in the model posted by David Evans in his blog page. The model is based in the concept of emission layers of the atmosphere. The different active gases that are part of the composition of the the atmosphere, each emits infra red radiation at characteristic wavelengths and from different atmospheric layers.

The active gases of the atmosphere, sometimes called greenhouse gases, are H2O, CO2, O3 and CH4 in order of decreasing thermal emissions. Apart from the active gases some radiation is emitted directly from Earth’s surface and the top of the clouds through what is called “The atmospheric IR window“, the spectrum to which the atmosphere is transparent in the IR. In David’s nomenclature these 6 possible sinks for the incoming heat are called “pipes”, of these two are of minor importance O3 and CH4, leaving 4 main pipes. Energy can redistribute though the other pipes if one of them gets blocked, as for example by adding CO2.

Fig 1. The spectral outgoing long-wave radiation (OLR). Showing the spectral windows of the different gases and the transparent window from which the surface emits. In gray is the blackbody emission of an object at 300K = 23ºC

David does a very good job at summarizing the available data on the highs of emission of the different gases and the top of the clouds, here. The gases are supposed ti be almost black body emitters in the window through which each is active, meaning the emitted energy is only a function of the temperature of the layer of the atmosphere from which the emission takes place. Since the temperature of the atmosphere decreases with altitude (in the troposphere), a higher layer emits less power than one closer to the Earth’s surface.

David’s OLR (outgoing long-wave radiation) model is only concerned on how the variation of various parameters modify the distribution of heat through the pipes, how these parameters may be dependent of the temperature or other independent variables is outside his scope.

Here I am going to layout a thermal model, based in well known physics to try to explain some of these missing relations. The first step is to build a model that fits the data, so to that purpose I am going to use the numbers from David Evans’ post:

  • Lapse rate 6.5ºC/km, surface temperature= 288K
  • Cloud cover = 62%, albedo = 30%, solar constant = 1367.7 W/m²
  • Water emission layer: height=8km, output power = 33%
  • Carbon Dioxide layer: height=7km, output power = 20%
  • Cloud top emission layer: height=3.3km, output power = 20%
  • Methane emission layer: height=3km, output power = 2%
  • Ozone emission layer: height=16km, output power= 5.8%
  • Surface emission layer: height=0km, output power=18.2%

Note: for now I have treated the CO2 as emitting from a constant average hight, I liked David’s treatment of the wights of the spectral emission on this spectrum, and I am planning on taking a similar approach on my next refinement. (End note)

The model uses a 2 surfaces representation of The Earth: surface 0 the ground surface (the origin) and the top of the atmosphere surface which is characterized by the maximum height of the convective Hadley. Temperatures are assumed to be linear throughout the atmosphere, so once the convective overturn is specified and the temperature at the top of the Hadley cell is known, the temperature of any other layer is linearly interpoled. The amount of energy that flows through each pipe is controlled by six additional parameters that represent the spectral width of the different spectral windows for each pipe. In the analogy of flow coming out of a damp through a set of pipes in parallel, these parameters represent the widths of the pipes.  For now these values have been adjusted to fit the percentages specified above, but I pretend to deduce their dependence with the height of the emission layers and the wave-lengths of the windows in the next post of the series.

The complete equations of the model and the values of the different parameters are on the link. The core of the model is equations 41, 50 and 51; representing the energy balance in both regions, the surface and the atmosphere.

clamte model diagram

Fig 2. Model schematic. One surface and one band model. Two balance equations one on the surface and one on the upper atmosphere as a whole. The atmosphere emits from different layers which are at different temperatures

The incoming solar power, modified by albedo, is the heat source of planet Earth and this heat is assumed to be absorbed on the surface. The surface balances the heat by radiation and convection mechanisms. The surface radiates either directly to space (about 18%) or to the clouds, this makes a total of three heat sinks for the surface: the two radiation and the convective mechanism.


The atmosphere on the other hand is heated by the surface, through the convention and the radiation to clouds mechanisms, which being heat sinks for the surface, become sources for the atmosphere. The atmosphere is balanced by its own sinks which is the radiation to space from the different active layers: clouds, H2O, CO2, CH4 and O3.

Q_{Conv}+Q_{ToClouds}=Q_{FromClouds}+ Q_{H2O}-Q_{CO2}+Q_{CH4}+Q_{O3}

Each of the radiative emission layers is modeled like so:

Q_i=A_i \epsilon f_i \sigma T_i^4

T_i=T_0-\alpha h_i

where A_i is the surface area, \epsilon is the emittance of the atmosphere (0.996), \sigma is Stephan-Boltzmann constant, T_i is the temperature of the emission layer in K, f_i is the window factor, T_0 is the temperature of Earth’s surface, \alpha is the lapse rate and h_i the height of the emission layer.

The convective heat is modeled as so:

Q_{Conv}=A_0 h_{conv} (T_0-T_1)

The lapse rate is then:


where A_0 is the area of Earth’s surface, h_{conv} is the convection film coefficient, T_1 is the temperature at the top of the Hadley convective cell, and H is the height of the convective cell.

The direct radiation to space is then:

Q_{Direct}=A_0 \epsilon f_{direct}(1-c)\sigma T_o^4

where c is the cloud cover and f_{direct} the direct atmospheric window.

The radiation to clouds is:

Q_{ToClouds}=A_0 \epsilon f_{direct}c\sigma T_o^4-A_1 \epsilon f_{clouds}c\sigma T_1^4

with f_{clouds} being the atmospheric window from the top of the clouds and A_1 the surface of a sphere which encompass the convective layer of Earth.

Lastly the solar irradiation is

Q_{Solar}=A_0 G_s/4 (1-a)

Whit G_s as the solar constant and a as the albedo.

The model has then 8 parameters that can be adjusted to fit the experimental data: all 6 window factors, the convective coefficient and the height of the convective cell. These parameters are set by imposing the experimental outgoing power distribution, the experimental mean lapse rate and the surface mean temperature which are a total of 7 restrictions. This leaves an extra degree of freedom which I chose as setting the height of the convective cell as 8.2 km arbitrarily.

There are several problems with the current model, that will be addressed in the next post of the serie:

  1. The temperature of the stratosphere increases with height from the tropopausa at about 10-12 km so the ozone temperature layer is not correct. The actual ozone layer is above 20-30 km high but I chose to leave it at 16km so that it’s temperature not fall drastically when using the linear lapse rate. The stratosphere increases temperature  because the O3 captures part of the UV light from the sun and is heated. In future models I may include this effect.
  2. Although the physical meaning of the window factors is clear, these factors can be deduced mathematically from the temperature of the emission layer and the wavelength interval as the fraction of the Planck distribution at the temperature that is emitted through the window. This will be tried on next model, once done the factor will be linked to the height of the layer, the lapse rate and the surface temperature through the temperature of the layer. The fact that the model has an extra degree of freedom (the height of the convection cell) increases my confidence that once the theoretical window fractions are calculated, which inevitably will be different from those obtained from the adjustment, the model will still fit the experimental data within reason.
  3.   CO2 emits radiation from a whole range of heights in the atmosphere through the weights of the spectral window (see figure 1), the treatment of this feature will be studied. I think it is the result of a lower opacity (larger optical length) of the CO2 at those wavelengths so the solution is only partly lowering the emission height but also the emittance at those wavelengths, since a lower absorption (opacity) will always be accompanied by a lower emittance at a same wavelength (Kirchhoff Law of radiation)


This has been a very interesting post for me. I look forwards to the continuation. Any comment, or doubt or correction is welcomed.



Uncertainty and Bayesian probability

bayesian_processThis post is to address the relation between the uncertainty on the state of a system with the information we have of it. Said in those terms is kind of obvious, the more information the less uncertainty. More information can take the form of knowing some other aspect of the system or being more precise information on previously acknowledge aspects.

In mathematics the way to deal with this kind of problem is probability and the way probabilities change when additional data is taken into account is the law of Bayes, hence Bayesian probabilities.

To illustrate how this works I’m going to go through an example. Lets imagine we have a tank of worm water, like a bath tub, and we want to know what temperature the water in the tank has. We could stick a thermometer in the tank and measure its temperature, assuming the temperature of the water in the tank is well mixed then that would yield the temperature of the tank with the uncertainty characteristic of our measuring device.

Fig. (1) The uncertainty distribution of a simple measurement of the water in the tank

If we are using a mercury thermometer the uncertainty would be around \pm0.2\textrm{ }\textdegree\textrm{C}. Measuring devices’ uncertainties are usually assumed to be normally distributed unless specific evidence is available. Usually the uncertainty level used is 95% or 2\sigma (two standard deviations). Meaning that when I say that the water in the tank is at 36.9\pm0.2\textrm{ }\textdegree\textrm{C}, it means I’m 95% certain that the temperature is between 36.7 and 37.1. The probability distribution of Fig. (1) shows the exact meaning, we are more confident the closer we get to the mean value.

To see how our knowledge of the system varies when further information is taken into account we are going to consider that this tank is not alone in the universe but it interacts with it.

Lets make the tank be in thermal equilibrium by adding a hot water inlet and a outlet, configured in such a way that the level of water is constant. Lets assume the tank is well insulated on all of its lateral walls and floor but open to the room temperature air on its top surface. For this system to be in equilibrium the mass flows of the inlet and outlet must coincide and the incoming heat through the inlet must be equal to the heat losses to the ambient.

This system is easily described with a simple equation relating the variables in play:

mc_p(T_{in}-T)=Ah(T-T_{amb})                                   (1)


  • m is the mass flow.
  • c_p is the specific heat of the water, which we are going to take as a constant known with absolute certainty to be c_p=4.187\quad\frac{\textrm{kJ}}{\textrm{kg}\textdegree\textrm{C}}   .
  • T_{in} is the temperature of the water comming into the tank.
  • T is the temperature of the tank.
  • A is the area of the surface water of the tank, which we also will assume is a perfectly known quantity T_{in}=0.64\quad\textrm{m}^2
  • h is the convection film coeficient, which we’ll assume is h=10 \quad\frac{\textrm{W}}{\textrm{m}^2\textdegree\textrm{C}}.
  • T_{amb} is the ambient temperature of the air in contact with the water surface.

Lets assume we are measuring the inlet temperature, the ambient temperature and the mass flow, as follows:

  1. T_{in}= 58 \pm 0.2\quad \textdegree\textrm{C}.
  2. T_{amb}= 17 \pm 0.2\quad \textdegree\textrm{C}
  3. m= 0.0015 \pm 3.7E-5\quad \frac{\textrm{kg}}{\textrm{s}}

Having these measurements gives us a pretty good idea of the temperature of the tank even before explicitly measuring it. Fig. (2) shows this probability, it turns out we already know the temperature of the tank is 37.257 \pm 0.52\textrm{ }\textdegree\textrm{C}. It is less precise that the direct measurement but the monitoring of the interactions with the outside does provide an estimate of the temperature of the system.

Fig. (2) Prior probability of the state of the tank, before measuring it explicitly.

Now lets consider what happens when we measure the temperature and the measurement device shows like before 36.9\pm0.2\textrm{ }\textdegree\textrm{C}. Since our prior knowledge of the state of the system is different, in the previous case we didn’t know anything of the system before hand, now we believe the temperature is 37.257 \pm 0.52\textrm{ }\textdegree\textrm{C}. How this influences the ultimate state of our knowledge, after the measurement?

Bayes’ theorem provides the method that allows us to to update beliefs when new evidence arrives (more on that on wikipedia). Applying the theorem to the example at hand we find that the prior beliefs modify the end result bringing the mean a little towards the mean of our prior beliefs and reduces a little the uncertainty of the measurement. Figure (3) shows how our prior probabilities (blue) bring the measurement (green) slightly towards the right to, transforming our prior believes (blue) to to our later, more precise, believes (red). The final state of knowledge of the system is 36.946 \pm 0.187\textrm{ }\textdegree\textrm{C}, the uncertainty has gone down from 0.2 to 0.187 because of our prior knowledge, moving the mean from 36.9 to 36.946.

Fig. (3) The Bayesian process. Our prior believes (blue) influence our final state of knowledge (red) after measuring the system (green). It doesn’t only modify the most likely value but also reduces the uncertainty of the measurement.