El Cáos a través del péndulo doble

El péndulo doble es uno de los sistemas más sencillos, cualquiera puede construirse uno en su casa con dos masas sujetadas de dos barras, sin embargo demuestra la complejidad de la mecánica en la naturaleza. La imagen en la portada (fuente original) muestra lo compleja que resulta la trayectoria de este segundo péndulo; el primer péndulo recorre arcos de circunferencia, trayectoria roja sin embargo al segundo muestra toda clase de quiebros inesperados, linea amarilla.

Se trata de un sistema donde el comportamiento caótico se presenta de una forma aparente hasta para ángulos de desplazamiento iniciales no muy grandes.

Lo primero que haremos es deducir las ecuaciones del movimiento del péndulo doble. Esto llevará una buena parte del post deduciremos las ecuaciones por dos procedimientos distintos, el Newtoniano y el Lagrangiano. Después exploraremos en que sentido es el movimiento caótico a través del análisis de resultados de la simulación de las ecuaciones deducidas para distintas condiciones iniciales.

Método Newtoniano

Vamos a empezar por plantear el problema desde un punto de vista Newtoniano. Veremos después como el planteamiento se estandariza y facilita si se plantea desde un punto de vista Lagrangiano.

Presentamos en la Figura 1 un esquema de fuerzas y aceleraciones que actúan sobre el sistema.

Figura 1: a) Esquema de fuerzas b) Esquema de aceleraciones, en un péndulo doble.

Según la segunda ley de Newton F=ma (fuerza igual a masa por aceleración) en este caso aplicamos esta ley a las fuerzas de cada una de las masas en la direción vertical (y) y horizontal (x). Tenemos:

F_2\mathrm{sin}{\theta}_2-F_1\mathrm{sin}{\theta}_1=m_1a_{x1}

F_1\mathrm{cos}{\theta}_1-F_2\mathrm{cos}{\theta}_2-m_1g=m_1a_{y1}

-F_2\mathrm{sin}{\theta}_2=m_2a_{x2}

F_2\mathrm{cos}{\theta}_2-m_2g=m_2a_{y2}

De la tercera de estas 4 ecuaciones sacamos:

F_2=-\frac{m_2a_{x2}}{\mathrm{sin}{\theta}_2}

De la primera sustituyendo el valor obtenido pata F_2 tenemos:

F_1=-\frac{m_1a_{x1}+m_2a_{x2}}{\mathrm{sin}{\theta}_1}

De las dos ecuaciones restantes, la segunda y la cuarta obtenemos las ecuaciones que rigen el sistema:

m_1a_{x1}+m_1g-\frac{m_2a_{x2}}{\mathrm{tan}{\theta}_2}+\frac{m_1a_{x1}}{\mathrm{tan}{\theta}_1}

a_{y2}+\frac{a_{x2}}{\mathrm{tan}{\theta}_2}+g=0

Ahora solo queda deducir las aceleraciones del sistema, para ello partimos de las posiciones de los péndulos, las derivamos con respecto al tiempo para obtener las velocidades y las volvemos a derivar para obtener aceleraciones.

Empezamos por las posiciones, de la Figura 1b, se observa (tened en cuenta que el eje y lo consideramos positivo en el sentido ascendente y el origen esta en el punto fijo del péndulo):

x_1=L_1\mathrm{sin}{\theta}_1

y_1=-L_1\mathrm{cos}{\theta}_1

x_2=L_1\mathrm{sin}{\theta}_1+L_2\mathrm{sin}{\theta}_2

y_2=-L_1\mathrm{cos}{\theta}_1-L_2\mathrm{cos}{\theta}_2

Ahora derivamos con respecto al tiempo y obtenemos las velocidades:

v_{x1}=L_1\mathrm{cos}{\theta}_1\dot{{\theta}_1}

v_{y1}=L_1\mathrm{sin}{\theta}_1\dot{{\theta}_1}

v_{x2}=L_1\mathrm{cos}{\theta}_1\dot{{\theta}_1}+L_2\mathrm{cos}{\theta}_2\dot{{\theta}_2}

v_{y2}=L_1\mathrm{sin}{\theta}_1\dot{{\theta}_1}+L_2\mathrm{sin}{\theta}_2\dot{{\theta}_2}

dónde usamos la notación  \dot{z} para denotar la derivada temporal de z. Por último realizamos la segunda derivada:

a_{x1}=L_1(\mathrm{cos}{\theta}_1\ddot{{\theta}_1}-\mathrm{sin}{\theta}_1\dot{{\theta}_1)}

a_{y1}=L_1(\mathrm{sin}{\theta}_1\ddot{{\theta}_1}+\mathrm{cos}{\theta}_1\dot{{\theta}_1})

a_{x2}=L_1(\mathrm{cos}{\theta}_1\ddot{{\theta}_1}-\mathrm{sin}{\theta}_1\dot{{\theta}_1})+L_2(\mathrm{cos}{\theta}_2\ddot{{\theta}_2}-\mathrm{sin}{\theta}_2\dot{{\theta}_2})

a_{y2}=L_1(\mathrm{sin}{\theta}_1\ddot{{\theta}_1}+\mathrm{cos}{\theta}_1\dot{{\theta}_1})+L_2(\mathrm{sin}{\theta}_2\ddot{{\theta}_2} +\mathrm{cos}{\theta}_2\dot{{\theta}_2})

Método Lagrangiano

La teoría de Euler-LaGrange dice que en mecánica se cumple la ecuación

\frac{d}{dt}\left(\frac{\partial L}{\partial \dot{x}}\right)-\frac{\partial L}{\partial x}=0

Donde $L$ es el Lagrangiano, $x$ es una coordenada generalizada del sistema y $\dot{x}$ su derivada temporal. En nuestro problema coordenadas generalizadas son las posiciones angulares, {\theta }_1 y{\theta }_2. La ley de Euler-LaGrange se demuestra desde los principios newtonianos y es equivalente a aquellos. Se trata de un planteamiento desde el punto de vista energético.

El Lagrangiano se define como:

L=T-V

Donde T es la energía cinética y V la energía potencial del sistema.

Recordemos que la energía potencial gravitatoria es V=-mgy  donde y es la altura (signo menos porque hemos elegido el eje y hacia abajo).

y_1=L_1{\mathrm{cos} {\theta }_1\ }
y_1=L_1{\mathrm{cos} {\theta }_1\ }+L_2{\mathrm{cos} {\theta }_2\ }
V=-m_1gy_1-m_2gy_2
V=-m_1gL_1{\mathrm{cos} {\theta }_1\ }-m_2g\left(L_1{\mathrm{cos} {\theta }_1\ }+L_2{\mathrm{cos} {\theta }_2\ }\right)
V=-g\left[\left(m_1+m_2\right)L_1{\mathrm{cos} {\theta }_1\ }+{m_2L}_2{\mathrm{cos} {\theta }_2\ }\right]

La energía cinética es un poco más difícil de sacar. Recordemos que

T=\frac{1}{2}mv^2

La velocidad de la masa 1 es inmediata.

v_1=L_1\dot{{\theta }_1}

Sin embargo el de la masa 2 resulta de la suma vectorial de la velocidad 1 más la velocidad de la masa dos relativa al sistema no-inercial.

\overrightarrow{v_2}=\overrightarrow{v_1}+\overrightarrow{v_r}
v_r=L_2\dot{{\theta }_2}

Es fácil de ver que el ángulo que forman \overrightarrow{v_1} y \overrightarrow{v_r} es {\theta }_2-{\theta }_1 y por tanto por el teorema del coseno:

{v_2}^2={\left(L_1\dot{{\theta }_1}\right)}^2+{\left(L_2\dot{{\theta }_2}\right)}^2+2L_1L_2\dot{{\theta }_1}\dot{{\theta }_2}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }

Por tanto la energía cinética es:

T=\frac{1}{2}m_1{v_1}^2+\frac{1}{2}m_2{v_2}^2
T=\frac{1}{2}m_1{L_1}^2{\dot{{\theta }_1}}^2+\frac{1}{2}m_2\left[{\left(L_1\dot{{\theta }_1}\right)}^2+{\left(L_2\dot{{\theta }_2}\right)}^2+2L_1L_2\dot{{\theta }_1}\dot{{\theta }_2}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }\right]

El Lagrangiano por tanto:

L=\frac{1}{2}(m_1+m_2){L_1}^2{\dot{{\theta }_1}}^2+\frac{1}{2}m_2\left[{\left(L_2\dot{{\theta }_2}\right)}^2+2L_1L_2\dot{{\theta }_1}\dot{{\theta }_2}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }\right]+ g\left[\left(m_1+m_2\right)L_1{\mathrm{cos} {\theta }_1\ }+{m_2L}_2{\mathrm{cos} {\theta }_2\ }\right]

El siguiente paso por tanto es calcular las derivadas parciales del Lagrangiano como función $L({\theta }_1,{\theta }_2,\ \dot{{\theta }_1}\dot{{,\theta }_2})$:

\frac{\partial L}{\partial {\theta }_1}=m_2L_1L_2\dot{{\theta }_1}\dot{{\theta }_2}{\mathrm{sin} \left({\theta }_2-{\theta }_1\right)\ }-g\left(m_1+m_2\right)L_1{\mathrm{sin} {\theta }_1\ }
\frac{\partial L}{\partial {\theta }_2}=-m_2L_1L_2\dot{{\theta }_1}\dot{{\theta }_2}{\mathrm{sin} \left({\theta }_2-{\theta }_1\right)\ }-gm_2L_2{\mathrm{sin} {\theta }_1\ }
\frac{\partial L}{\partial \dot{{\theta }_1}}=\left(m_1+m_2\right){L_1}^2\dot{{\theta }_1}+m_2L_1L_2\dot{{\theta }_2}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }
\frac{\partial L}{\partial \dot{{\theta }_2}}=m_2{L_2}^2\dot{{\theta }_2}+m_2L_1L_2\dot{{\theta }_1}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }

\frac{d}{dt}\left(\frac{\partial L}{\partial \dot{{\theta }_1}}\right)=\left(m_1+m_2\right){L_1}^2\ddot{{\theta }_1}+m_2L_1L_2\left[\ddot{{\theta }_2}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }-\dot{{\theta }_2}(\dot{{\theta }_2}-\dot{{\theta }_1}){\mathrm{sin} \left({\theta }_2-{\theta }_1\right)\ }\right]

\frac{d}{dt}\left(\frac{\partial L}{\partial \dot{{\theta }_2}}\right)=m_2{L_2}^2\ddot{{\theta }_2}+m_2L_1L_2\left[\ddot{{\theta }_1}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }-\dot{{\theta }_1}(\dot{{\theta }_2}-\dot{{\theta }_1}){\mathrm{sin} \left({\theta }_2-{\theta }_1\right)\ }\right]

Por tanto, aplicando \frac{d}{dt}\left(\frac{\partial L}{\partial \dot{{\theta }_i}}\right)-\frac{\partial L}{\partial {\theta }_i}=0, las dos ecuaciones que rigen el sistema (simplificando los términos que se cancelan) son:

\left(m_1+m_2\right)L_1\ddot{{\theta }_1}+m_2L_2\left[\ddot{{\theta }_2}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }-{\dot{{\theta }_2}}^2{\mathrm{sin} \left({\theta }_2-{\theta }_1\right)\ }\right]+g\left(m_1+m_2\right){\mathrm{sin} {\theta }_1\ }=0

L_2\ddot{{\theta }_2}+L_1\left[\ddot{{\theta }_1}{\mathrm{cos} \left({\theta }_2-{\theta }_1\right)\ }-{\dot{{\theta }_1}}^2{\mathrm{sin} \left({\theta }_2-{\theta }_1\right)\ }\right]+g{\mathrm{sin} {\theta }_1\ }=0

Para que estén escritas de la misma manera que en el apartado anterior solo habría que despejar \ddot{{\theta }_1} y \ddot{{\theta }_2}, cosa que vamos a dejar para el lector.

La ventaja del método Lagrangiano es que no requiere calcular las fuerzas que soportan los péndulos que es un proceso engorroso. Además la manera de llegar hasta las ecuaciones es siempre igual, a cambio requiere realizar varias derivadas.

Comportamiento Caótico del sistema

En Física se dice que un sistema es caótico cuando pequeñas diferencias en las condiciones iniciales del sistema conducen a situaciones muy distintas con el trascurso del tiempo. Para mostrar esto voy a simular dos casos en el primero los péndulos van a soltarse con unos ángulos: {\theta}_1=30^{\circ} \quad (\frac{\pi}{6} \mathrm{rad}) y {\theta}_2=60^{\circ} \quad (\frac{\pi}{3} \mathrm{rad}); el segundo se soltara desde:{\theta}_1=31^{\circ} \quad (\frac{31\pi}{180} \mathrm{rad}) y {\theta}_2=61^{\circ} \quad (\frac{61\pi}{3} \mathrm{rad}). Una diferencia de tan solo 1º produce diferencias importantes en tan solo 3s de simulación. De la misma manera diferencias de 1″ llevarían a errores inaceptables al cabo de unos minutos. Los péndulos simulados tienen una longitud de 1 m y unas masas de 1 kg.

difference_vs_time
Figura 2: Diferencia en la posición de cada una de las masas entre los casos simulados. En azul la diferencia entre los valores de {\theta}_1 en radianes. En naranja la diferencia entre los valores de {\theta}_2 en radianes.
position_vs_time
Figura 3: posiciones de las masas de los pendulos durante los primeros 3s.

 

De la Figura 3 vemos que los dos sistemas se siguen bastante bien durante 1.5 s o así, a partir de ahí empiezan a separarse bastante bruscamente. La Figura 2 muestra la diferencia entre el caso cuyas condiciones iniciales son (30º, 60º) y aquel que tiene (31º,61º). Durante los primeros 1.75s o así las graficás de la Figura 2 estan bastante cerca de cero, indicando que la diferencia entre los dos sistemas es escasa. Sin embargo a los 2s diverge sustancialmente ya que llegan a mas de 2 rad (114º) de diferencia. Observar que la máxima diferencia posible es de 180º, significando que estan en la posición opuesta.

Notar que un error de 1º en la posición de un péndulo de 1m es bastante grande ya que equivale a unos 17 mm de desplazamiento, por esta razón los dos casos divergen a los pocos segundos.

Advertisements

Climate Radiation Model

This model is inspired in the model posted by David Evans in his blog page. The model is based in the concept of emission layers of the atmosphere. The different active gases that are part of the composition of the the atmosphere, each emits infra red radiation at characteristic wavelengths and from different atmospheric layers.

The active gases of the atmosphere, sometimes called greenhouse gases, are H2O, CO2, O3 and CH4 in order of decreasing thermal emissions. Apart from the active gases some radiation is emitted directly from Earth’s surface and the top of the clouds through what is called “The atmospheric IR window“, the spectrum to which the atmosphere is transparent in the IR. In David’s nomenclature these 6 possible sinks for the incoming heat are called “pipes”, of these two are of minor importance O3 and CH4, leaving 4 main pipes. Energy can redistribute though the other pipes if one of them gets blocked, as for example by adding CO2.

nimbus-clear
Fig 1. The spectral outgoing long-wave radiation (OLR). Showing the spectral windows of the different gases and the transparent window from which the surface emits. In gray is the blackbody emission of an object at 300K = 23ºC

David does a very good job at summarizing the available data on the highs of emission of the different gases and the top of the clouds, here. The gases are supposed ti be almost black body emitters in the window through which each is active, meaning the emitted energy is only a function of the temperature of the layer of the atmosphere from which the emission takes place. Since the temperature of the atmosphere decreases with altitude (in the troposphere), a higher layer emits less power than one closer to the Earth’s surface.

David’s OLR (outgoing long-wave radiation) model is only concerned on how the variation of various parameters modify the distribution of heat through the pipes, how these parameters may be dependent of the temperature or other independent variables is outside his scope.

Here I am going to layout a thermal model, based in well known physics to try to explain some of these missing relations. The first step is to build a model that fits the data, so to that purpose I am going to use the numbers from David Evans’ post:

  • Lapse rate 6.5ºC/km, surface temperature= 288K
  • Cloud cover = 62%, albedo = 30%, solar constant = 1367.7 W/m²
  • Water emission layer: height=8km, output power = 33%
  • Carbon Dioxide layer: height=7km, output power = 20%
  • Cloud top emission layer: height=3.3km, output power = 20%
  • Methane emission layer: height=3km, output power = 2%
  • Ozone emission layer: height=16km, output power= 5.8%
  • Surface emission layer: height=0km, output power=18.2%

Note: for now I have treated the CO2 as emitting from a constant average hight, I liked David’s treatment of the wights of the spectral emission on this spectrum, and I am planning on taking a similar approach on my next refinement. (End note)

The model uses a 2 surfaces representation of The Earth: surface 0 the ground surface (the origin) and the top of the atmosphere surface which is characterized by the maximum height of the convective Hadley. Temperatures are assumed to be linear throughout the atmosphere, so once the convective overturn is specified and the temperature at the top of the Hadley cell is known, the temperature of any other layer is linearly interpoled. The amount of energy that flows through each pipe is controlled by six additional parameters that represent the spectral width of the different spectral windows for each pipe. In the analogy of flow coming out of a damp through a set of pipes in parallel, these parameters represent the widths of the pipes.  For now these values have been adjusted to fit the percentages specified above, but I pretend to deduce their dependence with the height of the emission layers and the wave-lengths of the windows in the next post of the series.

The complete equations of the model and the values of the different parameters are on the link. The core of the model is equations 41, 50 and 51; representing the energy balance in both regions, the surface and the atmosphere.

clamte model diagram

Fig 2. Model schematic. One surface and one band model. Two balance equations one on the surface and one on the upper atmosphere as a whole. The atmosphere emits from different layers which are at different temperatures

The incoming solar power, modified by albedo, is the heat source of planet Earth and this heat is assumed to be absorbed on the surface. The surface balances the heat by radiation and convection mechanisms. The surface radiates either directly to space (about 18%) or to the clouds, this makes a total of three heat sinks for the surface: the two radiation and the convective mechanism.

Q_{Solar}=Q_{Conv}+Q_{Direct}+Q_{ToClouds}

The atmosphere on the other hand is heated by the surface, through the convention and the radiation to clouds mechanisms, which being heat sinks for the surface, become sources for the atmosphere. The atmosphere is balanced by its own sinks which is the radiation to space from the different active layers: clouds, H2O, CO2, CH4 and O3.

Q_{Conv}+Q_{ToClouds}=Q_{FromClouds}+ Q_{H2O}-Q_{CO2}+Q_{CH4}+Q_{O3}

Each of the radiative emission layers is modeled like so:

Q_i=A_i \epsilon f_i \sigma T_i^4

T_i=T_0-\alpha h_i

where A_i is the surface area, \epsilon is the emittance of the atmosphere (0.996), \sigma is Stephan-Boltzmann constant, T_i is the temperature of the emission layer in K, f_i is the window factor, T_0 is the temperature of Earth’s surface, \alpha is the lapse rate and h_i the height of the emission layer.

The convective heat is modeled as so:

Q_{Conv}=A_0 h_{conv} (T_0-T_1)

The lapse rate is then:

\alpha=(T_0-T_1)/H

where A_0 is the area of Earth’s surface, h_{conv} is the convection film coefficient, T_1 is the temperature at the top of the Hadley convective cell, and H is the height of the convective cell.

The direct radiation to space is then:

Q_{Direct}=A_0 \epsilon f_{direct}(1-c)\sigma T_o^4

where c is the cloud cover and f_{direct} the direct atmospheric window.

The radiation to clouds is:

Q_{ToClouds}=A_0 \epsilon f_{direct}c\sigma T_o^4-A_1 \epsilon f_{clouds}c\sigma T_1^4

with f_{clouds} being the atmospheric window from the top of the clouds and A_1 the surface of a sphere which encompass the convective layer of Earth.

Lastly the solar irradiation is

Q_{Solar}=A_0 G_s/4 (1-a)

Whit G_s as the solar constant and a as the albedo.

The model has then 8 parameters that can be adjusted to fit the experimental data: all 6 window factors, the convective coefficient and the height of the convective cell. These parameters are set by imposing the experimental outgoing power distribution, the experimental mean lapse rate and the surface mean temperature which are a total of 7 restrictions. This leaves an extra degree of freedom which I chose as setting the height of the convective cell as 8.2 km arbitrarily.

There are several problems with the current model, that will be addressed in the next post of the serie:

  1. The temperature of the stratosphere increases with height from the tropopausa at about 10-12 km so the ozone temperature layer is not correct. The actual ozone layer is above 20-30 km high but I chose to leave it at 16km so that it’s temperature not fall drastically when using the linear lapse rate. The stratosphere increases temperature  because the O3 captures part of the UV light from the sun and is heated. In future models I may include this effect.
  2. Although the physical meaning of the window factors is clear, these factors can be deduced mathematically from the temperature of the emission layer and the wavelength interval as the fraction of the Planck distribution at the temperature that is emitted through the window. This will be tried on next model, once done the factor will be linked to the height of the layer, the lapse rate and the surface temperature through the temperature of the layer. The fact that the model has an extra degree of freedom (the height of the convection cell) increases my confidence that once the theoretical window fractions are calculated, which inevitably will be different from those obtained from the adjustment, the model will still fit the experimental data within reason.
  3.   CO2 emits radiation from a whole range of heights in the atmosphere through the weights of the spectral window (see figure 1), the treatment of this feature will be studied. I think it is the result of a lower opacity (larger optical length) of the CO2 at those wavelengths so the solution is only partly lowering the emission height but also the emittance at those wavelengths, since a lower absorption (opacity) will always be accompanied by a lower emittance at a same wavelength (Kirchhoff Law of radiation)

 

This has been a very interesting post for me. I look forwards to the continuation. Any comment, or doubt or correction is welcomed.

 

 

Uncertainty and Bayesian probability

bayesian_processThis post is to address the relation between the uncertainty on the state of a system with the information we have of it. Said in those terms is kind of obvious, the more information the less uncertainty. More information can take the form of knowing some other aspect of the system or being more precise information on previously acknowledge aspects.

In mathematics the way to deal with this kind of problem is probability and the way probabilities change when additional data is taken into account is the law of Bayes, hence Bayesian probabilities.

To illustrate how this works I’m going to go through an example. Lets imagine we have a tank of worm water, like a bath tub, and we want to know what temperature the water in the tank has. We could stick a thermometer in the tank and measure its temperature, assuming the temperature of the water in the tank is well mixed then that would yield the temperature of the tank with the uncertainty characteristic of our measuring device.

simple_measurement
Fig. (1) The uncertainty distribution of a simple measurement of the water in the tank

If we are using a mercury thermometer the uncertainty would be around \pm0.2\textrm{ }\textdegree\textrm{C}. Measuring devices’ uncertainties are usually assumed to be normally distributed unless specific evidence is available. Usually the uncertainty level used is 95% or 2\sigma (two standard deviations). Meaning that when I say that the water in the tank is at 36.9\pm0.2\textrm{ }\textdegree\textrm{C}, it means I’m 95% certain that the temperature is between 36.7 and 37.1. The probability distribution of Fig. (1) shows the exact meaning, we are more confident the closer we get to the mean value.

To see how our knowledge of the system varies when further information is taken into account we are going to consider that this tank is not alone in the universe but it interacts with it.

Lets make the tank be in thermal equilibrium by adding a hot water inlet and a outlet, configured in such a way that the level of water is constant. Lets assume the tank is well insulated on all of its lateral walls and floor but open to the room temperature air on its top surface. For this system to be in equilibrium the mass flows of the inlet and outlet must coincide and the incoming heat through the inlet must be equal to the heat losses to the ambient.

This system is easily described with a simple equation relating the variables in play:

mc_p(T_{in}-T)=Ah(T-T_{amb})                                   (1)

Where:

  • m is the mass flow.
  • c_p is the specific heat of the water, which we are going to take as a constant known with absolute certainty to be c_p=4.187\quad\frac{\textrm{kJ}}{\textrm{kg}\textdegree\textrm{C}}   .
  • T_{in} is the temperature of the water comming into the tank.
  • T is the temperature of the tank.
  • A is the area of the surface water of the tank, which we also will assume is a perfectly known quantity T_{in}=0.64\quad\textrm{m}^2
  • h is the convection film coeficient, which we’ll assume is h=10 \quad\frac{\textrm{W}}{\textrm{m}^2\textdegree\textrm{C}}.
  • T_{amb} is the ambient temperature of the air in contact with the water surface.

Lets assume we are measuring the inlet temperature, the ambient temperature and the mass flow, as follows:

  1. T_{in}= 58 \pm 0.2\quad \textdegree\textrm{C}.
  2. T_{amb}= 17 \pm 0.2\quad \textdegree\textrm{C}
  3. m= 0.0015 \pm 3.7E-5\quad \frac{\textrm{kg}}{\textrm{s}}

Having these measurements gives us a pretty good idea of the temperature of the tank even before explicitly measuring it. Fig. (2) shows this probability, it turns out we already know the temperature of the tank is 37.257 \pm 0.52\textrm{ }\textdegree\textrm{C}. It is less precise that the direct measurement but the monitoring of the interactions with the outside does provide an estimate of the temperature of the system.

pdf_T_prior
Fig. (2) Prior probability of the state of the tank, before measuring it explicitly.

Now lets consider what happens when we measure the temperature and the measurement device shows like before 36.9\pm0.2\textrm{ }\textdegree\textrm{C}. Since our prior knowledge of the state of the system is different, in the previous case we didn’t know anything of the system before hand, now we believe the temperature is 37.257 \pm 0.52\textrm{ }\textdegree\textrm{C}. How this influences the ultimate state of our knowledge, after the measurement?

Bayes’ theorem provides the method that allows us to to update beliefs when new evidence arrives (more on that on wikipedia). Applying the theorem to the example at hand we find that the prior beliefs modify the end result bringing the mean a little towards the mean of our prior beliefs and reduces a little the uncertainty of the measurement. Figure (3) shows how our prior probabilities (blue) bring the measurement (green) slightly towards the right to, transforming our prior believes (blue) to to our later, more precise, believes (red). The final state of knowledge of the system is 36.946 \pm 0.187\textrm{ }\textdegree\textrm{C}, the uncertainty has gone down from 0.2 to 0.187 because of our prior knowledge, moving the mean from 36.9 to 36.946.

bayesian_process
Fig. (3) The Bayesian process. Our prior believes (blue) influence our final state of knowledge (red) after measuring the system (green). It doesn’t only modify the most likely value but also reduces the uncertainty of the measurement.

Joseph’s tiling the court yard problem

An example of a problem involving the resolution of Pell’s 2nd order diophantine equation

Problem statement

Joseph is planning to tile his square court yard with square tiles. He has requested an offer from his two friends Frank and Peter. Frank proposes to use tiles with an area 17 times bigger than Peter. Knowing that Peter’s tiles cover exactly the whole floor of the court and that Frank’s on the other hand fall short for the area a one of Peter’s tiles, how many tiles are the minimum that Peter and Frank, each have proposed in their respective offers?

Solution:

Let and be the number of tiles used by Peter and Frank respectively (we know they must be perfect squares).

Let A be the area of Peter’s tiles and 17A the area of Frank’s, then:

  • The total area of Peter’s tiles is Ax².
  • The total area of Frank’s tiles is on the other hand 17Ay².

Since the difference in Frank’s and Peter’s total tiles’ areas is the area of one of Peter’s tiles, we have:

Ax² – 17Ay² = A

, which when dividing each side of the equation by A, yields:

x² – 17y² = 1                    (1)

The problem then consists is finding the positive integer roots of equation (1).

This is a particular case of Pell’s equation: x² – Dy² = 1 where D is a positive integer. It is a particular type of 2nd order diophantine equation. In particular it has infinite solutions over the integers for any value of D. The proof of this amazing fact can be found here

With modern computing, an equation like this is easily solved by brute force, just by trying all possible integers starting from 1. None the less if D was some big number it would still be of interest to implement some better algorithm. So I am going to solve the problem by hand to illustrate the procedure.

First we rearrange equation (1):

\frac{x^2}{y^2} = \frac{1}{y^2} + 17

where we notice that on the left side of the equation as y increases even slightly  1 /y² << 17, meaning x² / y² is roughly 17 or x / y is roughly the square root of 17. This insight suggests that we could try and find rational approximations to the irrational quantity \sqrt{17} and search for the solution among the numerator and denumerators of such fractions.

To find rational approximations we use the continued fractions technique. It has been proven that the solution is always within the set of continued fractions, or convergents of the \sqrt{D} in the general Pell’s equation (also in the previous link)

To calculate the continuous fractions of \sqrt{17} we proceed:

  1.                                                  17 = (4 + z)² = 16 + 8z +z²
  2.                                                    1 = 8z + z² = z (8 + z)
  3.                                                             z=1 / (8 + z)

This last expression can be used recursively to calculate the value of z, replacing z once and again by  \frac{1}{8+z}. These are the continued fractions, so:

\sqrt{17}=4+\cfrac{1}{8+\cfrac{1}{8+\cfrac{1}{8+\dots}}}

as we continue expanding the terms we get closer and closer to \sqrt{17}, each of the approximations is called a convergent. So the first convergent would be 4, the second 4+\frac{1}{8}=\frac{33}{8}, …

Lets try these convergents as a solution of (1)

  • first convergent: 4=\frac{4}{1} so  x = 4,  y = 1 which gives  x² – 17y² = -1
  • second convergent: \frac{33}{8} so x = 33, y = 8 and x² – 17y² = 1

Eureka!! we have found the solution and we only had to try to numbers. Petter will use 33²   = 1069 tiles and Frank 8² = 64.

I hope it was interesting for you, it surely was for me. It is almost magic how the integer solutions of the equation form a rational approximation of an irrational number… who would have thought.