---
jupytext:
  formats: ipynb,md:myst
  text_representation:
    extension: .md
    format_name: myst
    format_version: 0.13
    jupytext_version: 1.16.4
kernelspec:
  display_name: Python 3 (phys-581)
  language: python
  name: python3
---

```{code-cell}
:tags: [hide-cell]

import mmf_setup;mmf_setup.nbinit()
import logging;logging.getLogger('matplotlib').setLevel(logging.CRITICAL)
%matplotlib inline
import numpy as np, matplotlib.pyplot as plt
```

(sec:Drag)=
# Drag

In class, we explored the following model for drag force:
\begin{gather*}
  F_{d} \approx 2\rho_{\text{air}} A v^2 \sin^{\alpha}\theta
\end{gather*}
where $\rho_{\text{air}}$ is the mass-density of air, $A$ is the cross-sectional area,
$v$ is the falling speed, and $\theta$ is the cone half-angle.  We expect $\alpha = 1$,
but the question is: using simple resources available in class, can we reasonably test
the hypothesis that $\alpha = 1$?

Here we present some results for analyzing this model.

:::{admonition} Quantities

[Dynamic viscosity of air][]:
\begin{gather*}
  \eta_{\text{air}} \approx 18.5\; \mathrm{\mu Pa\cdot s}
  \left(\frac{T}{300\mathrm{K}}\right)^{0.7}.
\end{gather*}

[Dynamic viscosity of air]: <https://en.wikipedia.org/wiki/Viscosity#Air>

:::
## Terminal Velocity

We first simplify the model to express the drag force as
\begin{gather*}
  F_{d} = \beta v^2, \qquad \beta \approx 2\rho_{\text{air}} A \sin^{\alpha}\theta.
\end{gather*}
Newton's law gives the terminal velocity when $\dot{v} = 0$:
\begin{gather*}
  m\dot{v} = mg - F_d = mg - \beta v^2, \qquad
  v_T = \sqrt{\frac{mg}{\beta}} \approx
      \sqrt{\frac{mg}{2\rho_{\text{air}} A \sin^{\alpha}\theta}}.
\end{gather*}
:::{admonition} Do It!
Think about this form.  We can test several things.  First, our model posits that the
drag force depends quadratically on the velocity.  Can you test this from your data?
Next we have the dependence on the cone angle: will your results be sensitive enough to
test this?  Do a dimensional analysis: is there a dimensionless parameter?

This analysis has ignored the viscosity of the air: include $\eta_{\text{air}}$ in your
analysis.
:::
It might be difficult in class to ensure that the cones are measured only during the
time when they reach $v_T$, so we may want to determine the average velocity after
falling through a time $t$ or height $h$.

To do this, we choose units so that $v_T = m = g = 1$:
\begin{gather*}
  1 = \underbrace{m}_{\text{mass}} 
    = \underbrace{\frac{v_T}{g}}_{\text{time}} 
    = \underbrace{\frac{v_T^2}{g}}_{\text{distance}}
    = \underbrace{v_T}_{\text{speed}}.
\end{gather*}
With these units, we have:
\begin{gather*}
  \dot{v} = 1 - v^2\\
  \int_{0}^{v}\frac{1}{1-v^2}\d{v} = \int_0^{t}\d{t}, \qquad
  \frac{1}{2}\ln\frac{1+v}{1-v} = t, \qquad
  1+v = (1-v)e^{2t},\\
  v = \frac{e^{2t} - 1}{e^{2t} + 1} 
    = \frac{e^{t} - e^{-t}}{e^{t} + e^{-t}}
    = \tanh t,\\
  h = \int_0^{t}\d{t}\;\tanh t = t - \ln\bigl(1+\tanh(t)\bigr).
\end{gather*}
```{code-cell}
from scipy.integrate import solve_ivp
def dv_dt(t, v):
    return 1-v**2
res = solve_ivp(dv_dt, t_span=(0, 3), y0=[0], max_step=0.1)
t, v = res.t, res.y[0]
assert np.allclose(v, np.tanh(t))
h = t - np.log(1+np.tanh(t))

fig, axs = plt.subplots(1, 2, figsize=(8, 3))
ax = axs[0]
ax.plot(t, v, label="solve_ivp")
ax.plot(t, np.tanh(t), label="exact")
ax.plot(t, h/t, label="$h/t$")
ax.legend()
ax.set(xlabel="$t$ [$v_T/g$]", ylabel="$v$ [$v_T$]");
ax = axs[1]
ax.plot(t, h)
ax.set(xlabel="$t$ [$v_T/g$]", ylabel="$h$ [$v_T^2/g$]")
plt.tight_layout();
```
We see from the plot that the falling object will reach terminal velocity within about
2-3 time units $v_T/g$.

## Sample Data

Here we analyze a set of data from one of the groups.  The data here is a list of
terminal velocities $v$ correlated with $\sin\theta$:

```{code-cell}
from uncertainties import ufloat, unumpy as unp

m = ufloat(1.6, 0.1)*1e-3
base_diameter = [9.4e-2, 8.2e-2, 7e-2, 6e-2, 4.5e-2, 3.3e-2]
dr = 0.001  # Error in table?
r = np.array([ufloat(d/2, dr) for d in base_diameter])
side_length = ufloat(5.25e-2, 0.1e-2)
v_terminal = np.array([ufloat(*v) for v in [
    (1.82, 0.13),
    (2.20, 0.17),
    (2.74, 0.20), 
    (2.82, 0.21),
    (3.93, 0.32),
    (4.35, 0.40)]])
sin_thetas = r/side_length

_n, _s = unp.nominal_values, unp.std_devs
fig, ax = plt.subplots()
ax.errorbar(_n(sin_thetas), _n(v_terminal), ls='none', 
            xerr=_s(sin_thetas), yerr=_s(v_terminal))
ax.set(xlabel=rf"$\sin\theta$", ylabel="$v$ [m/s]",
       xscale="log", yscale="log");
a, b = np.polyfit(np.log(_n(sin_thetas)), np.log(_n(v_terminal)), deg=1)

x_ = _n(sin_thetas)
th = np.linspace(np.asin(x_.min()), np.asin(x_.max()))
x = np.sin(th)
y = np.exp(b)*x**a
ax.plot(x, y, label=fr"$v = {np.exp(b):.2g}/\sin^{{{-a:.2g}}}\theta$")
ax.legend()
```

## Quantitative Analysis

In this case, to search for a power-law dependence, we can just do a linear regression
of the appropriate variables as we plotted above, but we will present this as a general
fit to demonstrate how you might do this for more complicated models.

:::{margin}
To include the errors in $x$, we can use {mod}`scipy.odr`.
:::
We can start with {func}`scipy.optimize.curve_fit`, which fits
\begin{gather*}
  y = f(x, p_1, p_2, \dots) + \epsilon, \qquad \epsilon \sim \mathcal{N}_{\sigma}
\end{gather*}
where the errors $\epsilon$ are assumed to be normally distributed with standard
deviation $\sigma$.  This does not allow us to model the errors in $x$, but we can get
an estimate of the uncertainties.

We will use the model
\begin{gather*}
  y = v = ax^{\alpha}, \qquad x = \sin \theta.
\end{gather*}

```{code-cell}
from scipy.optimize import curve_fit

def f(sin_theta, a, alpha):
    return a * sin_theta**alpha

xdata = _n(sin_thetas)
ydata = _n(v_terminal)
sigma = _s(v_terminal)
p0 = [1.0, 1.0]  # Initial guess
res = curve_fit(
    f, xdata=xdata, ydata=ydata, sigma=sigma,
    absolute_sigma=True,  # Assume the stated errors are correct
    )
popt, pcov = res
perr = np.sqrt(np.diag(pcov))
a = ufloat(popt[0], perr[0])
alpha = ufloat(popt[1], perr[1])

# Reduced chi^2
chi2r = np.sum((abs(ydata - f(xdata, *popt))/sigma)**2)/(len(xdata) - len(popt))

fig, ax = plt.subplots()
ax.errorbar(xdata, ydata, ls='none', 
            xerr=_s(sin_thetas), yerr=_s(v_terminal))
x = sin_th = np.linspace(xdata.min(), xdata.max())
y = f(x, a=a, alpha=alpha)
l, = ax.plot(x, _n(y), label=fr"${a:.2S}(\sin\theta)^{{{alpha:.2S}}}\theta$")
ax.fill_between(x, _n(y)-_s(y), _n(y)+_s(y), fc=l.get_c(), alpha=0.5)
ax.legend()
ax.set(xlabel=rf"$\sin\theta$", ylabel="$v$ [m/s]", 
       title=fr"$v = {a:.2S}(\sin\theta)^{{{alpha:.2S}}}\theta$: $\chi^2_r = {chi2r:.4g}$");
```

## Bayesian Analsyis
Now let's do this with Bayesian analysis.  For now we will ignore the errors in the
abscissa.  Our model is thus
\begin{gather*}
  v_t = a\sin^{-\alpha} \theta + \epsilon
\end{gather*}
where the errors $\epsilon$ are assumed to have a known distribution $p(\epsilon)$ which
we take to be Gaussian
\begin{gather*}
  p_{\epsilon}(\epsilon) \propto e^{-\epsilon^2/2/\sigma^2}.
\end{gather*}
We also need a prior $p_0(a, \alpha)$.  If we do not know anything, we might choose an
**uninformed prior** $p_0(a, \alpha) \propto 1$.  *Note: this is not normalizable... we
will deal with this later.*

Now we apply Bayes theorem to determine the **posterior** distribution:
\begin{gather*}
  p(a, \alpha|D) \propto p(D|a, \alpha)p_0(a, \alpha)
\end{gather*}
where the likelihood $p(D|a, \alpha)$ is the probability of obtaining the data $D$ given
a fixed set of parameters $a$ and $\alpha$.  In our case, this is exactly what we mean
by the estimated error $\epsilon$:
\begin{gather*}
   \epsilon = v_t -a\sin^{-\alpha} \theta,\\
   p(D|a, \alpha) = p_{\epsilon}(\epsilon) =  p_{\epsilon}\Bigl(v_t -a\sin^{-\alpha} \theta\Bigr).
\end{gather*}
For multiple data points, we would take the product over all data points.  For example,
consider two data points:
\begin{gather*}
  p(a, \alpha|D_1, D_2) \propto p(D_2|a, \alpha)\underbrace{p(D_1|a, \alpha)p_0(a,
  \alpha)}_{p(a, \alpha | D_1)}.
\end{gather*}
Note that this has a simple interpretation: the bracketed quantity -- what we learn
after taking one measurement -- becomes the prior for adding the second measurement.  As
long as everything is uncorrelated, the order in which we add the data does not matter.

With gaussians, and flat priors etc., you can do everything analytically (Do It!) but
here we will do it numerically with the [emcee](https://emcee.readthedocs.io/en/stable/)
package so we can generalize.  This package works with the log of everything, so we have
\begin{gather*}
  \log p(a, \alpha| D) \propto \log p(D|a, \alpha) + \log p(a, \alpha).
\end{gather*}
To use the code, you must write the right-hand-side function.  In our case:
\begin{gather*}
  \log p(a, \alpha| D) \propto \sum_{n} \log p_{\epsilon}\Bigl(v_n -a\sin^{-\alpha}
  \theta_n\Bigr) + \log p_0(a, \alpha).
\end{gather*}
For gaussian errors, and constant prior, this is exactly the negative $\chi^2$ usually
used for curve fitting:
\begin{gather*}
  \log p(a, \alpha| D) \propto -
  \underbrace{\sum_{n} \frac{\Bigl(v_n -a\sin^{-\alpha}\theta_n\Bigr)^2}{2\sigma^2}}_{\chi^2} + \log p_0(a, \alpha).
\end{gather*}


```{code-cell}
import emcee

def log_prior(a, alpha):
    return 0.0  # Uninformed
    
def log_probability(params, sin_theta, v, v_err):
    a, alpha = params
    eps = (v - a/sin_theta**alpha)
    sigma = v_err
    return -0.5 * np.sum((eps/sigma)**2)

rng = np.random.default_rng(seed=2)
nwalkers = 32
ndim = 2
guess = np.array([1.8, 0.83]) + rng.normal(size=(nwalkers, ndim))

sampler = emcee.EnsembleSampler(
    nwalkers, ndim, log_probability, 
    args=(_n(sin_thetas), _n(v_terminal), _s(v_terminal)))

sampler.run_mcmc(guess, 5000, progress=False);
```

Now that we have run our sampler, we can estimate the autocorrelation times:
```{code-cell}
tau = sampler.get_autocorr_time()
print(tau)
```
These suggest that we only need about 30 steps to remove any spurious correlations from
the initial data.  We then include every 15th sample (about half the autocorrelation
time).  Finally, we flatten the samples so we can pass it to the plotting routine.
```{code-cell}
flat_samples = sampler.get_chain(discard=100, thin=15, flat=True)
print(flat_samples.shape)
```

```{code-cell}
import corner

fig = corner.corner(flat_samples, labels=("$a$", r"$\alpha$"))
```

## Errors in the Abscissa

So far we have neglected the errors in $\sin\theta$.  Here we fix this.  The idea is to
consider the half-angle $\theta_i$ as a another parameter: we are trying to make cones
with this half-angle, but we then measure the cones and obtain values $x_i$ that deviate
from $\sin \theta_i$ by a random amount $\delta_i$.  Thus, our complete error model is:
\begin{gather*}
  v_{i} = \frac{a}{\sin^\alpha \theta_i} + \epsilon_i, \\
  x_{i} = \sin \theta_i + \delta_i.
\end{gather*}
If these errors are independent, then the likelihood is the product
\begin{gather*}
  p_{\epsilon_{i}}\Bigl(v_{i} - \frac{a}{\sin^\alpha \theta_i}\Bigr)
  p_{\delta_{i}}\Bigl(x_{i} - \sin\theta_i\Bigr).
\end{gather*}
We must now include the abscissa $\vect{\theta}$ as additional parameters.  Following
through Bayes's theorem, we end up with a posterior $p(a, \alpha, \vect{\theta}|D)$.
Explicitly, for a single data point
\begin{gather*}
  p(a, \alpha) \propto  p_{\epsilon}\Bigl(v - \frac{a}{\sin^\alpha \theta}\Bigr)
  p_{\delta}\Bigl(x - \sin\theta\Bigr)p_0(a, \alpha, \theta).
\end{gather*}
Now, we don't really care about the values of $\vect{\theta}$ -- they are **nuisance
parameters**.  To obtain the posterior distribution of $a$ and $\alpha$, we should
**marginalize** over these by integrating:
\begin{gather*}
  p(a, \alpha|D) = \int p(a, \alpha, \vect{\theta}|D)\d\vect{\theta}.
\end{gather*}
While this seems tricky analytically, it is trivial with samples -- simply ignore the
nuisance variables!
:::{admonition} Do It!  Why does this work?
:::





















