Skip to content

Commit 3d0d06e

Browse files
committed
update doc
1 parent 52798cc commit 3d0d06e

10 files changed

+912
-0
lines changed

docs/modules/Kalmanfilter_basics.rst

Lines changed: 567 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 341 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,341 @@
1+
2+
KF Basics - Part 2
3+
------------------
4+
5+
### Probabilistic Generative Laws
6+
7+
1st Law:
8+
^^^^^^^^
9+
10+
The belief representing the state :math:`x_{t}`, is conditioned on all
11+
past states, measurements and controls. This can be shown mathematically
12+
by the conditional probability shown below:
13+
14+
.. math:: p(x_{t} | x_{0:t-1},z_{1:t-1},u_{1:t})
15+
16+
1) :math:`z_{t}` represents the **measurement**
17+
18+
2) :math:`u_{t}` the **motion command**
19+
20+
3) :math:`x_{t}` the **state** (can be the position, velocity, etc) of
21+
the robot or its environment at time t.
22+
23+
‘If we know the state :math:`x_{t-1}` and :math:`u_{t}`, then knowing
24+
the states :math:`x_{0:t-2}`, :math:`z_{1:t-1}` becomes immaterial
25+
through the property of **conditional independence**’. The state
26+
:math:`x_{t-1}` introduces a conditional independence between
27+
:math:`x_{t}` and :math:`z_{1:t-1}`, :math:`u_{1:t-1}`
28+
29+
Therefore the law now holds as:
30+
31+
.. math:: p(x_{t} | x_{0:t-1},z_{1:t-1},u_{1:t})=p(x_{t} | x_{t-1},u_{t})
32+
33+
2nd Law:
34+
^^^^^^^^
35+
36+
If :math:`x_{t}` is complete, then:
37+
38+
.. math:: p(z_{t} | x-_{0:t},z_{1:t-1},u_{1:t})=p(z_{t} | x_{t})
39+
40+
:math:`x_{t}` is **complete** means that the past states, controls or
41+
measurements carry no additional information to predict future.
42+
43+
:math:`x_{0:t-1}`, :math:`z_{1:t-1}` and :math:`u_{1:t}` are
44+
**conditionally independent** of :math:`z_{t}` given :math:`x_{t}` of
45+
complete.
46+
47+
The filter works in two parts:
48+
49+
:math:`p(x_{t} | x_{t-1},u_{t})` -> **State Transition Probability**
50+
51+
:math:`p(z_{t} | x_{t})` -> **Measurement Probability**
52+
53+
Conditional dependence and independence example:
54+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
55+
56+
:math:`\bullet`\ **Independent but conditionally dependent**
57+
58+
Let’s say you flip two fair coins
59+
60+
A - Your first coin flip is heads
61+
62+
B - Your second coin flip is heads
63+
64+
C - Your first two flips were the same
65+
66+
A and B here are independent. However, A and B are conditionally
67+
dependent given C, since if you know C then your first coin flip will
68+
inform the other one.
69+
70+
:math:`\bullet`\ **Dependent but conditionally independent**
71+
72+
A box contains two coins: a regular coin and one fake two-headed coin
73+
((P(H)=1). I choose a coin at random and toss it twice. Define the
74+
following events.
75+
76+
A= First coin toss results in an H.
77+
78+
B= Second coin toss results in an H.
79+
80+
C= Coin 1 (regular) has been selected.
81+
82+
If we know A has occurred (i.e., the first coin toss has resulted in
83+
heads), we would guess that it is more likely that we have chosen Coin 2
84+
than Coin 1. This in turn increases the conditional probability that B
85+
occurs. This suggests that A and B are not independent. On the other
86+
hand, given C (Coin 1 is selected), A and B are independent.
87+
88+
Bayes Rule:
89+
~~~~~~~~~~~
90+
91+
Posterior =
92+
93+
.. math:: \frac{Likelihood*Prior}{Marginal}
94+
95+
Here,
96+
97+
**Posterior** = Probability of an event occurring based on certain
98+
evidence.
99+
100+
**Likelihood** = How probable is the evidence given the event.
101+
102+
**Prior** = Probability of the just the event occurring without having
103+
any evidence.
104+
105+
**Marginal** = Probability of the evidence given all the instances of
106+
events possible.
107+
108+
Example:
109+
110+
1% of women have breast cancer (and therefore 99% do not). 80% of
111+
mammograms detect breast cancer when it is there (and therefore 20% miss
112+
it). 9.6% of mammograms detect breast cancer when its not there (and
113+
therefore 90.4% correctly return a negative result).
114+
115+
We can turn the process above into an equation, which is Bayes Theorem.
116+
Here is the equation:
117+
118+
:math:`\displaystyle{\Pr(\mathrm{A}|\mathrm{X}) = \frac{\Pr(\mathrm{X}|\mathrm{A})\Pr(\mathrm{A})}{\Pr(\mathrm{X|A})\Pr(\mathrm{A})+ \Pr(\mathrm{X | not \ A})\Pr(\mathrm{not \ A})}}`
119+
120+
:math:`\bullet`\ Pr(A|X) = Chance of having cancer (A) given a positive
121+
test (X). This is what we want to know: How likely is it to have cancer
122+
with a positive result? In our case it was 7.8%.
123+
124+
:math:`\bullet`\ Pr(X|A) = Chance of a positive test (X) given that you
125+
had cancer (A). This is the chance of a true positive, 80% in our case.
126+
127+
:math:`\bullet`\ Pr(A) = Chance of having cancer (1%).
128+
129+
:math:`\bullet`\ Pr(not A) = Chance of not having cancer (99%).
130+
131+
:math:`\bullet`\ Pr(X|not A) = Chance of a positive test (X) given that
132+
you didn’t have cancer (~A). This is a false positive, 9.6% in our case.
133+
134+
Bayes Filter Algorithm
135+
~~~~~~~~~~~~~~~~~~~~~~
136+
137+
The basic filter algorithm is:
138+
139+
for all :math:`x_{t}`:
140+
141+
1. :math:`\overline{bel}(x_t) = \int p(x_t | u_t, x_{t-1}) bel(x_{t-1})dx`
142+
143+
2. :math:`bel(x_t) = \eta p(z_t | x_t) \overline{bel}(x_t)`
144+
145+
end.
146+
147+
:math:`\rightarrow`\ The first step in filter is to calculate the prior
148+
for the next step that uses the bayes theorem. This is the
149+
**Prediction** step. The belief, :math:`\overline{bel}(x_t)`, is
150+
**before** incorporating measurement(\ :math:`z_{t}`) at time t=t. This
151+
is the step where the motion occurs and information is lost because the
152+
means and covariances of the gaussians are added. The RHS of the
153+
equation incorporates the law of total probability for prior
154+
calculation.
155+
156+
:math:`\rightarrow` This is the **Correction** or update step that
157+
calculates the belief of the robot **after** taking into account the
158+
measurement(\ :math:`z_{t}`) at time t=t. This is where we incorporate
159+
the sensor information for the whereabouts of the robot. We gain
160+
information here as the gaussians get multiplied here. (Multiplication
161+
of gaussian values allows the resultant to lie in between these numbers
162+
and the resultant covariance is smaller.
163+
164+
Bayes filter localization example:
165+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
166+
167+
.. code-block:: ipython3
168+
169+
from IPython.display import Image
170+
Image(filename="bayes_filter.png",width=400)
171+
172+
173+
174+
175+
.. image:: Kalmanfilter_basics_2_files/Kalmanfilter_basics_2_5_0.png
176+
:width: 400px
177+
178+
179+
180+
Given - A robot with a sensor to detect doorways along a hallway. Also,
181+
the robot knows how the hallway looks like but doesn’t know where it is
182+
in the map.
183+
184+
1. Initially(first scenario), it doesn’t know where it is with respect
185+
to the map and hence the belief assigns equal probability to each
186+
location in the map.
187+
188+
2. The first sensor reading is incorporated and it shows the presence of
189+
a door. Now the robot knows how the map looks like but cannot
190+
localize yet as map has 3 doors present. Therefore it assigns equal
191+
probability to each door present.
192+
193+
3. The robot now moves forward. This is the prediction step and the
194+
motion causes the robot to lose some of the information and hence the
195+
variance of the gaussians increase (diagram 4.). The final belief is
196+
**convolution** of posterior from previous step and the current state
197+
after motion. Also, the means shift on the right due to the motion.
198+
199+
4. Again, incorporating the measurement, the sensor senses a door and
200+
this time too the possibility of door is equal for the three door.
201+
This is where the filter’s magic kicks in. For the final belief
202+
(diagram 5.), the posterior calculated after sensing is mixed or
203+
**convolution** of previous posterior and measurement. It improves
204+
the robot’s belief at location near to the second door. The variance
205+
**decreases** and **peaks**.
206+
207+
5. Finally after series of iterations of motion and correction, the
208+
robot is able to localize itself with respect to the
209+
environment.(diagram 6.)
210+
211+
Do note that the robot knows the map but doesn’t know where exactly it
212+
is on the map.
213+
214+
Bayes and Kalman filter structure
215+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
216+
217+
The basic structure and the concept remains the same as bayes filter for
218+
Kalman. The only key difference is the mathematical representation of
219+
Kalman filter. The Kalman filter is nothing but a bayesian filter that
220+
uses Gaussians.
221+
222+
For a bayes filter to be a Kalman filter, **each term of belief is now a
223+
gaussian**, unlike histograms. The basic formulation for the **bayes
224+
filter** algorithm is:
225+
226+
.. math::
227+
228+
\begin{aligned}
229+
\bar {\mathbf x} &= \mathbf x \ast f_{\mathbf x}(\bullet)\, \, &\text{Prediction} \\
230+
\mathbf x &= \mathcal L \cdot \bar{\mathbf x}\, \, &\text{Correction}
231+
\end{aligned}
232+
233+
:math:`\bar{\mathbf x}` is the *prior*
234+
235+
:math:`\mathcal L` is the *likelihood* of a measurement given the prior
236+
:math:`\bar{\mathbf x}`
237+
238+
:math:`f_{\mathbf x}(\bullet)` is the *process model* or the gaussian
239+
term that helps predict the next state like velocity to track position
240+
or acceleration.
241+
242+
:math:`\ast` denotes *convolution*.
243+
244+
Kalman Gain
245+
~~~~~~~~~~~
246+
247+
.. math:: x = (\mathcal L \bar x)
248+
249+
Where x is posterior and :math:`\mathcal L` and :math:`\bar x` are
250+
gaussians.
251+
252+
Therefore the mean of the posterior is given by:
253+
254+
.. math::
255+
256+
257+
\mu=\frac{\bar\sigma^2\, \mu_z + \sigma_z^2 \, \bar\mu} {\bar\sigma^2 + \sigma_z^2}
258+
259+
.. math:: \mu = \left( \frac{\bar\sigma^2}{\bar\sigma^2 + \sigma_z^2}\right) \mu_z + \left(\frac{\sigma_z^2}{\bar\sigma^2 + \sigma_z^2}\right)\bar\mu
260+
261+
In this form it is easy to see that we are scaling the measurement and
262+
the prior by weights:
263+
264+
.. math:: \mu = W_1 \mu_z + W_2 \bar\mu
265+
266+
The weights sum to one because the denominator is a normalization term.
267+
We introduce a new term, :math:`K=W_1`, giving us:
268+
269+
.. math::
270+
271+
\begin{aligned}
272+
\mu &= K \mu_z + (1-K) \bar\mu\\
273+
&= \bar\mu + K(\mu_z - \bar\mu)
274+
\end{aligned}
275+
276+
where
277+
278+
.. math:: K = \frac {\bar\sigma^2}{\bar\sigma^2 + \sigma_z^2}
279+
280+
The variance in terms of the Kalman gain:
281+
282+
.. math::
283+
284+
\begin{aligned}
285+
\sigma^2 &= \frac{\bar\sigma^2 \sigma_z^2 } {\bar\sigma^2 + \sigma_z^2} \\
286+
&= K\sigma_z^2 \\
287+
&= (1-K)\bar\sigma^2
288+
\end{aligned}
289+
290+
:math:`K` is the *Kalman gain*. It’s the crux of the Kalman filter. It
291+
is a scaling term that chooses a value partway between :math:`\mu_z` and
292+
:math:`\bar\mu`.
293+
294+
Kalman Filter - Univariate and Multivariate
295+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
296+
297+
\ **Prediction**\
298+
299+
:math:`\begin{array}{|l|l|l|} \hline \text{Univariate} & \text{Univariate} & \text{Multivariate}\\ & \text{(Kalman form)} & \\ \hline \bar \mu = \mu + \mu_{f_x} & \bar x = x + dx & \bar{\mathbf x} = \mathbf{Fx} + \mathbf{Bu}\\ \bar\sigma^2 = \sigma_x^2 + \sigma_{f_x}^2 & \bar P = P + Q & \bar{\mathbf P} = \mathbf{FPF}^\mathsf T + \mathbf Q \\ \hline \end{array}`
300+
301+
:math:`\mathbf x,\, \mathbf P` are the state mean and covariance. They
302+
correspond to :math:`x` and :math:`\sigma^2`.
303+
304+
:math:`\mathbf F` is the *state transition function*. When multiplied by
305+
:math:`\bf x` it computes the prior.
306+
307+
:math:`\mathbf Q` is the process covariance. It corresponds to
308+
:math:`\sigma^2_{f_x}`.
309+
310+
:math:`\mathbf B` and :math:`\mathbf u` are model control inputs to the
311+
system.
312+
313+
\ **Correction**\
314+
315+
:math:`\begin{array}{|l|l|l|} \hline \text{Univariate} & \text{Univariate} & \text{Multivariate}\\ & \text{(Kalman form)} & \\ \hline & y = z - \bar x & \mathbf y = \mathbf z - \mathbf{H\bar x} \\ & K = \frac{\bar P}{\bar P+R}& \mathbf K = \mathbf{\bar{P}H}^\mathsf T (\mathbf{H\bar{P}H}^\mathsf T + \mathbf R)^{-1} \\ \mu=\frac{\bar\sigma^2\, \mu_z + \sigma_z^2 \, \bar\mu} {\bar\sigma^2 + \sigma_z^2} & x = \bar x + Ky & \mathbf x = \bar{\mathbf x} + \mathbf{Ky} \\ \sigma^2 = \frac{\sigma_1^2\sigma_2^2}{\sigma_1^2+\sigma_2^2} & P = (1-K)\bar P & \mathbf P = (\mathbf I - \mathbf{KH})\mathbf{\bar{P}} \\ \hline \end{array}`
316+
317+
:math:`\mathbf H` is the measurement function.
318+
319+
:math:`\mathbf z,\, \mathbf R` are the measurement mean and noise
320+
covariance. They correspond to :math:`z` and :math:`\sigma_z^2` in the
321+
univariate filter. :math:`\mathbf y` and :math:`\mathbf K` are the
322+
residual and Kalman gain.
323+
324+
The details will be different than the univariate filter because these
325+
are vectors and matrices, but the concepts are exactly the same:
326+
327+
- Use a Gaussian to represent our estimate of the state and error
328+
- Use a Gaussian to represent the measurement and its error
329+
- Use a Gaussian to represent the process model
330+
- Use the process model to predict the next state (the prior)
331+
- Form an estimate part way between the measurement and the prior
332+
333+
References:
334+
~~~~~~~~~~~
335+
336+
1. Roger Labbe’s
337+
`repo <https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python>`__
338+
on Kalman Filters. (Majority of text in the notes are from this)
339+
340+
2. Probabilistic Robotics by Sebastian Thrun, Wolfram Burgard and Dieter
341+
Fox, MIT Press.
1.04 MB
Loading
5.64 KB
Loading
11.9 KB
Loading
18.2 KB
Loading
17.8 KB
Loading
37.8 KB
Loading
3.52 KB
Loading

docs/modules/appendix.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,7 @@
33
Appendix
44
==============
55

6+
.. include:: Kalmanfilter_basics.rst
7+
8+
.. include:: Kalmanfilter_basics_2.rst
9+

0 commit comments

Comments
 (0)