What is the importance of Kalman filtering

The Kalman filter simply explained [Part 2]

After we have clarified the core of the Kalman filter in Part 1, we now turn to the more complicated part. The procedure mentioned in Part 1 with multiplying or adding the mean values ​​and variances only works in the one-dimensional case. \ (\)

That is, if the state that you want to measure can be fully described with just one variable. However, the example mentioned at the beginning of determining the position of a vehicle in the tunnel can no longer be fully described with one variable. Only the position is of interest, but strictly speaking it is already 2-dimensional in the plane (\ (x \) & \ (y \)). In addition, only the speed (\ (\ dot x \) & \ (\ dot y \)) can be measured, not the position directly. This leads to a 4D Kalman filter with the following state variables:

$$ x = \ begin {bmatrix}
x \
y \
\ dot x \
\ dot y
\ end {bmatrix} = \ begin {matrix} \ text {position X} \ \ text {position Y} \ \ text {speed in X} \ \ text {speed in Y}
\ end {matrix} $$

Mr. Kalman had now thought about how to manage to calculate an optimal estimate of all states despite the noisy measurement of individual sensors.

Multi-dimensional Kalman filter

I would like to explain the procedure again using the example of a vehicle with a navigation device that drives into a tunnel. The last known position is before the GPS signal was lost. After that, only the speed information from the vehicle (wheel speed & yaw rate) is available (with built-in navigation systems) as a normally distributed noisy measured variable. From this a speed in \ (x \) and \ (y \) can be calculated.

Initial conditions / initialization

System state \ (x \)

At the beginning you have to initialize with an initial state. In the 1-dimensional case this was \ (\ mu_0 \), now in the multi-dimensional case it was a vector.

$$ x_0 = \ begin {bmatrix}
x_0 \
y_0 \
\ dot x_0 \
\ dot y_0
\ end {bmatrix} $$

If nothing is known, 0 can simply be entered here. If some boundary conditions are already known, these can be communicated to the filter. The choice of the following covariance matrix \ (P \) controls how quickly the filter converges on the correct (measured) values.

Covariance matrix \ (P \)

An uncertainty must be given for the initial state \ (x_0 \). In the 1D case this was \ (\ sigma_0 \), now a matrix which defines an uncertainty for all states individually at the beginning. Here as an example with \ (\ sigma_0 = 10 \) for all 4 states.

$$ P = \ begin {bmatrix}
10 & 0 & 0 & 0 \\
0 & 10 & 0 & 0 \\
0 & 0 & 10 & 0 \\
0 & 0 & 0 & 10
\ end {bmatrix} $$

This matrix is ​​changed most often during the filter runs. It is changed in both the Predict and Correct steps. If you are quite sure about the states at the beginning, you can insert low values ​​here, at the beginning you do not know exactly what the values ​​of the state vector are, the covariance matrix \ (P \) should have very large values ​​(1 million or similar) .) to allow the filter to converge relatively quickly (to find the correct values ​​on the basis of the measurements).

Dynamics matrix \ (A \)

The core of the filter, however, is the following definition, which you should only set up yourself with a great understanding of the physical context. For many real problems this is not easy. For our simple example (movement in the plane), the physics behind it results from the uniform movement. For the position the result is \ (x_ {t + 1} = \ dot x_t \ cdot t + x_t \) and for the speed \ (\ dot x_ {t + 1} = \ dot x_t \). For the state vector shown above, the dynamics in matrix notation are as follows:

$$ A = \ begin {bmatrix}
1 & 0 & dt & 0 \
0 & 1 & 0 & dt \
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1
\ end {bmatrix} $$

This says “where” the state vector moves from one calculation step to the next within \ (dt \). This dynamic model is also called the “Constant Velocity” model because it assumes that the speed remains constant during one calculation step of the filter.

$$ \ begin {bmatrix}
x \
y \
\ dot x \
\ dot y
\ end {bmatrix} _ {t + 1} = \ begin {bmatrix}
1 & 0 & dt & 0 \
0 & 1 & 0 & dt \
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1
\ end {bmatrix} \ cdot \ begin {bmatrix}
x \
y \
\ dot x \
\ dot y
\ end {bmatrix} _t $$

As an example, just write out the 1st line, which calculates the position after a calculation step with the duration \ (dt \).

$$ x_ {t + 1} = x_t + dt \ cdot \ dot x_t $$

This simply reflects physical relationships for the uniform movement. A higher form would be the constant acceleration model, which would be a 6D filter and still contain the accelerations in the state vector. In principle, you can also specify other dynamics here (e.g. a holonomic vehicle).

Process noise covariance matrix \ (Q \)

Since the movement of the vehicle can also be disturbed (in the sense of superimposed, normally distributed noise), this is introduced via the process-noise-covariance matrix \ (Q \). This matrix tells the filter how the system status can “jump” from one step to the next. Let's imagine the vehicle that drives. This can be disturbed by a gust of wind or bumps in the road, which results in a force being applied. A change in speed by the driver is also an acceleration that acts on the vehicle. If an acceleration acts on the system state \ (x \), the physical dependency is for it

$$ x = \ frac {1} {2} dt ^ 2 \ cdot \ ddot x $$

$$ \ dot x = dt \ cdot \ ddot x $$

The matrix \ (Q \) is a covariance matrix, which contains the following elements:

$$ Q = \ begin {bmatrix} \ sigma_ {x} ^ 2 & \ sigma_ {xy} & \ sigma_ {x \ dot x} & \ sigma_ {x \ dot y} \ \ sigma_ {yx} & \ sigma_ {y} ^ 2 & \ sigma_ {y \ dot x} & \ sigma_ {y \ dot y} \ \ sigma _ {\ dot xx} & \ sigma _ {\ dot xy} & \ sigma _ {\ dot x} ^ 2 & \ sigma _ {\ dot x \ dot y} \ \ sigma _ {\ dot yx} & \ sigma _ {\ dot yy} & \ sigma _ {\ dot y \ dot x} & \ sigma _ {\ dot y} ^ 2 \ end {bmatrix} $$

The easiest way to calculate it is to set up the vector \ (G \) and then multiply it by the assumed standard deviation for the applied acceleration, e.g. \ (\ sigma_a = 8m / s ^ 2 \).

$$ Q = G \ cdot G ^ T \ cdot \ sigma_a ^ 2 $$

with \ (G = \ begin {bmatrix} 0.5 \ Delta t ^ 2 & 0.5 \ Delta t ^ 2 & \ Delta t & \ Delta t \ end {bmatrix} ^ T \).

Control matrix \ (B \) and control variable \ (u \)

The influence of external control variables (e.g. steering, braking, etc.) is also possible via the control matrix \ (B \), but should be omitted here for simpler explanation.

Measurement matrix \ (H \)

The filter must also be told what is being measured and how it relates to the state vector. In the example of the vehicle entering a tunnel, only the speed, no longer the position! The values ​​can be measured directly with the factor 1 (i.e. the speed is measured directly in the correct unit), which is why only 1.0 is put in the appropriate place in \ (H \).

$$ H = \ begin {bmatrix}
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1
\ end {bmatrix} $$

If the sensors measure in a different unit or if the size is measured indirectly, the relationships must be mapped in the measurement matrix \ (H \) using a formula.

Measurement noise covariance matrix \ (R \)

As in the one-dimensional case, the variance \ (\ sigma_0 \), a measurement uncertainty must also be specified here.

$$ R = \ begin {bmatrix}
10 & 0\\
0 & 10
\ end {bmatrix} $$

This measurement uncertainty indicates how much you trust the measured values ​​of the sensors. Since we only measure \ (\ dot x \) and \ (\ dot y \), this is a 2 × 2 matrix. If the sensor is very precise, small values ​​should be used here. If the sensor is relatively imprecise, large values ​​should be used here.

Identity matrix \ (I \)

Last but not least, an identity matrix is ​​necessary.

$$ I =
\ begin {bmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1
\ end {bmatrix} $$

Filter step predictions

This part of the Kalman filter now dares to predict the system state in the future. In addition, under certain conditions (observability), it can also be used to calculate a state that cannot be measured! That's amazing, but just what we need for our case. We cannot measure the position of the vehicle because the GPS of the navigation device has no reception in a tunnel. By initializing the state vector with a position and measuring the speed, an optimal prediction about the position can still be made with the dynamics \ (A \).

$$ x_ {t + 1} = A \ cdot x_t $$

The covariance \ (P \) must also be recalculated. The uncertainty about the state of the system becomes greater in the Predict step, as has simply been seen in the 1D case. In the multi-dimensional case, the measurement uncertainty \ (Q \) is added additively, so the uncertainty \ (P \) becomes larger and larger.

$$ P = A \ times P \ times A ’+ Q $$

That's it. We calculated into the future (\ (dt \)). The Kalman filter made a statement about the expected system state in the future.

Now exactly \ (dt \) time passes and the filter is measuring / correcting and checking whether the prediction of the system state matches the new measured values ​​well. If so, the covariance \ (P \) of the filter is chosen to be smaller (it is more certain), if not, larger (something is wrong, the filter becomes less certain).

Measure / correct filter step

The following mathematical calculations are not something that one must necessarily be able to derive. Rudolf E. Kalman thought it over in a few quiet minutes and it looks crazy, but it works.

Current measured values ​​\ (Z \) come from the sensors, with which an innovation \ (w \) of the state vector \ (x \) with the measurement matrix \ (H \) is calculated.

$$ w = Z- (H \ cdot x) $$

Then it is checked with which variance (which in the multi-dimensional case is called covariance matrix) can be further calculated. For this, the uncertainty \ (P \) as well as the measurement matrix \ (H \) and the measurement uncertainty \ (R \) are required.

$$ S = (H \ cdot P \ cdot H ’+ R) $$

This determines the so-called Kalman gain. It states whether the measured values ​​or the system dynamics should be trusted more.

$$ K = \ frac {P \ cdot H ’} {S} $$

The Kalman Gain decreases when the measured values ​​correspond to the predicted system state. If the measured values ​​say something completely different, the elements of the matrix K become larger.

The system status is now updated with this information.

$$ x = x + (K \ cdot w) $$

And also determined a new covariance for the upcoming Predict step.

$$ P = (I- (K \ cdot H)) \ cdot P $$

Now it goes back to the Prediction step. Graphically it looks like this:

This filter runs continuously as long as measured values ​​come in. It can also be operated ‘Open Loop’, i.e. only the prediction step is carried out if no measured values ​​are available. Then the uncertainty \ (P \) becomes bigger and bigger.

Matlab implementation of a multi-dimensional Kalman filter

% Paul Balzer | Motorblog% CC-BY-SA2.0 license clear all; clc %% Generate measurements% Since we now have no measured values,% we simply generate% a few noisy values ​​ourselves. it = 100; % Number of measured values ​​realv = 10; % true speed% x 'y' measurements = [realv + 1. * randn (1, it); zeros (1, it)]; dt = 1; % Time step %% initialize% x y x 'y' x = [0; 0; 10; 0]; % Initial State (Location and velocity) P = [10, 0, 0, 0; ... 0, 10, 0, 0; ... 0, 0, 10, 0; ... 0, 0, 0, 10]; % Initial Uncertainty A = [1, 0, dt, 0; ... 0, 1, 0, dt; ... 0, 0, 1, 0; ... 0, 0, 0, 1]; % Transition Matrix H = [0, 0, 1, 0; ... 0, 0, 0, 1]; % Measurement function R = [10, 0; ... 0, 10]; % measurement noise covariance Q = [1/4 * dt ^ 4, 1/4 * dt ^ 4, 1/2 * dt ^ 3, 1/2 * dt ^ 3; ... 1/4 * dt ^ 4, 1/4 * dt ^ 4, 1/2 * dt ^ 3, 1/2 * dt ^ 3; ... 1/2 * dt ^ 3, 1/2 * dt ^ 3, dt ^ 2, dt ^ 2 ; ... 1/2 * dt ^ 3, 1/2 * dt ^ 3, dt ^ 2, dt ^ 2]; % Process Noise Covariance I = eye (4); % Identity matrix %% Kalman Filter Steps% for n = 1: length (measurements)% Prediction x = A * x; % Predicted state from previous and system P = A * P * A '+ Q; % Predicting the covariance% Correction Z = measurements (:, n); y = Z- (H * x); % Innovation from measured value difference S = (H * P * H '+ R); % Innovation covariance K = P * H '* inv (S); % Filter matrix (Kalman gain) x = x + (K * y); % update the system status P = (I- (K * H)) * P; % update the covariance end

The filter at work

Let us now assume that we drive into a tunnel with \ (\ dot x = 10m / s \) and the last known position is \ (x = 0, y = 0 \). Then the Kalman filter determines the position of the vehicle, although it cannot be measured at all. In the following figure times for 6 filter runs:

You can see how the filter becomes more and more uncertain about the current position (curve at the top left becomes flatter), but with the speed being driven more and more secure (curve at the top right becomes increasingly narrow). The speed measurement is always very exactly 10m / s, which makes the innovation very small, so the filter does not have to correct a lot.

Since the vehicle speed is available via the CAN bus approx. Every 20 ms, 6 iterations only correspond to a good 0.1 s. The filter converges relatively quickly, depending on the choice of the initial conditions. For example, after 100 iterations (corresponds to 2s on the vehicle) the variance is already very low, so the filter is pretty sure that it is correct.

top: measured speed, left: variance of the speed, right: estimated speed

Python implementation of the Kalman filter

If Matlab is not available, here is a sample implementation in Python.

import numpy as np for n in range (len (measurements [0])): # Time Update (Prediction) # ========================= # Project the state ahead x = A * x # Project the error covariance ahead P = A * P * AT + Q # Measurement Update (Correction) # =================== ============ # Compute the Kalman Gain S = H * P * HT + RK = (P * HT) * np.linalg.pinv (S) # Update the estimate via z Z = measurements [:, n] .reshape (2,1) y = Z - (H * x) # Innovation or Residual x = x + (K * y) # Update the error covariance P = (I - (K * H)) * P

The Python code from the video can be found on balzer82.github.io/Kalman.

 

Filter design: how do I choose Q and R?

Overall, it does not matter how large the numerical values ​​in \ (Q \) and \ (R \) are, what matters is what ratio they are. If \ (R \) is chosen ten times larger and \ (Q \) as well, this will hardly have any effect in the filter. The relationship between the values ​​is crucial. The correct choice of \ (Q \) and \ (R \) are directly responsible for the filter performance and form the fundamental question of the filter design:

This either / or question can only be decided on an application-specific basis. In some cases you just want to filter poorly measuring sensors for a relatively constant process, in other cases you want to merge several sensors and the dynamics should be retained. The matrices are to be selected accordingly. Alternatively, of course, the filter can also be designed to be adaptive, i.e. \ (Q \) or / and \ (R \) can be changed during operation.

Conclusion

The Kalman filter is relatively quick and easy to implement and provides an optimal estimate of the state for normally distributed noisy sensor values ​​under certain conditions (not explained in more detail). Mr. Kalman was so convinced of his algorithm that he was able to inspire an engineer friend at NASA. And so this filter helped for the first time in the Apollo Guidance Computer during the moon landings.

By the way: For commercial inquiries there is also the possibility to consult me ​​on the topic. Hire me!

Errata

02/2014: There was an error in the previous version: The Update / Predict steps were reversed in the Matlab code.
03/2014: The explanation has been revised again. Python implementation added
07/2016: Travolta Filter Design Tradeoff added