Description
Bayesian Network, Temporal Models and Markov Decision Process
or what state can they be in? For now we will consider only nodes that take discrete
values. The values should be both mutually exclusive and exhaustive, which
means that the variable must take on exactly one of these values at a time. Common
types of discrete nodes include:
• Boolean nodes, which represent propositions, taking the binary values true (T)
and false (F). In a medical diagnosis domain, the node Cancer would represent
the proposition that a patient has cancer.
• Ordered values. For example, a node Pollution might represent a patient’s pollution
exposure and take the values {low, medium, high}.
• Integral values. For example, a node called Age might represent a patient’s age
and have possible values from 1 to 120.
Even at this early stage, modeling choices are being made. For example, an alternative
to representing a patient’s exact age might be to clump patients into different
age groups, such as {baby, child, adolescent, young, middleaged, old}. The trick is to
choose values that represent the domain efficiently, but with enough detail to perform
the reasoning required. More on this later!
TABLE 2.1
Preliminary choices of nodes and
values for the lung cancer example.
Node name Type Values
Pollution Binary {low, high} Smoker Boolean {T, F} Cancer Boolean {T, F} Dyspnoea Boolean {T, F} X-ray Binary {pos, neg}
2This is a modified version of the so-called “Asia” problem Lauritzen and Spiegelhalter, 1988, given
in §2.5.3.
The structure, or topology, of the network should capture qualitative relationships
between variables. In particular, two nodes should be connected directly if one affects
or causes the other, with the arc indicating the direction of the effect. So, in our
medical diagnosis example, we might ask what factors affect a patient’s chance of
having cancer? If the answer is “Pollution and smoking,” then we should add arcs
from Pollution and Smoker to Cancer. Similarly, having cancer will affect the patient’s
breathing and the chances of having a positive X-ray result. So we add arcs
from Cancer to Dyspnoea and XRay. The resultant structure is shown in Figure 2.1.
It is important to note that this is just one possible structure for the problem; we look
at alternative network structures in §2.4.3.
P(X=pos|C)
S P(C=T|P,S)
0.05
0.03
0.02
0.90
0.20
T
F
T
F 0.001
Cancer
Pollution Smoker
XRay Dyspnoea
0.90
P(P=L)
C P(D=T|C)
F 0.30
P
L
L
H
P(S=T)
H
C
T
F
T 0.65
0.30
FIGURE 2.1: A BN for the lung cancer problem.
Structure terminology and layout
In talking about network structure it is useful to employ a family metaphor: a node
is a parent of a child, if there is an arc from the former to the latter. Extending the
The patient has dyspnoea and is not a smoker. If the x-ray comes positive, what is the probability that the patient has
cancer? Use the Bayesian Network above. Show in detail the calculations that take place.
Problem 2 (40 points):
Consider a two-bit register. The register has four possible states: 00, 01, 10 and 11. Initially, at time t = 0, the contents of
the register is chosen at random to be one of these four states, each with equal probability. At each time step, the register
is randomly manipulated as follows: with probability 1/2, the register is left unchanged; with probability 1/4, the two bits
of the register are exchanged (01 becomes 10, 10 becomes 01, 00 becomes 00, and 11 becomes 11 ); and with probability
1/4, the right bit is flipped (example: 01 becomes 00). After the register has been manipulated in this fashion, the left bit is
observed.
Suppose that on the first two time steps (t = 1 and t = 2), we observe the sequence 0, 0. In other terms, the observed bit at
time step t = 1 is 0, and the observed bit at time step t = 2 is also 0.
1. Show how the register can be formulated with a temporal model (Transition model + Observation model): What is
the probability of transitioning from every state to every other state? What is the probability of observing each output
(0 or 1) in each state?
2. Use the forward algorithm (filtering) to determine the probability of being in each state at time t after observing only
the first t bits, for t = 1, 2.
3. Use the forward-backward algorithm (smoothing) to determine the probability of being in each state at time t = 1
given both observed bits.
4. Use the forward algorithm (prediction) to determine the probability of observing 1 in the next time step t = 3.
Problem 3 (30 points):
Consider the Markov Decision Process (MDP) with transition probabilities and reward function as given in the tables below.
Assume the discount factor ! = 1 (i.e., there is no actual discounting).
s a s0 T(s, a, s0)
A 1 A 1
A 1 B 0
A 2 A 0.5
A 2 B 0.5
s a R(s, a)
A 1 0
A 2