Description

Bayesian Network, Temporal Models and Markov Decision Process

or what state can they be in? For now we will consider only nodes that take discrete

values. The values should be both mutually exclusive and exhaustive, which

means that the variable must take on exactly one of these values at a time. Common

types of discrete nodes include:

• Boolean nodes, which represent propositions, taking the binary values true (T)

and false (F). In a medical diagnosis domain, the node Cancer would represent

the proposition that a patient has cancer.

• Ordered values. For example, a node Pollution might represent a patient’s pollution

exposure and take the values {low, medium, high}.

• Integral values. For example, a node called Age might represent a patient’s age

and have possible values from 1 to 120.

Even at this early stage, modeling choices are being made. For example, an alternative

to representing a patient’s exact age might be to clump patients into different

age groups, such as {baby, child, adolescent, young, middleaged, old}. The trick is to

choose values that represent the domain efficiently, but with enough detail to perform

the reasoning required. More on this later!

TABLE 2.1

Preliminary choices of nodes and

values for the lung cancer example.

Node name Type Values

Pollution Binary {low, high} Smoker Boolean {T, F} Cancer Boolean {T, F} Dyspnoea Boolean {T, F} X-ray Binary {pos, neg}

2This is a modified version of the so-called “Asia” problem Lauritzen and Spiegelhalter, 1988, given

in §2.5.3.

The structure, or topology, of the network should capture qualitative relationships

between variables. In particular, two nodes should be connected directly if one affects

or causes the other, with the arc indicating the direction of the effect. So, in our

medical diagnosis example, we might ask what factors affect a patient’s chance of

having cancer? If the answer is “Pollution and smoking,” then we should add arcs

from Pollution and Smoker to Cancer. Similarly, having cancer will affect the patient’s

breathing and the chances of having a positive X-ray result. So we add arcs

from Cancer to Dyspnoea and XRay. The resultant structure is shown in Figure 2.1.

It is important to note that this is just one possible structure for the problem; we look

at alternative network structures in §2.4.3.

P(X=pos|C)

S P(C=T|P,S)

0.05

0.03

0.02

0.90

0.20

T

F

T

F 0.001

Cancer

Pollution Smoker

XRay Dyspnoea

0.90

P(P=L)

C P(D=T|C)

F 0.30

P

L

L

H

P(S=T)

H

C

T

F

T 0.65

0.30

FIGURE 2.1: A BN for the lung cancer problem.

Structure terminology and layout

In talking about network structure it is useful to employ a family metaphor: a node

is a parent of a child, if there is an arc from the former to the latter. Extending the

The patient has dyspnoea and is not a smoker. If the x-ray comes positive, what is the probability that the patient has

cancer? Use the Bayesian Network above. Show in detail the calculations that take place.

Problem 2 (40 points):

Consider a two-bit register. The register has four possible states: 00, 01, 10 and 11. Initially, at time t = 0, the contents of

the register is chosen at random to be one of these four states, each with equal probability. At each time step, the register

is randomly manipulated as follows: with probability 1/2, the register is left unchanged; with probability 1/4, the two bits

of the register are exchanged (01 becomes 10, 10 becomes 01, 00 becomes 00, and 11 becomes 11 ); and with probability

1/4, the right bit is flipped (example: 01 becomes 00). After the register has been manipulated in this fashion, the left bit is

observed.

Suppose that on the first two time steps (t = 1 and t = 2), we observe the sequence 0, 0. In other terms, the observed bit at

time step t = 1 is 0, and the observed bit at time step t = 2 is also 0.

1. Show how the register can be formulated with a temporal model (Transition model + Observation model): What is

the probability of transitioning from every state to every other state? What is the probability of observing each output

(0 or 1) in each state?

2. Use the forward algorithm (filtering) to determine the probability of being in each state at time t after observing only

the first t bits, for t = 1, 2.

3. Use the forward-backward algorithm (smoothing) to determine the probability of being in each state at time t = 1

given both observed bits.

4. Use the forward algorithm (prediction) to determine the probability of observing 1 in the next time step t = 3.

Problem 3 (30 points):

Consider the Markov Decision Process (MDP) with transition probabilities and reward function as given in the tables below.

Assume the discount factor ! = 1 (i.e., there is no actual discounting).

s a s0 T(s, a, s0)

A 1 A 1

A 1 B 0

A 2 A 0.5

A 2 B 0.5

s a R(s, a)

A 1 0

A 2