FOUNDATIONS OF PROBABILISTIC PROGRAMMING

What does a probabilistic program actually compute? How can one formally reason about

such probabilistic programs? This valuable guide covers such elementary questions and

more. It provides a state-of-the-art overview of the theoretical underpinnings of modern

probabilistic programming and their applications in machine learning, security, and other

domains, at a level suitable for graduate students and non-experts in the ﬁeld. In addition,

the book treats the connection between probabilistic programs and mathematical logic,

security (what is the probability that software leaks conﬁdential information?), and presents

three programming languages for different applications: Excel tables, program testing, and

approximate computing. This title is also available as Open Access on Cambridge Core.

gilles barthe is Scientiﬁc Director at the Max Planck Institute for Security and

Privacy and Research Professor at the IMDEA Software Institute, Madrid. His recent

research develops programming language techniques and veriﬁcation methods for

probabilistic languages, with a focus on cryptographic and differentially private

computations.

joost-pieter katoen is Professor at RWTH Aachen University and University of

Twente. His research interests include formal veriﬁcation, formal semantics, concurrency

theory, and probabilistic computation. He co-authored the book Principles of Model

Checking (2008). He received an honorary doctorate from Aalborg University, is member

of the Academia Europaea, and is an ERC Advanced Grant holder.

alexandra silva is Professor of Algebra, Semantics, and Computation at University

College London. A theoretical computer scientist with contributions in the areas of

semantics of programming languages, concurrency theory, and probabilistic network

veriﬁcation, her work has been recognized by multiple awards, including the Needham

Award 2018, the Presburger Award 2017, the Leverhulme Prize 2016, and an ERC Starting

Grant in 2015.

Published online by Cambridge University Press

Published online by Cambridge University Press

FOUNDATIONS OF PROBABILISTIC

PROGRAMMING

Edited by

GILLES BARTHE

Max-Planck-Institut f¨ur Cybersicherheit und Schutz der Privatsph¨are, Bochum, Germany

JOOST–PIETER KATOEN

Rheinisch–Westf¨alische Technische Hochschule, Aachen, Germany

ALEXANDRA SILVA

University College London

Published online by Cambridge University Press

University Printing House, Cambridge CB2 8BS, United Kingdom

One Liberty Plaza, 20th Floor, New York, NY 10006, USA

477 Williamstown Road, Port Melbourne, VIC 3207, Australia

314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India

79 Anson Road, #06–04/06, Singapore 079906

Cambridge University Press is part of the University of Cambridge.

It furthers the University’s mission by disseminating knowledge in the pursuit of

education, learning, and research at the highest international levels of excellence.

www.cambridge.org

Information on this title: www.cambridge.org/9781108488518

DOI: 10.1017/9781108770750

© Gilles Barthe, Joost-Pieter Katoen and Alexandra Silva 2021

This work is in copyright. It is subject to statutory exceptions and to the provisions of

relevant licensing agreements; with the exception of the Creative Commons version the

link for which is provided below, no reproduction of any part of this work may

take place without the written permission of Cambridge University Press.

An online version of this work is published at doi.org/10.1017/9781108770750 under a

Creative Commons Open Access license CC-BY which permits re-use, distribution and

reproduction in any medium for any purpose providing appropriate credit to the original

work is given, any changes made are indicated. To view a copy of this license, visit

https://creativecommons.org/licenses/by/4.0

All versions of this work may contain content reproduced under license from third parties.

Permission to reproduce this third-party content must be obtained from these third-parties directly.

When citing this work, please include a reference to the DOI 10.1017/9781108770750

First published 2021

Printed in the United Kingdom by TJ Books Limited, Padstow Cornwall

A catalogue record for this publication is available from the British Library.

ISBN 978-1-108-48851-8 Hardback

Cambridge University Press has no responsibility for the persistence or accuracy of

URLs for external or third-party internet websites referred to in this publication

and does not guarantee that any content on such websites is, or will remain,

accurate or appropriate.

Published online by Cambridge University Press

Contents

Contributors page vii

Preface xi

1 Semantics of Probabilistic Programming: A Gentle Introduction

Fredrik Dahlqvist, Alexandra Silva and Dexter Kozen 1

2 Probabilistic Programs as Measures

Sam Staton 43

3 Application of Computable Distributions to the Semantics of Prob-

abilistic Programs

Daniel Huang, Greg Morrisett and Bas Spitters 75

4 On Probabilistic λ-Calculi

Ugo Dal Lago 121

5 Probabilistic Couplings from Program Logics

Gilles Barthe and Justin Hsu 145

6 Expected Runtime Analyis by Program Veriﬁcation

Benjamin Lucien Kaminski, Joost-Pieter Katoen and Christoph Math-

eja 185

7 Termination Analysis of Probabilistic Programs with Martingales

Krishnendu Chatterjee, Hongfei Fu and Petr Novotný 221

8 Quantitative Analysis of Programs with Probabilities and Concen-

tration of Measure Inequalities

Sriram Sankaranarayanan 259

9 The Logical Essentials of Bayesian Reasoning

Bart Jacobs and Fabio Zanasi 295

v

Published online by Cambridge University Press

vi Contents

10 Quantitative Equational Reasoning

Giorgio Bacci, Radu Mardare, Prakash Panangaden and Gordon

Plotkin 333

11 Probabilistic Abstract Interpretation: Sound Inference and Appli-

cation to Privacy

José Manuel Calderón Trilla, Michael Hicks, Stephen Magill,

Piotr Mardziel and Ian Sweet 361

12 Quantitative Information Flow with Monads in Haskell

Jeremy Gibbons, Annabelle McIver, Carroll Morgan and

Tom Schrijvers 391

13 Luck: A Probabilistic Language for Testing

Lampropoulos Leonidas, Benjamin C. Pierce, Li-yao Xia, Diane

Gallois-Wong, Cătălin Hriţcu, John Hughes 449

14 Tabular: Probabilistic Inference from the Spreadsheet

Andrew D. Gordon, Claudio Russo, Marcin Szymczak, Johannes

Borgström, Nicolas Rolland, Thore Graepel and Daniel Tarlow 489

15 Programming Unreliable Hardware

Michael Carbin and Sasa Misailovic 533

Published online by Cambridge University Press

Preface

Probabilistic programs

Probabilistic programs describe recipes for inferring statistical conclusions from a

complex mixture of uncertain data and real-world observations. They can represent

probabilistic graphical models far beyond the capabilities of Bayesian networks

and are expected to have a major impact on machine intelligence. Probabilistic

programs are ubiquitous. They steer autonomous robots and self-driving cars, are

key to describe security mechanisms, naturally code up randomised algorithms

for solving NP-hard or even unsolvable problems, and are rapidly encroaching on

AI. Probabilistic programming aims to make probabilistic modelling and machine

learning accessible to the programmer.

What is this book all about?

Probabilistic programs, though typically relatively small in size, are hard to grasp,

let alone check automatically. Elementary questions are notoriously hard – even the

most elementary question “does a program halt with probability one? – is “more un-

decidable” than the halting problem. This book is about the theoretical foundations

of probabilistic programming. It is primarily concerned with fundamental questions

such as the following: What is Bayesian probability theory? What is the precise

mathematical meaning of probabilistic programs? How show almost-sure termi-

nation? How determine the (possibly inﬁnite) expected runtime of probabilistic

programs? How can two similar programs be compared? It covers several analy-

sis techniques on probabilistic programs such as abstract interpretation, algebraic

reasoning and determining concentration measures.

These chapters are complemented with chapters on the formal deﬁnition of

concrete probabilistic programming languages and some possible applications of

the use of probabilistic programs.

xi

https://doi.org/10.1017/9781108770750.001 Published online by Cambridge University Press

xii Preface

How to read this volume?

The volume consists of ﬁve parts: semantics, veriﬁcation, logic, security, and pro-

gramming languages.

Semantics.

The ﬁrst part on semantics consists of four chapters on diﬀerent aspects of the

formal semantics of probabilistic programming languages. Dahlqvist et al. start oﬀ

in Chapter 1 by presenting an operational and denotational semantics of an imper-

ative language with discrete and continuous distributions. The chapter by Staton

presents a compositional measure-theoretic semantics for a ﬁrst-order probabilis-

tic language with arbitrary soft conditioning. Chapter 3 by Huang et al. studies

semantics with a focus on computability. Motivated by the tension between the

discrete and the continuous in probabilistic modelling, type-2 computable distribu-

tions are introduced as the elementary mathematical objects to give semantics to

probabilistic languages. Dal Lago’s Chapter 4 completes the semantics part by treat-

ing a probabilistic version of the λ-calculus, the backbone language for functional

programming languages.

Veriﬁcation.

The second part on formal veriﬁcation starts with a chapter by Barthe and Hsu

on the use of couplings, a well-known concept in probability theory for verifying

probabilistic programs. Kaminski et al. present a weakest pre-condition calculus

in the style of Dijkstra for determining the expected run-time of a probabilistic

program. This can be used to determine whether a program needs inﬁnitely many

steps to terminate with probability one. Chapter 7 by Chatterjee et al. presents var-

ious techniques based on supermartingales to decide the almost-sure termination

of a probabilistic program, i.e., does a program terminate with probability one on

all possible inputs? The veriﬁcation part ends with a chapter by Sankaranarayanan

about the quantitative analysis of probabilistic programs by means of concentration

of measure inequalities. This includes Chernoﬀ–Hoeﬀding and Bernstein inequal-

ities.

Logic.

The third part focuses on logic and consists of two chapters. In Chapter 9, Ja-

cobs and Zanasi present an insightful new perspective on fundamental aspects of

probability theory which are relevant for probabilistic programming. Their chapter

introduces a category-theoretic formalization of Bayesian reasoning, and in particu-

lar a string-diagram-friendly one. This contribution is complemented by the chapter

by Bacci et al. which surveys some recent contributions on extending the classical

https://doi.org/10.1017/9781108770750.001 Published online by Cambridge University Press

Preface xiii

Birkhoﬀ/Lawvere theory of equational reasoning to the quantitative setting. In that

setting, compared entities are not necessarily equal but rather treated by a notion of

distance.

Security.

Part four is concerned with security, an important application ﬁeld in which prob-

abilities are pivotal. Chapter 11 by Calderon et al. collects together results on

probabilistic abstract interpretation and applies them to probabilistic programming

in the context of security. Chapter 12 by Gibbons et al. presents an embedded

domain-speciﬁc language in Haskell to compute hyper-distributions induced by

programs. This is used to compute the amount of leakage of a program by measur-

ing variations on post-distributions that include Shannon entropy and Bayes’ risk

(that is, the maximum information an attacker can learn in a single run).

Programming languages.

The ﬁnal part of this volume is concerned with three concrete probabilistic pro-

gramming languages: Luck, Tabular, and Rely. Chapter 13 describes Luck, pro-

posed by Lampropoulos et al., a language for test generation and a framework for

property-based testing of functional programs. Luck combines local instantiation of

unknown variables and global constraint solving to make test generation more eﬃ-

cient than existing approaches. Gordon et al. introduce Tabular, a domain-speciﬁc

programming language designed to express probabilistic models and to perform

probabilistic inference over relational data. Tabular can be used from Microsoft

Excel or as stand-alone software. Chapter 14 presents the syntax, semantics and

type system of Tabular and shows how it can be used to design probabilistic models

and to perform probabilistic inference. Chapter 15, the last chapter of this volume,

by Carbin and Misailovic, presents Rely, a programming language that enables

reasoning about the probability that a program produces the correct result when

executed on unreliable hardware.

How this volume emerged

This book consists of 15 contributed chapters and a preface. The idea for this

volume emerged at the ﬁrst summer school on Foundations of Programming and

Software Systems held in Braga, Portugal, May–June 2017. This biennial school

series is supported by EATCS (European Association for Theoretical Computer

Science), ETAPS (European Conference on Theory and Practice of Software),

ACM SIGPLAN (Special Interest Group on Programming Languages) and ACM

SIGLOG (Special Interest Group on Logic and Computation). It was felt that there is

no comprehensive book on the theoretical foundations of probabilistic programming

https://doi.org/10.1017/9781108770750.001 Published online by Cambridge University Press

xiv Preface

languages. We sincerely hope that this volume contributes to ﬁlling this gap. Enjoy

reading!

Acknowledgements

Many people have helped us to make this volume possible. First and foremost, we

like to thank all authors for their ﬁne contributions and for their patience in this

book process. All chapters have been subject to peer reviews. We thank the review-

ers Alejandro Aguirre, Krishnendu Chatterjee, Ugo Dal Lago, Thomas Ehrhard,

Claudia Faggian, Marco Gaboardi, Francesco Gavazzo, Jeremy Gibbons, Andrew

D. Gordon, Ichiro Hasuo, Jan Hoﬀmann, Justin Hsu, Bart Jacobs, Benjamin Lu-

cien Kaminski, Pasquale Malacaria, Radu Mardare, Christoph Matheja, Annabelle

McIver, Sasa Misailovic, Petr Novotný, Prakash Panangaden, Corina Pasareanu,

Alejandro Russo, Sriram Sankaranarayanan, Steﬀen Smolka, and Marcin Szym-

czak for their thorough reviewing work.

We thank Luis Barbosa, Catarina Fernandes and Renato Neves for their invaluable

eﬀorts in the local organisation of the FoPPS 2017 summer school. We thank Joshua

Moerman for his eﬀorts in the editing process. David Tranah and Anna Scriven

from Cambridge University Press are thanked for their support during this process.

We thank Hanna Schraﬀenberger for designing the cover of the book. Finally, we

are grateful for the ﬁnancial support from the ERC (AdG FRAPPANT and StG

ProFoundNet) and the MPI on Security and Privacy which enabled to publish this

volume under gold open access.

https://doi.org/10.1017/9781108770750.001 Published online by Cambridge University Press

1

Semantics of Probabilistic Programming:

A Gentle Introduction

Fredrik Dahlqvist and Alexandra Silva

University College London

Dexter Kozen

Cornell University

Abstract: Reasoning about probabilistic programs is hard because it compounds

the diﬃculty of classic program analysis with sometimes subtle questions of prob-

ability theory. Having precise mathematical models, or semantics, describing their

behaviour is therefore particularly important. In this chapter, we review two prob-

abilistic semantics. First an operational semantics which models the local, step-

by-step, behaviour of programs, then a denotational semantics describing global

behaviour as an operator transforming probability distributions over memory states.

1.1 Introduction

A probabilistic program is any program whose execution is probabilistic. This

usually means that there is a source of randomness that allows weighted choices

to be made during execution. Given an initial machine-state, in the event that the

program halts, there will be a distribution describing the probability of output

events. Any deterministic program is trivially a probabilistic program that does

not make any random choices. The source of randomness is typically a random

number generator, which is assumed to provide independent samples from a known

distribution. In practice, these are often pseudo-random number generators, which

do not provide true randomness, but only an approximation; however, it is possible

to construct hardware random number generators that provide true randomness, for

example by measuring a noisy electromagnetic process.

Reasoning about deterministicprograms usually involves answering binary yes/no

questions: Is the postcondition always satisﬁed? Does this program halt on all in-

puts? Does it always halt in polynomial time? On the other hand, reasoning about

probabilistic programming usually involves more quantitative questions: What is

the probability that the postcondition is satisﬁed? What is the probability that this

a

From Foundations of Probabilistic Programming, edited by Gilles Barthe, Joost-Pieter Katoen and Alexandra

Silva published 2020 by Cambridge University Press.

1

https://doi.org/10.1017/9781108770750.002 Published online by Cambridge University Press

2 Dahlqvist, Kozen and Silva: Semantics of Probabilistic Programming

program halts? Is its expected halting time polynomial? In order to answer questions

like these, the ﬁrst step should be to develop a formal mathematical semantics for

probabilistic programs, which will allow us to formalise such questions precisely.

This is the main purpose of this chapter.

Reasoning about probabilistic programs is in general diﬃcult because it com-

pounds the diﬃculty of deterministic program analysis with questions of probability

theory, which can sometimes be counterintuitive. We will use examples to illustrate

all the main ideas presented in this chapter. We introduce these examples here and

will return to them as we develop the semantics of probabilistic programs. We

start with two examples involving discrete probabilities for which naive probability

theory provides a suﬃcient framework for reasoning. We will then present two

programs that involve continuous distributions for which a more general theory

known as measure theory is needed. The requisite background for understanding

these concepts is presented in Section 1.2.

x :=0;

while x ==0 do

x:=coin()

start

[x → ?] [x → 0] [x → 1]

x:=0

1

/2 : x:=1

1

/2 : x:=0

Figure 1.1 A simple coin-toss program

We start with the simple program of Fig. 1.1 displayed next to the small proba-

bilistic automaton it implements. Here the construct coin() is our random number

generator; each successive call returns 0 or 1, each with probability 1/2, and succes-

sive calls are independent, which means that n successive calls will yield one of the

2

n

possible sequences of n binary digits, each with probability 2

−n

. A distribution

on {0, 1} that takes value 1 with probability p and 0 with probability 1 − p is called

a Bernoulli distribution with (success) parameter p. Thus coin() is a Bernoulli

distribution with success parameter 1/2.

It is intuitively clear that this program eventually halts with probability 1. Looking

at the automaton of Fig. 1.1, one can see that the probability of the program going

https://doi.org/10.1017/9781108770750.002 Published online by Cambridge University Press

1.1 Introduction 3

through n iterations of the body of the loop is 2

−n

. Moreover, the expected number

of iterations of the body of the loop is given by

∞

n=1

n2

−n

= 2.

This type of simple probabilistic process involving repeated independent trials until

some ﬁxed “success” event occurs is called a Bernoulli process. If the probability

of success in each trial is p, then the expected time until success is 1/p. In this

example, p = 1/2. We will show in Section 1.3 how the mathematical interpretation

of this program (its semantics) can be constructed compositionally, that is to say

line-by-line, and how it agrees with these simple observations.

Our second example is also discrete, but intuitively less obvious. The program

of Fig. 1.2 implements a random walk on the two-dimensional grid Z × Z. In each

iteration of the body of the loop, the function step updates the current coordinates

by moving left, right, down, or up, each with equal probability 1/4.

main {

u :=0;

v :=0;

step (u,v );

while u !=0 || v!=0 do

step (u,v)

}

step (u,v ){

x:=coin ();

y:=coin ();

u:=u+(x-y);

v := v+( x+y -1)

}

Figure 1.2 A random walk on a two-dimensional grid

The loop continues until the random walk returns to the origin. The ﬁrst call to

step outside the loop ensures that the program takes at least one step, so it does not

halt immediately. The question of the halting probability is now much less obvious.

The state space is inﬁnite, and there is no constraint on how far the random walk can

travel from the origin. Indeed, for any distance, there is a nonzero probability that

it goes at least that far. However, it turns out that the probability that the program

halts is 1. In the terminology of probability theory, we would say that the two-

dimensional random walk is recurrent at every point. This example illustrates how

the analysis of probabilistic programs can rely on results from probability theory

https://doi.org/10.1017/9781108770750.002 Published online by Cambridge University Press

4 Dahlqvist, Kozen and Silva: Semantics of Probabilistic Programming

that are far from obvious. Indeed, the three-dimensional version is not recurrent;

the probability that a random walk on Z

3

eventually returns to the origin is strictly

less than 1.

We now consider two programs that require continuous distributions. The se-

mantics of such programs cannot be deﬁned without the full power of measure

theory, the mathematical foundation of probabilities and integration. The program

of Fig. 1.3 approximates the constant π using Monte Carlo integration, a probabilis-

tic integration method. The program works by taking a large number of independent,

uniformly distributed random samples from the square [0,1]×[0, 1] and counting

the number that fall inside the unit circle. As the area of the square is 1 and the area

of the part of the unit circle inside that square is π/4, by the law of large numbers

we expect to see a π/4 fraction of sample points lying inside the circle.

i :=0;

n :=0;

while i<1e9 do

x:=rand ();

y:=rand ();

if (x*x+y*y) < 1 then n:=n+1;

i:=i+1

i :=4*n /1 e9;

Figure 1.3 Probabilistic computation of π.

In this example, the random number generator rand() samples from the uniform

distribution on the interval [0, 1]. This distribution is often called Lebesgue mea-

sure. Here the state space [0, 1] is uncountable and the probability of drawing any

particular x ∈[0, 1] is zero. Such probability distributions are called continuous.

The natural question to ask about this program is not whether it terminates (it clearly

does) but whether it returns a good approximation of π with high probability. We

will answer this question in Section 1.3.

Finally, the program in Fig. 1.4 generates a real number between [0,1] whose

expansion in base 3 does not contain any 1’s. This program is not like the others

in that it does not halt (nor is it meant to). The program generates a sample from

a curious and in many respects counterintuitive distribution called the Cantor

distribution. It cannot be described using discrete probability distributions (i.e.

ﬁnite or countable weighted sums of point masses), although the program only

uses a discrete fair coin as a source. The Cantor distribution is also an example of

continuous probability distribution, which assigns probability zero to every element

of the state space. It is also an example of a so-called singular distribution, since

it can be shown that the set of all its possible outcomes—that is to say the set of

https://doi.org/10.1017/9781108770750.002 Published online by Cambridge University Press

1.2 Measure theory: What you need to know 5

all real numbers whose base-3 expansion contains no 1’s—has measure 0 in the

Lebesgue measure on [0, 1].

x :=0;

d :=1;

while true do

d := d/3;

x := x+2* coin ()*d

Figure 1.4 Cantor distribution program.

1.2 Measure theory: What you need to know

Measures are a generalization of the concepts of length, area, or volume of Euclidean

geometry to other spaces. They form the basis of probability and integration theory.

In this section, we explain what it means for a space to be a measurable space,we

deﬁne measures on these spaces, and we examine the rich structure of spaces of

measures, which will be essential to the semantics of probabilistic programs deﬁned

in Section 1.3.5. When not speciﬁed otherwise we use the word measure to refer to

ﬁnite measures.

1.2.1 Some intuition

The concepts of length, area, and volume on Euclidean spaces are examples of

(positive) measures. These are suﬃcient to illustrate most of the desired properties

of measures and some pitfalls to avoid. For the sake of simplicity, let us examine

the concept of length. Given an interval [a, b]⊆R, its length is of course (

[

a, b

]

) =

b−a. But the length function makes sense for other subsets of R besides intervals.

So we will begin with two related questions:

(a) Which subsets of R can meaningfully be assigned a “length” consistent with

the length of intervals? I.e., what should the domain of be?

(b) Which properties should the length function satisfy?

The answer to question (a) will give rise to the notion of measurable space, and the

answer to question (b) will give rise to the notion of measure, both deﬁned formally

in Section 1.2.2.

Note that larger intervals have larger lengths: if [a, b]⊆[c, d], then we have that

([a, b]) = b − a ≤ d − c = ([c, d]). This intuitively obvious property is a general

feature of all positive measures: they associate nonnegative real numbers to subsets

monotonically with respect to set inclusion. Let us now take two disjoint intervals

https://doi.org/10.1017/9781108770750.002 Published online by Cambridge University Press

6 Dahlqvist, Kozen and Silva: Semantics of Probabilistic Programming

[a

1

, b

1

]and [a

2

, b

2

]with b

1

< a

2

. It is natural to deﬁne the length of [a

1

, b

1

]∪[a

2

, b

2

]

as the sum of the length of the respective intervals, i.e.

([a

1

, b

1

]∪[a

2

, b

2

]) = ([a

1

, b

1

]) + ([a

2

, b

2

]) = (b

1

− a

1

) + (b

2

− a

2

).

We can draw two conclusions from this natural deﬁnition. First, if A, B are two

disjoint subsets of R in the domain of , then their union should also belong to the

domain of , and the measure of the union should be the sum of the measures. More

generally, if A

i

,1≤ i ≤ n, is any ﬁnite collection of pairwise disjoint sets in the

domain of , then

n

i=1

A

i

should also be in the domain of , and the measure of

the union should be the sum of the measures of the A

i

; that is,

n

i=1

A

i

=

n

i=1

(A

i

). (1.1)

A real-valued function on subsets satisfying (1.1) is called (ﬁnitely) additive. All

measures will be ﬁnitely additive, and in fact more. Consider the countable col-

lection of pairwise disjoint intervals

[

n, n + 2

−n

)

, n ∈ N. Generalising (1.1), it is

natural to deﬁne on the union of these intervals as

∞

n=0

[

n, n + 2

−n

)

=

∞

n=0

2

−n

= 2.

Again, we can draw two conclusions from this natural deﬁnition. First, if A

i

for

i ∈ N is a countable collection of pairwise disjoint sets in the domain of , then

i ∈N

A

i

should be in the domain of ; second, that (1.1) should be extended to such

countable collections, so that

∞

i=0

A

i

=

∞

i=0

(A

i

). (1.2)

A function satisfying (1.2) is called countably additive or σ-additive.Every

measure will be countably additive. The reader will now legitimately ask: what

happens if the sum in (1.2) diverges? To deal with this behaviour, one simply allows

∞ as a possible length, that is to say the codomain of can be the extended real line

R

+

∪ {∞}. In particular, this allows us to deﬁne the length of R via (1.2) as:

(R) =

n∈Z

[n, n + 1)

= ∞.

However, for the purpose of semantics of probabilistic programs, we will not need

measures taking the value ∞. A measure is called ﬁnite if it only assigns ﬁnite

values in R to any set in its domain. For the remainder of this chapter, the term

“measure”, otherwise unqualiﬁed, will refer to ﬁnite measures.

https://doi.org/10.1017/9781108770750.002 Published online by Cambridge University Press