100% found this document useful (5 votes)
1K views574 pages

Building Skills in Python

programming Python

Uploaded by

aton.dehoutem331
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (5 votes)
1K views574 pages

Building Skills in Python

programming Python

Uploaded by

aton.dehoutem331
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 574

Building Skills in Python

Release 2.6.5

Steven F. Lott

April 20, 2010

CONTENTS

Front Matter
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3
5 5 6 7 8 9 9 10

1 Preface 1.1 Why Read This Book? . . . . . 1.2 Audience . . . . . . . . . . . . . 1.3 Organization of This Book . . . 1.4 Limitations . . . . . . . . . . . . 1.5 Programming Style . . . . . . . 1.6 Conventions Used in This Book 1.7 Acknowledgements . . . . . . . .

II

Language Basics
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11
15 15 15 16 19 21 21 24 25 28 31 31 34 36 40 41 42 44 45

2 Background and History 2.1 History . . . . . . . . 2.2 Features of Python . 2.3 Comparisons . . . . . 2.4 Some Jargon . . . . .

3 Python Installation 3.1 Windows Installation . . . . . . . 3.2 Macintosh Installation . . . . . . . 3.3 GNU/Linux and UNIX Overview 3.4 Build from Scratch Installation

4 Getting Started 4.1 Command-Line Interaction . . . . . . . 4.2 The IDLE Development Environment . 4.3 Script Mode . . . . . . . . . . . . . . . 4.4 Getting Help . . . . . . . . . . . . . . . 4.5 Syntax Formalities . . . . . . . . . . . . 4.6 Exercises . . . . . . . . . . . . . . . . . 4.7 Other Tools . . . . . . . . . . . . . . . . 4.8 Style Notes: Wise Choice of File Names

5 Simple Numeric Expressions and Output 47 5.1 Seeing Output with the print() Function (or print Statement) . . . . . . . . . . . . . . . . 47 5.2 Numeric Types and Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.3 Numeric Conversion (or Factory) Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.4 5.5 5.6

Built-In Math Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Expression Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Expression Style Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54 56 60 61 61 61 63 64 66 68 71 71 73 75 78 78 79 80 81 83 83 85 88 90 91 92 93 94

6 Advanced Expressions 6.1 Using Modules . . . . . . . . . 6.2 The math Module . . . . . . . 6.3 The random Module . . . . . . 6.4 Advanced Expression Exercises 6.5 Bit Manipulation Operators . 6.6 Division Operators . . . . . . .

7 Variables, Assignment and Input 7.1 Variables . . . . . . . . . . . . . . . . . . . . . . . . 7.2 The Assignment Statement . . . . . . . . . . . . . 7.3 Input Functions . . . . . . . . . . . . . . . . . . . . 7.4 Multiple Assignment Statement . . . . . . . . . . . 7.5 The del Statement . . . . . . . . . . . . . . . . . . . 7.6 Interactive Mode Revisited . . . . . . . . . . . . . . 7.7 Variables, Assignment and Input Function Exercises 7.8 Variables and Assignment Style Notes . . . . . . . . 8 Truth, Comparison and Conditional Processing 8.1 Truth and Logic . . . . . . . . . . . . . . . . . . 8.2 Comparisons . . . . . . . . . . . . . . . . . . . . 8.3 Conditional Processing: the if Statement . . . . 8.4 The pass Statement . . . . . . . . . . . . . . . . 8.5 The assert Statement . . . . . . . . . . . . . . . 8.6 The if-else Operator . . . . . . . . . . . . . . . 8.7 Condition Exercises . . . . . . . . . . . . . . . . 8.8 Condition Style Notes . . . . . . . . . . . . . . . 9 Loops and Iterative Processing 9.1 Iterative Processing: For All and There Exists 9.2 Iterative Processing: The for Statement . . . . 9.3 Iterative Processing: The while Statement . . 9.4 More Iteration Control: break and continue 9.5 Iteration Exercises . . . . . . . . . . . . . . . . 9.6 Condition and Loops Style Notes . . . . . . . . 9.7 A Digression . . . . . . . . . . . . . . . . . . . 10 Functions 10.1 Semantics . . . . . . . . . . . . . . . . . . 10.2 Function Denition: The def and return 10.3 Function Use . . . . . . . . . . . . . . . . 10.4 Function Varieties . . . . . . . . . . . . . 10.5 Some Examples . . . . . . . . . . . . . . 10.6 Hacking Mode . . . . . . . . . . . . . . . 10.7 More Function Denition Features . . . . 10.8 Function Exercises . . . . . . . . . . . . . 10.9 Object Method Functions . . . . . . . . . 10.10 Functions Style Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95 . 95 . 96 . 97 . 98 . 100 . 103 . 104 . . . . . . . . . . 107 107 109 110 111 112 113 115 118 121 122

. . . . . . . Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11 Additional Notes On Functions 125 11.1 Functions and Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 ii

11.2 The global Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 11.3 Call By Value and Call By Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 11.4 Function Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

III

Data Structures
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

131
135 135 136 139 139 141 141 141 143 146 146 147 148 151 152 153 155 155 155 156 157 157 158 160 161 163 163 163 164 164 165 166 167 169 170 175 175 176 176 178 178 179 180 181 iii

12 Sequences: Strings, Tuples and Lists 12.1 Sequence Semantics . . . . . . . . . 12.2 Overview of Sequences . . . . . . . 12.3 Exercises . . . . . . . . . . . . . . . 12.4 Style Notes . . . . . . . . . . . . . .

13 Strings 13.1 String Semantics . . . . . . . . . . . . 13.2 String Literal Values . . . . . . . . . . 13.3 String Operations . . . . . . . . . . . 13.4 String Comparison Operations . . . . 13.5 String Statements . . . . . . . . . . . 13.6 String Built-in Functions . . . . . . . 13.7 String Methods . . . . . . . . . . . . . 13.8 String Modules . . . . . . . . . . . . . 13.9 String Exercises . . . . . . . . . . . . 13.10 Digression on Immutability of Strings 14 Tuples 14.1 Tuple Semantics . . . . . . . . . . 14.2 Tuple Literal Values . . . . . . . . 14.3 Tuple Operations . . . . . . . . . 14.4 Tuple Comparison Operations . . 14.5 Tuple Statements . . . . . . . . . 14.6 Tuple Built-in Functions . . . . . 14.7 Tuple Exercises . . . . . . . . . . 14.8 Digression on The Sigma Operator 15 Lists 15.1 List Semantics . . . . . . . . . . . 15.2 List Literal Values . . . . . . . . . 15.3 List Operations . . . . . . . . . . 15.4 List Comparison Operations . . . 15.5 List Statements . . . . . . . . . . 15.6 List Built-in Functions . . . . . . 15.7 List Methods . . . . . . . . . . . . 15.8 Using Lists as Function Parameter 15.9 List Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Defaults . . . . .

16 Mappings and Dictionaries 16.1 Dictionary Semantics . . . . . . . . . . . 16.2 Dictionary Literal Values . . . . . . . . . 16.3 Dictionary Operations . . . . . . . . . . . 16.4 Dictionary Comparison Operations . . . . 16.5 Dictionary Statements . . . . . . . . . . . 16.6 Dictionary Built-in Functions . . . . . . . 16.7 Dictionary Methods . . . . . . . . . . . . 16.8 Using Dictionaries as Function Parameter

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Defaults

16.9 Dictionary Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 16.10 Advanced Parameter Handling For Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 17 Sets 17.1 17.2 17.3 17.4 17.5 17.6 17.7 17.8 17.9 Set Semantics . . . . . . . . . . . . . . . . Set Literal Values . . . . . . . . . . . . . . Set Operations . . . . . . . . . . . . . . . . Set Comparison Operators . . . . . . . . . Set Statements . . . . . . . . . . . . . . . . Set Built-in Functions . . . . . . . . . . . . Set Methods . . . . . . . . . . . . . . . . . Using Sets as Function Parameter Defaults Set Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 187 187 188 190 191 191 192 194 195 199 199 200 203 204 206 206 207 208 210 211 212 213 213 214 215 216 217 217 218 219 221 221 222 223 224 226 226 228 232 235 235 236 239 242 244 246 248

18 Exceptions 18.1 Exception Semantics . . . . . . . . 18.2 Basic Exception Handling . . . . . 18.3 Raising Exceptions . . . . . . . . . 18.4 An Exceptional Example . . . . . 18.5 Complete Exception Handling and 18.6 Exception Functions . . . . . . . . 18.7 Exception Attributes . . . . . . . 18.8 Built-in Exceptions . . . . . . . . 18.9 Exception Exercises . . . . . . . . 18.10 Style Notes . . . . . . . . . . . . . 18.11 A Digression . . . . . . . . . . . . 19 Iterators and Generators 19.1 Iterator Semantics . . . . . . . 19.2 Generator Function Semantics 19.3 Dening a Generator Function 19.4 Generator Functions . . . . . . 19.5 Generator Statements . . . . . 19.6 Iterators Everywhere . . . . . 19.7 Generator Function Example . 19.8 Generator Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . The nally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20 Files 20.1 File Semantics . . . . . . . . . . 20.2 File Organization and Structure 20.3 Additional Background . . . . . 20.4 Built-in Functions . . . . . . . . 20.5 File Statements . . . . . . . . . 20.6 File Methods . . . . . . . . . . . 20.7 Several Examples . . . . . . . . 20.8 File Exercises . . . . . . . . . . .

21 Functional Programming with Collections 21.1 Lists of Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 List Comprehensions . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Sequence Processing Functions: map(), filter() and reduce() 21.4 Advanced List Sorting . . . . . . . . . . . . . . . . . . . . . . . . 21.5 The Lambda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.6 Multi-Dimensional Arrays or Matrices . . . . . . . . . . . . . . . 21.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

22 Advanced Mapping Techniques 251 22.1 Default Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 22.2 Inverting a Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 22.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

IV

Data + Processing = Objects


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

255
259 259 262 263 264 266 269 271 287 287 292 294 296 296 297 299 299 301 303 307 307 310 313 315 319 320 321 322 322 327 329 330 331 332 333 334 336 336 343 343 344 346 348 v

23 Classes 23.1 Semantics . . . . . . . . . . . . . . . . 23.2 Class Denition: the class Statement 23.3 Creating and Using Objects . . . . . . 23.4 Special Method Names . . . . . . . . 23.5 Some Examples . . . . . . . . . . . . 23.6 Object Collaboration . . . . . . . . . 23.7 Class Denition Exercises . . . . . . .

24 Advanced Class Denition 24.1 Inheritance . . . . . . . . . . . . . . . . . . . 24.2 Polymorphism . . . . . . . . . . . . . . . . . 24.3 Built-in Functions . . . . . . . . . . . . . . . 24.4 Collaborating with max(), min() and sort() 24.5 Initializer Techniques . . . . . . . . . . . . . 24.6 Class Variables . . . . . . . . . . . . . . . . . 24.7 Static Methods and Class Method . . . . . . 24.8 Design Approaches . . . . . . . . . . . . . . . 24.9 Advanced Class Denition Exercises . . . . . 24.10 Style Notes . . . . . . . . . . . . . . . . . . . 25 Some Design Patterns 25.1 Factory . . . . . . . . . . 25.2 State . . . . . . . . . . . 25.3 Strategy . . . . . . . . . 25.4 Design Pattern Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26 Creating or Extending Data Types 26.1 Semantics of Special Methods . . . . . . . . . . . . . . . . 26.2 Basic Special Methods . . . . . . . . . . . . . . . . . . . . . 26.3 Special Attribute Names . . . . . . . . . . . . . . . . . . . 26.4 Numeric Type Special Methods . . . . . . . . . . . . . . . 26.5 Collection Special Method Names . . . . . . . . . . . . . . 26.6 Collection Special Method Names for Iterators and Iterable 26.7 Collection Special Method Names for Sequences . . . . . . 26.8 Collection Special Method Names for Sets . . . . . . . . . . 26.9 Collection Special Method Names for Mappings . . . . . . 26.10 Mapping Example . . . . . . . . . . . . . . . . . . . . . . . 26.11 Iterator Examples . . . . . . . . . . . . . . . . . . . . . . . 26.12 Extending Built-In Classes . . . . . . . . . . . . . . . . . . 26.13 Special Method Name Exercises . . . . . . . . . . . . . . . 27 Attributes, Properties and Descriptors 27.1 Semantics of Attributes . . . . . . . . . . . 27.2 Properties . . . . . . . . . . . . . . . . . . 27.3 Descriptors . . . . . . . . . . . . . . . . . . 27.4 Attribute Handling Special Method Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27.5 Attribute Access Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 28 Decorators 28.1 Semantics of Decorators . . . 28.2 Built-in Decorators . . . . . 28.3 Dening Decorators . . . . . 28.4 Dening Complex Decorators 28.5 Decorator Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 351 352 354 355 356 357 357 358 358 360 361

29 Managing Contexts: the with Statement 29.1 Semantics of a Context . . . . . . . . . . 29.2 Using a Context . . . . . . . . . . . . . . 29.3 Dening a Context Manager Function . . 29.4 Dening a Context Manager Class . . . . 29.5 Context Manager Exercises . . . . . . . .

Components, Modules and Packages


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

363
367 367 368 370 372 373 375 375 377 379 379 380 381 381 381

30 Modules 30.1 Module Semantics . . . . . . . . . . . 30.2 Module Denition . . . . . . . . . . . 30.3 Module Use: The import Statement 30.4 Finding Modules: The Path . . . . . 30.5 Variations on An import Theme . . 30.6 The exec Statement . . . . . . . . . . 30.7 Module Exercises . . . . . . . . . . . 30.8 Style Notes . . . . . . . . . . . . . . . 31 Packages 31.1 Package Semantics 31.2 Package Denition 31.3 Package Use . . . 31.4 Package Exercises 31.5 Style Notes . . . . 32 The 32.1 32.2 32.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Python Library 383 Overview of the Python Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 Most Useful Library Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 Library Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 395 396 397 399 401 401 403 404 405 407 408 409

33 Complex Strings: the re Module 33.1 Semantics . . . . . . . . . . . . . 33.2 Creating a Regular Expression . 33.3 Using a Regular Expression . . . 33.4 Regular Expression Exercises . .

34 Dates and Times: the time and datetime Modules 34.1 Semantics: What is Time? . . . . . . . . . . . . . . 34.2 Some Class Denitions . . . . . . . . . . . . . . . . 34.3 Creating a Date-Time . . . . . . . . . . . . . . . . . 34.4 Date-Time Calculations and Manipulations . . . . . 34.5 Presenting a Date-Time . . . . . . . . . . . . . . . . 34.6 Formatting Symbols . . . . . . . . . . . . . . . . . . 34.7 Time Exercises . . . . . . . . . . . . . . . . . . . . .

vi

34.8 Additional time Module Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410 35 File Handling Modules 35.1 The os.path Module . . . . . . . . . . . . . . . . 35.2 The os Module . . . . . . . . . . . . . . . . . . . . 35.3 The fileinput Module . . . . . . . . . . . . . . . 35.4 The glob and fnmatch Modules . . . . . . . . . . 35.5 The tempfile Module . . . . . . . . . . . . . . . . 35.6 The shutil Module . . . . . . . . . . . . . . . . . 35.7 The File Archive Modules: tarfile and zipfile 35.8 The sys Module . . . . . . . . . . . . . . . . . . . 35.9 Additional File-Processing Modules . . . . . . . . 35.10 File Module Exercises . . . . . . . . . . . . . . . . 36 File 36.1 36.2 36.3 36.4 36.5 36.6 36.7 36.8 36.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 413 414 416 417 418 419 419 423 424 425 427 427 428 431 432 434 436 441 446 446 451 451 453 455 458 459 461 461 465 465 466 467 469 477 478 480 485 491

Formats: CSV, Tab, XML, Logs and Others Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Comma-Separated Values: The csv Module . . . . . . . . . . . . . . . Tab Files: Nothing Special . . . . . . . . . . . . . . . . . . . . . . . . Property Files and Conguration (or .INI ) Files: The ConfigParser Fixed Format Files, A COBOL Legacy: The codecs Module . . . . . XML Files: The xml.etree and xml.sax Modules . . . . . . . . . . . Log Files: The logging Module . . . . . . . . . . . . . . . . . . . . . File Format Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . The DOM Class Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . and Batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37 Programs: Standing Alone 37.1 Kinds of Programs . . . . . . . . . 37.2 Command-Line Programs: Servers 37.3 The optparse Module . . . . . . . 37.4 Command-Line Examples . . . . . 37.5 Other Command-Line Features . . 37.6 Command-Line Exercises . . . . . 37.7 The getopt Module . . . . . . . .

38 Architecture: Clients, Servers, the Internet and 38.1 About TCP/IP . . . . . . . . . . . . . . . . . . . 38.2 The World Wide Web and the HTTP protocol . 38.3 Writing Web Clients: The urllib2 Module . . . 38.4 Writing Web Applications . . . . . . . . . . . . . 38.5 Sessions and State . . . . . . . . . . . . . . . . . 38.6 Handling Form Inputs . . . . . . . . . . . . . . . 38.7 Web Services . . . . . . . . . . . . . . . . . . . . 38.8 Client-Server Exercises . . . . . . . . . . . . . . 38.9 Socket Programming . . . . . . . . . . . . . . .

the World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Wide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

VI

Projects

499

39 Areas of the Flag 503 39.1 Basic Red, White and Blue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 39.2 The Stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504 40 Bowling Scores 41 Musical Pitches 507 509

vii

41.1 41.2 41.3 41.4 41.5

Equal Temperament Overtones . . . . . . Circle of Fifths . . . Pythagorean Tuning Five-Tone Tuning .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

510 511 511 512 513 515 515 517 518 519 521 521 522 523 524 525 525 526 528 529 529 531 532 534 535 537 539 542 545 545 549 552 552 553

42 What Can be Computed? 42.1 Background . . . . . . . . . . . . 42.2 The Turing Machine . . . . . . . 42.3 Example Machine . . . . . . . . 42.4 Turing Machine Implementation 42.5 Exercise 1 . . . . . . . . . . . . . 42.6 Test Machines . . . . . . . . . . 42.7 Exercise 2 . . . . . . . . . . . . . 42.8 Better Implementations . . . . . 42.9 Exercise 3 . . . . . . . . . . . . . 42.10 Consequences . . . . . . . . . . . 42.11 Other Applications . . . . . . . 42.12 Alternative Specications . . . . 42.13 Exercise 4 . . . . . . . . . . . . . 43 Mah 43.1 43.2 43.3 43.4 43.5 43.6 43.7 43.8 Jongg Hands Tile Class Hierarchy . . Wall Class . . . . . . . TileSet Class Hierarchy Hand Class . . . . . . . Some Test Cases . . . . Hand Scoring - Points . Hand Scoring - Doubles Limit Hands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

44 Chess Game Notation 44.1 Algebraic Notation . . . . . . . . 44.2 Algorithms for Resolving Moves 44.3 Descriptive Notation . . . . . . . 44.4 Game State . . . . . . . . . . . . 44.5 PGN Processing Specications .

VII

Back Matter
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

555
557 557 557 557 557 557 559 561 563

45 Bibliography 45.1 Use Cases . . . . . 45.2 Computer Science 45.3 Design Patterns . 45.4 Languages . . . . 45.5 Problem Domains 46 Indices and Tables 47 Production Notes Bibliography

viii

Building Skills in Python, Release 2.6.5

A Programmers Introduction to Python

Legal Notice This work is licensed under a Creative Commons License. You are free to copy, distribute, display, and perform the work under the following conditions: Attribution. You must give the original author, Steven F. Lott, credit. Noncommercial. You may not use this work for commercial purposes. No Derivative Works. You may not alter, transform, or build upon this work. For any reuse or distribution, you must make clear to others the license terms of this work.

CONTENTS

Building Skills in Python, Release 2.6.5

CONTENTS

Part I

Front Matter

CHAPTER

ONE

PREFACE
The Zen Of Python Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases arent special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one and preferably only one obvious way to do it. Although that way may not be obvious at rst unless youre Dutch. Now is better than never. Although never is often better than right now. If the implementation is hard to explain, its a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea lets do more of those!

1.1 Why Read This Book?


You need this book because you need to learn Python. Here are a few reasons why you might need to learn Python You need a programming language which is easy to read and has a vast library of modules focused on solving the problems youre faced with. You saw an article about Python specically, or dynamic languages in general, and want to learn more. Youre starting a project where Python will be used or is in use. A colleague has suggested that you look into Python. Youve run across a Python code sample on the web and need to learn more. Python reects a number of growing trends in software development, putting it at or near the leading edge of good programming languages. It is a very simple language surrounded by a vast library of add-on modules. It is an open source project, supported by many individuals. It is an object-oriented language, binding data and processing into class denitions. It is a platform-independent, scripted language, with complete access 5

Building Skills in Python, Release 2.6.5

to operating system APIs. It supports integration of complex solutions from pre-built components. It is a dynamic language, which avoids many of the complexities and overheads of compiled languages. This book is a close-to-complete presentation of the Python language. It is oriented toward learning, which involves accumulating many closely intertwined concepts. In our experience teaching, coaching and doing programming, there is an upper limit on the clue absorption rate. In order to keep within this limit, weve found that it helps to present a language as ever-expanding layers. Well lead you from a very tiny, easy to understand subset of statements to the entire Python language and all of the built-in data structures. Weve also found that doing a number of exercises helps internalize each language concept. Three Faces of a Language. There are three facets to a programming language: how you write it, what it means, and the additional practical considerations that make a program useful. While many books cover the syntax and semantics of Python, in this book well also cover the pragmatic considerations. Our core objective is to build enough language skills that good object-oriented design will be an easy next step. The syntax of a language is covered in the language reference manual available online. In the case of relatively simple languages, like Python, the syntax is simple. Well provide additional examples of language syntax. The semantics of the language can be a bit more slippery than the syntax. Some languages involve obscure or unique concepts that make it diicult to see what a statement really means. In the case of languages like Python, which have extensive additional libraries, the burden is doubled. First, one has to learn the language, then one has to learn the libraries. The number of open source packages made available by the Python community can increase the eort required to understand an entire architecture. The reward, however, is high-quality software based on high-quality components, with a minimum of development and integration eort. Many languages oer a number of tools that can accomplish the same basic task. Python is no exception. It is often diicult to know which of many alternatives performs better or is easier to adapt. Well try to focus on showing the most helpful approach, emphasizing techniques that apply for larger development eorts. Well try to avoid quick and dirty solutions that are only appropriate when learning the language.

1.2 Audience
Professional programmers who need to learn Python are our primary audience. We provide specic help for you in a number of ways. Since Python is simple, we can address newbie programmers who dont have deep experience in a number of other languages. We will call out some details in specic newbie sections. Experienced programmers can skip these sections. Since Python has a large number of sophisticated built-in data structures, we address these separately and fully. An understanding of these structures can simplify complex programs. The object-orientation of Python provides tremendous exibility and power. This is a deep subject, and we will provide an introduction to object-oriented programming in this book. More advanced design techniques are addressed in Building Skills in Object-Oriented Design, [Lott05]. The accompanying libraries make it inexpensive to develop complex and complete solutions with minimal eort. This, however, requires some time to understand the packaged components that are available, and how they can be integrated to create useful software. We cover some of the most important modules to specically prevent programmers from reinventing the wheel with each project. Instructors are a secondary audience. If you are looking for classroom projects that are engaging, comprehensible, and focus on perfecting language skills, this book can help. Each chapter in this book contains exercises that help students master the concepts presented in the chapter. This book assumes an basic level of skill with any of the commonly-available computer systems. The following skills will be required. 6 Chapter 1. Preface

Building Skills in Python, Release 2.6.5

Download and install open-source application software. Principally, this is the Python distribution kit from http://www.python.org. However, we will provide references to additional software components. Create text les. We will address doing this in IDLE, the Python Integrated Development Environment (IDE). We will also talk about doing this with a garden-variety text editor like Komodo, VIM, EMACS, TEXTPAD and BBEDIT. Run programs from the command-line. This includes the DOS command shell in Microsoft Windows, or the Terminal tool in Linux or Apples Macintosh OS X. Be familiar with high-school algebra and some trigonometry. Some of the exercises make heavy use of basic algebra and trigonometry. When youve nished with this book you should be able to do the following. Use of the core procedural programming constructs: variables, statements, exceptions, functions. We will not, for example, spend any time on design of loops that terminate properly. Create class denitions and subclasses. This includes managing the basic features of inheritance, as well as overloaded method names. Use the Python collection classes appropriately, this includes the various kinds of sequences, and the dictionary.

1.3 Organization of This Book


This book falls into ve distinct parts. To manage the clue absorption rate, the rst three parts are organized in a way that builds up the language in layers from central concepts to more advanced features. Each layer introduces a few new concepts, and is presented in some depth. Programming exercises are provided to encourage further exploration of each layer. The last two parts cover the extension modules and provide specications for some complex exercises that will help solidify programming skills. Some of the chapters include digressions on more advanced topics. These can be skipped, as they cover topics related to programming in general, or notes about the implementation of the Python language. These are reference material to help advanced students build skills above and beyond the basic language. The rst part, Language Basics introduces the basic feartures of the Python language, covering most of the statements but sticking with basic numeric data types. Background and History provides some history and background on Python. Getting Started covers installation of Python, using the interpreter interactively and creating simple program les. Simple Numeric Expressions and Output covers the basic expressions and core numeric types. Variables, Assignment and Input introduces variables, assignment and some simple input constructs. Truth, Comparison and Conditional Processing adds truth and conditions to the language. Loops and Iterative Processing. In Functions well add basic function denition and function call constructs; Additional Notes On Functions introduces some advanced function call features. The second part, Data Structures adds a number of data structures to enhance the expressive power of the language. In this part we will use a number of dierent kinds of objects, prior to designing our own objects. Sequences: Strings, Tuples and Lists extends the data types to include various kinds of sequences. These include Strings , Tuples and Lists. Mappings and Dictionaries describes mappings and dictionaries. Exceptions covers exception objects, and exception creation and handling. Files covers les and several closely related operating system (OS) services. Functional Programming with Collections describes more advanced sequence techniques, including multi-dimensional matrix processing. This part attempts to describe a reasonably complete set of built-in data types. 1.3. Organization of This Book 7

Building Skills in Python, Release 2.6.5

The third part, Data + Processing = Objects, unies data and processing to dene the object-oriented programming features of Python. Classes introduces basics of class denitions and introduces simple inheritance. Advanced Class Denition adds some features to basic class denitions. Some Design Patterns extend this discussion further to include several common design patterns that use polymorphism. Creating or Extending Data Types describes the mechanism for adding types to Python that behave like the built-in types. Part four, Components, Modules and Packages, describes modules, which provide a higher-level grouping of class and function denitions. It also summarizes selected extension modules provided with the Python environment. Modules provides basic semantics and syntax for creating modules. We cover the organization of packages of modules in Packages. An overview of the Python library is the subject of The Python Library. Complex Strings: the re Module covers string pattern matching and processing with the re module. Dates and Times: the time and datetime Modules covers the time and datetime module. Programs: Standing Alone covers the creation of main programs. We touch just the tip of the client-server iceberg in Architecture: Clients, Servers, the Internet and the World Wide Web. Some of the commonly-used modules are covered during earlier chapters. In particular the math and random modules are covered in The math Module and the string module is covered in Strings. Files touches on fileinput, os, os.path, glob, and fnmatch. Finally, part ve, Projects, presents several larger and more complex programming problems. These are ranked from relatively simple to quite complex. Areas of the Flag covers computing the area of the symbols on the American ag. Bowling Scores covers scoring in a game of bowling. Musical Pitches has several algorithms for the exact frequencies of musical pitches. What Can be Computed? has several exercises related to computability and the basics of nite state machines. Mah Jongg Hands describes algorithms for evaluating hands in the game of Maj Jongg. Chess Game Notation deals with interpreting the log from a game of chess.

1.4 Limitations
This book cant cover everything Python. There are a number of things which we will not cover in depth, and some things which we cant even touch on lightly. This list will provide you directions for further study. The rest of the Python library. The library is a large, sophisticated, rapidly-evolving collection of software components. We selected a few modules that are widely-used. There are many books which cover the library in general, and books which cover specic modules in depth. The subject of Object-Oriented (OO) design is the logical next step in learning Python. That topic is covered in Building Skills in Object-Oriented Design [Lott05]. Database design and programming requires a knowledge of Python and a grip on OO design. It requires a digression into the relational model and the SQL language. Graphical User Interface (GUI) development requires a knowledge of Python, OO design and database design. There are two commonly-used toolkits: Tkinter and pyGTK. Web application development, likewise, requires a knowledge of Python, OO design and database design. This topic requires digressions into internetworking protocols, specically HTTP and SOAP, plus HTML, XML and CSS languages. There are numerous web development frameworks for Python.

Chapter 1. Preface

Building Skills in Python, Release 2.6.5

1.5 Programming Style


We have to adopt a style for presenting Python. We wont present a complete set of coding standards, instead well present examples. This section has some justication of the style we use for the examples in this book. Just to continune this rant, we nd that actual examples speak louder than any of the gratuitously detailed coding standards which are so popular in IT shops. We nd that many IT organizations waste considerable time trying to write descriptions of a preferred style. A good example, however, trumps any description. As consultants, we are often asked to provide standards to an inexperienced team of programmers. The programmers only look at the examples (often cutting and pasting them). Why spend money on empty verbiage that is peripheral to the useful example? One important note: we specically reject using complex prexes for variable names. Prexes are little more than visual clutter. In many places, for example, an integer parameter with the amount of a bet might be called pi_amount where the prex indicates the scope (p for a parameter) and type (i for an integer). We reject the pi_ as potentially misleading and therefore uninformative. This style of name is only appropriate for primitive types, and doesnt address complex data structures well at all. How does one name a parameter that is a list of dictionaries of class instances? pldc_? In some cases, prexes are used to denote the scope of an instance variables. Variable names might include a cryptic one-letter prex like f to denote an instance variable; sometimes programmers will use my or the as an English-like prex. We prefer to reduce clutter. In Python, instance variables are always qualied by self., making the scope crystal clear. All of the code samples were tested on Python 2.6 for MacOS, using an iMac running MacOS 10.5. Additional testing of all code was done with Windows 2000 on a Dell Latitude laptop as well as a VMWare implementation of Fedora 11.

1.6 Conventions Used in This Book


Here is a typical Code sample. Typical Python Example
combo = { } for i in range(1,7): for j in range(1,7): roll= i+j combo.setdefault( roll, 0 ) combo[roll] += 1 for n in range(2,13): print "%d %.2f%% " % ( n, combo[n]/36.0 )

1. This creates a Python dictionary, a map from key to value. If we initialize it with something like the following: combo = dict( [ (n,0) for n in range(2,13) ] ) , we dont need the setdefault() function call below. 2. This assures that the rolled number exists in the dictionary with a default frequency count of 0. 3. Print each member of the resulting dictionary. Something more obscure like [ (n,combo[n]/36.0) for n in range(2,13)] is certainly possible. The output from the above program will be shown as follows:

1.5. Programming Style

Building Skills in Python, Release 2.6.5

2 0.03% 3 0.06% 4 0.08% 5 0.11% 6 0.14% 7 0.17% 8 0.14% 9 0.11% 10 0.08% 11 0.06% 12 0.03% Tool completed successfully

We will use the following type styles for references to a specic Class, method(), attribute, which includes both class variables or instance variables. Sidebars When we do have a signicant digression, it will appear in a sidebar, like this. Tip: tip There will be design tips, and warnings, in the material for each exercise. These reect considerations and lessons learned that arent typically clear to starting OO designers.

1.7 Acknowledgements
Id like to thank Carl Frederick for asking me if I was using Python to develop complex applications. At the time, I said Id have to look into it. This is the result of that investigation. I am indebted to Thomas Pautler, Jim Bullock, Michal Van Dorpe, Matthew Curry, Igor Sakovich, Drew, John Larsen, Robert Lucente, Lex Hider, John Nowlan and Tom Elliott for supplying much-needed corrections to errors in previous editions. John Hayes provided particular complete and meticulous copy-editing.

10

Chapter 1. Preface

Part II

Language Basics

11

Building Skills in Python, Release 2.6.5

The Processing View


A programming language involves two closely interleaved topics. On one hand, there are the procedural constructs that process information inside the computer, with visible eects on the various external devices. On the other hand are the various types of data structures and relationships for organizing the information manipulated by the program. This part describes the most commonly-used Python statements, sticking with basic numeric data types. Data Structures will present a reasonably complete set of built-in data types and features for Python. While the two are tightly interwoven, we pick the statements as more fundamental because we can (and will) add new data types. Indeed, the essential thrust of object-oriented programming (covered in Data + Processing = Objects) is the creation of new data types. Some of the examples in this part refer to the rules of various common casino games. Knowledge of casino gambling is not essential to understanding the language or this part of the book. We dont endorse casino gambling. Indeed, many of the exercises reveal the magnitude of the house edge in most casino games. However, casino games have just the right level of algorithmic complexity to make for excellent programming exercises. Well provide a little background on Python in Background and History . From there, well move on to installing Python in Python Installation. In Simple Numeric Expressions and Output well introduce the print statement (and print() function); well use this to see the results of arithmetic expressions including the numeric data types, operators, conversions, and some built-in functions. Well expand on this in Advanced Expressions. Well introduce variables, the assignment statement, and input in Variables, Assignment and Input , allowing us to create simple input-process-output programs. When we add truth, comparisons, conditional processing in Truth, Comparison and Conditional Processing, and iteration in Loops and Iterative Processing, well have all the tools necessary for programming. In Functions and Additional Notes On Functions, well show how to dene and use functions, the rst of many tools for organizing programs to make them understandable.

13

Building Skills in Python, Release 2.6.5

14

CHAPTER

TWO

BACKGROUND AND HISTORY


History of Python and Comparison with Other Languages
This chapter describes the history of Python in History. The Features of Python is an overview of the features of Python. After that, Comparisons is a subjective comparison between Python and a few other other languages, using some quality criteria harvested from two sources: the Java Language Environment White Paper and On the Design of Programming Languages. This material can be skipped by newbies: it doesnt help explain Python, it puts it into a context among other programming languages.

2.1 History
Python is a relatively simple programming language that includes a rich set of supporting libraries. This approach keeps the language simple and reliable, while providing specialized feature sets as separate extensions. Python has an easy-to-use syntax, focused on the programmer who must type in the program, read what was typed, and provide formal documentation for the program. Many languages have syntax focused on developing a simple, fast compiler; but those languages may sacrice readability and writability. Python strikes a good balance between fast compilation, readability and writability. Python is implemented in C, and relies on the extensive, well understood, portable C libraries. It ts seamlessly with Unix, Linux and POSIX environments. Since these standard C libraries are widely available for the various MS-Windows variants, and other non-POSIX operating systems, Python runs similarly in all environments. The Python programming language was created in 1991 by Guido van Rossum based on lessons learned doing language and operating system support. Python is built from concepts in the ABC language and Modula-3. For information ABC, see The ABC Programmers Handbook [Geurts91], as well as http://www.cwi.nl/~steven/abc/. For information on Modula-3, see Modula-3 [Harbison92], as well as http://www.research.compaq.com/SRC/modula-3/html/home.html. The current Python development is centralized at http://www.python.org.

2.2 Features of Python


Python reects a number of growing trends in software development. It is a very simple language surrounded by a vast library of add-on modules. It is an open source project, supported by dozens of individuals. It is an object-oriented language. It is a platform-independent, scripted language, with complete access to operating

15

Building Skills in Python, Release 2.6.5

system API s. It supports integration of complex solutions from pre-built components. It is a dynamic language, allowing more run-time exibility than statically compiled languages. Additionally, Python is a scripting language with full access to Operating System (OS) services. Consequently, Python can create high level solutions built up from other complete programs. This allows someone to integrate applications seamlessly, creating high-powered, highly-focused meta-applications. This kind of very-high-level programming (programming in the large) is often attempted with shell scripting tools. However, the programming power in most shell script languages is severely limited. Python is a complete programming language in its own right, allowing a powerful mixture of existing application programs and unique processing to be combined. Python includes the basic text manipulation facilities of Awk or Perl. It extends these with extensive OS services and other useful packages. It also includes some additional data types and an easier-to-read syntax than either of these languages. Python has several layers of program organization. The Python package is the broadest organizational unit; it is collection of modules. The Python module, analogous to the Java package, is the next level of grouping. A module may have one or more classes and free functions. A class has a number of static (class-level) variables, instance variables and methods. Well lookl at these layers in detail in appropriate sections. Some languages (like COBOL) have features that are folded into the language itself, leading to a complicated mixture of core features, optional extensions, operating-system features and special-purpose data structures or algorithms. These poorly designed languages may have problems with portability. This complexity makes these languages hard to learn. One hint that a language has too many features is that a language subset is available. Python suers from none of these defects: the language has only about 24 statements (of which ve are declaratory in nature), the compiler is simple and portable. This makes the the language is easy to learn, with no need to create a simplied language subset.

2.3 Comparisons
Well measure Python with two yardsticks. First, well look at a yardstick originally used for Java. Then well look at yardstick based on experience designing Modula-2.

2.3.1 The Java Yardstick


The Java Language Environment White Paper [Gosling96] lists a number of desirable features of a programming language: Simple and Familiar Object-Oriented Secure Interpreted Dynamic Architecture Neutral Portable Robust Multithreaded Garbage Collection Exceptions 16 Chapter 2. Background and History

Building Skills in Python, Release 2.6.5

High Performance Python meets and exceeds most of these expectations. Well look closely at each of these twelve desireable attributes. Simple and Familiar. By simple, we mean that there is no GOTO statement, we dont need to explicitly manage memory and pointers, there is no confusing preprocessor, we dont have the aliasing problems associated with unions. We note that this list summarizes the most confusing and bug-inducing features of the C programming language. Python is simple. It relies on a few core data structures and statements. The rich set of features is introduced by explicit import of extension modules. Python lacks the problem-plagued GOTO statement, and includes the more reliable break, continue and exception raise statements. Python conceals the mechanics of object references from the programmer, making it impossible to corrupt a pointer. There is no language preprocessor to obscure the syntax of the language. There is no C-style union (or COBOL-style REDEFINES) to create problematic aliases for data in memory. Python uses an English-like syntax, making it reasonably familiar to people who read and write English or related languages. There are few syntax rules, and ordinary, obvious indentation is used to make the structure of the software very clear. Object-Oriented. Python is object oriented. Almost all language features are rst class objects, and can be used in a variety of contexts. This is distinct from Java and C++ which create confusion by having objects as well as primitive data types that are not objects. The built-in type() function can interrogate the types of all objects. The language permits creation of new object classes. It supports single and multiple inheritance. Polymorphism is supported via run-time interpretation, leading to some additional implementation freedoms not permitted in Java or C++. Secure. The Python language environment is reasonably secure from tampering. Pre-compiled python modules can be distributed to prevent altering the source code. Additional security checks can be added by supplementing the built-in __import__() function. Many security aws are problems with operating systems or framework software (for example, database servers or web servers). There is, however, one prominent language-related security problem: the buer overow problem, where an input buer, of nite size, is overwritten by input data which is larger than the available buer. Python doesnt suer from this problem. Python is a dynamic language, and abuse of features like the exec statement or the eval() function can introduce security problems. These mechanisms are easy to identify and audit in a large program. Interpreted. An interpreted language, like Python allows for rapid, exible, exploratory software development. Compiled languages require a sometimes lengthy edit-compile-link-execute cycle. Interpreted languages permit a simpler edit-execute cycle. Interpreted languages can support a complete debugging and diagnostic environment. The Python interpreter can be run interactively; which can help with program development and testing. The Python interpreter can be extended with additional high-performance modules. Also, the Python interpreter can be embedded into another application to provide a handy scripting extension to that application. Dynamic. Python executes dynamically. Python modules can be distributed as source; they are compiled (if necessary) at import time. Object messages are interpreted, and problems are reported at run time, allowing for exible development of applications. In C++, any change to centrally used class headers will lead to lengthy recompilation of dependent modules. In Java, a change to the public interface of a class can invalidate a number of other modules, leading to recompilation in the best case, or runtime errors in the worst case. Portable. Since Python rests squarely on a portable C source, Python programs behave the same on a variety of platforms. Subtle issues like memory management are completely hidden. Operating system

2.3. Comparisons

17

Building Skills in Python, Release 2.6.5

inconsistency makes it impossible to provide perfect portability of every feature. Portable GUIs are built using the widely-ported Tk GUI tools Tkinter, or the GTK+ tools and the the pyGTK bindings. Robust. Programmers do not directly manipulate memory or pointers, making the language run-time environment very robust. Errors are raised as exceptions, allowing programs to catch and handle a variety of conditions. All Python language mistakes lead to simple, easy-to-interpret error messages from exceptions. Multithreaded. The Python threading module is a Posix-compliant threading library. This is not completely supported on all platforms, but does provide the necessary interfaces. Beyond thread management, OS process management is also available, as are execution of shell scripts and other programs from within a Python program. Additionally, many of the web frameworks include thread management. In products like TurboGears, individual web requests implicitly spawn new threads. Garbage Collection. Memory-management can be done with explicit deletes or automated garbage collection. Since Python uses garbage collection, the programmer doesnt have to worry about memory leaks (failure to delete) or dangling references (deleting too early). The Python run-time environment handles garbage collection of all Python objects. Reference counters are used to assure that no live objects are removed. When objects go out of scope, they are eligible for garbage collection. Exceptions. Python has exceptions, and a sophisticated try statement that handles exceptions. Unlike the standard C library where status codes are returned from some functions, invalid pointers returned from others and a global error number variable used for determining error conditions, Python signals almost all errors with an exception. Even common, generic OS services are wrapped so that exceptions are raised in a uniform way. High Performance. The Python interpreter is quite fast. However, where necessary, a class or module that is a bottleneck can be rewritten in C or C++, creating an extension to the runtime environment that improves performance.

2.3.2 The Modula-2 Yardstick


One of the languages which strongly inuenced the design of Python was Modula-2. In 1974, N. Wirth (creator of Pascal and its successor, Modula-2) wrote an article On the Design of Programming Languages [Wirth74], which dened some timeless considerations in designing a programming language. He suggests the following: a language be easy to learn and easy to use; safe from misinterpretation; extensible without changing existing features; machine [platform] independent; the compiler [interpreter ] must be fast and compact; there must be ready access to system services, libraries and extensions written in other languages; the whole package must be portable. Python syntax is designed for readability; the language is quite simple, making it easy to learn and use. The Python community is always alert to ways to simplify Python. The Python 3.0 project is actively working to remove a few poorly-concieved features of Python. This will mean that Python 3.0 will be simpler and easier to use, but incompatible with Python 2.x in a few areas. Most Python features are brought in via modules, assuring that extensions do not change or break existing features. This allows tremendous exibility and permits rapid growth in the language libraries.

18

Chapter 2. Background and History

Building Skills in Python, Release 2.6.5

The Python interpreter is very small. Typically, it is smaller than the Java Virtual Machine. Since Python is (ultimately) written in C, it has the same kind of broad access to external libraries and extensions. Also, this makes Python completely portable.

2.4 Some Jargon


For folks new to developing software, it might help to understand a few distinctions made above. Interperted Not Interpreted (i.e., Compiled) Python is a byte-code interpreter. A Python code object is a sequence of bytes that represent various operations and values. The Python interpreter steps through the bytes, performing the operations. A compiled language (e.g., C, C++, etc.) is translated from source form to executable binary specic to operating system and hardware platform. Java is similar to Python: its compiled and the Java Virtual Machine is a byte-code interpreter. Dynamic Not Dynamic (i.e., Static) Python is a dynamic language. Variables and functions do not have dened data types. Instead, a variable is simply a label attached to an object. A function is a callable object with parameters, but no declared result type. Each object has a strongly-dened permanent class. There is no sophisticated compile-time type checking. Instead, any type mismatches will be detected at run-time. Since many types are nearly interchangeable, there isnt a need for a lot of type checking. For examples of interchangeable (polymorphic) types, see Simple Numeric Expressions and Output . Languages like C, C++ and Java have statically-declared variables and functions. Scripting Non-Scripting The scripting distinction is an operational feature of POSIX-compliant operating systems. Files which begin with the #!/path/to/interpreter will be used as scripts by the OS. They can be executed from the command-line because the interpreter is named in the rst line of the le. Languages like Java, C and C++ do not have this feature; these les must be compiled before they can be executed.

2.4. Some Jargon

19

Building Skills in Python, Release 2.6.5

20

Chapter 2. Background and History

CHAPTER

THREE

PYTHON INSTALLATION
Downloading, Installing and Upgrading Python
This chapter is becoming less and less relevant as Python comes pre-installed with most Linux-based operating systems. Consequently, the most interesting part of this chapter is the Windows Installation, where we describe downloading and installing Python on Windows. Python runs on a wide, wide variety of platforms. If your particular operating system isnt described here, refer to http://www.python.org/community/ to locate an implementation. Mac OS developers will nd it simplest to upgrade to Leopard (Max OS 10.5) or Snow Leopard (Mac OS 10.6), since it has Python included. The Mac OS installation includes the complete suite of tools. Well look at upgrading in Macintosh Installation. For other GNU/Linux developers, youll nd that Python is generally included in most distributions. Further, many Linux distributions automatically upgrade their Python installation. For example, Fedora Core 11 includes Python 2.6 and installs upgrades as they become available. You can nd installation guidelines in GNU/Linux and UNIX Overview . The Goal. The goal of installation is to get the Python interpreter and associated libraries. Windows users will get a program called python.exe. Linux and MacOS users will get the Python interpreter, a program named python. In addition to the libraries and the interpreter, your Python installation comes with a tutorial document (also available at http://docs.python.org/tutorial/) on Python that will step you through a number of quick examples. For newbies, this provides an additional point of view that you may nd helpful. You may also want to refer to the Beginners Guide Wiki at http://wiki.python.org/moin/BeginnersGuide.

3.1 Windows Installation


In some circumstances, your Windows environment may require administrator privilege. The details are beyond the scope of this book. If you can install software on your PC, then you have administrator privileges. In a corporate or academic environment, someone else may be the administrator for your PC. The Windows installation of Python has three broad steps. 1. Pre-installation: make backups and download the installation kit. 2. Installation: install Python. 3. Post-installation: check to be sure everything worked. Well go through each of these in detail.

21

Building Skills in Python, Release 2.6.5

3.1.1 Windows Pre-Installation


Backup. Before installing software, back up your computer. I strongly recommend that you get a tool like Nortons Ghost (http://www.symantec.com/norton/ghost) or clonezilla (http://clonezilla.org/). Products like these will create a CD that you can use to reconstruct the operating system on your PC in case something goes wrong. It is diicult to undo an installation in Windows, and get your computer back the way it was before you started. Ive never had a single problem installing Python. Ive worked with a number of people, however, who either have bad luck or dont read carefully and have managed to corrupt their Windows installation by downloading and installing software. While Python is safe, stable, reliable, virus-free, and well-respected, you may be someone with bad luck who has a problem. Often the problem already existed on your PC and installing Python was the straw that broke the camels back. A backup is cheap insurance. You should also have a folder for saving your downloads. You can create a folder in My Documents called downloads. I suggest that you keep all of your various downloaded tools and utilities in this folder for two reasons. If you need to reinstall your software, you know exactly what you downloaded. When you get a new computer (or an additional computer), you know what needs to be installed on that computer. Download. After making a backup, go to the http://www.python.org web site and look for the Download area. In here, youre looking for the pre-built Windows installer. This book will emphasize Python 2.6. In that case, the kit will have a lename like python-2.6.x.msi. When you click on the lename, your browser should start downloading the le. Save it in your downloads folder. Backup. Now is a good time to make a second backup. Seriously. This backup will have your untouched Windows system, plus the Python installation kit. It is still cheap insurance. If you have anti-virus software [you do, dont you? ] you may need to disable this until you are done installing Python. At this point, you have everything you need to install Python: A backup The Python installer

3.1.2 Windows Installation


Youll need two things to install Python. If you dont have both, see the previous section on pre-installation. A backup The Python installer Double-click the Python installer (python-2.6.x.msi). The rst step is to select a destination directory. The default destination should be C:\Python26 . Note that Python does not expect to live in the C:\My Programs folder. Because the My Programs folder has a space in the middle of the name something that is atypical for all operating systems other than Windows subtle problems can arise. Consequently, Python folks prefer to put Python into C:\Python26 on Windows machines. Click Next to continue. If you have a previous installation, then the next step is to conrm that you want to backup replaced les. The option to make backups is already selected and the folder is usually C:\Python26\BACKUP. This is the way it should be. Click Next to continue. The next step is the list of components to install. You have a list of ve components. Python interpreter and libraries. You want this.

22

Chapter 3. Python Installation

Building Skills in Python, Release 2.6.5

Tcl/Tk (Tkinter, IDLE, pydoc). You want this, so that you can use IDLE to build programs. Python HTML Help le. This is some reference material that youll probably want to have. Python utility scripts (Tools/). We wont be making any use of this in this book. In the long run, youll want it. Python test suite (Lib/test/). We wont make any use of this, either. It wont hurt anything if you install it. There is an Advanced Options... button that is necessary if you are using a company-supplied computer for which you are not the administrator. If you are not the administrator, and you have permission to install additional software, you can click on this button to get the Advanced Options panel. Theres a button labeled Non-Admin install that youll need to click in order to install Python on a PC where you dont have administrator privileges. Click Next to continue. You can pick a Start Menu Group for the Python program, IDLE and the help les. Usually, it is placed in a menu named Python 2.6. I cant see any reason for changing this, since it only seems to make things harder to nd. Click Next to continue. The installer puts les in the selected places. This takes less than a minute. Click Finish ; you have just installed Python on your computer. Tip: Debugging Windows Installation The only problem you are likely to encounter doing a Windows installation is a lack of administrative privileges on your computer. In this case, you will need help from your support department to either do the installation for you, or give you administrative privileges.

3.1.3 Windows Post-Installation


In your Start... menu, under All Programs , you will now have a Python 2.6 group that lists ve things: IDLE (Python GUI) Module Docs Python (command line) Python Manuals Uninstall Python Important: Testing If you select the Python (command line) menu item, youll see the Python (command line) window. This will contain something like the following.
Python 2.6.2 (r262:71605, Apr 14 2009, 22:40:02) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> ^Z

If you hit Ctrl-Z and then Enter , Python will exit. The basic Python program works. You can skip to Getting Started to start using Python. If you select the Python Manuals menu item, this will open a Microsoft Help reader that will show the complete Python documentation library.

3.1. Windows Installation

23

Building Skills in Python, Release 2.6.5

3.2 Macintosh Installation


Python is part of the MacOS environment. Tiger (Mac OS 10.4) includes Python 2.3.5 and IDLE. Leopard (Mac OS 10.5) includes Python 2.5.1. Snow Leopard (Mac OS 10.6) includes Python 2.6. Generally, you dont need to do much to get started. Youll just need to locate the various Python les. Look in /System/Library/Frameworks/Python.Framework/Versions for the relevant les. In order to upgrade software in the Macintosh OS, you must know the administrator, or owner password. If you are the person who installed or initially setup the computer, you had to pick an owner password during the installation. If someone else did the installation, youll need to get the password from them. A Mac OS upgrade of Python has three broad steps. 1. Pre-upgrade: make backups and download the installation kit. 2. Installation: upgrade Python. 3. Post-installation: check to be sure everything worked. Well go through each of these in detail.

3.2.1 Macintosh Pre-Installation


Before installing software, back up your computer. While you cant easily burn a DVD of everything on your computer, you can usually burn a DVD of everything in your personal Mac OS X Home directory. Ive never had a single problem installing Python. Ive worked with a number of people, however, who either have bad luck or dont read carefully and have managed to corrupt their Mac OS installation by downloading and installing software. While Python is safe, stable, reliable, virus-free, and well-respected, you may be someone with bad luck who has a problem. A backup is cheap insurance. Download. After making a backup, go to the http://www.python.org web site and look for the Download area. In here, youre looking for the pre-built Mac OS X installer. This book will emphasize Python 2.6. In that case, the kit lename will start with python-2.6.2.macosx. Generally, the lename will have a date embedded in it and look like python-2.6.2.macosx2009-04-16.dmg When you click on the lename, your browser should start downloading the le. Save it in your Downloads folder. Backup. Now is a good time to make a second backup. Seriously. It is still cheap insurance. At this point, you have everything you need to install Python: A backup The Python installer

3.2.2 Macintosh Installation


When you double-click the python-2.6.2-macosx2009-04-16.dmg, it will create a disk image named Universal MacPython 2.6.x . This disk image has your license, a ReadMe le, and the MacPython.mpkg. When you double-click the MacPython.mpkg e, it will take all the necessary steps to install Python on your computer. The installer will take you through seven steps. Generally, youll read the messages and click Continue. Introduction. Read the message and click Continue. Read Me. This is the contents of the ReadMe le on the installer disk image. Read the message and click Continue.

24

Chapter 3. Python Installation

Building Skills in Python, Release 2.6.5

License. You can read the history of Python, and the terms and conditions for using it. To install Python, you must agree with the license. When you click Continue , you will get a pop-up window that asks if you agree. Click Agree to install Python. Select Destination. Generally, your primary disk drive, usually named Macintosh HD will be highlighted with a green arrow. Click Continue. Installation Type. If youve done this before, youll see that this will be an upgrade. If this is the rst time, youll be doing an install. Click the Install or Upgrade button. Youll be asked for your password. If, for some reason, you arent the administrator for this computer, you wont be able to install software. Otherwise, provide your password so that you can install software. Finish Up. The message is usually The software was successfully installed. Click Close to nish.

3.2.3 Macintosh Post-Installation


In your Applications folder, youll nd a MacPython 2.6 folder, which contains a number of applications. BuildApplet Extras IDLE PythonLauncher Update Shell Prole.command Look in /System/Library/Frameworks/Python.Framework/Versions for the relevant les. In the bin , Extras and Resources directories youll nd the various applications. The bin/idle le will launch IDLE for us. Once youve nished installation, you should check to be sure that everything is working correctly. Important: Testing From the terminal you can enter the python command. You should see the following
MacBook-5:~ slott$ python Python 2.6.3 (r263:75184, Oct 2 2009, 07:56:03) [GCC 4.0.1 (Apple Inc. build 5493)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>>

Enter end-of-le ctrl-D to exit from Python.

3.3 GNU/Linux and UNIX Overview


In Checking for Python well provide a procedure for examining your current conguration to see if you have Python in the rst place. If you have Python, and its version 2.6, youre all done. Otherwise, youll have to determine what tools you have for doing an installation or upgrade. If you have Yellowdog Updater Modied (YUM) see YUM Installation. If you have one of the GNU/Linux variants that uses the Red Hat Package Manager (RPM), see RPM Installation.

3.3. GNU/Linux and UNIX Overview

25

Building Skills in Python, Release 2.6.5

The alternative to use the source installation procedure in Build from Scratch Installation. Root Access. In order to install software in GNU/Linux, you must know the administrator, or root password. If you are the person who installed the GNU/Linux, you had to pick an administrator password during the installation. If someone else did the installation, youll need to get the password from them. Normally, we never log in to GNU/Linux as root except when we are installing software. In this case, because we are going to be installing software, we need to log in as root, using the administrative password. If you are a GNU/Linux newbie and are in the habit of logging in as root, youre going to have to get a good GNU/Linux book, create another username for yourself, and start using a proper username, not root. When you work as root, you run a terrible risk of damaging or corrupting something. When you are logged on as anyone other than root, you will nd that you cant delete or alter important les. Unix is not Linux. For non-Linux commercial Unix installations (Solaris, AIX, HP/UX, etc.), check with your vendor (Oracle/Sun, IBM, HP, etc.) It is very likely that they have an extensive collection of open source projects like Python pre-built for your UNIX variant. Getting a pre-built kit from your operating system vendor is an easy way to install Python.

3.3.1 Checking for Python


Many GNU/Linux and Unix systems have Python installed. On some older Linuxes [Linuxi? Lini? Linen?] there may be an older version of Python that needs to be upgraded. Heres what you do to nd out whether or not you already have Python. We cant easily cover all variations. Well use Fedora as a typical Linux distribution. Run the Terminal tool. Youll get a window which prompts you by showing something like [slott@linux01 slott]$ . In response to this prompt, enter env python, and see what happens. Heres what happens when Python is not installed.
[slott@linux01 slott]$ env python tcsh: python: not found

Heres what you see when there is a properly installed, but out-of-date Python on your GNU/Linux box.
[slott@linux01 slott]$ env python Python 2.3.5 (#1, Mar 20 2005, 20:38:20) [GCC 3.3 20030304 (Apple Computer, Inc. build 1809)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> ^D

We used an ordinary end-of-le (Control-D) to exit from Python. In this case, the version number is 2.3.5, which is good, but we need to install an upgrade.

3.3.2 YUM Installation


If you are a Red Hat or Fedora user, you likely have a program named Yum. If you dont have Yum, you should upgrade to Fedora Core 11. Note that Yum repositories do not cover every combination of operating system and Python distribution. In these cases, you should consider an operating system upgrade in order to introduce a new Python distribution. If you have an out-of-date Python, youll have to enter two commands in the Terminal window.

26

Chapter 3. Python Installation

Building Skills in Python, Release 2.6.5

yum upgrade python yum install tkinter

The rst command will upgrade the Python 2.6 distribution. You can use the command install instead of upgrade in the unlikely event that you somehow have Yum, but dont have Python. The second command will assure that the extension package named tkinter is part of your Fedora installation. It is not, typically, provided automatically. Youll need this to make use of the IDLE program used extensively in later chapters. In some cases, you will also want a packaged called the Python Development Tools. This includes some parts that are used by Python add-on packages.

3.3.3 RPM Installation


Many variants of GNU/Linux use the Red Hat Package Manager (RPM). The rpm tool automates the installation of software and the important dependencies among software components. If you dont know whether on not your GNU/Linux uses the Red Hat Package manager, youll have to nd a GNU/Linux expert to help you make that determination. Red Hat Linux (and the related Fedora Core distributions) have a version of Python pre-installed. Sometimes, the pre-installed Python is an older release and needs an upgrade. This book will focus on Fedora Core GNU/Linux because thats what I have running. Specically, Fedora Core 8. You may have a dierent GNU/Linux, in which case, this procedure is close, but may not be precisely what youll have to do. The Red Hat and Fedora GNU/Linux installation of Python has three broad steps. 1. Pre-installation: make backups. 2. Installation: install Python. Well focus on the simplest kind of installation. 3. Post-installation: check to be sure everything worked. Well go through each of these in detail.

3.3.4 RPM Pre-Installation


Before installing software, back up your computer. You should also have a directory for saving your downloads. I recommend that you create a /opt directory for these kinds of options which are above and beyond the basic Linx installation. You can keep all of your various downloaded tools and utilities in this directory for two reasons. If you need to reinstall your software, you know exactly what you downloaded. When you get a new computer (or an additional computer), you know what needs to be installed on that computer.

3.3.5 RPM Installation


A typical scenario for installing Python is a command like the following. This has specic le names for Fedora Core 9. Youll need to locate appropriate RPMs for your distribution of Linux.
rpm -i http://download.fedora.redhat.com/pub/fedora/linux/development\ /i386/os/Packages/python-2.5.1-18.fc9.i386.rpm

3.3. GNU/Linux and UNIX Overview

27

Building Skills in Python, Release 2.6.5

Often, thats all there is to it. In some cases, youll get warnings about the DSA signature. These are expected, since we didnt tell RPM the public key that was used to sign the packages.

3.3.6 RPM Post-Installation


Important: Testing Run the Terminal tool. At the command line prompt, enter env python, and see what happens.
[slott@localhost trunk]$ env python Python 2.6 (r26:66714, Jun 8 2009, 16:07:26) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>

If you hit Ctrl-D (the GNU/Linux end-of-le character), Python will exit. The basic Python program works.

3.4 Build from Scratch Installation


There are many GNU/Linux variants, and we cant even begin to cover each variant. You can use a similar installation on Windows or the Mac OS, if you have the GCC compiler installed. Heres an overview of how to install using a largely manual sequence of steps. 1. Pre-Installation. Make backups and download the source kit. Youre looking for the a le named python-2.5.x.tgz. 2. Installation. The installation involves a fairly common set of commands. If you are an experienced system administrator, but a novice programmer, you may recognize these. Change to the /opt/python directory with the following command.
cd /opt/python

Unpack the archive le with the following command.


tar -zxvf Python-2.6.x.tgz

Do the following four commands to congure the installation scripts and make the Python package. and then install Python on your computer.
cd Python-2.6 ./configure make

As root, youll need to do the following command. Either use sudo or su to temporarily elevate your privileges.
make install

3. Post-installation. Check to be sure everything worked. Important: Testing Run the Terminal tool. At the command line prompt, enter env python, and see what happens.

28

Chapter 3. Python Installation

Building Skills in Python, Release 2.6.5

[slott@localhost trunk]$ env python Python 2.6 (r26:66714, Jun 8 2009, 16:07:26) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>

If you hit Ctrl-D (the GNU/Linux end-of-le character), Python will exit. The basic Python program works. Tip: Debugging Other Unix Installation The most likely problem youll encounter in doing a generic installation is not having the appropriate GNU GCC compiler. In this case, you will see error messages from congure which identies the list of missing packages. Installing the GNU GCC can become a complex undertaking.

3.4. Build from Scratch Installation

29

Building Skills in Python, Release 2.6.5

30

Chapter 3. Python Installation

CHAPTER

FOUR

GETTING STARTED
Interacting with Python
Python is an interpreted, dynamic language. The Python interpreter can be used in two modes: interactive and scripted. In interactive mode, Python responds to each statement while we type. In script mode, we give Python a le of statements and turn it loose to interpret all of the statements in that script. Both modes produce identical results. When were producing a nished application program, we set it up to run as a script. When were experimenting or exploring, however, we may use Python interactively. Well describe the interactive command-line mode for entering simple Python statements in Command-Line Interaction. In The IDLE Development Environment well cover the basics of interactive Python in the IDLE environment. Well describes the script mode for running Python program les in Script Mode. Well look at the help unction in Getting Help. Once weve started interacting with Python, we can address some syntax issues in Syntax Formalities. Well mention some other development tools in Other Tools. Well also address some style issues in Style Notes: Wise Choice of File Names.

4.1 Command-Line Interaction


Well look at interaction on the command line rst, because it is the simplest way to interact with Python. It parallels scripted execution, and helps us visualize how Python application programs work. This is the heart of IDLE as well as the foundation for any application programs we build. This is not the only way or even the most popular way to run Python. It is, however, the simplest and serves as a good place to start.

4.1.1 Starting and Stopping Command-Line Python


Starting and stopping Python varies with your operating system. Generally, all of the variations are nearly identical, and dier only in minor details. Windows. There are two ways to start interactive Python under Windows. 1. You can run the command tool (cmd.exe) and enter the python command. 2. You can run the Python (Command Line) program under the Python2.6 menu item on the Start menu.

31

Building Skills in Python, Release 2.6.5

To exit from Python, enter the end-of-le character sequence, Control-Z and Return. Mac OS, GNU/Linux and Unix. You will run the Terminal tool. You can enter the command python to start interactive Python. To exit from Python, enter the end-of-le character, Control-D.

4.1.2 Entering Python Statements


When we run the Python interpreter (called python , or Python.exe in Windows), we see a greeting like the following:
[slott@localhost trunk]$ env python Python 2.6 (r26:66714, Jun 8 2009, 16:07:26) [GCC 4.4.0 20090506 (Red Hat 4.4.0-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>

When we get the >>> prompt, the Python interpreter is looking for input. We can type any Python statements we want. Each complete statement is executed when it is entered. In this section only, well emphasize the prompts from Python. This can help newbies see the complete cycle of interaction between themselves and the Python interpreter. In the long run well be writing scripts and wont emphasize this level of interaction. Well only cover a few key rules. The rest of the rules are in Syntax Formalities. Rule 1. The essential rule of Python syntax is that a statement must be complete on a single line. There are some exceptions, which well get to below.
>>> 2 + 3 5

This shows Python doing simple integer arithmetic. When you entered 2 + 3 and then hit Return, the Python interpreter evaluated this statement. Since the statement was only an expression, Python printed the results. Well dig into to the various kinds of numbers in Simple Numeric Expressions and Output . For now, its enough to know that you have integers and oating-point numbers that look much like other programming languages. As a side note, integers have two slightly dierent avors fast (but small) and long (but slow). Python prefers to use the fast integers (called int) until your numbers get so huge that it has to switch to long. Arithmetic operators include the usual culprits: + , - , *, / , % and ** standing for addition, subtraction, multiplication, division, modulo (remainder after division) and raising to a power. The usual mathematical rules of operator precedence (multiplys and divides done before adds and subtracts) are in full force, and ( and ) are used to group terms against precedence rules. For example, converting 65 Fahrenheit to Celsius is done as follows:
>>> (65 - 32) * 5 / 9 18 >>> (65.-32)*5/9 18.333333333333332 >>>

32

Chapter 4. Getting Started

Building Skills in Python, Release 2.6.5

Note that the rst example used all integer values, and the result was an integer result. In the second example, the presence of a oat caused all the values to be coerced to oat. Also note that Python has the standard binary-to-decimal precision issue. The actual value computed does not have a precise binary representation, and the default display of the decimal equivalent looks strange. Well return to this in Numeric Types and Operators. Incomplete Statements. What happens when an expression statement is obviously incomplete?
>>> ( 65 - 32 ) * 5 / File "<stdin>", line 1 ( 65 - 32 ) * 5 / ^ SyntaxError: invalid syntax

Parenthensis. There is an escape clause in the basic rule of one statement one line. When the parenthesis are incomplete, Python will allow the statement to run on to multiple lines. Python will change the prompt to ... to show that the statement is incomplete, and more is expected.
>>> ( 65 - 32 ... )*5 / 9 18

Rule 5. It is also possible to continue a long statement using a \ escape at the end of the line.
>>> 5 + 6 *\ ... 7 47

This escape allows you to break up an extremely long statement for easy reading. Indentation. Python relies heavily on indendentation to make a program readable. When interacting with Python, we are often typing simple expression statements, which are not indented. Later, when we start typing compound statements, indentation will begin to matter. Heres what happens if we try to indent a simple expression statement.
>>> 5+6 SyntaxError: invalid syntax

Note that some statements are called compound statements they contain an indented suite of statements. Python will change the prompt to ... and wait until the entire compound statement is entered before it does does the evaluation. Well return to these when its appropriate in Truth, Comparison and Conditional Processing. History. When we type an expression statement, Python evaluates it and displays the result. When we type all other kinds of statements, Python executes it silently. Well see more of this, starting in Variables, Assignment and Input . Small mistakes can be frustrating when typing a long or complex statement. Python has a reasonable command history capability, so you can use the up-arrow key to recover a previous statement. Generally, youll prefer to create script les and run the scripts. When debugging a problem, however, interactive mode can be handy for experimenting. One of the desirable features of well-written Python is that most things can be tested and demonstrated in small code fragments. Often a single line of easy-to-enter code is the desired style for interesting programming

4.1. Command-Line Interaction

33

Building Skills in Python, Release 2.6.5

features. Many examples in reference manuals and unit test scripts are simply captures of interactive Python sessions.

4.2 The IDLE Development Environment


There are a number of possible integrated development environments (IDE) for Python. Python includes the IDLE tool, which well emphasize. Additionally, you can download or purchase a number of IDEs that support Python. In Other Tools well look at other development tools. Starting and stopping IDLE varies with your operating system. Generally, all of the variations are nearly identical, and dier only in minor details.

4.2.1 IDLE On Windows


There are several ways to start IDLE in Windows. 1. You can use IDLE (Python GUI) from the Python2.6 menu on the Start menu. 2. You can also run IDLE from the command prompt. This requires two conguration settings in Windows. Assure that C:Python26\Lib\idlelib on your system PATH. This directory contains IDLE.BAT . Assure that .pyw les are associated with C:\Python26\pythonw.exe. In order to suppress creation of a console window for a GUI application, Windows oers pythonw.exe. You can quit IDLE by using the Quit menu item under the File menu.

4.2.2 IDLE On Mac OS X


In the Mac OS, if youve done an upgrade, you may nd the IDLE program in the Python 2.6 folder in your Applications folder. You can double-click this icon to run IDLE. If you have the baseline application, youll have to nd IDLE in the directory /System/Library/Frameworks/Python.framework/Versions/Current/bin. Generally, this is directory part of your PATH setting, and you can type the command idle & in a Terminal window to start IDLE. When you run IDLE by double-clicking the idle icon, youll notice that two windows are opened: a Python Shell window and a Console window. The Console window isnt used for much. When you run IDLE from the Terminal window, no console window is opened. The Terminal window is the Python console. You can quit IDLE by using the Quit menu item under the File menu. You can also quit by using the Quit Idle menu item under the Idle menu. Since the Macintosh keyboard has a command key, , as well as a control key, ctrl, there are two keyboard mappings for IDLE. You can use the Congure IDLE... item under the Options menu to select any of the built-in Key Sets. Selecting the IDLE Classic Mac settings may be more comfortable for Mac OS users.

34

Chapter 4. Getting Started

Building Skills in Python, Release 2.6.5

4.2.3 IDLE on GNU/Linux


Well avoid the GNOME and KDE subtleties. Instead, well focus on running IDLE from the Terminal tool. Since the le path is rather long, youll want to edit your .profile (or .bash_profile ) to include the following alias denition.
alias idle='env python /usr/lib/python2.5/idlelib/idle.py &'

This allows you to run IDLE by entering the command idle in a Terminal window. You can quit IDLE by using the Exit menu item under the File menu.

4.2.4 Basic IDLE Operations


Initially, youll see the following greeting from IDLE.
Python 2.6.3 (r263:75184, Oct 2 2009, 07:56:03) [GCC 4.0.1 (Apple Inc. build 5493)] on darwin Type "copyright", "credits" or "license()" for more information. **************************************************************** Personal firewall software may warn about the connection IDLE makes to its subprocess using this computer's internal loopback interface. This connection is not visible on any external interface and no data is sent to or received from the Internet. **************************************************************** IDLE 2.6.3 >>>

You may notice a Help menu. This has the Python Docs menu item, which you can access through the menu or by hitting F1. This will launch Safari to show you the Python documents available on the Internet. The personal rewall notication is a reminder that IDLE uses Internetworking Protocols (IP) as part of its debugger. If you have a software rewall on you development computer, and the rewall software complains, you can allow the connection. IDLE has a simple and relatively standard text editor, which does Python syntax highlighting. It also has a Python Shell window which manages an interactive Python session. You will see that the Python Shell window has a Shell and a Debug menu. When you use the New menu item in the File menu, youll see a le window, which has a slightly dierent menu bar. A le window has name which is a le name (or untitled ), and two unique menus, a Run and a Format menu. Generally, youll use IDLE in two ways: Youll enter Python statements in the Python Shell window. Youll create les, and run those module les using the Run Module item in the Run menu. This option is usually F5.

4.2.5 The Shell Window


The Python Shell window in IDLE presents a >>> prompt. At this prompt, you can enter Python expressions or statements for evaluation. This window has a complete command history, so you can use the up arrow to select a previous statement and make changes. 4.2. The IDLE Development Environment 35

Building Skills in Python, Release 2.6.5

You can refer back to Command-Line Interaction ; those interactions will look and behave the same in IDLE as they do on the command line. The Shell Window is essentially the command-line interface wrapped in a scrolling window. The IDLE interface, however, provides a consistent working environment, which is independent of each operating systems command-line interface. The Shell and Debug menus provides functions youll use when developing larger programs. For our rst steps with Python, we wont need either of these menus. Well talk briey about the functions, but cant really make use of them until weve learned more of the language. The Shell Menu. The Shell menu is used to restart the Python interpreter, or scroll back through the shells log to locate the most recent restart. This is important when you are developing a module that is used as a library. When you change that module, you need to reset the shell so that the previous version is forgotten and the new version can be imported into a fresh, empty interpreter. Generally, being able to work interactively is the best way to develop working programs. It encourages you to create tidy, simple-looking components which you can exercise directly. The Debug Menu. The Debug menu provides some handy tools for watching how Python executes a program. The Go to File/Line item is used to locate the source le where an exception was raised. You click on the exception message which contains the le name and select the Go to File/Line menu item, and IDLE will open the le and highlight the selected line. The Debugger item opens an interactive debugger window that allows you to step through the executing Python program. The Stack Viewer item opens a window that displays the current Python stack. This shows the arguments and working variables in the Python interpereter. The stack is organized into local and global namespaces, a conceot we need to delve into in Variables, Assignment and Input . The Auto-open Stack Viewer option will open the Stack Viewer automatically when a program raises an unhandled exception. How exceptions are raised and handled is a concept well delve into in Exceptions.

4.2.6 The File Windows


Each le window in IDLE is a simple text editor with two additional menus. The Format menu has a series of items for fairly common source text manipulations. The formatting operations include indenting, commenting, handling tabs and formatting text paragraphs. The Run menu makes it easy to execute the le you are editing. The Python Shell menu item brings up the Python Shell window. The Check Module item checks the syntax for your le. If there are any errors, IDLE will highlight the oending line so you can make changes. Additionally, this option will check for inconsistent use of tabs and spaces for indentation. The Run Module , F5 , runs the entire le. Youll see the output in the Python Shell window.

4.3 Script Mode


In interactive mode, Python displays the results of expressions. In script mode, however, Python doesnt automatically display results.

36

Chapter 4. Getting Started

Building Skills in Python, Release 2.6.5

In order to see output from a Python script, well introduce the print statement and the print() function. The print statement is the Python 2.6 legacy construct. The print() function is a new Python 3 construct that will replace the print statement. Well visit this topic in depth in Seeing Output with the print() Function (or print Statement). For now, you can use either one. Well show both. In the future, the print statement will be removed from the language.

4.3.1 The print Statement


The print statement takes a list of values and prints their string representation on the standard output le. The standard output is typically directed to the Terminal window.
print "PI = ", 355.0/113.0

We can have the Python interpreter execute our script les. Application program scripts can be of any size or complexity. For the following examples, well create a simple, two-line script, called example1.py.

example1.py
print 65, "F" print ( 65 - 32 ) * 5 / 9, "C"

4.3.2 The print() function


The print() functions takes a list of values and prints their string representation on the standard output le. The standard output is typically directed to the Terminal window. Until Python 3, we have to request the print() function with a special introductory statement: from __future__ import print_function.
from __future__ import print_function print( "PI = ", 355.0/113.0 )

We can have the Python interpreter execute our script les. Application program scripts can be of any size or complexity. For the following examples, well create a simple, two-line script, called example1.py.

example1.py
from __future__ import print_function print( 65, "F" ) print( ( 65 - 32 ) * 5 / 9, "C" )

4.3.3 Running a Script


There are several ways we can start the Python interpreter and have it evaluate our script le.

4.3. Script Mode

37

Building Skills in Python, Release 2.6.5

Explicitly from the command line. In this case well be running Python and providing the name of the script as an argument. Well look at this in detail below. Implicitly from the command line. In this case, well either use the GNU/Linux shell comment (sharpbang marker) or well depend on the le association in Windows. This is slightly more complex, and well look at this in detail below. Manually from within IDLE . Its important for newbies to remember that IDLE shouldnt be part of the nal delivery of a working application. However, this is a great way to start development of an application program. We wont look at this in detail because its so easy. Hit F5. MacBook users may have to hit fn and F5. Running Python scripts from the command-line applies to all operating systems. It is the core of delivering nal applications. We may add an icon for launching the application, but under the hood, an application program is essentially a command-line start of the Python interpreter.

4.3.4 Explicit Command Line Execution


The simplest way to execute a script is to provide the script le name as a parameter to the python interpreter. In this style, we explicitly name both the interpreter and the input script. Heres an example.
python example1.py

This will provide the example1.py le to the Python interpreter for execution.

4.3.5 Implicit Command-Line Execution


We can streamline the command that starts our application. For POSIX-standard operating systems (GNU/Linux, UNIX and MacOS), we make the script le itself executable and directing the shell to locate the Python interpreter for us. For Windows users, we associate our script le with the python.exe interpreter. There are one or two steps to this. 1. Associate your le with the Python interpreter. Except for Windows, you make sure the rst line is the following: #!/usr/bin/env python . For Windows, you must assure that .py les are associated with python.exe and .pyw les are associated with pythonw.exe. The whole le will look like this:
#!/usr/bin/env python print 65, "F" print ( 65 - 32 ) * 5 / 9, "C"

2. For POSIX-standard operating systems, do a chmod +x example1.py to make the le example1.py executable. You only do this once, typically the rst time you try to run the le. For Windows, you dont need to do this. Now you can run a script in most GNU/Linux environments by saying:
./example1.py

38

Chapter 4. Getting Started

Building Skills in Python, Release 2.6.5

4.3.6 Windows Conguration


Windows users will need to be sure that python.exe is on their PATH. This is done with the System control panel. Click on the Advanced tab. Click on the Environment Variables... button. Click on the System variables Path line, and click the Edit... button. This will often have a long list of items, sometimes starting with %SystemRoot%. At the end of this list, add ";" and the direction location of Python.exe. On my machine, I put it in C:\Python26. For Windows programmers, the windows command interpreter uses the last letters of the le name to associate a le with an interpreter. You can have Windows run the python.exe program whenever you double-click a .py le. This is done with the Folder Options control panel. The File Types tab allows you to pair a le type with a program that processes the le.

4.3.7 GNU/Linux Conguration


We have to be sure that the Python interpreter is in value of the PATH that our shell uses. We cant delve into the details of each of the available UNIX Shells. However, the general rule is that the person who administers your POSIX computer should have installed Python and updated the /etc/profile to make Python available to all users. If, for some reason that didnt get done, you can update your own .profile to add Python to your PATH variable. The Sharp-Bang (shebang) Comment The #! technique depends on the way all of the POSIX shells handle scripting languages. When you enter a command that is the name of a le, the shell must rst check the le for the x (execute) mode; this was the mode you added with chmod +x. When execute mode is true, the shell must then check the rst few bytes to see what kind of le it is. The rst few bytes are termed the magic number ; deep in the bowels of GNU/Linux there is a database that shows what the magic number means, and how to work with the various kinds of les. Some les are binary executables, and the operating system handles these directly. When an executable les content begins with #!, it is a script le. The rest of the rst line names the program that will interpret the script. In this case, we asked the env program to nd the python interpreter. The shell nds the named program and runs it automatically, passing the name of script le as the last argument to the interpreter it found. The very cool part of this trick is that #! is a comment to Python. This rst line is, in eect, directed at the shell, and ignored by Python. The shell glances at it to see what the language is, and Python studiously ignores it, since it was intended for the shell.

4.3.8 Another Script Example


Throughout the rest of this book, were going to use this script processing mode as the standard way to run Python programs. Many of the examples will be shown as though a le was sent to the interpreter. For debugging and testing, it is sometimes useful to import the program denitions, and do some manipulations interactively. Well touch on this in Hacking Mode. Heres a second example. Well create a new le and write another small Python program. Well call it example2.py.

4.3. Script Mode

39

Building Skills in Python, Release 2.6.5

example2.py
#!/usr/bin/env python """Compute the odds of spinning red (or black) six times in a row on an American roulette wheel. """ print (18.0/38.0)**6

This is a one-line Python program with a two line module document string. Thats a good ratio to strive for. After we nish editing, we mark this as executable using chmod +x example2.py. Since this is a property of the le, this remains true no matter how many times we edit, copy or rename the le. When we run this, we see the following.
$ ./example2.py 0.0112962280375

Which says that spinning six reds in a row is about a one in eighty-nine probability.

4.4 Getting Help


Python has two closely-related help modes. One is the general help utility, the other is a help function that provides the documentation on a specic object, module, function or class.

4.4.1 The help() Utility


Help is available through the help() function. If you enter just help() you will enter the online help utility. This help utility allows you to explore the Python documentation. The interaction looks like this:
>>> help Type help() for interactive help, or help(object) for help about object. >>> help() Welcome to Python 2.5! This is the online help utility.

If this is your first time using Python, you should definitely check out the tutorial on the Internet at http://www.python.org/doc/tut/. Enter the name of any module, keyword, or topic to get help on writing Python programs and using Python modules. To quit this help utility and return to the interpreter, just type "quit". To get a list of available modules, keywords, or topics, type "modules", "keywords", or "topics". Each module also comes with a one-line summary of what it does; to list the modules whose summaries contain a given word such as "spam", type "modules spam". help>

40

Chapter 4. Getting Started

Building Skills in Python, Release 2.6.5

Note that the prompt changes from Pythons standard >>> to a special help-mode prompt of help>. When you enter quit, you exit the help system and go back to Pythons ordinary prompt. To start, enter :samp:modules, :samp:keywords or :samp:topics to see the variety of information available.

4.4.2 Help on a specic topic


If you enter help( object ) for some object, you will be given help on that specic object. This help is displayed using a help viewer. Youll enter something like this:
>>> help("EXPRESSIONS")

Youll get a page of output, ending with a special prompt from the program thats helping to display the help messages. The prompt varies: Mac OS and GNU/Linux will show one prompt, Windows will show another. Mac OS and GNU/Linux. In standard OSs, youre interacting with a program named less; it will prompt you with : for all but the last page of your document. For the last page it will prompt you with (END). This program is very sophisticated. The four most important commands you need to know are the following. q Quit the less help viewer. h Get help on all the commands which are available. Enter a space to see the next page. b Go back one page. Windows. In Windows, youre interacting with a program named more; it will prompt you with -- More --. The four important commands youll need to know are the following. q Quit the more help viewer. h Get help on all the commands which are available. Enter a space to see the next page.

4.5 Syntax Formalities


What is a Statement?
Informally, weve seen that simple Python statements must be complete on a single line. As we will see in following chapters, compound statements are built from simple and compound statements. Fundamentally, Python has a simple equivalence between the lexical line structure and the statements in a Python program. This forces us to write readable programs with one statement per line. There are nine formal rules for the lexical structure of Python. 1. Simple statements must be complete on a single Logical Line. Starting in Truth, Comparison and Conditional Processing well look at compound statements, which have indented suites of statements, and which span multiple Logical Lines. The rest of these rules will dene how Logical Lines are built from Physical Lines through a few Line Joining rules.

4.5. Syntax Formalities

41

Building Skills in Python, Release 2.6.5

2. Physical Lines are dened by the platform; theyll end in standard n or the Windows ASCII CR LF sequence ( \r\n ). 3. Comments start with the # character outside a quoted string; comments end at the end of the physical line. These are not part of a statement; they may occur on a line by themselves or at the end of a statement. 4. Coding-Scheme Comments. Special comments that are by VIM or EMACS can be included in the rst or second line of a Python le. For example, # -*- coding: latin1 -*- 5. Explicit Line Joining. A \ at the end of a physical line joins it to the next physical line to make a logical line. This escapes the usual meaning of the line end sequence. The two or three-character sequences ( \n or \r\n ) are treated as a single space. 6. Implicit Line Joining. Expressions with ()s, []s or {}s can be split into multiple physical lines. 7. Blank Lines. When entering statements interactively, an extra blank line is treated as the end of an indented block in a compound statement. Otherwise, blank lines have no signcance. 8. Indentation. The embedded suite of statements in a compound statement must be indented by a consistent number of spaces or tabs. When entering statements interactively or in an editor that knows Python syntax (like IDLE), the indentation will happen automatically; you will outdent by typing a single backspace. When using another text editor, you will be most successful if you congure your editor to use four spaces in place of a tab. This gives your programs a consisent look and makes them portable among a wide variety of editors. 9. Whitespace at the beginning of a line is part of indentation, and is signicant. Whitespace elsewhere within a line is not signicant. Feel free to space things out so that they read more like English and less like computer-ese.

4.6 Exercises
4.6.1 Command-Line Exercises
1. Simple Commands. Enter the following one-line commands to Python: copyright license credits help 2. Simple Expressions. Enter one-line commands to Python to compute the following: 12345 + 23456 98765 - 12345 128 * 256 22 / 7 355 / 113 (18-32)*5/9 -10*9/5+32

42

Chapter 4. Getting Started

Building Skills in Python, Release 2.6.5

4.6.2 IDLE Exercises


1. Create an Exercises Directory. Create a directory (or folder) for keeping your various exercise scripts. Be sure it is not in the same directory in which you installed Python. 2. Use IDLEs Shell Window. Start IDLE . Refer back to the exercises in Command-Line Interaction . Run these exercises using IDLE . 3. Use the IDLE File Window. Start IDLE . Note the version number. Use New Window under the File menu to create a simple le. The le should have the following content.
""" My First File """ print __doc__

Save this le in your exercises directory; be sure the name ends with .py . Run your le with the Run Module menu item in the Run menu, usually F5 .

4.6.3 Script Exercises


1. Print Script. Create and run Python le with commands like the following examples:
print print print print 12345 + 23456 98765 - 12345 128 * 256 22 / 7

Or, use the print function as follows.


from __future__ import print_function print(12345 + 23456) print(98765 - 12345) print(128 * 256) print(22 / 7)

2. Another Simple Print Script. Create and run a Python le with commands like the following examples:
print "one red", 18.0/38.0 print "two reds in a row", (18.0/38.0)**2

Or, use the print function as follows.


from __future__ import print_function print("one red", 18.0/38.0) print("two reds in a row", (18.0/38.0)**2)

3. Interactive Dierences. First, run IDLE (or Python) interactively and enter the following commands: copyright, license, credits. These are special global objects that print interesting things on the interactive Python console. Create a Python le with the three commands, each one on a separate line: copyright, license, credits. When you run this, it doesnt produce any output, nor does it produce an error. Now create a Python le with three commands, each on a separate line: print copyright, print license, print credits.

4.6. Exercises

43

Building Skills in Python, Release 2.6.5

Interestingly, these three global variables have dierent behavior when used in a script. This is rare. By default, there are just three more variables with this kind of behavior: quit, exit and help. 4. Numeric Types. Compare the results of 22/7 and 22.0/7. Explain the dierences in the output.

4.7 Other Tools


This section lists some additional tools which are popular ways to create, maintain and execute Python programs. While IDLE is suitable for many purposes, you may prefer an IDE with a dierent level of sophistication.

4.7.1 Any Platform


The Komodo Edit is an IDE that is considerably more sophisticated than IDLE. It is - in a way - too sophisticated for this book. Our focus is on the language, not high-powered IDEs. As with IDLE, this is a tool that runs everywhere, so you can move seamlessly from GNU/Linux to Wiundows to the Mac OS with a single, powerful tool. See http://www.komodo.com for more information on ordering and downloading.

4.7.2 Windows
Windows programmers might want to use a tool like Textpad. See http://www.textpad.com for information on ordering and downloading. Be sure to also download the python.syn le from http://www.textpad.com/add-ons which has a number of Python syntax coloring congurations. To use Textpad, you have two setup steps. First, youll need to add the Python document class. Second youll need to tell Textpad about the Python tool. The Python Document Class. You need to tell Textpad about the Python document class. Use the Congure menu; the New Document Class... menu item lets you add Python documents to Textpad. Name your new document class Python and click Next. Give your class members named *.py and click Next. Locate your python.syn le and click Next. Check the new Python document class, and click Next if everything looks right to create a new Textpad document class. The Python Tool. Youll want to add the Python interpreter as a Textpad tool. Use the Congure menu again, this time selecting the Preferences? item. Scroll down the list of preferences on the left and click on Tools. On the right, youll get a panel with the current set of tools and a prominent Add button on the top right-hand side. Click Add, and select Program... from the menu that appears. Youll get a dialog for locating a le; nd the Python.exe le. Click Okay to save this program as a Textpad tool. You can check this by using Congure menu and Preferences... item again. Scroll down the list to nd Tools . Click the + sign and open the list of tools. Click the Python tool and check the following: The Command is the exact path to your copy of Python.exe The Parameters contains $File The Initial Folder contains $FileDir The capture output option should be checked You might also want to turn o the Sound Alert option; this will beep when a program nishes running. I nd this makes things a little too noisy for most programs.

44

Chapter 4. Getting Started

Building Skills in Python, Release 2.6.5

4.7.3 Macintosh
Macintosh programmers might want to use a tool like BBEdit. BBEdit can also run the programs, saving the output for you. See http://www.barebones.com for more information on BBEdit. To use BBEdit, you have two considerations when writing Python programs. You must be sure to decorate each Python le with the following line: #!/usr/bin/env python. This tells BBEdit that the le should be interpreted by Python. Well mention this again, when we get to script-writing exericses. The second thing is to be sure you set the chdir to Scripts Folder option when you use the the run... item in the #! (shebang) menu. Without this, scripts are run in the root directory, not in the directory that contains your script le.

4.8 Style Notes: Wise Choice of File Names


There is considerable exibility in the language; two people can arrive at dierent presentations of Python source. Throughout this book we will present the guidelines for formatting, taken from the Python Enhancement Proposal (PEP) 8, posted on http://python.org/dev/peps/pep-0008/. Well include guidelines that will make your programming consistent with the Python modules that are already part of your Python environment. These guidelines should also also make your programming look like other third-party programs available from vendors and posted on the Internet. Python programs are meant to be readable. The language borrows a lot from common mathematical notation and from other programming languages. Many languages (C++ and Java) for instance, dont require any particular formatting; line breaks and indendentation become merely conventions; bad-looking, hard-to-read programs are common. On the other hand, Python makes the line breaks and indentations part of the language, forcing you to create programs that are easier on the eyes. General Notes. Well touch on many aspects of good Python style as we introduce each piece of Python programming. We havent seen much Python yet, but we do need some guidance to prevent a few tiny problems that could crop up. First, Python (like all of Linux) is case sensitive. Some languages that are either all uppercase, or insensitive to case. We have worked with programmers who actually nd it helpful to use the Caps Lock key on their keyboard to expedite working in an all-upper-case world. Please dont do this. Python should look like English, where lower-case letters predominate. Second, Python makes use of indentation. Most programmers indent very nicely, and the compiler or interpreter ignores this. Python doesnt ignore it. Indentation is useful for write clear, meaning documents and programs are no dierent. Finally, your operating system allows a fairly large number of characters to appear in a le name. Until we start writing modules and packages, we can call our les anything that the operating system will tolerate. Starting in Components, Modules and Packages, however, well have to limit ourselves to lenames that use only letters, digits and _s. There can be just one ending for the lename: .py . A le name like exercise_1.py is better than the name execise-1.py. We can run both programs equally well from the command line, but the name with the hyphen limits our ability to write larger and more sophisticated programs.

4.8. Style Notes: Wise Choice of File Names

45

Building Skills in Python, Release 2.6.5

46

Chapter 4. Getting Started

CHAPTER

FIVE

SIMPLE NUMERIC EXPRESSIONS AND OUTPUT


The print Statement and Numeric Operations
Basic expressions are the most central and useful feature of modern programming languages. To see the results of expressions, well use the print statement. This chapter starts out with Seeing Output with the print() Function (or print Statement), which covers the print statement. Numeric Types and Operators covers the basic numeric data types and operators that are integral to writing expressions Python. Numeric Conversion (or Factory) Functions covers conversions between the various numeric types. Built-In Math Functions covers some of the built-in functions that Python provides.

5.1 Seeing Output with the print() Function (or print Statement)
Before delving into expressions and numbers, well look at the print statement. Well cover just the essential syntax of the print statement; it has some odd syntax quirks that are painful to explain. Note: Python 3.0 Python 3.0 will replace the irregular print statement with a built-in print() function that is perfectly regular, making it simpler to explain and use. In order to use the print() function instead of the print statement, your script (or IDLE session) must start o with the following.
from __future__ import print_function

This replaces the print statement, with its irregular syntax with the print() function.

5.1.1 print Statement Syntax Overview


The print statement takes a list of values and, well, prints them. Speaking strictly, it does two things: 1. it converts the objects to strings and 2. puts the characters of those strings on standard output.

47

Building Skills in Python, Release 2.6.5

Generally, standard output is the console window where Python was started, although there are ways to change this that are beyond the scope of this book. Heres a quick summary of the more important features of print statement syntax. In short, the keyword, print, is followed by a comma-separated list of expressions.
print expression , ...

Note: Syntax Summary This syntax summary isnt completely correct because it implies that the list of expressions is terminated with a comma. Rather than fuss around with complex syntax diagrams (thats what the Python reference manual is for) weve shown an approximation that is close enough. The , in a print statement is used to separate the various expressions. A , can also be used at the end of the print statement to change the formatting; this is an odd-but-true feature that is unique to print statement syntax. Its hard to capture this sublety in a single syntax diagram. Further, this is completely solved by using the print() function. One of the simplest kind of expressions is a quoted string. You can use either apostrophes (') or quotes (") to surround strings. This gives you some exibility in your strings. You can put an apostrophe into a quoted string, and you can put quotes into an apostrophed string without the special escapes that some other languages require. The full set of quoting rules and alternatives, however, will have to wait for Strings. For example, the following trivial program prints three strings and two numbers.
print "Hi, Mom", "Isn't it lovely?", 'I said, "Hi".', 42, 91056

Multi-Line Output. Ordinarily, each print statement produces one line of output. You can end the print statement with a trailing , to combine the results of multiple print statements into a single line. Here are two examples.
print print print print "335/113=", 335.0/113.0 "Hi, Mom", "Isn't it lovely?", 'I said, "Hi".', 42, 91056

Since the rst print statement ends with a , it does not produce a complete line of output. The second print statement nishes the line of output. Redirecting Output. The print statements output goes to the operating systems standard output le. How do we send output to the systems standard error le? This involves some more advanced concepts, so well introduce it with a two-part recipe that we need to look at in more depth. Well revisit these topics in Components, Modules and Packages . First, youll need access to the standard error object; you get this via the following statement.
import sys

Second, there is an unusual piece of syntax called a chevron print which can be used to redirect output to standard error. >>
print file , expression , ...

48

Chapter 5. Simple Numeric Expressions and Output

Building Skills in Python, Release 2.6.5

Two common les are sys.stdout and sys.stderr. Well return to les in Files. Here is an example of a small script which produces messages on both stderr and stdout.

mixedout.py
#!/usr/bin/env python """Mixed output in stdout and stderr.""" import sys print >>sys.stderr, "This is an error message" print "This is stdout" print >>sys.stdout, "This is also stdout"

When you run this inside IDLE, youll notice that the stderr is colored red, where the stdout is colored black. Youll also notice that the order of the output in IDLE doesnt match the order in our program. Most POSIX operating systems buer stdout, but do not buer stderr. Consequently, stdout messages dont appear until the buer is full, or the program exits.

5.1.2 The print() Function


Python 3 replaces the relatively complex and irregular print statement with a simple and regular print() function. In Python 2.6 we can use this new function by doing the following:
from __future__ import print_function

This statement must be one of the rst executable statements in your script le. It makes a small but profuound change to Python syntax. The Python processor must be notied of this intended change up front. This provides us with the following: print([object, ...], [sep= ], [end=n], [le=sys.stdout] ) This will convert each object to a string, and then write the characters on the given le. The separator between objects is by default a single space. Setting a value for sep will set a dierent separator. The end-of-line character is by default a single newline. Setting a value for end will set a dierent end-of-line character. To change output les, provide a value for le. Multiline Output. To create multiline output, do the following:
from __future__ import print_function print( print( print( print( "335/113=", end="" ) 335.0/113.0 ) "Hi, Mom", "Isn't it lovely?", end="" ) 'I said, "Hi".', 42, 91056 )

Redirecting Output. The print statements output goes to the operating systems standard output le. How do we send output to the systems standard error le? This involves some more advanced concepts, so

5.1. Seeing Output with the print() Function (or print Statement)

49

Building Skills in Python, Release 2.6.5

well introduce it with a two-part recipe that we need to look at in more depth. Well revisit these topics in Components, Modules and Packages. First, youll need access to the standard error object. Second, youll provide the le option to the print() function.
from __future__ import sys print( "This is print( "This is print( "This is import print_function an error message", file=sys.stderr ) stdout" ) also stdout", file=sys.stdout )

Adding Features. You can with some care add features to the print() function. When we look at function denitions, well look at how we can override the built-in print() function to add our own unique features.

5.1.3 print Notes and Hints


A program produces a number of kinds of output. The print() function (or print statement) is a handy jumping-o point. Generally, well replace this with more advanced techiques. Final Reports. Our desktop applications may produce text-based report les. These are often done with print statements. PDF or other format output les. A desktop application which produces PDF or other format les will need to use additional libraries to produce PDF les. For example, ReportLab oers PDF-production libraries. These applications wont make extensive use of print statements. Error messages and processing logs. Logs and errors are often directed to the standard error le. You wont often use the print statement for this, but use the logging library. Debugging messages. Debugging messages are often handled by the logging library. The print statement (or print() function) is a very basic tool for debugging a complex Python program. Feel free to use print statements heavily to create a clear picture of what a program is actually doing. Ultimately, you are likely to replace print statements with other, more sophisticated methods.

5.2 Numeric Types and Operators


Python provides four built-in types of numbers: plain integers, long integers, oating point numbers and complex numbers. Numbers all have several things in common. Principally, the standard arithmetic operators of +, -, *, /, % and ** are all available for all of these numeric types. Additionally, numbers can be compared, using comparison operators that well look at in Comparisons. Also, numbers can be coerced from one type to another. More sophisticated math is separated into the math module, which we will cover later. However, a few advanced math functions are an integral part of Python, including abs() and pow().

5.2.1 Integers
Plain integers are at least 32 bits long. The range is at least -2,147,483,648 to 2,147,483,647 (approximately 2 billion). 50 Chapter 5. Simple Numeric Expressions and Output

Building Skills in Python, Release 2.6.5

Python represents integers as strings of decimal digits. A number does not include any punctuation, and cannot begin with a leading zero (0). Leading zeros are used for base 8 and base 16 numbers. Well look at this below.
>>> 355 >>> 355 >>> 355 >>> 3 255+100 397-42 71*5 355/113

While most features of Python correspond with common expectations from mathematics and other programming languages, the division operator, /, poses certain problems. Specically, the distinction between the algorithm and the data representation need to be made explicit. Division can mean either exact oatingpoint results or integer results. Mathematicians have evolved a number of ways of describing precisely what they mean when discussing division. We need similar expressive power in Python.Well look at more details of division operators in Division Operators. Binary, Octal and Hexadecimal. For historical reasons, Python supports programming in octal and hexadecimal. I like to think that the early days of computing were dominated by people with 8 or 16 ngers. A number with a leading 0 (zero) is octal, base 8, and uses the digits 0 to 7. 0123 is octal and equal to 83 decimal. A number with a leading 0x or 0X is hexadecimal, base 16, and uses the digits 0 through 9, plus a, A, b, B, c, C, d, D, e, E, f, and F. 0x2BC8 is hexadecimal and equal to 11208. A number with a leading 0b or 0B is binary, base 2, and uses digits 0 and 1. Important: Leading Zeroes When using Python 2.6, watch for leading zeros in numbers. If you simply transcribe programs from other languages, they may use leading zeros on decimal numbers. Important: Python 3 In Python 3, the octal syntax will change. Octal constants will begin with 0o to match hexadecimal constants which begin with 0x. 0o123 will be octal and equal to 83 decimal.

5.2.2 Long Integers


One of the useful data types that Python oers are long integers. Unlike ordinary integers with a limited range, long integers have arbitrary length; they can have as many digits as necessary to represent an exact answer. However, these will operate more slowly than plain integers. Long integers end in L or l. Upper case L is preferred, since the lower-case l looks too much like the digit 1. Python is graceful about converting to long integers when it is necessary. Important: Python 3 Python 3 will not require the trailing L. It will silently deduce if you need an integer or a long integer. How many dierent combinations of 32 bits are there? The answer is there are 232 ; 2**32 in Python. The answer is too large for ordinary integers, and we get the result as a long integer.

5.2. Numeric Types and Operators

51

Building Skills in Python, Release 2.6.5

>>> 2**32 4294967296L >>> 2**64 18446744073709551616L

There are about 4 billion ways to arrange 32 bits. How many bits in 1K of memory? 1024 8 bits. How many combinations of bits are possible in 1K of memory? 210248 .
print 2L**(1024*8)

I wont attempt to reproduce the output from Python. It has 2,467 digits. There are a lot of dierent combinations of bits in only 1K of memory. The computer Im using has 512 1024K bytes of memory; there are a lot of combinations of bits available in that memory. Python will silently convert between ultra-fast integers and slow-but-large long integers. You can force a conversion using the int() or long() factory functions.

5.2.3 Floating-Point Numbers


Python oers oating-point numbers, often implemented as double-precision numbers, typically using 64 bits. Floating-point numbers are written in two forms: a simple string of digits that includes a decimal point, and a more complex form that includes an explicit exponent.
.0625 0.0625 6.25E-2 625E-4

The last two examples are based on scientic notation, where numbers are written as a mantissa and an exponent. The E (or code:e) , powers of 10 are used with the exponent, giving us numbers that look like this: 6.25 102 and 625 104 . The last example isnt properly normalized, since the mantissa isnt between 0 and 10. Generally, a number, n, is some mantissa, g, and an exponent of c. For human consumption, we use a base of 10. Internally, most computers use a base of 2, not 10. n = g 10c n = h 2d This dierece in the mantissa leads to slight errors in converting certain values, which are exact in base 10, to approximations in base 2. For example, 1/5th doesnt have a precise representation. This isnt generally a problem because we have string formatting operations which can make this tiny representation error invisible to users.
>>> 1./5. 0.20000000000000001 >>> .2 0.20000000000000001

52

Chapter 5. Simple Numeric Expressions and Output

Building Skills in Python, Release 2.6.5

5.2.4 Complex Numbers


Besides plain integers, long integers and oating point numbers, Python also provides for imaginary and complex numbers. These use the European convention of ending with J or j. People who dont use complex numbers should skip this section. 3.14J is an imaginary number = 3.14 1. A complex number is created by adding a real and an imaginary number: 2 + 14j. Note that Python always prints these in ()s; for example (2+14j). The usual rules of complex math work perfectly with these numbers.
>>> (2+3j)*(4+5j) (-7+22j)

Python even includes the complex conjugate operation on a complex number. This operation follows the complex number separated by a dot (.). This notation is used because the conjugate is treated like a method function of a complex number object (well return to this method and object terminology in Classes). For example:
>>> 3+2j.conjugate() (3-2j)

5.3 Numeric Conversion (or Factory) Functions


We can convert a number from one type to another. A conversion may involve a loss of precision because weve reduced the number of bits available. A conversion may also add a false sense of precision by adding bits which dont have any real meaning. Well call these factory functions because they are a factory for creating new objects from other objects. The idea of factory function is a very general one, and these are just the rst of many examples of this pattern.

5.3.1 Numeric Factory Function Denitions


There are a number of conversions from one numeric type to another. int(x ) Generates an integer from the object x. If x is a oating point number, digits to the right of the decimal point are truncated as part of creating an integer. If the oating point number is more than about 10 digits, a long integer object is created to retain the precision. If x is a long integer that is too large to be represented as an integer, theres no conversion. Complex values cant be turned into integers directly. If x is a string, the string is parsed to create an integer value. It must be a string of digits with an optional sign (+ or -).
>>> int("1243") 1243 >>> int(3.14159) 3

5.3. Numeric Conversion (or Factory) Functions

53

Building Skills in Python, Release 2.6.5

float(x ) Generates a oat from object x. If x is an integer or long integer, a oating point number is created. Note that long integers can have a large number of digits, but oating point numbers only have approximately 16 digits; there can be some loss of precision. Complex values cant be turned into oating point numbers directly. If x is a string, the string is parsed to create an oat value. It must be a string of digits with an optional sign (+ or -). The digits can have a single decimal point (.). Also, a string can be in scientic notation and include e or E followed by the exponent as a simple signed integer value.
>>> float(23) 23.0 >>> float("6.02E24") 6.0200000000000004e+24 >>> float(22)/7 3.14285714286

long(x ) Generates a long integer from x. If x is a oating point number, digits to the right of the decimal point are truncated as part of creating a long integer.
>>> long(2) 2L >>> long(6.02E23) 601999999999999995805696L >>> long(2)**64 18446744073709551616L

complex(real, [imag] ) Generates a complex number from real and imag. If the imaginary part is omitted, it is 0.0. Complex is not as simple as the others. A complex number has two parts, real and imaginary. Conversion to complex typically involves two parameters.
>>> complex(3,2) (3+2j) >>> complex(4) (4+0j) >>> complex("3+4j") (3+4j)

Note that the second parameter, with the imaginary part of the number, is optional. This leads to a number of dierent ways to call this function. In the example above, we used three variations: two numeric parameters, one numeric parameter and one string parameter.

5.4 Built-In Math Functions


Python has a number of built-in functions, which are an integral part of the Python interpreter. We cant look at all of them because many are related to features of the language that we havent addressed yet. One of the built-in mathematical functions will have to wait for complete coverage until weve introduced the more complex data types, specically tuples, in Tuples. The divmod() function returns a tuple object with the quotient and remainder in division.

54

Chapter 5. Simple Numeric Expressions and Output

Building Skills in Python, Release 2.6.5

5.4.1 Built-In Math Functions


The bulk of the math functions are in a separate module, called math, which we will cover in The math Module . The formal denitions of mathematical built-in functions are provided below. abs(number ) Return the absolute value of the argument, |x|. pow(x, y, [z] ) Raise x to the y power, xy . If z is present, this is done modulo z, xy mod z . round(number, [digits] ) Round number to ndigits beyond the decimal point. If the ndigits parameter is given, this is the number of decimal places to round to. If ndigits is positive, this is decimal places to the right of the decimal point. If ndigits is negative, this is the number of places to the left of the decimal point. Examples:
>>> print round(678.456,2) 678.46 >>> print round(678.456,-1) 680.0

5.4.2 String Conversion Functions


The string conversion functions provide alternate representations for numeric values. This list expands on the function denitions in Numeric Conversion (or Factory) Functions. hex(number ) Create a hexadecimal string representation of number. A leading 0x is placed on the string as a reminder that this is hexadecimal.
>>> hex(684) '0x2ac'

oct(number ) Create a octal string representation of number. A leading 0 is placed on the string as a reminder that this is octal not decimal.
>>> oct(509) '0775'

bin(number ) Create a binary representation of number. A leading 0b is placed on the string as a reminder that this is binary and not decimal.
>>> bin(509) '0b111111101'

int(string, [base] ) Generates an integer from the string x. If base is supplied, x must be a string in the given base. If base is omitted, the string x must be decimal.

5.4. Built-In Math Functions

55

Building Skills in Python, Release 2.6.5

>>> int( '0775', 8 ) 509 >>> int( '0x2ac', 16 ) 684 >>> int( '101101101101', 2 ) 2925

The int() function has two forms. The int(x) form converts a decimal string, x, to an integer. For example, int('25') is 25. The int(x,b) form converts a string, x, in base b to an integer. For example, int('25',8) is 21. str(object ) Generate a string representation of the given object. This is the a readable version of the value. repr(object ) Generate a string representation of the given object. Generally, this is the a Python expression that can reconstruct the value; it may be rather long and complex. For the numeric examples weve seen so far, the value of repr() is generally the same as the value of str(). The str() and repr() functions convert any Python object to a string. The str() version is typically more readable, where the repr() version is an internalized representation. For most garden-variety numeric values, there is no dierence. For the more complex data types, however, the resultsof repr() and str() can be very dierent. For classes you write (see Classes), your class denition must provide these string representation functions.

5.4.3 Collection Functions


These are several built-in functions which operate on simple collections of data elements. max(value, ...) Return the largest value.
>>> max(1,2,3) 3

min(value, ...) Return the smallest value.


>>> min(1,2,3) 1

Additionally, there are several other collection-handling functions, including any(), all() and sum(). These will have to wait until we can look at collection objects in Data Structures.

5.5 Expression Exercises


There are two sets of exercises. The rst section, Basic Output and Functions, covers simpler exercises to reinforce Python basics. The second section, Numeric Types and Expressions, covers more complex numeric expressions.

56

Chapter 5. Simple Numeric Expressions and Output

Building Skills in Python, Release 2.6.5

5.5.1 Basic Output and Functions


1. Print Expression Results. In Command-Line Exercises, we entered some simple expressions into the Python interpreter. Change these simple expressions into print statements. Be sure to print a label or identier with each answer. Heres a sample.
print "9-1's * 9-1's = ", 111111111*111111111

Heres an example using the print() function.


from __future__ import print_function print( "9-1's * 9-1's = ", 111111111*111111111 )

2. Evaluate and Print Expressions. Write short scripts to print the results of the following expressions. In most places, changing integers to oating point produces a notably dierent result. For example (296/167)**2 and (296.0/167.0)**2 . Use long as well as complex types to see the dierences. 355/113 * ( 1 - 0.0003/3522 ) 22/17 + 37/47 + 88/83 (553/312)**2
5 3. Numeric Conversion. Write a print statement to print the mixed fraction 3 8 as a oating point number and as an integer.

4. Numeric Truncation. Write a print statement to compute (22.0/7.0)-int(22.0/7.0). What is this value? Compare it with 22.0/7.0. What general principal does this illustrate? 5. Illegal Conversions. Try illegal conversions like int('A') or int( 3+4j ). Why are exceptions raised? Why cant a simple default value like zero or None be used instead? 6. Evaluate and Print Built-in Math Functions. Write short scripts to print the results of the following expressions. pow( 2143/22, 0.25 ) pow(553/312,2) pow( long(3), 64 ) long( pow(float(3), 64) ) Why do the last two produce dierent results? What does the dierence between the two results tell us about the number of digits of precision in oating-point numbers? 7. Evaluate and Print Built-in Conversion Functions. Here are some more expressions for which you can print the results. hex( 1234 ) int( hex(1234), 16 ) long( 0xab ) int( 0xab ) int( 0xab, 16 ) int( ab, 16 ) cmp( 2, 3 )

5.5. Expression Exercises

57

Building Skills in Python, Release 2.6.5

5.5.2 Numeric Types and Expressions


1. Stock Value. Compute value from number of shares purchase price for a stock. Once upon a time, stock prices were quoted in fractions of a dollar, instead of dollars and cents. Create a simple print statement for 125 shares purchased at 3 3 8 . Create a second simple print statement for 1 7 150 shares purchased at 2 4 plus an additional 75 shares purchased at 1 8 . Dont manually convert 1 4 to 0.25. Use a complete expression of the form 2+1/4.0, just to get more practice writing expressions. 2. Convert Between |deg| C and |deg| F. Convert temperatures from one system to another. Conversion Constants: 32 F = 0 C, 212 F = 100 C. The following two formulae converts between C (Celsius) and F (Fahrenheit). 212 32 C 100 100 C = (F 32) 212 32 F = 32 + Create a print statement to convert 18 C to F. Create a print statement to convert -4 F to C. 3. Periodic Payment on a Loan. How much does a loan really cost? Here are three versions of the standard mortgage payment calculation, with m = payment, p = principal due, r = interest rate, n = number of payments. Dont be surprised by the sign of the results; theyre opposite the sign of the principle. With a positive principle, you get negative numbers; you are paying down a principle. ( ) r m=p 1 (1 + r)n Mortgage with payments due at the end of each period: m= rp(r + 1)n (r + 1)n 1

Mortgage woth payments due at the beginning of each period: m= rp(r + 1)n [(r + 1)n 1](r + 1)

Use any of these forms to compute the mortgage payment, m, due with a principal, p, of $110,000, an interest rate, r, of 7.25% annually, and payments, n, of 30 years. Note that banks actually process things monthly. So youll have to divide the interest rate by 12 and multiply the number of payments by 12. 4. Surface Air Consumption Rate. SACR is used by SCUBA divers to predict air used at a particular depth. For each dive, we convert our air consumption at that dives depth to a normalized air consumption at the surface. Given depth (in feet), d , starting tank pressure (psi), s, nal tank pressure (psi), f, and time (in minutes) of t, the SACR, c, is given by the following formula. c= 33(s f ) t(d + 33)

Typical values for pressure are a starting pressure of 3000, nal pressure of 500.

58

Chapter 5. Simple Numeric Expressions and Output

Building Skills in Python, Release 2.6.5

A medium dive might have a depth of 60 feet, time of 60 minutes. A deeper dive might be to 100 feet for 15 minutes. A shallower dive might be 30 feet for 60 minutes, but the ending pressure might be 1500. A typical c (consumption) value might be 12 to 18 for most people. Write print statements for each of the three dive proles given above: medium, deep and shallow. Given the SACR, c , and a tank starting pressure, s, and nal pressure, f, we can plan a dive to depth (in feet), d, for time (in minutes), t, using the following formula. Usually the 33(s f )/c is a constant, based on your SACR and tanks. 33(s f ) = t(d + 33) c For example, tanks you own might have a starting pressure of 2500 and and ending pressure of 500, you might have a c (SACR) of 15.2. You can then nd possible combinations of time and depth which you can comfortably dive. Write two print statements that shows how long one can dive at 60 feet and 70 feet. 5. Force on a Sail. How much force is on a sail? A sail moves a boat by transferring force to its mountings. The sail in the front (the jib) of a typical fore-and-aft rigged sailboat hangs from a stay. The sail in the back (the main) hangs from the mast. The forces on the stay (or mast) and sheets move the boat. The sheets are attached to the clew of the sail. The force on a sail, f, is based on sail area, a (in square feet) and wind speed, vw (in miles per hour). f = w2 0.004 a For a small racing dinghy, the smaller sail in the front might have 61 square feet of surface. The larger, mail sail, might have 114 square feet. Write a print statement to gure the force generated by a 61 square foot sail in 15 miles an hour of wind. 6. Craps Odds. What are the odds of winning on the rst throw of the dice? There are 36 possible rolls on 2 dice that add up to values from 2 to 12. There is just 1 way to roll a 2, 6 ways to roll a 7, and 1 way to roll a 12. Well take this as given until a later exercise where we have enough Python to generate this information. Without spending a lot of time on probability theory, there are two basic rules well use time and again. If any one of multiple alternate conditions needs to be true, usually expressed as or, we add the probabilities. When there are several conditions that must all be true, usually expressed as and, we multiply the probabilities. Rolling a 3, for instance, is rolling a 1-2 or rolling a 2-1. We add the probabilities: 1/36 + 1/36 = 2/36 = 1/18. On a come out roll, we win immediately if 7 or 11 is rolled. There are two ways to roll 11 (2/36) or 6 ways to roll 7 (6/36). Write a print statement to print the odds of winning on the come out roll. This means rolling 7 or rolling 11. Express this as a fraction, not as a decimal number; that means adding up the numerator of each number and leaving the denominator as 36. 7. Roulette Odds. How close are payouts and the odds? An American (double zero) roulette wheel has numbers 1-36, 0 and 00. 18 of the 36 numbers are red, 18 are black and the zeroes are green. The odds of spinning red, then are 18/38. The odds of zero or double zero are 2/36. 5.5. Expression Exercises 59

Building Skills in Python, Release 2.6.5

Red pays 2 to 1, the real odds are 38/18. Write a print statement that shows the dierence between the pay out and the real odds. You can place a bet on 0, 00, 1, 2 and 3. This bet pays 6 to 1. The real odds are 5/36. Write a print statement that shows the dierence between the pay out and the real odds.

5.6 Expression Style Notes


Spaces are used sparingly in expressions. Spaces are never used between a function name and the ()s that surround the arguments. It is considered poor form to write:
int (22.0/7)

The preferred form is the following:


int(22.0/7)

A long expression may be broken up with spaces to enhance readability. For example, the following separates the multiplication part of the expression from the addition part with a few wisely-chosen spaces.
b**2 - 4*a*c

60

Chapter 5. Simple Numeric Expressions and Output

CHAPTER

SIX

ADVANCED EXPRESSIONS
The math and random Modules, Bit-Level Operations, Division
This chapter covers some more advanced topics. The math Module cover the math module. The The random Module covers elements of the random module. Division Operators covers the important distinction between the division operators. We also provide some supplemental information that is more specialized. Bit Manipulation Operators covers some additional bitddling operators that work on the basic numeric types. Expression Style Notes has some notes on style.

6.1 Using Modules


A Python module extends the Python execution environment by adding new classes, functions and helpful constants. We tell the Python interpreter to fetch a module with a variation on the import statement. There are several variations on import, which well cover in depth in Components, Modules and Packages. For now, well use the simple import:
import m

This will import module m. Only the modules name, m is made available. Every name inside the module m must be qualied by prepending the module name and a .. So if module m had a function called spam(), wed refer to it as m.spam(). There are dozens of standard Python modules. Well get to the most important ones in Components, Modules and Packages. For now, well focus on extending the math capabilities of the basic expressions weve looked so far.

6.2 The math Module


The math module is made available to your programs with:
import math

The math module contains a number of common trigonometric functions. acos(x ) Arc cosine of x ; result in radians.

61

Building Skills in Python, Release 2.6.5

asin(x ) arc sine of x ; result in radians. atan(x ) arc tangent of x ; result in radians. atan2(y, x ) y arc tangent of y x: arctan( x ); result in radians. cos(x ) cosine of x in radians. cosh(x ) hyperbolic cosine of x in radians. exp(x ) ex , inverse of log (x). hypot(x, y ) Euclidean distance, x2 + y 2 ; the length of the hypotenuse of a right triangle with height of :replaceable:y and length of x. log(x ) Natural logarithm (base e) of x. Inverse of exp(). n = eln n . log10(x ) natural logarithm (base 10) of x , inverse of 10** x. n = 10log n . pow(x, y ) xy . sin(x ) sine of x in radians. sinh(x ) hyperbolic sine of x in radians. sqrt(x ) square root of x. This version returns an error if you ask for sqrt(-1), even though Python understands complex and imaginary numbers. A second module, cmath, includes a version of sqrt() which correctly creates imaginary numbers. tan(x ) tangent of x in radians. tanh(x ) hyperbolic tangent of x in radians. Additionally, the following constants are also provided. math.pi the value of , 3.1415926535897931 math.e the value of e, 2.7182818284590451, used for the exp() and log() functions. Conversion between radians, r, and degrees, d, is based on the following denition: 360 degrees = 2 radians From that, we get the following relationships: d = r 180 d r 180 ,r = d= 180 The math module contains the following other functions for dealing with oating point numbers. 62 Chapter 6. Advanced Expressions

Building Skills in Python, Release 2.6.5

ceil(x ) Next larger whole number.


>>> import math >>> math.ceil(5.1) 6.0 >>> math.ceil(-5.1) -5.0

fabs(x ) Absolute value of the real x. floor(x ) Next smaller whole number.
>>> import math >>> math.floor(5.9) 5.0 >>> math.floor(-5.9) -6.0

fmod(x, y ) Floating point remainder after division of x y . This depends on the platform C library and may handle the signs dierently than the Python x % y.
>>> math.fmod( -22, 7 ) -1.0 >>> -22 % 7 6

modf(x ) Creates a tuple with the fractional and integer parts of x. Both results carry the sign of x so that x can be reconstructed by adding them. Well return to tuples in Tuples.
>>> math.modf( 123.456 ) (0.45600000000000307, 123.0)

frexp(x ) This function unwinds the usual base-2 oating point representation. A oating point number is m 2e , where m is always a fraction 1 2 m 1, and e is an integer. This function returns a tuple with m and e. The inverse is ldexp(m,e). ldexp(m, e) Calculat m 2e , the inverse of frexp(x).

6.3 The random Module


The random module contains a large number of functions for working with distributions of random numbers. There are numerous functions available, but the later exercises will only use these functions. The random module is made available to your program with:
import random

Here are the denitions of some commonly-used functions.

6.3. The random Module

63

Building Skills in Python, Release 2.6.5

choice(sequence) Chooses a random value from the sequence sequence.


>>> import random >>> random.choice( ['red', 'black', 'green'] ) 'red'

random() A random oating point number, r, such that 0 r < 1.0. randrange([start], stop, [step] ) Choose a random element from range( start, stop, step ). randrange(6) returns a number, r, such that 0 r < 6. There are 6 values between 0 and 5. randrange(1,7) returns a number, r, such that 1 r < 7. There are 6 values between 1 and 6. randrange(10,100,5) returns a number, such that 10 5k < 100. for some integer value of k. These are values 10, 15, 20, ..., 95. randint(a, b) Choose a random number, r, such that a r b. Unlike randrange(), this function includes both end-point values. uniform(a, b) Returns a random oating point number, r, such that a r < b. The randrange() has two optional values, making it particularly exible. Heres an example of some of the alternatives.

demorandom.py
#!/usr/bin/env python import random # Simple Range 0 <= r < 6 print random.randrange(6), random.randrange(6) # More complex range 1 <= r < 7 print random.randrange(1,7), random.randrange(1,7) # Really complex range of even numbers between 2 and 36 print random.randrange(2,37,2) # Odd numbers from 1 to 35 print random.randrange(1,36,2)

This demonstrates a number of ways of generating random numbers. It uses the basic random.randrange() with a variety of dierent kinds of arguments.

6.4 Advanced Expression Exercises


1. Evaluate These Expressions. The following expressions are somewhat more complex, and use functions from the math module. math.sqrt( 40.0/3.0 - math.sqrt(12.0) ) 6.0/5.0*( (math.sqrt(5)+1) / 2 )**2 math.log( 2198 ) / math.sqrt( 6 )

64

Chapter 6. Advanced Expressions

Building Skills in Python, Release 2.6.5

2. Run demorandom.py. Run the demorandom.py script several times and save the results. Then add the following statement to the script and run it again several times. What happens when we set an explicit seed?
#!/usr/bin/env python import random random.seed(1) ...everything else the same

Try the following variation, and see what it does.


#!/usr/bin/env python import random, time random.seed(time.clock()) ...everything else the same

3. Wind Chill. Wind chill is used by meteorologists to describe the eect of cold and wind combined. Given the wind speed in miles per hour, V, and the temperature in F, T, the Wind Chill, w, is given by the formula below. Wind Chill, new model 35.74 + 0.6215 T 35.75 (V 0.16 ) + 0.4275 T (V 0.16 ) Wind Chill, old model 0.081 (3.71 V + 5.81 0.25 V ) (T 91.4) + 91.4

Wind speeds are for 0 to 40 mph, above 40, the dierence in wind speed doesnt have much practical impact on how cold you feel. Write a print statement to compute the wind chill felt when it is -2 F and the wind is blowing 15 miles per hour. 4. How Much Does The Atmosphere Weigh? Part 1 From Slicing Pizzas, Racing Turtles, and Further Adventures in Applied Mathematics, [Banks02]. Pressure is measured in Newtons, N, kg m/sec2 . Air Pressure is is measured in Newtons of force per square meter, N/m2 . Air Pressure (at sea level) P0 . This is the long-term average. P0 = 1.01325 105 Acceleration is measured in m/sec2 . Gravity acceleration (at sea level) g. g = 9.82 We can use g to get the kg of mass from the force of air pressure P0 . Apply the acceleration of gravity (in m/sec2 ) to the air pressure (in kg m/sec2 ). This result is mass of the atmosphere in kilograms per square meter (kg/m2 ). Mm2 = P0 g Given the mass of air per square meter, we need to know how many square meters of surface to apply this mass to. Radius of Earth R in meters, m. This is an average radius; our planet isnt a perfect sphere. R = 6.37 106

6.4. Advanced Expression Exercises

65

Building Skills in Python, Release 2.6.5

The area of a Sphere. A = 4r2 Mass of atmosphere (in Kg) is the weight per square meter, times the number of square meters. Ma = P0 g A Check: somewhere around 1018 kg. 5. How Much Does The Atmosphere Weigh? Part 2. From Slicing Pizzas, Racing Turtles, and Further Adventures in Applied Mathematics, [Banks02]. The exercise How Much Does the Atmosphere Weigh, Part 1 assumes the earth to be an entirely at sphere. The averge height of the land is actually 840m. We can use the ideal gas law to compute the pressure at this elevation and rene the number a little further. Pressure at a given elevation P = P0 e RT z Molecular weight of air m = 28.96 103 kg/mol. m = 28.96 103 Gas constant R, in joule/(K mol). R = 8.314 Gravity g, in m/sec2 . g = 9.82 Temperature T, in K based on temperature C, in C. Well just assume that C is 15 C. T = 273 + C Elevation z, in meters, m. z = 840 This pressure can be used for the air over land, and the pressure computed in How Much Does the Atmosphere Weigh, Part 1 can be used for the air over the oceans. How much land has this reduced pressure? Reference material gives the following areas in m2 , square meters. ocean area: Ao = 3.61 1014 land area: Al = 1.49 1014 Weight of Atmosphere, adjusted for land elevation Ml = P0 g A0 + P g Al
mg

6.5 Bit Manipulation Operators


Weve already seen the usual math operators: +, -, *, /, %, **; as well as the abs() and pow() functions. There are several other operators available to us. Principally, these are for manipulating the individual bits of an integer value. Well look at ~, &, ^, |, << and >>. The unary ~ operator ops all the bits in a plain or long integer. 1s become 0s and 0s become 1s. Since most hardware uses a technique called 2s complement, this is mathematically equivalent to adding 1 and switching the numbers sign. 66 Chapter 6. Advanced Expressions

Building Skills in Python, Release 2.6.5

>>> print ~0x12345678 -305419897

There are binary bit manipulation operators, also. These perform simple Boolean operations on all bits of the integer at once. The binary & operator returns a 1-bit if the two input bits are both 1.
>>> print 0&0, 1&0, 1&1, 0&1 0 0 1 0

Heres the same kind of example, combining sequences of bits. This takes a bit of conversion to base 2 to understand whats going on.
>>> print 3&5 1

The number 3, in base 2, is 0011. The number 5 is 0101. Lets match up the bits from left to right:
0 0 1 1 & 0 1 0 1 ------0 0 0 1

The binary ^ operator returns a 1-bit if one of the two inputs are 1 but not both. This is sometimes called the exclusive or.
>>> print 3^5 6

Lets look at the individual bits


0 0 1 1 ^ 0 1 0 1 ------0 1 1 0

Which is the binary representation of the number 6. The binary | operator returns a 1-bit if either of the two inputs is 1. This is sometimes called the inclusive or. Sometimes this is written and/or.
>>> print 3|5 7

Lets look at the individual bits.


0 0 1 1 | 0 1 0 1 ------0 1 1 1

Which is the binary representation of the number 7. There are also bit shifting operations. These are mathematically equivalent to multiplying and dividing by powers of two. Often, machine hardware can execute these operations faster than the equivalent multiply or divide.

6.5. Bit Manipulation Operators

67

Building Skills in Python, Release 2.6.5

The << is the left-shift operator. The left argument is the bit pattern to be shifted, the right argument is the number of bits.
>>> print 0xA << 2 40

0xA is hexadecimal; the bits are 1-0-1-0. This is 10 in decimal. When we shift this two bits to the left, its like multiplying by 4. We get bits of 1-0-1-0-0-0. This is 40 in decimal. The >> is the right-shift operator. The left argument is the bit pattern to be shifted, the right argument is the number of bits. Python always behaves as though it is running on a 2s complement computer. The left-most bit is always the sign bit, so sign bits are shifted in.
>>> print 80 >> 3 10

The number 80, with bits of 1-0-1-0-0-0-0, shifted right 3 bits, yields bits of 1-0-1-0, which is 10 in decimal. There are some other operators available, but, strictly speaking, theyre not arithmetic operators, theyre logic operations. Well return to them in Truth, Comparison and Conditional Processing.

6.6 Division Operators


In general, the data type of an expresion depends on the types of the arguments. This rule meets our expectations for most operators: when we add two integers, the result should be an integer. However, this doesnt work out well for division because there are two dierent expectations. Sometimes we expect division to create precise answers, usually the oating-point equivalents of fractions. Other times, we want a rounded-down integer result. The classical Python denition of / followed the pattern for other operators: the results depend entirely on the arguments. 685/252 was 2 because both arguments where integers. However, 685./252. was 2.7182539682539684 because the arguments were oating point. This denition often caused problems for applications where data types were used that the author hadnt expected. For example, a simple program doing Celsius to Fahrenheit conversions will produce dierent answers depending on the input. If one user provides 18 and another provides 18.0, the answers were dierent, even though all of the inputs all had the equal numeric values.
>>> 18*9/5+32 64 >>> 18.0*9/5+32 64.400000000000006 >>> 18 == 18.0 True

This unexpected inaccuracy was generally due to the casual use of integers where oating-point numbers were more appropriate. (This can also occur using integers where complex numbers were implictly expected.) An explicit conversion function (like float()) can help prevent this. The idea, however, is for Python be a simple and sparse language, without a dense clutter of conversions to cover the rare case of an unexpected data type. Starting with Python 2.2, a new division operator was added to clarify what is required. There are two division operators: / and //. The / operator should return oating-point results; the // operator will always return rounded-down results.

68

Chapter 6. Advanced Expressions

Building Skills in Python, Release 2.6.5

In Python 2.5 and 2.6, the / operator can either use classical or old rules (results depend on the values) or it can use the new rule (result is oating-point.) In Python 3.x, this transitional meaning of / goes away and it always produces a oating-point result. Important: Python 3 In Python 3, the / operator will always produces a oating-point result. The // operator will continue to produce an integer result. To help with the transition, two tools were made available. This gives programmers a way to keep older applications running; it also gives them a way to explicitly declare that their program uses the newer operator denition. There are two parts to this: a program statememt that can be placed in a program, as well as command-line options that can be used when starting the Python interpreter. Program Statements. To ease the transition from older to newer language features, there is a __future__ module available. This module includes a division denition that changes the denition of the / operator from classical to future. You can include the following import statement to state that your program depends on the future denition of division. Well look at the import statement in depth in Components, Modules and Packages.
from __future__ import division print 18*9/5+32 print 18*9//5+32

This produces the following output. The rst line shows the new use of the / operator to produce oating point results, even if both arguments are integers. The second line shows the // operator, which produces rounded-down results.
64.4 64

The from __future__ statement will set the expectation that your script uses the new-style oatingpoint division operator. This allows you to start writing programs with version 2.6 that will work correctly with all future versions. By version 3.0, this import statement will no longer be necessary, and these will have to be removed from the few modules that used them. Command Line Options. Another tool to ease the transition are command-line options used when running the Python interpreter. This can force old-style interpretation of the / operator or to warn about old-style use of the / operator between integers. It can also force new-style use of the / operator and report on all potentially incorrect uses of the / operator. The Python interpreter command-line option of -Q will force the / operator to be treated classically (old), or with the future (new) semantics. If you run Python with -Qold , the / operators result depends on the arguments. If you run Python with -Qnew, the / operators result will be oating point. In either case, the // operator returns a rounded-down integer result. You can use -Qold to force old modules and programs to work with version 2.2 and higher. When Python 3.0 is released, however, this transition will no longer be supported; by that time you should have xed your programs and modules. To make xing easier, the -Q command-line option can take two other values: warn and warnall . If you use -Qwarn, then the / operator applied to integer arguments will generate a run-time warning. This will allow you to nd and x situations where the // operator might be more appropriate. If you use -Qwarnall, then all instances of the / operator generate a warning; this will give you a close look at your programs. You can include the command line option when you run the Python interpreter. For Linux and MacOS users, you can also put this on the #! line at the beginning of your script le.

6.6. Division Operators

69

Building Skills in Python, Release 2.6.5

#!/usr/local/bin/python -Qnew

70

Chapter 6. Advanced Expressions

CHAPTER

SEVEN

VARIABLES, ASSIGNMENT AND INPUT


The = , augmented = and del Statements
Variables hold the state of our program. In Variables well introduce variables, then in The Assignment Statement well cover the basic assignment statement for changing the value of a variable. This is followed by an exercise section that refers back to exercises from Simple Numeric Expressions and Output . In Input Functions we introduce some primitive interactive input functions that are built-in. This is followed by some simple exercises that build on those from section The Assignment Statement . Well cover the multiple assignment statement in Multiple Assignment Statement . Well round on this section with the del statement, for removing variables in The del Statement .

7.1 Variables
As a procedural program makes progress through the steps from launch to completion, it does so by undergoing changes of state. The state of our program as a whole is the state of all of the programs variables. When one variable changes, the overall state has changed. Variables are the names your program assigns to the results of an expression. Every variable is created with an initial value. Variables will change to identify new objects and the objects identied by a variable can change their internal state. These three kinds of state changes (variable creation, object assignment, object change) happen as inputs are accepted and our program evaluates expressions. Eventually the state of the variables indicates that we are done, and our program can exit. A Python variable name must be at least one letter, and can have a string of numbers, letters and _s to any length. Names that start with _ or __ have special signicance. Names that begin with _ are typically private to a module or class. Well return to this notion of privacy in Classes and Modules. Names that begin with __ are part of the way the Python interpreter is built. Example variable names:
a pi aVeryLongName a_name __str__ _hidden

Tip: Tracing Execution

71

Building Skills in Python, Release 2.6.5

We can trace the execution of a program by simply following the changes of value of all the variables in the program. For programming newbies, it helps to create a list of variables and write down their changes when studying a program. Well show an example in the next section. Python creates new objects as the result of evaluating an expression. Python assigns these objects to new variables with an assignment statement. Python removes variables with a del statement. The underlying object is later garbage-collected when there are no more variables referring to the object. Some Consequences. A Python variable is little more than a name which refers to an object. The central issue is to recognize that the underlying object is the essential part of our program; a variable name is just a meaningful label. This has a number of important consequences. One consequence of a variable being simply a label is that any number of variables can refer to the same object. In other languages (C, C++, Java) there are two kinds of values: primitive and objects, and there are distinct rules for handling the two kinds of values. In Python, every variable is a simple reference to an underlying object. When talking about simple immutable objects, like the number 3, multiple variables referring to a common object is functionally equivalent to having a distinct copy of a primitive value. When talking about mutable objects, like lists, mappings, or complex objects, distinct variable references can change the state of the common object. Another consequences is that the Python object fully denes its own type. The objects type denes the representation, the range of values and the allowed operations on the object. The type is established when the object is created. For example, oating point addition and long integer objects have dierent representations, operations of adding these kinds of numbers are dierent, the objects created by addition are of distinct types. Python uses the type information to choose which addition operation to perform on two values. In the case of an expression with mixed types Python uses the type information to coerce one or both values to a common type. This also means the casting an object to match the declared type of a variable isnt meaningful in Python. You dont use C++ or Java-style casting. Weve already worked with the four numeric types: plain integers, long integers, oating point numbers and complex numbers. Weve touched on the string type, also. There are several other built-in types that we will look at in detail in Data Structures. Plus, we can use class denitions to dene new types to Python, something well look at in Data + Processing = Objects. We commonly say that a static language associates the type information with the variable. Only values of a certain type can be assigned to a given variable. Python, in contrast, is a dynamic language; a variable is just a label or tag attached to the object. Any variable can be associated with an object of any type. The nal consequence of variables referring to objects is that a variables scope can be independent of the object itself. This means that variables which are in distinct namespaces can refer to the same object. When a function completes execution and the namespace is deleted, the variables are deleted, and the number of variables referring to an object is reduced. Additional variables may still refer to an object, meaning that the object will continue to exist. When only one variable refers to an object, then removing the last variable removes the last reference to the object, and the object can be removed from memory. Also note that expressions generally create new objects; if an object is not saved in a variable, it silently vanishes. We can safely ignore the results of a function. Scope and Namespaces. A Python variable is a name which refers to an object. To be useful, each variable must have a scope of visibility. The scope is dened as the set of statements that can make use of this variable. A variable with global scope can be referenced anywhere. On the other hand, a variable with local scope can only be referenced in a limited suite of statements. This notion of scope is essential to being able to keep a intellectual grip on a program. Programs of even moderate complexity need to keep pools of variables with separate scopes. This allows you to reuse variable names without risk of confusion from inadvertantly changing the value of a variable used elsewhere in a program.

72

Chapter 7. Variables, Assignment and Input

Building Skills in Python, Release 2.6.5

Python collects variables into pools called namespaces . A new namespace is created as part of evaluating the body of a function or module, or creating a new object. Additionally, there is one global namespace. This means that each variable (and the state that it implies) is isolated to the execution of a single function or module. By separating all locally scoped variables into separate namespaces, we dont have an endless clutter of global variables. In the rare case that you need a global variable, the global statement is available to assign a variable to the global namespace. When we introduce functions in Functions, classes in Classes and modules in Components, Modules and Packages, well revisit this namespace technique for managing scope. In particular, see Functions and Namespaces for a digression on this.

7.2 The Assignment Statement


Assignment is fundamental to Python; it is how the objects created by an expression are preserved. Well look at the basic assignment statement, plus the augmented assignment statement. Later, in Multiple Assignment Statement , well look at multiple assignment.

7.2.1 Basic Assignment


We create and change variables primarily with the assignment statement. This statement provides an expression and a variable name which will be used to label the value of the expression.
variable = expression

Heres a short script that contains some examples of assignment statements.

example3.py
#!/usr/bin/env python # Computer the value of a block of stock shares= 150 price= 3 + 5.0/8.0 value= shares * price print value

1. We have an object, the number 150, which we assign to the variable shares. 2. We have an expression 3+5.0/8.0, which creates a oating-point number, which we save in the variable price. 3. We have another expression, shares * price, which creates a oating-point number; we save this in value so that we can print it. This script created three new variables. Since this le is new, well need to do the chmod +x example3.py once, after we create this le. Then, when we run this progam, we see the following.
$ ./example3.py 543.75

7.2. The Assignment Statement

73

Building Skills in Python, Release 2.6.5

7.2.2 Augmented Assignment


Any of the usual arithmetic operations can be combined with assignment to create an augmented assignment statement. For example, look at this augmented assignment statement:
a += v

This statement is a shorthand that means the same thing as the following:
a = a + v

Heres a larger example

portfolio.py
#!/usr/bin/env python # Total value of a portfolio made up of two blocks of stock portfolio = 0 portfolio += 150 * 2 + 1/4.0 portfolio += 75 * 1 + 7/8.0 print portfolio

First, well do the chmod +x portfolio.py on this le. Then, when we run this progam, we see the following.
$ ./portfolio.py 376.125

The other basic math operations can be used similarly, although the purpose gets obscure for some operations. These include -=, *=, /=, %=, &=, ^=, |=, <<= and >>=. Heres a lengthy example. This is an extension of Craps Odds in Numeric Types and Expressions. In craps, the rst roll of the dice is called the come out roll. This roll can be won immediately if the number is 7 or 11. It can be lost immediately if the number is 2, 3 or 12. All of the remaining numbers will establish a point and the game continues.

craps.py
#!/usr/bin/env python # Compute the odds of winning on the first roll win = 0 win += 6/36.0 # ways to roll a 7 win += 2/36.0 # ways to roll an 11 print "first roll win", win # Compute the odds of losing on the first roll lose = 0 lose += 1/36.0 # ways to roll 2 lose += 2/36.0 # ways to roll 3 lose += 1/36.0 # ways to roll 12 print "first roll lose", lose # Compute the odds of rolling a point number (4, 5, 6, 8, 9 or 10)

74

Chapter 7. Variables, Assignment and Input

Building Skills in Python, Release 2.6.5

point point point print

= 1 # odds must total to 1 -= win # remove odds of winning -= lose # remove odds of losting "first roll establishes a point", point

Theres a 22.2% chance of winning, and a 11.1% chance of losing. Whats the chance of establishing a point? One way is to gure that its whats left after winning or loosing. The total of all probabilities always add to 1. Subtract the odds of winning and the odds of losing and whats left is the odds of setting a point. Heres another way to gure the odds of rolling 4, 5, 6, 8, 9 or 10.
point point point point print = 0 += 2*3/36.0 # ways to roll 4 or 10 += 2*4/36.0 # ways to roll 5 or 9 += 2*5/36.0 # ways to roll 6 or 8 point

By the way, you can add the statement print win + lose + point to conrm that these odds all add to 1. This means that we have dened all possible outcomes for the come out roll in craps. Tip: Tracing Execution We can trace the execution of a program by simply following the changes of value of all the variables in the program. We can step through the planned execution of our Python source statements, writing down the variables and their values on a sheet of paper. From this, we can see the state of our calculation evolve. When we encounter an assignment statement, we look on our paper for the variable. If we nd the variable, we put a line through the old value and write down the new value. If we dont nd the variable, we add it to our page with the initial value. Heres our example from craps.py script through the rst part of the script. The win variable was created and set to 0, then the value was replaced with 0.16, and then replaced with 0.22. The lose variable was then created and set to 0. This is what our trace looks like so far. win: lose: 0.0 0 0.16 0.22

Heres our example when craps.py script is nished. We changed the variable lose several times. We also added and changed the variable point. win: lose: point: 0.0 0.0 1.0 0.16 0.027 0.77 0.22 0.083 0.66 0.111

We can use this trace technique to understand what a program means and how it proceeds from its initial state to its nal state. As with many things Python, there is some additional subtlety to assignment, but well cover those topics later. For example, multiple-assignment statement is something well look into in more deeply in Tuples.

7.3 Input Functions


Python provides two simplistic built-in functions to accept input and set the value of variables. These are not really suitable for a complete application, but will do for our initial explorations.

7.3. Input Functions

75

Building Skills in Python, Release 2.6.5

Typically, interactive programs which run on a desktop use a complete graphic user interface (GUI), often written with the Tkinter module or the pyGTK module. Interactive programs which run over the Internet use HTML forms. The primitive interactions were showing with input() and raw_input() are only suitable for very simple programs. Important: Python 3.x In Python 3, the raw_input() function will be renamed to input(). The Python 2 input() function will be removed. Its that useless. Note that some IDEs buer the programs output, making these functions appear to misbehave. For example, if you use Komodo, youll need to use the Run in a New Console option. If you use BBEdit, youll have to use the Run in Terminal option. You can enhance these functions somewhat by including the statement import readline. This module silently and automatically enhances these input functions to give the user the ability to scroll backwards and reuse previous inputs. You can also import rlcompleter. This module allows you to dene sophisticated keyword auto-completion for these functions.

7.3.1 The raw_input() Function


The rst way to get interactive input is the raw_input() function. This function accepts a string parameter, which is the users prompt, written to standard output. The next line available on standard input is returned as the value of the function. raw_input([prompt] ) If a prompt is present, it is written to sys.stdout. Input is read from sys.stdin and returned as a string. The raw_input() function reads from a le often called sys.stdin. When running from the command-line, this will be the keyboard, and what you type will be echoed in the command window or Terminal window. If you try, however, to run these examples from Textpad, youll see that Textpad doesnt have any place for you to type any input. In BBEdit, youll need to use the Run In Terminal item in the #! menu. Heres an example script that uses raw_input().

rawdemo.py
#!/usr/bin/env python # show how raw_input works a= raw_input( "yes?" ) print "you said", a

When we run this script from the shell prompt, it looks like the following.
MacBook-3:Examples slott$ python rawdemo.py yes?why not? you said why not?

1. This program begins by evaluating the raw_input() function. When raw_input() is applied to the parameter of "yes?", it writes the prompt on standard output, and waits for a line of input.

76

Chapter 7. Variables, Assignment and Input

Building Skills in Python, Release 2.6.5

(a) We entered why not?. (b) Once that line was complete, the input string is returned as the value of the function. (c) The raw_input() functions value was assigned to the variable a. 2. The second statement printed that variable along with some text. If we want numeric input, we must convert the resulting string to a number.

stock.py
#!/usr/bin/env python # Compute the value of a block of stock shares = int( raw_input("shares: ") ) price = float( raw_input("dollars: ") ) price += float( raw_input("eights: ") )/8.0 print "value", shares * price

Well chmod +x stock.py this program; then we can run it as many times as we like to get results.
MacBook-3:Examples slott$ ./stock.py shares: 150 dollars: 24 eights: 3 value 3656.25

The raw_input() mechanism is very limited. If the string returned by raw_input() is not suitable for use by int(), an exception is raised and the program stops running. Well cover exception handling in detail in Exceptions. As a teaser, heres what it looks like.
MacBook-5:Examples slott$ python stock.py shares: a bunch Traceback (most recent call last): File "stock.py", line 3, in <module> shares = int( raw_input("shares: ") ) ValueError: invalid literal for int() with base 10: 'a bunch'

7.3.2 The input() Function


In addition to the raw_input() function, which returns the exact string of characters, there is the input() function. This applies the eval() function to the input, which will typically convert numeric input to the appropriate objects. Important: Python 3 This function will be removed. Its best not to make use of it. The value of the input() function is eval( raw_input( prompt ) ).

7.3. Input Functions

77

Building Skills in Python, Release 2.6.5

7.4 Multiple Assignment Statement


The basic assignment statement can do more than assign the result of a single expression to a single variable. The assignment satement can also assign multiple variables at one time. The essential rule is that the left and right side must have the same number of elements. For example, the following script has several examples of multiple assignment.

line.py
#!/usr/bin/env python # Compute line between two points. x1,y1 = 2,3 # point one x2,y2 = 6,8 # point two m,b = float(y1-y2)/(x1-x2), y1-float(y1-y2)/(x1-x2)*x1 print "y=",m,"*x+",b

When we run this program, we get the following output


MacBook-3:Examples slott$ ./line.py y = 1.25 *x+ 0.5

We set variables x1, y1, x2 and y2. Then we computed m and b from those four variables. Then we printed the m and b. The basic rule is that Python evaluates the entire right-hand side of the = statement. Then it matches values with destinations on the left-hand side. If the lists are dierent lengths, an exception is raised and the program stops. Because of the complete evaluation of the right-hand side, the following construct works nicely to swap to variables. This is often quite a bit more complicated in other languages.
a,b = 1,4 b,a = a,b print a,b

Well return to this in Tuples, where well see additional uses for this feature.

7.5 The del Statement


An assignment statement creates or locates a variable and then assigns a new object to the variable. This change in state is how our program advances from beginning to termination. Python also provides a mechanism for removing variables, the del statement. The del statement looks like this:
del object , ...

Each object is any kind of Python object. Usually these are variables, but they can be functions, modules or classes.

78

Chapter 7. Variables, Assignment and Input

Building Skills in Python, Release 2.6.5

The del statement works by unbinding the name, removing it from the set of names known to the Python interpreter. If this variable was the last remaining reference to an object, the object will be removed from memory. If, on the other hand, other variables still refer to this object, the object wont be deleted. C++ Comparison Programmers familiar with C++ will be pleased to note that memory management is silent and automatic, making programs much more reliable with much less eort. This removal of objects is called garbage collection, something that can be rather diicult to manage in larger applications. When garbage collection is done incorrectly, it can lead to dangling references: a variable that refers to an object that was deleted prematurely. Poorly designed garbage collection can also lead to memory leaks, where unreferenced objects are not properly removed from memory. Because of the automated garbage collection in Python, it suers from none of these memory management problems. The del statement is typically used only in rare, specialized cases. Ordinary namespace management and garbage collection are generally suicient for most purposes.

7.6 Interactive Mode Revisited


When we rst looked at interactive Python in Command-Line Interaction we noted that Python executes assignment statements silently, but prints the results of an expression statement. Consider the following example.
>>> pi=355/113.0 >>> area=pi*2.2**2 >>> area 15.205309734513278

The rst two inputs are complete statements, so there is no response. The third input is just an expression, so there is a response. It isnt obvious, but the value assigned to pi isnt correct. Because we didnt see anything displayed, we didnt get any feedback from our computation of pi. Python, however, has a handy way to help us. When we type a simple expression in interactive Python, it secretly assigns the result to a temporary variable named _. This isnt a part of scripting, but is a handy feature of an interactive session. This comes in handy when exploring something rather complex. Consider this interactive session. We evaluate a couple of expressions, each of which is implicitly assigned to _. We can then save the value of _ in a second variable with an easier-to-remember name, like pi or area.
>>> 335/113.0 2.9646017699115044 >>> 355/113.0 3.1415929203539825 >>> pi=_ >>> pi*2.2**2 15.205309734513278 >>> area=_ >>> area 15.205309734513278

7.6. Interactive Mode Revisited

79

Building Skills in Python, Release 2.6.5

Note that we created a oating point object (2.964...), and Python secretly assigned this object to _. Then, we computed a new oating point object (3.141...), which Python assigned to _. What happened to the rst oat, 2.964...? Python garbage-collected this object, removing it from memory. The second oat that we created (3.141) was assigned to _. We then assigned it to pi, also, giving us two references to the object. When we computed another oating-point value (15.205...), this was assigned to _. Does this mean our second oat, 3.141... was garbage collected? No, it wasnt garbage collected; it was still referenced by the variable pi.

7.7 Variables, Assignment and Input Function Exercises


7.7.1 Variables and Assignment
1. Extend Previous Exercises. Rework the exercises in Numeric Types and Expressions. Each of the previous exercises can be rewritten to use variables instead of expressions using only constants. For example, if you want to tackle the Fahrenheit to Celsius problem, you might write something like this:
#!/usr/bib/env python # Convert 8 C to F C=8 F=32+C*(9./5.) print "celsius",C,"fahrenheit",F

Youll want to rewrite these exercises using variables to get ready to add input functions. 2. State Change. Is it true that all programs simply establish a state? It can argued that a controller for a device (like a toaster or a cruise control) simply maintains a steady state. The notion of state change as a program moves toward completion doesnt apply because the software is always on. Is this the case, or does the software controlling a device have internal state changes? For example, consider a toaster with a thermostat, a browness sensor and a single heating element. What are the inputs? What are the outputs? Are there internal states while the toaster is making toast?

7.7.2 Input Functions


Refer back to the exercises in Numeric Types and Expressions for formulas and other details. Each of these can be rewritten to use variables and an input conversion. For example, if you want to tackle the Fahrenheit to Celsius problem, you might write something like this:
C = raw_input('Celsius: ') F = 32+C*(9./5.) print "celsius",C,"fahrenheit",F

1. Stock Value. Input the number of shares, dollar price and number of 8ths. From these three inputs, compute the total dollar value of the block of stock. 2. Convert from |deg| C to |deg| F. Write a short program that will input C and output F. A second program will input F and output C.

80

Chapter 7. Variables, Assignment and Input

Building Skills in Python, Release 2.6.5

3. Periodic Payment. Input the principal, annual percentage rate and number of payments. Compute the monthly payment. Be sure to divide rate by 12 and multiple payments by 12. 4. Surface Air Consumption Rate. Write a short program will input the starting pressure, nal pressure, time and maximum depth. Compute and print the SACR. A second program will input a SACR, starting pressure, nal pressure and depth. It will print the time at that depth, and the time at 10 feet more depth. 5. Wind Chill. Input a temperature and a wind speed. Output the wind chill. 6. Force from a Sail. Input the height of the sail and the length. The surface area is 1/2 h l. For a wind speed of 25 MPH, compute the force on the sail. Small boat sails are 25-35 feet high and 6-10 feet long.

7.8 Variables and Assignment Style Notes


Spaces are used sparingly in Python. It is common to put spaces around the assignment operator. The recommended style is
c = (f-32)*5/9

Do not take great pains to line up assignment operators vertically. The following has too much space, and is hard to read, even though it is fussily aligned.
a b aVeryLongVariable d = = = = 12 a*math.log(a) 26 13

This is considered poor form because Python takes a lot of its look from natural languages and mathematics. This kind of horizontal whitespace is hard to follow: it can get diicult to be sure which expression lines up with which variable. Python programs are meant to be reasonably compact, more like reading a short narrative paragraph or short mathematical formula than reading a page-sized UML diagram. Variable names are often given as mixedCase; variable names typically begin with lower-case letters. The lower_case_with_underscores style is also used, but is less popular. In addition, the following special forms using leading or trailing underscores are recognized: single_trailing_underscore_: used to avoid conicts with Python keywords. For example: print_ = 42 __double_leading_and_trailing_underscore__: used for special objects or attributes, e.g. __init__, __dict__ or __file__. These names are reserved; do not use names like these in your programs unless you specically mean a particular built-in feature of Python. _single_underscore: means that the variable is private.

7.8. Variables and Assignment Style Notes

81

Building Skills in Python, Release 2.6.5

82

Chapter 7. Variables, Assignment and Input

CHAPTER

EIGHT

TRUTH, COMPARISON AND CONDITIONAL PROCESSING


Truth, Comparison and the if Statement, pass and assert Statements.
This section leading up to the for and while statements, as well as the break and continue statements. The elements of Python weve seen so far give us some powerful capabilities. We can write programs that implement a wide variety of requirements. State change is not always as simple as the examples weve seen in Variables, Assignment and Input . When we run a script, all of the statements are executed unconditionally. Our programs cant handle alternatives or conditions. Python provides decision-making mechanisms similar to other programming languages. In Truth and Logic well look at truth, logic and the logic operators. The exercises that follow examine some subtleties of Pythons evaluation rules. In Comparisons well look at the comparison operators. Then, Conditional Processing: the if Statement describes the if statement. In The assert Statement well introduce a handy diagnostic tool, the assert statement. In the next chapter, Loops and Iterative Processing, well look at looping constructs.

8.1 Truth and Logic


Many times the exact change in state that our program needs to make depends on a condition. A condition is a Boolean expression; an expression that is either True or False. Generally conditions are on comparisons among variables using the comparison operations. Well look at the essential denitions of truth, the logic operations and the comparison operations. This will allow us to build conditions.

8.1.1 Truth
Python represents truth and falsity in a variety of ways. False. Also 0, the special value None, zero-length strings "", zero-length lists [], zero-length tuples (), empty mappings {} are all treated as False. True. Anything else that is not equivalent to False. We try to avoid depending on relatively obscure rules for determining True vs. False. We prefer to use the two explicit keywords, True and False. Note that a previous version of Python didnt have the boolean literals, and some older open-source programs will dene these values. 83

Building Skills in Python, Release 2.6.5

Python provides a factory function to collapse these various forms of truth into one of the two explicit boolean objects. bool(object ) Returns True when the argument object is one the values equivalent to truth, False otherwise.

8.1.2 Logic
Python provides three basic logic operators that work on this Boolean domain. Note that this Boolean domain, with just two values, True and False, and these three operators form a complete algebraic system, sometimes called Boolean algebra, after the mathemetician George Boole. The operators supported by Python are not, and and or . We can fully dene these operators with rule statements or truth tables. This truth table shows the evaluation of not x.
print "x", "not x" print True, not True print False, not False

x True False

not x False True

This table shows the evaluation of x and y for all combination of True and False.
print print print print print "x", "y", "x and y" True, True, True and True True, False, True and False False, True, False and True False, False, False and False

x True True False False

y True False True False

x and y True False False False

An important feature of and is that it does not evaluate all of its parameters before it is applied. If the left-hand side is False or one of the equivalent values, the right-hand side is not evaluated, and the left-hand value is returned. Well look at some examples of this later. For now, you can try things like the following.
print False and 0 print 0 and False

This will show you that the rst false value is what Python returns for and. This table shows the evaluation of x or y for all combination of True and False. x True True False False y True False True False x or y True True True False

Parallel with the and operator, or does not evaluate the right-hand parameter if the left-hand side is True or one of the equivalent values.

84

Chapter 8. Truth, Comparison and Conditional Processing

Building Skills in Python, Release 2.6.5

As a nal note, and is a high priority operator (analogous to multiplication) and or is lower priority (analogous to addition). When evaluating expressions like a or b and c, the and operation is evaluated rst, followed by the or operation.

8.1.3 Exercises
1. Logic Short-Cuts. We have several versions of false: False, 0, None, '', (), [] and {}. Well cover all of the more advanced versions of false in Data Structures. For each of the following, work out the value according to the truth tables and the evaluation rules. Since each truth or false value is unique, we can see which part of the expression was evaluated. False and None 0 and None or () and [] True and None or () and [] 0 or None and () or [] True or None and () or [] 1 or None and 'a' or 'b'

8.2 Comparisons
Well look at the basic comparison operators. Well also look at the partial evaluation rules of the logic operators to show how we can build more useful expressions. Finally, well look at oating-point equality tests, which are sometimes done incorrectly.

8.2.1 Basic Comparisons


We compare values with the comparison operators. These correspond to the mathematical functions of <, , >, , = and =. Conditional expressions are often built using the Python comparison operators: <, <=, >, >=, == and != for less than, less than or equal to, greater than, greater than or equal to, equal to and not equal to.
>>> p1 = 22./7. >>> p2 = 355/113. >>> p1 3.1428571428571428 >>> p2 3.1415929203539825 >>> p1 < p2 False >>> p2 >= p2 True

When applying a comparison operator, we see a number of steps. 1. Evaluate both argument values. 2. Apply the comparison to create a boolean result. (a) Convert both parameters to the same type. Numbers are converted to progressively longer types: plain integer to long integer to oat to complex.

8.2. Comparisons

85

Building Skills in Python, Release 2.6.5

(b) Do the comparison. (c) Return True or False. We call out these three steps explicitly because there are some subtleties in comparison among unlike types of data; well come to this later when we cover sequences, mappings and classes in Data Structures. Generally, it doesnt make sense to compare unlike types of data. After all, you cant ask Which is larger, the Empire State Building or the color green? Comparisons can be combined in Python, unlike most other programming languages. We can ask: 0 <= a < 6 which has the usual mathematical meaning. Were not forced to use the longer form: 0 <= a and a < 6. This is useful when a is actually some complex expression that wed rather not repeat. Here is an example.
>>> 3 < p1 < 3.2 True >>> 3 < p1 and p1 < 3.2 True

Note that the preceding example had a mixture of integers and oating-point numbers. The integers were coerced to oating-point in order to evaluate the expressions.

8.2.2 Partial Evaluation


We can combine the logic operators, comparisons and math. This allows us to use comparisons and logic to prevent common mathematical blunders like attempting to divide by zero, or attempting to take the square root of a negative number. For example, lets start with this program that will gure the average of 95, 125 and 132.
sum = 95 + 125 + 132 count = 3 average = float(sum)/count print average

Initially, we set the variables sum and count . Then we compute the average using sum and count. Assume that the statement that computes the average (average=...) is part of a long and complex program. Sometimes that long program will try to compute the average of no numbers at all. This has the same eect as the following short example.
sum, count = 0, 0 average = float(sum)/count print average

In the rare case that we have no numbers to average we dont want to crash when we foolishly attempt to divide by zero. Wed prefer to have some more graceful behavior. Recall from Truth and Logic that the and operator doesnt evaluate the right-hand side unless the left-hand side is True. Stated the other way, the and operator only evaluates the right side if the left side is True. We can guard the division like this:
average = count != 0 and sum/count print average

86

Chapter 8. Truth, Comparison and Conditional Processing

Building Skills in Python, Release 2.6.5

This is an example that can simplify certain kinds of complex processing. If the count is non-zero, the left side is true and the right side must be checked. If the count is zero, the left side is False, the result of the complete and operation is False. This is a consequence of the meaning of the word and. The expression a and b means that a is true as well as b is true. If a is false, the value of b doesnt really matter, since the whole expression is clearly false. A similar analysis holds for the word or. The expression a or b means that one of the two is true; it also means that neither of the two is false. If a is true, then the value of b doesnt change the truth of the whole expression. The statement Its cold and rainy is completely false when it is warm; rain doesnt matter to falsifying the whole statement. Similarly, Im stopping for coee or a newspaper is true if Ive stopped for coee, irrespective of whether or not I got a newspaper.

8.2.3 Floating-Point Comparisons


Exact equality between oating-point numbers is a dangerous concept. After a lengthy computation, roundo errors in oating point numbers may have innitesimally small dierences. The answers are close enough to equal for all practical purposes, but every single one of the 64 bits may not be identical. The following technique is the appropriate way to do oating point comparisons.
abs(a-b)<0.0001

Rather than ask if the two oating point values are the same, we ask if theyre close enough to be considered the same. For example, run the following tiny program.

oatequal.py
#!/usr/bin/env python # Are two floating point values really completely equal? a,b = 1/3.0, .1/.3 print a,b,a==b print abs(a-b)<0.00001

When we run this program, we get the following output


$ python floatequal.py 0.333333333333 0.333333333333 False True

The two values appear the same when printed. Yet, on most platforms, the == test returns False. They are not precisely the same. This is a consequence of representing real numbers with only a nite amount of binary precision. Certain repeating decimals get truncated, and these truncation errors accumulate in our calculations. There are ways to avoid this problem; one part of this avoidance is to do the algebra necessary to postpone doing division operations. Division introduces the largest number erroneous bits onto the trailing edge of our numbers. The other part of avoiding the problem is never to compare oating point numbers for exact equality.

8.2. Comparisons

87

Building Skills in Python, Release 2.6.5

8.3 Conditional Processing: the if Statement


Many times the programs exact change in state depends on a condition. Conditional processing is done by setting statements apart in suites with conditions attached to the suites. The Python syntax for this is an if statement.

8.3.1 The if Statement


The basic form of an if statement provides a condition and a suite of statements that are executed when the condition is true. It looks like this:
if expression : suite

The suite is an indented block of statements. Any statement is allowed in the block, including indented if statements. You can use either tabs or spaces for indentation. The usual style is four spaces. This is our rst compound statement. See Python Syntax Rules for some additional guidance on syntax for compound statements. The if statement evaluates the condition expression rst. When the result is True, the suite of statements is executed. Otherwise the suite is skipped. For example, if two dice show a total of 7 or 11, the throw is a winner. In the following snippet, d1 and d2 are two dice values that range from 1 to 6.
if d1+d2 == 7 or d1+d2 == 11: print "winner", d1+d2

Here we have a typically complex expression. The or operator evaluates the left side rst. Python evaluates and applies the high-precendence arithmetic operator before the lower-precendence comparison operator. If the left side is true (d1 + d2 is 7), the or expression is true, and the suite is executed. If the left side is false, then the right side is evaluated. If it is true (d1 + d2 is 11), the or expression is true, and the suite is executed. Otherwise, the suite is skipped.

88

Chapter 8. Truth, Comparison and Conditional Processing

Building Skills in Python, Release 2.6.5

Python Syntax Rules Python syntax is very simple. Weve already seen how basic expressions and some simple statements are formatted. Here are some syntax rules and examples. Look at Syntax Formalities for an overview of the lexical rules. Compound statements, including if, while, for, have an indented suite of statements. You have a number of choices for indentation; you can use tab characters or spaces. While there is a lot of exibility, the most important thing is to be consistent. Further, the recommendation is to use spaces. Thats what well show. The generally accepted way to format Python code is to set your editor to replace tabs with 4 spaces. Well show an example with spaces, shown via . a=0 ifa==0: print"aiszero" else: print"aisnotzero" ifa%2==0: print"aiseven" else: print"aisodd" IDLE uses four spaces for indentation automatically. If youre using another editor, you can set it to use four spaces, also.

8.3.2 The elif Clause


Often there are several conditions that need to be handled. This is done by adding elif clauses. This is short for else-if. We can add an unlimited number of elif clauses. The elif clause has almost the same syntax as the if clause.
elif expression : suite

Here is a somewhat more complete rule for the come out roll in a game of craps:
result= None if d1+d2 == 7 or d1+d2 == 11: result= "winner" elif d1+d2 == 2 or d1+d2 == 3 or d1+d2 == 12: result= "loser" print result

First, we checked the condition for winning; if the rst condition is true, the rst suite is executed and the entire if statement is complete. If the rst condition is false, then the second condition is tested. If that condition is true, the second suite is executed, and the entire if statement is complete. If neither condition is true, the if statement has no eect.

8.3.3 The else Clause


Python also gives us the capability to put a catch-all suite at the end for all other conditions. This is done by adding an else clause. The else clause has the following syntax.

8.3. Conditional Processing: the if Statement

89

Building Skills in Python, Release 2.6.5

else: suite

Heres the complete come-out roll rule, assuming two values d1 and d2.
point= None if d1+d2 == 7 or d1+d2 == 11: print "winner" elif d1+d2 == 2 or d1+d2 == 3 or d1+d2 == 12: print "loser" else: point= d1+d2 print "point is", point

Here, we use the else: suite to handle all of the other possible rolls. There are six dierent values (4, 5, 6, 8, 9, or 10), a tedious typing exercise if done using or. We summarize this with the else: clause. While handy in one respect, this else: clause is also dangerous. By not explicitly stating the condition, it is possible to overlook simple logic errors. Consider the following complete if statement that checks for a winner on a eld bet. A eld bet wins on 2, 3, 4, 9, 10, 11 or 12. The payout odds are dierent on 2 and 12.
outcome= 0 if d1+d2 == 2 or outcome= 2 print "field elif d1+d2==4 or outcome= 1 print "field else: outcome= -1 print "field

d1+d2 == 12: pays 2:1" d1+d2==9 or d1+d2==10 or d1+d2==11: pays even money"

loses"

Here we test for 2 and 12 in the rst clause; we test for 4, 9, 10 and 11 in the second. Its not obvious that a roll of 3 is missing from the even money pay out. This fragment incorrectly treats 3, 5, 6, 7 and 8 alike in the else:. While the else: clause is used commonly as a catch-all, a more proper use for else: is to raise an exception because a condition was not matched by any of the if or elif clauses.

8.4 The pass Statement


The pass statement does nothing. Sometimes we need a placeholder to ll the syntactic requirements of a compound statement. We use the pass statement to ll in the required suite of statements. The syntax is trivial.
pass

Heres an example of using the pass statement.


if n%2 == 0: pass # Ignore even values else: count += 1 # Count the odd values

90

Chapter 8. Truth, Comparison and Conditional Processing

Building Skills in Python, Release 2.6.5

Yes, technically, we can invert the logic in the if-clause. However, sometimes it is more clear to provide the explicit do nothing than to determine the inverse of the condition in the if statement. As programs grow and evolve, having a pass statement can be a handy reminder of places where a program can be expanded. Also, when we come to class declarations in Data + Processing = Objects, well see one other use for the pass statement.

8.5 The assert Statement


An assertion is a condition that were claiming should be true at this point in the program. Typically, it summarizes the state of the programs variables. Assertions can help explain the relationships among variables, review what has happened so far in the program, and show that if statements and for or while loops have the desired eect. When a program is correct, all of the assertions are true no matter what inputs are provided. When a program has an error, at least one assertion winds up false for some combination of inputs. Python directly supports assertions through an assert statement. There are two forms:
assert condition assert condition , expression

If the condition is False, the program is in error; this statement raises an AssertionError exception. If the condition is True, the program is correct, this statement does nothing more. If the second form of the statement is used, and an expression is given, an exception is raised using the value of the expression. Well cover exceptions in detail in Exceptions. If the expression is a string, it becomes an the value associated with the AssertionError exception. Note: Additional Features There is an even more advanced feature of the assert statement. If the expression evaluates to a class, that class is used instead of AssertionError. This is not widely used, and depends on elements of the language we havent covered yet. Heres a typical example:
max= 0 if a < b: max= b if b < a: max= a assert (max == a or max == b) and max >= a and max >= b

If the assertion condition is true, the program continues. If the assertion condition is false, the program raises an AssertionError exception and stops, showing the line where the problem was found. Run this program with a equal to b and not equal to zero; it will raise the AssertionError exception. Clearly, the if statements dont set max to the largest of a and b when a = b . There is a problem in the if statements, and the presence of the problem is revealed by the assertion.

8.5. The assert Statement

91

Building Skills in Python, Release 2.6.5

8.6 The if-else Operator


There are situations where an expression involves a simple condition and a full-sized if statement is distracting syntatic overkill. Python has a handy logic operator that evalutes a condition, then returns either of two values depending on that condition. Ternary Operator Most arithmetic and logic operators have either one or two values. An operation that applies to a single value is called unary. For example -a and abs(b) are examples of unary operations: unary negation and unary absolute value. An operation that applies to two values is called binary. For example, a*b shows the binary multiplication operator. The if-else operator trinary (or ternary) It involves a conditional expression and two alternative expressions. Consequently, it doesnt use a single special character, but uses two keywords: if and else. Some folks will mistakenly call it the ternary operator as if this is the only possible ternary operator. The basic form of the operator is
expression if condition else expression

Python evaluates the condition in the middle rst. If the condition is True, then the left-hand expression is evaluated, and thats the value of the operation. If the condition is False, then the right-hand expression is evaluated, and thats the value of the operation. Note that the condition is always evaluated. Only one of the other two expressions is evaluated, making this a kind of short-cut operator like and and or. Here are a couple of examples.
average = sum/count if count != 0 else None oddSum = oddSum + ( n if n % 2 == 1 else 0 )

The intent is to have an English-like reading of the statement. The average is the sum divided by the count if the count is non-zero; else the average is None. The wordy alterative to the rst example is the following.
if count != 0: average= sum/count else: average= None

This seems like three extra lines of code to prevent an error in the rare situation of there being no values to average. Similarly, the wordy version of the second example is the following:
if n % 2 == 0: pass else: oddSum = oddSum + n

92

Chapter 8. Truth, Comparison and Conditional Processing

Building Skills in Python, Release 2.6.5

For this second example, the original statement registered our intent very clearly: we were summing the odd values. The long-winded if-statement tends to obscure our goal by making it just one branch of the if-statement.

8.7 Condition Exercises


1. Develop an or-guard. In the example above we showed the and-guard pattern:
average = count != 0 and float(sum)/count

Develop a similar technique using or. Compare this with the if-else operator. 2. Come Out Win. Assume d1 and d2 have the numbers on two dice. Assume this is the come out roll in Craps. Write the expression for winning (7 or 11). Write the expression for losing (2, 3 or 12). Write the expression for a point (4, 5, 6, 8, 9 or 10). 3. Field Win. Assume d1 and d2 have the numbers on 2 dice. The eld pays on 2, 3, 4, 9, 10, 11 or 12. Actually there are two conditions: 2 and 12 pay at one set of odds (2:1) and the other 5 numbers pay at even money. Write two two conditions under which the eld pays. 4. Hardways. Assume d1 and d2 have the numbers on 2 dice. A hardways proposition is 4, 6, 8, or 10 with both dice having the same value. Its the hard way to get the number. A hard 4, for instance is d1+d2 == 4 and d1 == d2. An easy 4 is d1+d2 == 4 and d1 != d2. You win a hardways bet if you get the number the hard way. You lose if you get the number the easy way or you get a seven. Write the winning and losing condition for one of the four hard ways bets. 5. Sort Three Numbers. This is an exercise in constructing if-statements. Using only simple variables and if statements, you should be able to get this to work; a loop is not needed. Given 3 numbers (X, Y, Z ), assign variables x, y, z so that x y z and x , y, and z are from X, Y, and Z. Use only a series of if-statements and assignment statements. Hint. You must dene the conditions under which you choose between x X, x Y or x Z. You will do a similar analysis for assigning values to y and z. Note that your analysis for setting y will depend on the value set for x; similarly, your analysis for setting z will depend on values set for x and y. 6. Come Out Roll. Accept d1 and d2 as input. First, check to see that they are in the proper range for dice. If not, print a message. Otherwise, determine the outcome if this is the come out roll. If the sum is 7 or 11, print winner. If the sum is 2, 3 or 12, print loser. Otherwise print the point. 7. Field Roll. Accept d1 and d2 as input. First, check to see that they are in the proper range for dice. If not, print a message. Otherwise, check for any eld bet pay out. A roll of 2 or 12 pays 2:1, print pays 2; 3, 4, 9, 10 and 11 pays 1:1, print pays even; everything else loses, print loses 8. Hardways Roll. Accept d1 and d2 as input. First, check to see that they are in the proper range for dice. If not, print a message. Otherwise, check for a hard ways bet pay out. Hard 4 and 10 pays 7:1; Hard 6 and 8 pay 9:1, easy 4, 6, 8 or 10, or any 7 loses. Everything else, the bet still stands.

8.7. Condition Exercises

93

Building Skills in Python, Release 2.6.5

9. Partial Evaluation. This partial evaluation of the and and or operators appears to violate the evaluate-apply principle espoused in The Evaluate-Apply Cycle. Instead of evaluating all parameters, these operators seem to evaluate only the left-hand parameter before they are applied. Is this special case a problem? Can these operators be removed from the language, and replaced with the simple if -statement? What are the consequences of removing the short-circuit logic operators?

8.8 Condition Style Notes


Now that we have introduced compound statements, you may need to make an adjustment to your editor. Set your editor to use spaces instead of tabs. Most Python is typed using four spaces instead of the ASCII tab character (^I). Most editors can be set so that when you hit the Tab key on your keyboard, the editor inserts four spaces. IDLE is set up this way by default. A good editor will follow the indents so that once you indent, the next line is automatically indented. Well show the spaces explicitly as in the following fragment. ifa>=b: m=a ifb>=a: m=b This is has typical spacing for a piece of Python programming. Note that the colon (:) immediately follows the condition. This is the usual format, and is consistent with the way natural languages (like English) are formatted. These if statements can be collapsed to one-liners, in which case they would look like this: ifa>=b:m=a ifb>=a:m=b It helps to limit your lines to 80 positions or less. You may need to break long statements with a \\ at the end of a line. Also, parenthesized expressions can be continued onto the next line without a \\. Some programmers will put in extra ()s just to make line breaks neat. While spaces are used sparingly, they are always used to set o comparison operators and boolean operators. Other mathematical operators may or may not be set o with spaces. This makes the comparisons stand out in an if statement or while statement. ifb**2-4*a*c<0: print"noroot" This shows the space around the comparison, but not the other arithmetic operators.

94

Chapter 8. Truth, Comparison and Conditional Processing

CHAPTER

NINE

LOOPS AND ITERATIVE PROCESSING


The for, while, break, continue Statements
The elements of Python weve seen so far give us some powerful capabilities. We can write programs that implement a wide variety of requirements. State change is not always as simple as the examples weve seen in Variables, Assignment and Input . In Truth, Comparison and Conditional Processing we saw how to make our programs handle handle alternatives or conditions. In this section, well see how to write programs which do their processing for all pieces of data. For example, when we compute an average, we compute a sum for all of the values. Python provides iteration (sometimes called looping) similar to other programming languages. In Iterative Processing: For All and There Exists well describe the semantics of iterative statements in general. In Iterative Processing: The for Statement well describe the for statement. Well cover the while statements in Iterative Processing: The while Statement . This is followed by some of the most interesting and challenging short exercises in this book. Well add some iteration control in More Iteration Control: break and continue, describing the break and continue statements. Well conclude this chapter with a digression on the correct ways to develop iterative and conditional statements in A Digression.

9.1 Iterative Processing: For All and There Exists


There are two common qualiers used for logical conditions. These are sometimes called the universal and existential qualiers. We can call the for all and there exists. We can also call them the all and any qualiers. A program may involve a state that is best described as a for all state, where a number of repetitions of some task are required. For example, if we were to write a program to simulate 100 rolls of two dice, the terminating condition for our program would be that we had done the simulation for all 100 rolls. Similary, we may have a condition that looks for existence of a single example. We might want to know if a le contains a line with ERROR in it. In this case, we want to write a program with a terminating condition would be that there exists an error line in the log le. It turns out that All and Any are logical inverses. We can always rework a for any condition to be a for all condition. A program that determines if there exists an error line is the same as a program that determines that all lines are not error lines.

95

Building Skills in Python, Release 2.6.5

Any time we have a for all or for any condition, we have an iteration: we will be iterating through the set of values, evaluating the condition. We have a choice of two Python statements for expressing this iteration. One is the for statement and the other is the while statement.

9.2 Iterative Processing: The for Statement


The simplest for statement looks like this:
for variable in iterable : suite

The suite is an indented block of statements. Any statement is allowed in the block, including indented for statements. The variable is a variable name. The suite will be executed iteratively with the variable set to each of the values in the given iterable. Typically, the suite will use the variable, expecting it to have a distinct value on each pass. There are a number of ways of creating the necessary iterable collection of values. The most common is to use the range() function to generate a suitable list. We can also create the list manually, using a sequence display; well show some examples here. Well return to the details of sequences in Sequences: Strings, Tuples and Lists. The range() function has 3 forms: range(x) generates x distinct values, from 0 to x -1, incrementing by 1. Mathematicians describe this as a half-open interval and write it [0, x). range(x, y) generates y x distinct values from x to y-1, incrementing by 1. [x, y ). range(x, y, z) generates values from x to y-1, incrementing by z : [x, x + z, x + 2z, ..., x + kz < y ], for some integer k. A sequence display looks like this: []
expression , ...

Its a list of expressions, usually simply numbers, separated by commas. The square brackets are essential for marking a sequence. Here are some examples.
for i in range(6): print i+1

This rst example uses range() to create a sequence of 6 values from 0 to just before 6. The for statement iterates through the sequence, assigning each value to the local variable i. The print statement has an expression that adds one to i and prints the resulting value.
for j in range(1,7): print j

This second example uses the range() to create a sequence of 6 values from 1 to just before 7. The for statement iterates through the sequence, assigning each value to the local variable j . The print statement prints the value.

96

Chapter 9. Loops and Iterative Processing

Building Skills in Python, Release 2.6.5

for o in range(1,36,2): print o

This example uses range() to create a sequence of 36/2=18 values from 1 to just before 36 stepping by 2. This will be a list of odd values from 1 to 35. The for statement iterates through the sequence, assigning each value to the local variable o. The print statement prints all 18 values.
for r in [1,3,5,7,9,12,14,16,18,19,21,23,25,27,30,32,34,36]: print r, "red"

This example uses an explicit sequence of values. These are all of the red numbers on a standard roulette wheel. It then iterates through the sequence, assigning each value to the local variable r. The print statement prints all 18 values followed by the word red. Heres a more complex example, showing nested for statements. This enumerates all the 36 outcomes of rolling two dice.
for d1 in range(6): for d2 in range(6): print d1+1,d2+1,'=',d1+d2+2

1. The outer for statement uses range() to create a sequence of 6 values, and iterates through the sequence, assigning each value to the local variable d1. 2. For each value of d1, the inner loop creates a sequence of 6 values, and iterates through that sequence, assigning each value to d2. 3. The print statement will be executed 36 times. Heres the example alluded to earlier, which does 100 simulations of rolling two dice.
import random for i in range(100): d1= random.randrange(6)+1 d2= random.randrange(6)+1 print d1+d2

1. The for statement uses range() to create a sequence of 100 values, assigns each value to the local variable i. Note that the suite of statements never actually uses the value of i. The value of i marks the state changes until the loop is complete, but isnt used for anything else. 2. For each value of i, two values are created, d1 and d2. 3. The sum of d1 and d2 is printed. There are a number of more advanced forms of the for statement, which well cover in the section on sequences in Sequences: Strings, Tuples and Lists.

9.3 Iterative Processing: The while Statement


The while statement looks like this:
while expression : suite

9.3. Iterative Processing: The while Statement

97

Building Skills in Python, Release 2.6.5

The suite is an indented block of statements. Any statement is allowed in the block, including indented while statements. As long as the expression is true, the suite is executed. This allows us to construct a suite that steps through all of the necessary tasks to reach a terminating condition. It is important to note that the suite of statements must include a change to at least one of the variables in the while expression. When it is possible to execute the suite of statements without changing any of the variables in the while expression, the loop will not terminate. Lets look at some examples.
t, s = 1, 1 while t != 9: t, s = t + 2, s + t

1. The loop is initialized with t and s each set to 1. 2. We specify that the loop continues while t = 9. 3. In the body of the loop, we increment t by 2, so that it will be an odd value; we increment s by t, summing a sequence of odd values. When this loop is done, t is 9, and s is the sum of odd numbers less than 9: 1+3+5+7. Also note that the while condition depends on t, so changing t is absolutely critical in the body of the loop. Heres a more complex example. This sums 100 dice rolls to compute an average.
s, r = 0, 0 while r != 100: d1,d2=random.randrange(6)+1,random.randrange(6)+1 s,r = s + d1+d2, r + 1 print s/r

1. We initialize the loop with s and r both set to zero. 2. The while statement species that during the loop r will not be 100; when the loop is done, r will be 100. 3. The body of the loop sets d1 and d2 to random numbers; it increments s by the sum of those dice, and it increments r by 1. When the loop is over, s will be the sum of 100 rolls of two dice. When we print, s/r we print the average rolled on two dice. The loop condition depends on r, so each trip through the loop must update r.

9.4 More Iteration Control: break and continue


Python oers several statements for more subtle loop control. The point of these statements is to permit two common simplications of a loop. In each case, these statements can be replaced with if statements; however, those if statement versions might be considered rather complex for expressing some fairly common situations. The break statement terminates a loop prematurely. The syntax is trivial:
break

98

Chapter 9. Loops and Iterative Processing

Building Skills in Python, Release 2.6.5

A break statement is always found within an if statement within the body of a for or while loop. A break statement is typically used when the terminating condition is too complex to write as an expression in the while clause of a loop. A break statement is also used when a for loop must be abandoned before the end of the sequence has been reached. The coninue statement skips the rest of a loops indented suite. The syntax is trivial:
continue

A continue statements is always found within an if statement within a for or while loop. The continue statement is used instead of deeply nested else clauses. Heres an example that has a complex break condition. We are going to see if we get six odd numbers in a row, or spin the roulette wheel 100 times. Well look at this in some depth because it pulls a number of features together in one program. This program shows both break and continue constructs. Most programs can actually be simplied by eliminating the break and continue statements. In this case, we didnt simplify, just to show how the statements are used. Note that we have a two part terminating condition: 100 spins or six odd numbers in a row. The hundred spins is relatively easy to dene using the range() function. The six odd numbers in a row requires testing and counting and then, possibly, ending the loop. The overall ending condition for the loop, then, is the number of spins is 100 or the count of odd numbers in a row is six.

sixodd.py
from __future__ import print_function import random oddCount= 0 for s in range(100): spinCount= s n= random.randrange(38) # Zero if n == 0 or n == 37: # treat 37 as 00 oddCount = 0 continue # Odd if n%2 == 1: oddCount += 1 if oddCount == 6: break continue # Even assert n%2 == 0 and 0 < n <= 36 oddCount = 0 print( oddCount, spinCount )

1. We import the print_function module to allow use of the print() function intead of the print statement. 2. We import the random module, so that we can generate a random sequence of spins of a roulette wheel. 3. We initialize oddCount, our count of odd numbers seen in a row. It starts at zero, because we havent seen any add numbers yet. 4. The for statement will assign 100 dierent values to s, such that 0 s < 100. This will control our experiment to do 100 spins of the wheel.

9.4. More Iteration Control: break and continue

99

Building Skills in Python, Release 2.6.5

5. We save the current value of s in a variable called spinCount, setting up part of our post condition for this loop. We need to know how many spins were done, since one of the exit conditions is that we did 100 spins and never saw six odd values in a row. This never saw six in a row exit condition is handled by the for statement itself. 6. Well treat 37 as if it were 00, which is like zero. In Roulette, these two numbers are neither even nor odd. The oddCount is set to zero, and the loop is continued. This continue statement resumes loop with the next value of s. It restarts processing at the top of the for statement suite. 7. We check the value of oddCount to see if it has reached six. If it has, one of the exit conditions is satised, and we can break out of the loop entirely. We use the break statement will stop executing statements in the suite of the for statement. If oddCount is not six, we dont break out of the loop, we use the continue statement to restart the for statement statement suite from the top with a new value for s. 8. We threw in an assert statement (see the next section, The assert Statement for more information) to claim that the spin, n, is even and not 0 or 37. This is kind of a safety net. If either of the preceding if statements were incorrect, or a continue statement was omitted, this statement would uncover that fact. We could do this with another if statement, but we wanted to introduce the assert statement. At the end of the loop, spinCount is the number of spins and oddCount is the most recent count of odd numbers in a row. Either oddCount is six or spinCount is 99. When spinCount is 99, that means that spins 0 through 99 were examined; there are 100 dierent numbers between 0 and 99.

9.5 Iteration Exercises


1. Greatest Common Divisor. The greatest common divisor is the largest number which will evenly divide two other numbers. Examples: GCD( 5, 10 ) = 5, the largest number that evenly divides 5 and 10. GCD( 21, 28 ) = 7, the largest number that divides 21 and 28. GCDs are used to reduce fractions. Once you have the GCD of the numerator and denominator, they can both be divided by the GCD to reduce the fraction to simplest form. 21/28 reduces to 3/4.

Greatest Common Divisor of two integers, p and q


Loop. Loop until p = q . Swap. If p < q then swap p and q, p q. Subtract. If p > q then subtract q from p, p p q . Result. Print p 2. Extracting the Square Root. This is a procedure for approximating the square root. It works by dividing the interval which contains the square root in half. Initially, we know the square root of the number is somewhere between 0 and the number. We locate a value in the middle of this interval and determine of the square root is more or less than this midpoint. We continually divide the intervals in half until we arrive at an interval which is small enough and contains the square root. If the interval is only 0.001 in width, then we have the square root accurate to 0.001

Square Root of a number, n


Two Initial Guesses.

100

Chapter 9. Loops and Iterative Processing

Building Skills in Python, Release 2.6.5

g1 0 g2 n At this point, g1 g1 n 0 g2 g2 n. Loop. Loop until |g1 g1 n| n < 0.001. Midpoint. mid (g1 + g2 ) 2 Midpoint Squared vs. Number. cmp mid mid n Which Interval? if cmp 0 then g1 mid. if cmp 0 then g2 mid. if cmp = 0, mid is the exact answer! Result. Print g1 3. Sort Four Numbers. This is a challenging exercise in if-statement construction. For some additional insight, see [Dijkstra76], page 61. Given 4 numbers (W, X, Y, Z ) Assign variables w, x, y, z so that w x y z and w, x, y, z are from W, X, Y, and Z. Do not use an array. One way to guarantee the second part of the above is to initialize w, x, y, z to W, X, Y, Z, and then use swapping to rearrange the variables. Hint: There are only a limited combination of out-of-order conditions among four variables. You can design a sequence of if statements, each of which xes one of the out-of-order conditions. This sequence of if statements can be put into a loop. Once all of the out-of-order conditions are xed, the numbers are in order, the loop can end. 4. Highest Power of 2. This can be used to determine how many bits are required to represent a number. We want the highest power of 2 which is less than or equal to our target number. For example 64 100 < 128. The highest power of 25 100 < 26 . Given a number n, nd a number p such that 2p n < 2p+1 . This can be done with only addition and multiplication by 2. Multiplication by 2, but the way, can be done with the << shift operator. Do not use the pow() function, or even the ** operator, as these are too slow for our purposes. Consider using a variable c, which you keep equal to 2p . An initialization might be p = 1, c = 2. When you increment p by 1, you also double c. Develop your own loop. This is actually quite challenging, even though the resulting program is tiny. For additional insight, see [Gries81], page 147. 5. How Much Eort to Produce Software? The following equations are the basic COCOMO estimating model, described in [Boehm81]. The input, K, is the number of 1000s of lines of source; that is total source lines divided by 1000. Development Eort, where K is the number of 1000s of lines of source. E is eort in sta-months. E = 2.4 K 1.05 Development Cost, where E is eort in sta-months, R is the billing rate. C is the cost in dollars (assuming 152 working hours per sta-month) C = E R 152

9.5. Iteration Exercises

101

Building Skills in Python, Release 2.6.5

Project Duration, where E is eort in sta-months. D is duration in calendar months. D = 2.5 E 0.38 Staing, where E is eort in sta-months, D is duration in calendar months. S is the average sta size. S= E D

Evaluate these functions for projects which range in size from 8,000 lines (K = 8) to 64,000 lines (K = 64) in steps of 8. Produce a table with lines of source, Eort, Duration, Cost and Sta size. 6. Wind Chill Table. Wind chill is used by meteorologists to describe the eect of cold and wind combined. Given the wind speed in miles per hour, V, and the temperature in F, T, the Wind Chill, w, is given by the formula below. See Wind Chill in Numeric Types and Expressions for more information. 35.74 + 0.6215 T 35.75 (V 0.16 ) + 0.4275 T (V 0.16 ) Wind speeds are for 0 to 40 mph, above 40, the dierence in wind speed doesnt have much practical impact on how cold you feel. Evaluate this for all values of V (wind speed) from 0 to 40 mph in steps of 5, and all values of T (temperature) from -10 to 40 in steps of 5. 7. Celsius to Fahrenheit Conversion Tables. Well make two slightly dierent conversion tables. For values of Celsius from -20 to +30 in steps of 5, produce the equivalent Fahrenheit temperature. The following formula converts C (Celsius) to F (Fahrenheit). F = 32 + 212 32 C 100

For values of Fahrenheit from -10 to 100 in steps of 5, produce the equivalent Celsius temperatures. The following formula converts F (Fahrenheit) to C (Celsius). C = (F 32) 100 212 32

8. Dive Planning Table. Given a surface air consumption rate, c, and the starting, s, and nal, f, pressure in the air tank, a diver can determine maximum depths and times for a dive. For more information, see Surface Air Consumption Rate in Numeric Types and Expressions. Accept c, s and f from input, then evaluate the following for d from 30 to 120 in steps of 10. Print a table of t and d. For each diver, c is pretty constant, and can be anywhere from 10 to 20, use 15 for this example. Also, s and f depend on the tank used, typical values are s =2500 and f =500. t= 33(s f ) c(d + 33)

9. Computing . Each of the following series compute increasingly accurate values of (3.1415926...) 1 1 1 1 1 =1 + + + 4 3 5 7 9 11

2 1 1 1 = 1 + 2 + 2 + 2 + 6 2 3 4 ) ( 1 )k ( 4 2 1 1 = 16 8k + 1 8k + 4 8k + 5 8k + 6
0k<

102

Chapter 9. Loops and Iterative Processing

Building Skills in Python, Release 2.6.5 1 12 123 + + + 3 35 357 10. Computing e. A logarithm is a power of some base. When we use logarithms, we can eectively multiply numbers using addition, and raise to powers using multiplication. Two Python built-in functions are related to this: math.log() and math.exp() . Both of these compute what are called natural logarithms, that is, logarithms where the base is e . This constant, e, is available in the math module, and it has the following formal denition: Denition of e. =1+ e=
0k<

1 k!

For more information on the () operator, see Digression on The Sigma Operator . The n! operator is factorial. Interestingly, its a post-x operator, it comes after the value it applies to. n! = n (n 1) (n 2) 1. For example, 4! = 4 3 2 1 = 24. By denition, 0! = 1.
1 1 1 1 If we add up the values 0! + 1! + 2! + 3! + we get the value of e. Clearly, when we get to about 1/10!, the fraction is so small it doesnt contribute much to the total.

We can do this with two loops, an outer loop to sum up the the k !.

1 k!

terms, and an inner loop to compute

However, if we have a temporary value of k !, then each time through the loop we can multiply this temporary by k, and then add 1/temp to the sum. You can test by comparing your results against math.e, e 2.71828 or math.exp(1.0). 11. Hailstone Numbers. For additional information, see [Banks02]. Start with a small number, n, 1 n < 30. There are two transformation rules that we will use: If n is odd, multiple by 3 and add 1 to create a new value for n. If n is even, divide by 2 to create a new value for n. Perform a loop with these two transformation rules until you get to n = 1. Youll note that when n = 1, you get a repeating sequence of 1, 4, 2, 1, 4, 2, ... You can test for oddness using the % (remainder) operation. If n % 2 == 1 , the number is odd, otherwise it is even. The two interesting facts are the path length, the number of steps until you get to 1, and the maximum value found during the process. Tabulate the path lengths and maximum values for numbers 1..30. Youll need an outer loop that ranges from 1 to 30. Youll need an inner loop to perform the two steps for computing a new n until n == 1; this inner loop will also count the number of steps and accumulate the maximum value seen during the process. Check: for 27, the path length is 111, and the maximum value is 9232.

9.6 Condition and Loops Style Notes


As additional syntax, the for and while statements permits an else clause. This is a suite of statements that are executed when the loop terminates normally. This suite is skipped if the loop is terminated by a 9.6. Condition and Loops Style Notes 103

Building Skills in Python, Release 2.6.5

break statement. The else clause on a loop might be used for some post-loop cleanup. This is so unlike other programming languages, that it is hard to justify using it. An else clause always raises a small problem when it is used. Its never perfectly clear what conditions lead to execution of an else clause. The condition that applies has to be worked out from context. For instance, in if statements, one explicitly states the exact condition for all of the if and elif clauses. The logical inverse of this condition is assumed as the else condition. It is, unfortunately, left to the person reading the program to work out what this condition actually is. Similarly, the else clause of a while statement is the basic loop termination condition, with all of the conditions on any break statements removed. The following kind of analysis can be used to work out the condition under which the else clause is executed.
while not BB: if C1: break if C2: break else: assert BB and not C1 and not C2 assert BB or C1 or C2

Because this analysis can be diicult, it is best to avoid the use of else clauses in loop constructs.

9.7 A Digression
For those new to programming, heres a short digression, adapted from chapter 8 of Edsger Dijkstras book, A Discipline of Programming [Dijkstra76]. Lets say we need to set a variable, m, to the larger of two input values, a and b. We start with a state we could call m undened. Then we want to execute a statement after which we are in a state of (m = a or m = b) and m a and m b. Clearly, we need to choose correctly between two dierent assignment statements. We need to do either m=a or m=b. How do we make this choice? With a little logic, we can derive the condition by taking each of these statements eects out of the desired end-state. For the statement m=a to be the right statement to use, we show the eect of the statement by replacing m with the value a, and examining the end state: (a = a or a = b) and a a and a b. Removing the parts that are obviously true, were left with a b. Therefore, the assignment m=a is only useful when a <= b. For the statement m=b to be the right statement to establish the necessary condition, we do a similar replacement of b for m and examine the end state: (b = a or b = b) and b a and b b. Again, we remove the parts that are obviously true and were left with b a. Therefore, the assignment m=b is only useful when b <= a. Each assignment statement can be guarded by an appropriate condition.
if a>=b: m=a elif b>=a: m=b

Is the correct statement to set m to the larger of a or b. Note that the hard part is establishing the post condition. Once we have that stated correctly, its relatively easy to gure the basic kind of statement that might make some or all of the post condition true. Then we do a little algebra to ll in any guards or loop conditions to make sure that only the correct statement is executed. Successful Loop Design. There are several considerations when using the while statement. This list is taken from David Gries, The Science of Programming [Gries81]. 104 Chapter 9. Loops and Iterative Processing

Building Skills in Python, Release 2.6.5

1. The body condition must be initialized properly. 2. At the end of the suite, the body condition is just as true as it was after initialization. This is called the invariant , because it is always true during the loop. 3. When this body condition is true and the while condition is false, the loop will have completed properly. 4. When the while condition is true, there are more iterations left to do. If we wanted to, we could dene a mathematical function based on the current state that computes how many iterations are left to do; this function must have a value greater than zero when the while condition is true. 5. Each time through the loop we change the state of our variables so that we are getting closer to making the while condition false; we reduce the number of iterations left to do. While these conditions seem overly complex for something so simple as a loop, many programming problems arise from missing one of them. Gries recommends putting comments around a loop showing the conditions before and after the loop. Since Python provides the assert statement; this formalizes these comments into actual tests to be sure the program is correct. Designing a Loop. Lets put a particular loop under the microscope. This is a small example, but shows all of the steps to loop construction. We want to nd the least power of 2 greater than or equal to some number greater than 1, call it x. This power of 2 will tell us how many bits are required to represent x, for example. We can state this mathematically as looking for some number, n, such that 2n1 < x 2n . If x is a power of 2, for example 64, wed nd 26 . If x is another number, for example 66, wed nd 26 < 66 27 , which is 64 < 66 128. We can start to sketch our loop already.
assert x > 1 ... initialize ... ... some loop ... assert 2**(n-1) < x <= 2**n

We work out the initialization to make sure that the invariant condition of the loop is initially true. Since x must be greater than or equal to 1, we can set n to 1. 211 = 20 = 1 < x. This will set things up to satisfy rule 1 and 2.
assert x > 1 n= 1 ... some loop ... assert 2**(n-1) < x <= 2**n

In loops, there must be a condition on the body that is invariant, and a terminating condition that changes. The terminating condition is written in the while clause. In this case, it is invariant (always true) that 2n1 < x. That means that the other part of our nal condition is the part that changes.
assert x > 1 n= 1 while not ( x <= 2**n ): n= n + 1

9.7. A Digression

105

Building Skills in Python, Release 2.6.5

assert 2**(n-1) < x assert 2**(n-1) < x <= 2**n

The next to last step is to show that when the while condition is true, there are more than zero trips through the loop possible. We know that x is nite and some power of 2 will satisfy this condition. Theres some n such that n 1 < log2 x n, which limits the trips through the loop. The nal step is to show that each cycle through the loop reduces the trip count. We can argue that increasing n gets us closer to the upper bound of log2 x. We should add this information on successful termination as comments in our loop.

106

Chapter 9. Loops and Iterative Processing

CHAPTER

TEN

FUNCTIONS
The heart of programming is the evaluate-apply cycle, where function arguments are evaluated and then the function is applied to those argument values. Well review this in Semantics. In Function Denition: The def and return Statements we introduce the syntax for dening a function. In Function Use, well describe using a function weve dened. Some languages make distinctions between varieties of functions, separating them into functions and subroutines. Well visit this from a Python perspective in Function Varieties. Well look at some examples in Some Examples. Well look at ways to use IDLE in Hacking Mode. We introduce some of the alternate argument forms available for handling optional and keyword parameters in More Function Denition Features. Further sophistication in how Python handles parameters has to be deferred to Advanced Parameter Handling For Functions, as it depends on a knowledge of dictionaries, introduced in Mappings and Dictionaries. In Object Method Functions we will describe how to use method functions as a prelude to Data Structures; real details on method functions are deferred until Classes. Well also defer examination of the yield statement until Iterators and Generators. The yield statement creates a special kind of function, one that is most useful when processing complex data structures, something well look at in Data Structures.

10.1 Semantics
A function, in a mathematical sense, is often described as a mapping from domain values to range values. Given a domain value, the function returns the matching range value. If we think of the square root function, it maps a positive number, n, to another number, s, such that s2 = n. If we think of multplication as a function, it maps a pair of values, a and b, to a new value, c, such that c = a b. When we memorize multiplication tables, we are memorizing these mappings. In Python, this narrow denition is somewhat relaxed. Python lets us create functions which do not need a domain value, but create new objects. It also allows us to have functions that dont return values, but instead have some other eect, like reading user input, or creating a directory, or removing a le. What We Provide. In Python, we create a new function by providing three pieces of information: the name of the function, a list of zero or more variables, called parameters, with the domain of input values, and a suite of statements that creates the output values. This denition is saved for later use. Well show this rst in Function Denition: The def and return Statements.

107

Building Skills in Python, Release 2.6.5

Typically, we create function denitions in script les because we dont want to type them more than once. Almost universally, we import a le with our function denitions so we can use them. We use a function in an expression by following the functions name with (). The Python interpreter evaluates the argument values in the (), then applies the function. Well show this second in Function Use. Applying a function means that the interpreter rst evaluates all of the argument values, then assigns the argument values to the function parameter variables, and nally evaluates the suite of statements that are the functions body. In this body, any return statements dene the resulting range value for the function. For more information on this evaluate-apply cycle, see The Evaluate-Apply Cycle. Namespaces and Privacy. Note that the parameter variables used in the function denition, as well as any variables in a function are private to that functions suite of statements. This is a consequence of the way Python puts all variables in a namespace. When a function is being evaluated, Python creates a temporary namespace. This namespace is deleted when the functions processing is complete. The namespace associated with application of a function is dierent from the global namespace, and dierent from all other function-body namespaces. While you can change the standard namespace policy (see The global Statement ) it generally will do you more harm than good. A functions interface is easiest to understand if it is only the parameters and return values and nothing more. If all other variables are local, they can be safely ignored. Terminology: argument and parameter. We have to make a rm distinction between an argument value, an object that is created or updated during execution, and the dened parameter variable of a function. The argument is the object used in particular application of a function; it may be referenced by other variables or objects. The parameter is a variable name that is part of the function, and is a local variable within the function body.

108

Chapter 10. Functions

Building Skills in Python, Release 2.6.5

The Evaluate-Apply Cycle The evaluate-apply cycle shows how any programming language computes the value of an expression. Consider the following expression:
math.sqrt( abs( b*b-4*a*c ) )

What does Python do? For the purposes of analysis, we can restructure this from the various mathematical notation styles to a single, uniform notation. We call this prex notation, because all of the operations prex their operands. While useful for analysis, this is cumbersome to write for real programs.
math.sqrt( abs( sub( mul(b,b), mul(mul(4,a),c) ) ) )

Weve replaced x*y with mul(x,y) , and replaced x-y with sub(x,y) . This allows us to more clearly see how evaluate-apply works. Each part of the expression is now written as a function with one or two arguments. First the arguments are evaluated, then the function is applied to those arguments. In order for Python to evaluate this math.sqrt(...) expression, it evaluates the argument, abs(...), and then applies math.sqrt() to it. This leads Python to a nested evaluate-apply process for the abs(...) expression. Well show the whole process, with indentation to make it clearer. Were going to show this as a list of steps, with > to show how the various operations nest inside each other.
Evaluate the arg to math.sqrt: > Evaluate the args to sub: > > Evaluate the args to mul: > > > Get the value of b > > Apply mul to b and b, creating r3=mul( b, b ). > > Evaluate the args to mul: > > > Evaluate the args to mul: > > > > Get the value of a > > > Apply mul to 4 and a, creating r5=mul( 4, a ). > > > Get the value of c > > Apply mul to r5 and c, creating r4=mul( mul( 4, a ), c ). > Apply sub to r3 and r4, creating r2=sub( mul( b, b ), mul( mul( 4, a ), c ) ). Apply math.sqrt to r2, creating r1=math.sqrt( sub( mul( b, b ), mul( mul( 4, a ), c ) ) ).

Notice that a number of intermediate results were created as part of this evaluation. If we were doing this by hand, wed write these down as steps toward the nal result. The apply part of the evalute-apply cycle is sometimes termed a function call. The idea is that the main procedure calls the body of a function; the function does its work and returns to the main procedure. This is also called a function invocation.

10.2 Function Denition: The def and return Statements


We create a function with a def statement. This provides the name, parameters and the suite of statements that creates the functions result.
def name ( parameter , ... ): suite

The name is the name by which the function is known. The parameters is a list of variable names; these names are the local variables to which actual argument values will be assigned when the function is applied. The suite (which must be indented) is a block of statements that computes the value for the function.

10.2. Function Denition: The def and return Statements

109

Building Skills in Python, Release 2.6.5

The rst line of a functions suite is expected to be a document string (generally a triple-quoted string) that provides basic documentation for the function. This is traditionally divided in two sections, a summary section of exactly one line and the detail section. Well return to this style guide in Functions Style Notes. The return statement species the result value of the function. This value will become the result of applying the function to the argument values. This value is sometimes called the eect of the function.
return expression

The yield statement species one of the result values of an iterable function. Well return to this in Iterators and Generators. Lets look at a complete example.
def odd( spin ): """Return true if this spin is odd.""" if spin % 2 == 1: return True return False

1. We name this function odd(), and dene it to accept a single parameter, named spin. 2. We provide a docstring with a short description of the function. 3. In the body of the function, we test to see if the remainder of spin /2 is 1; if so, we return True. 4. Otherwise, we return False.

10.3 Function Use


When Python evaluates odd(n), the following things will happen. 1. It evaluates n. For a simple variable, the value is the object to which the variable refers. For an expression, the expression is evaluated to result in an object. 2. It assigns this argument value to the local parameter of odd() (named spin ). 3. It applies odd(): the suite of statements is executed, ending with a return statement. 4. This value on the return statement is returned to the calling statement so that it can nish its execution. We would use this odd() function like this.
s = random.randrange(37) # 0 <= s <= 36, single-0 roulette if s == 0: print "zero" elif odd(s): print s, "odd" else: print s, "even"

1. We evaluate a function named random.randrange to create a random number, s. 2. The if clause handles the case where s is zero.

110

Chapter 10. Functions

Building Skills in Python, Release 2.6.5

3. The rst elif clause evaluates our odd() function. To do this evaluation, Python must set spin to the value of s and execute the suite of statements that are the body of odd(). The suite of statements will return either True or False. 4. Since the if and elif clauses handle zero and odd cases, all that is left is for s to be even.

10.4 Function Varieties


Some programming languages make a distinction between various types of functions or subprograms. There can be functions or subroutines or procedure functions. Python (like Java and C++) doesnt enforce this kind of distinction. Instead, Python imposes some distinction based on whether the function uses parameters and returns a value or yields a collection of values. Ordinary Functions. Functions which follow the classic mathematical denitions will map input argument values to a resulting value. These are, perhaps, a common kind of function. They include a return statement to express the resulting value. Procedure Functions. One common kind of function is one that doesnt return a result, but instead carries out some procedure. This function would omit any return statement. Or, if a return statement is used to exit from the function, the statement would have no value to return. Carrying out an action is sometimes termed a side-eect of the function. The primary eect is always the value returned. Heres an example of a function that doesnt return a value, but carries out a procedure.
from __future__ import print_function def report( spin ): """Report the current spin.""" if spin == 0: print( "zero" ) return if odd(spin): print( spin, "odd" ) return print( spin, "even" )

This function, report(), has a parameter named spin, but doesnt return a value. Here, the return statements exit the function but dont return values. This kind of function would be used as if it was a new Python language statement, for example:
for i in range(10): report( random.randrange(37) )

Here we execute the report() function as if it was a new kind of statement. We dont evaluate it as part of an expression. Theres actually no real subtlety to this distinction. Any expression can be used as a Python statement. A function call is an expression, and an expression is a statement. This greatly simplies Python syntax. The docstring for a function will explain what kind of value the function returns, or if the function doesnt return anything useful. The simple return statement, by the way, returns the special value None. This default value means that you can dene your function like report(), above, use it in an expression, and everything works nicely because the function does return a value. 10.4. Function Varieties 111

Building Skills in Python, Release 2.6.5

for i in range(10): t= report( random.randrange(37) ) print t

Youll see that t is None . Factory Functions. Another common form is a function that doesnt take a parameter. This function is a factory which generates a value. Some factory functions work by accessing some object encapsulated in a module. In the following example, well access the random number generator encapsulated in the random module.
def spinWheel(): """Return a string result from a roulette wheel spin.""" t= random.randrange(38) if t == 37: return "00" return str(t)

This functions evaluate-apply cycle is simplied to just the apply phase. To make 0 (zero) distinct from 00 (double zero), it returns a string instead of a number. Generators. A generator function contains the yield statement. These functions look like conventional functions, but they have a dierent purpose in Python. We will examine this in detail in Iterators and Generators. These functions have a persistent internal processing state; ordinary functions cant keep data around from any previous calls without resorting to global variables. Further, these functions interact with the for statement. Finally, these functions dont make a lot of sense until weve worked with sequences in Sequences: Strings, Tuples and Lists.

10.5 Some Examples


Heres a big example of using the odd() , spinWheel() and report() functions.

functions.py
#!/usr/bin/env python import random def odd( spin ): """odd(number) -&gt; boolean.""" return spin%2 == 1 def report( spin ): """Reports the current spin on standard output. if int(spin) == 0: print "zero" return if odd(int(spin)): print spin, "odd" return print spin, "even"

Spin is a String"""

112

Chapter 10. Functions

Building Skills in Python, Release 2.6.5

def spinWheel(): """Returns a string result from a roulette wheel spin.""" t= random.randrange(38) if t == 37: return "00" return str(t) for i in range(12): n= spinWheel() report( n )

1. Weve dened a function named odd(). This function evaluates a simple expression; it returns True if the value of its parameter, spin, is odd. 2. The function called report() uses the odd() function to print a line that describes the value of the parameter, spin. Note that the parameter is private to the function, so this use of the variable name spin is technically distinct from the use in the odd() function. However, since the report() function provides the value of spin to the odd() function, their two variables often happen to have the same value. 3. The spinWheel() function creates a random number and returns the value as a string. 4. The main part of this program is the for loop at the bottom, which calls spinWheel(), and then report(). The spinWheel() function uses random.randrange(); the report() function uses the odd() function. This generates and reports on a dozen spins of the wheel. For most of our exercises, this free-oating main script is acceptable. When we cover modules, in Components, Modules and Packages, well need to change our approach slightly to something like the following.
def main(): for i in range(12): n= spinWheel() report( n ) main()

This makes the main operation of the script clear by packaging it as a function. Then the only free-oating statement in the script is the call to main().

10.6 Hacking Mode


On one hand we have interactive use of the Python interpreter: we type something and the interpreter responds immediately. We can do simple things, but when our statements get too long, this interaction can become a niusance. We introduced this rst, in Command-Line Interaction. On the other hand, we have scripted use of the interpreter: we present a le as a nished program to execute. While handy for getting useful results, this isnt the easiest way to get a program to work in the rst place. We described this in Script Mode. In between the interactive mode and scripted mode, we have a third operating mode, that we might call hacking mode. The idea is to write most of our script and then exercise portions of our script interactively. In this mode, well develop script les, but well exercise them in an interactive environment. This is handy for developing and debugging function denitions. The basic procedure is as follows.

10.6. Hacking Mode

113

Building Skills in Python, Release 2.6.5

1. In our favorite editor, write a script with our function denitions. We often leave this editor window open. IDLE, for example, leaves this window open for us to look at. 2. Open a Python shell. IDLE, for example, always does this for us. 3. In the Python Shell, import the script le. In IDLE, this is eectively what happens when we run the module with F5. This will execute the various def statements, creating our functions in our interactive shell. 4. In the Python Shell, test the function interactively. If it works, were done. 5. If the functions in our module didnt work, we return to our editor window, make any changes and save the le. 6. In the Python Shell, clear out the old denition by restarting the shell. In IDLE, we can force this with F6. This happens automatically when we run the module using F5 7. Go back to step 3, to import and test our denitions. The interactive test results can be copied and pasted into the docstring for the le with our function denitions. We usually copy the contents of the Python Shell window and paste it into our modules or functions docstring. This record of the testing can be validated using the doctest module. Example. Heres the sample function were developing. If you look carefully, you might see a serious problem. If you dont see the problem, dont worry, well nd it by doing some debugging. In IDLE, we created the following le.

function1.py Initial Version


def odd( number ): """odd(number) -> boolean Returns True if the given number is odd. """ return number % 2 == "1"

We have two windows open: function1.py and Python Shell. Heres our interactive testing session. In our function1.py window, we hit F5 to run the module. Note the line that shows that the Python interpreter was restarted; forgetting any previous denitions. Then we exercised our function with two examples.
Python 2.5.1 (r251:54863, Oct 5 2007, 21:08:09) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. ************************************************************ Personal firewall software may warn about the connection IDLE makes to its subprocess using this computer's internal loopback interface. This connection is not visible on any external interface and no data is sent to or received from the Internet. ************************************************************ IDLE 1.1.4 >>> ================================ RESTART ================================ >>> >>> odd(2) False

114

Chapter 10. Functions

Building Skills in Python, Release 2.6.5

>>> odd(3) False

Clearly, it doesnt work, since 3 is odd. When we look at the original function, we can see the problem. The expression number % 2 == "1" should be number % 2 == 1. We need to x function1.py. Once the le is xed, we need to remove the old stu from Python, re-import our function and rerun our test. IDLE does this for us when we hit F5 to rerun the module. It shows this with the prominent restart message. If you are not using IDLE, you will need to restart Python to clear out the old denitions. Python optimizes import operations; if its seen the module once, it doesnt import it a second time. To remove this memory of which modules have been imported, you will need to restart Python.

10.7 More Function Denition Features


Python provides a mechanism for optional parameters. This allows us to create a single function which has several alternative forms. In other languages, like C++ or Java, these are called overloaded functions; they are actually separate function denitions with the same name but dierent parameter forms. In Python, we can write a single function that accepts several parameter forms. Python has three mechanisms for dealing with optional parameters and a variable number of parameters. Well cover the basics of optional parameters in this section. The other mechanisms for dealing with variable numbers of parameters will be deferred until Advanced Parameter Handling For Functions because these mechanisms use some more advanced data structures. Python functions can return multiple values. Well look at this, also.

10.7.1 Default Values for Parameters


The most common way to implement optional parameters is by providing a default value for the optional parameters. If no argument is supplied for the parameter, the default value is used.
def report( spin, count=1 ): print spin, count, "times in

a row"

This silly function can be used in two ways:


report( n ) report( n, 2 )

The rst form provides a default argument of 1 for the count parameter. The second form has an explicit argument value of 2 for the count parameter. If a parameter has no default value, it is not optional. If a parameter has a default value, it is optional. In order to disambiguate the assignment of arguments to parameters, Python uses a simple rule: all required parameters must be rst, all optional parameters must come after the required parameters. The int() function does this. We can say int("23") to do decimal conversion and int("23",16) to do hexadecimal conversion. Clearly, the second argument to int() has a default value of 10. Important: Red Alert Its very, very important to note that default values must be immutable objects. Well return to this concept of mutability in Data Structures. 10.7. More Function Denition Features 115

Building Skills in Python, Release 2.6.5

For now, be aware that numbers, strings, None, and tuple objects are immutable. As we look at various data type, well nd that lists, sets and dictionaries are mutable, and cannot be used as default values for function parameters. Fancy Defaults. When we look at the Python range() function, we see a more sophisticated version of this. range(x) is the same as range(0,x,1). range(x,y) is the same as range(x,y,1). It appears from these examples that the rst parameter is optional. The authors of Python use a pretty slick trick for this that you can use also. The range() function behaves as though the following function is dened.
def range(x, y=None, z=None): if y==None: start, stop, step = 0, x, 1 elif z==None: start, stop, step = x, y, 1 else: start, stop, step = x, y, z Real work is done with start, stop and step

By providing a default value of None, the function can determine whether a value was supplied or not supplied. This allows for complex default handling within the body of the function. Conclusion. Python must nd a value for all parameters. The basic rule is that the values of parameters are set in the order in which they are declared. Any missing parameters will have their default values assigned. These are called positional parameters, since the position is the rule used for assigning argument values when the function is applied. If a mandatory parameter (a parameter without a default value) is missing, this is a basic TypeError. For example:

badcall.py
#!/usr/bin/env python def hack(a,b): print a+b hack(3)

When we run this example, we see the following.


MacBook-5:Examples slott$ python badcall.py Traceback (most recent call last): File "badcall.py", line 4, in <module> hack(3) TypeError: hack() takes exactly 2 arguments (1 given)

10.7.2 Providing Argument Values by Keyword


In addition to supplying argument values by position, Python also permits argument values to be specied by name. Using explicit keywords can make programs much easier to read.

116

Chapter 10. Functions

Building Skills in Python, Release 2.6.5

First, well dene a function with a simple parameter list:


import random def averageDice( samples=100 ): """Return the average of a number of throws of 2 dice.""" s = 0 for i in range(samples): d1,d2 = random.randrange(6)+1,random.randrange(6)+1 s += d1+d2 return float(s)/float(samples)

Next, well show three dierent kinds of arguments: keyword, positional, and default.
test1 = averageDice( samples=200 ) test2 = averageDice( 300 ) test3 = averageDice()

When the averageDice() function is evaluated to set test1, the keyword form is used. The second call of the averageDice() function uses the positional form. The nal example relies on a default for the parameter. Conclusion. This gives us a number of variations including positional parameters and keyword parameters, both with and without defaults. Positional parameters work well when there are few parameters and their meaning is obvious. Keyword parameters work best when there are a lot of parameters, especially when there are optional parameters. Good use of keyword parameters mandates good selection of keywords. Single-letter parameter names or obscure abbreviations do not make keyword parameters helpfully informative. Here are the rules weve seen so far: 1. Supply values for all parameters given by name, irrespective of position. 2. Supply values for all remaining parameters by position; in the event of duplicates, raise a TypeError. 3. Supply defaults for any parameters that have defaults dened; if any parameters still lack values, raise a TypeError. There are still more options available for handling variable numbers of parameters. Its possible for additional positional parameters to be collected into a sequence object. Further, additional keyword parameters can be collected into a dictionary object. Well get to them when we cover dictionaries in Advanced Parameter Handling For Functions.

10.7.3 Returning Multiple Values


One common desire among programmers is a feature that allows a function to return multiple values. Python has some built-in functions that have this property. For example, divmod() returns the divisor and remainder in division. We could imagine a function, rollDice() that would return two values showing the faces of two dice. In Python, it is done by returning a tuple. Well wait for Tuples for complete information on tuples. The following is a quick example of how multiple assignment works with functions that return multiple values.

rolldice.py

10.7. More Function Denition Features

117

Building Skills in Python, Release 2.6.5

import random def rollDice(): return ( 1 + random.randrange(6), 1 + random.randrange(6) ) d1,d2=rollDice() print d1,d2

This shows a function that creates a two-valued tuple. Youll recall from Multiple Assignment Statement that Python is perfectly happy with multiple expressions on the right side of =, and multiple destination variables on the left side. This is one reason why multiple assignment is so handy.

10.8 Function Exercises


1. Fast exponentiation. This is a fast way to raise a number to an integer power. It requires the fewest multiplies, and does not use logarithms.

Fast Exponentiation of integers, raises n to the p power


(a) Base Case. If p = 0: return 1.0. (b) Odd. If p is odd: return n fastexp(n, p 1). (c) Even. If p is even: p compute t fastexp(n, ); 2 return t t. 2. Greatest Common Divisor. The greatest common divisor is the largest number which will evenly divide two other numbers. You use this when you reduce fractions. See Greatest Common Divisor for an alternate example of this exercises algorithm. This version can be slightly faster than the loop we looked at earlier.

Greatest Common Divisor of two integers, p and q


(a) Base Case. If p = q : return p. (b) p < q. If p < q : return GCD(q, p). (c) p > q. If p > q : return GCD(p, p q ). 3. Factorial Function. Factorial of a number n is the number of possible arrangements of 0 through n things. It is computed as the product of the numbers 1 through n. That is, 1 2 3 n. The formal denition is n! = n (n 1) (n 2) 1 0! = 1 We touched on this in Computing e. This function denition can simplify the program we wrote for that exercise.

118

Chapter 10. Functions

Building Skills in Python, Release 2.6.5

Factorial of an integer, n
(a) Base Case. If n = 0, return 1. (b) Multiply. If n > 0: return n factorial(n 1). 4. Fibonacci Series. Fibonacci numbers have a number of interesting mathematical properties. The ratio of adjacent Fibonacci numbers approximates the golden ratio ((1 + 5)/2, about 1.618), used widely in art and architecture.

The n th Fibonacci Number, Fn .


(a) F(0) Case. If n = 0: return 0. (b) F(1) Case. If n = 1: return 1. (c) F(n) Case. If n > 1: return F(n 1) + F(n 2). 5. Ackermanns Function. An especially complex algorithm that computes some really big results. This is a function which is specically designed to be complex. It cannot easily be rewritten as a simple loop. Further, it produces extremely large results because it describes extremely large exponents.

Ackermanns Function of two numbers, m and n


(a) Base Case. If m = 0: return n + 1. (b) N Zero Case. If m = 0 and n = 0: return ackermann(m 1, 1). (c) N Non-Zero Case. If m = 0 and n = 0: return ackermann(m 1, ackermann(m, n 1)). Yes, this requires you to compute ackermann(m, n 1) before you can compute ackermann(m 1, ackermann(m, n 1)). 6. Maximum Value of a Function. Given some integer-valued function f(), we want to know what value of x has the largest value for f() in some interval of values. For additional insight, see [Dijkstra76]. Imagine we have an integer function of an integer, call it f(). Here are some examples of this kind of function. def f1(x): return x def f2(x): return -5/3*x-3 def f3(x): return -5*x*x+2*x-3 The question we want to answer is what value of x in some xed interval returns the largest value for the given function? In the case of the rst example, def f1(x): return x, the largest value of f1() in the interval 0 x < 10 occurs when x is 9. What about f3() in the range 10 x < 10?

Max of a Function, F, in the interval low to high


(a) Initialize. x low; max x;

10.8. Function Exercises

119

Building Skills in Python, Release 2.6.5

maxF F(max). (b) Loop. While low x < high. i. New Max? If F(x) > maxF : max x; maxF F(max). ii. Next X. Increment x by 1. (c) Return. Return max as the value at which F(x) had the largest value. 7. Integration. This is a simple rectangular rule for nding the area under a curve which is continuous on some closed interval. We will dene some function which we will integrate, call it f(x)(). Here are some examples. def f1(x): return x*x def f2(x): return 0.5 * x * x def f3(x): return exp( x ) def f4(x): return 5 * sin( x ) When we specify y = f (x), we are specifying two dimensions. The y is given by the functions values. The x dimension is given by some interval. If you draw the functions curve, you put two limits on the x axis, this is one set of boundaries. The space between the curve and the y axis is the other boundary.
a The x axis limits are a and b. We subdivide this interval into s rectangles, the width of each is h = b s . We take the functions value at the corner as the average height of the curve over that interval. If the interval is small enough, this is reasonably accurate.

Integrate a Function, F, in the interval a to b in s steps


(a) Initialize. xa ba h s sum 0.0 (b) Loop. While a x < b. i. Update Sum. Increment sum by F (x) h. ii. Next X. Increment x by h. (c) Return. Return sum as the area under the curve F() for a x < b. 8. Field Bet Results. In the dice game of Craps, the Field bet in craps is a winner when any of the numbers 2, 3, 4, 9, 10, 11 or 12 are rolled. On 2 and 12 it pays 2:1, on any of the other numbers, it pays 1:1. Dene a function win( dice, num, pays). If the value of dice equals num, then the value of pays is returned, otherwise 0 is returned. Make the default for pays a 1, so we dont have to repeat this value over and over again. Dene a function field( dice ). This will call win() 7 times: once with each of the values for which the eld pays. If the value of dice is a 7, it returns -1 because the bet is a loss. Otherwise it returns 0 because the bet is unresolved.

120

Chapter 10. Functions

Building Skills in Python, Release 2.6.5

It would start with


def field( dice ): win( dice, 2, pays=2 ) win( dice, 3, pays=1 ) ...

Create a function roll() that creates two dice values from 1 to 6 and returns their sum. The sum of two dice will be a value from 2 to 12. Create a main program that calls roll() to get a dice value, then calls field() with the value that is rolled to get the payout amount. Compute the average of several hundred experiments. 9. range() Function Keywords. Does the range function permit keywords for supplying argument values? What are the keywords? 10. Optional First Argument. Optional parameters must come last, yet range fakes this out by appearing to have an optional parameter that comes rst. The most common situation is range(5) , and having to type range(0,5) seems rather silly. In this case, convenience trumps strict adherence to the rules. Is this a good thing? Is strict adherence to the rules more or less important than convenience?

10.9 Object Method Functions


Weve seen how we can create functions and use those functions in programs and other functions. Python has a related technique called methods or method functions. The functions weve used so far are globally available. A method function, on the other hand, belongs to an object. The objects class denes what methods and what properties the object has. Well cover method functions in detail, starting in Classes. For now, however, some of the Python data types were going to introduce in Data Structures will use method functions. Rather than cover too many details, well focus on general principles of how you use method functions in this section. The syntax for calling a method function looks like this:
someObject.aMethod( argument list )

A single . separates the owning object (someObject) from the method name (aMethod()). We glanced at a simple example when we rst looked at complex numbers. The complex conjugate function is actually a method function of the complex number object. The example is in Complex Numbers. In the next chapter, well look at various kinds of sequences. Python denes some generic method functions that apply to any of the various classes of sequences. The string and list classes, both of which are special kinds of sequences, have several methods functions that are unique to strings or lists. For example:
>>> "Hi Mom".lower() 'hi mom'

Here, we call the lower() method function, which belongs to the string object "Hi Mom". When we describe modules in Components, Modules and Packages, well cover module functions. These are functions that are imported with the module. The array module, for example, has an array() function that creates array objects. An array object has several method functions. Additionally, an array object is a kind of sequence, so it has all of the methods common to sequences, also.

10.9. Object Method Functions

121

Building Skills in Python, Release 2.6.5

file objects have an interesting life cycle, also. A file object is created with a built-in function, file(). A le object has numerous method functions, many of which have side-eects of reading from and writing to external les or devices. Well cover les in Files, listing most of the methods unique to le objects.

10.10 Functions Style Notes


The suite within a compound statement is typically indented four spaces. It is often best to set your text editor with tab stops every four spaces. This will usually yield the right kind of layout. Well show the spaces explicitly as in the following fragment. defmax(a,b): ifa>=b: m=a ifb>=a: m=b returnm This is has typical spacing for a piece of Python programming. Also, limit your lines to 80 positions or less. You may need to break long statements with a \ at the end of a line. Also, parenthesized expressions can be continued onto the next line without a \. Some programmers will put in extra () just to make line breaks neat. Names. Function names are typically mixedCase(). However, a few important functions were done in CapWords() style with a leading upper case letter. This can cause confusion with class names, and the recommended style is a leading lowercase letter for function names. In some languages, many related functions will all be given a common prex. Functions may be called inet_addr(), inet_network(), inet_makeaddr(), inet_lnaof(), inet_netof(), inet_ntoa(), etc. Because Python has classes (covered in Data + Processing = Objects) and modules (covered in Components, Modules and Packages), this kind of function-name prex is not used in Python programs. The class or module name is the prex. Look at the example of math and random for guidance on this. Parameter names are also typically mixedCase. In the event that a parameter or variable name conicts with a Python keyword, the name is extended with an _. In the following example, we want our parameter to be named range, but this conicts with the builtin function range(). We use a trailing _ to sort this out.
def integrate( aFunction, range_ ): """Integrate a function over a range.""" ...

Blank lines are used sparingly in a Python le, generally to separate unrelated material. Typicaly, function denitions are separated by single blank lines. A long or complex function might have blank lines within the body. When this is the case, it might be worth considering breaking the function into separate pieces. Docstrings. The rst line of the body of a function is called a docstring. The recommended forms for docstrings are described in Python Extension Proposal (PEP) 257. Typically, the rst line of the docstring is a pithy summary of the function. This may be followed by a blank line and more detailed information. The one-line summary should be a complete sentence.
def fact( n ): """fact( number ) -> number Returns the number of permutations of n things."""

122

Chapter 10. Functions

Building Skills in Python, Release 2.6.5

if n == 0: return 1L return n*fact(n-1L) def bico( n, r ): """bico( number, number ) -> number Returns the number of combinations of n things taken in subsets of size r. Arguments: n -- size of domain r -- size of subset """ return fact(n)/(fact(r)*fact(n-r))

The docsting can be retrieved with the help() function. help(object ) Provides help about the given object. Heres an example, based on our fact() shown above.
>>> help(fact) Help on function fact in module __main__: fact(n) fact( number ) -> number Returns the number of permutations of n things.

Note that you will be in the help reader, with a prompt of (END). Hit q to quit the help reader. For more information, see Getting Help.

10.10. Functions Style Notes

123

Building Skills in Python, Release 2.6.5

124

Chapter 10. Functions

CHAPTER

ELEVEN

ADDITIONAL NOTES ON FUNCTIONS


The global Statement
In Functions and Namespaces well describe some of the internal mechanisms Python uses for storing variables. Well introduce the global statement in The global Statement . Well include a digression on the two common argument binding mechanisms: call by value and call by reference in Call By Value and Call By Reference. Note that this is a distinction that doesnt apply to Python, but if you have experience in languages like C or C++, you may wander where and how this is implemented. Finally, well cover some aspects of functions as rst-class objects in Function Objects.

11.1 Functions and Namespaces


This is an overview of how Python determines the meaning of a name. Well omit some details to hit the more important points. For more information, see section 4.1 of the Python Language Reference. The important issue is that we want variables created in the body of a function to be private to that function. If all variables are global, then each function runs a risk of accidentally disturbing the value of a global variable. In the COBOL programming language (without using separate compilation or any of the modern extensions) all variables are globally declared in the data division, and great care is required to prevent accidental or unintended use of a variable. To achieve privacy and separation, Python maintains several dictionaries of variables. These dictionaries dene the context in which a variable name is understood. Because these dictionaries are used for resolution of variables, which name objects, they are called namespaces. A global namespace is available to all modules that are part of the currently executing Python script. Each module, class, function, lambda, or anonymous block of code given to the exec command has its own private namespace. Names are resolved using the nested collection of namespaces that dene an execution environment. Python always checks the most-local dictionary rst, ending with the global dictionary. Consider the following script.
def deep( a, b ): print "a=", a print "b=", b

125

Building Skills in Python, Release 2.6.5

def shallow( hows, things ): deep( hows, 1 ) deep( things, coffee ) hows= 1 coffee= 2 shallow( "word", 3.1415926 ) shallow( hows, coffee )

1. The deep() function has a local namespace, where two variables are dened: a and b. When deep() is called from shallow(), there are three nested scopes that dene the environment: the local namespace for deep(): the local namespace for shallow(), and the global namespace for the main script. 2. The shallow() function has a local namespace, where two variables are dened: hows and things. When shallow() is called from the main script, the local hows is resolved in the local namespace. It hides the global variable with the same name. The reference to coffee is not resolved in the local namespace, but is resolved in the global namespace. This is called a free variable, and is sometimes a symptom of poor software design. 3. The main script by denition executes in the global namespace, where two variables (hows and coffee) are dened, along with two functions, deep() and shallow(). Built-in Functions. If you evaluate the function globals(), youll see the mapping that contains all of the global variables Python knows about. For these early programs, all of our variables are global. If you simply evaluate locals(), youll see the same thing. However, if you call locals() from within the body of a function, youll be able to see the dierence between local and global variables. The following example shows the creation of a gobal variable a, and a global function, q. It shows the local namespace in eect while the function is executing. In this local namespace we also have a variable named a.
>>> a=22.0 >>> globals() {'__builtins__': <module '__builtin__' (built-in)>, '__doc__': None, 'a': 22.0} >>> def q( x, y ): ... a = x / y ... print locals() ... >>> locals() {'__builtins__': <module '__builtin__' (built-in)>, 'q': <function q at 0x76830>, '__doc__': None, 'a': >>> globals() {'__builtins__': <module '__builtin__' (built-in)>, 'q': <function q at 0x76830>, '__doc__': None, 'a': >>> q(22.0,7.0) {'a': 3.1428571428571428, 'y': 7.0, 'x': 22.0}

'__name__': '__main__',

'__name__': '__main__', 22.0} '__name__': '__main__', 22.0}

The function vars() accepts a parameter which is the name of a specic local context: a module, class, or object. It returns the local variables for that specic context. The local variables are kept in a local variable named __dict__. The vars() function retrieves this. The dir() function examines the __dict__ of a specic object to locate all local variables as well as other features of the object. Assignment statements, as well as def and class statements, create names in the local dictionary. The del statement removes a name from the local dictionary. 126 Chapter 11. Additional Notes On Functions

Building Skills in Python, Release 2.6.5

Some Consequences. Since each imported module exists in its own namespace, all functions and classes within that module must have their names qualied by the module name. We saw this when we imported math and random. To use the sqrt() function, we must say math.sqrt, providing the module name that is used to resolve the name sqrt(). This module namespace assures that everything in a module is kept separate from other modules. It makes our programs clear by qualifying the name with the module that dened the name. The module namespace also allow a module to have relatively global variables. A module, for example, can have variables that are created when the module is imported. In a sense these are global to all the functions and classes in the module. However, because they are only known within the modules namespace, they wont conict with variables in our program or other modules. Having to qualify names within a module can become annoying when we are making heavy use of a module. Python has ways to put elements of a module into the global namespace. Well look at these in Components, Modules and Packages.

11.2 The global Statement


The suite of statements in a function denition executes with a local namespace that is dierent from the global namespace. This means that all variables created within a function are local to that function. When the suite nishes, these working variables are discarded. The overall Python session works in the global namespace. Every other context (e.g. within a functions suite) is a distinct local namespace. Python oers us the global statement to change the namespace search rule.
global name

The global statement tells Python that the following names are part of the global namespace, not the local namespace. The following example shows two functions that share a global variable.
ratePerHour= 45.50 def cost( hours ): global ratePerHour return hours * ratePerHour def laborMaterials( hours, materials ): return cost(hours) + materials

Warning: Global Warning The global statement has a consequence of tightly coupling pieces of software. This can lead to diiculty in maintenance and enhancement of the program. Classes and modules provide better ways to assemble complex programs. As a general policy, we discourage use of the global statement.

11.3 Call By Value and Call By Reference


Beginning programmers can skip this section. This is a digression for experienced C and C++ programmers. Most programming languages have a formal mechanism for determining if a parameter receives a copy of the argument (call by value) or a reference to the argument object (call by name or call by reference.) 11.2. The global Statement 127

Building Skills in Python, Release 2.6.5

The distinction is important in languages with primitive types: data which is not a formal object. These primitive types can be eiciently passed by value, where ordinary objects are more eiciently passed by reference. Additionally, this allows a languge like C or C++ to use a reference to a variable as input to a function and have the function update the variable without an obvious assignment statement. Bad News. The following scenario is entirely hypothetical for Python programmers, but a very real problem for C and C++ programmers. Imagine we have a function to2() , with this kind of denition in C.
int to2( int *a ) { /* set parameter a's value to 2 */ *a= 2; return 0; }

This function changes the value of the variable a to 2. This would be termed a side-eect because it is in addition to any value the function might return normally. When we do the following in C
int x= 27; int z= to2( &x ); printf( "x=%i, z=%i", x, z );

We get the unpleasant side-eect that our function to2() has changed the argument variable, x, and the variable wasnt in an assignment statement! We merely called a function, using x as an argument. In C, the & operator is a hint that a variable might be changed. Further, the function denition should contain the keyword const when the reference is properly read-only. However, these are burdens placed on the programmer to assure that the program compiles correctly. Python Rules. In Python, the arguments to a function are always objects, never references to variables. Consider this Python version of the to2() function:
def to2( a ) a = 2 return 0 x = 27 z = to2( x ) print "x=%d, z=%d" % ( x, z )

The variable x is a reference to an integer object with a value of 27. The parameter variable (a) in the to2() function is a reference to the same object, and a is local to the functions scope. The original variable, x, cannot be changed by the function, and the original argument object, the integer 27, is immutable, and cant be changed either. If an argument value is a mutable object, the parameter is a reference to that object, and the function has access to methods of that object. The methods of the object can be called, but the original object cannot be replaced with a new object. Well look at mutable objects in Data Structures. For now, all the objects weve used (strings and numbers) are immutable and cannot be changed. The Python rules also mean that, in general, all variable updates must be done explicitly via an assignment statement. This makes variable changes perfectly clear.

128

Chapter 11. Additional Notes On Functions

Building Skills in Python, Release 2.6.5

11.4 Function Objects


One interesting consequence of the Python world-view is that a function is an object of the class function, a subclass of callable. The common feature that all callable objects share is that they have a very simple interface: they can be called. Other callable objects include the built-in functions, generator functions (which have the yield statement instead of the return statement) and things called lambdas. Sometimes we dont want to call and evaluate a function. Sometimes we want to do other things to or with a function. For example, the various factory functions (int(), long(), float(), complex()) can be used with the isinstance() function instead of being called to create a new object. For example, isinstance(2,int) has a value of True. It uses the int() function, but doesnt apply the int() function. A function object is created with the def statement. Primarily, we want to evaluate the function objects we create. However, because a function is an object, it has attributes, and it can be manipulated to a limited extent. From a syntax point of view, a name followed by () is a function call. You can think of the () as the call operator: they require evaluation of the arguments, then they apply the function.
name ( arguments )

There are a number of manipulations that you might want to do with a function object. Call The Function. By far, the most common use for a function object is to call it. When we follow a function name with (), we are calling the function: evaluating the arguments, and applying the function. Calling the function is the most common manipulation. Alias The Function. This is dangerous, because it can make a program obscure. However, it can also simplify the evoluation and enhancement of software. Heres a scenario. Imagine that the rst version of our program had two functions named rollDie() and rollDice(). The denitions might look like the following.
def rollDie(): return random.randrange(1,7) def rollDice(): return random.randrange(1,7) + random.randrange(1,7)

When we wanted to expand our program to handle ve-dice games, we realized we could generalize the rollDice() function to cover both cases.
def rollNDice( n=2 ): t= 0 for d in range(n): t += random.randrange( 1, 7 ) return t

It is important to remove the duplicated algorithm in all three versions of our dice rolling function. Since rollDie() and rollDice() are just special cases of rollNDice(). We can replace our original two functions with something like the following.
def rollDie(): return rollNDice( 1 ) def rollDice(): return rollNDice()

11.4. Function Objects

129

Building Skills in Python, Release 2.6.5

However, we have an alternative. This revised denition of rollDice() is really just an another name for the rollNDice(). Because a function is an object assigned to a variable, we can have multiple variables assigned to the function. Heres how we create an alias to a function.
def rollDie(): return rollNDice( 1 ) rollDice = rollNDice

Warning: Function Alias Confusion Function alias denitions helps maintaining compatibility between old and new releases of software. It is not something that should be done as a general practice; we need to be careful providing multiple names for a given function. This can be a simplication. It can also be a big performance improvement for certain types of functions that are heavily used deep within nested loops. Function Attributes. A function object has a number of attributes. We can interrogate those attributes, and to a limited extend, we can change some of these attributes. For more information, see section 3.2 of the Python Language Reference and section 2.3.9.3 of the Python Library Reference. func_doc __doc__ Docstring from the rst line of the functions body. func_name __name__ Function name from the def statement. __module__ Name of the module in which the function name was dened. func_defaults Tuple with default values to be assigned to each argument that has a default value. This is a subset of the parameters, starting with the rst parameter that has a default value. func_code The actual code object that is the suite of statements in the body of this function. func_globals The dictionary that denes the global namespace for the module that denes this function. This is m.__dict__ of the module which dened this function. func_dict __dict__ The dictionary that denes the local namespace for the attributes of this function. You can set and get your own function attributes, also. Heres an example
def rollDie(): return random.randrange(1,7) rollDie.version= "1.0" rollDie.authoor= "sfl"

130

Chapter 11. Additional Notes On Functions

Part III

Data Structures

131

Building Skills in Python, Release 2.6.5

The Data View


Computer programs are built with two essential features: data and processing. We started with processing elements of Python. Were about to start looking at data structures. In Language Basics, we introduced almost all of the procedural elements of the Python language. We started with expressions, looking at the various operators and data types available. We described fourteen of the approximately 24 statements that make up the Python language. Expression Statement. For example, a function evaluation where there is no return value. Examples include the print() function. import. Used to include a module into another module or program. print. Used to provide visible output. This is being replaced by the print() function. assignment. This includes the simple and augmented assignment statements. This is how you create variables. del. Used (rarely) to remove a variable, function, module or other object. if. Used to conditionally perform suites of statements. This includes elif and else statements. pass. This does nothing, but is a necessary syntactic placeholder for an if or while suite that is empty. assert. Used to conrm the program is in the expected state. for and while. Perform suites of statements using a sequence of values or while a condition is held true. break and continue. Helpful statements for short-cutting loop execution. def. Used to dene a new function. return. Used to exit a function. Provides the return value from the function. global. Used adjust the scoping rules, allowing local access to global names. We discourage its use in The global Statement . The Other Side of the Coin. The next chapters focus on adding various data types to the basic Python language. The subject of data representation and data structures is possibly the most profound part of computer programming. Most of the killer applications email, the world wide web, relational databases are basically programs to create, read and transmit complex data structures. We will make extensive use of the object classes that are built-in to Python. This experience will help us design our own object classes in Data + Processing = Objects. Well work our way through the following data structures. Sequences. In Sequences: Strings, Tuples and Lists well extend our knowledge of data types to include an overview various kinds of sequences: strings, tuples and lists. Sequences are collections of objects accessed by their numeric position within the collection. In Strings we describe the string subclass of sequence. The exercises include some challenging string manipulations. We describe xed-length sequences, called tuple s in Tuples. In Lists we describe the variable-length sequence, called a list. This list sequence is one of the powerful features that sets Python apart from other programming languages. The exercises at the end of the list section include both simple and relatively sophisticated problems. Mappings. In Mappings and Dictionaries we describe mappings and dictionary objects, called dict. Well show how dictionaries are part of some advanced techniques for handling arguments to functions. Mappings are collections of value objects that are accessed by key objects. 133

Building Skills in Python, Release 2.6.5

Sets. Well cover set objects in Sets. Sets are simple collections of unique objects with no additional kind of access. Exceptions. Well cover exception objects in Exceptions. Well also show the exception handling statements, including try, except, nally and raise statements. Exceptions are both simple data objects and events that control the execution of our programs. Iterables. The yield statement is a variation on return that simplies certain kinds of generator algorithms that process or create create iterable data structures. We can iterate through almost any kind of data collection. We can also dene our own unique or specialized iterations. Well cover this in Iterators and Generators. Files. The subject of les is so vast, that well introduce le objects in Files. The with statement is particularly helpful when working with les. Files are so centrally important that well return les in Components, Modules and Packages. Well look at several of the le-related modules in File Handling Modules as well as File Formats: CSV, Tab, XML, Logs and Others.. In Functional Programming with Collections we describe more advanced sequence techniques, including multi-dimensional processing, additional sequence-processing functions, and sorting. Deferred Topics. There are a few topics that need to be deferred until later. try. Well look at exceptions in Exceptions. This will include the except, nally and raise statements, also. yield. Well look at Generator Functions in Iterators and Generators. class. Well cover this in its own part, Classes. with. Well look at Context Managers in Managing Contexts: the with Statement . import. Well revisit import in detail in Components, Modules and Packages. exec. Additionally, well cover the exec statement in The exec Statement .

134

CHAPTER

TWELVE

SEQUENCES: STRINGS, TUPLES AND LISTS


The Common Features of Sequences
Before digging into the details, well introduce the common features of three of the data types that are containers for sequences of values. In Sequence Semantics we will provide an overview of the semantics of sequences. We describes the common features of the sequences in Overview of Sequences. The sequence is central to programming and central to Python. A number of statements and functions we have covered have sequence-related features that we have glossed over, danced around, and generally avoided. Well revisit a number of functions and statements we covered in previous sections, and add the power of sequences to them. In particular, the for statement is something we glossed over in Iterative Processing: For All and There Exists. In the chapters that follow well look at Strings, Tuples and Lists in detail. In Mappings and Dictionaries , well introduce another structured data type for manipulating mappings between keys and values.

12.1 Sequence Semantics


A sequence is a container of objects which are kept in a specic order. We can identify the individual objects in a sequence by their position or index. Positions are numbered from zero in Python; the element at index zero is the rst element. We call these containers because they are a single object which contains (or collects) any number of other objects. The any number clause means that they can contain zero other objects, meaning that an empty container is just as valid as a container with one or thousands of objects. Important: Other Languages In some programming languages, they use words like vector or array to refer to sequential containers. For example, in C or Java, the primitive array has a statically allocated number of positions. In Java, a reference outside that specic number of positions raises an exception. In C, however, a reference outside the dened positions of an array is an error that may never be detected. Really. There are four commonly-used subspecies of sequence containers. String, called str. A container of single-byte ASCII characters. Unicode String, unicode. A container of multi-byte Unicode (or Universal Character Set) characters.

135

Building Skills in Python, Release 2.6.5

tuple. A container of anything with a xed number of elements. list. A container of anything with a dynamic number of elements. Important: Python 3 This mix of types will change slightly. The String and Unicode types will merge into the str type. This will represent text. A new container, the byte array will be introduced, named bytes. This will represent binary data. tuple and list wont change. When we create a tuple or string , weve created an immutable, or static object. We can examine the object, looking at specic characters or items. We cant change the object. This means that we cant put additional data on the end of a string. What we can do, however, is create a new string that is the concatenation of the two original string objects. When we create a list, on the other hand, weve created a mutable object. A list can have additional objects appended to it or inserted in it. Objects can be removed from a list, also. A list can grow and shrink; the order of the objects in the list can be changed without creating a new list object. One other note on string. While string are sequences of characters, there is no separate character data type. A character is simply a string of length one. This relieves programmers from the C or Java burden of remembering which quotes to use for single characters as distinct from multi-character string. It also eliminates any problems when dealing with Unicode multi-byte characters.

12.2 Overview of Sequences


All the varieties of sequences (string, tuple and list) have some common characteristics. Well identify the common features rst, and then move on to cover these in detail for each individual type of sequence. This section is a road-map for the following three sections that cover string, tuple and list in detail. Literal Values. Each sequence type has a literal representation. The details will be covered in separate sections, but the basics are these: string uses quotes: "string". tuple uses (): (1,'b',3.1). list uses []: [1,'b',3.1]. Operations. Sequences have three common operations: + will concatenate sequences to make longer sequences. * is used with a number and a sequence to repeat the sequence several times. Finally, the [ ] operator is used to select elements from a sequence. The [ ] operator can extract a single item, or a subset of items by slicing. There are two forms of []. The single item format is sequence [ index ]. Items are numbered from 0. The slice format is sequence [ start : end ]. Items from start to end -1 are chosen to create a new sequence; it will be a slice of the original sequence. There will be end start items in the resulting sequence. Positions can be numbered from the end of the string as well as the beginning. Position -1 is the last item of the sequence, -2 is the next-to-last item. Heres how it works: each item has a positive number position that identies the item in the sequence. Well also show the negative position numbers for each item in the sequence. For this example, were looking at a four-element sequence like the tuple (3.14159,"two words",2048,(1+2j)) .

136

Chapter 12. Sequences: Strings, Tuples and Lists

Building Skills in Python, Release 2.6.5

forward position reverse position item

0 -4 3.14159

1 -3 two words

2 -2 2048

3 -1 (1+2j)

Why do we have two dierent ways of identifying each position in the sequence? If you want, you can think of it as a handy short-hand. The last item in any sequence, S can be identied by the formula S[ len(S)-1 ] . For example, if we have a sequence with 4 elements, the last item is in position 3. Rather than write S[ len(S)-1 ], Python lets us simplify this to S[-1] . You can see how this works with the following example.
>>> a=(3.14159,"two words",2048,(1+2j)) >>> a[0] 3.1415899999999999 >>> a[-3] 'two words' >>> a[2] 2048 >>> a[-1] (1+2j)

Built-in Functions. len(), max() and min() apply to all varieties of sequences. Well provide the denitions here and refer to them in various class denitions. len(sized_collection) Return the number of items of the collection. This can be any kind of sized collection. All sequences and mappings are subclasses of collections.Sized and provide a length. Here are some examples.
>>> len("Wednesday") 9 >>> len( (1,1,2,3) ) 4

max(iterable_collection) Returns the largest value in the iterable collection. All sequences and mappings are subclasses of collections.Iterable; the max() function can iterate over elements and locate the largest.
>>> max( (1,2,3) ) 3 >>> max('abstractly') 'y'

Note that max() can also work with a number of individual arguments instead of a single iterable collection argument value. We looked a this in Collection Functions. min(iterable_collection) Returns the smallest value in the iterable collection. All sequences and mappings are subclasses of collections.Iterable; the max() function can iterate over elements and locate the smallest.
>>> min( (10,11,2) ) 2 >>> min( ('10','11','2') ) '10'

Note that strings are compared alphabetically. The min() (and max() function cant determine that these are supposed to be evaluated as numbers.)

12.2. Overview of Sequences

137

Building Skills in Python, Release 2.6.5

sum(iterable_collection, [start=0] ) Return the sum of the items in the iterable collection. All sequences and mappings are subclasses of collections.Iterable. If start is provided, this is the initial value for the sum, otherwise 0 is used. If the values being summed are not all numeric values, this will raise a TypeError exception.
>>> sum( (1,1,2,3,5,8) ) 20 >>> sum( (), 3 ) 3 >>> sum( (1,2,'not good') ) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for +: 'int' and 'str'

any(iterable_collection) Return True if there exists an item in the iterable collection which is True. All sequences and mappings are subclasses of collections.Iterable. all(iterable_collection) Return True if all items in the iterable collection are True. All sequences and mappings are subclasses of collections.Iterable. enumerate(iterable_collection) Iterates through the iterable collection returning 2-tuples of ( index, item ).
>>> for position, item in enumerate( ('word',3.1415629,(2+3j) ) ): ... print position, item ... 0 word 1 3.1415629 2 (2+3j)

sorted(sequence, [key=None], [reverse=False] ) This returns an iterator that steps through the elements of the iterable container in ascending order. If the reverse keyword parameter is provided and set to True, the container is iterated in descending order. The key parameter is used when the items in the container arent simply sorted using the default comparison operators. The key function must return the elds to be compared selected from the underlying objects in the tuple. Well look at this in detail in Functional Programming with Collections. reversed(sequence) This returns an iterator that steps through the elements in the iterable container in reverse order.
>>> tuple( reversed( (9,1,8,2,7,3) ) ) (3, 7, 2, 8, 1, 9)

Comparisons. The standard comparisons (<, <=, >, <=, ==, !=) apply to sequences. These all work by doing item-by-item comparison within the two sequences. The item-by-item rule results in strings being sorted alphabetically, and tuples and lists sorted in a way that is similar to strings. There are two additional comparisons: in and not in. These check to see if a single value occurs in the sequence. The in operator returns a True if the item is found, False if the item is not found. The not in operator returns True if the item is not found in the sequence. 138 Chapter 12. Sequences: Strings, Tuples and Lists

Building Skills in Python, Release 2.6.5

Methods. The string and list classes have method functions that operate on the objects value. For instance "abc".upper() executes the upper() method belonging to the string literal "abc". The result is a new string, 'ABC'. The exact dictionary of methods is unique to each class of sequences. Statements. The tuple and list classes are central to certain Python statements, like the assignment statement and the for statement. These were details that we skipped over in The Assignment Statement and Iterative Processing: For All and There Exists. Modules. There is a string module with several string specic functions. Most of these functions are now member functions of the string type. Additionally, this module has a number of constants to dene various subsets of the ASCII character set, including digits, printable characters, whitespace characters and others. Factory Functions. There are also built-in factory (or conversion) functions for the sequence objects. Weve looked at some of these already, when we looked at str() and repr().

12.3 Exercises
1. Tuples and Lists. What is the value in having both immutable sequences (tuple) and mutable sequences (list)? What are the circumstances under which you would want to change a string? What are the problems associated with a string that grows in length? How can storage for variable length string be managed? 2. Unicode Strings. What is the value in making a distinction between Unicode strings and ASCII strings? Does it improve performance to restrict a string to single-byte characters? Should all strings simply be Unicode strings to make programs simpler? How should le reading and writing be handled? 3. Statements and Data Structures. In order to introduce the for statement in Iterative Processing: For All and There Exists, we had to dance around the sequence issue. Would it make more sense to introduce the various sequence data structures rst, and then describe statements that process the data structure later? Something has to be covered rst, and is therefore more fundamental. Is the processing statement more fundamental to programming, or is the data structure?

12.4 Style Notes


Try to avoid extraneous spaces in list and tuple displays. Python programs should be relatively compact. Prose writing typically keeps ()s close to their contents, and puts spaces after commas, never before them. This should hold true for Python, also. The preferred formatting for a list or tuple, then, is [1,2,3] or (1, 2, 3). Spaces are not put after the initial [ or (. Spaces are not put before ,.

12.3. Exercises

139

Building Skills in Python, Release 2.6.5

140

Chapter 12. Sequences: Strings, Tuples and Lists

CHAPTER

THIRTEEN

STRINGS
Well look at the two string classes from a number of viewpoints: semantics, literal values, operations, comparison operators, built-in functions, methods and modules. Additionally, we have a digression on the immutability of string objects.

13.1 String Semantics


A String (the formal class name is str) is an immutable sequence of ASCII characters. A Unicode String (unicode) is an immutable sequence of Unicode characters. Since a string (either str or unicode) is a sequence, all of the common operations on sequences apply. We can concatenate string objects together and select characters from a string. When we select a slice from a string, weve extracted a substring. An individual character is simply a string of length one. Important: Python 3.0 The Python 2 str class, which is limited to single-byte ASCII characters does two separate things: it represents text as well as a collection of bytes. The text features of str gain the features from the Unicode String class, unicode. The new str class will represent strings of text, irrespective of the underlying encoding. It can be ASCII, UTF-8, UTF-16 or any other encoding. The array of bytes features of the Python 2 str class will be moved into a new class, bytes. This new class will implement simple sequences of bytes and will support conversion between bytes and strings using encoding and decoding functions.

13.2 String Literal Values


A str is a sequence of ASCII characters. The literal value for a str is written by surrounding the value with quotes or apostrophes. There are several variations to provide some additional features. Basic String Strings are enclosed in matching quotes (") or apostrophes ('). A string enclosed in quotes (") can contain apostrophes ('); similarly, a string enclosed in apostrophes (') can contains quotes ("). A basic str must be completed on a single line, or continued with a \ as the very last character of a line. Examples:

141

Building Skills in Python, Release 2.6.5

"consultive" 'syncopated' "don't do that" '"Okay," he said.'

Multi-Line String Also called Triple-Quoted String. A multi-line str is enclosed in triple quotes (""") or triple apostrophes ('''). It continues on across line boundaries until the concluding triple-quote or triple-apostrophe. Examples:
"""A very long string""" '''SELECT * FROM THIS, THAT WHERE THIS.KEY = THAT.FK AND THIS.CODE = 'Active' '''

Unicode String A Unicode String uses the above quoting rules, but prefaces the quote with (u"), (u'), (u""") or (u'''). Unicode is the Universal Character Set; each character requires from 1 to 4 bytes of storage. ASCII is a single-byte character set; each of the 256 ASCII characters requires a single byte of storage. Unicode permits any character in any of the languages in common use around the world. A special \uxxxx escape sequence is used for Unicode characters that dont happen to occur on your ASCII keyboard. Examples:
u'\u65e5\u672c' u"All ASCII"

Raw String A Raw String uses the above quoting rules, but prefaces the quote with (r"), (r'), (r""") or (r'''). The backslash characters (\) are not interpreted as escapes by Python, but are left as is. This is handy for Windows les names that contain \. It is also handy for regular expressions that make extensive use of backslashes. Examples:
newline_literal= r'\n' filename= "C:\mumbo\jumbo" pattern= "(\*\S+\*)"

The newline_literal is a two character string, not the newline character. Outside of raw strings, non-printing characters and Unicode characters that arent found on your keyboard are created using escapes. A table of escapes is provided below. These are Python representations for unprintable ASCII characters. Theyre called escapes because the \ is an escape from the usual meaning of the following character.

142

Chapter 13. Strings

Building Skills in Python, Release 2.6.5

Escape \\ \' \" \a \b \f \n \r \t \ooo \xhh

Meaning Backslash (\) Apostrophe (') Quote (") Audible Signal; the ASCII code called BEL. Some OSs translate this to a screen ash or ignore it completely. Backspace (ASCII BS) Formfeed (ASCII FF). On a paper-based printer, this would move to the top of the next page. Linefeed (ASCII LF), also known as newline. This would move the paper up one line. Carriage Return (ASCII CR). On a paper based printer, this returned the print carriage to the start of the line. Horizontal Tab (ASCII TAB) An ASCII character with the given octal value. The ooo is any octal number. An ASCII character with the given hexadecimal value. The x is required. The hh is any hex number.

Adjacent Strings. Note that adjacent string objects are automatically concatenated to make a single string. "ab" "cd" "ef" is the same as "abcdef". The most common use for this is the following:
msg = "A very long" \ "message, which didn't fit on" \ "one line."

Unicode Characters. For Unicode, a special \uxxxx escape is provided. This requires the four digit Unicode character identication. For example, is made up of Unicode characters U+65e5 and U+672c. In Python, we write this string as u'\u65e5\u672c'. There are a variety of Unicode encoding schemes, for example, UTF-8, UTF-16 and LATIN-1. The codecs module provides mechanisms for encoding and decoding Unicode Strings.

13.3 String Operations


There are a number of operations on str objects, operations which create strs and operations which create other objects from strs. There are three operations (+ , * , [ ]) that work with all sequences (including strs) and a unique operation, %, that can be performed only with str objects. The + operator creates a new string as the concatenation of the arguments.
>>> "hi " + 'mom' 'hi mom'

The * operator between str and numbers (number * str or str * number) creates a new str that is a number of repetitions of the input str.
>>> print 3*"cool!" cool!cool!cool!

13.3. String Operations

143

Building Skills in Python, Release 2.6.5

The [ ] operator can extract a single character or a slice from the string. There are two forms: the single-item form and the slice form. The single item format is string [ index ]. Characters are numbered from 0 to len(string). Characters are also numbered in reverse from -len(string) to -1. The slice format is string [ start : end ]. Characters from start to end -1 are chosen to create a new str as a slice of the original str; there will be end start characters in the resulting str. If start is omitted it is the beginning of the string (position 0). If end is omitted it is the end of the string (position -1). Yes, you can omit both (someString[:]) to make a copy of a string.
>>> s="adenosine" >>> s[2] 'e' >>> s[:5] 'adeno' >>> s[5:] 'sine' >>> s[-5:] 'osine' >>> s[:-5] 'aden'

The String Formatting Operation, %. The % operator is sometimes called string interpolation, since it interpolates literal text and converted values. We prefer to call it string formatting, since that is a more apt description. Much of the formatting is taken straight from the C librarys printf() function. This operator has three forms. You can use % with a str and value, str and a tuple as well as str and classname:dict. Well cover tuple and dict in detail later. The string on the left-hand side of % contains a mixture of literal text plus conversion specications. A conversion specication begins with %. For example, integers are converted with %i. Each conversion specication will use a corresponding value from the tuple. The rst conversion uses the rst value of the tuple, the second conversion uses the second value from the tuple. For example:
import random d1, d2 = random.randrange(1,6), random.randrange(1,6) r= "die 1 shows %i, and die 2 shows %i" % ( d1, d2 )

The rst %i will convert the value for d1 to a string and insert the value, the second %i will convert the value for d2 to a string. The % operator returns the new string based on the format, with each conversion specication replaced with the appropriate values. Conversion Specications. Each conversion specication has from one to four elements, following this pattern: %.
[ flags ][ width [ precision ]] code

The % and the nal code in each conversion specication are required. The other elements are optional. The optional ags element can have any combination of the following values: - Left adjust the converted value in a eld that has a length given by the width element. The default is right adjustment.

144

Chapter 13. Strings

Building Skills in Python, Release 2.6.5

+ Show positive signs (sign will be + or -). The default is to show negative signs only. (a space) Show positive signs with a space (sign will be or -). The default is negative signs only. # Use the Python literal rules (0 for octal, 0x for hexadecimal, etc.) The default is decoration-free notation. 0 Zero-ll the the eld that has a length given by the width element. The default is to space-ll the eld. This doesnt make a lot of sense with the - (left-adjust) ag. The optional width element is a number that species the total number of characters for the eld, including signs and decimal points. If omitted, the width is just big enough to hold the output number. If a * is used instead of a number, an item from the tuple of values is used as the width of the eld. For example, "%*i" % ( 3, d1 ) uses the value 3 from the tuple as the eld width and d1 as the value to convert to a string. The optional precision element (which must be preceded by a dot, . if it is present) has a few dierent purposes. For numeric conversions, this is the number of digits to the right of the decimal point. For string conversions, this is the maximum number of characters to be printed, longer string s will be truncated. If a * is used instead of a number, an item from the tuple of values is used as the precision of the conversion. For example, "%*.*f" % ( 6, 2, avg ) uses the value 6 from the tuple as the eld width, the value 2 from the tuple as the precision and avg as the value. The standard conversion rules also permit a long or short indicator: l or h. These are tolerated by Python so that these formats will be compatible with C, but they have no eect. They reect internal representation considerations for C programming, not external formatting of the data. The required one-letter code element species the conversion to perform. The codes are listed below. % Not a conversion, this creates a % in the resulting str. Use %% to put a % in the output str. c Convert a single-character str. This will also convert an integer value to the corresponding ASCII character. For example, "%c" % ( 65, ) results in "A". s Convert a str. This will convert non- str objects by implicitly calling the str() function. r Call the repr() function, and insert that value. i d Convert a numeric value, showing ordinary decimal output. The code i stands for integer, d stands for decimal. They mean the same thing; but its hard to reach a consensus on which is correct. u Convert an unsigned number. While relevant to C programming, this is the same as the i or d format conversion. o Convert a numeric value, showing the octal representation. %#0 gets the Python-style value with a leading zero. This is similar to the oct() function. x X Convert a numeric value, showing the hexadecimal representation. %#X gets the Python-style value with a leading 0X; %#x gets the Python-style value with a leading 0x. This is similar to the hex() function. e E Convert a numeric value, showing scientic notation. %e produces d.ddd e xx, %E produces d.ddd E xx. f F Convert a numeric value, using ordinary decimal notation. In case the number is gigantic, this will switch to %g or %G notation. g G Generic oating-point conversion. For values with an exponent larger than -4, and smaller than the precision element, the %f format will be used. For values with an exponent smaller than -4, or values larger than the precision element, the %e or %E format will be used. Here are some examples.
"%i: %i win, %i loss, %6.3f " % (count,win,loss,float(win)/loss)

13.3. String Operations

145

Building Skills in Python, Release 2.6.5

This example does four conversions: three simple integer and one oating point that provides a width of 6 and 3 digits of precision. -0.000 is the expected format. The rest of the string is literally included in the output.
"Spin %3i: %2i, %s" % (spin,number,color)

This example does three conversions: one number is converted into a eld with a width of 3, another converted with a width of 2, and a string is converted, using as much space as the string requires.
>>> a=6.02E23 >>> "%e" % a '6.020000e+23' >>> "%E " % a '6.020000E+23' >>>

This example shows simple conversion of a oating-point number to the default scientic notation which has a witdth of 12 and a precision of 6.

13.4 String Comparison Operations


The standard comparisons (<, <=, >, >=, ==, !=) apply to str objects. These comparisons use the standard character-by-character comparison rules for ASCII or Unicode. There are two additional comparisons: in and not in. These check to see if a substring occurs in a longer string. The in operator returns a True when the substring is found, False if the substring is not found. The not in operator returns True if the substring is not found.
>>> 'a' in 'xyzzyabcxyzzy' True >>> 'abc' in 'xyzzyabc' True

Dont be fooled by the fact that string representations of integers dont seem to sort properly. String comparison does not magically recornize that the strings are representations of numbers. Its simple alphabetical order rules applied to digits.
>>> '100' < '25' True

This is true because '1' < '2'.

13.5 String Statements


The for statement will step though all elements of a sequence. In the case of a string, it will step through each character of the string. For example:
for letter in "forestland": print letter

This will print each letter of the given string.

146

Chapter 13. Strings

Building Skills in Python, Release 2.6.5

13.6 String Built-in Functions


The following built-in functions are relevant to str manipulation chr(i ) Return a str of one character with ordinal i. Note that 0 i < 256 to be a proper ASCII character. unichr(u) Return a Unicode String (unicode) of one character with ordinal u. 0 u < 65536. ord(c) Return the integer ordinal of a one character str. This works for any character, including Unicode characters. unicode(string, [encoding], [errors] ) Creates a new Unicode object from the given encoded string. encoding defaults to the current default string encoding. errors denes the error handling, defaults to strict. The unicode() function converts the string to a specic Unicode external representation. The default encoding is UTF-8 with strict error handling. Choices for errors are strict, replace and ignore. Strict raises an exception for unrecognized characters, replace substitutes the Unicode replacement character ( \uFFFD ) and ignore skips over invalid characters. The codecs and unicodedata modules provide more functions for working with Unicode.
>>> unicode("hi mom","UTF-16") u'\u6968\u6d20\u6d6f' >>> unicode("hi mom","UTF-8") u'hi mom'

Important: Python 3 The ord(), chr(), unichr() and unicode() functions will be simplied in Python 3. Python 3 no longer separates ASCII from Unicode strings. These functions will all implicitly work with Unicode strings. Note that the UTF-8 encoding of Unicode overlaps with ASCII, so this simplication to use Unicode will not signicantly disrupt programs that work ASCII les. Several important functions were dened earlier in String Conversion Functions. repr(). Returns a canonical string representation of the object. eval(repr(object)) == object. For most object types,

For simple numeric types, the result of repr() isnt very interesting. For more complex, types, however, it often reveals details of their structure.
>>> ... ... ... >>> "'a a="""a very long string in multiple lines """ repr(a) very \\nlong string \\nin multiple lines\\n'"

This representation shows the newline characters ( \n ) embedded within the triple-quoted string. Important: Python 3 The reverse quotes (`a`) work like repr(a). The reverse quote syntax is rarely used, and will be dropped in Python 3.

13.6. String Built-in Functions

147

Building Skills in Python, Release 2.6.5

str(). Return a nice string representation of the object. If the argument is a string, the return value is the same object.
>>> a= str(355.0/113.0) >>> a '3.14159292035' >>> len(a) 13

Some other functions which apply to strings as well as other sequence objects. len(). For strings, this function returns the number of characters.
>>> len("abcdefg") 7 >>> len(r"\n") 2 >>> len("\n") 1

max(). For strings, this function returns the maximum character. min(). For strings, this function returns the minimum character. sorted(). Iterate through the strings characters in sorted order. This expands the string into an explicit list of individual characters.
>>> sorted( "malapertly" ) ['a', 'a', 'e', 'l', 'l', 'm', 'p', 'r', 't', 'y'] >>> "".join( sorted( "malapertly" ) ) 'aaellmprty'

reversed(). Iterate through the strings characters in reverse order. This creates an iterator. The iterator can be used with a variety of functions or statements.
>>> reversed( "malapertly" ) <reversed object at 0x600230> >>> "".join( reversed( "malapertly" ) 'yltrepalam'

13.7 String Methods


A string object has a number of method functions. These can be grouped arbitrarily into transformations, which create new string s from old, and information, which returns a fact about a string. The following string transformation functions create a new string object from an existing string. capitalize() Create a copy of the string with only its rst character capitalized. center(width) Create a copy of the string centered in a string of length width. Padding is done using spaces. encode(encoding, [errors] ) Return an encoded version of string. Default encoding is the current default string encoding. errors may be given to set a dierent error handling scheme. Default is strict meaning that encoding errors raise a ValueError. Other possible values are ignore and replace. 148 Chapter 13. Strings

Building Skills in Python, Release 2.6.5

expandtabs([tabsize] ) Return a copy of string where all tab characters are expanded using spaces. If tabsize is not given, a tab size of 8 characters is assumed. join(sequence) Return a string which is the concatenation of the strings in the sequence. Each separator between elements is a copy of the given string object. ljust(width) Return a copy of string left justied in a string of length width. Padding is done using spaces. lower() Return a copy of string converted to lowercase. lstrip() Return a copy of string with leading whitespace removed. replace(old, new, [maxsplit] ) Return a copy of string with all occurrences of substring old replaced by new. If the optional argument maxsplit is given, only the rst maxsplit occurrences are replaced. rjust(width) Return a copy of string right justied in a string of length width. Padding is done using spaces. rstrip() Return a copy of string with trailing whitespace removed. strip() Return a copy of string with leading and trailing whitespace removed. swapcase() Return a copy of string with uppercase characters converted to lowercase and vice versa. title() Return a copy of string with words starting with uppercase characters, all remaining characters in lowercase. translate(table, [deletechars] ) Return a copy of the string, where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation table. The table must be a string of length 256, providing a translation for each 1-byte ASCII character. The translation tables are built using the string.maketrans() function in the string module. upper() Return a copy of string converted to uppercase. The following accessor methods provide information about a string. count(sub, [start], [end] ) Return the number of occurrences of substring sub in string. If start or end are present, these have the same meanings as a slice string[start:end]. endswith(suix, [start], [end] ) Return True if string ends with the specied suffix, otherwise return False. The suix can be a single string or a sequence of individual strings. If start or end are present, these have the same meanings as a slice string[start:end]. find(sub, [start], [end] ) Return the lowest index in string where substring sub is found. Return -1 if the substring is not found. If start or end are present, these have the same meanings as a slice string[start:end].

13.7. String Methods

149

Building Skills in Python, Release 2.6.5

index(sub, [start], [end] ) Return the lowest index in string where substring sub is found. Raise ValueError if the substring is not found. If start or end are present, these have the same meanings as a slice string[start:end]. isalnum() Return True if all characters in string are alphanumeric and there is at least one character in string; False otherwise. isalpha() Return True if all characters in string are alphabetic and there is at least one character in string; False otherwise. isdigit() Return True if all characters in string are digits and there is at least one character in string; False otherwise. islower() Return True if all characters in string are lowercase and there is at least one cased character in string; False otherwise. isspace() Return True if all characters in string are whitespace and there is at least one character in string, False otherwise. istitle() Return True if string is titlecased. Uppercase characters may only follow uncased characters (whitespace, punctuation, etc.) and lowercase characters only cased ones, False otherwise. isupper() Return True if all characters in string are uppercase and there is at least one cased character in string; False otherwise. rfind(sub, [start], [end] ) Return the highest index in string where substring sub is found. Return -1 if the substring is not found. If start or end are present, these have the same meanings as a slice string[start:end]. rindex(sub, [start], [end] ) Return the highest index in string where substring sub is found. Raise ValueError if the substring is not found.. If start or end are present, these have the same meanings as a slice string[start:end]. startswith(sub, [start], [end] ) Return True if string starts with the specied prefix, otherwise return False. The prex can be a single string or a sequence of individual strings. If start or end are present, these have the same meanings as a slice string[start:end]. The following generators create another kind of object, usually a sequence, from a string. partition(separator ) Return three values: the text prior to the rst occurance of separator in string, the sep as the delimiter, and the text after the rst occurance of the separator. If the separator doesnt occur, all of the input string is in the rst element of the 3-tuple; the other two elements are empty strings. split(separator, [maxsplit] ) Return a list of the words in the string, using separator as the delimiter. If maxsplit is given, at most maxsplit splits are done. If separator is not specied, any whitespace characater is a separator. splitlines(keepends) Return a list of the lines in string, breaking at line boundaries. Line breaks are not included in the resulting list unless keepends is given and set to True.

150

Chapter 13. Strings

Building Skills in Python, Release 2.6.5

13.8 String Modules


There is an older module named string. Almost all of the functions in this module are directly available as methods of the string type. The one remaining function of value is the maketrans() function, which creates a translation table to be used by the translate() method of a string. maketrans(from, to) Return a translation table (a string 256 characters long) suitable for use in str.translate(). The from and to parameters must be strings of the same length. The table will assure that each character in from is mapped to the character in the same position in to. The following example shows how to make and then apply a translation table.
>>> import string >>> t= string.maketrans("aeiou","xxxxx") >>> phrase= "now is the time for all good men to come to the aid of their party" >>> phrase.translate( t ) 'nxw xs thx txmx fxr xll gxxd mxn tx cxmx tx thx xxd xf thxxr pxrty'

The codecs module takes a dierent approach and has a number of built-in translations. More importantly, this module contains a number of denitions of the characters in the ASCII character set. These denitions serve as a central, formal repository for facts about the character set. Note that there are general denitions, applicable to Unicode character setts, dierent from the ASCII denitions. ascii_letters The set of all letters, ascii_uppercase. ascii_lowercase The lowercase 'abcdefghijklmnopqrstuvwxyz' ascii_uppercase The uppercase 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' essentially a union of in in the the ascii_lowercase and character character set: set:

letters letters

ASCII ASCII

digits The digits used to make decimal numbers: '0123456789' hexdigits The digits used to make hexadecimal numbers: '0123456789abcdefABCDEF' letters This is the set of all letters, a union of lowercase and uppercase, which depends on the setting of the locale on your system. lowercase This is the set of lowercase letters, and depends on the setting of the locale on your system. octdigits The digits used to make octal numbers: '01234567' printable All printable characters in the character set. This is a union of digits, letters, punctuation and whitespace. punctuation All punctuation in the !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ ASCII character set, this is

uppercase This is the set of uppercase letters, and depends on the setting of the locale on your system. whitespace A collection of characters that cause spacing to happen. '\t\n\x0b\x0c\r' For ASCII this is

13.8. String Modules

151

Building Skills in Python, Release 2.6.5

13.9 String Exercises


1. Check Amount Writing. Translate a number into the English phrase. This example algorithm fragment is only to get you started. This shows how to pick o the digits from the right end of a number and assemble a resulting string from the left end of the string. Note that the right-most two digits have special names, requiring some additional cases above and beyond the simplistic loop shown below. For example, 291 is two hundred ninety one, where 29 is twenty nine. The word for 2 changes, depending on the context. As a practical matter, you should analyze the number by taking o three digits at a time, the expression (number % 1000) does this. You would then format the three digit number with words like million, thousand, etc.

English Words For An Amount, n


(a) Initialization. Set result Set tc 0. This is the tens counter that shows what position were examining. (b) Loop. While n > 0. i. Get Right Digit. Set digit n%10, the remainder when divided by 10. ii. Make Phrase. Translate digit to a string from zero to nine. Translate tc to a string from to thousand. This is tricky because the teens are special, where the hundreds and thousands are pretty simple. iii. Assemble Result. Prepend digit string and tc string to the left end of the result string. iv. Next Digit. n n 10. Be sure to use the // integer division operator, or youll get oating-point results. Increment tc by 1. (c) Result. Return result as the English translation of n. 2. Roman Numerals. This is similar to translating numbers to English. Instead we will translate them to Roman Numerals. The Algorithm is similar to Check Amount Writing (above). You will pick o successive digits, using %10 and /10 to gather the digits from right to left. The rules for Roman Numerals involve using four pairs of symbols for ones and ve, tens and fties, hundreds and ve hundreds. An additional symbol for thousands covers all the relevant bases. When a number is followed by the same or smaller number, it means addition. II is two 1s = 2. VI is 5 + 1 = 6. When one number is followed by a larger number, it means subtraction. IX is 1 before 10 = 9. IIX isnt allowed, this would be VIII. For numbers from 1 to 9, the symbols are I and V, and the coding works like this. (a) I (b) II

152

Chapter 13. Strings

Building Skills in Python, Release 2.6.5

(c) III (d) IV (e) V (f) VI (g) VII (h) VIII (i) IX The same rules work for numbers from 10 to 90, using X and L. For numbers from 100 to 900, using the symbols C and D. For numbers between 1000 and 4000, using M. Here are some examples. 1994 = MCMXCIV, 1956 = MCMLVI, 3888= MMMDCCCLXXXVIII 3. Word Lengths. Analyze the following block of text. Youll want to break into into words on whitespace boundaries. Then youll need to discard all punctuation from before, after or within a word. Whats left will be a sequence of words composed of ASCII letters. Compute the length of each word, and produce the sequence of digits. (no word is 10 or more letters long.) Compare the sequence of word lenghts with the value of math.pi.
Poe, E. Near a Raven Midnights so dreary, tired and weary, Silently pondering volumes extolling all by-now obsolete lore. During my rather long nap - the weirdest tap! An ominous vibrating sound disturbing my chamber's antedoor. "This", I whispered quietly, "I ignore".

This is based on http://www.cadaeic.net/cadenza.htm.

13.10 Digression on Immutability of Strings


In Strings and Tuples we noted that string and tuple objects are immutable. They cannot be changed once they are created. Programmers experienced in other languages sometimes nd this to be an odd restriction. Two common questions that arise are how to expand a string and how to remove characters from a string. Generally, we dont expand or contract a string, we create a new string that is the concatenation of the original string objects. For example:
>>> a="abc" >>> a=a+"def" >>> a 'abcdef'

In eect, Python gives us string objects of arbitrary size. It does this by dynamically creating a new string instead of modifying an existing string. Some programmers who have extensive experience in other languages will ask if creating a new string from the original string is the most eicient way to accomplish this. Or they suggest that it would be

13.10. Digression on Immutability of Strings

153

Building Skills in Python, Release 2.6.5

simpler to allow a mutable string for this kind of concatenation. The short answer is that Pythons storage management makes this use of immutable string the simplest and most eicient. Responses to the immutability of tuple and mutability of list vary, including some of the following frequently asked questions. Since a list does everything a tuple does and is mutable, why bother with tuple? Immutable tuple objects are more eicient than variable-length list objects for some operations. Once the tuple is created, it can only be examined. When it is no longer referenced, the normal Python garbage collection will release the storage for the tuple. Most importantly, a tuple can be reliably hashed to a single value. This makes it a usable key for a mapping. Many applications rely on xed-length tuples. A program that works with coordinate geometry in two dimensions may use 2-tuples to represent (x, y) coordinate pairs. Another example might be a program that works with colors as 3-tuples, (r, g, b), of red, green and blue levels. A variable-length list is not appropriate for these kinds of xed-length tuple. Wouldnt it be more eicient to allow mutable string s? There are a number of axes for eiciency: the two most common are time and memory use. A mutable string could use less memory. However, this is only true in the benign special case where we are only replacing or shrinking the string within a xed-size buer. If the string expands beyond the size of the buer the program must either crash with an exception, or it must switch to dynamic memory allocation. Python simply uses dynamic memory allocation from the start. C programs often have serious security problems created by attempting to access memory outside of a string buer. Python avoids this problem by using dynamic allocation of immutable string objects. Processing a mutable string could use less time. In the cases of changing a string in place or removing characters from a string, a xed-length buer would require somewhat less memory management overhead. Rather than indict Python for oering immutable string, this leads to some productive thinking about string processing in general. In text-intensive applications we may want to avoid creating separate string objects. Instead, we may want to create a single string object the input buer and work with slices of that buer. Rather than create string, we can create slice objects that describe starting and ending osets within the one-and-only input buer. If we then need to manipulate these slices of the input buer, we can create new string objects only as needed. In this case, our application program is designed for eiciency. We use the Python string objects when we want exibility and simplicity.

154

Chapter 13. Strings

CHAPTER

FOURTEEN

TUPLES
Well look at tuple from a number of viewpoints: semantics, literal values, operations, comparison operators, statements, built-in functions and methods. Additionally, we have a digression on the operator in Digression on The Sigma Operator .

14.1 Tuple Semantics


A tuple is a container for a xed sequence of data objects. The name comes from the Latin suix for multiples: double, triple, quadruple, quintuple. Mathematicians commonly consider ordered pairs; for instance, most analytical geometry is done with Cartesian coordinates (x, y ). An ordered pair can be generalized as a 2-tuple. An essential ingredient here is that a tuple has a xed and known number of elements. A 3-dimensional point is a 3-tuple. An CMYK color code is a 4-tuple. The size of the tuple cant change without fundamentally redening the problem were solving. A tuple is an immutable sequence of Python objects. Since it is a sequence, all of the common operations to sequences apply. Since it is immutable, it cannot be changed. Two common questions that arise are how to expand a tuple and how to remove objects from a tuple. When someone asks about changing an element inside a tuple, either adding, removing or updating, we have to remind them that the list, covered in Lists, is for dynamic sequences of elements. A tuple is generally applied when the number of elements is xed by the nature of the problem. This tuple processing even pervades the way functions are dened. We can have positional parameters collected into a tuple, something well cover in Advanced Parameter Handling For Functions.

14.2 Tuple Literal Values


A tuple literal is created by surrounding objects with () and separating the items with commas (,). An empty tuple is simple (). An interesting question is how Python tells an expression from a 1-tuple. A 1-element tuple has a single comma; for example, (1,). An expression lacks the comma: (1). A pleasant consequence of this is that an extra comma at the end of every tuple is legal; for example, (9, 10, 56, ). Examples:

155

Building Skills in Python, Release 2.6.5

xy= (2, 3) personal= ('Hannah',14,5*12+6) singleton= ("hello",)

xy A 2-tuple with integers. personal A 3-tuple with a string and two integers singleton A 1-tuple with a string. The trailing , assures that his is a tuple, not an expression. The elements of a tuple do not have to be the same type. A tuple can be a mixture of any Python data types, including other tuples.

14.3 Tuple Operations


There are three standard sequence operations (+, *, []) that can be performed with tuples as well as other sequences. The + operator creates a new tuple as the concatenation of the arguments. Heres an example.
>>> ("chapter",8) + ("strings","tuples","lists") ('chapter', 8, 'strings', 'tuples', 'lists')

The * operator between tuple and number (number * tuple or tuple * number) creates a new tuple that is a number of repetitions of the input tuple.
>>> 2*(3,"blind","mice") (3, 'blind', 'mice', 3, 'blind', 'mice')

The [] operator selects an item or a slice from the tuple. There are two forms: the single-item form and the slice form. The single item format is tuple [ index ]. Items are numbered from 0 to len(tuple). Items are also numbered in reverse from -len(tuple) to -1. The slice format is tuple [ start : end ]. Items from start to end -1 are chosen to create a new tuple as a slice of the original tuple; there will be end start items in the resulting tuple. If start is omitted it is the beginning of the tuple (position 0). If end is omitted it is the end of the tuple (position -1). Yes, you can omit both (someTuple[:]) to make a copy of a tuple. This is a shallow copy: the original objects are now members of two distinct tuples.
>>> t=( (2,3), (2,"hi"), (3,"mom"), 2+3j, >>> t[2] (3, 'mom') >>> t[:3] ((2, 3), (2, 'hi'), (3, 'mom')) >>> t[3:] ((2+3j), 6.02e+23) >>> t[-1] 6.02e+23 >>> t[-3:] ((3, 'mom'), (2+3j), 6.02e+23) 6.02E23 )

156

Chapter 14. Tuples

Building Skills in Python, Release 2.6.5

14.4 Tuple Comparison Operations


The standard comparisons (<, <=, >, >=, ==, !=, in, not in) work exactly the same among tuple objects as they do among string and other sequences. The tuple pbjects are compared element by element. If the corresponding elements are the same type, ordinary comparison rules are used. If the corresponding elements are dierent types, the type names are compared, since there is almost no other rational basis for comparison.
>>> a >>> b >>> a True >>> 3 True >>> 3 False = (1, 2, 3, 4, 5) = (9, 8, 7, 6, 5) < b in a in b

Heres a longer example.

redblack.py
#!/usr/bin/env python import random n= random.randrange(38) if n == 0: print '0', 'green' elif n == 37: print '00', 'green' elif n in ( 1,3,5,7,9, 12,14,16,18, 19,21,23,25,27, 30,32,34,36 ): print n, 'red' else: print n, 'black'

1. We import random. 2. We create a random number, n in the range 0 to 37. 3. We check for 0 and 37 as special cases of single and double zero. 4. If the number is in the tuple of red spaces on the roulette layout, this is printed. 5. If none of the other rules are true, the number is in one of the black spaces.

14.5 Tuple Statements


There are a number of statements that have specic features related to tuple objects. The Assignment Statement. There is a variation on the assignment statement called a multipleassignment statement that works nicely with tuples. We looked at this in Multiple Assignment Statement . Multiple variables are set by decomposing the items in the tuple. For example:

14.4. Tuple Comparison Operations

157

Building Skills in Python, Release 2.6.5

>>> x, y = (1, 2) >>> x 1 >>> y 2

An essential ingredient here is that a tuple has a xed and known number of elements. For example a 2-dimensional geometric point might have a tuple with x and y. A 3-dimensional point might be a tuple with x, y, and z. This works well because the right side of the assignment statement is fully evaluated before the assignments are performed. This allows things like swapping the values in two variables with x,y=y,x. The for Statement. The for statement will step though all elements of a sequence. For example:
s= 0 for i in ( 1,3,5,7,9, 12,14,16,18, 19,21,23,25,27, 30,32,34,36 ): s += i print "total",s

This will step through each number in the given tuple. There are three built-in functions that will transform a tuple into another sequence. The enumerate(), sorted() and reversed() functions will provide the items of the tuple with their index, in sorted order or in reverse order.

14.6 Tuple Built-in Functions


The tuple() function creates a tuple out of another sequence object. tuple(sequence) Create a tuple from another sequence. This will convert list or str to a tuple. Functions which apply to tuples, but are dened elsewhere. len(). For tuples, this function returns the number of items.
>>> len( (1,1,2,3) ) 4 >>> len( () ) 0

max(). For tuples, this function returns the maximum item.


>>> max( (1,9973,2) ) 9973

min(). For tuples, this function returns the minimum item. sum(). For tuples, this function sums the individual items.
>>> sum( (1,9973,2) ) 9976

any(). For tuples, Return True if there exists any item which is True.

158

Chapter 14. Tuples

Building Skills in Python, Release 2.6.5

>>> any( (0,None,False) ) False >>> any( (0,None,False,42) ) True >>> any( (1,True) ) True

all(). For tuples, Return True if all items are True.


>>> all( (0,None,False,42) ) False >>> all( (1,True) ) True

enumerate(). Iterate through the tuple returning 2-tuples of ( index, item ). In eect, this function enumerates all the items in a sequence: it provides a number and each element of the original sequence in a 2-tuple.
for i, x in someTuple: print "position", i, " has value ", x

Consider the following.


>>> a = ( 3.1415926, "Words", (2+3j) ) >>> tuple( enumerate( a ) ) ((0, 3.1415926000000001), (1, 'Words'), (2, (2+3j)))

We created a tuple from the enumeration. This shows that each item of the enumeration is a 2-tuple with the index number and an item from the original tuple. sorted(). Iterate through the tuple in sorted order.
>>> (1, >>> (9, tuple( sorted( (9,1,8,2,7,3) )) 2, 3, 7, 8, 9) tuple( sorted( (9,1,8,2,7,3), reverse=True )) 8, 7, 3, 2, 1)

reversed(). Iterate through the tuple in reverse order.


>>> tuple( reversed( (9,1,8,2,7,3) ) ) (3, 7, 2, 8, 1, 9)

The following function returns a tuple. divmod(x, y ) -> ( div, mod ) Return a 2-tuple with ((x-x%y)/y, x%y). The return values have the invariant: div y + mod = x. This is the quotient and the remainder in division. The divmod() functions is often combined with multiple assignment. For example:
>>> q,r = divmod(355,113) >>> q 3 >>> r 16

14.6. Tuple Built-in Functions

159

Building Skills in Python, Release 2.6.5

>>> q*113+r 355

14.7 Tuple Exercises


These exercises implement some basic statistical algorithms. For some background on the Sigma operator, , see Digression on The Sigma Operator . 1. Blocks of Stock. A block of stock as a number of attributes, including a purchase date, a purchase price, a number of shares, and a ticker symbol. We can record these pieces of information in a tuple for each block of stock and do a number of simple operations on the blocks. Lets dream that we have the following portfolio. Purchase Date 25 25 25 25 Jan Jan Jan Jan 2001 2001 2001 2001 Purchase Price 43.50 42.80 42.10 37.58 Shares 25 50 75 100 Symbol CAT DD EK GM :Current Price 92.45 51.19 34.87 37.58

We can represent each block of stock as a 5-tuple with purchase date, purchase price, shares, ticker symbol and current price.
portfolio= [ ( "25-Jan-2001", 43.50, 25, 'CAT', 92.45 ), ( "25-Jan-2001", 42.80, 50, 'DD', 51.19 ), ( "25-Jan-2001", 42.10, 75, 'EK', 34.87 ), ( "25-Jan-2001", 37.58, 100, 'GM', 37.58 ) ]

Develop a function that examines each block, multiplies shares by purchase price and determines the total purchase price of the portfolio. Develop a second function that examines each block, multiplies shares by purchase price and shares by current price to determine the total amount gained or lost. 2. Mean. Computing the mean of a list of values is relatively simple. The mean is the sum of the values divided by the number of values in the list . Since the statistical formula is so closely related to the actual loop, well provide the formula, followed by an overview of the code. xi x =
0i<n

[The cryptic-looking x is a short-hand for mean of variable x.] The denition of the mathematical operator leads us to the following method for computing the mean:

Computing Mean
(a) Initialize. Set sum, s, to zero (b) Reduce. For each value, i, in the range 0 to the number of values in the list, n: Set s s + xi .

160

Chapter 14. Tuples

Building Skills in Python, Release 2.6.5

(c) Result. Return s n. 3. Standard Deviation. The standard deviation can be done a few ways, but well use the formula shown below. This computes a deviation measurement as the square of the dierence between each sample and the mean. The sum of these measurements is then divided by the number of values times the number of degrees of freedom to get a standardized deviation measurement. Again, the formula summarizes the loop, so well show the formula followed by an overview of the code. x =
0i<n

(xi x )2

n1

[The cryptic-looking x is short-hand for standard deviation of variable x.] The denition of the mathematical operator leads us to the following method for computing the standard deviation:

Computing Standard Deviation


(a) Initialize. Compute the mean, m. Initialize sum, s, to zero. (b) Reduce. For each value, xi in the list: Compute the dierence from the mean, d xi x . Set s s + d2 .
s (c) Variance. Compute the variance as n 1 . The n 1. factor reects the statistical notion of degrees of freedom, which is beyond the scope of this book.

(d) Standard Deviation. Return the square root of the variance. The math module contains the math.sqrt() funtion. For some additional information, see The math Module.

14.8 Digression on The Sigma Operator


For those programmers new to statistics, this section provides background on the Sigma operator, . The usual presentation of the summation operator looks like this.
n i=1

f (i)

The operator has three parts to it. Below it is a bound variable, i and the starting value for the range, written as i = 1. Above it is the ending value for the range, usually something like n. To the right is some function to execute for each value of the bound variable. In this case, a generic function, f (i). This is read as sum f ( i ) for i in the range 1 to n. This common denition of uses a closed range; one that includes the end values of 1 and n. This, however, is not a helpful denition for software. It is slightly simpler to dene to start with zero and use a half-open interval. It still exactly n elements, including 0 and n-1; mathematically, 0 i < n. For software design purposes, we prefer the following notation, but it is not often used. Since most statistical and mathematical texts use 1-based indexing, some care is required when translating formulae to

14.8. Digression on The Sigma Operator

161

Building Skills in Python, Release 2.6.5

programming languages that use 0-based indexing.


0i<n

f (i)

This shows the bound variable (i ) and the range below the operator. It shows the function to execute on the right of the operator. Statistical Algorithms. Our two statistical algorithms have a form more like the following. In this we are applying some function, f, to each value, xi of an array. f (xi )
0i<n

When computing the mean, there the function applied to each value does nothing. When computing standard deviation, the function applied involves subtracting and multiplying. We can transform this denition directly into a for loop that sets the bound variable to all of the values in the range, and does some processing on each value of the sequence of values. This is the Python implemention of . This computes two values, the sum, sum, and the number of elements, n.

Python Sigma Iteration


sum= 0 for x_i in aSequence: fx_i = some processing of x_i sum += fx_i n= len(aSequence)

1. Execute the body of the loop for all values of x_i in the sequence aSequence. The sequence can be a tuple, list or other sequential container. 2. For simple mean calculation, the fx_i statement does nothing. For standard deviation, however, this statement computes the measure of deviation from the average. 3. We sum the x_i values for a mean calculation. We sum fx_i values for a standard deviation calculation.

162

Chapter 14. Tuples

CHAPTER

FIFTEEN

LISTS
Well look at list from a number of viewpoints: semantics, literal values, operations, comparison operators, statements, built-in functions and methods.

15.1 List Semantics


A list is a container for variable length sequence of Python objects. A list is mutable, which means that items within the list can be changed. Also, items can be added to the list or removed from the list. Since a list is a sequence, all of the common operations to sequences apply. Sometimes well see a list with a xed number of elements, like a two-dimensional point with two elements, x and y. A xed-length list may not be the right choice; a tuple, covered in Tuples is usually better for static sequences of elements. A great deal of Pythons internals are list -based. The for statement, in particular, expects a sequence, and we often create a list by using the range() function. When we split a string using the split() method, we get a list of substrings.

15.2 List Literal Values


A list literal is created by surrounding objects with [] and separating the items with commas (,). A list can be created, expanded and reduced. An empty list is simply []. As with tuple, an extra comma at the end of the list is gracefully ignored. Examples:
myList = [ 2, 3, 4, 9, 10, 11, 12 ] history = [ ]

The elements of a list do not have to be the same type. A list can be a mixture of any Python data types, including lists, tuples, strings and numeric types. A list permits a sophisticated kind of display called a comprehension. Well revisit this in some depth in List Comprehensions. As a teaser, consider the following:
>>> [ 2*i+1 for i in range(6) ] [1, 3, 5, 7, 9, 11]

163

Building Skills in Python, Release 2.6.5

This statement creates a list using a list comprehension. A comprehension starts with a candidate list ( range(6), in this example) and derives the list values from the candidate list (using 2*i+1 in this example). A great deal of power is available in comprehensions, but well save the details for a later section.

15.3 List Operations


The three standard sequence operations (+, *, []) can be performed with list, as well as other sequences like tuple and string. The + operator creates a new list as the concatenation of the arguments.
>>> ["field"] + [2, 3, 4] + [9, 10, 11, 12] ['field', 2, 3, 4, 9, 10, 11, 12]

The * operator between list and numbers (number * list or list * number) creates a new list that is a number of repetitions of the input list.
>>> 2*["pass","don't","pass"] ['pass', "don't", 'pass', 'pass', "don't", 'pass']

The [] operator selects an character or a slice from the list. There are two forms: the single-item form and the slice form. The single item format is list [ index ]. Items are numbered from 0 to len(list). Items are also numbered in reverse from -len(list) to -1. The slice format is list [ start : end ]. Items from start to end -1 are chosen to create a new list as a slice of the original list; there will be end start items in the resulting list. If start is omitted it is the beginning of the list (position 0). If end is omitted it is the end of the list (position -1). Yes, you can omit both (someList[:]) to make a copy of a list. This is a shallow copy: the original objects are now members of two distinct lists. In the following example, weve constructed a list where each element is a tuple. Each tuple could be a pair of dice.
>>> l=[(6, 2), (5, 4), (2, 2), (1, 3), (6, 5), (1, 4)] >>> l[2] (2, 2) >>> l[:3] [(6, 2), (5, 4), (2, 2)] >>> l[3:] [(1, 3), (6, 5), (1, 4)] >>> l[-1] (1, 4) >>> l[-3:] [(1, 3), (6, 5), (1, 4)]

15.4 List Comparison Operations


The standard comparisons (<, <=, >, >=, ==, !=, in, not in) work exactly the same among list, tuple and string sequences. The list items are compared element by element. If the corresponding 164 Chapter 15. Lists

Building Skills in Python, Release 2.6.5

elements are the same type, ordinary comparison rules are used. If the corresponding elements are dierent types, the type names are compared, since there is no other rational basis for comparison.
d1= random.randrange(6)+1 d2= random.randrange(6)+1 if d1+d2 in [2, 12] + [3, 4, 9, 10, 11]: print "field bet wins on ", d1+d2 else: print "field bet loses on ", d1+d2

This will create two random numbers, simulating a roll of dice. If the number is in the list of eld bets, this is printed. Note that we assemble the nal list of eld bets from two other list objects. In a larger application program, we might separate the dierent winner list instances based on dierent payout odds.

15.5 List Statements


There are a number of statements that have specic features related to list objects. The Assignment Statement. The variation on the assignment statement called multiple-assignment statement also works with lists. We looked at this in Multiple Assignment Statement . Multiple variables are set by decomposing the items in the list.
>>> x, y = [ 1, "hi" ] >>> x 1 >>> y 'hi'

This will only work of the list has a xed and known number of elements. This is more typical when working with a tuple, which is immutable, rather than a list, which can vary in length. The for Statement. The for statement will step though all elements of a sequence.
s= 0 for i in [2,3,5,7,11,13,17,19]: s += i print "total",s

When we introduced the for statement in Iterative Processing: The for Statement , we showed the range() function; this function creates a list. We can also create a list with a literal or comprehension. Weve looked at simple literals above. Well look at comprehensions below. The del Statement. The del statement removes items from a list. For example
>>> >>> >>> [1, i = range(10) del i[0], i[2], i[4], i[6] i 2, 4, 5, 7, 8]

This example reveals how the del statement works. The i variable starts as the list [0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]. 1. Remove i[0] and the variable is [1, 2, 3, 4, 5, 6, 7, 8, 9]. 2. Remove i[2] (the value 3) from this new list , and get [1, 2, 4, 5, 6, 7, 8, 9].

15.5. List Statements

165

Building Skills in Python, Release 2.6.5

3. Remove i[4] (the value 6) from this new list and get [1, 2, 4, 5, 7, 8, 9]. 4. Finally, remove i[6] and get [1, 2, 4, 5, 7, 8].

15.6 List Built-in Functions


The list() function creates a list out of another sequence object. list(sequence) Create a list from another sequence. This will convert tuple or str to a list. Functions which apply to tuples, but are dened elsewhere. len(). For lists, this function returns the number of items.
>>> len( [1,1,2,3] ) 4 >>> len( [] ) 0

max(). For lists, this function returns the maximum item.


>>> max( [1,9973,2] ) 9973

min(). For lists, this function returns the minimum item. sum(). For lists, this function sums the individual items.
>>> sum( [1,9973,2] ) 9976

any(). For lists, Return True if there exists any item which is True.
>>> any( [0,None,False] ) False >>> any( [0,None,False,42] ) True >>> any( [1,True] ) True

all(). For lists, Return True if all items are True.


>>> all( [0,None,False,42] ) False >>> all( [1,True] ) True

enumerate(). Iterate through the list returning 2-tuples of ( index, item ). In eect, this function enumerates all the items in a sequence: it provides a number and each element of the original sequence in a 2-tuple.
for i, x in someList: print "position", i, " has value ", x

Consider the following list of tuples.

166

Chapter 15. Lists

Building Skills in Python, Release 2.6.5

>>> a = [ ("pi",3.1415946),("e",2.718281828),("mol",6.02E23) ] >>> list( enumerate( a ) ) [(0, ('pi', 3.1415945999999999)), (1, ('e', 2.7182818279999998) 02e+23))] >>> for i, t in enumerate( a ): ... print "item",i,"is",t ... item 0 is ('pi', 3.1415945999999999) item 1 is ('e', 2.7182818279999998) item 2 is ('mol', 6.02e+23)

sorted(). Iterate through the list in sorted order.


>>> [1, >>> [9, list( sorted( [9,1,8,2,7,3] )) 2, 3, 7, 8, 9] tuple( sorted( [9,1,8,2,7,3], reverse=True )) 8, 7, 3, 2, 1]

reversed(). Iterate through the list in reverse order.


>>> tuple( reversed( [9,1,8,2,7,3] ) ) [3, 7, 2, 8, 1, 9]

The following function returns a list. range([start], stop, [step] ) The arguments must be plain integers. If the step argument is omitted, it defaults to 1. If the start argument is omitted, it defaults to 0. step must not be zero (or else ValueError is raised). The full form returns a list of plain integers [ start, start + step, start + 2 step, ...]. If step is positive, the last element is the largest start + i step < stop; ; if step is negative, the last element is the largest start + i step > stop.

15.7 List Methods


A list object has a number of member methods. These can be grouped arbitrarily into mutators, which change the list, transformers which create something new from the list, and and accessors, which returns a fact about a list. The following list mutators update a list object. Generally, these do not return a value. In the case of the pop() method, it both returns information as well as mutates the list. append(object ) Update list by appending object to end of the list. extend(list ) Extend list by appending list elements. Note the dierence from append() , which treats the argument as a single list object. insert(index, object ) Update list l by inserting object before position index. If index is greater than len(list), the object is simply appended. If index is less than zero, the object is prepended.

15.7. List Methods

167

Building Skills in Python, Release 2.6.5

pop([index=-1] ) Remove and return item at index (default last, -1) in list. An exception is raised if the list is already empty. remove(value) Remove rst occurrence of value from list. An exception is raised if the value is not in the list. reverse() Reverse the items of the list. This is done in place, it does not create a new list. There is no return value. sort([key], [reverse=False] ) Sort the items of the list. This is done in place, it does not create a new list. If the reverse keyword parameter is provided and set to True, the tuple is sorted into descending order. The key parameter is used when the items in the tuple arent simply sorted using the default comparison operators. The key function must return the elds to be compared selected from the underlying items in the tuple. Well look at this in detail in Functional Programming with Collections. The following accessor methods provide information about a list. count(value) Return number of occurrences of value in list. index(value) Return index of rst occurrence of value in list. Stacks and Queues. The list.append() and list.pop() functions can be used to create a standard push-down stack, or last-in-rst-out (LIFO) list. The append() method places an item at the end of the list (or top of the stack), where the pop() method can remove it and return it.
>>> stack = [] >>> stack.append(1) >>> stack.append( "word" ) >>> stack.append( ("a","2-tuple") ) >>> stack.pop() ('a', '2-tuple') >>> stack.pop() 'word' >>> stack.pop() 1 >>> len(stack) 0 >>> stack.pop() Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: pop from empty list

The list.append() and list.pop() functions can be used to create a standard queue, or rst-in-rst-out (FIFO) list. The append() method places an item at the end of the queue. A call to pop(0) removes the rst item from the queue and returns it.
>>> >>> >>> >>> queue = [] queue.append( 1 ) queue.append( "word" ) queue.append( ("a","2-tuple") )

168

Chapter 15. Lists

Building Skills in Python, Release 2.6.5

>>> queue.pop(0) 1 >>> queue.pop(0) 'word' >>> queue.pop(0) ('a', '2-tuple') >>> len(queue) 0 >>> queue.pop(0) Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: pop from empty list

15.8 Using Lists as Function Parameter Defaults


Its very, very important to note that default values must be immutable objects. Recall that numbers, strings, None, and tuple objects are immutable. We note that lists as well as sets and dictionaries are mutable, and cannot be used as default values for function parameters. Consider the following example of what not to do.
>>> ... ... ... >>> >>> [2] >>> [2, >>> [2, >>> >>> >>> >>> [2] >>> >>> [2, >>> [2, def append2( someList=[] ): someList.append(2) return someList looks_good= [] append2(looks_good) append2(looks_good) 2] looks_good 2]

not_good= append2() not_good worse= append2() worse 2] not_good 2]

1. We dened a function which has a default value thats a mutable object. This is simple a bad programming practice in Python. 2. We used this function with a list object, looks_good. The function updated the list object as expected. 3. We used the functions default value to create not_good. The function appended to an empty list and returned this new list object. It turns out that the function updated the mutable default value, also. 4. When we use the functions default value again, with worse, the function uses the updated default value and updates it again. 15.8. Using Lists as Function Parameter Defaults 169

Building Skills in Python, Release 2.6.5

Both not_good and worse are references to the same mutable object that is being updated. To avoid this, do not use mutable values as defaults. Do this instead.
def append2( someList=None ): if someList is None: someList= [] someList.append(2) return someList

This creates a fresh new mutable object as needed.

15.9 List Exercises


1. Accumulating Distinct Values. This uses the Bounded Linear Search algorithm to locate duplicate values in a sequence. This is a powerful technique to eliminate sorting from a wide variety of summarytype reports. Failure to use this algorithm leads to excessive processing in many types of applications.

Distinct Values of a Sequence, seq


(a) Initialize Distinct Values. Set dv list(). (b) Loop. For each value, v, in seq. Well use the Bounded Linear Search to see if v occurs in dv. i. Initialize. Set i 0. Append v to the list dv. ii. Search. while dv [i] = v : increment i. At this point dv [i] = v . The question is whether i = len(dv ) or not. iii. New Value?. if i = len(dv ): v is distinct. iv. Existing Value?. if i = len(dv ): v is a duplicate of dv [i]. Delete dv [1], the value we added. (c) Result. Return array dv, which has distinct values from seq. You may also notice that this fancy Bounded Linear Search is suspiciously similar to the index() method function of a list. Rewrite this using uniq.index instead of the Bounded Linear Search in step 2. When we look the set collection, youll see another way to tackle this problem. 2. Binary Search. This is not as universally useful as the Bounded Linear Search (above) because it requires the data be sorted.

Binary Search a sorted Sequence, seq, for a target value, tgt


(a) Initialize. l, h 0, len(seq ). m (l + h) 2. This is the midpoint of the sorted sequence.

170

Chapter 15. Lists

Building Skills in Python, Release 2.6.5

(b) Divide and Conquer. While l + 1 < h and seq [m] = tgt. If tgt < seq [m]: h m. Move h to the midpoint. If tgt > seq [m]: l m + 1. Move l to the midpoint. m (l + h) 2. Compute a midpoint of the new, smaller sequence. (c) Result. If tgt = seq [m]: return m If tgt = seq [m]: return -1 as a code for not found. 3. Quicksort. The super-fast sort routine As a series of loops it is rather complex. As a recursion it is quite short. This is the same basic algorithm in the C libraries. Quicksort proceeds by partitioning the list into two regions: one has all of the high values, the other has all the low values. Each of these regions is then individually sorted into order using the quicksort algorithm. This means the each region will be subdivided and sorted. For now, well sort an array of simple numbers. Later, we can generalize this to sort generic objects.

Quicksort a List, a between elements lo and hi


(a) Partition i. Initialize. ls, hs lo, hi. Setup for partitioning between ls and hs. middle (ls + hs) 2. ii. Swap To Partition. while ls < hs: If a[ls].key a[middle].key : increment ls by 1. Move the low boundary of the partitioning. If a[ls].key > a[middle].key : swap the values a[ls] a[middle]. If a[hs].key a[middle].key : decrement hs by 1. Move the high boundary of the partitioning. If a[hs].key < a[middle].key :, swap the values a[hs] (b) Quicksort Each Partition. QuickSort( a , lo, middle ) QuickSort( a , middle+1, hi ) 4. Recursive Search. This is also a binary search: it works using a design called divide and conquer. Rather than search the whole list, we divide it in half and search just half the list. This version, however is dened with a recusive function instead of a loop. This can often be faster than the looping version shown above. a[middle].

Recursive Search a List, seq for a target, tgt, in the region between elements lo and hi.
(a) Empty Region? If lo + 1 hi: return -1 as a code for not found. (b) Middle Element. m (lo + hi) 2. (c) Found? If seq [m] = tgt: return m.

15.9. List Exercises

171

Building Skills in Python, Release 2.6.5

(d) Lower Half? If seq [m] < tgt: return recursiveSearch ( seq, tgt, lo, m ) (e) Upper Half? If seq [m] > tgt: return recursiveSearch( seq, tgt, m+1, hi ) 5. Sieve of Eratosthenes. This is an algorithm which locates prime numbers. A prime number can only be divided evenly by 1 and itself. We locate primes by making a table of all numbers, and then crossing out the numbers which are multiples of other numbers. What is left must be prime.

Sieve of Eratosthenes
(a) Initialize. Create a list, prime of 5000 booleans, all True, initially. p 2. (b) Iterate. While 2 p < 5000. i. Find Next Prime. While not prime[p] and 2 p < 5000: Increment p by 1. ii. Remove Multiples. At this point, p is prime. Set k p + p. while k < 5000. prime[k ] F alse. Increment k by p. iii. Next p. Increment p by 1. (c) Report. At this point, for all p, if prime [ p ] is true, p is prime. while 2 p < 5000: if prime[p]: print p The reporting step is a lter operation. Were creating a list from a source range and a lter rule. This is ideal for a list comprehension. Well look at these in List Comprehensions. Formally, we can say that the primes are the set of values dened by primes = {p|0p<5000 if primep }. This formalism looks a little bit like a list comprehension. 6. Polynomial Arithmetic. We can represent numbers as polynomials. We can represent polynomials as arrays of their coeicients. This is covered in detail in [Knuth73], section 2.2.4 algorithms A and M. Example: 4x3 + 3x + 1 has the following coeicients: ( 4, 0, 3, 1 ). The polynomial 2x2 3x 4 is represented as ( 2, -3, -4 ). The sum of these is 4x3 + 2x2 3; ( 4, 2, 0, -3 ). The product these is 8x5 12x4 10x3 7x2 15x 4; ( 8, -12, -10, -7, -15, -4 ). You can apply this to large decimal numbers. In this case, x is 10, and the coeicients must all be between 0 and x -1. For example, 1987 = 1x3 + 9x2 + 8x + 7, when x = 10.

Add Polynomials, p, q
(a) Result Size. rsz the larger of len(p) and len(q ). (b) Pad P? If len(p) < rsz :

172

Chapter 15. Lists

Building Skills in Python, Release 2.6.5

Set p1 to a tuple of rsz len(p) zeros + p. Else: Set p1 to p. (c) Pad Q? If len(q ) < rsz : Set q1 t a tuple of rsz len(q ) zeroes + q. Else, Set q1 to q. (d) Add. Add matching elements from p1 and q1 to create result, r. (e) Result. Return r as the sum of p and q.

Multiply Polynomials, x, y
(a) Result Size. rsz len(x) + len(y ). Initialize the result list, r, to all zeros, with a size of rsz . (b) For all elements of x. while 0 i < len(x): For all elements of y. while 0 j < len(y ): Set r[i + j ] = r[i + j ] + x[i] y [j ]. (c) Result. Return a tuple made from r as the product of x and y. 7. Random Number Evaluation. Before using a new random number generator, it is wise to evaluate the degree of randomness in the numbers produced. A variety of clever algorithms look for certain types of expected distributions of numbers, pairs, triples, etc. This is one of many random number tests. Use random.random() to generate an array of random samples. These numbers will be uniform over the interval 0..1

Distribution test of a sequence of random samples, U


(a) Initialize. Initialize count to a list of 10 zeros. (b) Examine Samples. For each sample value, v, in the original set of 1000 random samples, U. i. Coerce Into Range. Set x v 10. Multiply by 10 and truncate and integer to get a a new value in the range 0 to 9. ii. Count. Increment count [x ] by 1. (c) Report. We expect each count to be 1/10th of our available samples. We need to display the actual count and the % of the total. We also need to calculate the dierence between the actual count and the expected count, and display this. 8. Dutch National Flag. A challenging problem, one of the hardest in this set. This is from Edsger Dijkstras book, A Discipline of Programming [Dijkstra76]. Imagine a board with a row of holes lled with red, white, and blue pegs. Develop an algorithm which will swap pegs to make three bands of red, white, and blue (like the Dutch ag). You must also satisfy this additional constraint: each peg must be examined exactly once. Without the additional constraint, this is a relatively simple sorting problem. The additional constraint requires that instead of a simple sort which passes over the data several times, we need a more clever sort.

15.9. List Exercises

173

Building Skills in Python, Release 2.6.5

Hint: You will need four partitions in the array. Initially, every peg is in the Unknown partition. The other three partitions (Red, White and Blue) are empty. As the algorithm proceeds, pegs are swapped into the Red, White or Blue partition from the Unknown partition. When you are done, the unknown partition is reduced to zero elements, and the other three partitions have known numbers of elements.

174

Chapter 15. Lists

CHAPTER

SIXTEEN

MAPPINGS AND DICTIONARIES


Many algorithms need to map a key to a data value. This kind of mapping is supported by the Python dictionary, dict. Well look at dictionaries from a number of viewpoints: semantics, literal values, operations, comparison operators, statements, built-in functions and methods. We are then in a position to look at two applications of the dictionary. Well look at how Python uses dictionaries along with sequences to handle arbitrary connections of parameters to functions in Advanced Parameter Handling For Functions. This is a very sophisticated set of tools that let us dene functions that are very exible and easy to use.

16.1 Dictionary Semantics


A dictionary, called a dict, maps a key to a value. The key can be any type of Python object that computes a consistent hash value. The value referenced by the key can be any type of Python object. There is a subtle terminology issue here. Python has provisions for creating a variety of dierent types of mappings. Only one type of mapping comes built-in; that type is the dict. The terms are almost interchangeable. However, you may develop or download other types of mappings, so well be careful to focus on the dict class. Working with a dict is similar to working with a sequence. Items are inserted into the dict, found in the dict and removed from the dict. A dict object has member methods that return a sequence of keys, or values, or ( key , value ) tuples suitable for use in a for statement. Unlike a sequence, a dict does not preserve order. Instead of order, a dict uses a hashing algorithm to identify each items place in the dict with a rapid calculation of the keys hash value. The built-in function, hash() is used to do this calculation. Items in the dict are inserted in an position related to their keys apparently random hash values. Some Alternate Terminology. A dict can be thought of as a container of ( key : value ) pairs. This can be a helpful way to imagine the information in a mapping. Each pair in the list is the mapping from a key to an associated value. A dict can be called an associative array. Ordinary arrays, typied by sequences, use a numeric index, but a dicts index is made up of the key objects. Each key is associated with (or mapped to) the appropriate value. Some Consequences. Each key object has a hash value, which is used to place the value in the dict. Consequently, the keys must have consistent hash values; they must be immutable objects. You cant use list, dict or set objects as keys. You can use tuple, string and frozenset objects, since they are immutable. Additionally, when we get to class denitions (in Classes), we can make arrangements for our objects to return an immutable hash value.

175

Building Skills in Python, Release 2.6.5

A common programming need is a heterogeneous container of data. Database records are an example. A record in a database might have a boats name (as a string), the length overall (as a number) and an inventory of sails (a list of strings). Often a record like this will have each element (known as a eld ) identied by name. A C or C++ struct accomplishes this. This kind of named collection of data elements may be better handled with a class (someting well cover in Classes) or a named tuple (see collections). However, a mapping is also useful for managing this kind of heterogeneous data with named elds. Note that many languages make record denitions a statically-dened container of named elds. A Python dict is dynamic, allowing us to add eld names at run-time. A common alternative to hashing is using some kind of ordered structure to maintain the keys. This might be a tree or list, which would lead to other kinds of mappings. For example, there is an ordered dictionary, in the Python collections module.

16.2 Dictionary Literal Values


A dict literal is created by surrounding a key-value list with {}; a key is separated from its value with :, and the key : value pairs are separated with commas (,). An empty dict is simply {}. As with list and tuple, an extra , inside the {} is tolerated. Examples:
diceRoll = { (1,1): "snake eyes", (6,6): "box cars" } myBoat = { "NAME":"KaDiMa", "LOA":18, "SAILS":["main","jib","spinnaker"] } theBets = { }

diceRoll This is a dict with two elements. One element has a key of a tuple (1,1) and a value of a string, "snake eyes". The other element has a key of a tuple (6,6) and a value of a string "box cars". myBoat This variable is a dict with three elements. One element has a key of the string "NAME" and a value of the string "KaDiMa". Another element has a key of the string "LOA" and a value of the integer 18. The third element has a key of the string "SAILS" and the value of a list ["main", "jib", "spinnaker"]. theBets An empty dict. The values and keys in a dict do not have to be the same type. Keys must be a type that can produce a hash value. Since list s and dict objects are mutable, they are not permitted as keys. All other non-mutable types (especially string, frozenset and tuple) are legal keys.

16.3 Dictionary Operations


A dict only permits a single operation: []. This is used to add, change or retrieve items from the dict. The slicing operations that apply to sequences dont apply to a dict. Examples of dict operations.
>>> >>> >>> >>> {2: d= {} d[2] = [ (1,1) ] d[3] = [ (1,2), (2,1) ] d [(1, 1)], 3: [(1, 2), (2, 1)]}

176

Chapter 16. Mappings and Dictionaries

Building Skills in Python, Release 2.6.5

>>> d[2] [(1, 1)] >>> d["2 or 3"] = d[2] + d[3] >>> d {'2 or 3': [(1, 1), (1, 2), (2, 1)], 2: [(1, 1)], 3: [(1, 2), (2, 1)]}

1. This example starts by creating an empty dict, d. 2. Into d[2] we insert a list with a single tuple. 3. Into d[3] we insert a list with two tuples. 4. When the entire dict is printed it shows the two key:value pairs, one with a key of 2 and another with a key of 3. 5. The entry with a key of the string "2 or 3" has a value which is computed from the values of d[2] + d[3]. Since these two entries are lists, the lists can be combined with the + operator. The resulting expression is stored into the dict. 6. When we print d, we see that there are three key:value pairs: one with a key of 3, one with a key of 2 and one with a key of "2 or 3" . This ability to use any object as a key is a powerful feature, and can eliminate some needlessly complex programming that might be done in other languages. Here are some other examples of picking elements out of a dict.
>>> myBoat = { "NAME":"KaDiMa", "LOA":18, ... "SAILS":["main","jib","spinnaker"] } >>> myBoat["NAME"] 'KaDiMa' >>> myBoat["SAILS"].remove("spinnaker") >>> myBoat {'LOA': 18, 'NAME': 'KaDiMa', 'SAILS': ['main', 'jib']}

String Formatting with Dictionaries. The string formatting operator, %, can be applied between str and dict as well as str and sequence. When this operator was introduced in Strings, the format specications were applied to a tuple or other sequence. When used with a dict, each format specication is given an additional option that species which dict element to use. The general format for each conversion specication is:
%( element ) [ flags ][ width [ . precision ]] code

The ags, width, precision and code elements are dened in Strings. The element eld must be enclosed in ()s; this is the key to be selected from the dict. For example:
print "%(NAME)s, %(LOA)d" % myBoat

This will nd myBoat[NAME] and use %s formatting; it will nd myBoat[LOA] and use %d number formatting.

16.3. Dictionary Operations

177

Building Skills in Python, Release 2.6.5

16.4 Dictionary Comparison Operations


Some of the standard comparisons ( < , <= , > , >=, == , != ) dont have a lot of meaning between two dictionaries. Since there may be no common keys, nor even a common data type for keys, dictionaries are simply compared by length. The dict with fewer elements is considered less than a dict with more elements. The membership comparisons (in, not in) apply to the keys of the dictionary.
>>> colors = { "blue": (0x30,0x30,0xff), "green": (0x30,0xff,0x97), ... "red": (0xff,0x30,0x97), "yellow": (0xff,0xff,0x30) } >>> "blue" in colors True >>> (0x30,0x30,0xff) in colors False >>> "orange" not in colors True

16.5 Dictionary Statements


There are a number of statements that have specic features related to dict objects. The for Statement. The for statement iterates through the keys of the dictionary.
>>> colors = { "blue": (0x30,0x30,0xff), "green": (0x30,0xff,0x97), ... "red": (0xff,0x30,0x97), "yellow": (0xff,0xff,0x30) } >>> for c in colors: ... print c, colors[c]

Its common to use some slightly dierent techniques for iterating through the elements of a dict. The key:value pairs. We can use the items() method to iterate through the sequence of 2-tuples that contain each key and the associated value.
for key, value in someDictionary.items(): # process key and value

The values. We can use the values() method to iterate through the sequence of values in the dictionary.
for value in someDictionary.values(): # process the value

Note that we cant easily determine the associated key. A dictionary only goes one way: from key to value. The keys. We can use the keys() method to iterate through the sequence of keys. This is what happens when we simply use the dictionary object in the for statement. Heres an example of using the key:value pairs.
>>> myBoat = { "NAME":"KaDiMa", "LOA":18, ... "SAILS":["main","jib","spinnaker"] } >>> for key, value in myBoat.items(): ... print key, "=", value ...

178

Chapter 16. Mappings and Dictionaries

Building Skills in Python, Release 2.6.5

LOA = 18 NAME = KaDiMa SAILS = ['main', 'jib', 'spinnaker']

The del Statement. The del statement removes items from a dict . For example
>>> i = { "two":2, "three":3, "quatro":4 } >>> del i["quatro"] >>> i {'two': 2, 'three': 3}

In this example, we use the key to remove the item from the dict. The member function, pop(), does this also.
>>> i = { "two":2, "three":3, "quatro":4 } >>> i.pop("quatro") 4 >>> i {'two': 2, 'three': 3}

16.6 Dictionary Built-in Functions


Here are the built-in functions that deal with dictionaries. dict([values], [key=value...] ) Creates a new dictionary. If a positional parameter, values is provided, each element must be a 2tuple. The values pairs are used to populate the dictionary; the rst element of each pair is the key and the second element is the value. Note that the zip() function produces a list of 2-tuples from two parallel lists. If any keyword parameters are provided, each keyword becomes a key in the dictionary and the keyword argument becomes the value for that key.
>>> dict( [('first',0), ('second',1),('third',2)] ) {'second': 1, 'third': 2, 'first': 0} >>> dict( zip(['fourth','fifth','sixth'],[3,4,5]) ) {'sixth': 5, 'fifth': 4, 'fourth': 3} >>> dict( seventh=7, eighth=8, ninth=9 ) {'seventh': 7, 'eighth': 8, 'ninth': 9}

Functions which apply to dicts, but are dened elsewhere. len(). For dicts, this function returns the number of items.
>>> len( {1:'first',2:'second',3:'third'} ) 3 >>> len( {} ) 0

max(). For dicts, this function returns the maximum key.


>>> max( {1:'first',2:'second',3:'third'} ) 3

16.6. Dictionary Built-in Functions

179

Building Skills in Python, Release 2.6.5

min(). For dicts, this function returns the minimum key. sum(). For dicts, this function sums the keys.
>>> sum( {1:'first',2:'second',3:'third'} ) 6

any(). Equivalent to any( dictionary.keys() ). Return True if any key in the dictionary are True, or equivalent to True. This is almost always true except for empty dictionaries or a peculiar dictionary with keys of 0, False, None, etc. all(). Equivalent to all( dictionary.keys() ). Return True if all keys in the dictionary are True, or equivalent to True.
>>> all( {1:'first',2:'second',3:'third'} ) True >>> all( {1:'first',2:'second',None:'error'} ) False

enumerate(). Iterate through the dictionary returning 2-tuples of ( index, key ). This iterates through the key values. Since dictionaries have no explicit ordering to their keys, this enumeration is in an arbitrary order. sorted(). Iterate through the dictionary keys in sorted order. The keys are actually a list, and this returns a list of the sorted keys.
>>> sorted( { "two":2, "three":3, "quatro":4 } ) ['quatro', 'three', 'two']

16.7 Dictionary Methods


A dict object has a number of member methods. Many of these maintain the values in a dict . Others retrieve parts of the dict as a sequence, for use in a for statement. The following mutator functions update a dict object. Most of these do not return a value. The dict.pop() and dict.setdefault() methods both update the dictionary and return values. clear() Remove all items from the dict. pop(key, [default] ) Remove the given key from the dict, returning the associated value. If the key does not exist, return the default value provided. If the key does not exist and no default value exists, raise a KeyError exception. setdefault(key, [default] ) If the key is in the dictionary, return the associated value. If the key is not in the dictionary, set the given default as the value and return this value. If default is not given, it defaults to None. update(new, [key=value...] ) Merge values from the new new into the original dict, adding or replacing as needed. It is equivalent to the following Python statement. for k in new.keys(): d[k]= new[k] If any keyword parameters are provided, each keyword becomes a key in the dictionary and the keyword argument becomes the value for that key.

180

Chapter 16. Mappings and Dictionaries

Building Skills in Python, Release 2.6.5

>>> x= dict( seventh=7, eighth=8, ninth=9 ) >>> x {'seventh': 7, 'eighth': 8, 'ninth': 9} >>> x.update( first=1 ) >>> x {'seventh': 7, 'eighth': 8, 'ninth': 9, 'first': 1}

The following transformer function transforms a dictionary into another object. copy() Copy the dict to make a new dict. This is a shallow copy. All objects in the new dict are references to the same objects as the original dict. The following accessor methods provide information about a dict. get(key, [default] ) Get the item with the given key, similar to dict[key]. If the key is not present and default is given, supply default instead. If the key is not present and no default is given, raise the KeyError exception. items() Return all of the items in the dict as a sequence of (key,value) 2-tuples. Note that these are returned in no particular order. keys() Return all of the keys in the dict as a sequence of keys. Note that these are returned in no particular order. values() Return all the values from the dict as a sequence. Note that these are returned in no particular order.

16.8 Using Dictionaries as Function Parameter Defaults


Its very, very important to note that default values must be immutable objects. Recall that numbers, strings, None, and tuple objects are immutable. We note that dictionaries as well as sets and lists are mutable, and cannot be used as default values for function parameters. Consider the following example of what not to do.
>>> def default2( someDict={} ): ... someDict['default']= 2 ... return someDict ... >>> looks_good= {} >>> default2(looks_good) {'default': 2} >>> default2(looks_good) {'default': 2} >>> looks_good {'default': 2} >>> >>> >>> not_good= default2() >>> not_good {'default': 2} >>> worse= default2()

16.8. Using Dictionaries as Function Parameter Defaults

181

Building Skills in Python, Release 2.6.5

>>> worse {'default': 2} >>> not_good {'default': 2} >>> >>> not_good['surprise']= 'what?' >>> not_good {'default': 2, 'surprise': 'what?'} >>> worse {'default': 2, 'surprise': 'what?'}

1. We dened a function which has a default value thats a mutable object. This is simple a bad programming practice in Python. 2. We used this function with a dictionary object, looks_good. The function updated the dictionary object as expected. 3. We used the functions default value to create not_good. The function inserted a value into an empty dictionary and returned this new dictionary object. It turns out that the function updated the mutable default value, also. 4. When we use the functions default value again, with worse, the function uses the updated default value and updates it again. Both not_good and worse are references to the same mutable object that is being updated. To avoid this, do not use mutable values as defaults. Do this instead.
def default2( someDict=None ): if someDict is None: someDict= {} someDict['default']= 2 return someDict

This creates a fresh new mutable object as needed.

16.9 Dictionary Exercises


1. Word Frequencies. Update the exercise in Accumulating Unique Values to count each occurance of the values in aSequence. Change the result from a simple sequence to a dict. The dict key is the value from aSequence. The dict value is the count of the number of occurances. If this is done correctly, the input sequence can be words, numbers or any other immutable Python object, suitable for a dict key. For example, the program could accept a line of input, discarding punctuation and breaking them into words in space boundaries. The basic string operations should make it possible to create a simple sequence of words. Iterate through this sequence, placing the words into a dict. The rst time a word is seen, the frequency is 1. Each time the word is seen again, increment the frequency. Produce a frequency table. To alphabetize the frequency table, extract just the keys. A sequence can be sorted (see section 6.2). This sorted sequence of keys can be used to extract the counts from the dict. 2. Stock Reports. A block of publicly traded stock has a variety of attributes, well look at a few of them. A stock has a ticker symbol and a company name. Create a simple dict with ticker symbols and company names. 182 Chapter 16. Mappings and Dictionaries

Building Skills in Python, Release 2.6.5

For example:
stockDict = { 'GM': 'General Motors', 'CAT':'Caterpillar', 'EK':"Eastman Kodak" }

Create a simple list of blocks of stock. These could be tuple s with ticker symbols, prices, dates and number of shares. For example:
purchases = [ ( 'GE', 100, '10-sep-2001', 48 ), ( 'CAT', 100, '1-apr-1999', 24 ), ( 'GE', 200, '1-jul-1999', 56 ) ]

Create a purchase history report that computes the full purchase price (shares times dollars) for each block of stock and uses the stockDict to look up the full company name. This is the basic relational database join algorithm between two tables. Create a second purchase summary that which accumulates total investment by ticker symbol. In the above sample data, there are two blocks of GE. These can easily be combined by creating a dict where the key is the ticker and the value is the list of blocks purchased. The program makes one pass through the data to create the dict. A pass through the dict can then create a report showing each ticker symbol and all blocks of stock. 3. Date Decoder. A date of the form 8-MAR-85 includes the name of the month, which must be translated to a number. Create a dict suitable for decoding month names to numbers. Create a function which uses string operations to split the date into 3 items using the - character. Translate the month, correct the year to include all of the digits. The function will accept a date in the dd-MMM-yy format and respond with a tuple of ( y , m, d ). 4. Dice Odds. There are 36 possible combinations of two dice. A simple pair of loops over range(6)+1 will enumerate all combinations. The sum of the two dice is more interesting than the actual combination. Create a dict of all combinations, using the sum of the two dice as the key. Each value in the dict should be a list of tuple s; each tuple has the value of two dice. The general outline is something like the following:

Enumerate Dice Combinations


Initialize. combos dict() For all d1. Iterate with 1 d1 < 7. For all d2. Iterate with 1 d2 < 7. Create a Tuple. t (d1 , d2 ). In the Dictionary. Is d1 + d2 a key in combos? Append. Append the tuple, t to the value for item d1 + d2 in combos. Not In the Dictionary. If d1 + d2 is not a key in combos, then Insert. Add a new element d1 + d2 to the combos; the value is a 1-element list of the tuple, t. Report. Display the resulting dictionary.

16.9. Dictionary Exercises

183

Building Skills in Python, Release 2.6.5

16.10 Advanced Parameter Handling For Functions


In More Function Denition Features we hinted that Python functions can handle a variable number of argument values in addition to supporting optional argument values. When we dene a function, we can have optional parameters. We dene a xed number of parameters, but some (or all) can be omitted because we provided default values for them. This allows us to provide too few positional argument values. If we provide too many positional argument values to a function, however, Python raises an exception. It turns out that we can also handle this. Consider the following example. We dened a function of three positional parameters, and then evaluated it with more than three argument values.
>>> def avg(a,b,c): return (a+b+c)/3.0 ... >>> avg(10,11,12) 11.0 >>> avg(10,11,12,13) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: avg() takes exactly 3 arguments (4 given)

First, well look at handling an unlimited number of positional values. Then well look at handling an unlimited number of keyword values.

16.10.1 Unlimited Number of Positional Argument Values


Python lets us dene a function that handles an unknown and unlimited number of argument values. Examples of built-in functions with a unlimited number of argument values are max() and min(). Rather than have Python raise an exception for extra argument values, we can request the additional positional argument values be collected into a tuple. To do this, we provide a nal parameter denition of the form * extras. The * indicates that this parameter variable is the place to capture the extra argument values. The variable, here called extras, will receive a sequence with all of the extra positional argument values. You can only provide one such variable (if you provided two, how could Python decide which of these two got the extra argument values?) You must provide this variable after the ordinary positional parameters in the function denition. The following function accepts an unlimited number of positional arguments; it collects these in a single tuple parameter, args.
def myMax( *args ): max= args[0] for a in args[1:]: if a > max: max= a return max

Heres another example. In this case we have a xed parameter in the rst position and all the extra parameters collected into a tuple called vals.
def printf( format, *vals ): print format % vals

184

Chapter 16. Mappings and Dictionaries

Building Skills in Python, Release 2.6.5

This should look familiar to C programmers. Now we can write the following, which may help ease the transition from C to Python.
printf( "%s = %d", "some string", 2 ) printf( "%s, %s, %d %d", "thing1", "thing2", 3, 22 )

16.10.2 Unlimited Number of Keyword Argument Values


In addition to collecting extra positional argument values into a single parameter, Python can also collect extra keyword argument values into a dict. If you want a container of keyword arguments, you provide a parameter of the form ** extras. Your variable, here called extras, will receive a dict with all of the keyword parameters. The following function accepts any number of keyword arguments; they are collected into a single parameter.
def rtd( **args ): if "rate" in args and "time" in args: args['distance'] = args['rate']*args['time'] elif "rate" in args and "distance" in args: args['time']= args['distance']/args['rate'] elif "time" in args and "distance" in args: args['rate']= args['distance']/args['time'] else: raise Exception( "%r does not compute" % ( args, ) ) return args

Heres two examples of using this rtd() function.


>>> rtd( rate=60.0, time= .75 ) {'distance': 45.0, 'rate': 60.0, 'time': 0.75} >>> rtd( distance=173, time=2+50/60.0 ) {'distance': 173, 'rate': 61.058823529411761, 'time': 2.8333333333333335}

The keyword arguments are collected into a dict, named args. We check for combinations of rate, time and distance in the args dictionary. For each combination, we can solve for the remaining value and update the dict by insert the additional key and value into the dict.

16.10.3 Evaluation with a Container Instead of Individual Argument Values


When evaluating a function, we can provide a sequence instead of providing individual positional parameters. We do this with a special version of the * operator when evaluating a function. Heres an example of forcing a 3- tuple to be assigned to three positional parameters.
>>> def avg3( a, b, c ): ... return (a+b+c)/3.0 ... >>> data= ( 4, 3, 2 ) >>> avg3( *data ) 3.0

In this example, we told Python to assign each element of our 3-tuple named data, to a separate parameter variables of the function avg3().

16.10. Advanced Parameter Handling For Functions

185

Building Skills in Python, Release 2.6.5

As with the * operator, we can use ** to make a dict become a series of keyword parameters to a function.
>>> d={ 'a':5, 'b':6, 'c':9 } >>> avg3( **d ) 6.666666666666667

In this example, we told Python to assign each element of the dict, d , to specic keyword parameters of our function. We can mix and match this with ordinary parameter assignment, also. Heres an example.
>>> avg3( 2, b=3, **{'c':4} ) 3.0

Here weve called our function with three argument values. The parameter a will get its value from a simple positional parameter. The parameter b will get its value from a keyword argument. The parameter c will get its value from having the dict {'c':4} turned into keyword parameter assignment. Well make more use of this in Inheritance .

186

Chapter 16. Mappings and Dictionaries

CHAPTER

SEVENTEEN

SETS
Many algorithms need to work with simple containers of data values, irrespective of order or any key. This is a simple set of objects, which is supported by the Python set container. Well look at Sets from a number of viewpoints: semantics, literal values, operations, comparison operators, statements, built-in functions and methods.

17.1 Set Semantics


A set is, perhaps the simplest possible container, since it contains objects in no particular order with no particular identication. Objects stand for themselves. With a sequence, objects are identied by position. With a mapping, objects are identied by some key. With a set, objects stand for themselves. Since each object stands for itself; elements of a set cannot be duplicated. A list or tuple, for example, can have any number of duplicate objects. For example, the tuple ( 1, 1, 2, 3 ) has four elements, which includes two copies of the integer 1; if we create a set from this tuple, the set will only have three elements. A set has large number of operations for unions, intersections, and dierences. A common need is to examine a set to see if a particular object is a member of that set, or if one set is contained within another set. A set is mutable, which means that it cannot be used as a key for a dict (see Mappings and Dictionaries for more information.) In order to use a set as a dict key, we can create a frozenset, which is an immutable copy of a set. This allows us to accumulate a set of values to create a dict key.

17.2 Set Literal Values


There are no literal values for set objects. A set value is created by using the set() or frozenset() factory functions. These can be applied to any iterable container, which includes any sequence, the keys of a dict, or even a le. Well return to the general notion of iterable when we look at the yield statement in Iterators and Generators. set(iterable) Transforms the given iterable (sequence, le, frozenset or set) into a set.
>>> set( ("hello", "world", "of", "words", "of", "world") ) set(['world', 'hello', 'words', 'of'])

187

Building Skills in Python, Release 2.6.5

Note that we provided a six-tuple sequence to the set() function, and we got a set with the four unique objects. The set is shown as a list literal, to remind us that a set is mutable. You cannot provide a list of argument values to the set() function. You must provide an iterable object (usually a tuple). Trying set( "one", "two", "three" ) will result in an TypeError because you provided three arguments. You must provide a single argument which is iterable. All sequences are iterable, so a sequence literal is the easiest to provide. set(iterable) Transforms the given iterable (sequence, le or set) into an immutable frozenset.

17.3 Set Operations


There are a large number of set operations, including union (|), intersection (&), dierence ( -), and symmetric dierence (^). These are unusual operations, so well look at them in some detail. In addition to the operator notation, there are also method functions which do the same things. Well look at the method function versions below. Well use the following two set objects to show these operators.
>>> fib=set( (1,1,2,3,5,8,13) ) >>> prime=set( (2,3,5,7,11,13) )

Union. |. The resulting set has elements from both source sets. An element is in the result if it is one set or the other.
>>> fib | prime set([1, 2, 3, 5, 7, 8, 11, 13])

188

Chapter 17. Sets

Building Skills in Python, Release 2.6.5

S 1 S 2 = {e|e S 1 or e S 2} Intersection. &. The resulting set has elements that are common to both source sets. An element is in the result if it is in one set and the other.
>>> fib & prime set([2, 3, 5, 13])

S 1 S 2 = {e|e S 1 and e S 2} Dierence. -. The resulting set has elements of the left-hand set with all elements from the right-hand set removed. An element will be in the result if it is in the left-hand set and not in the right-hand set.
>>> fib - prime set([8, 1]) >>> prime - fib set([11, 7])

S 1 S 2 = {e|e S 1 and e / S 2} S 2 S 1 = {e|e / S 1 and e S 2} Symmetric Dierence. ^. The resulting set has elements which are unique to each set. An element will be in the result set if either it is in the left-hand set and not in the right-hand set or it is in the right-hand set and not in the left-hand set. Whew!

17.3. Set Operations

189

Building Skills in Python, Release 2.6.5

>>> fib ^ prime set([1, 7, 8, 11])

S 1 S 2 = {e|e S 1 xor e S 2}

17.4 Set Comparison Operators


Therer are a number of set comparisons. All of the standard comparisons (<, <=, >, >=, ==, !=, in, not in) work with sets, but the interpretation of the operators is based on set theory. The various operations from set theory are the subset and proper subset relationships. The various comparison mathematical operations of , , , are implemented by <, <=, >, >=. In the following example, the set craps is all of the ways we can roll craps on a come out roll. Also, weve dened three to hold both of the dice rolls that total 3. When we compare three with craps, we see the expected relationships: three is a subset craps as well as a proper subset of craps.
>>> craps= set( [ (1,1), (2,1), (1,2), (6,6) ] ) >>> three = set( [ (1,2), (2,1) ] ) >>> three < craps True >>> three <= craps True

The in and not in operators implement that and / relationships. In the following example, the set craps is all of the ways we can roll craps on a come out roll. Weve modeled a throw of the dice as a 2-tuple. We can now test a specic throw to see if it is craps.

190

Chapter 17. Sets

Building Skills in Python, Release 2.6.5

>>> craps= set( [ (1,1), (2,1), (1,2), (6,6) ] ) >>> (1,2) in craps True >>> (3,4) in craps False

17.5 Set Statements


The for statement works directly with set objects, because they are iterable. A set is not a sequence, but it is like a sequence because we can iterate through the elements using a for statement. Here we create three set objects: even, odd, and zero to reect some standard outcomes in Roulette. The union of all three sets is the complete set of possible spins. We can iterate through this resulting set.
>>> >>> >>> >>> even= set( range(2,38,2) ) odd= set( range(1,37,2) ) zero= set( (0,'00') ) for n in even|odd|zero: print n

17.6 Set Built-in Functions


A number of built-in functions create or process set objects. The set() and frozenset() were described above, under Set Literal Values. Functions which apply to sets, but are dened elsewhere.

17.5. Set Statements

191

Building Skills in Python, Release 2.6.5

len(). For sets, this function returns the number of items.


>>> len( set( [1,1,2,3] ) ) 3 >>> len( set() ) 0

Note that sets do not include duplicates, thats why the length in the rst example is not 4. max(). For sets, this function returns the maximum item.
>>> max( set( [1,1,2,3,5,8] ) ) 8

min(). For sets, this function returns the minimum item. sum(). For sets, this function sums the items.
>>> sum( set( [1,1,2,3,5,8] ) ) 19

Note that sets do not include duplicates, thats why the sum is not 20. any(). For sets, Return True if there exists any item which is True.
>>> set( [0, None, False] ) set([0, None]) >>> any( _ ) False >>> any( set( [0,None,False,42] ) ) True

Note that False and 0 have the same value when constructing a set, and are duplicates. all(). For sets, Return True if all items are True.
>>> all( set( [0,None,False,42] ) ) False >>> all( set( [1,True] ) ) True

enumerate(). Iterate through the set returning 2-tuples of ( index, item ). Since sets have no explicit ordering to their items, this enumeration is in an arbitrary order. sorted(). Iterate through the set elements in sorted order. This returns a set of elements.
>>> sorted( set( [1,1,2,3,5,8] ) ) [1, 2, 3, 5, 8]

17.7 Set Methods


A set object has a number of member methods. The following mutators update a set object. Note that most of these methods dont return a value. The exception is pop.

192

Chapter 17. Sets

Building Skills in Python, Release 2.6.5

clear() Remove all items from the set. pop() Remove an arbitrary object from the set, returning the object. If the set was already empty, this will raise a KeyError exception. add(new ) Adds element new to the set. If the object is already in the set, nothing happens. remove(old ) Removes element old from the set . If the object old is not in the set , this will raise a KeyError exception. discard() Same a set.remove(). update(new ) Merge values from the new set into the original set, adding elements as needed. It is equivalent to the following Python statement. s |= new. intersection_update(new ) Update set to have the intersection of set and new. In eect, this discards elements from set, keeping only elements which are common to new and set It is equivalent to the following Python statement. s &= new. difference_update(new ) Update set to have the dierence between set and new. In eect, this discards elements from set which are also in new. It is equivalent to the following Python statement. s -= new. symmetric_difference_update(new ) Update set to have the symmetric dierence between set and new. In eect, this both discards elements from s which are common with new and also inserts elements into s which are unique to new. It is equivalent to the following Python statement. s ^= new. The following transformers built a new object from one or more sets. copy() Copy the set to make a new set. This is a shallow copy. All objects in the new set are references to the same objects as the original set. union(new ) If new is a proper set, return set | new. If new is a sequence or other iterable, make a new set from the value of new, then return the union, set | new. This does not update the original set.
>>> prime.union( (1, 2, 3, 4, 5) ) set([1, 2, 3, 4, 5, 7, 11, 13])

intersection(new ) If new is a proper set, return set & new. If new is a sequence or other iterable, make a new set from the value of new, then return the intersection, set & new. This does not update set. difference(new ) If new is a proper set, return set - new. If new is a sequence or other iterable, make a new set from the value of new, then return the dierence, set - new. This does not update set.

17.7. Set Methods

193

Building Skills in Python, Release 2.6.5

symmetric_difference(new ) If new is a proper set, return s ^ new. If new is a sequence or other iterable, make a new set from the value of new, then return the symmetric dierence, sset ^ new. This does not update s . The following accessor methods provide information about a set. issubset(other ) If set is a subset of other, return True, otherwise return False. Essentially, this is set <= other . issuperset(other ) If set is a superset of other, return True , otherwise return False. Essentially, this is set >= other.

17.8 Using Sets as Function Parameter Defaults


Its very, very important to note that default values must be immutable objects. Recall that numbers, strings, None, and tuple objects are immutable. We note that sets as well as dictionaries and lists are mutable, and cannot be used as default values for function parameters. Consider the following example of what not to do.
>>> def default2( someSet=set() ): ... someSet.add(2) ... return someSet ... >>> looks_good= set() >>> default2( looks_good ) set([2]) >>> looks_good set([2]) >>> >>> >>> not_good= default2() >>> not_good set([2]) >>> worse= default2() >>> worse set([2]) >>> >>> not_good.add(3) >>> not_good set([2, 3]) >>> worse set([2, 3])

1. We dened a function which has a default value thats a mutable object. This is simple a bad programming practice in Python. 2. We used this function with a set object, looks_good. The function updated the set object as expected. 3. We used the functions default value to create not_good. The function inserted a value into an empty set and returned this new set object. It turns out that the function updated the mutable default value, also. 4. When we use the functions default value again, with worse, the function uses the updated default value and updates it again.

194

Chapter 17. Sets

Building Skills in Python, Release 2.6.5

Both not_good and worse are references to the same mutable object that is being updated. To avoid this, do not use mutable values as defaults. Do this instead.
def default2( someSet=None ): if someSet is None: someSet= {} someSet.add( 2 ) return someSet

This creates a fresh new mutable object as needed.

17.9 Set Exercises


1. Dice Rolls. In Craps, each roll of the dice belongs to one of several set s of rolls that are used to resolve bets. There are only 36 possible dice rolls, but its annoying to dene the various set s manually. Heres a multi-step procedure that produces the various set s of dice rolls around which you can dene the game of craps. First, create a sequence with 13 empty set s, call it dice. Something like [ set() ]*13 doesnt work because it makes 13 copies of a single set object. Youll need to use a for statement to evaluate the set constructor function 13 dierent times. What is the rst index of this sequence? What is the last entry in this sequence? Second, write two, nested, for-loops to iterate through all 36 combinations of dice, creating 2- tuple s. The 36 2-tuple s will begin with (1,1) and end with (6,6). The sum of the two elements is an index into dice. We want to add each 2- tuple to the appropriate set in the dice sequence. When youre done, you should see results like the following:
>>> dice[7] set([(5, 2), (6, 1), (1, 6), (4, 3), (2, 5), (3, 4)])

Now you can dene the various rules as sets built from other sets. lose On the rst roll, you lose if you roll 2, 3 or 12. This is the set dice[2] | dice[3] | dice[12]. The game is over. win On the rst roll, you win if you roll 7 or 11. The game is over. This is dice[7] | dice[11]. point On the rst roll, any other result (4, 5, 6, 8, 9, or 10) establishes a point. The game runs until you roll the point or a seven. craps Once a point is established, you win if you roll the points number. You lose if you roll a 7. Once you have these three sets dened, you can simulate the rst roll of a craps game with a relatively elegant-looking program. You can generate two random numbers to create a 2-tuple. You can then check to see if the 2-tuple is in the lose or win sets. If the come-out roll is in the point set, then the sum of the 2-tuple will let you pick a set from the dice sequence. For example, if the come-out roll is (2,2), the sum is 4, and youd assign dice[4] to the variable point; this is the set of winners for the rest of the game. The set of losers for the rest of the game is always the craps set.

17.9. Set Exercises

195

Building Skills in Python, Release 2.6.5

The rest of the game is a simple loop, like the come-out roll loop, which uses two random numbers to create a 2- tuple. If the number is in the point set, the game is a winner. If the number is in the craps set, the game is a loser, otherwise it continues. 2. Roulette Results. In Roulette, each spin of the wheel has a number of attributes like even-ness, low-ness, red-ness, etc. You can bet on any of these attributes. If the attribte on which you placed bet is in the set of attributes for the number, you win. Well look at a few simple attributes: red-black, even-odd, and high-low. The even-odd and high-low attributes are easy to compute. The red-black attribute is based on a xed set of values.
redNumbers= set( [1,3,5,7,9,12,14,16,18,19,21,23,25,27,30,32,34,36] )

We have to distinguish between 0 and 00, which makes some of this decision-making rather complex. We can, for example, use ordinary integers for the numbers 0 to 36, and append the string 00 to this set of numbers. For example, set( range(37) ) | set( ['00'] ). This set is the entire Roulette wheel, we can call it wheel. We can dene a number of set s that stand for bets: red, black, even, odd, high and low. We can iterate though the values of wheel, and decide which set s that value belongs to. If the spin is non-zero and spin % 2 == 0, add the spin to the even set. If the spin is non-zero and spin % 2 != 0, add the spin to the odd set. If the spin is non-zero and its in the redNumbers set, add the spin to the red set. If the spin is non-zero and its not in the redNumbers set, add the value to the black set. If the spin is non-zero and spin <= 18, add the value to the low set. If the spin is non-zero and spin > 18, add the value to the high set. Once you have these six sets dened, you can use them to simulate Roulette. Each round involves picking a random spin with something like random.choice( list(wheel) ). You can then see which set the spin belongs to. If the spin belongs to a set on which youve bet, the spin is a winner, otherwise its a loser. These six sets all pay 2:1. There are a some set s which pay 3:1, including the 1-12, 13-24, 25 to 36 ranges, as well as the three columns, spin % 3 == 0, spin % 3 == 1 and spin % 3 == 2. There are still more bets on the Roulette table, but the set s of spins for those bets are rather complex to dene. 3. Sieve of Eratosthenes. Look at Sieve of Eratosthenes. We created a list of candidate prime numbers, using a sequence with 5000 boolean ags. We can, without too much work, simplify this to use a set instead of a list.

Sieve of Eratosthenes - Set Version


(a) Initialize Create a set, prime which has integers between 2 and 5000. Set p 2 (b) Iterate. While 2 p < 5000: Find Next Prime. while not primep and 2 p < 5000: Increment p by 1. Remove Multiples. At this point, p is prime.

196

Chapter 17. Sets

Building Skills in Python, Release 2.6.5

Set k p + p while k < 5000: Remove k from the set prime Set k k + p Next p. Increment p by 1. (c) Report. At this point, the set prime has the prime numbers. We can return the set. In the Find Next Prime step, youre really looking for the minimum in the prime set which is greater than or equal to p. In the Remove Multiples step, you can create the set of multiples, and use difference_update() to remove the multiples from prime. You can, also, use the range() function to create multiples of p, and create a set from this sequence of multiples.

17.9. Set Exercises

197

Building Skills in Python, Release 2.6.5

198

Chapter 17. Sets

CHAPTER

EIGHTEEN

EXCEPTIONS
The try, except, nally and raise statements
A well-written program should produce valuable results even when exceptional conditions occur. A program depends on numerous resources: memory, les, other packages, input-output devices, to name a few. Sometimes it is best to treat a problem with any of these resources as an exception, which interrupts the normal sequential ow of the program. In Exception Semantics we introduce the semantics of exceptions. Well show the basic exception-handling features of Python in Basic Exception Handling and the way exceptions are raised by a program in Raising Exceptions. Well look at a detailed example in An Exceptional Example. In Complete Exception Handling and The nally Clause, we cover some additional syntax thats sometimes necessary. In Exception Functions, well look at a few standard library functions that apply to exceptions. We descibe most of the built-in exceptions in Built-in Exceptions. In addition to exercises in Exception Exercises, we also include style notes in Style Notes and a digression on problems that can be caused by poor use of exceptions in A Digression.

18.1 Exception Semantics


An exception is an event that interrupts the ordinary sequential processing of a program. When an exception is raised, Python will handle it immediately. Python does this by examining except clauses associated with try statements to locate a suite of statements that can process the exception. If there is no except clause to handle the exception, the program stops running, and a message is displayed on the standard error le. An exception has two sides: the dynamic change to the sequence of execution and an object that contains information about the exceptional situation. The dynamic change is initiated by the raise statement, and can nish with the handlers that process the raised exception. If no handler matches the exception, the programs execution eectively stops at the point of the raise. In addition to the dynamic side of an exception, an object is created by the raise statement; this is used to carry any information associated with the exception. Consequences. The use of exceptions has two important consequences. First, we need to clarify where exceptions can be raised. Since various places in a program will raise exceptions, and these can be hidden deep within a function or class, their presence should be announced by specifying the possible exceptions in the docstring. Second, multiple parts of a program will have handlers to cope with various exceptions. These handlers should handle just the meaningful exceptions. Some exceptions (like RuntimeError or MemoryError) generally cant 199

Building Skills in Python, Release 2.6.5

be handled within a program; when these exceptions are raised, the program is so badly broken that there is no real recovery. Exceptions are a powerful tool for dealing with rare, atypical conditions. Generally, exceptions should be considered as dierent from the expected or ordinary conditions that a program handles. For example, if a program accepts input from a person, exception processing is not appropriate for validating their inputs. Theres nothing rare or uncommon about a person making mistakes while attempting to enter numbers or dates. On the other hand, an unexpected disconnection from a network service is a good candidate for an exception; this is a rare and atypical situation. Examples of good exceptions are those which are raised in response to problems with physical resources like les and networks. Python has a large number of built-in exceptions, and a programmer can create new exceptions. Generally, it is better to create new exceptions rather than attempt to stretch or bend the meaning of existing exceptions.

18.2 Basic Exception Handling


Exception handling is done with the try statement. The try statement encapsulates several pieces of information. Primarily, it contains a suite of statements and a group of exception-handling clauses. Each exception-handling clause names a class of exceptions and provides a suite of statements to execute in response to that exception. The basic form of a try statement looks like this:
try: suite except exception , target : suite except: suite

Each suite is an indented block of statements. Any statement is allowed in the suite. While this means that you can have nested try statements, that is rarely necessary, since you can have an unlimited number of except clauses on a single try statement. If any of the statements in the try suite raise an exception, each of the except clauses are examined to locate a clause that matches the exception raised. If no statement in the try suite raises an exception, the except clauses are silently ignored. The rst form of the except clause provides a specic exception class which is used for matching any exception which might be raised. If a target variable name is provided, this variable will have the exception object assigned to it. The second form of the except clause is the catch-all version. This will match all exceptions. If used, this must be provided last, since it will always match the raised exception. Well look at the additional nally clause in a later sections. Important: Python 3 The except statement cant easily handle a list of exception classes. The Python 2 syntax for this is confusing because it requires some additional () around the list of exceptions.
except ( exception, ... ) , target :

200

Chapter 18. Exceptions

Building Skills in Python, Release 2.6.5

The Python 3 syntax wil be slightly simpler. Using the keyword as will remove the need for the additional () around the list of exceptions.
except exception, ... as target

Overall Processing. The structure of the complete try statement summarizes the philosophy of exceptions. First, try the suite of statements, expecting them work. In the unlikely event that an exception is raised, nd an exception clause and execute that exception clause suite to recover from or work around the exceptional situation. Except clauses include some combination of error reporting, recovery or work-around. For example, a recovery-oriented except clause could delete useless les. A work-around exception clause could returning a complex result for square root of a negative number. First Example. Heres the rst of several related examples. This will handle two kinds of exceptions, ZeroDivisionError and ValueError.

exception1.py
def avg( someList ): """Raises TypeError or ZeroDivisionError exceptions.""" sum= 0 for v in someList: sum = sum + v return float(sum)/len(someList) def avgReport( someList ): try: m= avg(someList) print "Average+15%=", m*1.15 except TypeError, ex: print "TypeError:", ex except ZeroDivisionError, ex: print "ZeroDivisionError:", ex

This example shows the avgReport() function; it contains a try clause that evaluates the avg() function. We expect that there will be a ZeroDivisionError exception if an empty list is provided to avg(). Also, a TypeError exception will be raised if the list has any non-numeric value. Otherwise, it prints the average of the values in the list. In the try suite, we print the average. For certain kinds of inappropriate input, we will print the exceptions which were raised. This design is generally how exception processing is handled. We have a relatively simple, clear function which attempts to do the job in a simple and clear way. We have a application-specic process which handles exceptions in a way thats appropriate to the overall application. Nested :command:try Statements. In more complex programs, you may have many function denitions. If more than one function has a try statement, the nested function evaluations will eectively nest the try statements inside each other. This example shows a function solve(), which calls another function, quad(). Both of these functions have a try statement. An exception raised by quad() could wind up in an exception handler in solve().

18.2. Basic Exception Handling

201

Building Skills in Python, Release 2.6.5

exception2.py
def sum( someList ): """Raises TypeError""" sum= 0 for v in someList: sum = sum + v return sum def avg( someList ): """Raises TypeError or ZeroDivisionError exceptions.""" try: s= sum(someList) return float(s)/len(someList) except TypeError, ex: return "Non-Numeric Data" def avgReport( someList ): try: m= avg(someList) print "Average+15%=", m*1.15 except TypeError, ex: print "TypeError: ", ex except ZeroDivisionError, ex: print "ZeroDivisionError: ", ex

In this example, we have the same avgReport() function, which uses avg() to compute an average of a list. Weve rewritten the avg() function to depend on a sum() function. Both avgReport() and avg() contain try statements. This creates a nested context for evaluation of exceptions. Specically, when the function sum is being evaluated, an exception will be examined by avg() rst, then examined by avgReport(). For example, if sum() raises a TypeError exception, it will be handled by avg(); the avgReport() function will not see the TypeError exception. Function Design. Note that this example has a subtle bug that illustrates an important point regarding function design. We introduced the bug when we dened avg() to return either an answer or an error status code in the form of a string. Generally, things are more complex when we try to mix return of valid results and return of error codes. Status codes are the only way to report errors in languages that lack exceptions. C, for example, makes heavy use of status codes. The POSIX standard API denitions for operating system services are oriented toward C. A program making OS requests must examing the results to see if it is a proper values or an indication that an error occurred. Python, however, doesnt have this limitation. Consequently many of the OS functions available in Python modules will raise exceptions rather than mix proper return values with status code values. In our case, our design for avg() attepts to return either a valid numeric result or a string result. To be correct we would have to do two kinds of error checking in avgReport(). We would have to handle any exceptions and we would also have to examine the results of avg() to see if they are an error value or a proper answer. Rather than return status codes, a better design is to simply use exceptions for all kinds of errors. IStatus codes have no real purposes in well-designed programs. In the next section, well look at how to dene and raise our own exceptions.

202

Chapter 18. Exceptions

Building Skills in Python, Release 2.6.5

18.3 Raising Exceptions


The raise statement does two things: it creates an exception object, and immediately leaves the expected program execution sequence to search the enclosing try statements for a matching except clause. The eect of a raise statement is to either divert execution in a matching except suite, or to stop the program because no matching except suite was found to handle the exception. The Exception object created by raise can contain a message string that provides a meaningful error message. In addition to the string, it is relatively simple to attach additional attributes to the exception. Here are the two forms for the raise satement.
raise exceptionClass , value raise exception

The rst form of the raise statement uses an exception class name. The optional parameter is the additional value that will be contained in the exception. Generally, this is a string with a message, however any object can be provided. Heres an example of the raise statement.
raise ValueError, "oh dear me"

This statement raises the built-in exception ValueError with an amplifying string of "oh dear me". The amplifying string in this example, one might argue, is of no use to anybody. This is an important consideration in exception design. When using a built-in exception, be sure that the arguments provided pinpoint the error condition. The second form of the raise statement uses an object constructor to create the Exception object.
raise ValueError( "oh dear me" )

Heres a variation on the second form in which additional attributes are provided for the exception.
ex= MyNewError( "oh dear me" ) ex.myCode= 42 ex.myType= "O+" raise ex

In this case a handler can make use of the message, as well as the two additional attributes, myCode and myType. Dening Your Own Exception. You will rarely have a need to raise a built-in exception. Most often, you will need to dene an exception which is unique to your application. Well cover this in more detail as part of the object oriented programming features of Python, in Classes . Heres the short version of how to create your own unique exception class.
class MyError( Exception ): pass

This single statement denes a subclass of Exception named MyError. You can then raise MyError in a raise statement and check for MyError in except clauses. Heres an example of dening a unique exception and raising this exception with an amplifying string.

18.3. Raising Exceptions

203

Building Skills in Python, Release 2.6.5

quadratic.py
import math class QuadError( Exception ): pass def quad(a,b,c): if a == 0: ex= QuadError( "Not Quadratic" ) ex.coef= ( a, b, c ) raise ex if b*b-4*a*c < 0: ex= QuadError( "No Real Roots" ) ex.coef= ( a, b, c ) raise ex x1= (-b+math.sqrt(b*b-4*a*c))/(2*a) x2= (-b-math.sqrt(b*b-4*a*c))/(2*a) return (x1,x2)

Additional raise Statements. Exceptions can be raised anywhere, including in an except clause of a try statement. Well look at two examples of re-raising an exception. We can use the simple raise statement in an except clause. This re-raises the original exception. We can use this to do standardized error handling. For example, we might write an error message to a log le, or we might have a standardized exception clean-up process.
try: attempt something risky except Exception, ex: log_the_error( ex ) raise

This shows how we might write the exception to a standard log in the function log_the_error() and then re-raise the original exception again. This allows the overall application to choose whether to stop running gracefully or handle the exception. The other common technique is to transform Python errors into our applications unique errors. Heres an example that logs an error and transforms the built-in FloatingPointError into our application-specic error, MyError.
class MyError( Exception ): pass try: attempt something risky except FloatingPointError, e: do something locally, perhaps to clean up raise MyError("something risky failed: %s" % ( e, ) )

This allows us to have more consistent error messages, or to hide implementation details.

18.4 An Exceptional Example


The following example uses a uniquely named exception to indicate that the user wishes to quit rather than supply input. Well dene our own exception, and dene function which rewrites a built-in exception to be our own exception.

204

Chapter 18. Exceptions

Building Skills in Python, Release 2.6.5

Well dene a function, ckyorn(), which does a Check for Y or N. This function has two parameters, prompt and help, that are used to prompt the user and print help if the user requests it. In this case, the return value is always a Y or N. A request for help (?) is handled automatically. A request to quit is treated as an exception, and leaves the normal execution ow. This function will accept Q or end-of-le (usually ctrl-D, but also ctrl-Z on Windows) as the quit signal.

interaction.py
class UserQuit( Exception ): pass def ckyorn( prompt, help="" ): ok= 0 while not ok: try: a=raw_input( prompt + " [y,n,q,?]: " ) except EOFError: raise UserQuit if a.upper() in [ 'Y', 'N', 'YES', 'NO' ]: ok= 1 if a.upper() in [ 'Q', 'QUIT' ]: raise UserQuit if a.upper() in [ '?' ]: print help return a.upper()[0]

We can use this function as shown in the following example.


import interaction answer= interaction.ckyorn( help= "Enter Y if finished entering data", prompt= "All done?")

This function transforms an EOFError into a UserQuit exception, and also transforms a user entry of Q or q into this same exception. In a longer program, this exception permits a short-circuit of all further processing, omitting some potentially complex if statements. Details of the ckyorn() Function Our function uses a loop that will terminate when we have successfully interpreted an answer from the user. We may get a request for help or perhaps some uninterpretable input from the user. We will continue our loop until we get something meaningful. The post condition will be that the variable ok is set to True and the answer, a is one of ("Y", "y", "N", "n"). Within the loop, we surround our raw_input() function with a try suite. This allows us to process any kind of input, including user inputs that raise exceptions. The most common example is the user entering the end-of-le character on their keyboard. We handle the built-in EOFError by raising our UserQuit exception. When we get end-of-le from the user, we need to tidy up and exit the program promptly. If no exception was raised, we examine the input character to see if we can interpret it. Note that if the user enters Q or QUIT, we treat this exactly like as an end-of-le; we raise the UserQuit exception so that the program can tidy up and exit quickly. We return a single-character result only for ordinary, valid user inputs. A user request to quit is considered extraordinary, and we raise an exception for that.

18.4. An Exceptional Example

205

Building Skills in Python, Release 2.6.5

18.5 Complete Exception Handling and The nally Clause


A common use case is to have some nal processing that must occur irrespective of any exceptions that may arise. The situation usually arises when an external resource has been acquired and must be released. For example, a le must be closed, irrespective of any errors that occur while attempting to read it. With some care, we can be sure that all exception clauses do the correct nal processing. However, this may lead to a some redundant programming. The nally clause saves us the eort of trying to carefully repeat the same statement(s) in a number of except clauses. This nal step will be performed before the try block is nished, either normally or by any exception. The complete form of a try statement looks like this:
try: suite except exception , target : suite except: suite finally: suite

Each suite is an indented block of statements. Any statement is allowed in the suite. While this means that you can have nested try statements, that is rarely necessary, since you can have an unlimited number of except clauses. The nally clause is always executed. This includes all three possible cases: if the try block nishes with no exceptions; if an exception is raised and handled; and if an exception is raised but not handled. This last case means that every nested try statement with a nally clause will have that nally clause executed. Use a nally clause to close les, release locks, close database connections, write nal log messages, and other kinds of nal operations. In the following example, we use the nally clause to write a nal log message.
def avgReport( someList ): try: print "Start avgReport" m= avg(someList) print "Average+15%=", m*1.15 except TypeError, ex: print "TypeError: ", ex except ZeroDivisionError, ex: print "ZeroDivisionError: ", ex finally: print "Finish avgReport"

18.6 Exception Functions


The sys module provides one function that provides the details of the exception that was raised. Programs with exception handling will occasionally use this function.

206

Chapter 18. Exceptions

Building Skills in Python, Release 2.6.5

The sys.exc_info() function returns a 3- tuple with the exception, the exceptions parameter, and a traceback object that pinpoints the line of Python that raised the exception. This can be used something like the following not-very-good example.

exception2.py
import sys import math a= 2 b= 2 c= 1 try: x1= (-b+math.sqrt(b*b-4*a*c))/(2*a) x2= (-b-math.sqrt(b*b-4*a*c))/(2*a) print x1, x2 except: e,p,t= sys.exc_info() print e,p

This uses multiple assignment to capture the three elements of the sys.exc_info() tuple , the exception itself in e, the parameter in p and a Python traceback object in t. This catch-all exception handler in this example is a bad policy. It may catch exceptions which are better left uncaught. Well look at these kinds of exceptions in Built-in Exceptions. For example, a RuntimeError is something you should not bother catching.

18.7 Exception Attributes


Exceptions have one interesting attribute. In the following example, well assume we have an exception object named e. This would happen inside an except clause that looked like except SomeException, e:. Traditionally, exceptions had a message attribute as well as an args attribute. These were used inconsistently. When you create a new Exception instance, the argument values provided are loaded into the args attribute. If you provide a single value, this will also be available as message; this is a property name that references args[0]. Heres an example where we provided multiple values as part of our Exception.
>>> a=Exception(1,2,3) >>> a.args (1, 2, 3) >>> a.message __main__:1: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6 ''

Heres an example where we provided a single value as part of our Exception; in this case, the message attribute is made available.
>>> b=Exception("Oh dear") >>> b.message 'Oh dear'

18.7. Exception Attributes

207

Building Skills in Python, Release 2.6.5

>>> b.args ('Oh dear',)

18.8 Built-in Exceptions


The following exceptions are part of the Python environment. There are three broad categories of exceptions. Non-error Exceptions. These are exceptions that dene events and change the sequence of execution. Run-time Errors. These exceptions can occur in the normal course of events, and indicate typical program problems. Internal or Unrecoverable Errors. These exceptions occur when compiling the Python program or are part of the internals of the Python interpreter; there isnt much recovery possible, since it isnt clear that our program can even continue to operate. Problems with the Python source are rarely seen by application programs, since the program isnt actually running. Here are the non-error exceptions. Generally, you will never have a handler for these, nor will you ever raise them with a raise statement. exception StopIteration This is raised by an iterator when there is no next value. The for statement handles this to end an iteration loop cleanly. exception GeneratorExit This is raised when a generator is closed by having the close() method evaluated. exception KeyboardInterrupt This is raised when a user hits ctrl-C to send an interrupt signal to the Python interpreter. Generally, this is not caught in application programs because its the only way to stop a program that is misbehaving. exception SystemExit This exception is raised by the sys.exit() function. Generally, this is not caught in application programs; this is used to force a program to exit. Here are the errors which can be meaningfully handled when a program runs. exception AssertionError Assertion failed. See the assert statement for more information in The assert Statement exception AttributeError Attribute not found in an object. exception EOFError Read beyond end of le. exception FloatingPointError Floating point operation failed. exception IOError I/O operation failed. exception IndexError Sequence index out of range. exception KeyError Mapping key not found.

208

Chapter 18. Exceptions

Building Skills in Python, Release 2.6.5

exception OSError OS system call failed. exception OverflowError Result too large to be represented. exception TypeError Inappropriate argument type. exception UnicodeError Unicode related error. exception ValueError Inappropriate argument value (of correct type). exception ZeroDivisionError Second argument to a division or modulo operation was zero. The following errors indicate serious problems with the Python interepreter. Generally, you cant do anything if these errors should be raised. exception MemoryError Out of memory. exception RuntimeError Unspecied run-time error. exception SystemError Internal error in the Python interpreter. The following exceptions are more typically returned at compile time, or indicate an extremely serious error in the basic construction of the program. While these exceptional conditions are a necessary part of the Python implementation, theres little reason for a program to handle these errors. exception ImportError Import cant nd module, or cant nd name in module. exception IndentationError Improper indentation. exception NameError Name not found globally. exception NotImplementedError Method or function hasnt been implemented yet. exception SyntaxError Invalid syntax. exception TabError Improper mixture of spaces and tabs. exception UnboundLocalError Local name referenced but not bound to a value. The following exceptions are part of the implementation of exception objects. Normally, these never occur directly. These are generic categories of exceptions. When you use one of these names in a catch clause, a number of more more specialized exceptions will match these. exception Exception Common base class for all user-dened exceptions.

18.8. Built-in Exceptions

209

Building Skills in Python, Release 2.6.5

exception StandardError Bas