100% found this document useful (3 votes)

2K views1,207 pages

Practical Programming in TCL and TK

tcl book

Uploaded by

sanjay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (3 votes)

2K views1,207 pages

Practical Programming in TCL and TK

tcl book

Uploaded by

sanjay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1207

Practical Programming in Tcl & Tk, Third Edition

By Brent B. Welch

Publisher: Prentice Hall PTR

Pub Date: November 10, 1999
ISBN: 0-13-022028-0
Pages: 832
Supplier: Team FLY

Tcl/Tk 8.2 is the first scripting language that can handle enterprise-wide integration tasks
that encompass Windows, Solaris, Macintosh, and other key platforms. Now, in this fully
updated Third Edition, Tcl/Tk development team member and best-selling author Brent
Welch presents all you need to know to achieve powerful results with Tcl/Tk 8.2 and the
new Tcl Web Server.
Coverage includes:
Tcl's fundamental mechanisms and operating system interfaces
Basic and advanced coding techniques and tools, including the Tcl script library
facility
Tk and X Windows-with detailed examples and sample widgets
The new, extensible Tcl Web Server
New Tcl internationalization features and thread support
New techniques for working with regular expressions and namespaces
You'll find extensive coverage of user interface development, as well as application
integration techniques that leverage Tcl/Tk's powerful cross-platform scripting
capabilities. Welch covers Tcl's extensive network support, as well as Safe Tcl, C
programming with the Tk toolkit, the Tcl compiler, and Tcl/Tk plug-ins for Netscape and
Internet Explorer. Whether you're a current Tcl/Tk programmer, or a developer searching
for a convenient, powerful multiplatform scripting language, Practical Programming in
Tcl and Tk, Third Edition delivers exactly what you're looking for.
"This is an excellent book, loaded with useful examples. Newcomers to Tk will find the
widget descriptions particularly helpful." -John Ousterhout CEO and founder of Scriptics
Corporation and the creator of Tcl/Tk
"Brent Welch fills an important need for an introduction to Tcl/Tk with an applied focus
and with coverage of many of the useful extensions available . . . I recommend this book
to my new students . . . and I keep a copy handy for my own use." -Joseph A. Konstan,
Professor of Computer Science University of Minnesota
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Publisher: Prentice Hall PTR

Pub Date: November 10, 1999
ISBN: 0-13-022028-0
Pages: 832
Supplier: Team FLY

Copyright
List of Examples
List of Tables
Preface
Why Tcl?
Tcl and Tk Versions
Who Should Read This Book
How to Read This Book
Other Tcl Books
On-line Examples
Ftp Archives
World Wide Web
Newsgroups
Typographic Conventions
Hot Tips
Book Organization
What's New in the Third Edition
First Edition Thanks
Second Edition Thanks
Third Edition Thanks
Contact the Author
Part I. Tcl Basics
Chapter 1. Tcl Fundamentals
Tcl Commands
Hello, World!
Variables
Command Substitution
Math Expressions
Backslash Substitution
Grouping with Braces and Double Quotes
Procedures
A Factorial Example
More about Variables
More about Math Expressions
Comments
Substitution and Grouping Summary
Fine Points
Reference
Chapter 2. Getting Started
The source Command
UNIX Tcl Scripts
Windows 95 Start Menu
The Macintosh and ResEdit
The console Command
Command-Line Arguments
Predefined Variables
Chapter 3. The Guestbook CGI Application
A Quick Introduction to HTML
CGI for Dynamic Pages
The guestbook.cgi Script
Defining Forms and Processing Form Data
The cgi.tcl Package
Next Steps
Chapter 4. String Processing in Tcl
The string Command
The append Command
The format Command
The scan Command
The binary Command
Related Chapters
Chapter 5. Tcl Lists
Tcl Lists
Constructing Lists
Getting List Elements: llength, lindex, and lrange
Modifying Lists: linsert and lreplace
Searching Lists: lsearch
Sorting Lists: lsort
The split Command
The join Command
Related Chapters
Chapter 6. Control Structure Commands
If Then Else

Switch
While

Foreach

For
Break and Continue
Catch
Error

Return
Chapter 7. Procedures and Scope
The proc Command
Changing Command Names with rename
Scope
The global Command
Call by Name Using upvar
Variable Aliases with upvar
Chapter 8. Tcl Arrays
Array Syntax
The array Command
Building Data Structures with Arrays
Chapter 9. Working with Files and Programs
Running Programs with exec
The file Command
Cross-Platform File Naming
Manipulating Files and Directories
File Attributes
Input/Output Command Summary
Opening Files for I/O
Reading and Writing
The Current Directory ?cd and pwd
Matching File Names with glob
The exit and pid Commands
Environment Variables
The registry Command
Part II. Advanced Tcl
Chapter 10. Quoting Issues and Eval
Constructing Code with the list Command
Exploiting the concat inside eval
The uplevel Command
The subst Command
Chapter 11. Regular Expressions
When to Use Regular Expressions
Regular Expression Syntax
Advanced Regular Expressions
Syntax Summary
The regexp Command
The regsub Command
Transforming Data to Program with regsub
Other Commands That Use Regular Expressions
Chapter 12. Script Libraries and Packages
Locating Packages: The auto_path Variable
Using Packages
Summary of Package Loading
The package Command
Libraries Based on the tclIndex File
The unknown Command
Interactive Conveniences
Tcl Shell Library Environment
Coding Style
Chapter 13. Reflection and Debugging
The clock Command
The info Command
Cross-Platform Support
Tracing Variable Values
Interactive Command History
Debugging
Scriptics' TclPro
Other Tools
Performance Tuning
Chapter 14. Namespaces
Using Namespaces
Namespace Variables
Command Lookup
Nested Namespaces
Importing and Exporting Procedures
Callbacks and Namespaces
Introspection
The namespace Command
Converting Existing Packages to use Namespaces
[incr Tcl] Object System
Notes
Chapter 15. Internationalization
Character Sets and Encodings
Message Catalogs
Chapter 16. Event-Driven Programming
The Tcl Event Loop
The after Command
The fileevent Command
The vwait Command
The fconfigure Command
Chapter 17. Socket Programming
Client Sockets
Server Sockets
The Echo Service
Fetching a URL with HTTP
The http Package
Basic Authentication
Chapter 18. TclHttpd Web Server
Integrating TclHttpd with your Application
Domain Handlers
Application Direct URLs
Document Types
HTML + Tcl Templates
Form Handlers
Programming Reference
Standard Application-Direct URLs
The TclHttpd Distribution
Server Configuration
Chapter 19. Multiple Interpreters and Safe-Tcl
The interp Command
Creating Interpreters
Safe Interpreters
Command Aliases
Hidden Commands
Substitutions
I/O from Safe Interpreters
The Safe Base
Security Policies
Chapter 20. Safe-Tk and the Browser Plugin
Tk in Child Interpreters
The Browser Plugin
Security Policies and Browser Plugin
Configuring Security Policies
Part III. Tk Basics
Chapter 21. Tk Fundamentals
Hello, World! in Tk
Naming Tk Widgets
Configuring Tk Widgets
Tk Widget Attributes and the Resource Database
Summary of the Tk Commands
Chapter 22. Tk by Example
ExecLog
The Example Browser
A Tcl Shell
Chapter 23. The Pack Geometry Manager
Packing toward a Side
Horizontal and Vertical Stacking
The Cavity Model
Packing Space and Display Space
Resizing and -expand
Anchoring
Packing Order
Choosing the Parent for Packing
Unpacking a Widget
Packer Summary
Window Stacking Order
Chapter 24. The Grid Geometry Manager
A Basic Grid
Spanning Rows and Columns
Row and Column Constraints
The grid Command
Chapter 25. The Place Geometry Managery
place Basics
The Pane Manager
The place Command
Chapter 26. Binding Commands to Events
The bind Command
The bindtags Command
Event Syntax
Modifiers
Event Sequences
Virtual Events
Event Keywords
Part IV. Tk Widgets
Chapter 27. Buttons and Menus
Button Commands and Scope Issues
Buttons Associated with Tcl Variables
Button Attributes
Button Operations
Menus and Menubuttons
Keyboard Traversal
Manipulating Menus and Menu Entries
Menu Attributes
A Menu by Name Package
Chapter 28. The Resource Database
An Introduction to Resources
Loading Option Database Files
Adding Individual Database Entries
Accessing the Database
User-Defined Buttons
User-Defined Menus
Chapter 29. Simple Tk Widgets
Frames and Toplevel Windows
The Label Widget
The Message Widget
The Scale Widget
The bell Command
Chapter 30. Scrollbars
Using Scrollbars
The Scrollbar Protocol
The Scrollbar Widget
Chapter 31. The Entry Widget
Using Entry Widgets
The Entry Widget
Chapter 32. The Listbox Widget
Using Listboxes
Listbox Bindings
Listbox Attributes
Chapter 33. The Text Widget
Text Indices
Text Marks
Text Tags
The Selection
Tag Bindings
Searching Text
Embedded Widgets
Embedded Images
Looking inside the Text Widget
Text Bindings
Text Operations
Text Attributes
Chapter 34. The Canvas Widget
Canvas Coordinates
Hello, World!
The Min Max Scale Example
Canvas Objects
Canvas Operations
Generating Postscript
Canvas Attributes
Hints
Part V. Tk Details
Chapter 35. Selections and the Clipboard
The Selection Model
The selection Command
The clipboard Command
Selection Handlers
Chapter 36. Focus, Grabs, and Dialogs
Standard Dialogs
Custom Dialogs
Animation with the update Command
Chapter 37. Tk Widget Attributes
Configuring Attributes
Size
Borders and Relief
The Focus Highlight
Padding and Anchors
Chapter 38. Color, Images, and Cursors
Colors
Colormaps and Visuals
Bitmaps and Images
The Text Insert Cursor
The Mouse Cursor
Chapter 39. Fonts and Text Attributes
Naming a Font
X Font Names
Font Metrics
The font Command
Text Attributes
Gridding, Resizing, and Geometry
A Font Selection Application
Chapter 40. Send
The send Command
The Sender Script
Communicating Processes
Remote eval through Sockets
Chapter 41. Window Managers and Window Information
The wm Command
The winfo Command
The tk Command
Chapter 42. Managing User Preferences
App-Defaults Files
Defining Preferences
The Preferences User Interface
Managing the Preferences File
Tracing Changes to Preference Variables
Improving the Package
Chapter 43. A User Interface to Bindings
A Pair of Listboxes Working Together
The Editing Interface
Saving and Loading Bindings
Part VI. C Programming
Chapter 44. C Programming and Tcl
Basic Concepts
Creating a Loadable Package
A C Command Procedure
The blob Command Example
Strings and Internationalization
Tcl_Main and Tcl_AppInit
The Event Loop
Invoking Scripts from C
Chapter 45. Compiling Tcl and Extensions
Standard Directory Structure
Building Tcl from Source
Using Stub Libraries
Using autoconf
The Sample Extension
Chapter 46. Writing a Tk Widget in C
Initializing the Extension
The Widget Data Structure
The Widget Class Command
The Widget Instance Command
Configuring and Reconfiguring Attributes
Specifying Widget Attributes
Displaying the Clock
The Window Event Procedure
Final Cleanup
Chapter 47. C Library Overview
An Overview of the Tcl C Library
An Overview of the Tk C Library
Part VII. Changes
Chapter 48. Tcl 7.4/Tk 4.0
wish
Obsolete Features
The cget Operation
Input Focus Highlight
Bindings
Scrollbar Interface
pack info

Focus
The send Command
Internal Button Padding
Radiobutton Value
Entry Widget
Menus
Listboxes
No geometry Attribute
Text Widget
Color Attributes
Color Allocation and tk colormodel
Canvas scrollincrement
The Selection
The bell Command
Chapter 49. Tcl 7.5/Tk 4.1
Cross-Platform Scripts
The clock Command
The load Command
The package Command
Multiple foreach loop variables
Event Loop Moves from Tk to Tcl
Network Sockets
Multiple Interpreters and Safe-Tcl
The grid Geometry Manager
The Text Widget
The Entry Widget
Chapter 50. Tcl 7.6/Tk 4.2
More file Operations
Virtual Events
Standard Dialogs
New grid Geometry Manager
Macintosh unsupported1 Command
Chapter 51. Tcl/Tk 8.0
The Tcl Compiler
Namespaces
Safe-Tcl
New lsort
tcl_precision Variable
Year 2000 Convention
Http Package
Serial Line I/O
Platform-Independent Fonts
The tk scaling Command
Application Embedding
Native Menus and Menubars
CDE Border Width
Native Buttons and Scrollbars
Images in Text Widgets
No Errors from destroy
grid rowconfigure

The Patch Releases

Chapter 52. Tcl/Tk 8.1
Unicode and Internationalization
Thread Safety
Advanced Regular Expressions
New String Commands
The DDE Extension
Miscellaneous
Chapter 53. Tcl/Tk 8.2
The Trf Patch
Faster String Operations
Empty Array Names
Brower Plugin Compatiblity
Chapter 54. Tcl/Tk 8.3
Proposed Tcl Changes
Proposed Tk Changes
Chapter 55. About The CD-ROM
Technical Support
Index
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Copyright
Library of Congress Cataloging-in-Publication Data
Welch, Brent. B.
Practical programming in Tcl and Tk / Brent B. Welch.-- 3rd ed.
p. cm.
ISBN 0-13-022028-0
1. Tcl (Computer program language) 2. Tk toolkit. I. Title.
QA76.73.T44 W45 1999
005.13'3--dc21 99-047206

Credits
Editorial/Production Supervision: Joan L. McNamara
Acquisitions Editor: Mark Taub
Marketing Manager: Kate Hargett
Editorial Assistant: Michael Fredette
Cover Design Director: Jerry Votta
Cover Design: Design Source
Manufacturing Manager: Alexis R. Heydt
© 2000, 1997 by Prentice Hall PTR
Prentice-Hall, Inc.
Upper Saddle River, New Jersey 07458
Prentice Hall books are widely used by corporations and government agencies for training, marketing,
and resale. The publisher offers discounts on this book when ordered in bulk quantities. For more
information, contact:
Corporate Sales Department, Prentice Hall PTR, One Lake Street, Upper Saddle River, NJ 07458
Phone: 800-382-3419; Fax: 201-236-7141; email: [email protected]
All rights reserved. No part of this book may be reproduced, in any form or by any means, without
permission in writing from the publisher.
All product names mentioned herein are the trademarks of their respective owners.
Printed in the United States of America
10 9 8 7 6 5 4 3 2 1
Prentice-Hall International (UK) Limited, London
Prentice-Hall of Australia Pty. Limited, Sydney
Prentice-Hall Canada Inc., Toronto
Prentice-Hall Hispanoamericana, S.A., Mexico
Prentice-Hall of India Private Limited, New Delhi
Prentice-Hall of Japan, Inc., Tokyo
Prentice-Hall (Singapore) Pte. Ltd., Singapore
Editora Prentice-Hall do Brasil, Ltda., Rio de Janeiro

Dedication
to Jody, Christopher, Daniel, and Michael

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

List of Examples
1.1 The "Hello, World!" example
1.2 Tcl variables
1.3 Command substitution
1.4 Simple arithmetic
1.5 Nested commands
1.6 Built-in math functions
1.7 Grouping expressions with braces
1.8 Quoting special characters with backslash
1.9 Continuing long lines with backslashes
1.10 Grouping with double quotes vs. braces
1.11 Embedded command and variable substitution
1.12 Defining a procedure
1.13 A while loop to compute factorial
1.14 A recursive definition of factorial
1.15 Using set to return a variable value
1.16 Embedded variable references
1.17 Using info to determine if a variable exists
1.18 Controlling precision with tcl_precision
2.1 A standalone Tcl script on UNIX
2.2 A standalone Tk script on UNIX
2.3 Using /bin/sh to run a Tcl script
2.4 The EchoArgs script
3.1 A simple CGI script
3.2 Output of Example 3-1
3.3 The guestbook.cgi script
3.4 The Cgi_Header procedure
3.5 The Link command formats a hypertext link
3.6 Initial output of guestbook.cgi
3.7 Output of guestbook.cgi
3.8 The newguest.html form
3.9 The newguest.cgi script
4.1 Comparing strings with string compare
4.2 Comparing strings with string equal
4.3 Mapping Microsoft World special characters to ASCII
5.1 Constructing a list with the list command
5.2 Using lappend to add elements to a list
5.3 Using concat to splice lists together
5.4 Double quotes compared to the concat and list commands
5.5 Modifying lists with linsert and lreplace
5.6 Deleting a list element by value
5.7 Sorting a list using a comparison function
5.8 Use split to turn input data into Tcl lists
5.9 Implementing join in Tcl
6.1 A conditional if then else command
6.2 Chained conditional with elseif
6.3 Using switch for an exact match
6.4 Using switch with substitutions in the patterns
6.5 A switch with "fall through" cases
6.6 Comments in switch commands
6.7 A while loop to read standard input
6.8 Looping with foreach
6.9 Parsing command-line arguments
6.10 Using list with foreach
6.11 Multiple loop variables with foreach
6.12 Multiple value lists with foreach
6.13 A for loop
6.14 A standard catch phrase
6.15 A longer catch phrase
6.16 There are several possible return values from catch
6.17 Raising an error
6.18 Preserving errorInfo when calling error
6.19 Raising an error with return
7.1 Default parameter values
7.2 Variable number of arguments
7.3 Variable scope and Tcl procedures
7.4 A random number generator.
7.5 Print variable by name
7.6 Improved incr procedure
8.1 Using arrays
8.2 Referencing an array indirectly
8.3 Referencing an array indirectly using upvar
8.4 ArrayInvert inverts an array
8.5 Using arrays for records, version 1
8.6 Using arrays for records, version 2
8.7 Using a list to implement a stack
8.8 Using an array to implement a stack
8.9 A list of arrays
8.10 A list of arrays
8.11 A simple in-memory database
9.1 Using exec on a process pipeline
9.2 Comparing file modify times
9.3 Determining whether pathnames reference the same file
9.4 Opening a file for writing
9.5 A more careful use of open
9.6 Opening a process pipeline
9.7 Prompting for input
9.8 A read loop using gets
9.9 A read loop using read and split
9.10 Copy a file and translate to native format
9.11 Finding a file by name
9.12 Printing environment variable values
10.1 Using list to construct commands
10.2 Generating procedures dynamically with a template
10.3 Using eval with $args
10.4 lassign: list assignment with foreach
10.5 The File_Process procedure applies a command to each line of a file
11.1 Expanded regular expressions allow comments
11.2 Using regular expressions to parse a string
11.3 A pattern to match URLs
11.4 An advanced regular expression to match URLs
11.5 The Url_Decode procedure
11.6 The Cgi_Parse and Cgi_Value procedures
11.7 Cgi_Parse and Cgi_Value store query data in the cgi array
11.8 Html_DecodeEntity
11.9 Html_Parse
12.1 Maintaining a tclIndex file
12.2 Loading a tclIndex file
13.1 Calculating clicks per second
13.2 Printing a procedure definition
13.3 Mapping form data onto procedure arguments
13.4 Finding built-in commands
13.5 Getting a trace of the Tcl call stack
13.6 A procedure to read and evaluate commands
13.7 Using info script to find related files
13.8 Tracing variables
13.9 Creating array elements with array traces
13.10 Interactive history usage
13.11 Implementing special history syntax
13.12 A Debug procedure
13.13 Time Stamps in log records
14.1 Random number generator using namespaces
14.2 Random number generator using qualified names
14.3 Nested namespaces
14.4 The code procedure to wrap callbacks
14.5 Listing commands defined by a namespace
15.1 MIME character sets.and file encodings
15.2 Using scripts in nonstandard encodings
15.3 Three sample message catalog files
15.4 Using msgcat::mcunknown to share message catalogs
16.1 A read event file handler
16.2 Using vwait to activate the event loop
16.3 A read event file handler for a nonblocking channel
17.1 Opening a client socket with a timeout
17.2 Opening a server socket
17.3 The echo service
17.4 A client of the echo service
17.5 Opening a connection to an HTTP server
17.6 Opening a connection to an HTTP server
17.7 Http_Head validates a URL
17.8 Using Http_Head
17.9 Http_Get fetches the contents of a URL
17.10 HttpGetText reads text URLs
17.11 HttpCopyDone is used with fcopy
17.12 Downloading files with http::geturl
17.13 Basic Authentication using http::geturl
18.1 A simple URL domain
18.2 Application Direct URLs
18.3 Alternate types for Application Direct URLs
18.4 A sample document type handler
18.5 A one-level site structure
18.6 A HTML + Tcl template file
18.7 SitePage template procedure
18.8 SiteMenu and SiteFooter template procedures
18.9 The SiteLink procedure
18.10 Mail form results with /mail/forminfo
18.11 Mail message sent by /mail/forminfo
18.12 Processing mail sent by /mail/forminfo
18.13 A self-checking form procedure
18.14 A page with a self-checking form
18.15 The /debug/source application-direct URL implementation
19.1 Creating and deleting an interpreter
19.2 Creating a hierarchy of interpreters
19.3 A command alias for exit
19.4 Querying aliases
19.5 Dumping aliases as Tcl commands
19.6 Substitutions and hidden commands
19.7 Opening a file for an unsafe interpreter
19.8 The Safesock security policy
19.9 The Tempfile security policy
19.10 Restricted puts using hidden commands
19.11 A safe after command
21.1 "Hello, World!" Tk program.
21.2 Looking at all widget attributes
22.1 Logging the output of a program run with exec
22.2 A platform-specific cancel event
22.3 A browser for the code examples in the book
22.4 A Tcl shell in a text widget
22.5 Macintosh look and feel
22.6 Windows look and feel
22.7 UNIX look and feel
23.1 Two frames packed inside the main frame
23.2 Turning off geometry propagation
23.3 A horizontal stack inside a vertical stack
23.4 Even more nesting of horizontal and vertical stacks
23.5 Mixing bottom and right packing sides
23.6 Filling the display into extra packing space
23.7 Using horizontal fill in a menu bar
23.8 The effects of internal padding (-ipady)
23.9 Button padding vs. packer padding
23.10 The look of a default button
23.11 Resizing without the expand option
23.12 Resizing with expand turned on
23.13 More than one expanding widget
23.14 Setup for anchor experiments
23.15 The effects of noncenter anchors
23.16 Animating the packing anchors
23.17 Controlling the packing order
23.18 Packing into other relatives
24.1 A basic grid
24.2 A grid with sticky settings
24.3 A grid with row and column specifications
24.4 A grid with external padding
24.5 A grid with internal padding
24.6 All combinations of -sticky settings
24.7 Explicit row and column span
24.8 Grid syntax row and column span
24.9 Row padding compared to widget padding
24.10 Gridding a text widget and scrollbar
25.1 Centering a window with place
25.2 Covering a window with place
25.3 Combining relative and absolute sizes
25.4 Positioning a window above a sibling with place
25.5 Pane_Create sets up vertical or horizontal panes
25.6 PaneDrag adjusts the percentage
25.7 PaneGeometry updates the layout
26.1 Bindings on different binding tags
26.2 Output from the UNIX xmodmap program
26.3 Emacs-like binding convention for Meta and Escape
26.4 Virtual events for cut, copy, and paste
27.1 A troublesome button command
27.2 Fixing the troublesome situation
27.3 A button associated with a Tcl procedure
27.4 Radiobuttons and checkbuttons
27.5 A command on a radiobutton or checkbutton
27.6 A menu sampler
27.7 A menu bar in Tk 8.0
27.8 A simple menu by name package
27.9 Using the Tk 8.0 menu bar facility
27.10 MenuGet maps from name to menu
27.11 Adding menu entries
27.12 A wrapper for cascade entries
27.13 Using the menu by name package
27.14 Keeping the accelerator display up to date
28.1 Reading an option database file
28.2 A file containing resource specifications
28.3 Using resources to specify user-defined buttons
28.4 Resource_ButtonFrame defines buttons based on resources
28.5 Using Resource_ButtonFrame
28.6 Specifying menu entries via resources
28.7 Defining menus from resource specifications
28.8 Resource_GetFamily merges user and application resources
29.1 Macintosh window styles
29.2 A label that displays different strings
29.3 The message widget formats long lines of text
29.4 Controlling the text layout in a message widget
29.5 A scale widget
30.1 A text widget and two scrollbars
30.2 Scroll_Set manages optional scrollbars
30.3 Listbox with optional scrollbars
31.1 A command entry
32.1 Choosing items from a listbox
33.1 Tag configurations for basic character styles
33.2 Line spacing and justification in the text widget
33.3 An active text button
33.4 Delayed creation of embedded widgets
33.5 Using embedded images for a bulleted list
33.6 Finding the current range of a text tag
33.7 Dumping the text widget
33.8 Dumping the text widget with a command callback
34.1 A large scrolling canvas
34.2 The canvas "Hello, World!" example
34.3 A min max scale canvas example
34.4 Moving the markers for the min max scale
34.5 Canvas arc items
34.6 Canvas bitmap items
34.7 Canvas image items
34.8 A canvas stroke drawing example
34.9 Canvas oval items
34.10 Canvas polygon items
34.11 Dragging out a box
34.12 Simple edit bindings for canvas text items
34.13 Using a canvas to scroll a set of widgets
34.14 Generating postscript from a canvas
35.1 Paste the PRIMARY or CLIPBOARD selection
35.2 Separate paste actions
35.3 Bindings for canvas selection
35.4 Selecting objects
35.5 A canvas selection handler
35.6 The copy and cut operations
35.7 Pasting onto the canvas
36.1 Procedures to help build dialogs
36.2 A simple dialog
36.3 A feedback procedure
37.1 Equal-sized labels
37.2 3D relief sampler
37.3 Padding provided by labels and buttons
37.4 Anchoring text in a label or button
37.5 Borders and padding
38.1 Resources for reverse video
38.2 Computing a darker color
38.3 Specifying an image for a widget
38.4 Specifying a bitmap for a widget
38.5 The built-in bitmaps
38.6 The Tk cursors
39.1 The FontWidget procedure handles missing fonts
39.2 Font metrics
39.3 A gridded, resizable listbox
39.4 Font selection dialog
40.1 The sender application
40.2 Hooking the browser to an eval server
40.3 Making the shell into an eval server
40.4 Remote eval using sockets
40.5 Reading commands from a socket
40.6 The client side of remote evaluation
41.1 Gridded geometry for a canvas
41.2 Telling other applications what your name is
42.1 Preferences initialization
42.2 Adding preference items
42.3 Setting preference variables
42.4 Using the preferences package
42.5 A user interface to the preference items
42.6 Interface objects for different preference types
42.7 Displaying the help text for an item
42.8 Saving preferences settings to a file
42.9 Read settings from the preferences file
42.10 Tracing a Tcl variable in a preference item
43.1 A user interface to widget bindings
43.2 Bind_Display presents the bindings for a widget or class
43.3 Related listboxes are configured to select items together
43.4 Controlling a pair of listboxes with one scrollbar
43.5 Drag-scrolling a pair of listboxes together
43.6 An interface to define bindings
43.7 Defining and saving bindings
44.1 The initialization procedure for a loadable package
44.2 The RandomCmd C command procedure
44.3 The RandomObjCmd C command procedure
44.4 The Tcl_Obj structure
44.5 The Plus1ObjCmd procedure
44.6 The Blob and BlobState data structures
44.7 The Blob_Init and BlobCleanup procedures
44.8 The BlobCmd command procedure
44.9 BlobCreate and BlobDelete
44.10 The BlobNames procedure
44.11 The BlobN and BlobData procedures
44.12 The BlobCommand and BlobPoke procedures
44.13 A canonical Tcl main program and Tcl_AppInit
44.14 A canonical Tk main program and Tk_AppInit
44.15 Calling C command procedure directly with Tcl_Invoke
46.1 The Clock_Init procedure
46.2 The Clock widget data structure
46.3 The ClockCmd command procedure
46.4 The ClockObjCmd command procedure
46.5 The ClockInstanceCmd command procedure
46.6 The ClockInstanceObjCmd command procedure
46.7 ClockConfigure allocates resources for the widget
46.8 ClockObjConfigure allocates resources for the widget
46.9 The Tk_ConfigSpec typedef
46.10 Configuration specs for the clock widget
46.11 The Tk_OptionSpec typedef
46.12 The Tk_OptionSpec structure for the clock widget
46.13 ComputeGeometry computes the widget's size
46.14 The ClockDisplay procedure
46.15 The ClockEventPro handles window events
46.16 The ClockDestroy cleanup procedure
46.17 The ClockObjDelete command

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

List of Tables
1-1 Backslash sequences
1-2 Arithmetic operators from highest to lowest precedence
1-3 Built-in math functions
1-4 Built-in Tcl commands
2-1 Wish command line options
2-2 Variables defined by tclsh and wish
3-1 HTML tags used in the examples
4-1 The string command
4-2 Matching characters used with string match
4-3 Character class names
4-4 Format conversions
4-5 Format flags
4-6 Binary conversion types
5-1 List-related commands
8-1 The array command
9-1 Summary of the exec syntax for I/O redirection
9-2 The file command options
9-3 Array elements defined by file stat
9-4 Platform-specific file attributes
9-5 Tcl commands used for file access
9-6 Summary of the open access arguments
9-7 Summary of POSIX flags for the access argument
9-8 The registry command
9-9 The registry data types
11-1 Basic regular expression syntax
11-2 Additional advanced regular expression syntax
11-3 Character classes
11-4 Backslash escapes in regular expressions
11-5 Embedded option characters used with the (?x) syntax
11-6 Options to the regexp command
11-7 Sample regular expressions
12-1 Options to the pkg_mkIndex command
12-2 The package command
13-1 The clock command
13-2 Clock formatting keywords
13-3 UNIX-specific clock formatting keywords
13-4 The info command
13-5 The history command
13-6 Special history syntax
14-1 The namespace command
15-1 The encoding command
15-2 The msgcat package
16-1 The after command
16-2 The fileevent command
16-3 I/O channel properties controlled by fconfigure
16-4 End of line translation modes
17-1 Options to the http::geturl command
17-2 Elements of the http::geturl state array
17-3 The http support procedures
18-1 Httpd support procedures
18-2 Url support procedures
18-3 Doc procedures for configuration
18-4 Doc procedures for generating responses
18-5 Doc procedures that support template processing
18-6 The form package
18-7 Elements of the page array
18-8 Elements of the env array
18-9 Status application-direct URLs
18-10 Debug application-direct URLs
18-11 Application-direct URLS that e-mail form results
18-12 Basic TclHttpd Parameters
19-1 The interp command
19-2 Commands hidden from safe interpreters
19-3 The safe base master interface
19-4 The safe base slave aliases
20-1 Tk commands omitted from safe interpreters
20-2 Plugin Environment Variables
20-3 Aliases defined by the browser package
20-4 The browser::getURL callbacks
21-1 Tk widget-creation commands
21-2 Tk widget-manipulation commands
21-3 Tk support procedures
23-1 The pack command
23-2 Packing options
24-1 The grid command
24-2 Grid widget options
25-1 The place command
25-2 Placement options
26-1 Event types
26-2 Event modifiers
26-3 The event command
26-4 A summary of the event keywords
27-1 Resource names of attributes for all button widgets
27-2 Button operations
27-3 Menu entry index keywords
27-4 Menu operations
27-5 Menu attribute resource names.
27-6 Attributes for menu entries
29-1 Attributes for frame and toplevel widgets
29-2 Label Attributes
29-3 Message Attributes
29-4 Bindings for scale widgets
29-5 ttributes for scale widgets
29-6 perations on the scale widget
30-1 Bindings for the scrollbar widget
30-2 Attributes for the scrollbar widget
30-3 Operations on the scrollbar widget
31-1 Entry bindings
31-2 Entry attribute resource names
31-3 Entry indices
31-4 Entry operations
32-1 Listbox indices
32-2 Listbox operations
32-3 The values for the selectMode of a listbox
32-4 Bindings for browse selection mode
32-5 Bindings for single selection mode
32-6 Bindings for extended selection mode
32-7 Bindings for multiple selection mode
32-8 Listbox scroll bindings
32-9 Listbox attribute resource names
33-1 Text indices
33-2 Index modifiers for text widgets
33-3 Attributes for text tags
33-4 Options to the search operation
33-5 Window and image alignment options
33-6 Options to the window create operation
33-7 Options to the image create operation
33-8 Bindings for the text widget
33-9 Operations for the text widget
33-10 Text attribute resource names
34-1 Arc attributes
34-2 Bitmap attributes
34-3 Image attributes
34-4 Line attributes
34-5 Oval attributes
34-6 Polygon attributes
34-7 Rectangle attributes
34-8 Indices for canvas text items
34-9 Canvas operations that apply to text items
34-10 Text attributes
34-11 Operations on a canvas widget
34-12 Canvas postscript options
34-13 Canvas attribute resource names
35-1 The selection command
35-2 The clipboard command
36-1 Options to tk_messageBox
36-2 Options to the standard file dialogs
36-3 Options to tk_chooseColor
36-4 The focus command
36-5 The grab command
36-6 The tkwait command
37-1 Size attribute resource names
37-2 Border and relief attribute resource names
37-3 Highlight attribute resource names
37-4 Layout attribute resource names
38-1 Color attribute resource names
38-2 Windows system colors
38-3 Macintosh system colors
38-4 Visual classes for displays
38-5 Summary of the image command
38-6 Bitmap image options
38-7 Photo image attributes
38-8 Photo image operations
38-9 Copy options for photo images
38-10 Read options for photo images
38-11 Write options for photo images
38-12 Cursor attribute resource names
39-1 Font attributes
39-2 X Font specification components
39-3 The font command
39-4 Layout attribute resource names
39-5 Selection attribute resource names
40-1 Options to the send command
41-1 Size, placement and decoration window manager operations
41-2 Window manager commands for icons
41-3 Session-related window manager operations
41-4 Miscellaneous window manager operations
41-5 send command information
41-6 Window hierarchy information
41-7 Window size information
41-8 Window location information
41-9 Virtual root window information
41-10 Atom and window ID information
41-11 Colormap and visual class information
45-1 The Tcl source directory structure
45-2 The installation directory structure
45-3 Standard configure flags
45-4 TEA standard Makefile targets
46-1 Configuration flags and corresponding C types
48-1 Changes in color attribute names
52-1 The testthread command
52-2 The dde command options

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface
Tcl stands for Tool Command Language. Tcl is really two things: a scripting language, and an
interpreter for that language that is designed to be easy to embed into your application. Tcl and its
associated graphical user-interface toolkit, Tk, were designed and crafted by Professor John
Ousterhout of the University of California, Berkeley. You can find these packages on the Internet (as
explained on page lii) and use them freely in your application, even if it is commercial. The Tcl
interpreter has been ported from UNIX to DOS, Windows, OS/2, NT, and Macintosh environments.
The Tk toolkit has been ported from the X window system to Windows and Macintosh.
I first heard about Tcl in 1988 while I was Ousterhout's Ph.D. student at Berkeley. We were designing
a network operating system, Sprite. While the students hacked on a new kernel, John wrote a new
editor and terminal emulator. He used Tcl as the command language for both tools so that users could
define menus and otherwise customize those programs. This was in the days of X10, and he had plans
for an X toolkit based on Tcl that would help programs cooperate with each other by communicating
with Tcl commands. To me, this cooperation among tools was the essence of Tcl.
This early vision imagined that applications would be large bodies of compiled code and a small
amount of Tcl used for configuration and high-level commands. John's editor, mx, and the terminal
emulator, tx, followed this model. While this model remains valid, it has also turned out to be possible
to write entire applications in Tcl. This is because the Tcl/Tk shell, wish, provides access to other
programs, the file system, network sockets, plus the ability to create a graphical user interface. For
better or worse, it is now common to find applications that contain thousands of lines of Tcl script.
This book was written because, while I found it enjoyable and productive to use Tcl and Tk, there
were times when I was frustrated. In addition, working at Xerox PARC, with many experts in
languages and systems, I was compelled to understand both the strengths and weaknesses of Tcl and
Tk. Although many of my colleagues adopted Tcl and Tk for their projects, they were also just as
quick to point out its flaws. In response, I have built up a set of programming techniques that exploit
the power of Tcl and Tk while avoiding troublesome areas. This book is meant as a practical guide to
help you get the most out of Tcl and Tk and avoid some of the frustrations I experienced.
It has been about 10 years since I was introduced to Tcl, and about five years since the first edition of
this book. During the last several years I have been working under John Ousterhout, first at Sun
Microsystems and now at Scriptics Corporation. I have managed to remain mostly a Tcl programmer
while others in our group have delved into the C implementation of Tcl itself. I've been building
applications like HTML editors, e-mail user interfaces, Web servers, and the customer database we run
our business on. This experience is reflected in this book. The bulk of the book is about Tcl scripting,
and the aspects of C programming to create Tcl extensions is given a lighter treatment. I have been
lucky to remain involved in the core Tcl development, and I hope I can pass along the insights I have
gained by working with Tcl.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

Why Tcl?
As a scripting language, Tcl is similar to other UNIX shell languages such as the Bourne Shell (sh),
the C Shell (csh), the Korn Shell (ksh), and Perl. Shell programs let you execute other programs. They
provide enough programmability (variables, control flow, and procedures) to let you build complex
scripts that assemble existing programs into a new tool tailored for your needs. Shells are wonderful
for automating routine chores.
It is the ability to easily add a Tcl interpreter to your application that sets it apart from other shells. Tcl
fills the role of an extension language that is used to configure and customize applications. There is no
need to invent a command language for your new application, or struggle to provide some sort of user-
programmability for your tool. Instead, by adding a Tcl interpreter, you structure your application as a
set of primitive operations that can be composed by a script to best suit the needs of your users. It also
allows other programs to have programmatic control over your application, leading to suites of
applications that work well together.
The Tcl C library has clean interfaces and is simple to use. The library implements the basic interpreter
and a set of core scripting commands that implement variables, flow control, and procedures (see page
22). There is also a set of commands that access operating system services to run other programs,
access the file system, and use network sockets. Tk adds commands to create graphical user interfaces.
Tcl and Tk provide a "virtual machine" that is portable across UNIX, Windows, and Macintosh
environments.
The Tcl virtual machine is extensible because your application can define new Tcl commands. These
commands are associated with a C or C++ procedure that your application provides. The result is
applications that are split into a set of primitives written in a compiled language and exported as Tcl
commands. A Tcl script is used to compose the primitives into the overall application. The script layer
has access to shell-like capability to run other programs, has access to the file system, and can call
directly into the compiled part of the application through the Tcl commands you define. In addition,
from the C programming level, you can call Tcl scripts, set and query Tcl variables, and even trace the
execution of the Tcl interpreter.
There are many Tcl extensions freely available on the Internet. Most extensions include a C library
that provides some new functionality, and a Tcl interface to the library. Examples include database
access, telephone control, MIDI controller access, and expect, which adds Tcl commands to control
interactive programs.
The most notable extension is Tk, a toolkit for graphical user interfaces. Tk defines Tcl commands
that let you create and manipulate user interface widgets. The script-based approach to user interface
programming has three benefits:

Development is fast because of the rapid turnaround; there is no waiting for long compilations.
The Tcl commands provide a higher-level interface than most standard C library user-interface
toolkits. Simple user interfaces require just a handful of commands to define them. At the same
time, it is possible to refine the user interface in order to get every detail just so. The fast
turnaround aids the refinement process.
The user interface can be factored out from the rest of your application. The developer can
concentrate on the implementation of the application core and then fairly painlessly work up a
user interface. The core set of Tk widgets is often sufficient for all your user interface needs.
However, it is also possible to write custom Tk widgets in C, and again there are many
contributed Tk widgets available on the network.
There are other choices for extension languages that include Visual Basic, Scheme, Elisp, Perl,
Python, and Javascript. Your choice between them is partly a matter of taste. Tcl has simple constructs
and looks somewhat like C. It is easy to add new Tcl primitives by writing C procedures. Tcl is very
easy to learn, and I have heard many great stories of users completing impressive projects in a short
amount of time (e.g., a few weeks), even though they never used Tcl before.
Java has exploded onto the computer scene since this book was first published. Java is a great systems
programming language that in the long run could displace C and C++. This is fine for Tcl, which is
designed to glue together building blocks written in any system programming language. Tcl was
designed to work with C, but has been adapted to work with the Java Virtual Machine. Where I say "C
or C++", you can now say "C, C++, or Java," but the details are a bit different with Java. This book
does not describe the Tcl/Java interface, but you can find TclBlend on the CD-ROM. TclBlend loads
the Java Virtual Machine into your Tcl application and lets you invoke Java methods. It also lets you
implement Tcl commands in Java instead of C or C++.
Javascript is a language from Netscape that is designed to script interactions with Web pages.
Javascript is important because Netscape is widely deployed. However, Tcl provides a more general
purpose scripting solution that can be used in a wide variety of applications. The Tcl/Tk Web browser
plugin provides a way to run Tcl in your browser. It turns out to be more of a Java alternative than a
JavaScript alternative. The plugin lets you run Tcl applications inside your browser, while JavaScript
gives you fine grain control over the browser and HTML display. The plugin is described in Chapter
20.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

Tcl and Tk Versions

Tcl and Tk continue to evolve. See http://www.beedub.com/book/ for updates and news about the
latest Tcl releases. Tcl and Tk have had separate version numbers for historical reasons, but they are
released in pairs that work together. The original edition of this book was based on Tcl 7.4 and Tk 4.0,
and there were a few references to features in Tk 3.6. This third edition has been updated to reflect
new features added through Tcl/Tk 8.2:

Tcl 7.5 and Tk 4.1 had their final release in May 1996. These releases feature the port of Tk to
the Windows and Macintosh environments. The Safe-Tcl security mechanism was introduced to
support safe execution of network applets. There is also network socket support and a new
Input/Output (I/O) subsystem to support high-performance event-driven I/O.
Tcl 7.6 and Tk 4.2 had their final release in October 1996. These releases include improvements
in Safe-Tcl, and improvements to the grid geometry manager introduced in Tk 4.1. Cross-
platform support includes virtual events (e.g., <<Copy>> as opposed to <Control-c>), standard
dialogs, and more file manipulation commands.
Tcl 7.7 and Tk 4.3 were internal releases used for the development of the Tcl/Tk plug-in for the
Netscape Navigator and Microsoft Internet Explorer Web browsers. Their development actually
proceeded in parallel to Tcl 7.6 and Tk 4.2. The plug-in has been released for a wide variety of
platforms, including Solaris/SPARC, Solaris/INTEL, SunOS, Linux, Digital UNIX, IRIX,
HP/UX, Windows 95, Windows NT, and the Macintosh. The browser plug-in supports Tcl
applets in Web pages and uses the sophisticated security mechanism of Safe-Tcl to provide
safety.
Tcl 8.0 features an on-the-fly compiler for Tcl that provides many-times faster Tcl scripts. Tcl
8.0 supports strings with embedded null characters. The compiler is transparent to Tcl scripts,
but extension writers need to learn some new C APIs to take advantage of its potential. The
release history of 8.0 spread out over a couple of years as John Ousterhout moved from Sun
Microsystems to Scriptics Corporation. The widely used 8.0p2 release was made in the fall of
1997, but the final patch release, 8.0.5, was made in the spring of 1999.
Tk changed its version to match Tcl at 8.0. Tk 8.0 includes a new platform-independent font
mechanism, native menus and menu bars, and more native widgets for better native look and feel
on Windows and Macintosh.
Tcl/Tk 8.1 features full Unicode support, a new regular expression engine that provides all the
features found in Perl 5, and thread safety so that you can embed Tcl into multithreaded
applications. Tk does a heroic job of finding the correct font to display your Unicode characters,
and it adds a message catalog facility so that you can write internationalized applications. The
release history of Tcl/Tk 8.1 also straddled the Sun to Scriptics transition. The first alpha release
was made in the fall of 1997, and the final patch release, 8.1.1, was made in May 1999.
Tcl/Tk 8.2 is primarily a bug fix and stabilization release. There are a few minor additions to the
Tcl C library APIs to support more extensions without requiring core patches. Tcl/Tk 8.2 went
rapidly into final release in the summer of 1999.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

Who Should Read This Book

This book is meant to be useful to the beginner in Tcl as well as the expert. For the beginner and
expert alike, I recommend careful study of Chapter 1, Tcl Fundamentals. The programming model of
T0cl is designed to be simple, but it is different from many programming languages. The model is
based on string substitutions, and it is important that you understand it properly to avoid trouble in
complex cases. The remainder of the book consists of examples that demonstrate how to use Tcl and
Tk productively. For your reference, each chapter has tables that summarize the Tcl commands and Tk
widgets they describe.
This book assumes that you have some programming experience, although you should be able to get
by even if you are a complete novice. Knowledge of UNIX shell programming will help, but it is not
required. Where aspects of window systems are relevant, I provide some background information.
Chapter 2 describes the details of using Tcl and Tk on UNIX, Windows, and Macintosh.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

How to Read This Book

This book is best used in a hands-on manner, trying the examples at the computer. The book tries to
fill the gap between the terse Tcl and Tk manual pages, which are complete but lack context and
examples, and existing Tcl programs that may or may not be documented or well written.
I recommend the on-line manual pages for the Tcl and Tk commands. They provide a detailed
reference guide to each command. This book summarizes much of the information from the manual
pages, but it does not provide the complete details, which can vary from release to release. HTML
versions of the on-line manual pages can be found on the CD-ROM that comes with this book.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

Other Tcl Books

This book was the second Tcl book after the original book by John Ousterhout, the creator of Tcl.
Since then, the number of Tcl books has increased remarkably. The following are just some of the
books currently available.
Tcl and the Tk Toolkit (Addison-Wesley, 1994) by John Ousterhout provides a broad overview of all
aspects of Tcl and Tk, even though it covers only Tcl 7.3 and Tk 3.6. The book provides a more
detailed treatment of C programming for Tcl extensions.
Exploring Expect (O'Reilly & Associates, Inc., 1995) by Don Libes is a great book about an extremely
useful Tcl extension. Expect lets you automate the use of interactive programs like ftp and telnet that
expect to interact with a user. By combining expect and Tk, you can create graphical user interfaces
for old applications that you cannot modify directly.
Graphical Applications with Tcl & Tk (M&T Press, 1996) by Eric Johnson is oriented toward
Windows users. The second edition is up-to-date with Tcl/Tk 8.0.
Tcl/Tk Tools (O'Reilly & Associates, Inc., 1997) by Mark Harrison describes many useful Tcl
extensions. These include Oracle and Sybase interfaces, object-oriented language enhancements,
additional Tk widgets, and much more. The chapters were contributed by the authors of the
extensions, so they provide authoritative information on some excellent additions to the Tcl toolbox.
CGI Developers Resource, Web Programming with Tcl and Perl (Prentice Hall, 1997) by John Ivler
presents Tcl-based solutions to programming Web sites.
Effective Tcl/Tk Programming (Addison Wesley, 1997) by Michael McLennan and Mark Harrison
illustrate Tcl and Tk with examples and application design guidelines.
Interactive Web Applications with Tcl/Tk (AP Professional, 1998) by Michael Doyle and Hattie
Schroeder describes Tcl programming in the context of the Web browser plugin.
Tcl/Tk for Programmers (IEEE Computer Society, 1998) by Adrian Zimmer describes Unix and
Windows programming with Tcl/Tk. This book also includes solved exercises at the end of each
chapter.
Tcl/Tk for Real Programmers (Academic Press, 1999) by Clif Flynt is another example-oriented book.
Tcl/Tk in a Nutshell (O'Reilly, 1999) by Paul Raines and Jeff Tranter is a handy reference guide. It
covers several popular extensions including Expect, [incr Tcl], Tix, TclX, BLT, SybTcl, OraTcl, and
TclODBC. There is a tiny pocket-reference guide for Tcl/Tk that may eliminate the need to thumb
through my large book to find the syntax of a particular Tcl or Tk command.
Web Tcl Complete (McGraw Hill, 1999) by Steve Ball describes programming with the Tcl Web
Server. It also covers Tcl/Java integration using TclBlend.
[incr Tcl] From The Ground Up (Osborn-McGraw Hill, 1999) by Chad Smith describes the [incr Tcl]
object-oriented extension to Tcl.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

On-line Examples
The book comes with a CD-ROM that has source code for all of the examples, plus a selection of Tcl
freeware found on the Internet. The CD-ROM is created with the Linux mkhybrid program, so it is
readable on UNIX, Windows, and Macintosh. There, you will find the versions of Tcl and Tk that
were available as the book went to press. You can also retrieve the sources shown in the book from my
personal Web site:
http://www.beedub.com/book/

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

Ftp Archives
The primary site for the Tcl and Tk distributions is given below as a Universal Resource Location
(URL):
ftp://ftp.scriptics.com/pub/tcl
You can use FTP and log in to the host (e.g., ftp.scriptics.com) under the anonymous user name. Give
your e-mail address as the password. The directory is in the URL after the host name (e.g., /pub/tcl).
There are many sites that mirror this distribution. The mirror sites provide an archive site for
contributed Tcl commands, Tk widgets, and applications. There is also a set of Frequently Asked
Questions files. These are some of the sites that maintain Tcl archives
ftp://ftp.neosoft.com/pub/tcl
ftp://ftp.syd.dit.csiro.au/pub/tk
ftp://ftp.ibp.fr/pub/tcl
ftp://src.doc.ic.ac.uk/packages/tcl/
ftp://ftp.luth.se/pub/unix/tcl/
ftp://sunsite.cnlab-switch.ch/mirror/tcl
ftp://ftp.sterling.com/programming/languages/tcl
ftp://ftp.sunet.se/pub/lang/tcl
ftp://ftp.cs.columbia.edu/archives/tcl
ftp://ftp.uni-paderborn.de/pub/unix/tcl
ftp://sunsite.unc.edu/pub/languages/tcl
ftp://ftp.funet.fi/pub/languages/tcl
You can use a World Wide Web browser like Mosaic, Netscape, Internet Explorer, or Lynx to access
these sites. Enter the URL as specified above, and you are presented with a directory listing of that
location. From there you can change directories and fetch files.
If you do not have direct FTP access, you can use an e-mail server for FTP. Send e-mail to
[email protected] with the message Help to get directions. If you are on BITNET, send e-mail
to [email protected].
You can search for FTP sites that have Tcl by using the Archie service that indexes the contents of
anonymous FTP servers. Information about using Archie can be obtained by sending mail to
[email protected] that contains the message Help.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

World Wide Web

Start with these World Wide Web pages about Tcl:
http://www.scriptics.com/
http://www.sco.com/Technology/tcl/Tcl.html
http://www.purl.org/NET/Tcl-FAQ/
The home page for this book contains errata for all editions. This is the only URL I control personally,
and I plan to keep it up-to-date indefinitely:
http://www.beedub.com/book/
The Prentice Hall Web site has information about the book, but you must use its search facility to find
the exact location. Start at:
http://www.prenhall.com/

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

Newsgroups
The comp.lang.tcl newsgroup is very active. It provides a forum for questions and answers about
Tcl. Announcements about Tcl extensions and applications are posted to the
comp.lang.tcl.announce newsgroup.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

Typographic Conventions
The more important examples are set apart with a title and horizontal rules, while others appear in-
line. The examples use courier for Tcl and C code. When interesting results are returned by a Tcl
command, those are presented below in oblique courier. The => is not part of the return value in the
following example.

expr 5 + 8
=> 13

The courier font is also used when naming Tcl commands and C procedures within sentences.
The usage of a Tcl command is presented in the following example. The command name and constant
keywords appear in courier. Variable values appear in courier oblique. Optional arguments are
surrounded with question marks.

set varname ?value?

The name of a program is in italics:

xterm

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

Hot Tips
The icon in the margin marks a "hot tip" as judged by the reviewers of the book.
The visual markers help you locate the more useful sections in the book. These
are also listed in the index under Hot Tip.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

Book Organization
The chapters of the book are divided into seven parts. The first part describes basic Tcl features. The
first chapter describes the fundamental mechanisms that characterize the Tcl language. This is an
important chapter that provides the basic grounding you will need to use Tcl effectively. Even if you
have programmed in Tcl already, you should review Chapter 1. Chapter 2 goes over the details of
using Tcl and Tk on UNIX, Windows, and Macintosh. Chapter 3 presents a sample application, a CGI
script, that illustrates typical Tcl programming. The rest of Part I covers the basic Tcl commands in
more detail, including string handling, data types, control flow, procedures, and scoping issues. Part I
finishes with a description of the facilities for file I/O and running other programs.
Part II describes advanced Tcl programming. It starts with eval, which lets you generate Tcl programs
on the fly. Regular expressions provide powerful string processing. If your data-processing application
runs slowly, you can probably boost its performance significantly with the regular expression facilities.
Namespaces partition the global scope of procedures and variables. Unicode and message catalogs
support internationalized applications. Libraries and packages provide a way to organize your code for
sharing among projects. The introspection facilities of Tcl tell you about the internal state of Tcl.
Event driven I/O helps server applications manage several clients simultaneously. Network sockets are
used to implement the HTTP protocol used to fetch pages on the World Wide Web. Safe-Tcl is used to
provide a secure environment to execute applets downloaded over the network. TclHttpd is an
extensible web server built in Tcl. You can build applications on top of this server, or embed it into
your existing applications to give them a web interface.
Part III introduces Tk. It gives an overview of the toolkit facilities. A few complete examples are
examined in detail to illustrate the features of Tk. Event bindings associate Tcl commands with events
like keystrokes and button clicks. Part III ends with three chapters on the Tk geometry managers that
provide powerful facilities for organizing your user interface.
Part IV describes the Tk widgets. These include buttons, menus, scrollbars, labels, text entries,
multiline and multifont text areas, drawing canvases, listboxes, and scales. The Tk widgets are highly
configurable and very programmable, but their default behaviors make them easy to use as well. The
resource database that can configure widgets provides an easy way to control the overall look of your
application.
Part V describes the rest of the Tk facilities. These include selections, keyboard focus, and standard
dialogs. Fonts, colors, images, and other attributes that are common to the Tk widgets are described in
detail. This part ends with a few larger Tk examples.
Part VI is an introduction to C programming and Tcl. The goal of this part is to get you started in the
right direction when you need to extend Tcl with new commands written in C or integrate Tcl into
custom applications.
Part VII provides a chapter for each of the Tcl/Tk releases covered by the book. These chapters
provide details about what features were changed and added. They also provide a quick reference if
you need to update a program or start to use a new version.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

What's New in the Third Edition

The third edition is up-to-date with Tcl/Tk 8.2. The main new Tcl/Tk features are Internationalization,
which is covered in Chapter 15, a new regular expression engine, which is described in Chapter 11,
and thread-safety. There is a new chapter about compiling C extensions, and there is a more complete
C extension example. The chapters on Eval and the Web browser plugin received a thorough update. I
made a light sweep through the remainder of the book correcting errors and improving examples.
Perhaps the best addition for the reader is an all-new index.
My favorite addition to the book is Chapter 18 that describes TclHttpd, a Web server built in Tcl.
TclHttpd provides a number of nice ways to integrate a Web server with a Tcl application, replacing
the standard CGI interface with something that is much more flexible and efficient. I have been using
this server for the last year to build www.scriptics.com. This freely available server has been used to
build several other products, plus it provides an easy way for you to bring up your own Web server.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

First Edition Thanks

I would like to thank my managers and colleagues at Xerox PARC for their patience with me as I
worked on this book. The tips and tricks in this book came partly from my own work as I helped lab
members use Tcl, and partly from them as they taught me. Dave Nichols' probing questions forced me
to understand the basic mechanisms of the Tcl interpreter. Dan Swinehart and Lawrence Butcher kept
me sharp with their own critiques. Ron Frederick and Berry Kerchival adopted Tk for their graphical
interfaces and amazed me with their rapid results. Becky Burwell, Rich Gold, Carl Hauser, John
Maxwell, Ken Pier, Marvin Theimer, and Mohan Vishwanath made use of my early drafts, and their
questions pointed out large holes in the text. Karin Petersen, Bill Schilit, and Terri Watson kept life
interesting by using Tcl in very nonstandard ways. I especially thank my managers, Mark Weiser and
Doug Terry, for their understanding and support.
I thank John Ousterhout for Tcl and Tk, which are wonderful systems built with excellent
craftsmanship. John was kind enough to provide me with an advance version of Tk 4.0 so that I could
learn about its new features well before its first beta release.
Thanks to the Tcl programmers out on the Net, from whom I learned many tricks. John LoVerso and
Stephen Uhler are the hottest Tcl programmers I know.
Many thanks to the patient reviewers of early drafts: Pierre David, Clif Flynt, Simon Kenyon, Eugene
Lee, Don Libes, Lee Moore, Joe Moss, Hador Shemtov, Frank Stajano, Charles Thayer, and Jim
Thornton.
Many folks contributed suggestions by e-mail: Miguel Angel, Stephen Bensen, Jeff Blaine, Tom
Charnock, Brian Cooper, Patrick D'Cruze, Benoit Desrosiers, Ted Dunning, Mark Eichin, Paul
Friberg, Carl Gauthier, David Gerdes, Klaus Hackenberg, Torkle Hasle, Marti Hearst, Jean-Pierre
Herbert, Jamie Honan, Norman Klein, Joe Konstan, Susan Larson, Håkan Liljegren, Lionel Mallet,
Dejan Milojicic, Greg Minshall, Bernd Mohr, Will Morse, Heiko Nardmann, Gerd Neugebauer, TV
Raman, Cary Renzema, Rob Riepel, Dan Schenk, Jean-Guy Schneider, Elizabeth Scholl, Karl
Schwamb, Rony Shapiro, Peter Simanyi, Vince Skahan, Bill Stumbo, Glen Vanderburg, Larry Virden,
Reed Wade, and Jim Wight. Unfortunately, I could not respond to every suggestion, even some that
were excellent.
Thanks to the editors and staff at Prentice Hall. Mark Taub has been very helpful as I progressed
through my first book. Lynn Schneider and Kerry Reardon were excellent copy and production editors,
respectively.
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

Second Edition Thanks

I get to thank John Ousterhout again, this time for supporting me as I worked in the Tcl/Tk group at
Sun Microsystems. The rest of the group deserve a lot of credit for turning Tcl and Tk into a dynamite
cross-platform solution. Scott Stanton led the Tk port to the PC. Ray Johnson led the Tk port to the
Macintosh. Jacob Levy implemented the event-driven I/O system, Safe-Tcl, and the browser plug-in.
Brian Lewis built the Tcl compiler. Ken Corey worked on Java integration and helped with the
SpecTcl user interface builder. Syd Polk generalized the menu system to work with native widgets on
the Macintosh and Windows. Colin Stevens generalized the font mechanism and worked on
internationalization for Tk.
Stephen Uhler deserves special thanks for inspiring many of the cool examples I use in this book. He
was the lead on the SpecTcl user interface builder. He built the core HTML display library on which I
based an editor. We worked closely together on the first versions of TclHttpd. He taught me how to
write compact, efficient Tcl code and to use regular expression substitutions in amazing ways. I hope
he has learned at least a little from me.
Thanks again to Mark Taub, Eileen Clark, and Martha Williams at Prentice Hall. George Williams
helped me assemble the files for the CD-ROM.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

Third Edition Thanks

John Ousterhout continues his wonderful role as Tcl benefactor, now as founder of Scriptics
Corporation. I'd like to thank every one of the great folks that I work with at Scriptics, especially the
pioneering crew of Sarah Daniels, Scott Stanton, Ray Johnson, Bryan Surles, Melissa Hirschl, Lee
Bernhard, Suresh Sastry, Emil Scaffon, Pat P., Scott Redman, and Berry Kercheval. The rest of the
gang deserves a big thanks for making Scriptics such an enjoyable place to work. Jerry Peek, who is a
notable author himself, provided valuable advice and wonderfully detailed comments! Ken Jones told
me about a great indexing tool.
I'd like to thank all the readers that drop me the encouraging note or probing question via e-mail. I am
always interested in new and interesting uses of Tcl!
Thanks to the editors at Prentice Hall: Mark Taub, Joan McNamara, and Joan Eurell. Mark continues
to encourage me to come out with new editions, and the Joans helped me complete this third edition
on time.
Finally, I thank my wonderful wife Jody for her love, kindness, patience, wit, and understanding as I
worked long hours. Happily, many of those hours were spent working from home. I now have three
sons, Christopher, Daniel, and Michael, who get the credit for keeping me from degenerating into a
complete nerd.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Preface

Contact the Author

I am always open to comments about this book. My e-mail address is [email protected]. It helps me sort
through my mail if you put the word "book" or the title of the book into the e-mail subject line. Visit
my Web site at:
http://www.beedub.com/
for current news about the book and my other interests.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part I: Tcl Basics

Part I introduces the basics of Tcl. Everyone should read Chapter 1, which describes the
fundamental properties of the language. Tcl is really quite simple, so beginners can pick it up
quickly. The experienced programmer should review Chapter 1 to eliminate any misconceptions
that come from using other languages.
Chapter 2 is a short introduction to running Tcl and Tk on UNIX, Windows, and Macintosh
systems. You may want to look at this chapter first so you can try out the examples as you read
Chapter 1.
Chapter 3 presents a sample application, a CGI script, that implements a guestbook for a Web
site. The example uses several facilities that are described in detail in later chapters. The goal is
to provide a working example that illustrates the power of Tcl.
The rest of Part I covers basic programming with Tcl. Simple string processing is covered in
Chapter 4. Tcl lists, which share the syntax rules of Tcl commands, are explained in Chapter 5.
Control structure like loops and if statements are described in Chapter 6. Chapter 7 describes Tcl
procedures, which are new commands that you write in Tcl. Chapter 8 discusses Tcl arrays.
Arrays are the most flexible and useful data structure in Tcl. Chapter 9 describes file I/O and
running other programs. These facilities let you build Tcl scripts that glue together other
programs and process data in files.
After reading Part I you will know enough Tcl to read and understand other Tcl programs, and to
write simple programs yourself.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part I. Tcl Basics

Chapter 1. Tcl Fundamentals

This chapter describes the basic syntax rules for the Tcl scripting language. It describes the basic
mechanisms used by the Tcl interpreter: substitution and grouping. It touches lightly on the following
Tcl commands: puts, format, set, expr, string, while, incr, and proc.
Tcl is a string-based command language. The language has only a few fundamental constructs and
relatively little syntax, which makes it easy to learn. The Tcl syntax is meant to be simple. Tcl is
designed to be a glue that assembles software building blocks into applications. A simpler glue makes
the job easier. In addition, Tcl is interpreted when the application runs. The interpreter makes it easy to
build and refine your application in an interactive manner. A great way to learn Tcl is to try out
commands interactively. If you are not sure how to run Tcl on your system, see Chapter 2 for
instructions for starting Tcl on UNIX, Windows, and Macintosh systems.
This chapter takes you through the basics of the Tcl language syntax. Even if you are an expert
programmer, it is worth taking the time to read these few pages to make sure you understand the
fundamentals of Tcl. The basic mechanisms are all related to strings and string substitutions, so it is
fairly easy to visualize what is going on in the interpreter. The model is a little different from some
other programming languages with which you may already be familiar, so it is worth making sure you
understand the basic concepts.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

Tcl Commands
Tcl stands for Tool Command Language. A command does something for you, like output a string,
compute a math expression, or display a widget on the screen. Tcl casts everything into the mold of a
command, even programming constructs like variable assignment and procedure definition. Tcl adds a
tiny amount of syntax needed to properly invoke commands, and then it leaves all the hard work up to
the command implementation.
The basic syntax for a Tcl command is:

command arg1 arg2 arg3 ...

The command is either the name of a built-in command or a Tcl procedure. White space (i.e., spaces or
tabs) is used to separate the command name and its arguments, and a newline (i.e., the end of line
character) or semicolon is used to terminate a command. Tcl does not interpret the arguments to the
commands except to perform grouping, which allows multiple words in one argument, and
substitution, which is used with programming variables and nested command calls. The behavior of
the Tcl command processor can be summarized in three basic steps:

Argument grouping.
Value substitution of nested commands, variables, and backslash escapes.
Command invocation. It is up to the command to interpret its arguments.
This model is described in detail in this Chapter.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

Hello, World!
Example 1-1 The "Hello, World!" example.

puts stdout {Hello, World!}

=> Hello, World!

In this example, the command is puts, which takes two arguments: an I/O stream identifier and a
string. puts writes the string to the I/O stream along with a trailing newline character. There are two
points to emphasize:

Arguments are interpreted by the command. In the example, stdout is used to identify the
standard output stream. The use of stdout as a name is a convention employed by puts and the
other I/O commands. Also, stderr is used to identify the standard error output, and stdin is
used to identify the standard input. Chapter 9 describes how to open other files for I/O.
Curly braces are used to group words together into a single argument. The puts command
receives Hello, World! as its second argument.

The braces are not part of the value.

The braces are syntax for the interpreter, and they get stripped off before the value is passed to the
command. Braces group all characters, including newlines and nested braces, until a matching brace is
found. Tcl also uses double quotes for grouping. Grouping arguments will be described in more detail
later.
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

Variables
The set command is used to assign a value to a variable. It takes two arguments: The first is the name
of the variable, and the second is the value. Variable names can be any length, and case is significant.
In fact, you can use any character in a variable name.

It is not necessary to declare Tcl variables before you use them.

The interpreter will create the variable when it is first assigned a value. The value of a variable is
obtained later with the dollar-sign syntax, illustrated in Example 1-2:

Example 1-2 Tcl variables.

set var 5
=> 5
set b $var
=> 5

The second set command assigns to variable b the value of variable var. The use of the dollar sign is
our first example of substitution. You can imagine that the second set command gets rewritten by
substituting the value of var for $var to obtain a new command.

set b 5

The actual implementation of substitution is more efficient, which is important when the value is
large.
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

Command Substitution
The second form of substitution is command substitution. A nested command is delimited by square
brackets, [ ]. The Tcl interpreter takes everything between the brackets and evaluates it as a
command. It rewrites the outer command by replacing the square brackets and everything between
them with the result of the nested command. This is similar to the use of backquotes in other shells,
except that it has the additional advantage of supporting arbitrary nesting of commands.

Example 1-3 Command substitution.

set len [string length foobar]

=> 6

In Example 1-3, the nested command is:

string length foobar

This command returns the length of the string foobar. The string command is described in detail
starting on page 45. The nested command runs first.
Then, command substitution causes the outer command to be rewritten as if it were:

set len 6

If there are several cases of command substitution within a single command, the interpreter processes
them from left to right. As each right bracket is encountered, the command it delimits is evaluated.
This results in a sensible ordering in which nested commands are evaluated first so that their result can
be used in arguments to the outer command.
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

Math Expressions
The Tcl interpreter itself does not evaluate math expressions. Tcl just does grouping, substitutions and
command invocations. The expr command is used to parse and evaluate math expressions.

Example 1-4 Simple arithmetic.

expr 7.2 / 4
=> 1.8

The math syntax supported by expr is the same as the C expression syntax. The expr command deals
with integer, floating point, and boolean values. Logical operations return either 0 (false) or 1 (true).
Integer values are promoted to floating point values as needed. Octal values are indicated by a leading
zero (e.g., 033 is 27 decimal). Hexadecimal values are indicated by a leading 0x. Scientific notation
for floating point numbers is supported. A summary of the operator precedence is given on page 20.
You can include variable references and nested commands in math expressions. The following
example uses expr to add the value of x to the length of the string foobar. As a result of the innermost
command substitution, the expr command sees 6 + 7, and len gets the value 13:

Example 1-5 Nested commands.

set x 7
set len [expr [string length foobar] + $x]
=> 13

The expression evaluator supports a number of built-in math functions. (For a complete listing, see
page 21.) Example 1-6 computes the value of pi:

Example 1-6 Built-in math functions.

set pi [expr 2*asin(1.0)]
=> 3.1415926535897931

The implementation of expr is careful to preserve accurate numeric values and avoid conversions
between numbers and strings. However, you can make expr operate more efficiently by grouping the
entire expression in curly braces. The explanation has to do with the byte code compiler that Tcl uses
internally, and its effects are explained in more detail on page 15. For now, you should be aware that
these expressions are all valid and run a bit faster than the examples shown above:

Example 1-7 Grouping expressions with braces.

expr {7.2 / 4}
set len [expr {[string length foobar] + $x}]
set pi [expr {2*asin(1.0)}]

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

Backslash Substitution
The final type of substitution done by the Tcl interpreter is backslash substitution. This is used to
quote characters that have special meaning to the interpreter. For example, you can specify a literal
dollar sign, brace, or bracket by quoting it with a backslash. As a rule, however, if you find yourself
using lots of backslashes, there is probably a simpler way to achieve the effect you are striving for. In
particular, the list command described on page 61 will do quoting for you automatically. In Example
1-8 backslash is used to get a literal $:

Example 1-8 Quoting special characters with backslash.

set dollar \$foo

=> $foo
set x $dollar
=> $foo

Only a single round of interpretation is done.

The second set command in the example illustrates an important property of Tcl. The value of dollar
does not affect the substitution performed in the assignment to x. In other words, the Tcl parser does
not care about the value of a variable when it does the substitution. In the example, the value of x and
dollar is the string $foo. In general, you do not have to worry about the value of variables until you
use eval, which is described in Chapter 10.
You can also use backslash sequences to specify characters with their Unicode, hexadecimal, or octal
value:

set escape \u001b

set escape \0x1b
set escape \033

The value of variable escape is the ASCII ESC character, which has character code 27. The table on
page 20 summarizes backslash substitutions.
A common use of backslashes is to continue long commands on multiple lines. This is necessary
because a newline terminates a command. The backslash in the next example is required; otherwise
the expr command gets terminated by the newline after the plus sign.

Example 1-9 Continuing long lines with backslashes.

set totalLength [expr [string length $one] + \

[string length $two]]

There are two fine points to escaping newlines. First, if you are grouping an argument as described in
the next section, then you do not need to escape newlines; the newlines are automatically part of the
group and do not terminate the command. Second, a backslash as the last character in a line is
converted into a space, and all the white space at the beginning of the next line is replaced by this
substitution. In other words, the backslash-newline sequence also consumes all the leading white space
on the next line.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

Grouping with Braces and Double Quotes

Double quotes and curly braces are used to group words together into one argument. The difference
between double quotes and curly braces is that quotes allow substitutions to occur in the group, while
curly braces prevent substitutions. This rule applies to command, variable, and backslash substitutions.

Example 1-10 Grouping with double quotes vs. braces.

set s Hello
=> Hello
puts stdout "The length of $s is [string length $s]."
=> The length of Hello is 5.
puts stdout {The length of $s is [string length $s].}
=> The length of $s is [string length $s].

In the second command of Example 1-10, the Tcl interpreter does variable and command substitution
on the second argument to puts. In the third command, substitutions are prevented, so the string is
printed as is.
In practice, grouping with curly braces is used when substitutions on the argument must be delayed
until a later time (or never done at all). Examples include loops, conditional statements, and procedure
declarations. Double quotes are useful in simple cases like the puts command previously shown.
Another common use of quotes is with the format command. This is similar to the C printf function.
The first argument to format is a format specifier that often includes special characters like newlines,
tabs, and spaces. The easiest way to specify these characters is with backslash sequences (e.g., \n for
newline and \t for tab). The backslashes must be substituted before the format command is called, so
you need to use quotes to group the format specifier.

puts [format "Item: %s\t%5.3f" $name $value]

Here format is used to align a name and a value with a tab. The %s and %5.3f indicate how the
remaining arguments to format are to be formatted. Note that the trailing \n usually found in a C
printf call is not needed because puts provides one for us. For more information about the format
command, see page 52.

Square Brackets Do Not Group

The square bracket syntax used for command substitution does not provide grouping. Instead, a nested
command is considered part of the current group. In the command below, the double quotes group the
last argument, and the nested command is just part of that group.

puts stdout "The length of $s is [string length $s]."

If an argument is made up only of a nested command, you do not need to group it with double-quotes
because the Tcl parser treats the whole nested command as part of the group.

puts stdout [string length $s]

The following is a redundant use of double quotes:

puts stdout "[expr $x + $y]"

Grouping before Substitution

The Tcl parser makes a single pass through a command as it makes grouping decisions and performs
string substitutions. Grouping decisions are made before substitutions are performed, which is an
important property of Tcl. This means that the values being substituted do not affect grouping because
the grouping decisions have already been made.
The following example demonstrates how nested command substitution affects grouping. A nested
command is treated as an unbroken sequence of characters, regardless of its internal structure. It is
included with the surrounding group of characters when collecting arguments for the main command.

Example 1-11 Embedded command and variable substitution.

set x 7; set y 9
puts stdout $x+$y=[expr $x + $y]
=> 7+9=16

In Example 1-11, the second argument to puts is:

$x+$y=[expr $x + $y]
The white space inside the nested command is ignored for the purposes of grouping the argument. By
the time Tcl encounters the left bracket, it has already done some variable substitutions to obtain:

7+9=

When the left bracket is encountered, the interpreter calls itself recursively to evaluate the nested
command. Again, the $x and $y are substituted before calling expr. Finally, the result of expr is
substituted for everything from the left bracket to the right bracket. The puts command gets the
following as its second argument:

7+9=16

Grouping before substitution.

The point of this example is that the grouping decision about puts's second argument is made before
the command substitution is done. Even if the result of the nested command contained spaces or other
special characters, they would be ignored for the purposes of grouping the arguments to the outer
command. Grouping and variable substitution interact the same as grouping and command
substitution. Spaces or special characters in variable values do not affect grouping decisions because
these decisions are made before the variable values are substituted.
If you want the output to look nicer in the example, with spaces around the + and =, then you must use
double quotes to explicitly group the argument to puts:

puts stdout "$x + $y = [expr $x + $y]"

The double quotes are used for grouping in this case to allow the variable and command substitution
on the argument to puts.

Grouping Math Expressions with Braces

It turns out that expr does its own substitutions inside curly braces. This is explained in more detail on
page 15. This means you can write commands like the one below and the substitutions on the variables
in the expression still occur:

puts stdout "$x + $y = [expr {$x + $y}]"

More Substitution Examples
If you have several substitutions with no white space between them, you can avoid grouping with
quotes. The following command sets concat to the value of variables a, b, and c all concatenated
together:

set concat $a$b$c

Again, if you want to add spaces, you'll need to use quotes:

set concat "$a $b $c"

In general, you can place a bracketed command or variable reference anywhere. The following
computes a command name:

[findCommand $x] arg arg

When you use Tk, you often use widget names as command names:

$text insert end "Hello, World!"

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

Procedures
Tcl uses the proc command to define procedures. Once defined, a Tcl procedure is used just like any
of the other built-in Tcl commands. The basic syntax to define a procedure is:

proc name arglist body

The first argument is the name of the procedure being defined. The second argument is a list of
parameters to the procedure. The third argument is a command body that is one or more Tcl
commands.
The procedure name is case sensitive, and in fact it can contain any characters. Procedure names and
variable names do not conflict with each other. As a convention, this book begins procedure names
with uppercase letters and it begins variable names with lowercase letters. Good programming style is
important as your Tcl scripts get larger. Tcl coding style is discussed in Chapter 12.

Example 1-12 Defining a procedure.

proc Diag {a b} {
set c [expr sqrt($a * $a + $b * $b)]
return $c
}
puts "The diagonal of a 3, 4 right triangle is [Diag 3 4]"
=> The diagonal of a 3, 4 right triangle is 5.0

The Diag procedure defined in the example computes the length of the diagonal side of a right triangle
given the lengths of the other two sides. The sqrt function is one of many math functions supported
by the expr command. The variable c is local to the procedure; it is defined only during execution of
Diag. Variable scope is discussed further in Chapter 7. It is not really necessary to use the variable c in
this example. The procedure can also be written as:
proc Diag {a b} {
return [expr sqrt($a * $a + $b * $b)]
}

The return command is used to return the result of the procedure. The return command is optional in
this example because the Tcl interpreter returns the value of the last command in the body as the value
of the procedure. So, the procedure could be reduced to:

proc Diag {a b} {
expr sqrt($a * $a + $b * $b)
}

Note the stylized use of curly braces in the example. The curly brace at the end of the first line starts
the third argument to proc, which is the command body. In this case, the Tcl interpreter sees the
opening left brace, causing it to ignore newline characters and scan the text until a matching right
brace is found. Double quotes have the same property. They group characters, including newlines,
until another double quote is found. The result of the grouping is that the third argument to proc is a
sequence of commands. When they are evaluated later, the embedded newlines will terminate each
command.
The other crucial effect of the curly braces around the procedure body is to delay any substitutions in
the body until the time the procedure is called. For example, the variables a, b, and c are not defined
until the procedure is called, so we do not want to do variable substitution at the time Diag is defined.
The proc command supports additional features such as having variable numbers of arguments and
default values for arguments. These are described in detail in Chapter 7.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

A Factorial Example
To reinforce what we have learned so far, below is a longer example that uses a while loop to
compute the factorial function:

Example 1-13 A while loop to compute factorial.

proc Factorial {x} {

set i 1; set product 1
while {$i <= $x} {
set product [expr $product * $i]
incr i
}
return $product
}
Factorial 10
=> 3628800

The semicolon is used on the first line to remind you that it is a command terminator just like the
newline character. The while loop is used to multiply all the numbers from one up to the value of x.
The first argument to while is a boolean expression, and its second argument is a command body to
execute. The while/ command and other control structures are described in Chapter 6.
The same math expression evaluator used by the expr command is used by while to evaluate the
boolean expression. There is no need to explicitly use the expr command in the first argument to
while, even if you have a much more complex expression.

The loop body and the procedure body are grouped with curly braces in the same way. The opening
curly brace must be on the same line as proc and while. If you like to put opening curly braces on the
line after a while or if statement,
you must escape the newline with a backslash:
while {$i < $x}\
{
set product ...
}

Always group expressions and command bodies with curly braces.

Curly braces around the boolean expression are crucial because they delay variable substitution until
the while command implementation tests the expression. The following example is an infinite loop:

set i 1; while $i<=10 {incr i}

The loop will run indefinitely.[*] The reason is that the Tcl interpreter will substitute for $i before
while is called, so while gets a constant expression 1<=10 that will always be true. You can avoid
these kinds of errors by adopting a consistent coding style that groups expressions with curly braces:
[*] Ironically,
Tcl 8.0 introduced a byte-code compiler, and the first releases of Tcl 8.0 had a bug in the compiler that caused this loop to
terminate! This bug is fixed in the 8.0.5 patch release.

set i 1; while {$i<=10} {incr i}

The incr command is used to increment the value of the loop variable i. This is a handy command
that saves us from the longer command:

set i [expr $i + 1]

The incr command can take an additional argument, a positive or negative integer by which to change
the value of the variable. Using this form, it is possible to eliminate the loop variable i and just modify
the parameter x. The loop body can be written like this:

while {$x > 1} {

set product [expr $product * $x]
incr x -1
}

Example 1-14 shows factorial again, this time using a recursive definition. A recursive function is one
that calls itself to complete its work. Each recursive call decrements x by one, and when x is one, then
the recursion stops.
Example 1-14 A recursive definition of factorial.

proc Factorial {x} {

if {$x <= 1} {
return 1
} else {
return [expr $x * [Factorial [expr $x - 1]]]
}
}

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

More about Variables

The set command will return the value of a variable if it is only passed a single argument. It treats that
argument as a variable name and returns the current value of the variable. The dollar-sign syntax used
to get the value of a variable is really just an easy way to use the set command. Example 1-15 shows a
trick you can play by putting the name of one variable into another variable:

Example 1-15 Using set to return a variable value.

set var {the value of var}

=> the value of var
set name var
=> var
set name
=> var
set $name
=> the value of var

This is a somewhat tricky example. In the last command, $name gets substituted with var. Then, the
set command returns the value of var, which is the value of var. Nested set commands provide
another way to achieve a level of indirection. The last set command above can be written as follows:

set [set name]

=> the value of var

Using a variable to store the name of another variable may seem overly complex. However, there are
some times when it is very useful. There is even a special command, upvar, that makes this sort of
trick easier. The upvar command is described in detail in Chapter 7.

Funny Variable Names

The Tcl interpreter makes some assumptions about variable names that make it easy to embed variable
references into other strings. By default, it assumes that variable names contain only letters, digits, and
the underscore. The construct $foo.o represents a concatenation of the value of foo and the literal
".o".

If the variable reference is not delimited by punctuation or white space, then you can use curly braces
to explicitly delimit the variable name (e.g., ${x}). You can also use this to reference variables with
funny characters in their name, although you probably do not want variables named like that. If you
find yourself using funny variable names, or computing the names of variables, then you may want to
use the upvar command.

Example 1-16 Embedded variable references.

set foo filename

set object $foo.o
=> filename.o
set a AAA
set b abc${a}def
=> abcAAAdef
set .o yuk!
set x ${.o}y
=> yuk!y

The unset Command

You can delete a variable with the unset command:

unset varName varName2 ...

Any number of variable names can be passed to the unset command. However, unset will raise an
error if a variable is not already defined.

Using info to Find Out about Variables

The existence of a variable can be tested with the info exists command. For example, because incr
requires that a variable exist, you might have to test for the existence of the variable first.

Example 1-17 Using info to determine if a variable exists.

if {![info exists foobar]} {

set foobar 0
} else {
incr foobar
}
Example 7-6 on page 86 implements a new version of incr which handles this case.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

More about Math Expressions

This section describes a few fine points about math in Tcl scripts. In Tcl 7.6 and earlier versions math
is not that efficient because of conversions between strings and numbers. The expr command must
convert its arguments from strings to numbers. It then does all its computations with double precision
floating point values. The result is formatted into a string that has, by default, 12 significant digits.
This number can be changed by setting the tcl_precision variable to the number of significant digits
desired. Seventeen digits of precision are enough to ensure that no information is lost when converting
back and forth between a string and an IEEE double precision number:

Example 1-18 Controlling precision with tcl_precision.

expr 1 / 3
=> 0
expr 1 / 3.0
=> 0.333333333333
set tcl_precision 17
=> 17
expr 1 / 3.0
# The trailing 1 is the IEEE rounding digit
=> 0.33333333333333331

In Tcl 8.0 and later versions, the overhead of conversions is eliminated in most cases by the built-in
compiler. Even so, Tcl was not designed to support math-intensive applications. You may want to
implement math-intensive code in a compiled language and register the function as a Tcl command as
described in Chapter 44.
There is support for string comparisons by expr, so you can test string values in if statements. You
must use quotes so that expr knows to do string comparisons:

if {$answer == "yes"} {... }

However, the string compare and string equal commands described in Chapter 4 are more
reliable because expr may do conversions on strings that look like numbers. The issues with string
operations and expr are discussed on page 48.
Expressions can include variable and command substitutions and still be grouped with curly braces.
This is because an argument to expr is subject to two rounds of substitution: one by the Tcl interpreter,
and a second by expr itself. Ordinarily this is not a problem because math values do not contain the
characters that are special to the Tcl interpreter. The second round of substitutions is needed to support
commands like while and if that use the expression evaluator internally.

Grouping expressions can make them run more efficiently.

You should always group expressions in curly braces and let expr do command and variable
substitutions. Otherwise, your values may suffer extra conversions from numbers to strings and back
to numbers. Not only is this process slow, but the conversions can loose precision in certain
circumstances. For example, suppose x is computed from a math function:

set x [expr {sqrt(2.0)}]

At this point the value of x is a double-precision floating point value, just as you would expect. If you
do this:

set two [expr $x * $x]

then you may or may not get 2.0 as the result! This is because Tcl will substitute $x and expr will
concatenate all its arguments into one string, and then parse the expression again. In contrast, if you do
this:

set two [expr {$x * $x}]

then expr will do the substitutions, and it will be careful to preserve the floating point value of x. The
expression will be more accurate and run more efficiently because no string conversions will be done.
The story behind Tcl values is described in more detail in Chapter 44 on C programming and Tcl.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

Comments
Tcl uses the pound character, #, for comments. Unlike in many other languages, the # must occur at
the beginning of a command. A # that occurs elsewhere is not treated specially. An easy trick to
append a comment to the end of a command is to precede the # with a semicolon to terminate the
previous command:

# Here are some parameters

set rate 7.0 ;# The interest rate
set months 60 ;# The loan term

One subtle effect to watch for is that a backslash effectively continues a comment line onto the next
line of the script. In addition, a semicolon inside a comment is not significant. Only a newline
terminates comments:

# Here is the start of a Tcl comment \

and some more of it; still in the comment

The behavior of a backslash in comments is pretty obscure, but it can be exploited as shown in
Example 2-3 on page 27.
A surprising property of Tcl comments is that curly braces inside comments are still counted for the
purposes of finding matching brackets. I think the motivation for this mis-feature was to keep the
original Tcl parser simpler. However, it means that the following will not work as expected to
comment out an alternate version of an if expression:

# if {boolean expression1} {
if {boolean expression2} {
some commands
}
The previous sequence results in an extra left curly brace, and probably a complaint about a missing
close brace at the end of your script! A technique I use to comment out large chunks of code is to put
the code inside an if block that will never execute:

if {0} {
unused code here
}

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

Substitution and Grouping Summary

The following rules summarize the fundamental mechanisms of grouping and substitution that are
performed by the Tcl interpreter before it invokes a command:

Command arguments are separated by white space, unless arguments are grouped with curly
braces or double quotes as described below.
Grouping with curly braces, { }, prevents substitutions. Braces nest. The interpreter includes all
characters between the matching left and right brace in the group, including newlines,
semicolons, and nested braces. The enclosing (i.e., outermost) braces are not included in the
group's value.
Grouping with double quotes, " ", allows substitutions. The interpreter groups everything until
another double quote is found, including newlines and semicolons. The enclosing quotes are not
included in the group of characters. A double-quote character can be included in the group by
quoting it with a backslash, (e.g., \").
Grouping decisions are made before substitutions are performed, which means that the values of
variables or command results do not affect grouping.
A dollar sign, $, causes variable substitution. Variable names can be any length, and case is
significant. If variable references are embedded into other strings, or if they include characters
other than letters, digits, and the underscore, they can be distinguished with the ${varname}
syntax.
Square brackets, [ ], cause command substitution. Everything between the brackets is treated as
a command, and everything including the brackets is replaced with the result of the command.
Nesting is allowed.
The backslash character, \, is used to quote special characters. You can think of this as another
form of substitution in which the backslash and the next character or group of characters are
replaced with a new character.
Substitutions can occur anywhere unless prevented by curly brace grouping. Part of a group can
be a constant string, and other parts of it can be the result of substitutions. Even the command
name can be affected by substitutions.
A single round of substitutions is performed before command invocation. The result of a
substitution is not interpreted a second time. This rule is important if you have a variable value or
a command result that contains special characters such as spaces, dollar signs, square brackets, or
braces. Because only a single round of substitution is done, you do not have to worry about
special characters in values causing extra substitutions.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

Fine Points
A common error is to forget a space between arguments when grouping with braces or quotes.
This is because white space is used as the separator, while the braces or quotes only provide
grouping. If you forget the space, you will get syntax errors about unexpected characters after the
closing brace or quote. The following is an error because of the missing space between } and {:

if {$x > 1} {puts "x = $x"}

A double quote is only used for grouping when it comes after white space. This means you can
include a double quote in the middle of a group without quoting it with a backslash. This requires
that curly braces or white space delimit the group. I do not recommend using this obscure
feature, but this is what it looks like:

set silly a"b

When double quotes are used for grouping, the special effect of curly braces is turned off.
Substitutions occur everywhere inside a group formed with double quotes. In the next command,
the variables are still substituted:

set x xvalue
set y "foo {$x}bar"
=> foo {xvalue}bar

When double quotes are used for grouping and a nested command is encountered, the nested
command can use double quotes for grouping, too.

puts "results [format "%f %f" $x $y]"

Spaces are not required around the square brackets used for command substitution. For the
purposes of grouping, the interpreter considers everything between the square brackets as part of
the current group. The following sets x to the concatenation of two command results because
there is no space between ] and [.

set x [cmd1][cmd2]

Newlines and semicolons are ignored when grouping with braces or double quotes. They get
included in the group of characters just like all the others. The following sets x to a string that
contains newlines:

set x "This is line one.

This is line two.
This is line three."

During command substitution, newlines and semicolons are significant as command terminators.
If you have a long command that is nested in square brackets, put a backslash before the newline
if you want to continue the command on another line. This was illustrated in Example 1-9 on
page 8.
A dollar sign followed by something other than a letter, digit, underscore, or left parenthesis is
treated as a literal dollar sign. The following sets x to the single character $.

set x $

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 1. Tcl Fundamentals

Reference

Backslash Sequences

Table 1-1. Backslash sequences.

\a Bell. (0x7)
\b Backspace. (0x8)
\f Form feed. (0xc)
\n Newline. (0xa)
\r Carriage return. (0xd)
\t Tab. (0x9)
\v Vertical tab. (0xb)
\<newline> Replace the newline and the leading white space on the next line with a space.
\\ Backslash. ('\')
\ooo Octal specification of character code. 1, 2, or 3 digits.
\xhh Hexadecimal specification of character code. 1 or 2 digits.
\uhhhh Hexadecimal specification of a 16-bit Unicode character value. 4 hex digits.
\c Replaced with literal c if c is not one of the cases listed above. In particular, \$, \", \{,
\} , \] , and \[ are used to obtain these characters.

Arithmetic Operators
Table 1-2. Arithmetic operators from highest to lowest precedence.

- ~ ! Unary minus, bitwise NOT, logical NOT.

* / % Multiply, divide, remainder.
+ - Add, subtract.
<< >> Left shift, right shift.
< > <= >= Comparison: less, greater, less or equal, greater or equal.
== != Equal, not equal.
& Bitwise AND.
^ Bitwise XOR.
| Bitwise OR.
&& Logical AND.
|| Logical OR.
x?y:z If x then y else z.

Built-in Math Functions

Table 1-3. Built-in math functions.

acos(x) Arc cosine of x.

asin(x) Arc sine of x.
atan(x) Arc tangent of x.
atan2(y,x) Rectangular (x,y) to polar (r,th). atan2 gives th.
ceil(x) Least integral value greater than or equal to x.
cos(x) Cosine of x.
cosh(x) Hyperbolic cosine of x.
exp(x) Exponential, ex.
floor(x) Greatest integral value less than or equal to x.
fmod(x,y) Floating point remainder of x/y.
hypot(x,y) Returns sqrt(x*x + y*y). r part of polar coordinates.
log(x) Natural log of x.
log10(x) Log base 10 of x.
pow(x,y) x to the y power, xy.
sin(x) Sine of x.
sinh(x) Hyperbolic sine of x.
sqrt(x) Square root of x.
tan(x) Tangent of x.
tanh(x) Hyperbolic tangent of x.
abs(x) Absolute value of x.
double(x) Promote x to floating point.
int(x) Truncate x to an integer.
round(x) Round x to an integer.
rand() Return a random floating point value between 0.0 and 1.0.
srand(x) Set the seed for the random number generator to the integer x.

Core Tcl Commands

The pages listed in Table 1-4 give the primary references for the command.

Table 1-4. Built-in Tcl commands.

Command Pg. Description

after 218 Schedule a Tcl command for later execution.
append 51 Append arguments to a variable's value. No spaces added.
array 91 Query array state and search through elements.
binary 54 Convert between strings and binary data.
break 77 Exit loop prematurely.
catch 77 Trap errors.
cd 115 Change working directory.
clock 173 Get the time and format date strings.
close 115 Close an open I/O stream.
concat 61 Concatenate arguments with spaces between. Splices lists.
console 28 Control the console used to enter commands interactively.
continue 77 Continue with next loop iteration.
error 79 Raise an error.
eof 109 Check for end of file.
Command Pg. Description
eval 122 Concatenate arguments and evaluate them as a command.
exec 99 Fork and execute a UNIX program.
exit 116 Terminate the process.
expr 6 Evaluate a math expression.
fblocked 223 Poll an I/O channel to see if data is ready.
fconfigure 221 Set and query I/O channel properties.
fcopy 239 Copy from one I/O channel to another.
file 102 Query the file system.
fileevent 219 Register callback for event-driven I/O.
flush 109 Flush output from an I/O stream's internal buffers.
for 76 Loop construct similar to C for statement.
foreach 73 Loop construct over a list, or lists, of values.
format 52 Format a string similar to C sprintf.
gets 112 Read a line of input from an I/O stream.
glob 115 Expand a pattern to matching file names.
global 84 Declare global variables.
history 185 Use command-line history.
if 70 Conditional command. Allows else and elseif clauses.
incr 12 Increment a variable by an integer amount.
info 176 Query the state of the Tcl interpreter.
interp 276 Create additional Tcl interpreters.
join 65 Concatenate list elements with a given separator string.
lappend 61 Add elements to the end of a list.
lindex 63 Fetch an element of a list.
linsert 64 Insert elements into a list.
list 61 Create a list out of the arguments.
llength 63 Return the number of elements in a list.
load 609 Load shared libraries that define Tcl commands.
lrange 63 Return a range of list elements.
lreplace 64 Replace elements of a list.
lsearch 64 Search for an element of a list that matches a pattern.
Command Pg. Description
lsort 65 Sort a list.
namespace 203 Create and manipulate namespaces.
open 110 Open a file or process pipeline for I/O.
package 165 Provide or require code packages.
pid 116 Return the process ID.
proc 81 Define a Tcl procedure.
puts 112 Output a string to an I/O stream.
pwd 115 Return the current working directory.
read 113 Read blocks of characters from an I/O stream.
regexp 148 Match regular expressions.
regsub 152 Substitute based on regular expressions.
rename 82 Change the name of a Tcl command.
return 80 Return a value from a procedure.
scan 54 Parse a string according to a format specification.
seek 114 Set the seek offset of an I/O stream.
set 5 Assign a value to a variable.
socket 228 Open a TCP/IP network connection.
source 26 Evaluate the Tcl commands in a file.
split 65 Chop a string up into list elements.
string 45 Operate on strings.
subst 132 Substitute embedded commands and variable references.
switch 71 Multi-way branch.
tell 114 Return the current seek offset of an I/O stream.
time 191 Measure the execution time of a command.
trace 183 Monitor variable assignments.
unknown 167 Handle unknown commands.
unset 13 Delete variables.
uplevel 130 Execute a command in a different scope.
upvar 85 Reference a variable in a different scope.
variable 197 Declare namespace variables.
vwait 220 Wait for a variable to be modified.
Command Pg. Description
while 73 Loop until a boolean expression is false.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part I. Tcl Basics

Chapter 2. Getting Started

This chapter explains how to run Tcl and Tk on different operating system platforms: UNIX,
Windows, and Macintosh. Tcl commands discussed are: source, console and info.
This chapter explains how to run Tcl scripts on different computer systems. While you can write Tcl
scripts that are portable among UNIX, Windows, and Macintosh, the details about getting started are
different for each system. If you are looking for a current version of Tcl/Tk, check the Internet sites
listed in the Preface on page lii.
The main Tcl/Tk program is wish. Wish stands for windowing shell, and with it you can create
graphical applications that run on all these platforms. The name of the program is a little different on
each of the UNIX, Windows, and Macintosh systems. On UNIX it is just wish. On Windows you will
find wish.exe, and on the Macintosh the application name is Wish. A version number may also be part
of the name, such as wish4.2, wish80.exe, or Wish 8.2. The differences among versions are introduced
on page xlviii, and described in more detail in Part VII of the book. This book will use wish to refer to
all of these possibilities.
Tk adds Tcl commands that are used to create graphical user interfaces, and Tk is described in Part III.
You can run Tcl without Tk if you do not need a graphical interface, such as with the CGI script
discussed in Chapter 3. In this case the program is tclsh, tclsh.exe or Tclsh.
When you run wish, it displays an empty window and prompts for a Tcl command with a % prompt.
You can enter Tcl commands interactively and experiment with the examples in this book. On
Windows and Macintosh, a console window is used to prompt for Tcl commands. On UNIX, your
terminal window is used. As described later, you can also set up stand alone Tcl/Tk scripts that are
self-contained applications.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 2. Getting Started

The source Command

You can enter Tcl commands interactively at the % prompt. It is a good idea to try out the examples in
this book as you read along. The highlighted examples from the book are on the CD-ROM in the
exsource folder. You can edit these scripts in your favorite editor. Save your examples to a file and
then execute them with the Tcl source command:

source filename

The source command reads Tcl commands from a file and evaluates them just as if you had typed
them interactively.
Chapter 3 develops a sample application. To get started, just open an editor on a file named cgi1.tcl.
Each time you update this file you can save it, reload it into Tcl with the source command, and test it
again. Development goes quickly because you do not wait for things to compile!

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 2. Getting Started

UNIX Tcl Scripts

On UNIX you can create a stand alone Tcl or Tcl/Tk script much like an sh or csh script. The trick is
in the first line of the file that contains your script. If the first line of a file begins with #!pathname,
then UNIX uses pathname as the interpreter for the rest of the script. The "Hello, World!" program
from Chapter 1 is repeated in Example 2-1 with the special starting line:

Example 2-1 A standalone Tcl script on UNIX.

#!/usr/local/bin/tclsh
puts stdout {Hello, World!}

Similarly, the Tk hello world program from Chapter 21 is shown in Example 2-2:

Example 2-2 A standalone Tk script on UNIX.

#!/usr/local/bin/wish
button .hello -text Hello -command {puts "Hello, World!"}
pack .hello -padx 10 -pady 10

The actual pathnames for tclsh and wish may be different on your system. If you type the pathname for
the interpreter wrong, you receive a confusing "command not found" error. You can find out the
complete pathname of the Tcl interpreter with the info nameofexecutable command. This is what
appears on my system:

info nameofexecutable
=> /home/welch/install/solaris/bin/tclsh8.2
Watch out for long pathnames.

On most UNIX systems, this special first line is limited to 32 characters, including the #!. If the
pathname is too long, you may end up with /bin/sh trying to interpret your script, giving you syntax
errors. You might try using a symbolic link from a short name to the true, long name of the interpreter.
However, watch out for systems like Solaris in which the script interpreter cannot be a symbolic link.
Fortunately, Solaris doesn't impose a 32-character limit on the pathname, so you can just use a long
pathname.
The next example shows a trick that works around the pathname length limitation in all cases. The
trick comes from a posting to comp.lang.tcl by Kevin Kenny. It takes advantage of a difference
between comments in Tcl and the Bourne shell. Tcl comments are described on page 16. In the
example, the Bourne shell command that runs the Tcl interpreter is hidden in a comment as far as Tcl
is concerned, but it is visible to /bin/sh:

Example 2-3 Using /bin/sh to run a Tcl script.

#!/bin/sh
# The backslash makes the next line a comment in Tcl \
exec /some/very/long/path/to/wish "$0" ${1+"$@"}
# ... Tcl script goes here ...

You do not even have to know the complete pathname of tclsh or wish to use this trick. You can just
do the following:

#!/bin/sh
# Run wish from the users PATH \
exec wish -f "$0" ${1+"$@"}

The drawback of an incomplete pathname is that many sites have different versions of wish and tclsh
that correspond to different versions of Tcl and Tk. In addition, some users may not have these
programs in their PATH.
If you have Tk version 3.6 or earlier, its version of wish requires a -f argument to make it read the
contents of a file. The -f switch is ignored in Tk 4.0 and higher versions. The -f, if required, is also
counted in the 32-character limit on #! lines.

#!/usr/local/bin/wish -f
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 2. Getting Started

Windows 95 Start Menu

You can add your Tcl/Tk programs to the Windows start menu. The command is the complete name of
the wish.exe program and the name of the script. The trick is that the name of wish.exe has a space in
it in the default configuration, so you must use quotes. Your start command will look something like
this:

"c:\Program Files\TCL82\wish.exe" "c:\My Files\script.tcl"

This starts c:\My Files\script.tcl as a stand alone Tcl/Tk program.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 2. Getting Started

The Macintosh and ResEdit

If you want to create a self-contained Tcl/Tk application on Macintosh, you must copy the Wish
program and add a Macintosh resource named tclshrc that has the start-up Tcl code. The Tcl code
can be a single source command that reads your script file. Here are step-by-step instructions to create
the resource using ResEdit:

First, make a copy of Wish and open the copy in ResEdit.

Pull down the Resource menu and select Create New Resource operation to make a new TEXT
resource.
ResEdit opens a window and you can type in text. Type in a source command that names your
script:

source "Hard Disk:Tcl/Tk 8.1:Applications:MyScript.tcl"

Set the name of the resource to be tclshrc. You do this through the Get Resource Info dialog
under the Resources menu in ResEdit.
This sequence of commands is captured in an application called "Drag n Drop Tclets", which comes
with the Macintosh Tcl distribution. If you drag a Tcl script onto this icon, it will create a copy of Wish
and create the tclshrc text resource that has a source command that will load that script.
If you have a Macintosh development environment, you can build a version of Wish that has additional
resources built right in. You add the resources to the applicationInit.r file. If a resource contains
Tcl code, you use it like this:

source -rcrc resource

If you don't want to edit resources, you can just use the Wish Source menu to select a script to run.
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 2. Getting Started

The console Command

The Windows and Macintosh platforms have a built-in console that is used to enter Tcl commands
interactively. You can control this console with the console command. The console is visible by
default. Hide the console like this:

console hide

Display the console like this:

console show

The console is implemented by a second Tcl interpreter. You can evaluate Tcl commands in that
interpreter with:

console eval command

There is an alternate version of this console called TkCon. It is included on the CD-ROM, and you can
find current versions on the Internet. TkCon was created by Jeff Hobbs and has lots of nice features.
You can use TkCon on Unix systems, too.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 2. Getting Started

Command-Line Arguments
If you run a script from the command line, for example from a UNIX shell, you can pass the script
command-line arguments. You can also specify these arguments in the shortcut command in
Windows. For example, under UNIX you can type this at a shell:

% myscript.tcl arg1 arg2 arg3

In Windows, you can have a shortcut that runs wish on your script and also passes additional
arguments:

"c:\Program Files\TCL82\wish.exe" c:\your\script.tcl arg1

The Tcl shells pass the command-line arguments to the script as the value of the argv variable. The
number of command-line arguments is given by the argc variable. The name of the program, or script,
is not part of argv nor is it counted by argc. Instead, it is put into the argv0 variable. Table 2-2 lists all
the predefined variables in the Tcl shells. argv is a list, so you can use the lindex command, which is
described on page 59, to extract items from it:

set arg1 [lindex $argv 0]

The following script prints its arguments (foreach is described on page 73):

Example 2-4 The EchoArgs script.

# Tcl script to echo command line arguments

puts "Program: $argv0"
puts "Number of arguments: $argc"
set i 0
foreach arg $argv {
puts "Arg $i: $arg"
incr i
}

Command-Line Options to Wish

Some command-line options are interpreted by wish, and they do not appear in the argv variable. The
general form of the wish command line is:

wish ?options? ?script? ?arg1 arg2?

If no script is specified, then wish just enters an interactive command loop. Table 2-1 lists the options
that wish supports:

Table 2-1. Wish command line options.

-colormap new Use a new private colormap. See page 540.

-display display Use the specified X display. UNIX only.
-geometry geometry The size and position of the window. See page 572.
-name name Specify the Tk application name. See page 562.
-sync Run X synchronously. UNIX only.
-use id Use the window specified by id for the main window. See page 580.
-visual visual Specify the visual for the main window. See page 540.
-- Terminate options to wish.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 2. Getting Started

Predefined Variables

Table 2-2. Variables defined by tclsh and wish.

argc The number of command-line arguments.

argv A list of the command-line arguments.
argv0 The name of the script being executed. If being used interactively, argv0 is the
name of the shell program.
embed_args The list of arguments in the <EMBED> tag. Tcl applets only. See page 298.
env An array of the environment variables. See page 117.
tcl_interactive True (one) if the tclsh is prompting for commands.
tcl_library The script library directory.
tcl_patchLevel Modified version number, e.g., 8.0b1.
tcl_platform Array containing operating system information. See page 182.
tcl_prompt1 If defined, this is a command that outputs the prompt.
tcl_prompt2 If defined, this is a command that outputs the prompt if the current command is
not yet complete.
tcl_version Version number.
auto_path The search path for script library directories. See page 162.
auto_index A map from command name to a Tcl command that defines it.
auto_noload If set, the library facility is disabled.
auto_noexec If set, the auto execute facility is disabled.
geometry (wish only). The value of the -geometry argument.
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part I. Tcl Basics

Chapter 3. The Guestbook CGI Application

This chapter presents a simple Tcl program that computes a Web page. The chapter provides a brief
background to HTML and the CGI interface to Web servers.
This chapter presents a complete, but simple guestbook program that computes an HTML document,
or Web page, based on the contents of a simple database. The basic idea is that a user with a Web
browser visits a page that is computed by the program. The details of how the page gets from your
program to the user with the Web browser vary from system to system. The Tcl Web Server described
in Chapter 18 comes with this guestbook example already set up. You can also use these scripts on
your own Web server, but you will need help from your Webmaster to set things up.
The chapter provides a very brief introduction to HTML and CGI programming. HTML is a way to
specify text formatting, including hypertext links to other pages on the World Wide Web. CGI is a
standard for communication between a Web server that delivers documents and a program that
computes documents for the server. There are many books on these subjects alone. CGI Developers
Resource, Web Programming with Tcl and Perl by John Ivler (Prentice Hall, 1997) is a good reference
for details that are left unexplained here.
A guestbook is a place for visitors to sign their name and perhaps provide other information. We will
build a guestbook that takes advantage of the World Wide Web. Our guests can leave their address as
a Universal Resource Location (URL). The guestbook will be presented as a page that has hypertext
links to all these URLs so that other guests can visit them. The program works by keeping a simple
database of the guests, and it generates the guestbook page from the database.
The Tcl scripts described in this chapter use commands and techniques that are described in more
detail in later chapters. The goal of the examples is to demonstrate the power of Tcl without
explaining every detail. If the examples in this chapter raise questions, you can follow the references to
examples in other chapters that do go into more depth.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 3. The Guestbook CGI Application

A Quick Introduction to HTML

Web pages are written in a text markup language called HTML (HyperText Markup Language). The
idea of HTML is that you annotate, or mark up, regular text with special tags that indicate structure
and formatting. For example, the title of a Web page is defined like this:

<TITLE>My Home Page</TITLE>

The tags provide general formatting guidelines, but the browsers that display HTML pages have
freedom in how they display things. This keeps the markup simple. The general syntax for HTML tags
is:

<tag parameters>normal text</tag>

As shown here, the tags usually come in pairs. The open tag may have some parameters, and the close
tag name begins with a slash. The case of a tag is not considered, so <title>, <Title>, and <TITLE>
are all valid and mean the same thing. The corresponding close tag could be </title>, </Title>,
</TITLE>, or even </TiTlE>.

The <A> tag defines hypertext links that reference other pages on the Web. The hypertext links connect
pages into a Web so that you can move from page to page to page and find related information. It is
the flexibility of the links that make the Web so interesting. The <A> tag takes an HREF parameter that
defines the destination of the link. If you wanted to link to my home page, you would put this in your
page:

<A HREF="http://www.beedub.com/">Brent Welch</A>

When this construct appears in a Web page, your browser typically displays "Brent Welch" in blue
underlined text. When you click on that text, your browser switches to the page at the address
"http://www.beedub.com/". There is a lot more to HTML, of course, but this should give you a basic
idea of what is going on in the examples. The following list summarizes the HTML tags that will be
used in the examples:

Table 3-1. HTML tags used in the examples.

HTML Main tag that surrounds the whole document.

HEAD Delimits head section of the HTML document.
TITLE Defines the title of the page.
BODY Delimits the body section. Lets you specify page colors.
H1 - H6 HTML defines 6 heading levels: H1, H2, H3, H4, H5, H6.
P Start a new paragraph.
BR One blank line.
B Bold text.
I Italic text.
A Used for hypertext links.
IMG Specify an image.
DL Definition list.
DT Term clause in a definition list.
DD Definition clause in a definition list.
UL An unordered list.
LI A bulleted item within a list.
TABLE Create a table.
TR A table row.
TD A cell within a table row.
FORM Defines a data entry form.
INPUT A one-line entry field, checkbox, radio button, or submit button.
TEXTAREA A multiline text field.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 3. The Guestbook CGI Application

CGI for Dynamic Pages

There are two classes of pages on the Web, static and dynamic. A static page is written and stored on a
Web server, and the same thing is returned each time a user views the page. This is the easy way to
think about Web pages. You have some information to share, so you compose a page and tinker with
the HTML tags to get the information to look good. If you have a home page, it is probably in this
class.
In contrast, a dynamic page is computed each time it is viewed. This is how pages that give up-to-the-
minute stock prices work, for example. A dynamic page does not mean it includes animations; it just
means that a program computes the page contents when a user visits the page. The advantage of this
approach is that a user might see something different each time he or she visits the page. As we shall
see, it is also easier to maintain information in a database of some sort and generate the HTML
formatting for the data with a program.
A CGI (Common Gateway Interface) program is used to compute Web pages. The CGI standard
defines how inputs are passed to the program as well as a way to identify different types of results,
such as images, plain text, or HTML markup. A CGI program simply writes the contents of the
document to its standard output, and the Web server takes care of delivering the document to the user's
Web browser. The following is a very simple CGI script:

Example 3-1 A simple CGI script.

puts "Content-Type: text/html"

puts ""
puts "<TITLE>The Current Time</TITLE>"
puts "The time is <B>[clock format [clock seconds]]</B>"

The program computes a simple HTML page that has the current time. Each time a user visits the page
they will see the current time on the server. The server that has the CGI program and the user viewing
the page might be on different sides of the planet. The output of the program starts with a Content-
Type line that tells your Web browser what kind of data comes next. This is followed by a blank line
and then the contents of the page.
The clock command is used twice: once to get the current time in seconds, and a second time to
format the time into a nice looking string. The clock command is described in detail on page 173.
Fortunately, there is no conflict between the markup syntax used by HTML and the Tcl syntax for
embedded commands, so we can mix the two in the argument to the puts command. Double quotes
are used to group the argument to puts so that the clock commands will be executed. When run, the
output of the program will look like this:

Example 3-2 Output of Example 3-1.

Content-Type: text/html

<TITLE>The Current Time</TITLE>

The time is <B>Wed Oct 16 11:23:43 1996</B>

This example is a bit sloppy in its use of HTML, but it should display properly in most Web browsers.
Example 3-3 includes all the required tags for a proper HTML document.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 3. The Guestbook CGI Application

The guestbook.cgi Script

The guestbook.cgi script computes a page that lists all the registered guests. The example is shown
first, and then each part of it is discussed in more detail later. One thing to note right away is that the
HTML tags are generated by procedures that hide the details of the HTML syntax. The first lines of
the script use the UNIX trick to have tclsh interpret the script. This trick is described on page 26:

Example 3-3 The guestbook.cgi script.

#!/bin/sh
# guestbook.cgi
# Implement a simple guestbook page.
# The set of visitors is kept in a simple database.
# The newguest.cgi script will update the database.
# \
exec tclsh "$0" ${1+"$@"}

# The cgilib.tcl file has helper procedures

# The guestbook.data file has the database
# Both file are in the same directory as the script

set dir [file dirname [info script]]

source [file join $dir cgilib.tcl]
set datafile [file join $dir guestbook.data]

Cgi_Header "Brent's Guestbook" {BGCOLOR=white TEXT=black}

P
if {![file exists $datafile]} {
puts "No registered guests, yet."
P
puts "Be the first [Link {registered guest!}newguest.html]"
} else {
puts "The following folks have registered in my GuestBook."
P
puts [Link Register newguest.html]
H2 Guests
catch {source $datafile}
foreach name [lsort [array names Guestbook]] {
set item $Guestbook($name)
set homepage [lindex $item 0]
set markup [lindex $item 1]
H3 [Link $name $homepage]
puts $markup
}
}
Cgi_End

Using a Script Library File

The script uses a number of Tcl procedures that make working with HTML and the CGI interface
easier. These procedures are kept in the cgilib.tcl file, which is kept in the same directory as the
main script. The script starts by sourcing the cgilib.tcl file so that these procedures are available.
The following command determines the location of the cgilib.tcl file based on the location of the
main script. The info script command returns the file name of the script. The file dirname and
file join commands manipulate file names in a platform-independent way. They are described on
page 102. I use this trick to avoid putting absolute file names into my scripts, which would have to be
changed if the program moves later:

set dir [file dirname [info script]]

source [file join $dir cgilib.tcl]

Beginning the HTML Page

The following command generates the standard information that comes at the beginning of an HTML
page:

Cgi_Header {Brent's GuestBook} {bgcolor=white text=black}

The Cgi_Header is shown in Example 3-4:

Example 3-4 The Cgi_Header procedure.

proc Cgi_Header {title {bodyparams {}}} {

puts stdout \
"Content-Type: text/html

<HTML>
<HEAD>
<TITLE>$title</TITLE>
</HEAD>
<BODY $bodyparams>
<H1>$title</H1>"
}

The Cgi_Header procedure takes as arguments the title for the page and some optional parameters for
the HTML <Body> tag. The guestbook.cgi script specifies black text on a white background to avoid
the standard gray background of most browsers. The procedure definition uses the syntax for an
optional parameter, so you do not have to pass bodyparams to Cgi_Header. Default values for
procedure parameters are described on page 81.
The Cgi_Header procedure just contains a single puts command that generates the standard
boilerplate that appears at the beginning of the output. Note that several lines are grouped together
with double quotes. Double quotes are used so that the variable references mixed into the HTML are
substituted properly.
The output begins with the CGI content-type information, a blank line, and then the HTML. The
HTML is divided into a head and a body part. The <TITLE> tag goes in the head section of an HTML
document. Finally, browsers display the title in a different place than the rest of the page, so I always
want to repeat the title as a level-one heading (i.e., H1) in the body of the page.

Simple Tags and Hypertext Links

The next thing the program does is to see whether there are any registered guests or not. The file
command, which is described in detail on page 102, is used to see whether there is any data:

if {![file exists $datafile]} {

If the database file does not exist, a different page is displayed to encourage a registration. The page
includes a hypertext link to a registration page. The newguest.html page will be described in more
detail later:

puts "No registered guests, yet."

P
puts "Be the first [Link {registered guest!}newguest.html]"

The P command generates the HTML for a paragraph break. This trivial procedure saves us a few
keystrokes:

proc P {} {
puts <P>
}

The Link command formats and returns the HTML for a hypertext link. Instead of printing the HTML
directly, it is returned, so you can include it in-line with other text you are printing:

Example 3-5 The Link command formats a hypertext link.

proc Link {text url} {

return "<A HREF=\"$url\">$text</A>"
}

The output of the program would be as below if there were no data:

Example 3-6 Initial output of guestbook.cgi.

Content-Type: text/html

<HTML>
<HEAD>
<TITLE>Brent's Guestbook</TITLE>
</HEAD>
<BODY BGCOLOR=white TEXT=black>
<H1>Brent's Guestbook</H1>
<P>
No registered guests.
<P>
Be the first <A HREF="newguest.html">registered guest!</A>
</BODY>
</HTML>

If the database file exists, then the real work begins. We first generate a link to the registration page,
and a level-two header to separate that from the guest list:

puts [Link Register newguest.html]

H2 Guests

The H2 procedure handles the detail of including the matching close tag:

proc H2 {string} {
puts "<H2>$string</H2>"
}

Using a Tcl Array for the Database

The datafile contains Tcl commands that define an array that holds the guestbook data. If this file is
kept in the same directory as the guestbook.cgi script, then you can compute its name:

set dir [file dirname [info script]]

set datafile [file join $dir guestbook.data]

By using Tcl commands to represent the data, we can load the data with the source command. The
catch command is used to protect the script from a bad data file, which will show up as an error from
the source command. Catching errors is described in detail on page 79:

catch {source $datafile}

The Guestbook variable is the array defined in guestbook.data. Array variables are the topic of
Chapter 8. Each element of the array is defined with a Tcl command that looks like this:

set Guestbook(key) {url markup}

The person's name is the array index, or key. The value of the array element is a Tcl list with two
elements: their URL and some additional HTML markup that they can include in the guestbook. Tcl
lists are the topic of Chapter 5. The following example shows what the command looks like with real
data:

set {Guestbook(Brent Welch)} {

http://www.beedub.com/
{<img src=http://www.beedub.com/welch.gif>}
}

The spaces in the name result in additional braces to group the whole variable name and each list
element. This syntax is explained on page 90. Do not worry about it now. We will see on page 42 that
all the braces in the previous statement are generated automatically. The main point is that the person's
name is the key, and the value is a list with two elements.
The array names command returns all the indices, or keys, in the array, and the lsort command sorts
these alphabetically. The foreach command loops over the sorted list, setting the loop variable x to
each key in turn:

foreach name [lsort [array names Guestbook]] {

Given the key, we get the value like this:

set item $Guestbook($name)

The two list elements are extracted with lindex, which is described on page 63.

set homepage [lindex $item 0]

set markup [lindex $item 1]

We generate the HTML for the guestbook entry as a level-three header that contains a hypertext link to
the guest's home page. We follow the link with any HTML markup text that the guest has supplied to
embellish his or her entry. The H3 procedure is similar to the H2 procedure already shown, except it
generates <H3> tags:

H3 [Link $name $homepage]

puts $markup

Sample Output
The last thing the script does is call Cgi_End to output the proper closing tags. Example 3-7 shows the
output of the guestbook.cgi script:

Example 3-7 Output of guestbook.cgi.

Content-Type: text/html

<HTML>
<HEAD>
<TITLE>Brent's Guestbook</TITLE>
</HEAD>
<BODY BGCOLOR=white TEXT=black>
<H1>Brent's Guestbook</H1>
<P>
The following folks have registered in my guestbook.
<P>
<A HREF="newguest.html">Register</A>
<H2>Guests</H2>
<H3><A HREF="http://www.beedub.com/">Brent Welch</A></H3>
<IMG SRC="http://www.beedub.com/welch.gif">
</BODY>
</HTML>

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents
Chapter 3. The Guestbook CGI Application

Defining Forms and Processing Form Data

The guestbook.cgi script only generates output. The other half of CGI deals with input from the user. Input
is more complex for two reasons. First, we have to define another HTML page that has a form for the user to
fill out. Second, the data from the form is organized and encoded in a standard form that must be decoded by
the script. Example 3-8 on page 40 defines a very simple form, and the procedure that decodes the form data
is shown in Example 11-6 on page 155.

The guestbook page contains a link to newguest.html . This page contains a form that lets a user register his
or her name, home page URL, and some additional HTML markup. The form has a submit button. When a
user clicks that button in their browser, the information from the form is passed to the newguest.cgi script.
This script updates the database and computes another page for the user that acknowledges the user's
contribution.

The newguest.html Form

An HTML form contains tags that define data entry fields, buttons, checkboxes, and other elements that let
the user specify values. For example, a one-line entry field that is used to enter the home page URL is defined
like this:

<INPUT TYPE=text NAME=url>

The INPUT tag is used to define several kinds of input elements, and its type parameter indicates what kind.
In this case, TYPE=text creates a one-line text entry field. The submit button is defined with an INPUT tag that
has TYPE=submit , and the VALUE parameter becomes the text that appears on the button:
<INPUT TYPE=submit NAME=submit VALUE=Register>

A general type-in window is defined with the TEXTAREA tag. This creates a multiline, scrolling text field that
is useful for specifying lots of information, such as a free-form comment. In our case we will let guests type
in HTML that will appear with their guestbook entry. The text between the open and close TEXTAREA tags is
inserted into the type-in window when the page is first displayed.

<TEXTAREA NAME=markup ROWS=10 COLS=50>Hello.</TEXTAREA>

A common parameter to the form tags is NAME= something . This name identifies the data that will come back
from the form. The tags also have parameters that affect their display, such as the label on the submit button
and the size of the text area. Those details are not important for our example. The complete form is shown in
Example 3-8 :

Example 3-8 The newguest.html form.

<!Doctype HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">

<HTML>
<HEAD>
<TITLE>Register in my Guestbook</TITLE>

<META HTTP-Equiv=Editor Content="SunLabs WebTk 1.0beta 10/11/96">
</HEAD>
<BODY>

<FORM ACTION="newguest.cgi" METHOD="POST">

<H1>Register in my Guestbook</H1>
<UL>
<LI>Name <INPUT TYPE="text" NAME="name" SIZE="40">
<LI>URL <INPUT TYPE="text" NAME="url" SIZE="40">
<P>
If you don't have a home page, you can use an email URL like "mailto:[email protected]
<LI>Additional HTML to include after your link:
<BR>

<TEXTAREA NAME="html" COLS="60" ROWS="15">

</TEXTAREA>
<LI><INPUT TYPE="submit" NAME="new" VALUE="Add me to your guestbook">
<LI><INPUT TYPE="submit" NAME="update" VALUE="Update my guestbook entry">
</UL>
</FORM>

</BODY>
</HTML>

The newguest.cgi Script

When the user clicks the Submit button in their browser, the data from the form is passed to the program
identified by the Action parameter of the form tag. That program takes the data, does something useful with
it, and then returns a new page for the browser to display. In our case the FORM tag names newguest.cgi as
the program to handle the data:

<FORM ACTION=newguest.cgi METHOD=POST>

The CGI specification defines how the data from the form is passed to the program. The data is encoded and
organized so that the program can figure out the values the user specified for each form element. The
encoding is handled rather nicely with some regular expression tricks that are done in Cgi_Parse . Cgi_Parse
saves the form data, and Cgi_Value gets a form value in the script. These procedures are described in
Example 11-6 on page 155. Example 3-9 starts out by calling Cgi_Parse :

Example 3-9 The newguest.cgi script.

#!/bin/sh
# \
exec tclsh "$0" ${1+"$@"}
# source cgilib.tcl from the same directory as newguest.cgi

set dir [file dirname [info script]]

source [file join $dir cgilib.tcl]
set datafile [file join $dir guestbook.data]

Cgi_Parse

# Open the datafile in append mode

if [catch {open $datafile a}out] {

Cgi_Header "Guestbook Registration Error" \
{BGCOLOR=black TEXT=red}
P
puts "Cannot open the data file"
P
puts $out;# the error message
exit 0
}

# Append a Tcl set command that defines the guest's entry

puts $out ""

puts $out [list set Guestbook([Cgi_Value name]) \
[list [Cgi_Value url] [Cgi_Value html]]]
close $out

# Return a page to the browser

Cgi_Header "Guestbook Registration Confirmed" \

{BGCOLOR=white TEXT=black}

puts "
<DL>
<DT>Name
<DD>[Cgi_Value name]
<DT>URL
<DD>[Link [Cgi_Value url] [Cgi_Value url]]
</DL>
[Cgi_Value html]
"

Cgi_End

The main idea of the newguest.cgi script is that it saves the data to a file as a Tcl command that defines an
element of the Guestbook array. This lets the guestbook.cgi script simply load the data by using the Tcl
source command. This trick of storing data as a Tcl script saves us from the chore of defining a new file
format and writing code to parse it. Instead, we can rely on the well-tuned Tcl implementation to do the hard
work for us efficiently.

The script opens the datafile in append mode so that it can add a new record to the end. Opening files is
described in detail on page 110. The script uses a catch command to guard against errors. If an error occurs,
a page explaining the error is returned to the user. Working with files is one of the most common sources of
errors (permission denied, disk full, file-not-found, and so on), so I always open the file inside a catch
statement:

if [catch {open $datafile a} out] {

# an error occurred
} else {
# open was ok
}

In this command, the variable out gets the result of the open command, which is either a file descriptor or an
error message. This style of using catch is described in detail in Example 6-14 on page 77.

The script writes the data as a Tcl set command. The list command is used to format the data properly:

puts $out [list set Guestbook([Cgi_Value name]) \

[list [Cgi_Value url] [Cgi_Value html]]]

There are two lists. First the url and html values are formatted into one list. This list will be the value of the
array element. Then, the whole Tcl command is formed as a list. In simplified form, the command is
generated from this:

list set variable value

Using the list command ensures that the result will always be a valid Tcl command that sets the variable to
the given value. The list command is described in more detail on page 61.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 3. The Guestbook CGI Application

The cgi.tcl Package

The cgilib.tcl file included with this book just barely scratches the surface of things you might like
to do in a CGI script. Don Libes has created a comprehensive package for CGI scripts known as
cgi.tcl . You can find it on the web at

http://expect.nist.gov/cgi.tcl/
One of Don's goals in cgi.tcl was to eliminate the need to directly write any HTML markup at all.
Instead, he has defined a whole suite of Tcl commands similar to the P and H2 procedures shown in
this chapter that automatically emit the matching close tags. He also has support procedures to deal
with browser cookies, page redirects, and other CGI features.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 3. The Guestbook CGI Application

Next Steps
There are a number of details that can be added to this example. A user may want to update their entry,
for example. They could do that now, but they would have to retype everything. They might also like a
chance to check the results of their registration and make changes before committing them. This
requires another page that displays their guest entry as it would appear on a page, and also has the
fields that let them update the data.
The details of how a CGI script is hooked up with a Web server vary from server to server. You
should ask your local Webmaster for help if you want to try this out on your local web site. The Tcl
Web Server comes with this guestbook example already set up, plus it has a number of other very
interesting ways to generate pages. My own taste in web page generation has shifted from CGI to a
template-based approach supported by the Tcl Web Server. This is the topic of Chapter 18.
The next few chapters describe basic Tcl commands and data structures. We return to this example in
Chapter 11 on regular expressions.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part I. Tcl Basics

Chapter 4. String Processing in Tcl

This chapter describes string manipulation and simple pattern matching. Tcl commands described are:
string, append, format, scan, and binary. The string command is a collection of several useful
string manipulation operations.
Strings are the basic data item in Tcl, so it should not be surprising that there are a large number of
commands to manipulate strings. A closely related topic is pattern matching, in which string
comparisons are made more powerful by matching a string against a pattern. This chapter describes a
simple pattern matching mechanism that is similar to that used in many other shell languages. Chapter
11 describes a more complex and powerful regular expression pattern matching mechanism.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 4. String Processing in Tcl

The string Command

The string command is really a collection of operations you can perform on strings. The following
example calculates the length of the value of a variable.

set name "Brent Welch"

string length $name
=> 11

The first argument to string determines the operation. You can ask string for valid operations by
giving it a bad one:

string junk
=> bad option "junk": should be bytelength, compare,
equal, first, index, is, last, length, map, match, range,
repeat, replace, tolower, totitle, toupper, trim, trimleft,
trimright, wordend, or wordstart

This trick of feeding a Tcl command bad arguments to find out its usage is common across many
commands. Table 4-1 summarizes the string command

Table 4-1. The string command.

string bytelength Returns the number of bytes used to store a string, which may be different
str from the character length returned by string length because of UTF-8
encoding. See page 210 of Chapter 15 about Unicode and UTF-8.
string compare ?- Compares strings lexicographically. Use -nocase for case insensitve
nocase? ?-length comparison. Use -length to limit the comparison to the first len characters.
len? str1 str2 Returns 0 if equal, -1 if str1 sorts before str2, else 1.
string equal ?- Compares strings and returns 1 if they are the same. Use -nocase for case
nocase? str1 str2 insensitve comparison.
string first str1 Returns the index in str2 of the first occurrence of str1, or -1 if str1 is not
str2 found.
string index string Returns the character at the specified index. An index counts from zero. Use
index end for the last character.

string is class ?- Returns 1 if string belongs to class. If -strict, then empty strings never
strict? ?- match, otherwise they always match. If -failindex is specified, then
failindex varname? varname is assigned the index of the character in string that prevented it
string from being a member of class. See Table 4-3 on page 50 for character class
names.
string last str1 Returns the index in str2 of the last occurrence of str1, or -1 if str1 is not
str2 found.
string length Returns the number of characters in string.
string
string map ?- Returns a new string created by mapping characters in string according to
nocase? charMap the input, output list in charMap. See page 51.
string

string match Returns 1 if str matches the pattern, else 0. Glob-style matching is used.
pattern str See page 48.
string range str i Returns the range of characters in str from i to j.
j
string repeat str Returns str repeated count times.
count

string replace str Returns a new string created by replacing characters first through last
first last?newstr? with newstr, or nothing.
string tolower Returns string in lower case. first and last determine the range of
string?first? ? string on which to operate.
last?
string totitle Capitalizes string by replacing its first character with the Unicode title
string?first? ? case, or upper case, and the rest with lower case. first and last determine
last? the range of string on which to operate.
string toupper Returns string in upper case. first and last determine the range of
string?first? ? string on which to operate.
last?
string trim Trims the characters in chars from both ends of string. chars defaults to
string?chars? whitespace.
string trimleft Trims the characters in chars from the beginning of string. chars defaults
string?chars? to whitespace.
string trimright Trims the characters in chars from the end of string. chars defaults to
string?chars? whitespace.
string wordend str Returns the index in str of the character after the word containing the
ix character at index ix.
string wordstart Returns the index in str of the first character in the word containing the
str ix character at index ix.

These are the string operations I use most:

The equal operation, which is shown in Example 4-2 on page 48.

String match. This pattern matching operation is described on page 48.
The tolower, totitle, and toupper operations convert case.
The trim, trimright, and trimleft operations are handy for cleaning up strings.
These new operations were added in Tcl 8.1 (actually, they first appeared in the 8.1.1 patch release):

The equal operation, which is simpler than using string compare.

The is operation that test for kinds of strings. String classes are listed in Table 4-3 on page 50.
The map operation that translates characters (e.g., like the Unix tr command.)
The repeat and replace operations.
The totitle operation, which is handy for capitalizing words.

String Indices
Several of the string operations involve string indices that are positions within a string. Tcl counts
characters in strings starting with zero. The special index end is used to specify the last character in a
string:

string range abcd 2 end

=> cd

Tcl 8.1 added syntax for specifying an index relative to the end. Specify end-N to get the Nth caracter
before the end. For example, the following command returns a new string that drops the first and last
characters from the original:

string range $string 1 end-1

There are several operations that pick apart strings: first, last, wordstart, wordend, index, and
range. If you find yourself using combinations of these operations to pick apart data, it will be faster if
you can do it with the regular expression pattern matcher described in Chapter 11.
Strings and Expressions
Strings can be compared with expr, if, and while using the comparison operators ==, !=, < and >.
However, there are a number of subtle issues that can cause problems. First, you must quote the string
value so that the expression parser can identify it as a string type. Then, you must group the expression
with curly braces to prevent the double quotes from being stripped off by the main interpreter:

if {$x == "foo"}command

expr is unreliable for string comparison.

Ironically, despite the quotes, the expression evaluator first converts items to numbers if possible, and
then converts them back if it detects a case of string comparison. The conversion back is always done
as a decimal number. This can lead to unexpected conversions between strings that look like
hexadecimal or octal numbers. The following boolean expression is true!

if {"0xa" == "10"} {puts stdout ack! }

=> ack!

A safe way to compare strings is to use the string compare and equal operations. These operations
work faster because the unnecessary conversions are eliminated. Like the C library strcmp function,
string compare returns 0 if the strings are equal, minus 1 if the first string is lexicographically less
than the second, or 1 if the first string is greater than the second:

Example 4-1 Comparing strings with string compare.

if {[string compare $s1 $s2] == 0} {

# strings are equal
}

The string equal command added in Tcl 8.1 makes this simpler:

Example 4-2 Comparing strings with string equal.

if {[string equal $s1 $s2]} {

# strings are equal
}
String Matching
The string match command implements glob-style pattern matching that is modeled after the file
name pattern matching done by various UNIX shells.
The heritage of the word "glob" is rooted in UNIX, and Tcl preserves this historical oddity in the glob
command that does pattern matching on file names. The glob command is described on page 115.
Table 4-2 shows the three constructs used in string match patterns:

Table 4-2. Matching characters used with string match.

* Match any number of any characters.

? Match exactly one character.
[chars] Match any character in chars.

Any other characters in a pattern are taken as literals that must match the input exactly. The following
example matches all strings that begin with a:

string match a* alpha

=> 1

To match all two-letter strings:

string match ?? XY
=> 1

To match all strings that begin with either a or b:

string match {[ab]*}cello

=> 0

Be careful! Square brackets are also special to the Tcl interpreter, so you will need to wrap the pattern
up in curly braces to prevent it from being interpreted as a nested command. Another approach is to
put the pattern into a variable:

set pat {[ab]*x}

string match $pat box
=> 1
You can specify a range of characters with the syntax [x-y]. For example, [a-z] represents the set of
all lower-case letters, and [0-9] represents all the digits. You can include more than one range in a set.
Any letter, digit, or the underscore is matched with:

string match {[a-zA-Z0-9_]}$char

The set matches only a single character. To match more complicated patterns, like one or more
characters from a set, then you need to use regular expression matching, which is described on page
148.
If you need to include a literal *, ?, or bracket in your pattern, preface it with a backslash:

string match {*\?}what?

=> 1

In this case the pattern is quoted with curly braces because the Tcl interpreter is also doing backslash
substitutions. Without the braces, you would have to use two backslashes. They are replaced with a
single backslash by Tcl before string match is called.

string match *\\? what?

Character Classes
The string is command tests a string to see whether it belongs to a particular class. This is useful
for input validation. For example, to make sure something is a number, you do:

if {![string is integer $input]} {

error "Invalid input. Please enter a number."
}

Classes are defined in terms of the Unicode character set, which means they are more general than
specifying character sets with ranges over the ASCII encoding. For example, alpha includes many
characters outside the range of [A-Za-z] because of different characters in other alphabets. The
classes are listed in Table 4-3.

Table 4-3. Character class names.

alnum Any alphabet or digit character.
alpha Any alphabet character.
ascii Any character with a 7-bit character code (i.e., less than 128.)
boolean 0, 1, true, false (in any case).
control Character code less than 32, and not NULL.
digit Any digit character.
double A valid floating point number.
false 0 or false (in any case).
graph Any printing characters, not including space characters.
integer A valid integer.
lower A string in all lower case.
print A synonym for alnum.
punct Any punctuation character.
space Space, tab, newline, carriage return, vertical tab, backspace.
true 1 or true (in any case).
upper A string all in upper case.
wordchar Alphabet, digit, and the underscore.
xdigit Valid hexadecimal digits.

Mapping Strings
The string map command translates a string based on a character map. The map is in the form of a
input, output list. Whereever a string contains an input sequence, that is replaced with the
corresponding output. For example:

string map "food" {f p d l}

=> pool

The inputs and outputs can be more than one character and do not have to be the same length:

string map "food" {f p d ll oo u}

=> pull

Example 4-3 is more practical. It uses string map to replace fancy quotes and hyphens produced by
Microsoft Word into ASCII equivalents. It uses the open, read, and close file operations that are
described in Chapter 9, and the fconfigure command described on page 223 to ensure that the file
format is UNIX friendly.

Example 4-3 Mapping Microsoft World special characters to ASCII.

proc Dos2Unix {filename} {

set input [open $filename]
set output [open $filename.new]
fconfigure $output -translation lf
puts $output [string map {
\223 "
\224 "
\222 '
\226 -
}[read $input]]
close $input
close $output
}

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 4. String Processing in Tcl

The append Command

The append command takes a variable name as its first argument and concatenates its remaining
arguments onto the current value of the named variable. The variable is created if it does not already
exist:

set foo z
append foo a b c
set foo
=> zabc

The append command is efficient with large strings.

The append command provides an efficient way to add items to the end of a string. It modifies a
variable directly, so it can exploit the memory allocation scheme used internally by Tcl. Using the
append command like this:

append x " some new stuff"

is always faster than this:

set x "$x some new stuff"

The lappend command described on page 61 has similar performance benefits when working with Tcl
lists.
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 4. String Processing in Tcl

The format Command

The format command is similar to the C printf function. It formats a string according to a format
specification:

format spec value1 value2 ...

The spec argument includes literals and keywords. The literals are placed in the result as is, while
each keyword indicates how to format the corresponding argument. The keywords are introduced with
a percent sign, %, followed by zero or more modifiers, and terminate with a conversion specifier.
Example keywords include %f for floating point, %d for integer, and %s for string format. Use %% to
obtain a single percent character. The most general keyword specification for each argument contains
up to six parts:

position specifier
flags
field width
precision
word length
conversion character
These components are explained by a series of examples. The examples use double quotes around the
format specification. This is because often the format contains white space, so grouping is required, as
well as backslash substitutions like \t or \n, and the quotes allow substitution of these special
characters. Table 4-4 lists the conversion characters:

Table 4-4. Format conversions.

d Signed integer.
u Unsigned integer.
i Signed integer. The argument may be in hex (0x) or octal (0) format.
o Unsigned octal.
x or X Unsigned hexadecimal. 'x' gives lowercase results.
c Map from an integer to the ASCII character it represents.
s A string.
f Floating point number in the format a.b.
e or E Floating point number in scientific notation, a.bE+-c.
g or G Floating point number in either %f or %e format, whichever is shorter.

A position specifier is i$, which means take the value from argument i as opposed to the normally
corresponding argument. The position counts from 1. If a position is specified for one format keyword,
the position must be used for all of them. If you group the format specification with double quotes, you
need to quote the $ with a backslash:

set lang 2
format "%${lang}\$s" one un uno
=> un

The position specifier is useful for picking a string from a set, such as this simple language-specific
example. The message catalog facility described in Chapter 15 is a much more sophisticated way to
solve this problem. The position is also useful if the same value is repeated in the formatted string.
The flags in a format are used to specify padding and justification. In the following examples, the #
causes a leading 0x to be printed in the hexadecimal value. The zero in 08 causes the field to be
padded with zeros. Table 4-5 summarizes the format flag characters.

format "%#x" 20
=> 0x14
format "%#08x" 10
=> 0x0000000a

Table 4-5. Format flags.

- Left justify the field.
+ Always include a sign, either + or -.
space Precede a number with a space, unless the number has a leading sign. Useful for packing
numbers close together.
0 Pad with zeros.
# Leading 0 for octal. Leading 0x for hex. Always include a decimal point in floating point. Do
not remove trailing zeros (%g).

After the flags you can specify a minimum field width value. The value is padded to this width with
spaces, or with zeros if the 0 flag is used:

format "%-20s %3d" Label 2

=> Label 2

You can compute a field width and pass it to format as one of the arguments by using * as the field
width specifier. In this case the next argument is used as the field width instead of the value, and the
argument after that is the value that gets formatted.

set maxl 8
format "%-*s = %s" $maxl Key Value
=> Key = Value

The precision comes next, and it is specified with a period and a number. For %f and %e it indicates
how many digits come after the decimal point. For %g it indicates the total number of significant digits
used. For %d and %x it indicates how many digits will be printed, padding with zeros if necessary.

format "%6.2f %6.2d" 1 1

=> 1.00 01

The storage length part comes last but it is rarely useful because Tcl maintains all floating point values
in double-precision, and all integers as long words.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 4. String Processing in Tcl

The scan Command

The scan command parses a string according to a format specification and assigns values to variables.
It returns the number of successful conversions it made. The general form of the command is:

scan string format var ?var? ?var? ...

The format for scan is nearly the same as in the format command. There is no %u scan format. The %c
scan format converts one character to its decimal value.
The scan format includes a set notation. Use square brackets to delimit a set of characters. The set
matches one or more characters that are copied into the variable. A dash is used to specify a range. The
following scans a field of all lowercase letters.

scan abcABC {%[a-z]}result

=> 1
set result
=> abc

If the first character in the set is a right square bracket, then it is considered part of the set. If the first
character in the set is ^, then characters not in the set match. Again, put a right square bracket
immediately after the ^ to include it in the set. Nothing special is required to include a left square
bracket in the set. As in the previous example, you will want to protect the format with braces, or use
backslashes, because square brackets are special to the Tcl parser.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 4. String Processing in Tcl

The binary Command

Tcl 8.0 added support for binary strings. Previous versions of Tcl used null-terminated strings
internally, which foils the manipulation of some types of data. Tcl now uses counted strings, so it can
tolerate a null byte in a string value without truncating it.
This section describes the binary command that provides conversions between strings and packed
binary data representations. The binary format command takes values and packs them according to a
template. For example, this can be used to format a floating point vector in memory suitable for
passing to Fortran. The resulting binary value is returned:

binary format template value ?value ...?

The binary scan command extracts values from a binary string according to a similar template. For
example, this is useful for extracting data stored in binary format. It assigns values to a set of Tcl
variables:

binary scan value template variable ?variable ...?

Format Templates
The template consists of type keys and counts. The types are summarized in Table 4-6. In the table,
count is the optional count following the type letter.

Table 4-6. Binary conversion types.

a A character string of length count. Padded with nulls in binary format.
A A character string of length count. Padded with spaces in binary format. Trailing nulls and
blanks are discarded in binary scan.
b A binary string of length count. Low-to-high order.
B A binary string of length count. High-to-low order.
h A hexadecimal string of length count. Low-to-high order.
H A hexadecimal string of length count. High-to-low order. (More commonly used than h.)
c An 8-bit character code. The count is for repetition.
s A 16-bit integer in little-endian byte order. The count is for repetition.
S A 16-bit integer in big-endian byte order. The count is for repetition.
i A 32-bit integer in little-endian byte order. The count is for repetition.
I A 32-bit integer in big-endian byte order. The count is for repetition.
f Single-precision floating point value in native format. count is for repetition.
d Double-precision floating point value in native format. count is for repetition.
x Pack count null bytes with binary format.
Skip count bytes with binary scan.
X Backup count bytes.
@ Skip to absolute position specified by count. If count is *, skip to the end.

The count is interpreted differently depending on the type. For types like integer (i) and double (d),
the count is a repetition count (e.g., i3 means three integers). For strings, the count is a length (e.g., a3
means a three-character string). If no count is specified, it defaults to 1. If count is *, then binary
scan uses all the remaining bytes in the value.

Several type keys can be specified in a template. Each key-count combination moves an imaginary
cursor through the binary data. There are special type keys to move the cursor. The x key generates
null bytes in binary format, and it skips over bytes in binary scan. The @ key uses its count as an
absolute byte offset to which to set the cursor. As a special case, @* skips to the end of the data. The X
key backs up count bytes.
Numeric types have a particular byte order that determines how their value is laid out in memory. The
type keys are lowercase for little-endian byte order (e.g., Intel) and uppercase for big-endian byte order
(e.g., SPARC and Motorola). Different integer sizes are 16-bit (s or S), 32-bit (i or I), and possibly
64-bit (l or L) on those machines that support 64-bit integers. Note that the official byte order for data
transmitted over a network is big-endian. Floating point values are always machine-specific, so it only
makes sense to format and scan these values on the same machine.
There are three string types: character (a or A), binary (b or B), and hexadecimal (h or H). With these
types the count is the length of the string. The a type pads its value to the specified length with null
bytes in binary format and the A type pads its value with spaces. If the value is too long, it is
truncated. In binary scan, the A type strips trailing blanks and nulls.
A binary string consists of zeros and ones. The b type specifies bits from low-to-high order, and the B
type specifies bits from high-to-low order. A hexadecimal string specifies 4 bits (i.e., nybbles) with
each character. The h type specifies nybbles from low-to-high order, and the H type specifies nybbles
from high-to-low order. The B and H formats match the way you normally write out numbers.

Examples
When you experiment with binary format and binary scan, remember that Tcl treats things as
strings by default. A "6", for example, is the character 6 with character code 54 or 0x36. The c type
returns these character codes:

set input 6
binary scan $input "c" 6val
set 6val
=> 54

You can scan several character codes at a time:

binary scan abc "c3" list

=> 1
set list
=> 97 98 99

The previous example uses a single type key, so binary scan sets one corresponding Tcl variable. If
you want each character code in a separate variable, use separate type keys:

binary scan abc "ccc" x y z

=> 3
set z
=> 99

Use the H format to get hexadecimal values:

binary scan 6 "H2" 6val

set 6val
=> 36

Use the a and A formats to extract fixed width fields. Here the * count is used to get all the rest of the
string. Note that A trims trailing spaces:

binary scan "hello world " a3x2A* first second

puts "\"$first\" \"$second\""
=> "hel" " world"

Use the @ key to seek to a particular offset in a value. The following command gets the second double-
precision number from a vector. Assume the vector is read from a binary data file:

binary scan $vector "@8d" double

With binary format, the a and A types create fixed width fields. A pads its field with spaces, if
necessary. The value is truncated if the string is too long:

binary format "A9A3" hello world

=> hello wor

An array of floating point values can be created with this command:

binary format "f*" 1.2 3.45 7.43 -45.67 1.03e4

Remember that floating point values are always in native format, so you have to read them on the
same type of machine that they were created. With integer data you specify either big-endian or little-
endian formats. The tcl_platform variable described on page 182 can tell you the byte order of the
current platform.

Binary Data and File I/O

When working with binary data in files, you need to turn off the newline translations and character set
encoding that Tcl performs automatically. These are described in more detail on pages 114 and 209.
For example, if you are generating binary data, the following command puts your standard output in
binary mode:

fconfigure stdout -translation binary -encoding binary

puts [binary format "B8" 11001010]

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 4. String Processing in Tcl

Related Chapters
To learn more about manipulating data in Tcl, read about lists in Chapter 5 and arrays in Chapter
8.
For more about pattern matching, read about regular expressions in Chapter 11.
For more about file I/O, see Chapter 9.
For information on Unicode and other Internationalization issues, see Chapter 15.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part I. Tcl Basics

Chapter 5. Tcl Lists

This chapter describes Tcl lists. Tcl commands described are: list, lindex, llength, lrange,
lappend , linsert , lreplace, lsearch , lsort, concat, join, and split.

Lists in Tcl have the same structure as Tcl commands. All the rules you learned about grouping
arguments in Chapter 1 apply to creating valid Tcl lists. However, when you work with Tcl lists, it is
best to think of lists in terms of operations instead of syntax. Tcl commands provide operations to put
values into a list, get elements from lists, count the elements of lists, replace elements of lists, and so
on. The syntax can sometimes be confusing, especially when you have to group arguments to the list
commands themselves.
Lists are used with commands such as foreach that take lists as arguments. In addition, lists are
important when you are building up a command to be evaluated later. Delayed command evaluation
with eval is described in Chapter 10, and similar issues with Tk callback commands are described in
Chapter 27.
However, Tcl lists are not often the right way to build complicated data structures in scripts. You may
find Tcl arrays more useful, and they are the topic of Chapter 8. List operations are also not right for
handling unstructured data such as user input. Use regular expressions instead, which are described in
Chapter 11.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 5. Tcl Lists

Tcl Lists
A Tcl list is a sequence of values. When you write out a list, it has the same syntax as a Tcl command.
A list has its elements separated by white space.Braces or quotes can be used to group words with
white space into a single list element. Because of the relationship between lists and commands, the
list-related commands described in this chapter are used often when constructing Tcl commands.

Big lists were often slow before Tcl 8.0.

Unlike list data structures in other languages, Tcl lists are just strings with a special interpretation. The
string representation must be parsed on each list access, so be careful when you use large lists. A list
with a few elements will not slow down your code much. A list with hundreds or thousands of
elements can be very slow. If you find yourself maintaining large lists that must be frequently
accessed, consider changing your code to use arrays instead.
The performance of lists was improved by the Tcl compiler added in Tcl 8.0. The compiler stores lists
in an internal format that requires constant time to access. Accessing the first element costs the same
as accessing any other element in the list. Before Tcl 8.0, the cost of accessing an element was
proportional to the number of elements before it in the list. The internal format also records the
number of list elements, so getting the length of a list is cheap. Before Tcl 8.0, computing the length
required reading the whole list.
Table 5-1 briefly describes the Tcl commands related to lists.

Table 5-1. List-related commands.

list arg1 arg2 ... Creates a list out of all its arguments.
lindex list i Returns the ith element from list.
llength list Returns the number of elements in list.
lrange list i j Returns the ith through jth elements from list.
lappend listVar Appends elements to the value of listVar.
arg arg ...
linsert list index Inserts elements into list before the element at position index. Returns a
arg arg ... new list.
lreplace list i j Replaces elements i through j of list with the args. Returns a new list.
arg arg ...
lsearch ?mode? Returns the index of the element in list that matches the value according to
list value the mode, which is -exact, -glob, or -regexp. -glob is the default. Returns
-1 if not found.
lsort ?switches? Sorts elements of the list according to the switches: -ascii, -integer, -
list real, -dictionary, -increasing, -decreasing, -index ix , -command
command .

Returns a new list.

concat list list Joins multiple lists together into one list.
...

join list Merges the elements of a list together by separating them with joinString.
joinString

split string Splits a string up into list elements, using the characters in splitChars as
splitChars boundaries between list elements.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 5. Tcl Lists

Constructing Lists
Constructing a list can be tricky because you must maintain proper list syntax. In simple cases, you can
do this by hand. In more complex cases, however, you should use Tcl commands that take care of
quoting so that the syntax comes out right.

The list command

The list command constructs a list out of its arguments so that there is one list element for each
argument. If any of the arguments contain special characters, the list command adds quoting to
ensure that they are parsed as a single element of the resulting list. The automatic quoting is very
useful, and the examples in this book use the list command frequently. The next example uses list
to create a list with three values, two of which contain special characters.

Example 5-1 Constructing a list with the list command.

set x {1 2}
=> 1 2
set y foo
=> foo
set l1 [list $x "a b" $y]
=> {1 2} {a b}foo
set l2 "\{$x\\a b}$y"
=> {1 2} {a b}foo

The list command does automatic quoting.

Compare the use of list with doing the quoting by hand in Example 5-1. The assignment of l2
requires carefully constructing the first list element by using quoted braces. The braces must be turned
off so that $x can be substituted, but we need to group the result so that it remains a single list element.
We also have to know in advance that $x contains a space, so quoting is required. We are taking a risk
by not quoting $y because we know it doesn't contain spaces. If its value changes in the future, the
structure of the list can change and even become invalid. In contrast, the list command takes care of
all these details automatically.
When I first experimented with Tcl lists, I became confused by the treatment of curly braces. In the
assignment to x, for example, the curly braces disappear. However, they come back again when $x is
put into a bigger list. Also, the double quotes around a b get changed into curly braces. What's going
on? Remember that there are two steps. In the first step, the Tcl parser groups arguments. In the
grouping process, the braces and quotes are syntax that define groups. These syntax characters get
stripped off. The braces and quotes are not part of the value. In the second step, the list command
creates a valid Tcl list. This may require quoting to get the list elements into the right groups. The
list command uses curly braces to group values back into list elements.

The lappend Command

The lappend command is used to append elements to the end of a list. The first argument to lappend
is the name of a Tcl variable, and the rest of the arguments are added to the variable's value as new list
elements. Like list, lappend preserves the structure of its arguments. It may add braces to group the
values of its arguments so that they retain their identity as list elements when they are appended onto
the string representation of the list.

Example 5-2 Using lappend to add elements to a list.

lappend new 1 2
=> 1 2
lappend new 3 "4 5"
=> 1 2 3 {4 5}
set new
=> 1 2 3 {4 5}

The lappend command is unique among the list-related commands because its first argument is the
name of a list-valued variable, while all the other commands take list values as arguments. You can
call lappend with the name of an undefined variable and the variable will be created.
The lappend command is implemented efficiently to take advantage of the way that Tcl stores lists
internally. It is always more efficient to use lappend than to try and append elements by hand.

The concat Command

The concat command is useful for splicing lists together. It works by concatenating its arguments,
separating them with spaces. This joins multiple lists into one list where the top-level list elements in
each input list become top-level list elements in the resulting list:

Example 5-3 Using concat to splice lists together.

set x {4 5 6}
set y {2 3}
set z 1
concat $z $y $x
=> 1 2 3 4 5 6

Double quotes behave much like the concat command. In simple cases, double quotes behave exactly
like concat. However, the concat command trims extra white space from the end of its arguments
before joining them together with a single separating space character. Example 5-4 compares the use
of list, concat, and double quotes:

Example 5-4 Double quotes compared to the concat and list commands.

set x {1 2}
=> 1 2
set y "$x 3"
=> 1 2 3
set y [concat $x 3]
=> 1 2 3
set s { 2 }
=> 2
set y "1 $s 3"
=> 1 2 3
set y [concat 1 $s 3]
=> 1 2 3
set z [list $x $s 3]
=> {1 2} { 2 } 3

The distinction between list and concat becomes important when Tcl commands are built
dynamically. The basic rule is that list and lappend preserve list structure, while concat (or double
quotes) eliminates one level of list structure. The distinction can be subtle because there are examples
where list and concat return the same results. Unfortunately, this can lead to data-dependent bugs.
Throughout the examples of this book, you will see the list command used to safely construct lists.
This issue is discussed more in Chapter 10.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 5. Tcl Lists

Getting List Elements: llength, lindex, and lrange

The llength command returns the number of elements in a list.

llength {a b {c d}"e f g" h}

=> 5
llength {}
=> 0

The lindex command returns a particular element of a list. It takes an index; list indices count from
zero.

set x {1 2 3}
lindex $x 1
=> 2

You can use the keyword end to specify the last element of a list, or the syntax end-N to count back
from the end of the list. The following commands are equivalent ways to get the element just before
the last element in a list.

lindex $list [expr {[llength $list] - 2}]

lindex $list end-1

The lrange command returns a range of list elements. It takes a list and two indices as arguments.
Again, end or end-N can be used as an index:

lrange {1 2 3 {4 5}}2 end

=> 3 {4 5}
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 5. Tcl Lists

Modifying Lists: linsert and lreplace

The linsert command inserts elements into a list value at a specified index. If the index is zero or
less, then the elements are added to the front. If the index is equal to or greater than the length of the
list, then the elements are appended to the end. Otherwise, the elements are inserted before the element
that is currently at the specified index.
lreplace replaces a range of list elements with new elements. If you don't specify any new elements,
you effectively delete elements from a list.
Note: linsert and lreplace do not modify an existing list. Instead, they return a new list value. In the
following example, the lreplace command does not change the value of x:

Example 5-5 Modifying lists with linsert and lreplace.

linsert {1 2}0 new stuff

=> new stuff 1 2
set x [list a {b c}e d]
=> a {b c}e d
lreplace $x 1 2 B C
=> a B C d
lreplace $x 0 0
=> {b c}e d

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 5. Tcl Lists

Searching Lists: lsearch

lsearch returns the index of a value in the list, or -1 if it is not present. lsearch supports pattern
matching in its search. Glob-style pattern matching is the default, and this can be disabled with the -
exact flag. The semantics of glob pattern matching is described in Chapter 4. The -regexp option lets
you specify the list value with a regular expression. Regular expressions are described in Chapter 11.
In the following example, the glob pattern l* matches the value list.

lsearch {here is a list}l*

=> 3

Example 5-6 uses lreplace and lsearch to delete a list element by value. The value is found with
lsearch . The value is removed with an lreplace that does not specify any replacement list elements:

Example 5-6 Deleting a list element by value.

proc ldelete {list value } {

set ix [lsearch -exact $list $value]
if {$ix >= 0} {
return [lreplace $list $ix $ix]
} else {
return $list
}
}

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 5. Tcl Lists

Sorting Lists: lsort

You can sort a list in a variety of ways with lsort. The list is not sorted in place. Instead, a new list
value is returned. The basic types of sorts are specified with the -ascii, -dictionary, -integer, or -
real options. The -increasing or -decreasing option indicate the sorting order. The default option
set is -ascii -increasing. An ASCII sort uses character codes, and a dictionary sort folds together
case and treats digits like numbers. For example:

lsort -ascii {a Z n2 n100}

=> Z a n100 n2
lsort -dictionary {a Z n2 n100}
=> a n2 n100 Z

You can provide your own sorting function for special-purpose sorting. For example, suppose you
have a list of names, where each element is itself a list containing the person's first name, middle name
(if any), and last name. The default sorts by everyone's first name. If you want to sort by their last
name, you need to supply a sorting command.

Example 5-7 Sorting a list using a comparison function.

proc NameCompare {a b} {
set alast [lindex $a end]
set blast [lindex $b end]
set res [string compare $alast $blast]
if {$res != 0} {
return $res
} else {
return [string compare $a $b]
}
}
set list {{Brent B. Welch} {John Ousterhout} {Miles Davis}}
=> {Brent B. Welch} {John Ousterhout} {Miles Davis}
lsort -command NameCompare $list
=> {Miles Davis} {John Ousterhout} {Brent B. Welch}

The NameCompare procedure extracts the last element from each of its arguments and compares those.
If they are equal, then it just compares the whole of each argument.
Tcl 8.0 added a -index option to lsort that can be used to sort lists on an index. Instead of using
NameCompare, you could do this:

lsort -index end $list

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 5. Tcl Lists

The split Command

The split command takes a string and turns it into a list by breaking it at specified characters and
ensuring that the result has the proper list syntax. The split command provides a robust way to turn
input lines into proper Tcl lists:

set line {welch:*:28405:100:Brent Welch:/usr/welch:/bin/csh}

split $line :
=> welch * 28405 100 {Brent Welch} /usr/welch /bin/csh
lindex [split $line :] 4
=> Brent Welch

Do not use list operations on arbitrary data.

Even if your data has space-separated words, you should be careful when using list operators on
arbitrary input data. Otherwise, stray double quotes or curly braces in the input can result in invalid list
structure and errors in your script. Your code will work with simple test cases, but when invalid list
syntax appears in the input, your script will raise an error. The next example shows what happens
when input is not a valid list. The syntax error, an unmatched quote, occurs in the middle of the list.
However, you cannot access any of the list because the lindex command tries to convert the value to a
list before returning any part of it.

Example 5-8 Use split to turn input data into Tcl lists.

set line {this is "not a tcl list}

lindex $line 1
=> unmatched open quote in list
lindex [split $line] 2
=> "not

The default separator character for split is white space, which contains spaces, tabs, and newlines. If
there are multiple separator characters in a row, these result in empty list elements; the separators are
not collapsed. The following command splits on commas, periods, spaces, and tabs. The
backslash–space sequence is used to include a space in the set of characters. You could also group the
argument to split with double quotes:

set line "\tHello, world."

split $line \,.\t
=> {}Hello {}world {}

A trick that splits each character into a list element is to specify an empty string as the split character.
This lets you get at individual characters with list operations:

split abc {}
=> a b c

However, if you write scripts that process data one character at a time, they may run slowly. Read
Chapter 11 about regular expressions for hints on really efficient string processing.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 5. Tcl Lists

The join Command

The join command is the inverse of split. It takes a list value and reformats it with specified
characters separating the list elements. In doing so, it removes any curly braces from the string
representation of the list that are used to group the top-level elements. For example:

join {1 {2 3} {4 5 6}}:
=> 1:2 3:4 5 6

If the treatment of braces is puzzling, remember that the first value is parsed into a list. The braces
around element values disappear in the process. Example 5-9 shows a way to implement join in a Tcl
procedure, which may help to understand the process:

Example 5-9 Implementing join in Tcl.

proc join {list sep} {

set s {} ;# s is the current separator
set result {}
foreach x $list {
append result $s $x
set s $sep
}
return $result
}

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 5. Tcl Lists

Related Chapters
Arrays are the other main data structure in Tcl. They are described in Chapter 8.
List operations are used when generating Tcl code dynamically. Chapter 10 describes these
techniques when using the eval command.
The foreach command loops over the values in a list. It is described on page 73 in Chapter 6.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part I. Tcl Basics

Chapter 6. Control Structure Commands

This chapter describes the Tcl commands that implement control structures: if, switch, foreach,
while, for, break, continue, catch, error, and return.

Control structure in Tcl is achieved with commands, just like everything else. There are looping
commands: while, foreach, and for. There are conditional commands: if and switch. There is an
error handling command: catch. Finally, there are some commands to fine-tune control structures:
break, continue, return, and error.

A control structure command often has a command body that is executed later, either conditionally or
in a loop. In this case, it is important to group the command body with curly braces to avoid
substitutions at the time the control structure command is invoked. Group with braces, and let the
control structure command trigger evaluation at the proper time. A control structure command returns
the value of the last command it chose to execute.
Another pleasant property of curly braces is that they group things together while including newlines.
The examples use braces in a way that is both readable and convenient for extending the control
structure commands across multiple lines.
Commands like if, for, and while involve boolean expressions. They use the expr command
internally, so there is no need for you to invoke expr explicitly to evaluate their boolean test
expressions.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 6. Control Structure Commands

If Then Else
The if command is the basic conditional command. If an expression is true, then execute one
command body; otherwise, execute another command body. The second command body (the else
clause) is optional. The syntax of the command is:

if expression ?then? body1 ?else? ?body2?

The then and else keywords are optional. In practice, I omit then but use else as illustrated in the
next example. I always use braces around the command bodies, even in the simplest cases:

Example 6-1 A conditional if then else command.

if {$x == 0} {
puts stderr "Divide by zero!"
} else {
set slope [expr $y/$x]
}

Curly brace positioning is important.

The style of this example takes advantage of the way the Tcl interpreter parses commands. Recall that
newlines are command terminators, except when the interpreter is in the middle of a group defined by
braces or double quotes. The stylized placement of the opening curly brace at the end of the first and
third lines exploits this property to extend the if command over multiple lines.
The first argument to if is a boolean expression. As a matter of style this expression is grouped with
curly braces. The expression evaluator performs variable and command substitution on the expression.
Using curly braces ensures that these substitutions are performed at the proper time. It is possible to be
lax in this regard, with constructs such as:

if $x break continue

This is a sloppy, albeit legitimate, if command that will either break out of a loop or continue with the
next iteration depending on the value of variable x. This style is fragile and error prone. Instead,
always use braces around the command bodies to avoid trouble later when you modify the command.
The following is much better (use then if it suits your taste):

if {$x} {
break
} else {
continue
}

When you are testing the result of a command, you can get away without using curly braces around the
command, like this:

if [command] body1

However, it turns out that you can execute the if statement more efficiently if you always group the
expression with braces, like this:

if {[command]}body1

You can create chained conditionals by using the elseif keyword. Again, note the careful placement
of curly braces that create a single if command:

Example 6-2 Chained conditional with elseif.

if {$key < 0} {
incr range 1
} elseif {$key == 0} {
return $range
} else {
incr range -1
}

Any number of conditionals can be chained in this manner. However, the switch command provides a
more powerful way to test multiple conditions.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 6. Control Structure Commands

Switch
The switch command is used to branch to one of many command bodies depending on the value of an
expression. The choice can be made on the basis of pattern matching as well as simple comparisons.
Pattern matching is discussed in more detail in Chapter 4 and Chapter 11. The general form of the
command is:

switch flags value pat1 body1 pat2 body2 ...

Any number of pattern-body pairs can be specified. If multiple patterns match, only the body of the
first matching pattern is evaluated. You can also group all the pattern-body pairs into one argument:

switch flags value {pat1 body1 pat2 body2 ... }

The first form allows substitutions on the patterns but will require backslashes to continue the
command onto multiple lines. This is shown in Example 6-4 on page 72. The second form groups all
the patterns and bodies into one argument. This makes it easy to group the whole command without
worrying about newlines, but it suppresses any substitutions on the patterns. This is shown in Example
6-3. In either case, you should always group the command bodies with curly braces so that substitution
occurs only on the body with the pattern that matches the value.
There are four possible flags that determine how value is matched.

-exact Matches the value exactly to one of the patterns. This is the default.
-glob Uses glob-style pattern matching. See page 48.
-regexp Uses regular expression pattern matching. See page 134.
-- No flag (or end of flags). Necessary when value can begin with -.

The switch command raises an error if any other flag is specified or if the value begins with -. In
practice I always use the -- flag before value so that I don't have to worry about that problem.
If the pattern associated with the last body is default, then this command body is executed if no other
patterns match. The default keyword works only on the last pattern-body pair. If you use the default
pattern on an earlier body, it will be treated as a pattern to match the literal string default:

Example 6-3 Using switch for an exact match.

switch -exact -- $value {

foo { doFoo; incr count(foo) }
bar { doBar; return $count(foo)}
default { incr count(other) }
}

If you have variable references or backslash sequences in the patterns, then you cannot use braces
around all the pattern-body pairs. You must use backslashes to escape the newlines in the command:

Example 6-4 Using switch with substitutions in the patterns.

switch -regexp -- $value \

^$key { body1 }\
\t### { body2 }\
{[0-9]*} { body3 }

In this example, the first and second patterns have substitutions performed to replace $key with its
value and \t with a tab character. The third pattern is quoted with curly braces to prevent command
substitution; square brackets are part of the regular expression syntax, too. (See page Chapter 11.)
If the body associated with a pattern is just a dash, -, then the switch command "falls through" to the
body associated with the next pattern. You can tie together any number of patterns in this manner.

Example 6-5 A switch with "fall through" cases.

switch -glob -- $value {

X* -
Y* { takeXorYaction $value }
}

Comments in switch Commands

A comment can occur only where the Tcl parser expects a command to begin.
This restricts the location of comments in a switch command. You must put them
inside the command body associated with a pattern, as shown in Example 6-6. If
you put a comment at the same level as the patterns, the switch command will try
to interpret the comment as one or more pattern-body pairs.
Example 6-6 Comments in switch commands.

switch -- $value {
# this comment confuses switch
pattern { # this comment is ok }
}

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 6. Control Structure Commands

While
The while command takes two arguments, a test and a command body:

while booleanExpr body

The while command repeatedly tests the boolean expression and then executes the body if the
expression is true (nonzero). Because the test expression is evaluated again before each iteration of the
loop, it is crucial to protect the expression from any substitutions before the while command is
invoked. The following is an infinite loop (see also Example 1-13 on page 12):

set i 0 ; while $i<10 {incr i}

The following behaves as expected:

set i 0 ; while {$i<10} {incr i}

It is also possible to put nested commands in the boolean expression. The following example uses
gets to read standard input. The gets command returns the number of characters read, returning -1
upon end of file. Each time through the loop, the variable line contains the next line in the file:

Example 6-7 A while loop to read standard input.

set numLines 0 ; set numChars 0

while {[gets stdin line] >= 0} {
incr numLines
incr numChars [string length $line]
}
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 6. Control Structure Commands

Foreach
The foreach command loops over a command body assigning one or more loop variables to each of
the values in one or more lists. Multiple loop variables were introduced in Tcl 7.5. The syntax for the
simple case of a single variable and a single list is:

foreach loopVar valueList commandBody

The first argument is the name of a variable, and the command body is executed once for each element
in the list with the loop variable taking on successive values in the list. The list can be entered
explicitly, as in the next example:

Example 6-8 Looping with foreach.

set i 1
foreach value {1 3 5 7 11 13 17 19 23} {
set i [expr $i*$value]
}
set i
=> 111546435

It is also common to use a list-valued variable or command result instead of a static list value. The
next example loops through command-line arguments. The variable argv is set by the Tcl interpreter
to be a list of the command-line arguments given when the interpreter was started:

Example 6-9 Parsing command-line arguments.

# argv is set by the Tcl shells

# possible flags are:
# -max integer
# -force
# -verbose
set state flag
set force 0
set verbose 0
set max 10
foreach arg $argv {
switch -- $state {
flag {
switch -glob -- $arg {
-f* {set force 1}
-v* {set verbose 1}
-max {set state max}
default {error "unknown flag $arg"}
}
}
max {
set max $arg
set state flag
}
}
}

The loop uses the state variable to keep track of what is expected next, which in this example is
either a flag or the integer value for -max. The -- flag to switch is required in this example because
the switch command complains about a bad flag if the pattern begins with a - character. The -glob
option lets the user abbreviate the -force and -verbose options.
If the list of values is to contain variable values or command results, then the list
command should be used to form the list. Avoid double quotes because if any
values or command results contain spaces or braces, the list structure will be
reparsed, which can lead to errors or unexpected results.

Example 6-10 Using list with foreach.

foreach x [list $a $b [foo]] {

puts stdout "x = $x"
}

The loop variable x will take on the value of a, the value of b, and the result of the foo command,
regardless of any special characters or whitespace in those values.

Multiple Loop Variables

You can have more than one loop variable with foreach. Suppose you have two loop variables x and
y. In the first iteration of the loop, x gets the first value from the value list and y gets the second value.
In the second iteration, x gets the third value and y gets the fourth value. This continues until there are
no more values. If there are not enough values to assign to all the loop variables, the extra variables get
the empty string as their value.

Example 6-11 Multiple loop variables with foreach.

foreach {key value} {orange 55 blue 72 red 24 green} {

puts "$key: $value"
}
orange: 55
blue: 72
red: 24
green:

If you have a command that returns a short list of values, then you can abuse the foreach command to
assign the results of the commands to several variables all at once. For example, suppose the command
MinMax returns two values as a list: the minimum and maximum values. Here is one way to get the
values:

set result [MinMax $list]

set min [lindex $result 0]
set max [lindex $result 1]

The foreach command lets us do this much more compactly:

foreach {min max}[MinMax $list] {break}

The break in the body of the foreach loop guards against the case where the command returns more
values than we expected. This trick is encapsulated into the lassign procedure in Example 10-4 on
page 131.

Multiple Value Lists

The foreach command has the ability to loop over multiple value lists in parallel. In this case, each
value list can also have one or more variables. The foreach command keeps iterating until all values
are used from all value lists. If a value list runs out of values before the last iteration of the loop, its
corresponding loop variables just get the empty string for their value.

Example 6-12 Multiple value lists with foreach.

foreach {k1 k2} {orange blue red green black}value {55 72 24} {
puts "$k1 $k2: $value"
}
orange blue: 55
red green: 72
black : 24

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 6. Control Structure Commands

For
The for command is similar to the C for statement. It takes four arguments:

for initial test final body

The first argument is a command to initialize the loop. The second argument is a boolean expression
that determines whether the loop body will execute. The third argument is a command to execute after
the loop body:

Example 6-13 A for loop.

for {set i 0} {$i < 10} {incr i 3} {

lappend aList $i
}
set aList
=> 0 3 6 9

You could use for to iterate over a list, but you should really use foreach instead. Code like the
following is slow and cluttered:

for {set i 0} {$i < [llength $list]} {incr i} {

set value [lindex $list $i]
}

This is the same as:

foreach value $list {

}
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 6. Control Structure Commands

Break and Continue

You can control loop execution with the break and continue commands. The break command causes
immediate exit from a loop, while the continue command causes the loop to continue with the next
iteration. There is no goto command in Tcl.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 6. Control Structure Commands

Catch
Until now we have ignored the possibility of errors. In practice, however, a command will raise an
error if it is called with the wrong number of arguments, or if it detects some error condition particular
to its implementation. An uncaught error aborts execution of a script.[*] The catch command is used
to trap such errors. It takes two arguments:
[*] More precisely, the Tcl script unwinds and the current Tcl_Eval procedure in the C runtime library returns TCL_ERROR . There are three
cases. In interactive use, the Tcl shell prints the error message. In Tk, errors that arise during event handling trigger a call to bgerror, a Tcl
procedure you can implement in your application. In your own C code, you should check the result of Tcl_Eval and take appropriate action
in the case of an error.

catch command ?resultVar?

The first argument to catch is a command body. The second argument is the name of a variable that
will contain the result of the command, or an error message if the command raises an error. catch
returns zero if there was no error caught, or a nonzero error code if it did catch an error.
You should use curly braces to group the command instead of double quotes because catch invokes
the full Tcl interpreter on the command. If double quotes are used, an extra round of substitutions
occurs before catch is even called. The simplest use of catch looks like the following:

catch {command }

A more careful catch phrase saves the result and prints an error message:

Example 6-14 A standard catch phrase.

if {[catch { command arg1 arg2 ... }result]} {

puts stderr $result
} else {
# command was ok, result contains the return value
}

A more general catch phrase is shown in the next example. Multiple commands are grouped into a
command body. The errorInfo variable is set by the Tcl interpreter after an error to reflect the stack
trace from the point of the error:

Example 6-15 A longer catch phrase.

if {[catch {
command1
command2
command3
} result]} {
global errorInfo
puts stderr $result
puts stderr "*** Tcl TRACE ***"
puts stderr $errorInfo
} else {
# command body ok, result of last command is in result
}

These examples have not grouped the call to catch with curly braces. This is acceptable because
catch always returns an integer, so the if command will parse correctly. However, if we had used
while instead of if , then curly braces would be necessary to ensure that the catch phrase was
evaluated repeatedly.

Catching More Than Errors

The catch command catches more than just errors. If the command body contains return, break, or
continue commands, these terminate the command body and are reflected by catch as nonzero return
codes. You need to be aware of this if you try to isolate troublesome code with a catch phrase. An
innocent looking return command will cause the catch to signal an apparent error. The next example
uses switch to find out exactly what catch returns. Nonerror cases are passed up to the surrounding
code by invoking return, break, or continue:

Example 6-16 There are several possible return values from catch.

switch [catch {
command1
command2
...
} result] {
0 { # Normal completion }
1 { # Error case }
2 { return $result ;# return from procedure}
3 { break ;# break out of the loop}
4 { continue ;# continue loop}
default { # User-defined error codes }
}

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 6. Control Structure Commands

Error
The error command raises an error condition that terminates a script unless it is trapped with the
catch command. The command takes up to three arguments:

error message ?info? ?code?

The message becomes the error message stored in the result variable of the catch command.
If the info argument is provided, then the Tcl interpreter uses this to initialize the errorInfo global
variable. That variable is used to collect a stack trace from the point of the error. If the info argument
is not provided, then the error command itself is used to initialize the errorInfo trace.

Example 6-17 Raising an error.

proc foo {} {
error bogus
}
foo
=> bogus
set errorInfo
=> bogus
while executing
"error bogus"
(procedure "foo" line 2)
invoked from within
"foo"

In the previous example, the error command itself appears in the trace. One common use of the info
argument is to preserve the errorInfo that is available after a catch. In the next example, the
information from the original error is preserved:
Example 6-18 Preserving errorInfo when calling error.

if {[catch {foo}result]} {
global errorInfo
set savedInfo $errorInfo
# Attempt to handle the error here, but cannot...
error $result $savedInfo
}

The code argument specifies a concise, machine-readable description of the error. It is stored into the
global errorCode variable. It defaults to NONE. Many of the file system commands return an
errorCode that has three elements: POSIX, the error name (e.g., ENOENT), and the associated error
message:

POSIX ENOENT {No such file or directory}

In addition, your application can define error codes of its own. Catch phrases can examine the code in
the global errorCode variable and decide how to respond to the error.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 6. Control Structure Commands

Return
The return command is used to return from a procedure. It is needed if return is to occur before the
end of the procedure body, or if a constant value needs to be returned. As a matter of style, I also use
return at the end of a procedure, even though a procedure returns the value of the last command
executed in the body.
Exceptional return conditions can be specified with some optional arguments to return. The complete
syntax is:

return ?-code c? ?-errorinfo i? ?-errorcode ec? string

The -code option value is one of ok, error, return, break, continue, or an integer. ok is the default
if -code is not specified.
The -code error option makes return behave much like the error command. The -errorcode
option sets the global errorCode variable, and the -errorinfo option initializes the errorInfo global
variable. When you use return -code error, there is no error command in the stack trace. Compare
Example 6-17 with Example 6-19:

Example 6-19 Raising an error with return.

proc bar {} {
return -code error bogus
}
catch {bar}result
=> 1
set result
=> bogus
set errorInfo
=> bogus
while executing
"bar"
The return, break, and continue code options take effect in the caller of the procedure doing the
exceptional return. If -code return is specified, then the calling procedure returns. If -code break is
specified, then the calling procedure breaks out of a loop, and if -code continue is specified, then the
calling procedure continues to the next iteration of the loop. These -code options to return enable the
construction of new control structures entirely in Tcl. The following example implements the break
command with a Tcl procedure:

proc break {} {
return -code break
}

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part I. Tcl Basics

Chapter 7. Procedures and Scope

Procedures encapsulate a set of commands, and they introduce a local scope for variables. Commands
described are: proc, global, and upvar.
Procedures parameterize a commonly used sequence of commands. In addition, each procedure has a
new local scope for variables. The scope of a variable is the range of commands over which it is
defined. Originally, Tcl had one global scope for shared variables, local scopes within procedures, and
one global scope for procedures. Tcl 8.0 added namespaces that provide new scopes for procedures
and global variables. For simple applications you can ignore namespaces and just use the global scope.
Namespaces are described in Chapter 14.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 7. Procedures and Scope

The proc Command

A Tcl procedure is defined with the proc command. It takes three arguments:

proc name params body

The first argument is the procedure name, which is added to the set of commands understood by the
Tcl interpreter. The name is case sensitive and can contain any characters. Procedure names do not
conflict with variable names. The second argument is a list of parameter names. The last argument is
the body of the procedure.
Once defined, a Tcl procedure is used just like any other Tcl command. When it is called, each
argument is assigned to the corresponding parameter and the body is evaluated. The result of the
procedure is the result returned by the last command in the body. The return command can be used to
return a specific value.
Procedures can have default parameters so that the caller can leave out some of the command
arguments. A default parameter is specified with its name and default value, as shown in the next
example:

Example 7-1 Default parameter values.

proc P2 {a {b 7} {c -2}} {
expr $a / $b + $c
}
P2 6 3
=> 0

Here the procedure P2 can be called with one, two, or three arguments. If it is called with only one
argument, then the parameters b and c take on the values specified in the proc command. If two
arguments are provided, then only c gets the default value, and the arguments are assigned to a and b.
At least one argument and no more than three arguments can be passed to P2.
A procedure can take a variable number of arguments by specifying the args keyword as the last
parameter. When the procedure is called, the args parameter is a list that contains all the remaining
values:

Example 7-2 Variable number of arguments.

proc ArgTest {a {b foo}args} {

foreach param {a b args} {
puts stdout "\t$param = [set $param]"
}
}
set x one
set y {two things}
set z \[special\$
ArgTest $x
=> a = one
b = foo
args =
ArgTest $y $z
=> a = two things
b = [special$
args =
ArgTest $x $y $z
=> a = one
b = two things
args = {[special$}
ArgTest $z $y $z $x
=> a = [special$
b = two things
args = {[special$}one

The effect of the list structure in args is illustrated by the treatment of variable z in Example 7-2. The
value of z has special characters in it. When $z is passed as the value of parameter b, its value comes
through to the procedure unchanged. When $z is part of the optional parameters, quoting is
automatically added to create a valid Tcl list as the value of args. Example 10-3 on page 127
illustrates a technique that uses eval to undo the effect of the added list structure.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 7. Procedures and Scope

Changing Command Names with rename

The rename command changes the name of a command. There are two main uses for rename. The first
is to augment an existing procedure. Before you redefine it with proc, rename the existing command:

rename foo foo.orig

From within the new implementation of foo you can invoke the original command as foo.orig.
Existing users of foo will transparently use the new version.
The other thing you can do with rename is completely hide a command by renaming it to the empty
string. For example, you might not want users to execute UNIX programs, so you could disable exec
with the following command:

rename exec {}

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 7. Procedures and Scope

Scope
By default there is a single, global scope for procedure names. This means that you can use a
procedure anywhere in your script. Variables defined outside any procedure are global variables.
However, as described below, global variables are not automatically visible inside procedures. There is
a different namespace for variables and procedures, so you could have a procedure and a global
variable with the same name without conflict. You can use the namespace facility described in Chapter
7 to manage procedures and global variables.
Each procedure has a local scope for variables. That is, variables introduced in the procedure live only
for the duration of the procedure call. After the procedure returns, those variables are undefined.
Variables defined outside the procedure are not visible to a procedure unless the upvar or global
scope commands are used. You can also use qualified names to name variables in a namespace scope.
The global and upvar commands are described later in this chapter. Qualified names are described on
page 198. If the same variable name exists in an outer scope, it is unaffected by the use of that variable
name inside a procedure.
In Example 7-3, the variable a in the global scope is different from the parameter a to P1. Similarly,
the global variable b is different from the variable b inside P1:

Example 7-3 Variable scope and Tcl procedures.

set a 5
set b -8
proc P1 {a} {
set b 42
if {$a < 0} {
return $b
} else {
return $a
}
}
P1 $b
=> 42
P1 [expr $a*2]
=> 10

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 7. Procedures and Scope

The global Command

Global scope is the toplevel scope. This scope is outside of any procedure. Variables defined at the
global scope must be made accessible to the commands inside a procedure by using the global
command. The syntax for global is:

global varName1 varName2 ...

The global command goes inside a procedure.

The global command adds a global variable to the current scope. A common mistake is to have a
single global command and expect that to apply to all procedures. However, a global command in
the global scope has no effect. Instead, you must put a global command in all procedures that access
the global variable. The variable can be undefined at the time the global command is used. When the
variable is defined, it becomes visible in the global scope.
Example 7-4 shows a random number generator. Before we look at the example, let me point out that
the best way to get random numbers in Tcl is to use the rand() math function:

expr rand()
=> .137287362934

The point of the example is to show a state variable, the seed, that has to persist between calls to
random, so it is kept in a global variable. The choice of randomSeed as the name of the global variable
associates it with the random number generator. It is important to pick names of global variables
carefully to avoid conflict with other parts of your program. For comparison, Example 14-1 on page
196 uses namespaces to hide the state variable:
Example 7-4 A random number generator.[*]

proc RandomInit { seed } {

global randomSeed
set randomSeed $seed
}
proc Random {} {
global randomSeed
set randomSeed [expr ($randomSeed*9301 + 49297) % 233280]
return [expr $randomSeed/double(233280)]
}
proc RandomRange { range } {
expr int([Random]*$range)
}
RandomInit [pid]
=> 5049
Random
=> 0.517686899863
Random
=> 0.217176783265
RandomRange 100
=> 17

[*]
Adapted from Exploring Expect by Don Libes, O'Reilly & Associates, Inc., 1995, and from Numerical Recipes in C by Press et al.,
Cambridge University Press, 1988.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 7. Procedures and Scope

Call by Name Using upvar

Use the upvar command when you need to pass the name of a variable, as opposed to its value, into a
procedure. The upvar command associates a local variable with a variable in a scope up the Tcl call
stack. The syntax of the upvar command is:

upvar ?level? varName localvar

The level argument is optional, and it defaults to 1, which means one level up the Tcl call stack. You
can specify some other number of frames to go up, or you can specify an absolute frame number with a
#number syntax. Level #0 is the global scope, so the global foo command is equivalent to:

upvar #0 foo foo

The variable in the uplevel stack frame can be either a scalar variable, an array element, or an array
name. In the first two cases, the local variable is treated like a scalar variable. In the case of an array
name, then the local variable is treated like an array. The use of upvar and arrays is discussed further
in Chapter 8 on page 92. The following procedure uses upvar to print the value of a variable given its
name.

Example 7-5 Print variable by name.

proc PrintByName { varName } {

upvar 1 $varName var
puts stdout "$varName = $var"
}

You can use upvar to fix the incr command. One drawback of the built-in incr is that it raises an
error if the variable does not exist. We can define a new version of incr that initializes the variable if
it does not already exist:
Example 7-6 Improved incr procedure.

proc incr { varName {amount 1}} {

upvar 1 $varName var
if {[info exists var]} {
set var [expr $var + $amount]
} else {
set var $amount
}
return $var
}

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 7. Procedures and Scope

Variable Aliases with upvar

The upvar command is useful in any situation where you have the name of a variable stored in another
variable. In Example 7-2 on page 82, the loop variable param holds the names of other variables. Their
value is obtained with this construct:

puts stdout "\t$param = [set $param]"

Another way to do this is to use upvar. It eliminates the need to use awkward constructs like [set
$param] . If the variable is in the same scope, use zero as the scope number with upvar. The following
is equivalent:

upvar 0 $param x
puts stdout "\t$param = $x"

Associating State with Data

Suppose you have a program that maintains state about a set of objects like files, URLs, or people.
You can use the name of these objects as the name of a variable that keeps state about the object. The
upvar command makes this more convenient:

upvar #0 $name state

Using the name directly like this is somewhat risky. If there were an object named x, then this trick
might conflict with an unrelated variable named x elsewhere in your program. You can modify the
name to make this trick more robust:

upvar #0 state$name state

Your code can pass name around as a handle on an object, then use upvar to get access to the data
associated with the object. Your code is just written to use the state variable, which is an alias to the
state variable for the current object. This technique is illustrated in Example 17-7 on page 232.

Namespaces and upvar

You can use upvar to create aliases for namespace variables, too. Namespaces are described in
Chapter 14. For example, as an alternative to reserving all global variables beginning with state, you
can use a namespace to hide these variables:

upvar #0 state::$name state

Now state is an alias to the namespace variable. This upvar trick works from inside any namespace.

Commands That Take Variable Names

Several Tcl commands involve variable names. For example, the Tk widgets can be associated with a
global Tcl variable. The vwait and tkwait commands also take variable names as arguments.

Upvar aliases do not work with text variables.

The aliases created with upvar do not work with these commands, nor do they work if you use trace,
which is described on page 183. Instead, you must use the actual name of the global variable. To
continue the above example where state is an alias, you cannot:

vwait state(foo)
button .b -textvariable state(foo)

Instead, you must

vwait state$name\(foo)
button .b -textvariable state$name\(foo)

The backslash turns off the array reference so Tcl does not try to access name as an array. You do not
need to worry about special characters in $name, except parentheses. Once the name has been passed
into the Tk widget it will be used directly as a variable name.
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part I. Tcl Basics

Chapter 8. Tcl Arrays

This chapter describes Tcl arrays, which provide a flexible mechanism to build many other data
structures in Tcl. Tcl command described is: array.
An array is a Tcl variable with a string-valued index. You can think of the index as a key, and the array
as a collection of related data items identified by different keys. The index, or key, can be any string
value. Internally, an array is implemented with a hash table, so the cost of accessing each array
element is about the same. Before Tcl 8.0, arrays had a performance advantage over lists that took time
to access proportional to the size of the list.
The flexibility of arrays makes them an important tool for the Tcl programmer. A common use of
arrays is to manage a collection of variables, much as you use a C struct or Pascal record. This chapter
shows how to create several simple data structures using Tcl arrays.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 8. Tcl Arrays

Array Syntax
The index of an array is delimited by parentheses. The index can have any string value, and it can be
the result of variable or command substitution. Array elements are defined with set:

set arr(index) value

The value of an array element is obtained with $ substitution:

set foo $arr(index)

Example 8-1 uses the loop variable value $i as an array index. It sets arr(x) to the product of 1 * 2
* ... * x:

Example 8-1 Using arrays.

set arr(0) 1
for {set i 1} {$i <= 10} {incr i} {
set arr($i) [expr {$i * $arr([expr $i-1])}]
}

Complex Indices
An array index can be any string, like orange, 5, 3.1415, or foo,bar. The examples in this chapter,
and in this book, often use indices that are pretty complex strings to create flexible data structures. As
a rule of thumb, you can use any string for an index, but avoid using a string that contains spaces.
Parentheses are not a grouping mechanism.

The main Tcl parser does not know about array syntax. All the rules about grouping and substitution
described in Chapter 1 are still the same in spite of the array syntax described here. Parentheses do not
group like curly braces or quotes, which is why a space causes problems. If you have complex indices,
use a comma to separate different parts of the index. If you use a space in an index instead, then you
have a quoting problem. The space in the index needs to be quoted with a backslash, or the whole
variable reference needs to be grouped:

set {arr(I'm asking for trouble)} {I told you so.}

set arr(I'm\ asking\ for\ trouble) {I told you so.}

If the array index is stored in a variable, then there is no problem with spaces in the variable's value.
The following works well:

set index {I'm asking for trouble}

set arr($index) {I told you so.}

Array Variables
You can use an array element as you would a simple variable. For example, you can test for its
existence with info exists, increment its value with incr, and append elements to it with lappend:

if {[info exists stats($event)]} {incr stats($event)}

You can delete an entire array, or just a single array element with unset. Using unset on an array is a
convenient way to clear out a big data structure.
It is an error to use a variable as both an array and a normal variable. The following is an error:

set arr(0) 1
set arr 3
=> can't set "arr": variable is array

The name of the array can be the result of a substitution. This is a tricky situation, as shown in
Example 8-2:

Example 8-2 Referencing an array indirectly.

set name TheArray
=> TheArray
set ${name}(xyz) {some value}
=> some value
set x $TheArray(xyz)
=> some value
set x ${name}(xyz)
=> TheArray(xyz)
set x [set ${name}(xyz)]
=> some value

A better way to deal with this situation is to use the upvar command, which is introduced on page 85.
The previous example is much cleaner when upvar is used:

Example 8-3 Referencing an array indirectly using upvar.

set name TheArray

=> TheArray
upvar 0 $name a
set a(xyz) {some value}
=> some value
set x $TheArray(xyz)
=> some value

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 8. Tcl Arrays

The array Command

The array command returns information about array variables. The array names command returns
the index names that are defined in the array. If the array variable is not defined, then array names
just returns an empty list. It allows easy iteration through an array with a foreach loop:

foreach index [array names arr pattern] {

# use arr($index)
}

The order of the names returned by array names is arbitrary. It is essentially determined by the hash
table implementation of the array. You can limit what names are returned by specifying a pattern that
matches indices. The pattern is the kind supported by the string match command, which is described
on page 48.
It is also possible to iterate through the elements of an array one at a time using the search-related
commands listed in Table 8-1. The ordering is also random, and I find the foreach over the results of
array names much more convenient. If your array has an extremely large number of elements, or if
you need to manage an iteration over a long period of time, then the array search operations might be
more appropriate. Frankly, I never use them. Table 8-1 summarizes the array command:

Table 8-1. The array command.

array exists arr Returns 1 if arr is an array variable.
array get arr ? Returns a list that alternates between an index and the corresponding array
pattern? value. pattern selects matching indices. If not specified, all indices and
values are returned.
array names arr ? Returns the list of all indices defined for arr, or those that match the string
pattern? match pattern.
array set arr Initializes the array arr from list, which has the same form as the list
list returned by array get.
array size arr Returns the number of indices defined for arr.
array Returns a search token for a search through arr.
startsearch arr
array Returns the value of the next element in array in the search identified by the
nextelement arr token id. Returns an empty string if no more elements remain in the search.
id
array anymore Returns 1 if more elements remain in the search.
arr id
array donesearch Ends the search identified by id.
arr id

Converting Between Arrays and Lists

The array get and array set operations are used to convert between an array and a list. The list
returned by array get has an even number of elements. The first element is an index, and the next is
the corresponding array value. The list elements continue to alternate between index and value. The
list argument to array set must have the same structure.

array set fruit {

best kiwi
worst peach
ok banana
}
array get fruit
=> ok banana best kiwi worst peach

Another way to loop through the contents of an array is to use array get and the two-variable form of
the foreach command.

foreach {key value}[array get fruit] {

# key is ok, best, or worst
# value is some fruit
}
Passing Arrays by Name
The upvar command works on arrays. You can pass an array name to a procedure and use the upvar
command to get an indirect reference to the array variable in the caller's scope. This is illustrated in
Example 8-4, which inverts an array. As with array names, you can specify a pattern to array get to
limit what part of the array is returned. This example uses upvar because the array names are passed
into the ArrayInvert procedure. The inverse array does not need to exist before you call
ArrayInvert.

Example 8-4 ArrayInvert inverts an array.

proc ArrayInvert {arrName inverseName {pattern *}} {

upvar $arrName array $inverseName inverse
foreach {index value}[array get array $pattern] {
set inverse($value) $index
}
}

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 8. Tcl Arrays

Building Data Structures with Arrays

This section describes several data structures you can build with Tcl arrays. These examples are
presented as procedures that implement access functions to the data structure. Wrapping up your data
structures in procedures is good practice. It shields the user of your data structure from the details of
its implementation.

Use arrays to collect related variables.

A good use for arrays is to collect together a set of related variables for a module, much as one would
use a record in other languages. By collecting these together in an array that has the same name as the
module, name conflicts between different modules are avoided. Also, in each of the module's
procedures, a single global statement will suffice to make all the state variables visible. You can also
use upvar to manage a collection of arrays, as shown in Example 8-8 on page 95.

Simple Records
Suppose we have a database of information about people. One approach uses a different array for each
class of information. The name of the person is the index into each array:

Example 8-5 Using arrays for records, version 1.

proc Emp_AddRecord {id name manager phone} {

global employeeID employeeManager \
employeePhone employeeName
set employeeID($name) $id
set employeeManager($name) $manager
set employeePhone($name) $phone
set employeeName($id) $name
}
proc Emp_Manager {name} {
global employeeManager
return $employeeManager($name)
}

Simple procedures are defined to return fields of the record, which hides the implementation so that
you can change it more easily. The employeeName array provides a secondary key. It maps from the
employee ID to the name so that the other information can be obtained if you have an ID instead of a
name. Another way to implement the same little database is to use a single array with more complex
indices:

Example 8-6 Using arrays for records, version 2.

proc Emp_AddRecord {id name manager phone} {

global employee
set employee(id,$name) $id
set employee(manager,$name) $manager
set employee(phone,$name) $phone
set employee(name,$id) $name
}
proc Emp_Manager {name} {
global employee
return $employee(manager,$name)
}

The difference between these two approaches is partly a matter of taste. Using a single array can be
more convenient because there are fewer variables to manage. In any case, you should hide the
implementation in a small set of procedures.

A Stack
A stack can be implemented with either a list or an array. If you use a list, then the push and pop
operations have a runtime cost that is proportional to the size of the stack. If the stack has a few
elements this is fine. If there are a lot of items in a stack, you may wish to use arrays instead.

Example 8-7 Using a list to implement a stack.

proc Push { stack value } {

upvar $stack list
lappend list $value
}
proc Pop { stack } {
upvar $stack list
set value [lindex $list end]
set list [lrange $list 0 [expr [llength $list]-2]]
return $value
}

In these examples, the name of the stack is a parameter, and upvar is used to convert that into the data
used for the stack. The variable is a list in Example 8-7 and an array in Example 8-8. The user of the
stack module does not have to know.
The array implementation of a stack uses one array element to record the number of items in the stack.
The other elements of the array have the stack values. The Push and Pop procedures both guard against
a nonexistent array with the info exists command. When the first assignment to S(top) is done by
Push, the array variable is created in the caller's scope. The example uses array indices in two ways.
The top index records the depth of the stack. The other indices are numbers, so the construct
$S($S(top)) is used to reference the top of the stack.

Example 8-8 Using an array to implement a stack.

proc Push { stack value } {

upvar $stack S
if {![info exists S(top)]} {
set S(top) 0
}
set S($S(top)) $value
incr S(top)
}
proc Pop { stack } {
upvar $stack S
if {![info exists S(top)]} {
return {}
}
if {$S(top) == 0} {
return {}
} else {
incr S(top) -1
set x $S($S(top))
unset S($S(top))
return $x
}
}

A List of Arrays
Suppose you have many arrays, each of which stores some data, and you want to maintain an overall
ordering among the data sets. One approach is to keep a Tcl list with the name of each array in order.
Example 8-9 defines RecordInsert to add an array to the list, and an iterator function,
RecordIterate, that applies a script to each array in order. The iterator uses upvar to make data an
alias for the current array. The script is executed with eval, which is described in detail in Chapter 10.
The Tcl commands in script can reference the arrays with the name data:
Example 8-9 A list of arrays.

proc RecordAppend {listName arrayName} {

upvar $listName list
lappend list $arrayName
}
proc RecordIterate {listName script} {
upvar $listName list
foreach arrayName $list {
upvar #0 $arrayName data
eval $script
}
}

Another way to implement this list-of-records structure is to keep references to the arrays that come
before and after each record. Example 8-10 shows the insert function and the iterator function when
using this approach. Once again, upvar is used to set up data as an alias for the current array in the
iterator. In this case, the loop is terminated by testing for the existence of the next array. It is perfectly
all right to make an alias with upvar to a nonexistent variable. It is also all right to change the target of
the upvar alias. One detail that is missing from the example is the initialization of the very first record
so that its next element is the empty string:

Example 8-10 A list of arrays.

proc RecordInsert {recName afterThis} {

upvar $recName record $afterThis after
set record(next) $after(next)
set after(next) $recName
}
proc RecordIterate {firstRecord body} {
upvar #0 $firstRecord data
while {[info exists data]} {
eval $body
upvar #0 $data(next) data
}
}

A Simple In-Memory Database

Suppose you have to manage a lot of records, each of which contain a large chunk of data and one or
more key values you use to look up those values. The procedure to add a record is called like this:

Db_Insert keylist datablob

The datablob might be a name, value list suitable for passing to array set, or simply a large chunk
of text or binary data. One implementation of Db_Insert might just be:

foreach key $keylist {

lappend Db($key) $datablob
}

The problem with this approach is that it duplicates the data chunks under each key. A better approach
is to use two arrays. One stores all the data chunks under a simple ID that is generated automatically.
The other array stores the association between the keys and the data chunks. Example 8-11, which uses
the namespace syntax described in Chapter 14, illustrates this approach. The example also shows how
you can easily dump data structures by writing array set commands to a file, and then load them
later with a source command:

Example 8-11 A simple in-memory database.

namespace eval db {
variable data ;# Array of data blobs
variable uid 0 ;# Index into data
variable index ;# Cross references into data
}
proc db::insert {keylist datablob} {
variable data
variable uid
variable index
set data([incr uid]) $datablob
foreach key $keylist {
lappend index($key) $uid
}
}
proc db::get {key} {
variable data
variable index
set result {}
if {![info exist index($key)]} {
return {}
}
foreach uid $index($key) {
lappend result $data($uid)
}
return $result
}
proc db::save {filename} {
variable uid
set out [open $filename w]
puts $out [list namespace eval db \
[list variable uid $uid]]
puts $out [list array set db::data [array get db::data]]
puts $out [list array set db::index [array get db::index]]
close $out
}
proc db::load {filename} {
source $filename
}

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part I. Tcl Basics

Chapter 9. Working with Files and Programs

This chapter describes how to run programs, examine the file system, and access environment
variables through the env array. Tcl commands described are: exec, file, open, close, read, write,
puts, gets, flush, seek, tell, glob, pwd, cd , exit, pid, and registry.

This chapter describes how to run programs and access the file system from Tcl. These commands
were designed for UNIX. In Tcl 7.5 they were implemented in the Tcl ports to Windows and
Macintosh. There are facilities for naming files and manipulating file names in a platform-independent
way, so you can write scripts that are portable across systems. These capabilities enable your Tcl script
to be a general-purpose glue that assembles other programs into a tool that is customized for your
needs.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

Running Programs with exec

The exec command runs programs from your Tcl script.[*] For example:
[*] Unlike other UNIX shell exec commands, the Tcl exec does not replace the current process with the new one. Instead, the Tcl library
forks first and executes the program as a child process.

set d [exec date]

The standard output of the program is returned as the value of the exec command. However, if the
program writes to its standard error channel or exits with a nonzero status code, then exec raises an
error. If you do not care about the exit status, or you use a program that insists on writing to standard
error, then you can use catch to mask the errors:

catch {exec program arg arg} result

The exec command supports a full set of I/O redirection and pipeline syntax. Each process normally
has three I/O channels associated with it: standard input, standard output, and standard error. With I/O
redirection, you can divert these I/O channels to files or to I/O channels you have opened with the Tcl
open command. A pipeline is a chain of processes that have the standard output of one command
hooked up to the standard input of the next command in the pipeline. Any number of programs can be
linked together into a pipeline.

Example 9-1 Using exec on a process pipeline.

set n [exec sort < /etc/passwd | uniq | wc -l 2> /dev/null]

Example 9-1 uses exec to run three programs in a pipeline. The first program is sort, which takes its
input from the file /etc/passwd. The output of sort is piped into uniq, which suppresses duplicate
lines. The output of uniq is piped into wc, which counts the lines. The error output of the command is
diverted to the null device to suppress any error messages. Table 9-1 provides a summary of the syntax
understood by the exec command.

Table 9-1. Summary of the exec syntax for I/O redirection.

-keepnewline (First argument.) Do not discard trailing newline from the result.
| Pipes standard output from one process into another.
|& Pipes both standard output and standard error output.
< fileName Takes input from the named file.
<@ fileId Takes input from the I/O channel identified by fileId.
<< value Takes input from the given value.
> fileName Overwrites fileName with standard output.
2> fileName Overwrites fileName with standard error output.
>& fileName Overwrites fileName with both standard error and standard out.
>> fileName Appends standard output to the named file.
2>> fileName Appends standard error to the named file.
>>& fileName Appends both standard error and standard output to the named file.
>@ fileId Directs standard output to the I/O channel identified by fileId.
2>@ fileId Directs standard error to the I/O channel identified by fileId.
>&@ fileId Directs both standard error and standard output to the I/O channel.
& As the last argument, indicates pipeline should run in background.

A trailing & causes the program to run in the background. In this case, the process identifier is returned
by the exec command. Otherwise, the exec command blocks during execution of the program, and the
standard output of the program is the return value of exec. The trailing newline in the output is
trimmed off, unless you specify -keepnewline as the first argument to exec.
If you look closely at the I/O redirection syntax, you'll see that it is built up from a few basic building
blocks. The basic idea is that | stands for pipeline, > for output, and < for input. The standard error is
joined to the standard output by &. Standard error is diverted separately by using 2>. You can use your
own I/O channels by using @.

The auto_noexec Variable

The Tcl shell programs are set up during interactive use to attempt to execute unknown Tcl commands
as programs. For example, you can get a directory listing by typing:

ls
instead of:

exec ls

This is handy if you are using the Tcl interpreter as a general shell. It can also cause unexpected
behavior when you are just playing around. To turn this off, define the auto_noexec variable:

set auto_noexec anything

Limitations of exec on Windows

Windows 3.1 has an unfortunate combination of special cases that stem from console-mode programs,
16-bit programs, and 32-bit programs. In addition, pipes are really just simulated by writing output
from one process to a temporary file and then having the next process read from that file. If exec or a
process pipeline fails, it is because of a fundamental limitation of Windows. The good news is that
Windows 95 and Windows NT cleaned up most of the problems with exec. Windows NT 4.0 is the
most robust.
Tcl 8.0p2 was the last release to officially support Windows 3.1. That release includes Tcl1680.dll,
which is necessary to work with the win32s subsystem. If you copy that file into the same directory as
the other Tcl DLLs, you may be able to use later releases of Tcl on Windows 3.1. However, there is no
guarantee this trick will continue to work.

AppleScript on Macintosh

The exec command is not provided on the Macintosh. Tcl ships with an AppleScript extension that
lets you control other Macintosh applications. You can find documentation in the AppleScript.html
that goes with the distribution. You must use package require to load the AppleScript command:

package require Tclapplescript

AppleScript junk
=> bad option "junk": must be compile, decompile, delete,
execute, info, load, run, or store.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

The file Command

The file command provides several ways to check the status of files in the file system. For example,
you can find out if a file exists, what type of file it is, and other file attributes. There are facilities for
manipulating files in a platform-independent manner. Table 9-2 provides a summary of the various
forms of the file command. They are described in more detail later. Note that the split, join, and
pathtype operations were added in Tcl 7.5. The copy, delete, mkdir, and rename operations were
added in Tcl 7.6. The attributes operation was added in Tcl 8.0

Table 9-2. The file command options.

file atime name Returns access time as a decimal string.

file attributes name ? Queries or sets file attributes. (Tcl 8.0)
option? ?value? ...

file copy ?-force? source Copies file source to file destination. The source and
destination destination can be directories. (Tcl 7.6)
file delete ?-force? name Deletes the named file. (Tcl 7.6)
file dirname name Returns parent directory of file name.
file executable name Returns 1 if name has execute permission, else 0.
file exists name Returns 1 if name exists, else 0.
file extension name Returns the part of name from the last dot (i.e., .) to the end. The
dot is included in the return value.
file isdirectory name Returns 1 if name is a directory, else 0.
file isfile name Returns 1 if name is not a directory, symbolic link, or device, else 0.
file join path path... Joins pathname components into a new pathname. (Tcl 7.5)
file lstat name var Places attributes of the link name into var.
file mkdir name Creates directory name. (Tcl 7.6)
file mtime name Returns modify time of name as a decimal string.
file nativename name Returns the platform-native version of name. (Tk 8.0).
file owned name Returns 1 if current user owns the file name, else 0.
file pathtype name relative, absolute, or driverelative. (Tcl 7.5)
file readable name Returns 1 if name has read permission, else 0.
file readlink name Returns the contents of the symbolic link name.
file rename ?-force? old Changes the name of old to new. (Tcl 7.6)
new

file rootname name Returns all but the extension of name (i.e., up to but not including
the last . in name).
file size name Returns the number of bytes in name.
file split name Splits name into its pathname components. (Tcl 7.5)
file stat name var Places attributes of name into array var. The elements defined for
var are listed in Table 9-3.
file tail name Returns the last pathname component of name.
file type name Returns type identifier, which is one of: file, directory,
characterSpecial, blockSpecial , fifo, link, or socket.
file writable name Returns 1 if name has write permission, else 0.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

Cross-Platform File Naming

Files are named differently on UNIX, Windows, and Macintosh. UNIX separates file name
components with a forward slash (/), Macintosh separates components with a colon (:), and Windows
separates components with a backslash (\). In addition, the way that absolute and relative names are
distinguished is different. For example, these are absolute pathnames for the Tcl script library (i.e.,
$tcl_library ) on Macintosh, Windows, and UNIX, respectively:

Disk:System Folder:Extensions:Tool Command Language:tcl7.6

c:\Program Files\Tcl\lib\Tcl7.6
/usr/local/tcl/lib/tcl7.6

The good news is that Tcl provides operations that let you deal with file pathnames in a platform-
independent manner. The file operations described in this chapter allow either native format or the
UNIX naming convention. The backslash used in Windows pathnames is especially awkward because
the backslash is special to Tcl. Happily, you can use forward slashes instead:

c:/Program Files/Tcl/lib/Tcl7.6

There are some ambiguous cases that can be specified only with native pathnames. On my Macintosh,
Tcl and Tk are installed in a directory that has a slash in it. You can name it only with the native
Macintosh name:

Disk:Applications:Tcl/Tk 4.2

Another construct to watch out for is a leading // in a file name. This is the Windows syntax for
network names that reference files on other computers. You can avoid accidentally constructing a
network name by using the file join command described next. Of course, you can use network
names to access remote files.
If you must communicate with external programs, you may need to construct a file name in the native
syntax for the current platform. You can construct these names with file join described later. You
can also convert a UNIX-like name to a native name with file nativename.
Several of the file operations operate on pathnames as opposed to returning information about the
file itself. You can use the dirname, extension, join, pathtype, rootname, split, and tail
operations on any string; there is no requirement that the pathnames refer to an existing file.

Building up Pathnames: file join

You can get into trouble if you try to construct file names by simply joining components with a slash.
If part of the name is in native format, joining things with slashes will result in incorrect pathnames on
Macintosh and Windows. The same problem arises when you accept user input. The user is likely to
provide file names in native format. For example, this construct will not create a valid pathname on
the Macintosh because $tcl_library is in native format:

set file $tcl_library/init.tcl

Use file join to construct file names.

The platform-independent way to construct file names is with file join. The following command
returns the name of the init.tcl file in native format:

set file [file join $tcl_library init.tcl]

The file join operation can join any number of pathname components. In addition, it has the feature
that an absolute pathname overrides any previous components. For example (on UNIX), /b/c is an
absolute pathname, so it overrides any paths that come before it in the arguments to file join:

file join a b/c d

=> a/b/c/d
file join a /b/c d
=> /b/c/d

On Macintosh, a relative pathname starts with a colon, and an absolute pathname does not. To specify
an absolute path, you put a trailing colon on the first component so that it is interpreted as a volume
specifier. These relative components are joined into a relative pathname:

file join a :b:c d

=> :a:b:c:d

In the next case, b:c is an absolute pathname with b: as the volume specifier. The absolute name
overrides the previous relative name:

file join a b:c d

=> b:c:d

The file join operation converts UNIX-style pathnames to native format. For example, on Macintosh
you get this:

file join /usr/local/lib

=> usr:local:lib

Chopping Pathnames: split, dirname, tail

The file split command divides a pathname into components. It is the inverse of file join. The
split operation detects automatically if the input is in native or UNIX format. The results of file
split may contain some syntax to help resolve ambiguous cases when the results are passed back to
file join. For example, on Macintosh a UNIX-style pathname is split on slash separators. The
Macintosh syntax for a volume specifier ( Disk:) is returned on the leading component:

file split "/Disk/System Folder/Extensions"

=> Disk: {System Folder} Extensions

A common reason to split up pathnames is to divide a pathname into the directory part and the file
part. This task is handled directly by the dirname and tail operations. The dirname operation returns
the parent directory of a pathname, while tail returns the trailing component of the pathname:

file dirname /a/b/c

=> /a/b
file tail /a/b/c
=> c

For a pathname with a single component, the dirname option returns ".", on UNIX and Windows, or
":" on Macintosh. This is the name of the current directory.
The extension and root options are also complementary. The extension option returns everything
from the last period in the name to the end (i.e., the file suffix including the period.) The root option
returns everything up to, but not including, the last period in the pathname:
file root /a/b.c
=> /a/b
file extension /a/b.c
=> .c

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

Manipulating Files and Directories

Tcl 7.6 added file operations to copy files, delete files, rename files, and create directories. In earlier
versions it was necessary to exec other programs to do these things, except on Macintosh, where cp,
rm , mv , mkdir, and rmdir were built in. These commands are no longer supported on the Macintosh.
Your scripts should use the file command operations described below to manipulate files in a
platform-independent way.
File name patterns are not directly supported by the file operations. Instead, you can use the glob
command described on page 115 to get a list of file names that match a pattern.

Copying Files
The file copy operation copies files and directories. The following example copies file1 to file2.
If file2 already exists, the operation raises an error unless the -force option is specified:

file copy ?-force? file1 file2

Several files can be copied into a destination directory. The names of the source files are preserved.
The -force option indicates that files under directory can be replaced:

file copy ?-force? file1 file2 ... directory

Directories can be recursively copied. The -force option indicates that files under dir2 can be
replaced:

file copy ?-force? dir1 dir2

Creating Directories
The file mkdir operation creates one or more directories:

file mkdir dir dir ...

It is not an error if the directory already exists. Furthermore, intermediate directories are created if
needed. This means that you can always make sure a directory exists with a single mkdir operation.
Suppose /tmp has no subdirectories at all. The following command creates /tmp/sub1 and
/tmp/sub1/sub2:

file mkdir /tmp/sub1/sub2

The -force option is not understood by file mkdir, so the following command -accidentally creates
a folder named -force, as well as one named oops.

file mkdir -force oops

Deleting Files
The file delete operation deletes files and directories. It is not an error if the files do not exist. A
non-empty directory is not deleted unless the -force option is specified, in which case it is recursively
deleted:

file delete ?-force? name name ...

To delete a file or directory named -force, you must specify a nonexistent file before the -force to
prevent it from being interpreted as a flag (-force -force won't work):

file delete xyzzy -force

Renaming Files and Directories

The file rename operation changes a file's name from old to new. The -force option causes new to
be replaced if it already exists.

file rename ?-force? old new

Using file rename is the best way to update an existing file. First, generate the new version of the file
in a temporary file. Then, use file rename to replace the old version with the new version. This
ensures that any other programs that access the file will not see the new version until it is complete.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

File Attributes
There are several file operations that return specific file attributes: atime, executable, exists,
isdirectory, isfile, mtime, owned, readable, readlink, size and type. Refer to Table 9-2 on page
102 for their function. The following command uses file mtime to compare the modify times of two
files. If you have ever resorted to piping the results of ls -l into awk in order to derive this information
in other shell scripts, you will appreciate this example:

Example 9-2 Comparing file modify times.

proc newer { file1 file2 } {

if ![file exists $file2] {
return 1
} else {
# Assume file1 exists
expr [file mtime $file1] > [file mtime $file2]
}
}

The stat and lstat operations return a collection of file attributes. They take a third argument that is
the name of an array variable, and they initialize that array with elements that contain the file
attributes. If the file is a symbolic link, then the lstat operation returns information about the link
itself and the stat operation returns information about the target of the link. The array elements are
listed in Table 9-3. All the element values are decimal strings, except for type, which can have the
values returned by the type option. The element names are based on the UNIX stat system call. Use
the file attributes command described later to get other platform-specific attributes:

Table 9-3. Array elements defined by file stat.

atime The last access time, in seconds.
ctime The last change time (not the create time), in seconds.
dev The device identifier, an integer.
gid The group owner, an integer.
ino The file number (i.e., inode number), an integer.
mode The permission bits.
mtime The last modify time, in seconds.
nlink The number of links, or directory references, to the file.
size The number of bytes in the file.
type file, directory, characterSpecial, blockSpecial , fifo, link, or socket.
uid The owner's user ID, an integer.

Example 9-3 uses the device (dev) and inode (ino) attributes of a file to determine whether two
pathnames reference the same file. The attributes are UNIX specific; they are not well defined on
Windows and Macintosh.

Example 9-3 Determining whether pathnames reference the same file.

proc fileeq { path1 path2 } {

file stat $path1 stat1
file stat $path2 stat2
expr $stat1(ino) == $stat2(ino) && \
$stat1(dev) == $stat2(dev)
}

The file attributes operation was added in Tcl 8.0 to provide access to platform-specific
attributes. The attributes operation lets you set and query attributes. The interface uses option-value
pairs. With no options, all the current values are returned.

file attributes book.doc

=> -creator FRAM -hidden 0 -readonly 0 -type MAKR

These Macintosh attributes are explained in Table 9-4. The four-character type codes used on
Macintosh are illustrated on page 516. With a single option, only that value is returned:

file attributes book.doc -readonly

=> 0

The attributes are modified by specifying one or more option–value pairs. Setting attributes can raise
an error if you do not have the right permissions:

file attributes book.doc -readonly 1 -hidden 0

Table 9-4. Platform-specific file attributes.

-permissions File permission bits. mode is a number with bits defined by the chmod system
mode call. (UNIX)
-group ID The group owner of the file. (UNIX)
-owner ID The owner of the file. (UNIX)
-archive bool The archive bit, which is set by backup programs. (Windows)
-hidden bool If set, then the file does not appear in listings. (Windows, Macintosh)
-readonly bool If set, then you cannot write the file. (Windows, Macintosh)
-system bool If set, then you cannot remove the file. (Windows)
-creator type type is 4-character code of creating application. (Macintosh)
-type type type is 4-character type code. (Macintosh)

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

Input/Output Command Summary

The following sections describe how to open, read, and write files. The basic model is that you open a
file, read or write it, then close the file. Network sockets also use the commands described here.
Socket programming is discussed in Chapter 17, and more advanced event-driven I/O is described in
Chapter 16. Table 9-5 lists the basic commands associated with file I/O:

Table 9-5. Tcl commands used for file access.

open what ?access? ?permissions? Returns channel ID for a file or pipeline.

puts ?-nonewline? ?channel? Writes a string.
string

gets channel ?varname? Reads a line.

read channel ?numBytes? Reads numBytes bytes, or all data.
read -nonewline channel Reads all bytes and discard the last \n.
tell channel Returns the seek offset.
seek channel offset ?origin? Sets the seek offset. origin is one of start, current, or
end.
eof channel Queries end-of-file status.
flush channel Writes buffers of a channel.
close channel Closes an I/O channel.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

Opening Files for I/O

The open command sets up an I/O channel to either a file or a pipeline of processes. The return value
of open is an identifier for the I/O channel. Store the result of open in a variable and use the variable as
you used the stdout, stdin, and stderr identifiers in the examples so far. The basic syntax is:

open what ?access? ?permissions?

The what argument is either a file name or a pipeline specification similar to that used by the exec
command. The access argument can take two forms, either a short character sequence that is
compatible with the fopen library routine, or a list of POSIX access flags. Table 9-6 summarizes the
first form, while Table 9-7 summarizes the POSIX flags. If access is not specified, it defaults to read.

Example 9-4 Opening a file for writing.

set fileId [open /tmp/foo w 0600]

puts $fileId "Hello, foo!"
close $fileId

Table 9-6. Summary of the open access arguments.

r Opens for reading. The file must exist.

r+ Opens for reading and writing. The file must exist.
w Opens for writing. Truncate if it exists. Create if it does not exist.
w+ Opens for reading and writing. Truncate or create.
a Opens for writing. Data is appended to the file.
a+ Opens for reading and writing. Data is appended.
Table 9-7. Summary of POSIX flags for the access argument.

RDONLY Opens for reading.

WRONLY Opens for writing.
RDWR Opens for reading and writing.
APPEND Opens for append.
CREAT Creates the file if it does not exist.
EXCL If CREAT is also specified, then the file cannot already exist.
NOCTTY Prevents terminal devices from becoming the controlling terminal.
NONBLOCK Does not block during the open.
TRUNC Truncates the file if it exists.

The permissions argument is a value used for the permission bits on a newly created file. UNIX uses
three bits each for the owner, group, and everyone else. The bits specify read, write, and execute
permission. These bits are usually specified with an octal number, which has a leading zero, so that
there is one octal digit for each set of bits. The default permission bits are 0666, which grant read/write
access to everybody. Example 9-4 specifies 0600 so that the file is readable and writable only by the
owner. 0775 would grant read, write, and execute permissions to the owner and group, and read and
execute permissions to everyone else. You can set other special properties with additional high-order
bits. Consult the UNIX manual page on chmod command for more details.
The following example illustrates how to use a list of POSIX access flags to open a file for reading
and writing, creating it if needed, and not truncating it. This is something you cannot do with the
simpler form of the access argument:

set fileId [open /tmp/bar {RDWR CREAT}]

Catch errors from open.

In general, you should check for errors when opening files. The following example illustrates a catch
phrase used to open files. Recall that catch returns 1 if it catches an error; otherwise, it returns zero. It
treats its second argument as the name of a variable. In the error case, it puts the error message into the
variable. In the normal case, it puts the result of the command into the variable:

Example 9-5 A more careful use of open.

if [catch {open /tmp/data r}fileId] {
puts stderr "Cannot open /tmp/data: $fileId"
} else {
# Read and process the file, then...
close $fileId
}

Opening a Process Pipeline

You can open a process pipeline by specifying the pipe character, |, as the first character of the first
argument. The remainder of the pipeline specification is interpreted just as with the exec command,
including input and output redirection. The second argument determines which end of the pipeline
open returns. The following example runs the UNIX sort program on the password file, and it uses the
split command to separate the output lines into list elements:

Example 9-6 Opening a process pipeline.

set input [open "|sort /etc/passwd" r]

set contents [split [read $input] \n]
close $input

You can open a pipeline for both read and write by specifying the r+ access mode. In this case, you
need to worry about buffering. After a puts, the data may still be in a buffer in the Tcl library. Use the
flush command to force the data out to the spawned processes before you try to read any output from
the pipeline. You can also use the fconfigure command described on page 223 to force line buffering.
Remember that read-write pipes will not work at all with Windows 3.1 because pipes are simulated
with files. Event-driven I/O is also very useful with pipes. It means you can do other processing while
the pipeline executes, and simply respond when the pipe generates data. This is described in Chapter
16.

Expect
If you are trying to do sophisticated things with an external application, you will find that the Expect
extension provides a much more powerful interface than a process pipeline. Expect adds Tcl
commands that are used to control interactive applications. It is extremely useful for automating FTP,
Telnet, and programs under test. It comes as a Tcl shell named expect, and it is also an extension that
you can dynamically load into other Tcl shells. It was created by Don Libes at the National Institute of
Standards and Technology (NIST). Expect is described in Exploring Expect (Libes, O'Reilly &
Associates, Inc., 1995). You can find the software on the CD and on the web at:
http://expect.nist.gov/
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

Reading and Writing

The standard I/O channels are already open for you. There is a standard input channel, a standard
output channel, and a standard error output channel. These channels are identified by stdin, stdout,
and stderr, respectively. Other I/O channels are returned by the open command, and by the socket
command described on page 228.
There may be cases when the standard I/O channels are not available. Windows has no standard error
channel. Some UNIX window managers close the standard I/O channels when you start programs from
window manager menus. You can also close the standard I/O channels with close.

The puts and gets Commands

The puts command writes a string and a newline to the output channel. There are a couple of details
about the puts command that we have not yet used. It takes a -nonewline argument that prevents the
newline character that is normally appended to the output channel. This is used in the prompt example
below. The second feature is that the channel identifier is optional, defaulting to stdout if not
specified. Note that you must use flush to force output of a partial line. This is illustrated in Example
9-7.

Example 9-7 Prompting for input.

puts -nonewline "Enter value: "

flush stdout ;# Necessary to get partial line output
set answer [gets stdin]

The gets command reads a line of input, and it has two forms. In the previous example, with just a
single argument, gets returns the line read from the specified I/O channel. It discards the trailing
newline from the return value. If end of file is reached, an empty string is returned. You must use the
eof command to tell the difference between a blank line and end-of-file. eof returns 1 if there is end
of file. Given a second varName argument, gets stores the line into a named variable and returns the
number of bytes read. It discards the trailing newline, which is not counted. A -1 is returned if the
channel has reached the end of file.
Example 9-8 A read loop using gets.

while {[gets $channel line] >= 0} {

# Process line
}
close $channel

The read Command

The read command reads blocks of data, and this capability is often more efficient. There are two
forms for read: You can specify the -nonewline argument or the numBytes argument, but not both.
Without numBytes, the whole file (or what is left in the I/O channel) is read and returned. The -
nonewline argument causes the trailing newline to be discarded. Given a byte count argument, read
returns that amount, or less if there is not enough data in the channel. The trailing newline is not
discarded in this case.

Example 9-9 A read loop using read and split.

foreach line [split [read $channel] \n] {

# Process line
}
close $channel

For moderate-sized files, it is about 10 percent faster to loop over the lines in a file using the read loop
in the second example. In this case, read returns the whole file, and split chops the file into list
elements, one for each line. For small files (less than 1K) it doesn't really matter. For large files
(megabytes) you might induce paging with this approach.

Platform-Specific End of Line Characters

Tcl automatically detects different end of line conventions. On UNIX, text lines are ended with a
newline character (\n). On Macintosh, they are terminated with a carriage return (\r). On Windows,
they are terminated with a carriage return, newline sequence (\r\n). Tcl accepts any of these, and the
line terminator can even change within a file. All these different conventions are converted to the
UNIX style so that once read, text lines are always terminated with a newline character (\n). Both the
read and gets commands do this conversion.

During output, text lines are generated in the platform-native format. The automatic handling of line
formats means that it is easy to convert a file to native format. You just need to read it in and write it
out:

puts -nonewline $out [read $in]

To suppress conversions, use the fconfigure command, which is described in more detail on page
223.
Example 9-10 demonstrates a File_Copy procedure that translates files to native format. It is
complicated because it handles directories:

Example 9-10 Copy a file and translate to native format.

proc File_Copy {src dest} {

if [file isdirectory $src] {
file mkdir $dest
foreach f [glob -nocomplain [file join $src *]] {
File_Copy $f [file join $dest [file tail $f]]
}
return
}
if [file isdirectory $dest] {
set dest [file join $dest [file tail $src]]
}
set in [open $src]
set out [open $dest w]
puts -nonewline $out [read $in]
close $out ; close $in
}

Random Access I/O

The seek and tell commands provide random access to I/O channels. Each channel has a current
position called the seek offset. Each read or write operation updates the seek offset by the number of
bytes transferred. The current value of the offset is returned by the tell command. The seek
command sets the seek offset by an amount, which can be positive or negative, from an origin which is
either start, current, or end.

Closing I/O channels

The close command is just as important as the others because it frees operating system resources
associated with the I/O channel. If you forget to close a channel, it will be closed when your process
exits. However, if you have a long-running program, like a Tk script, you might exhaust some
operating system resources if you forget to close your I/O channels.

The close command can raise an error.

If the channel was a process pipeline and any of the processes wrote to their standard error channel,
then Tcl believes this is an error. The error is raised when the channel to the pipeline is finally closed.
Similarly, if any of the processes in the pipeline exit with a nonzero status, close raises an error.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

The Current Directory ?cd and pwd

Every process has a current directory that is used as the starting point when resolving a relative
pathname. The pwd command returns the current directory, and the cd command changes the current
directory. Example 9-11 uses these commands.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

Matching File Names with glob

The glob command expands a pattern into the set of matching file names. The general form of the
glob command is:

glob ?flags? pattern ?pattern? ...

The pattern syntax is similar to the string match patterns:

* matches zero or more characters.

? matches a single character.
[abc] matches a set of characters.
{a,b,c} matches any of a, b, or c.
All other characters must match themselves.
The -nocomplain flag causes glob to return an empty list if no files match the pattern. Otherwise,
glob raises an error if no files match.

The -- flag must be used if the pattern begins with a -.

Unlike the glob matching in csh, the Tcl glob command matches only the names of existing files. In
csh, the {a,b} construct can match nonexistent names. In addition, the results of glob are not sorted.
Use the lsort command to sort its result if you find it important.
Example 9-11 shows the FindFile procedure, which traverses the file system hierarchy using
recursion. At each iteration it saves its current directory and then attempts to change to the next
subdirectory. A catch guards against bogus names. The glob command matches file names:

Example 9-11 Finding a file by name.

proc FindFile { startDir namePat } {
set pwd [pwd]
if [catch {cd $startDir}err] {
puts stderr $err
return
}
foreach match [glob -nocomplain -- $namePat]{
puts stdout [file join $startDir $match]
}
foreach file [glob -nocomplain *] {
if [file isdirectory $file] {
FindFile [file join $startDir $file] $namePat
}
}
cd $pwd
}

Expanding Tilde in File Names

The glob command also expands a leading tilde (~) in filenames. There are two cases:

~/ expands to the current user's home directory.

~user expands to the home directory of user.
If you have a file that starts with a literal tilde, you can avoid the tilde expansion by adding a leading
./ (e.g., ./~foobar).

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

The exit and pid Commands

The exit command terminates your script. Note that exit causes termination of the whole process
that was running the script. If you supply an integer-valued argument to exit, then that becomes the
exit status of the process.
The pid command returns the process ID of the current process. This can be useful as the seed for a
random number generator because it changes each time you run your script. It is also common to
embed the process ID in the name of temporary files.
You can also find out the process IDs associated with a process pipeline with pid:

set pipe [open "|command"]

set pids [pid $pipe]

There is no built-in mechanism to control processes in Tcl. On UNIX systems you can exec the kill
program to terminate a process:

exec kill $pid

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

Environment Variables
Environment variables are a collection of string-valued variables associated with each process. The
process's environment variables are available through the global array env. The name of the
environment variable is the index, (e.g., env(PATH)), and the array element contains the current value
of the environment variable. If assignments are made to env, they result in changes to the
corresponding environment variable. Environment variables are inherited by child processes, so
programs run with the exec command inherit the environment of the Tcl script. The following
example prints the values of environment variables.

Example 9-12 Printing environment variable values.

proc printenv { args } {

global env
set maxl 0
if {[llength $args] == 0} {
set args [lsort [array names env]]
}
foreach x $args {
if {[string length $x] > $maxl} {
set maxl [string length $x]
}
}
incr maxl 2
foreach x $args {
puts stdout [format "%*s = %s" $maxl $x $env($x)]
}
}
printenv USER SHELL TERM
=>
USER = welch
SHELL = /bin/csh
TERM = tx
Note: Environment variables can be initialized for Macintosh applications by editing a resource of type
STR# whose name is Tcl Environment Variables. This resource is part of the tclsh and wish
applications. Follow the directions on page 28 for using ResEdit. The format of the resource values is
NAME=VALUE.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 9. Working with Files and Programs

The registry Command

Windows uses the registry to store various system configuration information. The Windows tool to
browse and edit the registry is called regedit. Tcl provides a registry command. It is a loadable
package that you must load by using:

package require registry

The registry structure has keys, value names, and typed data. The value names are stored under a key,
and each value name has data associated with it. The keys are organized into a hierarchical naming
system, so another way to think of the value names is as an extra level in the hierarchy. The main point
is that you need to specify both a key name and a value name in order to get something out of the
registry. The key names have one of the following formats:

\\hostname\rootname\keypath
rootname\keypath
rootname

The rootname is one of HKEY_LOCAL_MACHINE, HKEY_PERFORMANCE_DATA, HKEY_USERS,

HKEY_CLASSES_ROOT , HKEY_CURRENT_USER , HKEY_CURRENT_CONFIG, or HKEY_DYN_DATA. Tables 9-8
and 9-9 summarize the registry command and data types:

Table 9-8. The registry command.

registry delete key ? Deletes the key and the named value, or it deletes all values under
valueName? the key if valueName is not specified.
registry get key valueName Returns the value associated with valueName under key.
registry keys key ?pat? Returns the list of keys or value names under key that match pat,
which is a string match pattern.
registry set key Creates key.
registry set key valueName Creates valueName under key with value data of the given type.
data ?type? Types are listed in Table 9-9.
registry type key valueName Returns the type of valueName under key.
registry values key ?pat? Returns the names of the values stored under key that match pat,
which is a string match pattern.

Table 9-9. The registry data types.

binary Arbitrary binary data.

none Arbitrary binary data.
expand_sz A string that contains references to environment variables with the %VARNAME%
syntax.
dword A 32-bit integer.
dword_big_endian A 32-bit integer in the other byte order. It is represented in Tcl as a decimal
string.
link A symbolic link.
multi_sz An array of strings, which are represented as a Tcl list.
resource_list A device driver resource list.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part II: Advanced Tcl

Part II describes advanced programming techniques that support sophisticated applications. The
Tcl interfaces remain simple, so you can quickly construct powerful applications.
Chapter 10 describes eval, which lets you create Tcl programs on the fly. There are tricks with
using eval correctly, and a few rules of thumb to make your life easier.
Chapter 11 describes regular expressions. This is the most powerful string processing facility in
Tcl. This chapter includes a cookbook of useful regular expressions.
Chapter 12 describes the library and package facility used to organize your code into reusable
modules.
Chapter 13 describes introspection and debugging. Introspection provides information about the
state of the Tcl interpreter.
Chapter 14 describes namespaces that partition the global scope for variables and procedures.
Namespaces help you structure large Tcl applications.
Chapter 15 describes the features that support Internationalization, including Unicode, other
character set encodings, and message catalogs.
Chapter 16 describes event-driven I/O programming. This lets you run process pipelines in the
background. It is also very useful with network socket programming, which is the topic of
Chapter 17.
Chapter 18 describes TclHttpd, a Web server built entirely in Tcl. You can build applications on
top of TclHttpd, or integrate the server into existing applications to give them a web interface.
TclHttpd also supports regular Web sites.
Chapter 19 describes Safe-Tcl and using multiple Tcl interpreters. You can create multiple Tcl
interpreters for your application. If an interpreter is safe, then you can grant it restricted
functionality. This is ideal for supporting network applets that are downloaded from untrusted
sites.
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part II. Advanced Tcl

Chapter 10. Quoting Issues and Eval

This chapter describes explicit calls to the interpreter with the eval command. An extra round of
substitutions is performed that results in some useful effects. The chapter describes the quoting
problems with eval and the ways to avoid them. The uplevel command evaluates commands in a
different scope. The subst command does substitutions but no command invocation.
Dynamic evaluation makes Tcl flexible and powerful, but it can be tricky to use properly. The basic
idea is that you create a string and then use the eval command to interpret that string as a command or
a series of commands. Creating program code on the fly is easy with an interpreted language like Tcl,
and very hard, if not impossible, with a statically compiled language like C++ or Java. There are
several ways that dynamic code evaluation is used in Tcl:

In some cases, a simple procedure isn't quite good enough, and you need to glue together a
command from a few different pieces and then execute the result using eval. This often occurs
with wrappers, which provide a thin layer of functionality over existing commands.
Callbacks are script fragments that are saved and evaluated later in response to some event.
Examples include the commands associated with Tk buttons, fileevent I/O handlers, and after
timer handlers. Callbacks are a flexible way to link different parts of an application together.
You can add new control structures to Tcl using the uplevel command. For example, you can
write a function that applies a command to each line in a file or each node in a tree.
You can have a mixture of code and data, and just process the code part with the subst
command. For example, this is useful in HTML templates described in Chapter 18. There are
also some powerful combinations of subst and regsub described in Chapter 11.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 10. Quoting Issues and Eval

Constructing Code with the list Command

It can be tricky to assemble a command so that it is evaluated properly by eval. The same difficulties
apply to commands like after, uplevel, and the Tk send command, all of which have similar
properties to eval, except that the command evaluation occurs later or in a different context.
Constructing commands dynamically is a source of many problems. The worst part is that you can
write code that works sometimes but not others, which can be very confusing.

Use list when constructing commands.

The root of the quoting problems is the internal use of concat by eval and similar commands to
concatenate their arguments into one command string. The concat can lose some important list
structure so that arguments are not passed through as you expect. The general strategy to avoid these
problems is to use list and lappend to explicitly form the command callback as a single, well-
structured list.

The eval Command

The eval command results in another call to the Tcl interpreter. If you construct a command
dynamically, you must use eval to interpret it. For example, suppose we want to construct the
following command now but execute it later:

puts stdout "Hello, World!"

In this case, it is sufficient to do the following:

set cmd {puts stdout "Hello, World!"}
=> puts stdout "Hello, World!"
# sometime later...
eval $cmd
=> Hello, World!

In this case, the value of cmd is passed to Tcl. All the standard grouping and substitution are done
again on the value, which is a puts command.
However, suppose that part of the command is stored in a variable, but that variable will not be
defined at the time eval is used. We can artificially create this situation like this:

set string "Hello, World!"

set cmd {puts stdout $string}
=> puts stdout $string
unset string
eval $cmd
=> can't read "string": no such variable

In this case, the command contains $string. When this is processed by eval, the interpreter looks for
the current value of string, which is undefined. This example is contrived, but the same problem
occurs if string is a local variable, and cmd will be evaluated later in the global scope.
A common mistake is to use double quotes to group the command. That will let $string be
substituted now. However, this works only if string has a simple value, but it fails if the value of
string contains spaces or other Tcl special characters:

set cmd "puts stdout $string"

=> puts stdout Hello, World!
eval $cmd
=> bad argument "World!": should be "nonewline"

The problem is that we have lost some important structure. The identity of $string as a single
argument gets lost in the second round of parsing by eval. The solution to this problem is to construct
the command using list, as shown in the following example:

Example 10-1 Using list to construct commands.

set string "Hello, World!"

set cmd [list puts stdout $string]
=> puts stdout {Hello, World!}
unset string
eval $cmd
=> Hello, World!
The trick is that list has formed a list containing three elements: puts, stdout, and the value of
string. The substitution of $string occurs before list is called, and list takes care of grouping that
value for us. In contrast, using double quotes is equivalent to:

set cmd [concat puts stdout $string]

Double quotes lose list structure.

The problem here is that concat does not preserve list structure. The main lesson is that you should
use list to construct commands if they contain variable values or command results that must be
substituted now. If you use double quotes, the values are substituted but you lose proper command
structure. If you use curly braces, then values are not substituted until later, which may not be in the
right context.

Commands That Concatenate Their Arguments

The uplevel, after and send commands concatenate their arguments into a command and execute it
later in a different context. The uplevel command is described on page 130, after is described on
page 218, and send is described on page 560. Whenever I discover such a command, I put it on my
danger list and make sure I explicitly form a single command argument with list instead of letting the
command concat items for me. Get in the habit now:

after 100 [list doCmd $param1 $param2]

send $interp [list doCmd $param1 $param2];# Safe!

The danger here is that concat and list can result in the same thing, so you can be led down the rosy
garden path only to get errors later when values change. The two previous examples always work. The
next two work only if param1 and param2 have values that are single list elements:

after 100 doCmd $param1 $param2

send $interp doCmd $param1 $param2;# Unsafe!

If you use other Tcl extensions that provide eval-like functionality, carefully check their
documentation to see whether they contain commands that concat their arguments into a command.
For example, Tcl-DP, which provides a network version of send, dp_send, also uses concat.

Commands That Use Callbacks

The general strategy of passing out a command or script to call later is a flexible way to assemble
different parts of an application, and it is widely used by Tcl commands. Examples include commands
that are called when users click on Tk buttons, commands that are called when I/O channels have data
ready, or commands that are called when clients connect to network servers. It is also easy to write
your own procedures or C extensions that accept scripts and call them later in response to some event.
These other callback situations may not appear to have the "concat problem" because they take a
single script argument. However, as soon as you use double quotes to group that argument, you have
created the concat problem all over again. So, all the caveats about using list to construct these
commands still apply.

Command Prefix Callbacks

There is a variation on command callbacks called a command prefix. In this case, the command is
given additional arguments when it is invoked. In other words, you provide only part of the command,
the command prefix, and the module that invokes the callback adds additional arguments before using
eval to invoke the command.

For example, when you create a network server, you supply a procedure that is called when a client
makes a connection. That procedure is called with three additional arguments that indicate the client's
socket, IP address, and port number. This is described in more detail on page 227. The tricky thing is
that you can define your callback procedure to take four (or more) arguments. In this case you specify
some of the parameters when you define the callback, and then the socket subsystem specifies the
remaining arguments when it makes the callback. The following command creates the server side of a
socket:

set virtualhost www.beedub.com

socket -server [list Accept $virtualhost] 8080

However, you define the Accept procedure like this:

proc Accept {myname sock ipaddr port} { ... }

The myname parameter is set when you construct the command prefix. The remaining parameters are
set when the callback is invoked. The use of list in this example is not strictly necessary because "we
know" that virtualhost will always be a single list element. However, using list is just a good habit
when forming callbacks, so I always write the code this way.
There are many other examples of callback arguments that are really command prefixes. Some of these
include the scrolling callbacks between Tk scrollbars and their widgets, the command aliases used
with Safe Tcl, the sorting functions in lsort, and the completion callback used with fcopy. Example
13-6 on page 181 shows how to use eval to make callbacks from Tcl procedures.

Constructing Procedures Dynamically

The previous examples have all focused on creating single commands by using list operations.
Suppose you want to create a whole procedure dynamically. Unfortunately, this can be particularly
awkward because a procedure body is not a simple list. Instead, it is a sequence of commands that are
each lists, but they are separated by newlines or semicolons. In turn, some of those commands may be
loops and if commands that have their own command bodies. To further compound the problem, you
typically have two kinds of variables in the procedure body: some that are to be used as values when
constructing the body, and some that are to be used later when executing the procedure. The result can
be very messy.
The main trick to this problem is to use either format or regsub to process a template for your
dynamically generated procedure. If you use format, then you can put %s into your templates where
you want to insert values. You may find the positional notation of the format string (e.g., %1$s and
%2$s) useful if you need to repeat a value in several places within your procedure body. The following
example is a procedure that generates a new version of other procedures. The new version includes
code that counts the number of times the procedure was called and measures the time it takes to run:

Example 10-2 Generating procedures dynamically with a template.

proc TraceGen {procName} {

rename $procName $procName-orig
set arglist {}
foreach arg [info args $procName-orig] {
append arglist "\$$arg "
}
proc $procName [info args $procName-orig] [format {
global _trace_count _trace_msec
incr _trace_count(%1$s)
incr _trace_msec(%1$s) [lindex [time {
set result [%1$s-orig %2$s]
} 1] 0]
return $result
} $procName $arglist]
}

Suppose that we have a trivial procedure foo:

proc foo {x y} {
return [expr $x * $y]
}

If you run TraceGen on it and look at the results, you see this:

TraceGen foo
info body foo
=>
global _trace_count _trace_msec
incr _trace_count(foo)
incr _trace_msec(foo) [lindex [time {
set result [foo-orig $x $y]
}1] 0]
return $result

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 10. Quoting Issues and Eval

Exploiting the concat inside eval

The previous section warns about the danger of concatenation when forming commands. However,
there are times when concatenation is done for good reason. This section illustrates cases where the
concat done by eval is useful in assembling a command by concatenating multiple lists into one list.
A concat is done internally by eval when it gets more than one argument:

eval list1 list2 list3 ...

The effect of concat is to join all the lists into one list; a new level of list structure is not added. This
is useful if the lists are fragments of a command. It is common to use this form of eval with the args
construct in procedures. Use the args parameter to pass optional arguments through to another
command. Invoke the other command with eval, and the values in $args get concatenated onto the
command properly. The special args parameter is illustrated in Example 7-2 on page 82.

Using eval in a Wrapper Procedure.

Here, we illustrate the use of eval and $args with a simple Tk example. In Tk, the button command
creates a button in the user interface. The button command can take many arguments, and commonly
you simply specify the text of the button and the Tcl command that is executed when the user clicks on
the button:

button .foo -text Foo -command foo

After a button is created, it is made visible by packing it into the display. The pack command can also
take many arguments to control screen placement. Here, we just specify a side and let the packer take
care of the rest of the details:

pack .foo -side left

Even though there are only two Tcl commands to create a user interface button, we will write a
procedure that replaces the two commands with one. Our first version might be:

proc PackedButton {name txt cmd} {

button $name -text $txt -command $cmd
pack $name -side left
}

This is not a very flexible procedure. The main problem is that it hides the full power of the Tk button
command, which can really take about 20 widget configuration options, such as -background, -
cursor, -relief , and more. They are listed on page 391. For example, you can easily make a red
button like this:

button .foo -text Foo -command foo -background red

A better version of PackedButton uses args to pass through extra configuration options to the button
command. The args parameter is a list of all the extra arguments passed to the Tcl procedure. My first
attempt to use $args looked like this, but it was not correct:

proc PackedButton {name txt cmd args} {

button $name -text $txt -command $cmd $args
pack $name -side left
}
PackedButton .foo "Hello, World!" {exit} -background red
=> unknown option "-background red"

The problem is that $args is a list value, and button gets the whole list as a single argument. Instead,
button needs to get the elements of $args as individual arguments.

Use eval with $args

In this case, you can use eval because it concatenates its arguments to form a single list before
evaluation. The single list is, by definition, the same as a single Tcl command, so the button
command parses correctly. Here we give eval two lists, which it joins into one command:

eval {button $name -text $txt -command $cmd} $args

The use of the braces in this command is discussed in more detail below. We also generalize our
procedure to take some options to the pack command. This argument, pack, must be a list of packing
options. The final version of PackedButton is shown in Example 10-3:

Example 10-3 Using eval with $args.

# PackedButton creates and packs a button.

proc PackedButton {path txt cmd {pack {-side right}} args} {
eval {button $path -text $txt -command $cmd} $args
eval {pack $path} $pack
}

In PackedButton, both pack and args are list-valued parameters that are used as parts of a command.
The internal concat done by eval is perfect for this situation. The simplest call to PackedButton is:

PackedButton .new "New" { New }

The quotes and curly braces are redundant in this case but are retained to convey some type
information. The quotes imply a string label, and the braces imply a command. The pack argument
takes on its default value, and the args variable is an empty list. The two commands executed by
PackedButton are:

button .new -text New -command New

pack .new -side right

PackedButton creates a horizontal stack of buttons by default. The packing can be controlled with a
packing specification:

PackedButton .save "Save" { Save $file } {-side left}

The two commands executed by PackedButton are:

button .new -text Save -command { Save $file }

pack .new -side left

The remaining arguments, if any, are passed through to the button command. This lets the caller fine-
tune some of the button attributes:

PackedButton .quit Quit { Exit } {-side left -padx 5} \

-background red}
The two commands executed by PackedButton are:

button .quit -text Quit -command { Exit }-background red

pack .quit -side left -padx 5

You can see a difference between the pack and args argument in the call to PackedButton. You need
to group the packing options explicitly into a single argument. The args parameter is automatically
made into a list of all remaining arguments. In fact, if you group the extra button parameters, it will be
a mistake:

PackedButton .quit Quit { Exit } {-side left -padx 5} \

{-background red}
=> unknown option "-background red"

Correct Quoting with eval

What about the peculiar placement of braces in PackedButton?

eval {button $path -text $txt -command $cmd} $args

By using braces, we control the number of times different parts of the command are seen by the Tcl
evaluator. Without any braces, everything goes through two rounds of substitution. The braces prevent
one of those rounds. In the above command, only $args is substituted twice. Before eval is called, the
$args is replaced with its list value. Then, eval is invoked, and it concatenates its two list arguments
into one list, which is now a properly formed command. The second round of substitutions done by
eval replaces the txt and cmd values.

Do not use double quotes with eval.

You may be tempted to use double quotes instead of curly braces in your uses of eval. Don't give in!
Using double quotes is, mostly likely, wrong. Suppose the first eval command is written like this:

eval "button $path -text $txt -command $cmd $args"

Incidentally, the previous is equivalent to:

eval button $path -text $txt -command $cmd $args

These versions happen to work with the following call because txt and cmd have one-word values
with no special characters in them:

PackedButton .quit Quit { Exit }

The button command that is ultimately evaluated is:

button .quit -text Quit -command { Exit }

In the next call, an error is raised:

PackedButton .save "Save As" [list Save $file]

=> unknown option "As"

This is because the button command is this:

button .save -text Save As -command Save /a/b/c

But it should look like this instead:

button .save -text {Save As}-command {Save /a/b/c}

The problem is that the structure of the button command is now wrong. The value of txt and cmd are
substituted first, before eval is even called, and then the whole command is parsed again. The worst
part is that sometimes using double quotes works, and sometimes it fails. The success of using double
quotes depends on the value of the parameters. When those values contain spaces or special characters,
the command gets parsed incorrectly.

Braces: the one true way to group arguments to eval.

To repeat, the safe construct is:

eval {button $path -text $txt -command $cmd} $args

The following variations are also correct. The first uses list to do quoting automatically, and the
others use backslashes or braces to prevent the extra round of substitutions:

eval [list button $path -text $txt -command $cmd] $args

eval button \$path -text \$txt -command \$cmd $args
eval button {$path} -text {$txt} -command {$cmd} $args

Finally, here is one more incorrect approach that tries to quote by hand:

eval "button {$path}-text {$txt}-command {$cmd} $args"

The problem above is that quoting is not always done with curly braces. If a value contains an
unmatched curly brace, Tcl would have used backslashes to quote it, and the above command would
raise an error:

set blob "foo\{bar space"

=> foo{bar space
eval "puts {$blob}"
=> missing close brace
eval puts {$blob}
=> foo{bar space

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 10. Quoting Issues and Eval

The uplevel Command

The uplevel command is similar to eval, except that it evaluates a command in a different scope than
the current procedure. It is useful for defining new control structures entirely in Tcl. The syntax for
uplevel is:

uplevel ?level? command ?list1 list2 ...?

As with upvar, the level parameter is optional and defaults to 1, which means to execute the
command in the scope of the calling procedure. The other common use of level is #0, which means to
evaluate the command in the global scope. You can count up farther than one (e.g., 2 or 3), or count
down from the global level (e.g., #1 or #2), but these cases rarely make sense.
When you specify the command argument, you must be aware of any substitutions that might be
performed by the Tcl interpreter before uplevel is called. If you are entering the command directly,
protect it with curly braces so that substitutions occur in the other scope. The following affects the
variable x in the caller's scope:

uplevel {set x [expr $x + 1]}

However, the following will use the value of x in the current scope to define the value of x in the
calling scope, which is probably not what was intended:

uplevel "set x [expr $x + 1]"

If you are constructing the command dynamically, again use list. This fragment is used later in
Example 10-4:

uplevel [list foreach $args $valueList {break}]

It is common to have the command in a variable. This is the case when the command has been passed
into your new control flow procedure as an argument. In this case, you should evaluate the command
one level up. Put the level in explicitly to avoid cases where $cmd looks like a number!

uplevel 1 $cmd

Another common scenario is reading commands from users as part of an application. In this case, you
should evaluate the command at the global scope. Example 16-2 on page 220 illustrates this use of
uplevel :

uplevel #0 $cmd

If you are assembling a command from a few different lists, such as the args parameter, then you can
use concat to form the command:

uplevel [concat $cmd $args]

The lists in $cmd and $args are concatenated into a single list, which is a valid Tcl command. Like
eval, uplevel uses concat internally if it is given extra arguments, so you can leave out the explicit
use of concat. The following commands are equivalent:

uplevel [concat $cmd $args]

uplevel "$cmd $args"
uplevel $cmd $args

Example 10-4 shows list assignment using the foreach trick described on Page 75. List assignment is
useful if a command returns several values in a list. The lassign procedure assigns the list elements to
several variables. The lassign procedure hides the foreach trick, but it must use the uplevel
command so that the loop variables get assigned in the correct scope. The list command is used to
construct the foreach command that is executed in the caller's scope. This is necessary so that
$variables and $values get substituted before the command is evaluated in the other scope.

Example 10-4 lassign: list assignment with foreach.

# Assign a set of variables from a list of values.

# If there are more values than variables, they are returned.
# If there are fewer values than variables,
# the variables get the empty string.

proc lassign {valueList args} {

if {[llength $args] == 0} {
error "wrong # args: lassign list varname ?varname..?"
}
if {[llength $valueList] == 0} {
# Ensure one trip through the foreach loop
set valueList [list {}]
}
uplevel 1 [list foreach $args $valueList {break}]
return [lrange $valueList [llength $args] end]
}

Example 10-5 illustrates a new control structure with the File_Process procedure that applies a
callback to each line in a file. The call to uplevel allows the callback to be concatenated with the
line to form the command. The list command is used to quote any special characters in line, so it
appears as a single argument to the command.

Example 10-5 The File_Process procedure applies a command to each line of a file.

proc File_Process {file callback} {

set in [open $file]
while {[gets $file line] >= 0} {
uplevel 1 $callback [list $line]
}
close $in
}

What is the difference between these two commands?

uplevel 1 [list $callback $line]

uplevel 1 $callback [list $line]

The first form limits callback to be the name of the command, while the second form allows
callback to be a command prefix. Once again, what is the bug with this version?

uplevel 1 $callback $line

The arbitrary value of $line is concatenated to the callback command, and it is likely to be a
malformed command when executed.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 10. Quoting Issues and Eval

The subst Command

The subst command is useful when you have a mixture of Tcl commands, Tcl variable references,
and plain old data. The subst command looks through the data for square brackets, dollar signs, and
backslashes, and it does substitutions on those. It leaves the rest of the data alone:

set a "foo bar"

subst {a=$a date=[exec date]}
=> a=foo bar date=Thu Dec 15 10:13:48 PST 1994

The subst command does not honor the quoting effect of curly braces. It does substitutions regardless
of braces:

subst {a=$a date={[exec date]}}

=> a=foo bar date={Thu Dec 15 10:15:31 PST 1994}

You can use backslashes to prevent variable and command substitution.

subst {a=\$a date=\[exec date]}

=> a=$a date=[exec date]

You can use other backslash substitutions like \uXXXX to get Unicode characters, \n to get newlines, or
\-newline to hide newlines.

The subst command takes flags that limit the substitutions it will perform. The flags are -
nobackslashes, -nocommands, or -novariables . You can specify one or more of these flags before
the string that needs to be substituted:

subst -novariables {a=$a date=[exec date]}

=> a=$a date=Thu Dec 15 10:15:31 PST 1994
String Processing with subst
The subst command can be used with the regsub command to do efficient, two-step string
processing. In the first step, regsub is used to rewrite an input string into data with embedded Tcl
commands. In the second step, subst or eval replaces the Tcl commands with their result. By artfully
mapping the data into Tcl commands, you can dynamically construct a Tcl script that processes the
data. The processing is efficient because the Tcl parser and the regular expression processor have been
highly tuned. Chapter 11 has several examples that use this technique.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part II. Advanced Tcl

Chapter 11. Regular Expressions

This chapter describes regular expression pattern matching and string processing based on regular
expression substitutions. These features provide the most powerful string processing facilities in Tcl.
Tcl commands described are: regexp and regsub.
Regular expressions are a formal way to describe string patterns. They provide a powerful and
compact way to specify patterns in your data. Even better, there is a very efficient implementation of
the regular expression mechanism due to Henry Spencer. If your script does much string processing, it
is worth the effort to learn about the regexp command. Your Tcl scripts will be compact and efficient.
This chapter uses many examples to show you the features of regular expressions.
Regular expression substitution is a mechanism that lets you rewrite a string based on regular
expression matching. The regsub command is another powerful tool, and this chapter includes several
examples that do a lot of work in just a few Tcl commands. Stephen Uhler has shown me several ways
to transform input data into a Tcl script with regsub and then use subst or eval to process the data.
The idea takes a moment to get used to, but it provides a very efficient way to process strings.
Tcl 8.1 added a new regular expression implementation that supports Unicode and advanced regular
expressions (ARE). This implementation adds more syntax and escapes that makes it easier to write
patterns, once you learn the new features! If you know Perl, then you are already familiar with these
features. The Tcl advanced regular expressions are almost identical to the Perl 5 regular expressions.
The new features include a few very minor incompatibilities with the regular expressions implemented
in earlier versions of Tcl 8.0, but these rarely occur in practice. The new regular expression package
supports Unicode, of course, so you can write patterns to match Japanese or Hindu documents!

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 11. Regular Expressions

When to Use Regular Expressions

Regular expressions can seem overly complex at first. They introduce their own syntax and their own
rules, and you may be tempted to use simpler commands like string first, string range, or
string match to process your strings. However, often a single regular expression command can
replace a sequence of several string commands. Any time you can replace several Tcl commands
with one, you get a performance improvement. Furthermore, the regular expression matcher is
implemented in optimized C code, so pattern matching is fast.
The regular expression matcher does more than test for a match. It also tells you what part of your
input string matches the pattern. This is useful for picking data out of a large input string. In fact, you
can capture several pieces of data in just one match by using subexpressions. The regexp Tcl
command makes this easy by assigning the matching data to Tcl variables. If you find yourself using
string first and string range to pick out data, remember that regexp can do it in one step
instead.
The regular expression matcher is structured so that patterns are first compiled into an form that is
efficient to match. If you use the same pattern frequently, then the expensive compilation phase is
done only once, and all your matching uses the efficient form. These details are completely hidden by
the Tcl interface. If you use a pattern twice, Tcl will nearly always be able to retrieve the compiled
form of the pattern. As you can see, the regular expression matcher is optimized for lots of heavy-duty
string processing.

Avoiding a Common Problem

Group your patterns with curly braces.

One of the stumbling blocks with regular expressions is that they use some of the same special
characters as Tcl. Any pattern that contains brackets, dollar signs, or spaces must be quoted when used
in a Tcl command. In many cases you can group the regular expression with curly braces, so Tcl pays
no attention to it. However, when using Tcl 8.0 (or earlier) you may need Tcl to do backslash
substitutions on part of the pattern, and then you need to worry about quoting the special characters in
the regular expression.
Advanced regular expressions eliminate this problem because backslash substitution is now done by
the regular expression engine. Previously, to get \n to mean the newline character (or \t for tab) you
had to let Tcl do the substitution. With Tcl 8.1, \n and \t inside a regular expression mean newline
and tab. In fact, there are now about 20 backslash escapes you can use in patterns. Now more than
ever, remember to group your patterns with curly braces to avoid conflicts between Tcl and the regular
expression engine.
The patterns in the first sections of this Chapter ignore this problem. The sample expressions in Table
11-7 on page 151 are quoted for use within Tcl scripts. Most are quoted simply by putting the whole
pattern in braces, but some are shown without braces for comparison.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 11. Regular Expressions

Regular Expression Syntax

This section describes the basics of regular expression patterns, which are found in all versions of Tcl.
There are occasional references to features added by advanced regular expressions, but they are
covered in more detail starting on page 138. There is enough syntax in regular expressions that there
are five tables that summarize all the options. These tables appear together starting at page 145.
A regular expression is a sequence of the following items:

A literal character.
A matching character, character set, or character class.
A repetition quantifier.
An alternation clause.
A subpattern grouped with parentheses.

Matching Characters
Most characters simply match themselves. The following pattern matches an a followed by a b:

The general wild-card character is the period, ".". It matches any single character. The following
pattern matches an a followed by any character:

Remember that matches can occur anywhere within a string; a pattern does not have to match the
whole string. You can change that by using anchors, which are described on page 137.

Character Sets
The matching character can be restricted to a set of characters with the [xyz] syntax. Any of the
characters between the two brackets is allowed to match. For example, the following matches either
Hello or hello:

[Hh]ello

The matching set can be specified as a range over the character set with the [x-y] syntax. The
following matches any digit:

[0-9]

There is also the ability to specify the complement of a set. That is, the matching character can be
anything except what is in the set. This is achieved with the [^xyz] syntax. Ranges and complements
can be combined. The following matches anything except the uppercase and lowercase letters:

[^a-zA-Z]

Using special characters in character sets.

If you want a ] in your character set, put it immediately after the initial opening bracket. You do not
need to do anything special to include [ in your character set. The following matches any square
brackets or curley braces:

[][{}]

Most regular expression syntax characters are no longer special inside character sets. This means you
do not need to backslash anything inside a bracketed character set except for backslash itself. The
following pattern matches several of the syntax characters used in regular expressions:

[][+*?()|\\]
Advanced regular expressions add names and backslash escapes as shorthand for common sets of
characters like white space, alpha, alphanumeric, and more. These are described on page 139 and
listed in Table 11-3 on page 146.

Quantifiers
Repetition is specified with *, for zero or more, +, for one or more, and ?, for zero or one. These
quantifiers apply to the previous item, which is either a matching character, a character set, or a
subpattern grouped with parentheses. The following matches a string that contains b followed by zero
or more a's:

ba*

You can group part of the pattern with parentheses and then apply a quantifier to that part of the
pattern. The following matches a string that has one or more sequences of ab:

(ab)+

The pattern that matches anything, even the empty string, is:

These quantifiers have a greedy matching behavior: They match as many characters as possible.
Advanced regular expressions add nongreedy matching, which is described on page 140. For example,
a pattern to match a single line might look like this:

.*\n

However, as a greedy match, this will match all the lines in the input, ending with the last newline in
the input string. The following pattern matches up through the first newline.

[^\n]*\n

We will shorten this pattern even further on page 140 by using nongreedy quantifiers. There are also
special newline sensitive modes you can turn on with some options described on page 143.

Alternation
Alternation lets you test more than one pattern at the same time. The matching engine is designed to be
able to test multiple patterns in parallel, so alternation is efficient. Alternation is specified with |, the
pipe symbol. Another way to match either Hello or hello is:

hello|Hello

You can also write this pattern as:

(h|H)ello

or as:

[hH]ello

Anchoring a Match
By default a pattern does not have to match the whole string. There can be unmatched characters
before and after the match. You can anchor the match to the beginning of the string by starting the
pattern with ^, or to the end of the string by ending the pattern with $. You can force the pattern to
match the whole string by using both. All strings that begin with spaces or tabs are matched with:

^[ \t]+

If you have many text lines in your input, you may be tempted to think of ^ as meaning "beginning of
line" instead of "beginning of string." By default, the ^ and $ anchors are relative to the whole input,
and embedded newlines are ignored. Advanced regular expressions support options that make the ^
and $ anchors line-oriented. They also add the \A and \Z anchors that always match the beginning and
end of the string, respectively.

Backslash Quoting
Use the backslash character to turn off these special characters :

. * ? + [ ] ( ) ^ $ | \

For example, to match the plus character, you will need:

\+
Remember that this quoting is not necessary inside a bracketed expression (i.e., a character set
definition.) For example, to match either plus or question mark, either of these patterns will work:

(\+|\?)
[+?]

To match a single backslash, you need two. You must do this everywhere, even inside a bracketed
expression. Or you can use \B, which was added as part of advanced regular expressions. Both of these
match a single backslash:

\\
\B

Unknown backslash sequences are an error.

Versions of Tcl before 8.1 ignored unknown backslash sequences in regular expressions. For example,
\= was just =, and \w was just w. Even \n was just n, which was probably frustrating to many
beginners trying to get a newline into their pattern. Advanced regular expressions add backslash
sequences for tab, newline, character classes, and more. This is a convenient improvement, but in rare
cases it may change the semantics of a pattern. Usually these cases are where an unneeded backslash
suddenly takes on meaning, or causes an error because it is unknown.

Matching Precedence
If a pattern can match several parts of a string, the matcher takes the match that occurs earliest in the
input string. Then, if there is more than one match from that same point because of alternation in the
pattern, the matcher takes the longest possible match. The rule of thumb is: first, then longest. This
rule gets changed by nongreedy quantifiers that prefer a shorter match.
Watch out for *, which means zero or more, because zero of anything is pretty easy to match. Suppose
your pattern is:

[a-z]*

This pattern will match against 123abc, but not how you expect. Instead of matching on the letters in
the string, the pattern will match on the zero-length substring at the very beginning of the input string!
This behavior can be seen by using the -indices option of the regexp command described on page
148. This option tells you the location of the matching string instead of the value of the matching
string.
Capturing Subpatterns
Use parentheses to capture a subpattern. The string that matches the pattern within parentheses is
remembered in a matching variable, which is a Tcl variable that gets assigned the string that matches
the pattern. Using parentheses to capture subpatterns is very useful. Suppose we want to get everything
between the <td> and </td> tags in some HTML. You can use this pattern:

The matching variable gets assigned the part of the input string that matches the pattern inside the
parentheses. You can capture many subpatterns in one match, which makes it a very efficient way to
pick apart your data. Matching variables are explained in more detail on page 148 in the context of the
regexp command.
Sometimes you need to introduce parentheses but you do not care about the match that occurs inside
them. The pattern is slightly more efficient if the matcher does not need to remember the match.
Advanced regular expressions add noncapturing parentheses with this syntax:

(?:pattern)

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 11. Regular Expressions

Advanced Regular Expressions

The syntax added by advanced regular expressions is mostly just short hand notation for constructs
you can make with the basic syntax already described. There are also some new features that add
additional power: nongreedy quantifiers, back references, look-ahead patterns, and named character
classes. If you are just starting out with regular expressions, you can ignore most of this section, except
for the one about backslash sequences. Once you master the basics, of if you are already familar with
regular expressions in Tcl (or the UNIX vi editor or grep utility), then you may be interested in the
new features of advanced regular expressions.

Compatibility with Patterns in Tcl 8.0

Advanced regular expressions add syntax in an upward compatible way. Old patterns continue to work
with the new matcher, but advanced regular expressions will raise errors if given to old versions of
Tcl. For example, the question mark is used in many of the new constructs, and it is artfully placed in
locations that would not be legal in older versions of regular expressions. The added syntax is
summarized in Table 11-2 on page 145.
If you have unbraced patterns from older code, they are very likely to be correct in Tcl 8.1 and later
versions. For example, the following pattern picks out everything up to the next newline. The pattern
is unbraced, so Tcl substitutes the newline character for each occurrence of \n. The square brackets are
quoted so that Tcl does not think they delimit a nested commmand:

regexp "(\[^\n\]+)\n" $input

The above command behaves identically when using advanced regular expressions, although you can
now also write it like this:

regexp {([^\n]+)\n} $input

The curley braces hide the brackets from the Tcl parser, so they do not need to be escaped with
backslash. This saves us two characters and looks a bit cleaner.

Backslash Escape Sequences

The most significant change in advanced regular expression syntax is backslash substitutions. In Tcl
8.0 and earlier, a backslash is only used to turn off special characters such as: . + * ? [ ].
Otherwise it was ignored. For example, \n was simply n to the Tcl 8.0 regular expression engine. This
was a source of confusion, and it meant you could not always quote patterns in braces to hide their
special characters from Tcl's parser. In advanced regular expressions, \n now means the newline
characer to the regular expression engine, so you should never need to let Tcl do backslash processing.
Again, always group your pattern with curley braces to avoid confusion.
Advanced regular expressions add a lot of new backslash sequences. They are listed in Table 11-4 on
page 146. Some of the more useful ones include \s, which matches space-like characters, \w, which
matches letters, digit, and the underscore, \y, which matches the beginning or end of a word, and \B,
which matches a backslash.

Character Classes
Character classes are names for sets of characters. The named character class syntax is valid only
inside a bracketed character set. The syntax is

[:identifier:]

For example, alpha is the name for the set of uppercase and lowercase letters. The following two
patterns are almost the same:

[A-Za-z]
[[:alpha:]]

The difference is that the alpha character class also includes accented characters like è. If you match
data that contains nonASCII characters, the named character classes are more general than trying to
name the characters explicitly.
There are also backslash sequences that are shorthand for some of the named character classes. The
following patterns to match digits are equivalent:

[0-9]
[[:digit:]]
\d

The following patterns match space-like characters including backspace, form feed, newline, carriage
return, tag, and vertical tab:
[ \b\f\n\r\t\v]
[:space:]
\s

The named character classes and the associated backslash sequence are listed in Table 11-3 on page
146.
You can use character classes in combination with other characters or character classes inside a
character set definition. The following patterns match leters, digits, and underscore:

[[:digit:][:alpha:]_]
[\d[:alpha:]_]
[[:alnum:]_]
\w

Note that \d, \s and \w can be used either inside or outside character sets. When used outside a
bracketed expression, they form their own character set. There are also \D, \S, and \W, which are the
complement of \d, \s, and \w. These escapes (i.e., \D for not-a-digit) cannot be used inside a
bracketed character set.
There are two special character classes, [[:<:] and [[:>:]], that match the beginning and end of a
word, respectively. A word is defined as one or more characters that match \w.

nongreedy Quantifiers
The *, +, and ? characters are quantifiers that specify repetition. By default these match as many
characters as possible, which is called greedy matching. A nongreedy match will match as few
characters as possible. You can specify nongreedy matching by putting a question mark after these
quantifiers. Consider the pattern to match "one or more of not-a-newline followed by a newline." The
not-a-newline must be explicit with the greedy quantifier, as in:

[^\n]+\n

Otherwise, if the pattern were just

.+\n

then the "." could well match newlines, so the pattern would greedily consume everything until the
very last newline in the input. A nongreedy match would be satisfied with the very first newline
instead:
.+?\n

By using the nongreedy quantifier we've cut the pattern from eight characters to five Another example
that is shorter with a nongreedy quantifier is the HTML example from page 138. The following pattern
also matches everything between <td> and </td>:

Even ? can be made nongreedy, ??, which means it prefers to match zero instead of one. This only
makes sense inside the context of a larger pattern. Send me e-mail if you have a compelling example
for it!

Bound Quantifiers
The {m,n} syntax is a quantifier that means match at least m and at most n of the previous matching
item. There are two variations on this syntax. A simple {m} means match exactly m of the previous
matching item. A {m,} means match m or more of the previous matching item. All of these can be
made nongreedy by adding a ? after them.

Back References
A back reference is a feature you cannot easily get with basic regular expressions. A back reference
matches the value of a subpattern captured with parentheses. If you have several sets of parentheses
you can refer back to different captured expressions with \1, \2, and so on. You count by left
parentheses to determine the reference.
For example, suppose you want to match a quoted string, where you can use either single or double
quotes. You need to use an alternation of two patterns to match strings that are enclosed in double
quotes or in single quotes:

("[^"]*"|'[^']*')

With a back reference, \1, the pattern becomes simpler:

('|").*?\1

The first set of parenthesis matches the leading quote, and then the \1 refers back to that particular
quote character. The nongreedy quantifier ensures that the pattern matches up to the first occurrence of
the matching quote.

Look-ahead
Look-ahead patterns are subexpressions that are matched but do not consume any of the input. They
act like constraints on the rest of the pattern, and they typically occur at the end of your pattern. A
positive look-ahead causes the pattern to match if it also matches. A negative look-ahead causes the
pattern to match if it would not match. These constraints make more sense in the context of matching
variables and in regular expression subsitutions done with the regsub command. For example, the
following pattern matches a filename that begins with A and ends with .txt

^A.*\.txt$

The next version of the pattern adds parentheses to group the file name suffix.

^A.*(\.txt)$

The parentheses are not strictly necessary, but they are introduced so that we can compare the pattern
to one that uses look-ahead. A version of the pattern that uses look-ahead looks like this:

^A.*(?=\.txt)$

The pattern with the look-ahead constraint matches only the part of the filename before the .txt, but
only if the .txt is present. In other words, the .txt is not consumed by the match. This is visible in
the value of the matching variables used with the regexp command. It would also affect the
substitutions done in the regsub command.
There is negative look-ahead too. The following pattern matches a filename that begins with A and
does not end with .txt.

^A.*(?!\.txt)$

Writing this pattern without negative look-ahead is awkward.

Character Codes
The \nn and \mmm syntax, where n and m are digits, can also mean an 8-bit character code
corresponding to the octal value nn or mmm. This has priority over a back reference. However, I just
wouldn't use this notation for character codes. Instead, use the Unicode escape sequence, \unnnn,
which specifies a 16-bit value. The \xnn sequence also specifies an 8-bit character code.
Unfortunately, the \x escape consumes all hex digits after it (not just two!) and then truncates the
hexadecimal value down to 8 bits. This misfeature of \x is not considered a bug and will probably not
change even in future versions of Tcl.
The \Uyyyyyyyy syntax is reserved for 32-bit Unicode, but I don't expect to see that implemented
anytime soon.
Collating Elements
Collating elements are characters or long names for characters that you can use inside character sets.
Currently, Tcl only has some long names for various ASCII punctuation characters. Potentially, it
could support names for every Unicode character, but it doesn't because the mapping tables would be
huge. This section will briefly mention the syntax so that you can understand it if you see it. But its
usefulness is still limited.
Within a bracketed expression, the following syntax is used to specify a collating element:

[.identifier.]

The identifier can be a character or a long name. The supported long names can be found in the
generic/regc_locale.c file in the Tcl source code distribution. A few examples are shown below:

[.c.]
[.#.]
[.number-sign.]

Equivalence Classes
An equivalence class is all characters that sort to the same position. This is another feature that has
limited usefulness in the current version of Tcl. In Tcl, characters sort by their Unicode character
value, so there are no equivalence classes that contain more than one character! However, you could
imagine a character class for 'o', 'ò', and other accented versions of the letter o. The syntax for
equivalence classes within bracketed expressions is:

[=char=]

where char is any one of the characters in the character class. This syntax is valid only inside a
character class definition.

Newline Sensitive Matching

By default, the newline character is just an ordinary character to the matching engine. You can make
the newline character special with two options: lineanchor and linestop. You can set these options
with flags to the regexp and regsub Tcl commands, or you can use the embedded options described
later in Table 11-5 on page 147.
The lineanchor option makes the ^ and $ anchors work relative to newlines. The ^ matches
immediately after a newline, and $ matches immediately before a newline. These anchors continue to
match the very beginning and end of the input,too. With or without the lineanchor option, you can
use \A and \Z to match the beginning and end of the string.
The linestop option prevents . (i.e., period) and character sets that begin with ^ from matching a
newline character. In otherwords, unless you explicitly include \n in your pattern, it will not match
across newlines.

Embedded Options
You can start a pattern with embedded options to turn on or off case sensitivity, newline sensitivity,
and expanded syntax, which is explained in the next section. You can also switch from advanced
regular expressions to a literal string, or to older forms of regular expressions. The syntax is a leading:

(?chars)

where chars is any number of option characters. The option characters are listed in Table 11-5 on
page 147.

Expanded Syntax
Expanded syntax lets you include comments and extra white space in your patterns. This can greatly
improve the readability of complex patterns. Expanded syntax is turned on with a regexp command
option or an embeded option.
Comments start with a # and run until the end of line. Extra white space and comments can occur
anywhere except inside bracketed expressions (i.e., character sets) or within multicharacter syntax
elements like (?=. When you are in expanded mode, you can turn off the comment character or
include an explicit space by preceeding them with a backslash. Example 11-1 shows a pattern to match
URLs. The leading (?x) turns on expanded syntax. The whole pattern is grouped in curly braces to
hide it from Tcl. This example is considered again in more detail in Example 11-3 on page 150:

Example 11-1 Expanded regular expressions allow comments.

regexp {(?x) # A pattern to match URLS

([^:]+): # The protocol before the initial colon
//([^:/]+) # The server name
(:([0-9]+))? # The optional port number
(/.*) # The trailing pathname
} $input

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 11. Regular Expressions

Syntax Summary
Table 11-1 summarizes the syntax of regular expressions available in all versions of Tcl:

Table 11-1. Basic regular expression syntax.

. Matches any character.

* Matches zero or more instances of the previous pattern item.
+ Matches one or more instances of the previous pattern item.
? Matches zero or one instances of the previous pattern item.
( Groups a subpattern. The repetition and alternation operators apply to the preceding subpattern.
)

| Alternation.
[ Delimit a set of characters. Ranges are specified as [x-y]. If the first character in the set is ^, then
] there is a match if the remaining characters in the set are not present.
^ Anchor the pattern to the beginning of the string. Only when first.
$ Anchor the pattern to the end of the string. Only when last.

Advanced regular expressions, which were introduced in Tcl 8.1, add more syntax that is summarized
in Table 11-2:

Table 11-2. Additional advanced regular expression syntax.

{m} Matches m instances of the previous pattern item.
{m}? Matches m instances of the previous pattern item. Nongreedy.
{m,} Matches m or more instances of the previous pattern item.
{m,}? Matches m or more instances of the previous pattern item. Nongreedy.
{m,n} Matches m through n instances of the previous pattern item.
{m,n}? Matches m through n instances of the previous pattern item. Nongreedy.
*? Matches zero or more instances of the previous pattern item. Nongreedy.
+? Matches one or more instances of the previous pattern item. Nongreedy.
?? Matches zero or one instances of the previous pattern item. Nongreedy.
(?:re) Groups a subpattern, re, but does not capture the result.
(?=re) Positive look-ahead. Matches the point where re begins.
(?!re) Negative look-ahead. Matches the point where re does not begin.
(?abc) Embedded options, where abc is any number of option letters listed in Table 11-5.
\c One of many backslash escapes listed in Table 11-4.
[: :] Delimits a character class within a bracketed expression. See Table 11-3.
[. .] Delimits a collating element within a bracketed expression.
[= =] Delimits an equivalence class within a bracketed expression.

Table 11-3 lists the named character classes defined in advanced regular expressions and their
associated backslash sequences, if any. Character class names are valid inside bracketed character sets
with the [:class:] syntax.

Table 11-3. Character classes.

alnum Upper and lower case letters and digits.
alpha Upper and lower case letters.
blank Space and tab.
cntrl Control characters: \u0001 through \u001F.
digit The digits zero through nine. Also \d.
graph Printing characters that are not in cntrl or space.
lower Lowercase letters.
print The same as alnum.
punct Punctuation characters.
space Space, newline, carrage return, tab, vertical tab, form feed. Also \s.
upper Uppercase letters.
xdigit Hexadecimal digits: zero through nine, a-f, A-F.

Table 11-4 lists backslash sequences supported in Tcl 8.1.

Table 11-4. Backslash escapes in regular expressions.

\a Alert, or "bell", character.

\A Matches only at the beginning of the string.
\b Backspace character, \u0008.
\B Synonym for backslash.
\cX Control-X.
\d Digits. Same as [[:digit:]]
\D Not a digit. Same as [^[:digit:]]
\e Escape character, \u001B.
\f Form feed, \u000C.
\m Matches the beginning of a word.
\M Matches the end of a word.
\n Newline, \u000A.
\r Carriage return, \u000D.
\s Space. Same as [[:space:]]
\S Not a space. Same as [^[:space:]]
\t Horizontal tab, \u0009.
\uXXXX A 16-bit Unicode character code.
\v Vertical tab, \u000B.
\w Letters, digit, and underscore. Same as [[:alnum:]_]
\W Not a letter, digit, or underscore. Same as [^[:alnum:]_]
\xhh An 8-bit hexidecimal character code. Consumes all hex digits after \x.
\y Matches the beginning or end of a word.
\Y Matches a point that is not the beginning or end of a word.
\Z Matches the end of the string.
\0 NULL, \u0000
\x Where x is a digit, this is a back-reference.
\xy Where x and y are digits, either a decimal back-reference, or an 8-bit octal character code.
\xyz Where x, y and z are digits, either a decimal back-reference or an 8-bit octal character code.

Table 11-5 lists the embeded option characters used with the (?abc) syntax.

Table 11-5. Embedded option characters used with the (?x) syntax.

b The rest of the pattern is a basic regular expression (a la vi or grep).

c Case sensitive matching. This is the default.
e The rest of the pattern is an extended regular expression (a la Tcl 8.0).
i Case insensitive matching.
m Synonym for the n option.
n Newline sensitive matching . Both lineanchor and linestop mode.
p Partial newline sensitive matching. Only linestop mode.
q The rest of the pattern is a literal string.
s No newline sensitivity. This is the default.
t Tight syntax; no embedded comments. This is the default.
w Inverse partial newline-sensitive matching. Only lineanchor mode.
x Expanded syntax with embeded white space and comments.
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 11. Regular Expressions

The regexp Command

The regexp command provides direct access to the regular expression matcher. Not only does it tell
you whether a string matches a pattern, it can also extract one or more matching substrings. The return
value is 1 if some part of the string matches the pattern; it is 0 otherwise. Its syntax is:

regexp ?flags? pattern string ?match sub1 sub2...?

The flags are described in Table 11-6:

Table 11-6. Options to the regexp command.

-nocase Lowercase characters in pattern can match either lowercase or uppercase letters in
string.
-indices The match variables each contain a pair of numbers that are in indices delimiting the
match within string. Otherwise, the matching string itself is copied into the match
variables.
-expanded The pattern uses the expanded syntax discussed on page 144.
-line The same as specifying both -lineanchor and -linestop.
-lineanchor Change the behavior of ^ and $ so they are line-oriented as discussed on page 143.
-linestop Change matching so that . and character classes do not match newlines as discussed
on page 143.
-about Useful for debugging. It returns information about the pattern instead of trying to
match it against the input.
-- Signals the end of the options. You must use this if your pattern begins with -.

The pattern argument is a regular expression as described earlier. If string matches pattern, then
the results of the match are stored in the variables named in the command. These match variable
arguments are optional. If present, match is set to be the part of the string that matched the pattern. The
remaining variables are set to be the substrings of string that matched the corresponding subpatterns
in pattern. The correspondence is based on the order of left parentheses in the pattern to avoid
ambiguities that can arise from nested subpatterns.
Example 11-2 uses regexp to pick the hostname out of the DISPLAY environment variable, which has
the form:

hostname:display.screen

Example 11-2 Using regular expressions to parse a string.

set env(DISPLAY) sage:0.1

regexp {([^:]*):}$env(DISPLAY) match host
=> 1
set match
=> sage:
set host
=> sage

The pattern involves a complementary set, [^:], to match anything except a colon. It uses repetition,
*, to repeat that zero or more times. It groups that part into a subexpression with parentheses. The
literal colon ensures that the DISPLAY value matches the format we expect. The part of the string that
matches the complete pattern is stored into the match variable. The part that matches the subpattern is
stored into host. The whole pattern has been grouped with braces to quote the square brackets.
Without braces it would be:

regexp (\[^:\]*): $env(DISPLAY) match host

With advanced regular expressions the nongreedy quantifier *? can replace the complementary set:

regexp (.*?): $env(DISPLAY) match host

This is quite a powerful statement, and it is efficient. If we had only had the string command to work
with, we would have needed to resort to the following, which takes roughly twice as long to interpret:

set i [string first : $env(DISPLAY)]

if {$i >= 0} {
set host [string range $env(DISPLAY) 0 [expr $i-1]]
}
A Pattern to Match URLs
Example 11-3 demonstrates a pattern with several subpatterns that extract the different parts of a URL.
There are lots of subpatterns, and you can determine which match variable is associated with which
subpattern by counting the left parenthesis. The pattern will be discussed in more detail after the
example:

Example 11-3 A pattern to match URLs.

set url http://www.beedub.com:80/index.html

regexp {([^:]+)://([^:/]+)(:([0-9]+))?(/.*)}$url \
match protocol x serverport path
=> 1
set match
=> http://www.beedub.com:80/index.html
set protocol
=> http
set server
=> www.beedub.com
set x
=> :80
set port
=> 80
set path
=> /index.html

Let's look at the pattern one piece at a time. The first part looks for the protocol, which is separated by
a colon from the rest of the URL. The first part of the pattern is one or more characters that are not a
colon, followed by a colon. This matches the http: part of the URL:

[^:]+:

Using nongreedy +? quantifier, you could also write that as:

.+?:

The next part of the pattern looks for the server name, which comes after two slashes. The server name
is followed either by a colon and a port number, or by a slash. The pattern uses a complementary set
that specifies one or more characters that are not a colon or a slash. This matches the
//www.beedub.com part of the URL:

//[^:/]+
The port number is optional, so a subpattern is delimited with parentheses and followed by a question
mark. An additional set of parentheses are added to capture the port number without the leading colon.
This matches the :80 part of the URL:

(:([0-9]+))?

The last part of the pattern is everything else, starting with a slash. This matches the /index.html part
of the URL:

/.*

Use subpatterns to parse strings.

To make this pattern really useful, we delimit several subpatterns with parentheses:

([^:]+)://([^:/]+)(:([0-9]+))?(/.*)

These parentheses do not change the way the pattern matches. Only the optional port number really
needs the parentheses in this example. However, the regexp command gives us access to the strings
that match these subpatterns. In one step regexp can test for a valid URL and divide it into the
protocol part, the server, the port, and the trailing path.
The parentheses around the port number include the : before the digits. We've used a dummy variable
that gets the : and the port number, and another match variable that just gets the port number. By using
noncapturing parentheses in advanced regular expressions, we can eliminate the unused match
variable. We can also replace both complementary character sets with a nongreedy .+? match.
Example 11-4 shows this variation:

Example 11-4 An advanced regular expression to match URLs.

set url http://www.beedub.com:80/book/

regexp {(.+?)://(.+?)(?::([0-9]+))?(/.*)}$url \
match protocol server port path
=> 1
set match
=> http://www.beedub.com:80/book/
set protocol
=> http
set server
=> www.beedub.com
set port
=> 80
set path
=> /book/

Sample Regular Expressions

The table in this section lists regular expressions as you would use them in Tcl commands. Most are
quoted with curly braces to turn off the special meaning of square brackets and dollar signs. Other
patterns are grouped with double quotes and use backslash quoting because the patterns include
backslash sequences like \n and \t. In Tcl 8.0 and earlier, these must be substituted by Tcl before the
regexp command is called. In these cases, the equivalent advanced regular expression is also shown.

Table 11-7. Sample regular expressions.

{^[yY]} Begins with y or Y, as in a Yes answer.

{^(yes|YES|Yes)$} Exactly "yes", "Yes", or "YES".
"^\[^ \t:\]+:" Begins with colon-delimited field that has no spaces or tabs.
{^\S+:} Same as above, using \S for "not space".
"^\[ \t]*$" A string of all spaces or tabs.
{(?n)^\s*$} A blank line using newline sensitive mode.
"(\n|^)\[^\n\]*(\n|$)" A blank line, the hard way.
{^[A-Za-z]+$} Only letters.
{^[[:alpha:]]+$} Only letters, the Unicode way.
{[A-Za-z0-9_]+} Letters, digits, and the underscore.
{\w+} Letters, digits, and the underscore using \w.
{[][${}\\]} The set of Tcl special characters: ] [ $ { } \
"\[^\n\]*\n" Everything up to a newline.
{.*?\n} Everything up to a newline using nongreedy *?
{\.} A period.
{[][$^?+*()|\\]} The set of regular expression special characters: ] [ $ ^ ? + * ( ) | \
<H1>(.*?)</H1> An H1 HTML tag. The subpattern matches the string between the tags.
 HTML comments.
{[0-9a-hA-H][0-9a-hA-H]} 2 hex digits.
{[[:xdigit:]]{2}} 2 hex digits, using advanced regular expressions.
{\d{1,3}} 1 to 3 digits, using advanced regular expressions.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 11. Regular Expressions

The regsub Command

The regsub command does string substitution based on pattern matching. It is very useful for
processing your data. It can perform simple tasks like replacing sequences of spaces and tabs with a
single space. It can perform complex data transforms, too, as described in the next section. Its syntax
is:

regsub ?switches? pattern string subspec varname

The regsub command returns the number of matches and replacements, or 0 if there was no match.
regsub copies string to varname , replacing occurrences of pattern with the substitution specified by
subspec . If the pattern does not match, then string is copied to varname without modification. The
optional switches include:

-all, which means to replace all occurrences of the pattern. Otherwise only the first occurrence
is replaced.
The -nocase, -expanded, -line, -linestop, and -lineanchor switches are the same as in the
regexp command. They are described on page 148.
The -- switch separates the pattern from the switches, which is necessary if your pattern begins
with a -.
The replacement pattern, subspec, can contain literal characters as well as the following special
sequences:

& is replaced with the string that matched the pattern.

\x , where x is a number, is replaced with the string that matched the corresponding subpattern in
pattern . The correspondence is based on the order of left parentheses in the pattern
specification.
The following replaces a user's home directory with a ~:
regsub ^$env(HOME)/ $pathname ~/ newpath

The following constructs a C compile command line given a filename:

set file tclIO.c

regsub {([^\.]*)\.c$}$file {cc -c & -o \1.o} ccCmd

The matching pattern captures everything before the trailing .c in the file name. The & is replaced with
the complete match, tclIO.c, and \1 is replaced with tclIO, which matches the pattern between the
parentheses. The value assigned to ccCmd is:

cc -c tclIO.c -o tclIO.o

We could execute that with:

eval exec $ccCmd

The following replaces sequences of multiple space characters with a single space:

regsub -all {\s+}$string " " string

It is perfectly safe to specify the same variable as the input value and the result. Even if there is no
match on the pattern, the input string is copied into the output variable.
The regsub command can count things for us. The following command counts the newlines in some
text. In this case the substitution is not important:

set numLines [regsub -all \n $text {} ignore]

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 11. Regular Expressions

Transforming Data to Program with regsub

One of the most powerful combinations of Tcl commands is regsub and subst. This section describes
a few examples that use regsub to transform data into Tcl commands, and then use subst to replace
those commands with a new version of the data. This technique is very efficient because it relies on
two subsystems that are written in highly optimized C code: the regular expression engine and the Tcl
parser. These examples are primarily written by Stephen Uhler.

URL Decoding
When a URL is transmitted over the network, it is encoded by replacing special characters with a %xx
sequence, where xx is the hexadecimal code for the character. In addition, spaces are replaced with a
plus (+). It would be tedious and very inefficient to scan a URL one character at a time with Tcl
statements to undo this encoding. It would be more efficient to do this with a custom C program, but
still very tedious. Instead, a combination of regsub and subst can efficiently decode the URL in just a
few Tcl commands.
Replacing the + with spaces requires quoting the + because it is the one-or-more special character in
regular expressions:

regsub -all {\+}$url {} url

The %xx are replaced with a format command that will generate the right character:

regsub -all {%([0-9a-hA-H][0-9a-hA-H])} $url \

{[format %c 0x\1]} url

The %c directive to format tells it to generate the character from a character code number. We force a
hexadecimal interpretation with a leading 0x. Advanced regular expressions let us write the "2 hex
digits" pattern a bit more cleanly:
regsub -all {%([[:xdigit:]]{2})} $url \
{[format %c 0x\1]} url

The resulting string is passed to subst to get the format commands substituted:

set url [subst $url]

For example, if the input is %7ewelch, the result of the regsub is:

[format %c 0x7e]welch

And then subst generates:

~welch

Example 11-5 encapsulates this trick in the Url_Decode procedure.

Example 11-5 The Url_Decode procedure.

proc Url_Decode {url} {

regsub -all {\+} $url {} url
regsub -all {%([:xdigit:]]{2})} $url \
{[format %c 0x\1]} url
return [subst $url]
}

CGI Argument Parsing

Example 11-6 builds upon Url_Decode to decode the inputs to a CGI program that processes data
from an HTML form. Each form element is identified by a name, and the value is URL encoded. All
the names and encoded values are passed to the CGI program in the following format:

name1=value1&name2=value2&name3=value3

Example 11-6 shows Cgi_List and Cgi_Query. Cgi_Query receives the form data from the standard
input or the QUERY_STRING environment variable, depending on whether the form data is transmitted
with a POST or GET request. These HTTP operations are described in detail in Chapter 17. Cgi_List
uses split to get back a list of names and values, and then it decodes them with Url_Decode. It
returns a Tcl-friendly name, value list that you can either iterate through with a foreach command, or
assign to an array with array set:
Example 11-6 The Cgi_Parse and Cgi_Value procedures.

proc Cgi_List {} {
set query [Cgi_Query]
regsub -all {\+}$query {} query
set result {}
foreach {x}[split $query &=] {
lappend result [Url_Decode $x]
}
return $result
}
proc Cgi_Query {} {
global env
if {![info exists env(QUERY_STRING)] ||
[string length $env(QUERY_STRING)] == 0} {
if {[info exists env(CONTENT_LENGTH)] &&
[string length $env(CONTENT_LENGTH)] != 0} {
set query [read stdin $env(CONTENT_LENGTH)]
} else {
gets stdin query
}
set env(QUERY_STRING) $query
set env(CONTENT_LENGTH) 0
}
return $env(QUERY_STRING)
}

An HTML form can have several form elements with the same name, and this can result in more than
one value for each name. If you blindly use array set to map the results of Cgi_List into an array,
you will lose the repeated values. Example 11-6 shows Cgi_Parse and Cgi_Value that store the query
data in a global cgi array. Cgi_Parse adds list structure whenever it finds a repeated form value. The
global cgilist array keeps a record of how many times a form value is repeated. The Cgi_Value
procedure returns elements of the global cgi array, or the empty string if the requested value is not
present.

Example 11-7 Cgi_Parse and Cgi_Value store query data in the cgi array.

proc Cgi_Parse {} {
global cgi cgilist
catch {unset cgi cgilist}
set query [Cgi_Query]
regsub -all {\+}$query {}query
foreach {name value}[split $query &=] {
set name [CgiDecode $name]
if {[info exists cgilist($name)] &&
($cgilist($name) == 1)} {
# Add second value and create list structure
set cgi($name) [list $cgi($name) \
[Url_Decode $value]]
} elseif {[info exists cgi($name)]} {
# Add additional list elements
lappend cgi($name) [CgiDecode $value]
} else {
# Add first value without list structure
set cgi($name) [CgiDecode $value]
set cgilist($name) 0 ;# May need to listify
}
incr cgilist($name)
}
return [array names cgi]
}
proc Cgi_Value {key} {
global cgi
if {[info exists cgi($key)]} {
return $cgi($key)
} else {
return {}
}
}
proc Cgi_Length {key} {
global cgilist
if {[info exist cgilist($key)]} {
return $cgilist($key)
} else {
return 0
}
}

Decoding HTML Entities

The next example is a decoder for HTML entities. In HTML, special characters are encoded as
entities. If you want a literal < or > in your document, you encode them as the entities < and >,
respectively, to avoid conflict with the <tag> syntax used in HTML. HTML syntax is briefly described
in Chapter 3 on page 32. Characters with codes above 127 like copyright © and egrave è are also
encoded. There are named entities, like < for < and è for è. You can also use decimal-
valued entities such as © for ©. Finally, the trailing semicolon is optional, so &lt or < can
both be used to encode <.
The entity decoder is similar to Url_Decode. In this case, however, we need to be more careful with
subst. The text passed to the decoder could contain special characters like a square bracket or dollar
sign. With Url_Decode we can rely on those special characters being encoded as, for example, %24.
Entity encoding is different (do not ask me why URLs and HTML have different encoding standards),
and dollar signs and square brackets are not necessarily encoded. This requires an additional pass to
quote these characters. This regsub puts a backslash in front of all the brackets, dollar signs, and
backslashes.
regsub -all {[][$\\]} $text {\\&} new

The decimal encoding (e.g., ©) is also more awkward than the hexadecimal encoding used in
URLs. We cannot force a decimal interpretation of a number in Tcl. In particular, if the entity has a
leading zero (e.g., 
) then Tcl interprets the value (e.g., 010) as octal. The scan command is
used to do a decimal interpretation. It scans into a temporary variable, and set is used to get that
value:

regsub -all {&#([0-9][0-9]?[0-9]?);?} $new \

{[format %c [scan \1 %d tmp; set tmp]]} new

With advanced regular expressions, this could be written as follows using bound quantifiers to specify
one to three digits:

regsub -all {&#(\d{1,3});?} $new \

{[format %c [scan \1 %d tmp;set tmp]]} new

The named entities are converted with an array that maps from the entity names to the special
character. The only detail is that unknown entity names (e.g., &foobar;) are not converted. This
mapping is done inside HtmlMapEntity, which guards against invalid entities.

regsub -all {&([a-zA-Z]+)(;?)} $new \

{[HtmlMapEntity \1 \\\2 ]} new

If the input text contained:

[x < y]

then the regsub would transform this into:

\[x [HtmlMapEntity lt \; ] y\]

Finally, subst will result in:

[x < y]

Example 11-8 Html_DecodeEntity.

proc Html_DecodeEntity {text} {
if {![regexp & $text]} {return $text}
regsub -all {[][$\\]}$text {\\&} new
regsub -all {&#([0-9][0-9]?[0-9]?);?} $new {\
[format %c [scan \1 %d tmp;set tmp]]} new
regsub -all {&([a-zA-Z]+)(;?)} $new \
{[HtmlMapEntity \1 \\\2 ]} new
return [subst $new]
}
proc HtmlMapEntity {text {semi {}}} {
global htmlEntityMap
if {[info exist htmlEntityMap($text)]} {
return $htmlEntityMap($text)
} else {
return $text$semi
}
}
# Some of the htmlEntityMap
array set htmlEntityMap {
lt < gt > amp &
aring \xe5 atilde \xe3
copy \xa9 ecirc \xea egrave \xe8
}

A Simple HTML Parser

The following example is the brainchild of Stephen Uhler. It uses regsub to transform HTML into a
Tcl script. When it is evaluated the script calls a procedure to handle each tag in an HTML document.
This provides a general framework for processing HTML. Different callback procedures can be
applied to the tags to achieve different effects. For example, the html_library-0.3 package on the
CD-ROM uses Html_Parse to display HTML in a Tk text widget.

Example 11-9 Html_Parse.

proc Html_Parse {html cmd {start {}}} {

# Map braces and backslashes into HTML entities

regsub -all \{ $html {\&ob;} html
regsub -all \} $html {\&cb;} html
regsub -all {\\} $html &bsl;} html

# This pattern matches the parts of an HTML tag

set s" \t\r\n" ;# white space
set exp <(/?)(\[^$s>]+)\[$s]*(\[^>]*)>

# This generates a call to cmd with HTML tag parts

# \1 is the leading /, if any
# \2 is the HTML tag name
# \3 is the parameters to the tag, if any
# The curly braces at either end group of all the text
# after the HTML tag, which becomes the last arg to $cmd.
set sub "\}\n {\\2} {\\1} {\\3} \{"
regsub -all $exp $html $sub html

# This balances the curly braces,

# and calls $cmd with $start as a pseudo-tag
# at the beginning and end of the script.
eval "$cmd {$start} {} {} {$html}"
eval "$cmd {$start} / {} {}"
}

The main regsub pattern can be written more simply with advanced regular expressions:

set exp {<(/?)(\S+?)\s(.?)>}

An example will help visualize the transformation. Given this HTML:

<Title>My Home Page</Title>

<Body bgcolor=white text=black>
<H1>My Home</H1>
This is my <b>home</b> page.

and a call to Html_Parse that looks like this:

Html_Parse $html {Render .text}hmstart

then the generated program is this:

Render .text {hmstart} {} {} {}

Render .text {Title} {} {} {My Home Page}
Render .text {Title} {/} {} {
}
Render .text {Body} {} {bgcolor=white text=black} {
}
Render .text {H1} {} {} {My Home}
Render .text {H1} {/} {} {
This is my }
Render .text {b} {} {} {home}
Render .text {b} {/} {} {page.
}
Render .text {hmstart}/ {} {}
One overall point to make about this example is the difference between using eval and subst with the
generated script. The decoders shown in Examples 11-5 and 11-8 use subst to selectively replace
encoded characters while ignoring the rest of the text. In Html_Parse we must process all the text. The
main trick is to replace the matching text (e.g., the HTML tag) with some Tcl code that ends in an
open curly brace and starts with a close curly brace. This effectively groups all the unmatched text.
When eval is used this way you must do something with any braces and backslashes in the unmatched
text. Otherwise, the resulting script does not parse correctly. In this case, these special characters are
encoded as HTML entities. We can afford to do this because the cmd that is called must deal with
encoded entities already. It is not possible to quote these special characters with backslashes because
all this text is inside curly braces, so no backslash substitution is performed. If you try that the
backslashes will be seen by the cmd callback.
Finally, I must admit that I am always surprised that this works:

eval "$cmd {$start} {} {} {$html}"

I always forget that $start and $html are substituted in spite of the braces. This is because double
quotes are being used to group the argument, so the quoting effect of braces is turned off. Try this:

set x hmstart
set y "foo {$x}bar"
=> foo {hmstart}bar

Stripping HTML Comments

The Html_Parse procedure does not correctly handle HTML comments. The problem is that the
syntax for HTML commands allows tags inside comments, so there can be > characters inside the
comment. HTML comments are also used to hide Javascript inside pages, which can also contain >.
We can fix this with a pass that eliminates the comments.
The comment syntax is this:

Using nongreedy quantifiers, we can strip comments with a single regsub:

regsub -all  $html {}html

Using only greedy quantifiers, it is awkward to match the closing --> without getting stuck on
embedded > characters, or without matching too much and going all the way to the end of the last
comment. Time for another trick:
regsub -all --> $html \x81 html

This replaces all the end comment sequences with a single character that is not allowed in HTML.
Now you can delete the comments like this:

regsub -all "<!--\[^\x81\]*\x81" $html {}html

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 11. Regular Expressions

Other Commands That Use Regular Expressions

Several Tcl commands use regular expressions.

lsearch takes a -regexp flag so that you can search for list items that match a regular
expression. The lsearch command is described on page 64.
switch takes a -regexp flag, so you can branch based on a regular expression match instead of
an exact match or a string match style match. The switch command is described on page 71.
The Tk text widget can search its contents based on a regular expression match. Searching in the
text widget is described on page 463.
The Expect Tcl extension can match the output of a program with regular expressions. Expect is
the subject of its own book, Exploring Expect (O'Reilly, 1995) by Don Libes.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part II. Advanced Tcl

Chapter 12. Script Libraries and Packages

Collections of Tcl commands are kept in libraries and organized into packages. Tcl automatically
loads libraries as an application uses their commands. Tcl commands discussed are: package,
pkg_mkIndex, auto_mkindex , unknown , and tcl_findLibrary.

Libraries group useful sets of Tcl procedures so that they can be used by multiple applications. For
example, you could use any of the code examples that come with this book by creating a script library
and then directing your application to check in that library for missing procedures. One way to
structure a large application is to have a short main script and a library of support scripts. The
advantage of this approach is that not all the Tcl code needs to be loaded to start the application.
Applications start up quickly, and as new features are accessed, the code that implements them is
loaded automatically.
The Tcl package facility supports version numbers and has a provide/require model of use. Typically,
each file in a library provides one package with a particular version number. Packages also work with
shared object libraries that implement Tcl commands in compiled code, which are described in
Chapter 44. A package can be provided by a combination of script files and object files. Applications
specify which packages they require and the libraries are loaded automatically. The package facility is
an alternative to the auto loading scheme used in earlier versions of Tcl. You can use either
mechanism, and this chapter describes them both.
If you create a package you may wish to use the namespace facility to avoid conflicts between
procedures and global variables used in different packages. Namespaces are the topic of Chapter 14.
Before Tcl 8.0 you had to use your own conventions to avoid conflicts. This chapter explains a simple
coding convention for large Tcl programs. I use this convention in exmh, a mail user interface that has
grown from about 2,000 to over 35,000 lines of Tcl code. A majority of the code has been contributed
by the exmh user community. Such growth might not have been possible without coding conventions.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 12. Script Libraries and Packages

Locating Packages: The auto_path Variable

The package facility assumes that Tcl libraries are kept in well-known directories. The list of well-
known directories is kept in the auto_path Tcl variable. This is initialized by tclsh and wish to include
the Tcl script library directory, the Tk script library directory (for wish), and the parent directory of the
Tcl script library directory. For example, on my Macintosh auto_path is a list of these three
directories:

Disk:System Folder:Extensions:Tool Command Language:tcl8.2

Disk:System Folder:Extensions:Tool Command Language
Disk:System Folder:Extensions:Tool Command Language:tk8.2

On my Windows 95 machine the auto_path lists these directories:

c:\Program Files\Tcl\lib\Tcl8.2
c:\Program Files\Tcl\lib
c:\Program Files\Tcl\lib\Tk8.2

On my UNIX workstation the auto_path lists these directories:

/usr/local/tcl/lib/tcl8.2
/usr/local/tcl/lib
/usr/local/tcl/lib/tk8.2

The package facility searches these directories and their subdirectories for packages. The easiest way
to manage your own packages is to create a directory at the same level as the Tcl library:

/usr/local/tcl/lib/welchbook
Packages in this location, for example, will be found automatically because the auto_path list
includes /usr/local/tcl/lib. You can also add directories to the auto_path explicitly:

lappend auto_path directory

One trick I often use is to put the directory containing the main script into the auto_path. The
following command sets this up:

lappend auto_path [file dirname [info script]]

If your code is split into bin and lib directories, then scripts in the bin directory can add the adjacent
lib directory to their auto_path with this command:

lappend auto_path \
[file join [file dirname [info script]] ../lib]

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 12. Script Libraries and Packages

Using Packages
Each script file in a library declares what package it implements with the package provide
command:

package provide name version

The name identifies the package, and the version has a major.minor format. The convention is that
the minor version number can change and the package implementation will still be compatible. If the
package changes in an incompatible way, then the major version number should change. For example,
Chapter 17 defines several procedures that use the HTTP network protocol. These include Http_Open,
Http_Get, and Http_Validate. The file that contains the procedures starts with this command:

package provide Http 1.0

Case is significant in package names. In particular, the package that comes with Tcl is named http ?all
lowercase.
More than one file can contribute to the same package simply by specifying the same name and
version . In addition, different versions of the same package can be kept in the same directory but in
different files.
An application specifies the packages it needs with the package require command:

package require name ?version? ?-exact?

If the version is left off, then the highest available version is loaded. Otherwise the highest version
with the same major number is loaded. For example, if the client requires version 1.1, version 1.2
could be loaded if it exists, but versions 1.0 and 2.0 would not be loaded. You can restrict the package
to a specific version with the -exact flag. If no matching version can be found, then the package
require command raises an error.
Loading Packages Automatically
The package require command depends on an index to record which files implement which
packages. The index must be maintained by you, your project librarian, or your system administrator
when packages change. The index is computed by the pkg_mkIndex command that puts the results into
the pkgIndex.tcl file in each library directory. The pkg_mkIndex command takes the name of a
directory and one or more glob patterns that specify files within that directory. File name patterns are
described on page 115. The syntax is:

pkg_mkIndex ?options? directory pattern ?pattern ...?

For example:

pkg_mkIndex /usr/local/lib/welchbook *.tcl

pkg_mkIndex -direct /usr/local/lib/Sybtcl *.so

The pkg_mkIndex command sources or loads all the files matched by the pattern, detects what
packages they provide, and computes the index. You should be aware of this behavior because it
works well only for libraries. If the pkg_mkIndex command hangs or starts random applications, it is
because it sourced an application file instead of a library file.
By default, the index created by pkg_mkIndex contains commands that set up the auto_index array
used to automatically load commands when they are first used. This means that code does not get
loaded when your script does a package require. If you want the package to be loaded right away,
specify the -direct flag to pkg_mkIndex so that it creates an index file with source and load
commands. The pkg_mkIndex options are summarized in Table 12-1.

Table 12-1. Options to the pkg_mkIndex command.

-direct Generates an index with source and load commands in it. This results in packages
being loaded directly as a result of package require.
-load Dynamically loads packages that match pattern into the slave interpreter used to
pattern compute the index. A common reason to need this is with the tcbload package needed
to load .tbc files compiled with TclPro Compiler.
-verbose Displays the name of each file processed and any errors that occur.

Packages Implemented in C Code

The files in a library can be either script files that define Tcl procedures or binary files in shared
library format that define Tcl commands in compiled code (i.e., a Dynamic Link Library (DLL)).
Chapter 44 describes how to implement Tcl commands in C. There is a C API to the package facility
that you use to declare the package name for your commands. This is shown in Example 44-1 on page
610. Chapter 37 also describes the Tcl load command that is used instead of source to link in shared
libraries. The pkg_mkIndex command also handles shared libraries:

pkg_mkIndex directory .tcl .so .shlib .dll

In this example, .so, .shlib, and .dll are file suffixes for shared libraries on UNIX, Macintosh, and
Windows systems, respectively. You can have packages that have some of their commands
implemented in C, and some implemented as Tcl procedures. The script files and the shared library
must simply declare that they implement the same package. The pkg_mkIndex procedure will detect
this and set up the auto_index, so some commands are defined by sourcing scripts, and some are
defined by loading shared libraries.
If your file servers support more than one machine architecture, such as Solaris and Linux systems,
you probably keep the shared library files in machine-specific directories. In this case the auto_path
should also list the machine-specific directory so that the shared libraries there can be loaded
automatically. If your system administrator configured the Tcl installation properly, this should already
be set up. If not, or you have your shared libraries in a nonstandard place, you must append the
location to the auto_path variable.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 12. Script Libraries and Packages

Summary of Package Loading

The basic structure of package loading works like this:

An application does a package require command. If the package is already loaded, the
command just returns the version number of the already loaded package. If is not loaded, the
following steps occur.
The package facility checks to see if it knows about the package. If it does, then it runs the Tcl
scripts registered with the package ifneeded command. These commands either load the
package or set it up to be loaded automatically when its commands are first used.
If the package is unknown, the tclPkgUnknown procedure is called to find it. Actually, you can
specify what procedure to call to do the lookup with the package unknown command, but the
standard one is tclPkgUnknown.
The tclPkgUnknown procedure looks through the auto_path directories and their subdirectories
for pkgIndex.tcl files. It sources those to build an internal database of packages and version
information. The pkgIndex.tcl files contain calls to package ifneeded that specify what to do
to define the package. The standard action is to call the tclPkgSetup procedure that sets up the
auto_index so that the commands in the package will be automatically loaded. If you use -
direct with pkg_mkIndex, the script contains source and load commands instead.

The tclPkgSetup procedure defines the auto_index array to contain the correct source or load
commands to define each command in the package. Automatic loading and the auto_index array
are described in more detail later.
As you can see, there are several levels of processing involved in finding packages. The system is
flexible enough that you can change the way packages are located and how packages are loaded. The
default scenario is complicated because it uses the delayed loading of source code that is described in
the next section. Using the -direct flag to pkg_mkIndex simplifies the situation somewhat. In any
case it all boils down to three key steps:

Use pkg_mkIndex to maintain your index files. Decide at this time whether or not to use direct
package loading.
Put the appropriate package require and package provide commands in your code.
Ensure that your library directories, or their parent directories, are listed in the auto_path
variable.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 12. Script Libraries and Packages

The package Command

The package command has several operations that are used primarily by the pkg_mkIndex procedure
and the automatic loading facility. These operations are summarized in Table 12-2.

Table 12-2. The package command.

package forget package Deletes registration information for package.

package ifneeded Queries or sets the command used to set up automatic loading of a
package ?command? package.
package names Returns the set of registered packages.
package provide package Declares that a script file defines commands for package with the
version given version.
package require package Declares that a script uses package. The -exact flag specifies that the
?version? ?-exact? exact version must be loaded. Otherwise, the highest matching
version is loaded.
package unknown ? Queries or sets the command used to locate packages.
command?
package vcompare v1 v2 Compares version v1 and v2. Returns 0 if they are equal, minus 1 if v1
is less than v2, or 1 if v1 is greater than v2.
package versions Returns which versions of the package are registered.
package

package vsatisfies v1 Returns 1 if v1 is greater or equal to v2 and still has the same major
v2 version number. Otherwise returns 0.
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 12. Script Libraries and Packages

Libraries Based on the tclIndex File

You can create libraries without using the package command. The basic idea is that a directory has a
library of script files, and an index of the Tcl commands defined in the library is kept in a tclIndex
file. The drawback is that versions are not supported and you may need to adjust the auto_path to list
your library directory. The main advantage of this approach is that this mechanism has been part of Tcl
since the earliest versions. If you currently maintain a library using tclIndex files, it will still work.
You must generate the index that records what procedures are defined in the library. The
auto_mkindex procedure creates the index, which is stored in a file named tclIndex that is kept in the
script library directory. (Watch out for the difference in capitalization between auto_mkindex and
pkg_mkIndex!) Suppose all the examples from this book are in the directory
/usr/local/tcl/welchbook. You can make the examples into a script library by creating the
tclIndex file:

auto_mkindex /usr/local/tcl/welchbook *.tcl

You will need to update the tclIndex file if you add procedures or change any of their names. A
conservative approach to this is shown in the next example. It is conservative because it re-creates the
index if anything in the library has changed since the tclIndex file was last generated, whether or not
the change added or removed a Tcl procedure.

Example 12-1 Maintaining a tclIndex file.

proc Library_UpdateIndex { libdir } {

set index [file join $libdir tclIndex]
if {![file exists $index]} {
set doit 1
} else {
set age [file mtime $index]
set doit 0
# Changes to directory may mean files were deleted
if {[file mtime $libdir] > $age} {
set doit 1
} else {
# Check each file for modification
foreach file [glob [file join $libdir *.tcl]] {
if {[file mtime $file] > $age} {
set doit 1
break
}
}
}
}
if { $doit } {
auto_mkindex $libdir *.tcl
}
}

Tcl uses the auto_path variable to record a list of directories to search for unknown commands. To
continue our example, you can make the procedures in the book examples available by putting this
command at the beginning of your scripts:

lappend auto_path /usr/local/tcl/welchbook

This has no effect if you have not created the tclIndex file. If you want to be extra careful, you can
call Library_UpdateIndex. This will update the index if you add new things to the library.

lappend auto_path /usr/local/tcl/welchbook

Library_UpdateIndex /usr/local/tcl/welchbook

This will not work if there is no tclIndex file at all because Tcl won't be able to find the
implementation of Library_UpdateIndex. Once the tclIndex has been created for the first time, then
this will ensure that any new procedures added to the library will be installed into tclIndex. In
practice, if you want this sort of automatic update, it is wise to include something like the
Library_UpdateIndex procedure directly into your application as opposed to loading it from the
library it is supposed to be maintaining.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 12. Script Libraries and Packages

The unknown Command

Automatic loading of Tcl commands is implemented by the unknown command. Whenever the Tcl
interpreter encounters a command that it does not know about, it calls the unknown command with the
name of the missing command. The unknown command is implemented in Tcl, so you are free to
provide your own mechanism to handle unknown commands. This chapter describes the behavior of
the default implementation of unknown, which can be found in the init.tcl file in the Tcl library. The
location of the library is returned by the info library command.

How Auto Loading Works

The unknown command uses an array named auto_index. One element of the array is defined for each
procedure that can be automatically loaded. The auto_index array is initialized by the package
mechanism or by tclIndex files. The value of an auto_index element is a command that defines the
procedure. Typical commands are:

source [file join $dir bind_ui.tcl]

load [file join $dir mime.so] Mime

The $dir gets substituted with the name of the directory that contains the library file, so the result is a
source or load command that defines the missing Tcl command. The substitution is done with eval,
so you could initialize auto_index with any commands at all. Example 12-2 is a simplified version of
the code that reads the tclIndex file.

Example 12-2 Loading a tclIndex file.

# This is a simplified part of the auto_load_index procedure.

# Go through auto_path from back to front.
set i [expr [llength $auto_path]-1]
for {} {$i >= 0} {incr i -1} {
set dir [lindex $auto_path $i]
if [catch {open [file join $dir tclIndex]} f] {
# No index
continue
}
# eval the file as a script. Because eval is
# used instead of source, an extra round of
# substitutions is performed and $dir gets expanded
# The real code checks for errors here.
eval [read $f]
close $f
}

Disabling the Library Facility: auto_noload

If you do not want the unknown procedure to try and load procedures, you can set the auto_noload
variable to disable the mechanism:

set auto_noload anything

Auto loading is quite fast. I use it regularly on applications both large and small. A large application
will start faster if you only need to load the code necessary to start it up. As you access more features
of your application, the code will load automatically. Even a small application benefits from auto
loading because it encourages you to keep commonly used code in procedure libraries.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 12. Script Libraries and Packages

Interactive Conveniences
The unknown command provides a few other conveniences. These are used only when you are typing
commands directly. They are disabled once execution enters a procedure or if the Tcl shell is not being
used interactively. The convenience features are automatic execution of programs, command history,
and command abbreviation. These options are tried, in order, if a command implementation cannot be
loaded from a script library.

Auto Execute
The unknown procedure implements a second feature: automatic execution of external programs. This
makes a Tcl shell behave more like other UNIX shells that are used to execute programs. The search
for external programs is done using the standard PATH environment variable that is used by other shells
to find programs. If you want to disable the feature all together, set the auto_noexec variable:

set auto_noexec anything

History
The history facility described in Chapter 13 is implemented by the unknown procedure.

Abbreviations
If you type a unique prefix of a command, unknown recognizes it and executes the matching command
for you. This is done after automatic program execution is attempted and history substitutions are
performed.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 12. Script Libraries and Packages

Tcl Shell Library Environment

Tcl searches for its script library directory when it starts up. In early versions of Tcl you had to
compile in the correct location, set a Windows registry value, or set the TCL_LIBRARY environment
variable to the correct location. Recent versions of Tcl use a standard searching scheme to locate the
script library. The search understands the standard installation and build environments for Tcl, and it
should eliminate the need to use the TCL_LIBRARY environment variable. On Windows the search
for the library used to depend on registry values, but this has also been discontinued in favor of a
standard search. In summary, "it should just work." However, this section explains how Tcl finds its
script library so that you can troubleshoot problems.

Locating the Tcl Script Library

The default library location is defined when you configure the source distribution, which is explained
on page 644. At this time an initial value for the auto_path variable is defined. (This default value
appears in tcl_pkgPath, but changing this variable has no effect once Tcl has started. I just pretend
tcl_pkgPath does not exist.) These values are just hints; Tcl may use other directories depending on
what it finds in the file system.
When Tcl starts up, it searches for a directory that contains its init.tcl startup script. You can short-
circuit the search by defining the TCL_LIBRARY environment variable. If this is defined, Tcl uses it only
for its script library directory. However, you should not need to define this with normal installations of
Tcl 8.0.5 or later. In my environment I'm often using several different versions of Tcl for various
applications and testing purposes, so setting TCL_LIBRARY is never correct for all possibilities. If I find
myself setting this environment variable, I know something is wrong with my Tcl installations!
The standard search starts with the default value that is compiled into Tcl (e.g.,
/usr/local/lib/tcl8.1.) After that, the following directories are examined for an init.tcl file.
These example values assume Tcl version 8.1 and patch level 8.1.1:

../lib/tcl8.1
../../lib/tcl8.1
../library
../../tcl8.1.1/library
../../../tcl8.1.1/library
The first two directories correspond to the standard installation directories, while the last three
correspond to the standard build environment for Tcl or Tk. The first directory in the list that contains
a valid init.tcl file becomes the Tcl script library. This directory location is saved in the
tcl_library global variable, and it is also returned by the info library command.

The primary thing defined by init.tcl is the implementation of the unknown procedure. It also
initializes auto_path to contain $tcl_library and the parent directory of $tcl_library. There may
be additional directories added to auto_path depending on the compiled in value of tcl_pkgPath.

tcl_findLibrary

A generalization of this search is implemented by tcl_findLibrary. This procedure is designed for

use by extensions like Tk and [incr Tcl]. Of course, Tcl cannot use tcl_findLibrary itself because it
is defined in init.tcl!
The tcl_findLibrary procedure searches relative to the location of the main program (e.g., tclsh or
wish) and assumes a standard installation or a standard build environment. It also supports an override
by an environment variable, and it takes care of sourcing an initialization script. The usage of
tcl_findLibrary is:

tcl_findLibrary base version patch script enVar varName

The base is the prefix of the script library directory name. The version is the main version number
(e.g., "8.0"). The patch is the full patch level (e.g., "8.0.3"). The script is the initialization script to
source from the directory. The enVar names an environment variable that can be used to override the
default search path. The varName is the name of a variable to set to name of the directory found by
tcl_findLibrary. A side effect of tcl_findLibrary is to source the script from the directory. An
example call is:

tcl_findLibrary tk 8.0 8.0.3 tk.tcl TK_LIBRARY tk_library

This call first checks to see whether TK_LIBRARY is defined in the environment. If so, it uses its value.
Otherwise, it searches the following directories for a file named tk.tcl. It sources the script and sets
the tk_library variable to the directory containing that file. The search is relative to the value
returned by info nameofexecutable:

../lib/tk8.0
../../lib/tk8.0
../library
../../tk8.0.3/library
../../../tk8.0.3/library

Tk also adds $tk_library to the end of auto_path, so the other script files in that directory are
available to the application:

lappend auto_path $tk_library

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 12. Script Libraries and Packages

Coding Style
If you supply a package, you need to follow some simple coding conventions to make your library
easier to use by other programmers. You can use the namespace facility introduced in Tcl 8.0. You can
also use conventions to avoid name conflicts with other library packages and the main application.
This section describes the conventions I developed before namespaces were added to Tcl.

A Module Prefix for Procedure Names

The first convention is to choose an identifying prefix for the procedures in your package. For
example, the preferences package in Chapter 42 uses Pref as its prefix. All the procedures provided
by the library begin with Pref. This convention is extended to distinguish between private and
exported procedures. An exported procedure has an underscore after its prefix, and it is acceptable to
call this procedure from the main application or other library packages. Examples include Pref_Add,
Pref_Init, and Pref_Dialog. A private procedure is meant for use only by the other procedures in
the same package. Its name does not have the underscore. Examples include PrefDialogItem and
PrefXres.

This naming convention precludes casual names like doit, setup, layout, and so on. Without using
namespaces, there is no way to hide procedure names, so you must maintain the naming convention
for all procedures in a package.

A Global Array for State Variables

You should use the same prefix on the global variables used by your package. You can alter the
capitalization; just keep the same prefix. I capitalize procedure names and use lowercase letters for
variables. By sticking with the same prefix you identify what variables belong to the package and you
avoid conflict with other packages.

Collect state in a global array.

In general, I try to use a single global array for a package. The array provides a convenient place to
collect a set of related variables, much as a struct is used in C. For example, the preferences package
uses the pref array to hold all its state information. It is also a good idea to keep the use of the array
private. It is better coding practice to provide exported procedures than to let other modules access
your data structures directly. This makes it easier to change the implementation of your package
without affecting its clients.
If you do need to export a few key variables from your module, use the underscore convention to
distinguish exported variables. If you need more than one global variable, just stick with the prefix
convention to avoid conflicts.

The Official Tcl Style Guide

John Ousterhout has published two programming style guides, one for C programming known as "The
Engineering Manual" and one for Tcl scripts known as "The Style Guide". These describe details
about file structure as well as naming conventions for modules, procedures, and variables. The Tcl
Style Guide conventions use Tcl namespaces to separate packages. Namespaces automatically provide
a way to avoid conflict between procedure names. Namespaces also support collections of variables
without having to use arrays for grouping.
You can find these style guides on the CD-ROM and also in ftp://ftp.scriptics.com/pub/tcl/doc. The
Engineering Manual is distributed as a compressed tar file, engManual.tar.Z, that contains sample
files as well as the main document. The Style Guide is distributed as styleGuide.ps (or .pdf).

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Part II. Advanced Tcl

Chapter 13. Reflection and Debugging

This chapter describes commands that give you a view into the interpreter. The history command and
a simple debugger are useful during development and debugging. The info command provides a
variety of information about the internal state of the Tcl interpreter. The time command measures the
time it takes to execute a command. Tcl commands discussed are: clock, info, history, and time.
Reflection provides feedback to a script about the internal state of the interpreter. This is useful in a
variety of cases, from testing to see whether a variable exists to dumping the state of the interpreter.
The info command provides lots of different information about the interpreter.
The clock command is useful for formatting dates as well as parsing date and time values. It also
provides high-resolution timer information for precise measurements.
Interactive command history is the third topic of the chapter. The history facility can save you some
typing if you spend a lot of time entering commands interactively.
Debugging is the last topic. The old-fashioned approach of adding puts commands to your code is
often quite useful. For tough problems, however, a real debugger is invaluable. The TclPro tools from
Scriptics include a high quality debugger and static code checker. The tkinspect program is an
inspector that lets you look into the state of a Tk application. It can hook up to any Tk application
dynamically, so it proves quite useful.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 13. Reflection and Debugging

The clock Command

The clock command has facilities for getting the current time, formatting time values, and scanning
printed time strings to get an integer time value. The clock command was added in Tcl 7.5. Table 13-
1 summarizes the clock command:

Table 13-1. The clock command.

clock clicks A system-dependent high resolution counter.

clock format value ?-format str? Formats a clock value according to str.
clock scan string ?-base clock? ? Parses date string and return seconds value. The clock
-gmt boolean? value determines the date.
clock seconds Returns the current time in seconds.

The following command prints the current time:

clock format [clock seconds]

=> Sun Nov 24 14:57:04 1996

The clock seconds command returns the current time, in seconds since a starting epoch. The clock
format command formats an integer value into a date string. It takes an optional argument that
controls the format. The format strings contains % keywords that are replaced with the year, month,
day, date, hours, minutes, and seconds, in various formats. The default string is:

%a %b %d %H:%M:%S %Z %Y

Tables 13-2 and 13-3 summarize the clock formatting strings:

Table 13-2. Clock formatting keywords.

%% Inserts a %.
%a Abbreviated weekday name (Mon, Tue, etc.).
%A Full weekday name (Monday, Tuesday, etc.).
%b Abbreviated month name (Jan, Feb, etc.).
%B Full month name.
%c Locale specific date and time (e.g., Nov 24 16:00:59 1996).
%d Day of month (01 ?31).
%H Hour in 24-hour format (00 ?23).
%I Hour in 12-hour format (01 ?12).
%j Day of year (001 ?366).
%m Month number (01 ?12).
%M Minute (00 ?59).
%p AM/PM indicator.
%S Seconds (00 ?59).
%U Week of year (00 ?52) when Sunday starts the week.
%w Weekday number (Sunday = 0).
%W Week of year (01 ?52) when Monday starts the week.
%x Locale specific date format (e.g., Feb 19 1997).
%X Locale specific time format (e.g., 20:10:13).
%y Year without century (00 ?99).
%Y Year with century (e.g. 1997).
%Z Time zone name.

Table 13-3. UNIX-specific clock formatting keywords.

%D Date as %m/%d/%y (e.g., 02/19/97).
%e Day of month (1 ?31), no leading zeros.
%h Abbreviated month name.
%n Inserts a newline.
%r Time as %I:%M:%S %p (e.g., 02:39:29 PM).
%R Time as %H:%M (e.g., 14:39).
%t Inserts a tab.
%T Time as %H:%M:%S (e.g., 14:34:29).

The clock clicks command returns the value of the system's highest resolution clock. The units of
the clicks are not defined. The main use of this command is to measure the relative time of different
performance tuning trials. The following command counts the clicks per second over 10 seconds,
which will vary from system to system:

Example 13-1 Calculating clicks per second.

set t1 [clock clicks]

after 10000 ;# See page 218
set t2 [clock clicks]
puts "[expr ($t2 - $t1)/10] Clicks/second"
=> 1001313 Clicks/second

The clock scan command parses a date string and returns a seconds value. The command handles a
variety of date formats. If you leave off the year, the current year is assumed.

Year 2000 Compliance

Tcl implements the standard interpretation of two-digit year values, which is that 70?9 are 1970?999,
00?9 are 2000?069. Versions of Tcl before 8.0 did not properly deal with two-digit years in all cases.
Note, however, that Tcl is limited by your system's time epoch and the number of bits in an integer. On
Windows, Macintosh, and most UNIX systems, the clock epoch is January 1, 1970. A 32-bit integer
can count enough seconds to reach forward into the year 2037, and backward to the year 1903. If you
try to clock scan a date outside that range, Tcl will raise an error because the seconds counter will
overflow or underflow. In this case, Tcl is just reflecting limitations of the underlying system.
If you leave out a date, clock scan assumes the current date. You can also use the -base option to
specify a date. The following example uses the current time as the base, which is redundant:
clock scan "10:30:44 PM" -base [clock seconds]
=> 2931690644

The date parser allows these modifiers: year, month, fortnight (two weeks), week, day, hour,
minute, second. You can put a positive or negative number in front of a modifier as a multiplier. For
example:

clock format [clock scan "10:30:44 PM 1 week"]

=> Sun Dec 01 22:30:44 1996
clock format [clock scan "10:30:44 PM -1 week"]
Sun Nov 17 22:30:44 1996

You can also use tomorrow, yesterday, today, now, last, this, next, and ago, as modifiers.

clock format [clock scan "3 years ago"]

=> Wed Nov 24 17:06:46 1993

Both clock format and clock scan take a -gmt option that uses Greenwich Mean Time. Otherwise,
the local time zone is used.

clock format [clock seconds] -gmt true

=> Sun Nov 24 09:25:29 1996
clock format [clock seconds] -gmt false
=> Sun Nov 24 17:25:34 1996

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 13. Reflection and Debugging

The info Command

Table 13-4 summarizes the info command. The operations are described in more detail later.

Table 13-4. The info command.

info args procedure A list of procedure's arguments.

info body procedure The commands in the body of procedure.
info cmdcount The number of commands executed so far.
info commands ? A list of all commands, or those matching pattern. Includes built-ins
pattern? and Tcl procedures.
info complete string True if string contains a complete Tcl command.
info default proc arg True if arg has a default parameter value in procedure proc. The
var default value is stored into var.
info exists variable True if variable is defined.
info globals ?pattern? A list of all global variables, or those matching pattern.
info hostname The name of the machine. This may be the empty string if networking
is not initialized.
info level The stack level of the current procedure, or 0 for the global scope.
info level number A list of the command and its arguments at the specified level of the
stack.
info library The pathname of the Tcl library directory.
info loaded ?interp? A list of the libraries loaded into the interpreter named interp, which
defaults to the current one.
info locals ?pattern? A list of all local variables, or those matching pattern.
info nameofexecutable The file name of the program (e.g., of tclsh or wish).
info patchlevel The release patch level for Tcl.
info procs ?pattern? A list of all Tcl procedures, or those that match pattern.
info script The name of the file being processed, or the empty string.
info sharedlibextension The file name suffix of shared libraries.
info tclversion The version number of Tcl.
info vars ?pattern? A list of all visible variables, or those matching pattern.

Variables
There are three categories of variables: local, global, and visible. Information about these categories is
returned by the locals, globals, and vars operations, respectively. The local variables include
procedure arguments as well as locally defined variables. The global variables include all variables
defined at the global scope. The visible variables include locals, plus any variables made visible via
global or upvar commands. A pattern can be specified to limit the returned list of variables to those
that match the pattern. The pattern is interpreted according to the rules of string match, which is
described on page 48:

info globals auto*

=> auto_index auto_noexec auto_path

Namespaces, which are the topic of the next chapter, partition global variables into different scopes.
You query the variables visible in a namespace with:

info vars namespace::*

Remember that a variable may not be defined yet even though a global or upvar command has
declared it visible in the current scope. Use the info exists command to test whether a variable or an
array element is defined or not. An example is shown on page 90.

Procedures
You can find out everything about a Tcl procedure with the args, body, and default operations. This
is illustrated in the following Proc_Show example. The puts commands use the -nonewline flag
because the newlines in the procedure body, if any, are retained:

Example 13-2 Printing a procedure definition.

proc Proc_Show {{namepat *}{file stdout}}{

foreach proc [info procs $namepat] {
set space ""
puts -nonewline $file "proc $proc {"
foreach arg [info args $proc] {
if [info default $proc $arg value] {
puts -nonewline $file "$space{$arg $value}"
} else {
puts -nonewline $file $space$arg
}
set space " "
}

# No newline needed because info body may return a

# value that starts with a newline

puts -nonewline $file "}{"

puts -nonewline $file [info body $proc]
puts $file "}"
}
}

Example 13-3 is a more elaborate example of procedure introspection that comes from the
direct.tcl file, which is part of the Tcl Web Server described in Chapter 18. This code is used to
map URL requests and the associated query data directly into Tcl procedure calls. This is discussed in
more detail on page 247. The Web server collects Web form data into an array called form. Example
13-3 matches up elements of the form array with procedure arguments, and it collects extra elements
into an args parameter. If a form value is missing, then the default argument value or the empty string
is used:

Example 13-3 Mapping form data onto procedure arguments.

# cmd is the name of the procedure to invoke

# form is an array containing form values

set cmdOrig $cmd

set params [info args $cmdOrig]

# Match elements of the form array to parameters

foreach arg $params {

if {![info exists form($arg)]} {
if {[info default $cmdOrig $arg value]} {
lappend cmd $value
} elseif {[string compare $arg "args"] == 0} {
set needargs yes
} else {
lappend cmd {}
}
} else {
lappend cmd $form($arg)
}
}
# If args is a parameter, then append the form data
# that does not match other parameters as extra parameters

if {[info exists needargs]} {

foreach {name value} $valuelist {
if {[lsearch $params $name] < 0} {
lappend cmd $name $value
}
}
}
# Eval the command

set code [catch $cmd result]

The info commands operation returns a list of all commands, which includes both built-in commands
defined in C and Tcl procedures. There is no operation that just returns the list of built-in commands.
Example 13-4 finds the built-in commands by removing all the procedures from the list of commands.

Example 13-4 Finding built-in commands.

proc Command_Info {{pattern *}}{

# Create a table of procedures for quick lookup

foreach p [info procs $pattern] {

set isproc($p) 1
}

# Look for command not in the procedure table

set result {}
foreach c [info commands $pattern] {
if {![info exists isproc($c)]}{
lappend result $c
}
}
return [lsort $result]
}

The Call Stack

The info level operation returns information about the Tcl evaluation stack, or call stack. The global
level is numbered zero. A procedure called from the global level is at level one in the call stack. A
procedure it calls is at level two, and so on. The info level command returns the current level
number of the stack if no level number is specified.
If a positive level number is specified (e.g., info level 3), then the command returns the procedure
name and argument values at that level in the call stack. If a negative level is specified, then it is
relative to the current call stack. Relative level -1 is the level of the current procedure's caller, and
relative level 0 is the current procedure. The following example prints the call stack. The Call_trace
procedure avoids printing information about itself by starting at one less than the current call stack
level:

Example 13-5 Getting a trace of the Tcl call stack.

proc Call_Trace {{file stdout}}{

puts $file "Tcl Call Trace"
for {set x [expr [info level]-1]}{$x > 0}{incr x -1}{
puts $file "$x: [info level $x]"
}
}

Command Evaluation
If you want to know how many Tcl commands are executed, use the info cmdcount command. This
counts all commands, not just top-level commands. The counter is never reset, so you need to sample
it before and after a test run if you want to know how many commands are executed during a test.
The info complete operation figures out whether a string is a complete Tcl command. This is useful
for command interpreters that need to wait until the user has typed in a complete Tcl command before
passing it to eval. Example 13-6 defines Command_Process that gets a line of input and builds up a
command. When the command is complete, the command is executed at the global scope.
Command_Process takes two callbacks as arguments. The inCmd is evaluated to get the line of input,
and the outCmd is evaluated to display the results. Chapter 10 describes callbacks why the curly braces
are used with eval as they are in this example:

Example 13-6 A procedure to read and evaluate commands.

proc Command_Process {inCmd outCmd}{

global command
append command(line) [eval $inCmd]
if [info complete $command(line)] {
set code [catch {uplevel #0 $command(line)}result]
eval $outCmd {$result $code}
set command(line) {}
}
}
proc Command_Read {{in stdin}}{
if [eof $in] {
if {$in != "stdin"}{
close $in
}
return {}
}
return [gets $in]
}
proc Command_Display {file result code}{
puts stdout $result
}
while {![eof stdin]}{
Command_Process {Command_Read stdin}\
{Command_Display stdout}
}

Scripts and the Library

The name of the current script file is returned with the info script command. For example, if you
use the source command to read commands from a file, then info script returns the name of that
file if it is called during execution of the commands in that script. This is true even if the info script
command is called from a procedure that is not defined in the script.

Use info script to find related files.

I often use info script to source or process files stored in the same directory as the script that is
running. A few examples are shown in Example 13-7.

Example 13-7 Using info script to find related files.

# Get the directory containing the current script.

set dir [file dirname [info script]]

# Source a file in the same directory

source [file join $dir helper.tcl]

# Add an adjacent script library directory to auto_path

# The use of ../lib with file join is cross-platform safe.
lappend auto_path [file join $dir ../lib]

The pathname of the Tcl library is stored in the tcl_library variable, and it is also returned by the
info library command. While you could put scripts into this directory, it might be better to have a
separate directory and use the script library facility described in Chapter 12. This makes it easier to
deal with new releases of Tcl and to package up your code if you want other sites to use it.

Version Numbers
Each Tcl release has a version number such as 7.4 or 8.0. This number is returned by the info
tclversion command. If you want your script to run on a variety of Tcl releases, you may need to test
the version number and take different actions in the case of incompatibilities between releases.
The Tcl release cycle starts with one or two alpha and beta releases before the final release, and there
may even be a patch release after that. The info patchlevel command returns a qualified version
number, like 8.0b1 for the first beta release of 8.0. We switched from using "p" (e.g., 8.0p2) to a three-
level scheme (e.g., 8.0.3) for patch releases. The patch level is zero for the final release (e.g., 8.2.0). In
general, you should be prepared for feature changes during the beta cycle, but there should only be bug
fixes in the patch releases. Another rule of thumb is that the Tcl script interface remains quite
compatible between releases; feature additions are upward compatible.

Execution Environment
The file name of the program being executed is returned with info nameofexecutable. This is more
precise than the name in the argv0 variable, which could be a relative name or a name found in a
command directory on your command search path. It is still possible for info nameofexecutable to
return a relative pathname if the user runs your program as ./foo, for example. The following
construct always returns the absolute pathname of the current program. If info nameofexecutable
returns an absolute pathname, then the value of the current directory is ignored. The pwd command is
described on page 115:

file join [pwd] [info nameofexecutable]

A few operations support dynamic loading of shared libraries, which are described in Chapter 44. The
info sharedlibextension returns the file name suffix of dynamic link libraries. The info loaded
command returns a list of libraries that have been loaded into an interpreter. Multiple interpreters are
described in Chapter 19.

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 13. Reflection and Debugging

Cross-Platform Support
Tcl is designed so that you can write scripts that run unchanged on UNIX, Macintosh, and Windows
platforms. In practice, you may need a small amount of code that is specific to a particular platform.
You can find out information about the platform via the tcl_platform variable. This is an array with
these elements defined:

tcl_platform(platform) is one of unix, macintosh, or windows.

tcl_platform(os) identifies the operating system. Examples include MacOS, Solaris , Linux,
Win32s (Windows 3.1 with the Win32 subsystem), Windows 95, Windows NT, and SunOS.

tcl_platform(osVersion) gives the version number of the operating system.

tcl_platform(machine) identifies the hardware. Examples include ppc (Power PC), 68k
(68000 family), sparc, intel, mips, and alpha.
tcl_platform(isWrapped) indicates that the application has been wrapped up into a single
executable with TclPro Wrapper. This is not defined in normal circumstances.
tcl_platform(user) gives the login name of the current user.

tcl_platform(debug) indicates that Tcl was compiled with debugging symbols.

tcl_platform(thread) indicates that Tcl was compiled with thread support enabled.

On some platforms a hostname is defined. If available, it is returned with the info hostname
command. This command may return an empty string.
One of the most significant areas affected by cross-platform portability is the file system and the way
files are named. This topic is discussed on page 103.
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 13. Reflection and Debugging

Tracing Variable Values

The trace command registers a command to be called whenever a variable is accessed, modified, or
unset. This form of the command is:

trace variable name ops command

The name is a Tcl variable name, which can be a simple variable, an array, or an array element. If a
whole array is traced, the trace is invoked when any element is used according to ops. The ops
argument is one or more of the letters r, for read traces, w, for write traces, and u, for unset traces. The
command is executed when one of these events occurs. It is invoked as:

command name1 name2 op

The name1 argument is the variable or array name. The name2 argument is the name of the array index,
or null if the trace is on a simple variable. If there is an unset trace on an entire array and the array is
unset, name2 is also null. The value of the variable is not passed to the procedure. The traced variable
is one level up the Tcl call stack. The upvar, uplevel, or global commands need to be used to make
the variable visible in the scope of command. These commands are described in more detail in Chapter
7.
A read trace is invoked before the value of the variable is returned, so if it changes the variable itself,
the new value is returned. A write trace is called after the variable is modified. The unset trace is
called after the variable is unset.

Read-Only Variables
Example 13-8 uses traces to implement a read-only variable. A variable is modified before the trace
procedure is called, so the ReadOnly variable is needed to preserve the original value. When a variable
is unset, the traces are automatically removed, so the unset trace action reestablishes the trace
explicitly. Note that the upvar alias (e.g., var) cannot be used to set up the trace:
Example 13-8 Tracing variables.

proc ReadOnlyVar {varName}{

upvar 1 $varName var
global ReadOnly
set ReadOnly($varName) $var
trace variable $varName wu ReadOnlyTrace
}
proc ReadOnlyTrace { varName index op }{
global ReadOnly
upvar 1 $varName var
switch $op {
w {
set var $ReadOnly($varName)
}
u {
set var $ReadOnly($varName)
# Re-establish the trace using the true name
trace variable $varName wu ReadOnlyTrace
}
}
}

This example merely overrides the new value with the saved value. Another alternative is to raise an
error with the error command. This will cause the command that modified the variable to return the
error. Another common use of trace is to update a user interface widget in response to a variable
change. Several of the Tk widgets have this feature built into them.
If more than one trace is set on a variable, then they are invoked in the reverse order; the most recent
trace is executed first. If there is a trace on an array and on an array element, then the trace on the array
is invoked first.

Creating an Array with Traces

Example 13-9 uses an array trace to dynamically create array elements:

Example 13-9 Creating array elements with array traces.

# make sure variable is an array

set dynamic() {}
trace variable dynamic r FixupDynamic
proc FixupDynamic {name index op}{
upvar 1 $name dynArray
if {![info exists dynArray($index)]}{
set dynArray($index) 0
}
}
Information about traces on a variable is returned with the vinfo option:

trace vinfo dynamic

=> {r FixupDynamic}

A trace is deleted with the vdelete option, which has the same form as the variable option. The
trace in the previous example can be removed with the following command:

trace vdelete dynamic r FixupDynamic

Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 13. Reflection and Debugging

Interactive Command History

The Tcl shell programs keep a log of the commands that you type by using a history facility. The log is
controlled and accessed via the history command. The history facility uses the term event to mean an
entry in its history log. The events are just commands, and they have an event ID that is their index in
the log. You can also specify an event with a negative index that counts backwards from the end of the
log. Event -1 is the previous event. Table 13-5 summarizes the Tcl history command. In the table,
event defaults to -1.

In practice you will want to take advantage of the ability to abbreviate the history options and even the
name of the history command itself. For the command, you need to type a unique prefix, and this
depends on what other commands are already defined. For the options, there are unique one-letter
abbreviations for all of them. For example, you could reuse the last word of the previous command
with [hist w $]. This works because a $ that is not followed by alphanumerics or an open brace is
treated as a literal $.
Several of the history operations update the history list. They remove the actual history command
and replace it with the command that resulted from the history operation. The event and redo
operations all behave in this manner. This makes perfect sense because you would rather have the
actual command in the history, instead of the history command used to retrieve the command.

Table 13-5. The history command.

history Short for history info with no count.
history add command ? Adds the command to the history list. If exec is specified, then execute
exec? the command.
history change new ? Changes the command specified by event to new in the command
event? history.
history event ?event? Returns the command specified by event.
history info ?count? Returns a formatted history list of the last count commands, or of all
commands.
history keep count Limits the history to the last count commands.
history nextid Returns the number of the next event.
history redo ?event? Repeats the specified command.

History Syntax
Some extra syntax is supported when running interactively to make the history facility more
convenient to use. Table 13-6 shows the special history syntax supported by tclsh and wish.

Table 13-6. Special history syntax.

!! Repeats the previous command.

!n Repeats command number n.If n is negative it counts backward from the current
command. The previous command is event -1.
!prefix Repeats the last command that begins with prefix.
!pattern Repeats the last command that matches pattern.
^old^new Globally replaces old with new in the last command.

The next example shows how some of the history operations work:

Example 13-10 Interactive history usage.

% set a 5
5
% set a [expr $a+7]
12
% history
1 set a 5
2 set a [expr $a+7]
3 history
% !2
19
% !!
26
% ^7^13
39
% !h
1 set a 5
2 set a [expr $a+7]
3 history
4 set a [expr $a+7]
5 set a [expr $a+7]
6 set a [expr $a+13]
7 history

A Comparison to C Shell History Syntax

The history syntax shown in the previous example is simpler than the history syntax provided by the C
shell. Not all of the history operations are supported with special syntax. The substitutions (using
^old^new) are performed globally on the previous command. This is different from the quick-history
of the C shell. Instead, it is like the !:gs/old/new/ history command. So, for example, if the example
had included ^a^b in an attempt to set b to 39, an error would have occurred because the command
would have used b before it was defined:

set b [expr $b+7]

If you want to improve the history syntax, you will need to modify the unknown command, which is
where it is implemented. This command is discussed in more detail in Chapter 12. Here is the code
from the unknown command that implements the extra history syntax. The main limitation in
comparison with the C shell history syntax is that the ! substitutions are performed only when ! is at
the beginning of the command:

Example 13-11 Implementing special history syntax.

# Excerpts from the standard unknown command

# uplevel is used to run the command in the right context
if {$name == "!!"}{
set newcmd [history event]
} elseif {[regexp {^!(.+)$}$name dummy event]}{
set newcmd [history event $event]
} elseif {[regexp {^\^([^^]*)\^([^^]*)\^?$}$name x old new]}{
set newcmd [history event -1]
catch {regsub -all -- $old $newcmd $new newcmd}
}
if {[info exists newcmd]}{
history change $newcmd 0
return [uplevel $newcmd]
}
Top
Practical Programming in Tcl & Tk, Third Edition
By Brent B. Welch

Table of Contents

Chapter 13. Reflection and Debugging

Debugging
The rapid turnaround with Tcl coding means that it is often sufficient to add a few puts statements to
your script to gain some insight about its behavior. This solution doesn't scale too well, however. A
slight improvement is to add a Debug procedure that can have its output controlled better. You can log
the information to a file, or turn it off completely. In a Tk application, it is simple to create a text
widget to hold the contents of the log so that you can view it from the application. Here is a simple
Debug procedure. To enable it you need to set the debug(enable) variable. To have its output go to
your terminal, set debug(file) to stderr.

Example 13-12 A Debug procedure.

proc Debug { args }{

global debug
if {![info exists debug(enabled)]}{
# Default is to do nothing
return
}
puts $debug(file) [join $args " "]
}
proc DebugOn {{file {}}}{
global debug
set debug(enabled) 1
if {[string length $file] == 0}{
set debug(file) stderr
} else {
if [catch {open $file w}fileID] {
puts stderr "Cannot open $file: $fileID"
set debug(file) stderr
} else {
puts stderr "Debug info to $file"
set debug(file) $fileID
}
}
}<