CS 2110: Parallel Programming
A brief intro to: Parallelism, Threads, and Concurrency
Our Goals
Appreciate the (increasing) importance of parallel programming Understand fundamental concepts:
Parallelism, threads, multi-threading, concurrency, locks, etc.
See some basics of this is done in Java See some common uses:
Divide and conquer, e.g. mergesort Worker threads in Swing
Notes!
An area of rapid change!
1990s: parallel computers were $$$$ Now: 4 core machines are commodity
Variations between languages Old dogs and new tricks? Not so good
Educators, Java books, web pages
Evolving frameworks, models, etc.
E.g. Javas getting Fork/Join in Java 1.7 (summer 11) MAP/REDUCE
(Multi)Process vs (Multi)Thread
Assume a computer has one CPU Can only execute one statement at a time
Thus one program at a time
Process: an operating-system level unit of execution Multi-processing
Op. Sys. time-slices between processes Computer appears to do more than one program (or background process) at a time
Tasks and Threads
Thread: a thread of execution
Smaller, lighter than a process smallest unit of processing that can be scheduled by an operating system Has its own run-time call stack, copies of the CPUs registers, its own program counter, etc. Process has its own memory address space, but threads share one address space
A single program can be multi-threaded
Time-slicing done just like in multiprocessing Repeat: the threads share the same memory
Task
A task is an abstraction of a series of steps
Might be done in a separate thread
In Java, there are a number of classes / interfaces that basically correspond to this
Example (details soon): Runnable
work done by method run()
Java: Statements Tasks
Consecutive lines of code:
Foo tmp = f1; f1 = f2; f2 = tmp;
A method:
swap(f1, f2);
A task object:
SwapTask task1= new SwapTask(f1,f2); task1.run();
Huh? Why a task object?
Actions, functions vs. objects. Whats the difference?
Huh? Why a task object?
Actions, functions vs. objects. Whats the difference? Objects:
Are persistent. Can be stored. Can be created and then used later. Can be attached to other things. Put in Collections. Contain state.
Functions:
Called, return (not permanent)
Java Library Classes
For task-like things:
Runnable, Callable SwingWorker, RecursiveAction, etc.
Thread class Managing tasks and threads
Executor, ExecutorService ForkJoinPool
In Swing
The Event-Dispatch Thread SwingUtilities.invokeLater()
Javas Nested Classes
You can declare a class inside another
http://download.oracle.com/javase/tutorial/java/javaOO/nested.html
If declared static, can use just like any class If not static
Can only define objects of that type from within non-static code of the enclosing class Object of inner-class type can access all fields of the object that created it. (Useful!) Often used for helper classes, e.g. a node object used in a list or tree.
See demo done in Eclipse: TaskDemo.java
Possible Needs for Task Objects
Can you think of any?
Possible Needs for Task Objects
Can you think of any? Storing tasks for execution later
Re-execution
Undo and Redo Threads
Undo Operations
A task object should:
Be able to execute and undo a function Therefore will need to be able to save enough state to go back
When application executes a task:
Create a task object and make it execute Store that object on a undo stack
Undo
Get last task object stored on stack, make it undo
Calculator App Example
We had methods to do arithmetic operations:
public void addToMemory(double inputVal) { memory = memory + inputVal; }
Instead:
public void addToMemory(double inputVal) { AddTask task = new AddTask(inputVal); task.run(); undoStack.add(task); }
Stack, Undo Stack
A Stack is an important ADT
A linear sequence of data Can only add to the end, remove item at the end LIFO organization: last in, first out Operations: push(x), pop(), sometimes top()
Stacks important for storing delayed things to return turn
Run-time stack (with activation records) An undo stack (and a separate redo stack)
Nested class for Adding
private class AddTask implements UndoableRunnable { private double param; public AddTask(double inputVal) { this.param = inputVal; } public void run() { // memory is field in CalcApp memory = memory + this.param; } public boolean undo() { memory = memory - this.param; return true; } }
Undo operation
In the Calc app:
public boolean undo() { boolean result = false; int last = undoStack.size()-1; if ( last >= 0 ) { UndoableRunnable task = undoStack.get(last); result = task.undo(); undoStack.remove(last); } return result; }
Java Thread Classes and Methods
Java has some primitives for creating and using threads
Most sources teach these, but in practice theyre hard to use well Now, better frameworks and libraries make using them directly less important.
But lets take a quick look
Javas Thread Class
Class Thread: its method run() does its business when that thread is run But you never call run(). Instead, you call start() which lets Java start it and call run() To use Thread class directly (not recommended now):
define a subclass of Thread and override run() not recommended! Create a task as a Runnable, link it with a Thread, and then call start() on the Thread.
The Thread will run the Runnables run() method.
Creating a Task and Thread
Again, the first of the two old ways Get a thread object, then call start() on that object
Makes it available to be run When its time to run it, Threads run() is called
So, create a thread using inheritance
Write class that extends Thread, e.g. MyThread Define your own run() Create a MyThread object and call start() on it
We wont do this! Not good design!
Runnables and Thread
Use the task abstraction and create a class that implements Runnable interface
Define the run() method to do the work you want
Now, two ways to make your task run in a separate thread
First way:
Create a Thread object and pass a Runnable to the constructor As before, call start() on the Thread object
Second way: hand your Runnable to a thread manager object
Several options here! These are the new good ways. More soon.
Join (not the most descriptive word)
The Thread class defines various primitive methods you could not implement on your own
For example: start, which calls run in a new thread
The join() method is one such method, essential for coordination in this kind of computation
Caller blocks until/unless the receiver is done executing (meaning its run returns) E.g. in method foo() running in main thread, we call: myThread.start(); myThread.join(); Then this code waits (blocks) until myThreads run() completes
This style of parallel programming is often called fork/join
Warning: well soon see a library called fork/join which simplifies things. In that, you never call join()
24
Threading in Swing
Threading matters a lot in Swing GUIs
You know: mains thread ends early JFrame.setvisible(true) starts the GUI thread
Swing methods run in a separate thread called the Event-Dispatching Thread (EDT)
Why? GUIs need to be responsive quickly Important for good user interaction
But: slow tasks can block the EDT
Makes GUI seem to hang Doesnt allow parallel things to happen
Thread Rules in Swing
All operations that update GUI components must happen in the EDT
These components are not thread-safe (later) SwingUtilities.invokeLater(Runnable r) is a method that runs a task in the EDT when appropriate
But execute slow tasks in separate worker threads To make common tasks easier, use a SwingWorker task
SwingWorker
A class designed to be extended to define a task for a worker thread
Override method doInBackground() This is like run() its what you want to do Override method done() This method is for updating the GUI afterwards
It will be run in the EDT
For more info, see:
http://download.oracle.com/javase/tutorial/uiswing/concurrency/
Note you can get interim results too
Code Example
We have a fibonacci demo that runs this method both recursively and with a loop Original version
Unresponsive until it completes all its calculations
Need to run calls to the recursive fibonacci in a separate thread
See Fib2.java that uses SwingWorker to define a task
New Java ForkJoin Framework
Designed to support a common need
Recursive divide and conquer code Look for small problems, solve without parallelism For larger problems
Define a task for each subproblem Library provides
a Thread manager, called a ForkJoinPool Methods to send your subtask objects to the pool to be run, and your call waits until their done The pool handles the multithreading well
Turns out that Javas threads are still too heavyweight
Will be in Java 7 standard libraries, but available in Java 6 as a downloaded .jar file
Get jsr166y.jar from http://gee.cs.oswego.edu/dl/concurrencyinterest/index.html
More info here
http://www.cs.washington.edu/homes/djg/teachingMaterials/grossm anSPAC_forkJoinFramework.html
Screenshots: For single- and multi-threaded Mergesort: Threads in Eclipse Debug window, and Macs CPU usage display
text
The ForkJoinPool
The thread manager
Used when calls are made to RecursiveTasks methods fork(), invokeAll(), etc. When created, knows how many processors are available Pretty sophisticated
Steals time from threads that have nothing to do
Overview of How To
Create a ForkJoinPool thread-manager object Create a task object that extends RecursiveTask
Well ignore use of generics with this (see docs) Create a task-object for entire problem and call invoke(task) on your ForkJoinPool
Your task class compute() is like Thread.run()
It has the code to do the divide and conquer First, it must check if small problem dont use parallelism, solve without it Then, divide and create >1 new task-objects. Run them:
Either with invokeAll(task1, task2, ). Waits for all to complete. Or calling fork() on first, then compute() on second, then join()
Same Ideas as Thread But...
To use the ForkJoin Framework: A little standard set-up code (e.g., create a ForkJoinPool) Dont subclass Thread Dont override run Dont call start Dont just call join or Do subclass RecursiveTask<V> Do override compute Do call invoke, invokeAll, fork Do call join which returns answer Do call invokeAll on multiple tasks
Mergesort Example
Top-level call. Create main task and submit
public static void mergeSortFJRecur(Comparable[] list, int first, int last) { if (last - first < RECURSE_THRESHOLD) { MergeSort.insertionSort(list, first, last); return; } Comparable[] tmpList = new Comparable[list.length]; threadPool.invoke(new SortTask(list, tmpList, first, last)); }
Mergesorts Task-Object Nested Class
static class SortTask extends RecursiveAction { Comparable[] list; Comparable[] tmpList; int first, last; public SortTask(Comparable[] a, Comparable[] tmp, int lo, int hi) { this.list = a; this.tmpList = tmp; this.first = lo; this.last = hi; } // continued next slide
compute() Does Task Recursion
protected void compute() { // in SortTask, continued from previous slide if (last - first < RECURSE_THRESHOLD) MergeSort.insertionSort(list, first, last); else { int mid = (first + last) / 2; // the two recursive calls are replaced by a call to invokeAll SortTask task1 = new SortTask(list, tmpList, first, mid); SortTask task2 = new SortTask(list, tmpList, mid+1, last); invokeAll(task1, task2); MergeSort.merge(list, first, mid, last); }
Leaving new ForkJoin framework
Java since 1.5 has a more general set of classes for task managers
Nice to Have a Thread Manager
If your code is responsible for creating a bunch of tasks, linking them with Threads, and starting them all, then you have muchto worry about:
What if you start too many threads? Can you manage the number of running threads? Enough processors? Can you shutdown all the threads? If one fails, can you restart it?
Executors
An Executor is an object that manages running tasks
Submit a Runnable to be run with Executors execute() method So, instead of creating a Thread for your Runnable and calling start() on that, do this:
Get an Executor object, say called exec Create a Runnable, say called myTask Submit for running: exec.execute(myTask)
How to Get an Executor
Use static methods in Executors library.
Fixed thread pool: at most N threads running at one time Executor exec = Executors.newFixedThreadPool(MAX_THREADS); Unlimited number of threads Executor exec = Executors.newCachedThreadPool();
Summary So Far
Create a class that implements a Runnable to be your task object
Or if ForkJoin framework, extend RecursiveTask
Create your task objects Create an Executor
Or a ForkJoinPool
Submit each task-object to the Executor which starts it up in a separate thread
Concurrency and Synchronization
Concurrency: Issues related to multiple-threads accessing shared data Synchronization: Methods to manage and control concurrent access to shared data by multiple-threads Note: Our book defines concurrent programming and concurrency to be what more people now call parallel programming
Possible Bugs in Multithreaded Code
Possible bug #1 i=1; x=10; x = i + x; // x could be 12 here Possible bug #2 if ( ! myList.contains(x) ) myList.add(x); // x could be in list twice Why could these cause unexpected results?
Heres Why
See MSD text pp. 759-762 Multiple threads executing same lines of code at same time, accessing same data values
How 1 + 10 might be 12
Thread 1 executes:
(x is 10, i is 1) Get i (1) into register 1 Get x (10) into its register 2 (other thread has CPU) Add registers Store result (11) into x (x is now 11) (other thread has CPU) (other thread has CPU) Do next line of code (x changes to 12 even though no code in this thread has touched x)
Thread 2 executes:
(x is 10, i is 1) (other thread has CPU) (other thread has CPU) Get i (1) into its register 1 (other thread has CPU) (other thread has CPU) Get x (11) into is register 2 Add registers Store result (12) into x (x is now 12)
Synchronization
Understand the issue with concurrent access to shared data?
Data could be a counter (int) or a data structure (e.g. a Map or List or Set)
A race condition: Two threads will access something. They compete causing a problem A critical section: a block of code that can only be safely executed by one thread at a time A lock: an object that is held by one thread at a time, then released
Synchronization in Java (1)
Any object can serve as a lock
Separate object: Object myLock = new Object(); Current instance: the this object
Enclose lines of code in a synchronized block synchronized(myLock) { // code here } More than one thread could try to execute this code, but one acquires the lock and the others block or wait until the first thread releases the lock
Synchronized Methods
Common situation: all the code in a method is a critical section
I.e. only one thread at a time should execute that method E.g. a getter or setter or mutator, or something that changes shared state info (e.g. a Map of important data)
Java makes it easy: add synchronized keyword to method signature. E.g.
public synchronized void update() {
Summary So Far
Concurrent access to shared data
Can lead to serious, hard-to-find problems E.g. race conditions
The concept of a lock Synchronized blocks of code or methods
One thread at a time While first thread is executing it, others block
Some Java Solutions
There are some synchronized collections Classes like AtomicInteger
Stores an int Has methods to operate on it in a thread-safe manner
int getAndAdd(int delta) instead of i=i+1
More Advanced Synchronization
A semaphore object
Allows simultaneous access by N threads If N==1, then this is known as a mutex (mutual exclusion) Java has a class Semaphore
Other Java classes
CountDownLatch, Barriers, etc. No more on these in CS2110 this term
Unused slides for Spring 2011
Barriers
Java class CyclicBarrier
A rendezvous point or barrier point Worker threads wait at a spot until all get there Then all proceed
Using CountDownLatch
Here are some common scenarios and demo programs for them Youll use the last of these for the War cardgame program!
Scenario #1
A manager thread and N worker threads Manager starts workers but then must wait for them to finish before doing follow-up work Solution:
Manager creates a CountDownLatch with value N After workers starts, manager calls await() on that When each worker completes its work, it calls countDown() on the latch After all N call countDown(), manager is un-blocked and does follow-up work
Example use: parallel divide and conquer like mergesort Code example: SyncDemo0.java
Scenario #2
A manager thread and N worker threads Manager starts workers but wants them to hold before doing real work until it says go Solution:
Manager creates a CountDownLatch with value 1 After each workers start, it calls await() on that Latch At some point, when ready, the manager calls countDown() on that Latch Now Workers free to continue with their work
Code example: SyncDemo1.java
Scenario #3
Work done in rounds where:
All workers wait for manager to say go Each worker does its job and then waits for next round Manager waits for all workers to complete a round, then does some follow-up work When thats done, manager starts next round by telling workers go
Solution: combine the two previous solutions
First Latch: hold workers until manager is ready Second Latch: manager waits until workers finish a round Workers run() has loop to repeat Manager must manage Latches, recreating them at end of round
Example use: a card game or anything that has that kind of structure Code example: SyncDemo2.java
Summary of last section
Multiple threads may need to cooperate
Common situation: some workers and a manager One thread may need to wait for one or more thread to complete One or more threads may need to wait to be released Or a combination of these situations
Threads all access a CountDownLatch
await() used to wait for enough calls to countDown()
End
Unused slides follow
Thr A Thr A Thr A
work work
Work to do
await
dfafaf