An Intro to Threading in Python
Python threading allows you to have different parts of your program run concurrently and can simplify your design. If you’ve got some experience in Python and want to speed up your program using threads, then this tutorial is for you!
This article assumes you’ve got the Python basics down pat and that you’re using at least version 3.6 to run the examples. If you need a refresher, you can start with the Python Learning Paths and get up to speed.
What Is a Thread?
A thread is a separate flow of execution. This means that your program will have two things happening at once. But for most Python 3 implementations the different threads do not actually execute at the same time: they merely appear to, see also Free python Courses online
It’s tempting to think of threading as having two (or more) different processors running on your program, each one doing an independent task at the same time. That’s almost right. The threads may be running on different processors, but they will only be running one at a time.
Getting multiple tasks running simultaneously requires a non-standard implementation of Python, writing some of your code in a different language, or using multiprocessing
which comes with some extra overhead.
Because of the way CPython implementation of Python works, threading may not speed up all tasks. This is due to interactions with the GIL that essentially limit one Python thread to run at a time.
Tasks that spend much of their time waiting for external events are generally good candidates for threading. Problems that require heavy CPU computation and spend little time waiting for external events might not run faster at all.
This is true for code written in Python and running on the standard CPython implementation. If your threads are written in C they have the ability to release the GIL and run concurrently. If you are running on a different Python implementation, check with the documentation too see how it handles threads.
If you are running a standard Python implementation, writing in only Python, and have a CPU-bound problem, you should check out the multiprocessing
module instead.
Architecting your program to use threading can also provide gains in design clarity. Most of the examples you’ll learn about in this tutorial are not necessarily going to run faster because they use threads. Using threading in them helps to make the design cleaner and easier to reason about.
So, let’s stop talking about threading and start using it!
Starting a Thread
Now that you’ve got an idea of what a thread is, let’s learn how to make one. The Python standard library provides threading
, which contains most of the primitives you’ll see in this article. Thread
, in this module, nicely encapsulates threads, providing a clean interface to work with them.
When you create a Thread
, you pass it a function and a list containing the arguments to that function. In this case, you’re telling the Thread
to run thread_function()
and to pass it 1
as an argument.
For this article, you’ll use sequential integers as names for your threads. There is threading.get_ident()
, which returns a unique name for each thread, but these are usually neither short nor easily readable.
thread_function()
itself doesn’t do much. It simply logs some messages with a time.sleep()
in between them. check Free Python Courses
Daemon Threads
In computer science, a daemon
is a process that runs in the background.
Python threading
has a more specific meaning for daemon
. A daemon
thread will shut down immediately when the program exits. One way to think about these definitions is to consider the daemon
thread a thread that runs in the background without worrying about shutting it down.
If a program is running Threads
that are not daemons
, then the program will wait for those threads to complete before it terminates. Threads
that are daemons, however, are just killed wherever they are when the program is exiting.
Let’s look a little more closely at the output of your program above. The last two lines are the interesting bit. When you run the program, you’ll notice that there is a pause (of about 2 seconds) after __main__
has printed its all done
message and before the thread is finished.
This pause is Python waiting for the non-daemonic thread to complete. When your Python program ends, part of the shutdown process is to clean up the threading routine.
If you look at the source for Python threading
, you’ll see that threading._shutdown()
walks through all of the running threads and calls .join()
on every one that does not have the daemon
flag set.
So your program waits to exit because the thread itself is waiting in a sleep. As soon as it has completed and printed the message, .join()
will return and the program can exit.
Frequently, this behavior is what you want, but there are other options available to us. Let’s first repeat the program with a daemon
thread. You do that by changing how you construct the Thread
, adding the daemon=True
flag:
x = threading.Thread(target=thread_function, args=(1,), daemon=True)
When you run the program now, you should see this output:
$ ./daemon_thread.py
Main : before creating thread
Main : before running thread
Thread 1: starting
Main : wait for the thread to finish
Main : all done
The difference here is that the final line of the output is missing. thread_function()
did not get a chance to complete. It was a daemon
thread, so when __main__
reached the end of its code and the program wanted to finish, the daemon was killed.
join()
a Thread
Daemon threads are handy, but what about when you want to wait for a thread to stop? What about when you want to do that and not exit your program? Now let’s go back to your original program and look at that commented out line twenty:
# x.join()
To tell one thread to wait for another thread to finish, you call .join()
. If you uncomment that line, the main thread will pause and wait for the thread x
to complete running.
Did you test this on the code with the daemon thread or the regular thread? It turns out that it doesn’t matter. If you .join()
a thread, that statement will wait until either kind of thread is finished.
Working With Many Threads
The example code so far has only been working with two threads: the main thread and one you started with the threading.Thread
object.
Frequently, you’ll want to start a number of threads and have them do interesting work. Let’s start by looking at the harder way of doing that, and then you’ll move on to an easier method.
The harder way of starting multiple threads is the one you already know:
import logging
import threading
import timedef thread_function(name):
logging.info("Thread %s: starting", name)
time.sleep(2)
logging.info("Thread %s: finishing", name)if __name__ == "__main__":
format = "%(asctime)s: %(message)s"
logging.basicConfig(format=format, level=logging.INFO,
datefmt="%H:%M:%S") threads = list()
for index in range(3):
logging.info("Main : create and start thread %d.", index)
x = threading.Thread(target=thread_function, args=(index,))
threads.append(x)
x.start() for index, thread in enumerate(threads):
logging.info("Main : before joining thread %d.", index)
thread.join()
logging.info("Main : thread %d done", index)
This code uses the same mechanism you saw above to start a thread, create a Thread
object, and then call .start()
. The program keeps a list of Thread
objects so that it can then wait for them later using .join()
.
Running this code multiple times will likely produce some interesting results. Here’s an example output from my machine:
$ ./multiple_threads.py
Main : create and start thread 0.
Thread 0: starting
Main : create and start thread 1.
Thread 1: starting
Main : create and start thread 2.
Thread 2: starting
Main : before joining thread 0.
Thread 2: finishing
Thread 1: finishing
Thread 0: finishing
Main : thread 0 done
Main : before joining thread 1.
Main : thread 1 done
Main : before joining thread 2.
Main : thread 2 done
If you walk through the output carefully, you’ll see all three threads getting started in the order you might expect, but in this case they finish in the opposite order! Multiple runs will produce different orderings. Look for the Thread x: finishing
message to tell you when each thread is done.
The order in which threads are run is determined by the operating system and can be quite hard to predict. It may (and likely will) vary from run to run, so you need to be aware of that when you design algorithms that use threading.
Fortunately, Python gives you several primitives that you’ll look at later to help coordinate threads and get them running together. Before that, let’s look at how to make managing a group of threads a bit easier, python Online Course