Generators in Python
Hi. In this article, we'll see how Generators work in Python.
Before looking at Generators, there is another Python concept you should be familiar with: Iterators.
An Iterator is an object which traverse elements of an iterable object. The iterable object could be any sequence or a file handler. Python already has syntax for iterators, so that the for loop works with sequences (strings, lists, dictionaries etc.) and file objects.
Here's a simple example of iterators:
>>> numbersList = [1, 2, 3] >>> for number in numbersList: print(number) ### OUTPUT ### 1 2 3
Sequences and file handlers in Python have an __iter__() magic method. When the sequence is used is in conjunction with a for loop, the __iter__() method of the sequence calls the builtin iter() method and returns an iterator object. This is called the The Iterator Protocol. The iterator object thus received, has a __next__() method, which gives the element in the sequence. When there is no element left and the __next__() method is called, the iterator object raises a StopIteration exception.
>>> numbersListIterator = iter([1, 2, 3]) >>> numbersListIterator <listiterator object at 0x03040830> >>> numbersListIterator.__next__() 1 >>> numbersListIterator.__next__() 2 >>> numbersListIterator.__next__() 3 >>> numbersListIterator.__next__() Traceback (most recent call last): numbersListIterator.__next__() StopIteration
Once you have an iterator object, you can iterate over the values in the following 3 ways:
- Using the __next__() magic function of the iterator object, as done above. The __next__() is calling the builtin next() method and passing itself (i.e. numbersListIterator) to it.
- Using the builtin next() function explicitly, such as next(numbersListIterator).
- Using a for loop, such as for number in numbersListIterator: print(number).
Generators
A Generator is an object in Python which returns a sequence of elements, one at a time. A Generator function returns the Generator object. It is characterized by the keyword yield i.e. a function having the yield keyword in its body is a Generator Function. Basic usage example:
>>> def generateFamousDetectives(): print("Famous Detective #1:", end = " ") yield "Sherlock Holmes" print("Famous Detective #2:", end = " ") yield "Hercule Poirot" print("Famous Detective #3:", end = " ") yield "Nancy Drew" >>> generateFamousDetectives <function generateFamousDetectives at 0x030303D8> >>> generateFamousDetectives() <generator object generateFamousDetectives at 0x030290F8> >>> generatorObjectOne = generateFamousDetectives() >>> generatorObjectOne.__next__() Famous Detective #1: 'Sherlock Holmes' >>> generatorObjectOne.__next__() Famous Detective #2: 'Hercule Poirot' >>> generatorObjectOne.__next__() Famous Detective #3: 'Nancy Drew' >>> generatorObjectOne.__next__() Traceback (most recent call last): generatorObjectOne.__next__() StopIteration
The generator function generateFamousDetectives returns a generator object, which we can assign to a variable, such as generatorObjectOne. Once we have this object, there are 3 methods in which we can fetch elements from it, just like iterators:
- Using the __next__() magic function of the generator object, as done above. The __next__() is calling the builtin next() method and passing itself (i.e. the generator object) to it.
- sing the builtin next() function explicitly, such as next(generatorObjectOne).
- Using a for loop, such as for detective in generatorObjectOne: print(detective).
How it works:
The print statements are irrelevant in the code, they are put there to demonstrate how the generator is working. When the generator function is called (generatorObjectOne = generateFamousDetectives()), it returns a generator object without trigering the execution of the function. When the __next__() method is called for the first time, the execution begins till the yield statement. With each call to __next__() method, execution resumes till the next yield statement. This goes on until all the values are spit out. When the values are exhausted, a StopIteration Exception is raised.
The word Generator can be interpreted in two ways. It can be understood to mean the function that is generating the values one by one i.e. generateFamousDetectives(). And it can also be understood to mean the generator object that the generator function is returning i.e. generatorObjectOne. The latter is the correct one. You can make the distinction by using the terms generator function and generator.
So, a generator is similar to iterators in the sense that both have __next__() method, both can be passed to the builtin next() function & both can be used in conjunction with a for loop. There is a major difference though. Generators evaluate the generator function till the point they encounter the next yield statement which returns an element, and as a result, they do not store the entire list of elements in memory. Iterators on the other hand, take an iterable as input, store the entire iterable in program memory, and return one element at a time.
The above example is a fairly simple one. Generators are found in the following forms more commonly:
>>> def countTo3(): for number in range(4): yield number >>> generatorObjectTwo = countTo3() >>> generatorObjectTwo.__next__() 0 >>> generatorObjectTwo.__next__() 1 >>> generatorObjectTwo.__next__() 2 >>> generatorObjectTwo.__next__() 3 >>> generatorObjectTwo.__next__() Traceback (most recent call last): File "<pyshell#118>", line 1, in <module> generatorObjectTwo.__next__() StopIteration >>> generatorObjectThree= countTo3() >>> >>> def countTo3(): for number in range(4): yield number >>> generatorObjectThree = countTo3() >>> for number in generatorObjectThree: print(number) 0 1 2 3 >>> def countTo3(): for number in range(4): yield number >>> sum(countTo3()) 6 >>> def generateNumbers(upperLimit): generatedNumber = 0 while generatedNumber <= upperLimit: yield generatedNumber generatedNumber += 1 >>> generateNumbers <function generateNumbers at 0x0318D0C0> >>> generateNumbers(7) <generator object generateNumbers at 0x0317DF08> >>> myGenerator = generateNumbers(7) >>> myGenerator.__next__() 0 >>> myGenerator.__next__() 1 >>> next(myGenerator) 2 >>> next(myGenerator) 3 >>> next(myGenerator) 4 >>> next(myGenerator) 5 >>> next(myGenerator) 6 >>> next(myGenerator) 7 >>> next(myGenerator) Traceback (most recent call last): File "<pyshell#52>", line 1, in <module> next(myGenerator) StopIteration
Generator Expressions
If you have ever tried to make a tuple comprehension, like I did, you would have ended up making a generator object instead. There are no tuple comprehensions in Python. Generator expressions look like comprehensions of other sequences, surrounded by round brackets/parentheses.
>>> generatorObjectOne = (x for x in range(5)) >>> generatorObjectOne <generator object <genexpr> at 0x03031738> >>> for number in generatorObjectOne: print(number) 0 1 2 3 4
So, these expressions work just like comprehensions of other sequences, the difference again is in the way the elements are returned. For example, a list comprehension will store the entire list in memory at a single instant and return the entire list at the same time, and the equivalent generator will return one number at a time and hence, does not store the entire sequence in memory. The generator expression is assigned to a variable which stores the generator object, which can be used accordingly. Here is a demonstration:
>>> listOf10Numbers = [number for number in range(1, 11)] >>> listOf10Numbers [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] >>> generatorOf10Numbers = ( number for number in range(1, 11) ) >>> generatorOf10Numbers.__next__() 1 >>> for number in generatorOf10Numbers: print(number) 2 3 4 5 6 7 8 9 10
The above example only has 10 numbers to output. If there were 100, or 1000 numbers to output, it is clear that using a generator will be wiser for the memory resources.
Here's another interesting example of generator expressions:
>>> def computePythogorasTripletsTill(upperBound): '''This function returns a generator of Pythogoras triplets lying between 0 and the number provided.''' return ( (x, y, z) for z in range(upperBound) for y in range(1, z) for x in range(1, y) if x*x + y*y == z*z ) >>> for triplet in computePythogorasTripletsTill(20): print(triplet) (3, 4, 5) (6, 8, 10) (5, 12, 13) (9, 12, 15) (8, 15, 17)
The above generator expression is as good as:
>>> for z in range(20): for y in range(1, z): for x in range(1, y): if x*x + y*y == z*z: print( (x, y, z ) ) (3, 4, 5) (6, 8, 10) (5, 12, 13) (9, 12, 15) (8, 15, 17)
So, the main purpose of Generators is to produce iterators which are not stored in the memory.
That's it for this one. I hope you have gained a working knowledge of Generators in Python. It is a great language-specific feature, and can work wonders when mastered. Cheers!