Hi, and welcome back to Python 101. So, you have become well versed with strings & numbers, and now you are curious about more complex variables, that contain more than one value. If you are not so curious, you haven't run into complex programs where you have to store and retrieve data. Either way, you will find that these complex variables, are an integral part of programming languages, let alone Python. In this chapter, we'll look at Data Structures in Python i.e. collections of strings, integers and other values.
Data Structures in Python
Python data structures are very intuitive, and each of them has a multitude of operations to get the job done. You will also receive some intel on how to make the right decision as to which data structure to use in a given situation. Okay, so we will be going over a huge number of new things in this chapter, but it's all very intuitive and all very repetitive, more or less. Once you get through Lists, you will notice that only a few functions and operations change in other data structures. So, don't be intimidated by the length of this chapter, you don't need to remember all of it, in fact, owing to its instinctive function calls, you won't have to remember anything, there's a function for almost everything. Let's see what all we will be covering in this chapter:
- Programs you will find a piece of cake (more or less) once you are through this webpage: Shuffle a deck of cards (using lists), Scrabble Score(using dictionaries)
- Lists
- Tuples
- Dictionaries
- Sets
- Making the correct choice: which data structure to choose in a situation
- Other less used data types: array, defaultdict, deque, heapq, queue
- Review: Comparison of different data structures in terms of mutability, orderliness, sortability, reversibility, slice-ability, comprehensions, accessibility using index operators, merging using + operator
- Review: Operations common to all data structures
- A Few General Things: max() and min() in a dictionary, hashing in Python, interchanging data structures, using builtin help() and dir() functions, comparing sequences in Python: Lexicographical ordering
- On the agenda in next chapter
- Exercises
Programs to whet your appetite§
Here's what you will be able to do by the end of this chapter:
#1 Shuffle a deck of cards
## SHUFFLING A DECK OF CARDS ## A deck of cards has 13 cards each of 4 suits: heart(♥), spade(♠), diamond(♦), club(♣). ## THOUGHT PROCESS: Construct an unshuffled deck by using two lists, one for suits, another for cardValues -> Shuffle the deck by iterating over all cards one by one using their indexes, and swapping the index with any random index. import random suits = ["♠", "♥", "♦", "♣"] cardValues = ["2", "3", "4", "5", "6", "7", "8", "9", "10", "J", "Q", "K", "A"] # initializing an empty deck deck = [] # CONSTRUCTING AN UNSHUFFLED DECK for suit in suits: for value in cardValues: deck.append(value + suit) # printing the unshuffled deck print("The original deck of cards:\n\n", deck) # SHUFFLING CARDS: iterating over all cards one by one using their indexes, swapping the index with any random index # Iterate over all cards one by one using their indexes for index in range(0, len(deck)): randomCardForSwitching = random.randrange(len(deck)) # Swapping indexes temporaryIndex = deck[index] deck[index] = deck[randomCardForSwitching] deck[randomCardForSwitching] = temporaryIndex # printing the shuffled deck print("\nThe shuffled deck of cards:\n\n", deck)
#2 Scrabble Score
## SCRABBLE SCORE ## Scrabble is a board game in which you get points for making words. Players get 7 tiles, which have letters printed on them, along with points. Each letter has a fixed number of points, letters used more often in the English language are given less points('S', 'T', vowels add 1 to the score), and letters used less often are given more points, based on how rarely they are used('Y' adds 4 points, 'X' adds 4 points, 'Q' adds 10). So, if you make a word 'say', then you get 1 + 1 + 4 i.e. 6 points. The player with the most points till the time the board is filled, wins. Following are the points associated with each letter: A: 1, B: 3, C: 3, D: 2, E: 1, F: 4, G: 2, H: 4, I: 1, J: 2, K: 5, L: 1, M: 3, N: 1, O: 1, P: 3, Q: 10, R: 1, S: 1, T:1, U: 2, V: 4, W: 4, X: 10, Y: 4, Z: 10 ## THOUGHT PROCESS: Create a dictionary of letters(in uppercase or lowercase) along with their corresponding point values -> Prompt the user for a word -> convert the word to uppercase or lowercase, as per the case of letters in the dictionary -> Initialize the score variable to 0 -> Iterate over each character of the entered word, get the number of points it is worth and add to the score variable -> Output the score of a the enteredWord. # Create a dictionary of letters along with their corresponding point values points = {"A": 1, "B": 3, "C": 3, "D": 2, "E": 1, "F": 4, "G": 2, "H": 4, "I": 1, "J": 2, "K": 5, "L": 1, "M": 3, "N": 1, "O": 1, "P": 3, "Q": 10, "R": 1, "S": 1, "T": 1, "U": 1, "V": 4, "W": 4, "X": 8, "Y": 4, "Z": 10} # Prompt the user for a word word = input("Enter a word: ") # Convert the word to uppercase and initialize the score variable to 0 word = word.upper() score = 0 # Iterate over each character of the entered word, get the number of points it is worth and add to the score variable for letter in word: score = score + points[letter] # Print the accumulated score print(word,"is worth", score, "points")
Lists [ ]§
List are an ordered groups of elements/values/items. These elements need not be of the same type, they can be a mixture of characters, numbers, strings, lists, other data structures. Lists are denoted by square braces i.e. [ ]
An overview of everything related to lists:
- CREATING A LIST: builtin list() or []
- ADDING AN ELEMENT TO AN EXISTING LIST: append()
- ADDING SEVERAL MEMBERS TO AN EXISTING LIST: extend()
- ADDING AN ELEMENT AT A GIVEN POSITION: insert()
- REMOVING AN ELEMENT FROM AN EXISTING LIST USING ITEM VALUE: remove()
- REMOVING THE ELEMENTS USING INDEXES: pop()
- SLICING A LIST: [:]
- SORTING A LIST: sort()
- REVERSING A LIST: reverse()
- ITERATING OVER EACH ELEMENT IN A LIST: a for loop
- MEMBERSHIP TESTS IN LISTS: the membership operators in and not
- LENGTH OF A LIST: builtin len() function
- ORDERLINESS OF A LIST: lists are ordered
- MERGING LISTS: using the + operator
- CLEARING A LIST: clear()
- MUTABILITY OF LISTS
- PACKING AND UNPACKING A LIST
- LIST COMPREHENSIONS
- GETTING MAXIMUM AND MINIMUM VALUES FROM A LIST
- JOINING THE ELEMENTS OF A LIST WITH THE PROVIDED SYMBOL/STRING: join()
- COUNTING THE NUMBER OF OCCURRENCES OF AN ELEMENT IN A LIST: count()
- GETTING THE INDEX OF FIRST OCCURRENCE OF AN ELEMENT: index()
- REMOVING A LIST FROM MEMORY: using the del keyword
######################################### # CREATING A LIST: builtin list() or [] # ######################################### list1 = ["one", 2, "three"] print(type(list1)) # <class 'list'> # ANOTHER WAY TO CREATE A LIST: using list(). You can create an empty list by 'list()', or you can initialize it with another list, by 'list(list1)' list2 = list() # creates an empty list print(list2) # [] list3 = list(list1) # create a list 'list3' and initialize it with values in 'list1' ################################################### # ADDING AN ELEMENT TO AN EXISTING LIST: append() # ################################################### list1 = ["one", 2, "three"] list1.append("IV") print(list1) # ['one', 2, 'three', 'IV'] ######################################################## # ADDING SEVERAL MEMBERS TO AN EXISTING LIST: extend() # ######################################################## # You can either provide another list in the brackets, or you could specify values separated by a comma, signifying a list. list1 = ['one', 2, 'three'] list2 = list() list2.extend(["integers", "numbers"]) print(list2) # ['integers', 'numbers'] list2.extend(list1) # ANOTHER WAY OF ADDING SEVERAL NUMBERS TO AN EXISTING LIST print(list2) # ['integers', 'numbers', 'one', 2, 'three'] ################################################### # ADDING AN ELEMENT AT A GIVEN POSITION: insert() # ################################################### numbers = ["zero", "one", "two", "four"] numbers.insert(3, "three") print( numbers ) # ['zero', 'one', 'two', 'three', 'four'] ######################################################################## # REMOVING AN ELEMENT FROM AN EXISTING LIST USING ITEM VALUE: remove() # ######################################################################## list2 = ['integers', 'numbers', 'one', 2, 'three'] list2.remove("integers") print(list2) # ['numbers', 'one', 2, 'three'] ############################################## # REMOVING THE ELEMENTS USING INDEXES: pop() # ############################################## list2 = ['integers', 'numbers', 'one', 2, 'three'] list2.pop(2) print(list2) # ['integers', 'numbers', 2, 'three'] # If we don't specify any index, the last item in the list will be removed. list2.pop() print(list2) # ['integers', 'numbers', 2] ####################### # SLICING A LIST: [:] # ####################### # Slicing in lists works the same way as it does in strings. The indices start from 0. Slicing DOES NOT hamper the original list, unless you assign the sliced list to the original variable. # [:] gives the original list print(list1[:]) # ['one', 2, 'three', 'IV'] # [x:] gives a list starting from element at index 1 till the end of the list print(list1[1:]) # [2, 'three', 'IV'] # [:x] gives a list from the beginning of the original list till the element at index(x - 1). print(list1[:3]) # ['one', 2, 'three'] # [x:y] gives a list from element at index x till element at index (y - 1). print(list1[1:3]) # [2, 'three'] # [x:y:z] gives a list from element at index x till the element at index y, picking every zth element. list10 = range(0, 10) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] print(list10[::3]) # [0, 3, 6, 9] # Negative step size also works. In this case, the element represented by first index you specify has to be on the right of the element represented by the second index, in the original list. Else, it will return an empty list. An exception of this would be [::-z] which will iterate the list from the rear end by default. list10[10:0:-3] # [9, 6, 3] list10[::-3] # [9, 6, 3, 0] # the default value of the second index is -(len + 1) when the step/stride is negative. Note that -11 in the second argument will mean a slice from the end to index -10, which is element 0 in the list. In the above expression, slice is till 0 + 1 i.e. index 1. # You can of course use negative indexes as well, I'll leave that to you to explore. Comment below if you run into any problem. ########################## # SORTING A LIST: sort() # ########################## # Ascending order by default. list3 = [4, 5, 2, 4, 1] print(list3.sort()) # [1, 2, 4, 4, 5] list3 = ["apple", "ball", "cat", "dog", "dark"] print(list3.sort()) # ['apple', 'ball', 'cat', 'dark', 'dog'] # Strings are sorted starting from the first character onwards. If the first characters of strings are same, next characters are compared, and so on. For example, 'dark' comes before 'dog'. # If you try to sort a list consisting of mixed types, say strings and numbers, Python will raise a TypeError saying that the types are unorderable. ############################### # REVERSING A LIST: reverse() # ############################### # Reverses a list. Can be used to sort a list in descending order. list4 = [1,4,6,9] print(list4.reverse()) # [9, 6, 4, 1] ##################################################### # ITERATING OVER EACH ELEMENT IN A LIST: a for loop # ##################################################### list3 = ["apple", "ball", "cat", 10, "dog"] for item in list3: print(item) ## OUTPUT apple ball cat 10 dog ###################################################################### # MEMBERSHIP TESTS IN LISTS: the membership operators 'in' and 'not' # ###################################################################### animals = ['cat', 'dog', 'snake', 'elephant'] >>> 'cat' in animals True >>> 'cheetah' not in animals True ############################################ # LENGTH OF A LIST: builtin len() function # ############################################ list5 = list(range(11)) print(len(list5)) # 11 ############################################ # ORDERLINESS OF A LIST: lists are ordered # ############################################ # Lists are ordered, meaning that elements will be stored in a list in the order you entered them in. This is in contrast with dictionaries, we'll get to that. list4 = ["apple", "ball", "cat", 10, "dog"] print(list4) # ['apple', 'ball', 'cat', 10, 'dog'] ####################################### # MERGING LISTS: using the + operator # ####################################### list1 = ["ted", "robin"] list2 = ["marshall", "lily"] list3 = ["barney"] list4 = list1 + list2 + list3 print( list4 ) # ['ted', 'robin', 'marshall', 'lily', 'barney'] ############################# # CLEARING A LIST: clear() # ############################# alphabet = ['a', 'b', 'c', 'd'] alphabet.clear() print(alphabet) # [] ####################### # MUTABILITY OF LISTS # ####################### # Unlike strings, lists are mutable. Once a list is declared, you can assign a new value to any of its elements using indexes. sentence = ["You", "cannot", "change", "me."] sentence[1] = "can" print( sentence ) # ['You', 'can', 'change', 'me.'] ################################ # PACKING AND UNPACKING A LIST # ################################ # Much like strings, we can obtain individual elements from a list and assign them into separate variables for our use. We can also do the converse of this action, by assigning multiple variables to a list. If you specify an incorrect number of variables while unpacking, Python will throw a ValueError. # Unpacking sentence = ["You", "can", "change", "me."] word1, word2, word3, word4 = sentence print( word1 ) # 'You' print( word3 ) # 'change' # Packing newSentence = word1, word2, word3, word4 print(newSentence) ('You', 'can', 'change', 'me.') # NOTE that packing a new list from variables doesn't actually produce a list. To confirm this, type in `print(type(newSentence))`. It produces a tuple. We will see what tuples are, in due time. ####################### # LIST COMPREHENSIONS # ####################### # The third and final way to create a list in Python is by using comprehensions. Comprehension is a generic concept, and applies to all the data structures. In comprehensions, we specify how a data structure will be populated. For example, in a list comprehension, you specify which elements should form a list. An example will make it much more clear. # Make a list of letters in the string 'abcdefghikl' alphabet = [letter for letter in 'abcdefghijkl'] print(alphabet) # ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l'] # Make a list of squares of numbers from 1 to 9 squares = [x**2 for x in range(1, 10)] print(squares) # [1, 4, 9, 16, 25, 36, 49, 64, 81] ## It's amazing what all can be done using comprehensions. They are immensely powerful. We have barely scratched the surface here. But it's enough to get you up and running. You can explore them in your leisure time. Here's another example, a little more complex, see if you can make sense of it. # Making 2 letter strings from two words, taking one letter at a time from each word. permutations = [ letter1 + letter2 for letter1 in "robin" for letter2 in "hood"] print(permutations) # OUTPUT: ['rh', 'ro', 'ro', 'rd', 'oh', 'oo', 'oo', 'od', 'bh', 'bo', 'bo', 'bd', 'ih', 'io', 'io', 'id', 'nh', 'no', 'no', 'nd'] ################################################## # GETTING MAXIMUM AND MINIMUM VALUES FROM A LIST # ################################################## # Using the builtin max() and min() methods. # NOTE that for lists having mixed type of values, say strings and numbers, Python will raise an error while trying to fetch the maximum or minimum value from it. numbers = [223, 999, 321, 683] print( max(numbers) ) # 999 print( min(numbers) ) # 223 strings = ["abc", "bcd", "dce"] print( max(strings ) ) # dce # recall the ord() function print( min(strings ) ) # abc ########################################################################## # JOINING THE ELEMENTS OF A LIST WITH THE PROVIDED SYMBOL/STRING: join() # ########################################################################## # join() applies to all sequences(strings, data structures) # "!".join(list1) produces a string by joining all members of list1 by placing ! symbol between the elements. Note that the join() function will raise a TypeError if there is any numeric value inside the list, because join is a method associated with string. Although, if you wrap these numeric values with double quotes(thereby making them strings), the join() function shall work with no error. address = ["221B", "Baker Street", "London", "U.K."] print("-".join(address)) # 221B-Baker Street-London-U.K. ####################################################################### # COUNTING THE NUMBER OF OCCURRENCES OF AN ELEMENT IN A LIST: count() # ####################################################################### # count() applies to all sequences(strings, data structures) # list1.count('abc') returns the number of occurrences of element 'abc' in list1 web = [".org", ".com", ".gov", ".org", ".gov"] print(web.count('.gov')) # 2 ################################################################ # GETTING THE INDEX OF FIRST OCCURRENCE OF AN ELEMENT: index() # ################################################################ list1 = [9, 9, 9, 8, 3, 3, 8] print( list1.index(8) # 3 ######################################################## # REMOVING A LIST FROM MEMORY: using the 'del' keyword # ######################################################## # If you are conscious of the memory occupied by your program, you may want to consider deleting your variables yourself. # FYI: By default, the interpreter deletes your variables for you as soon as you exit it, or after your program has run its course. >>> someList = [num for num in range(1, 100)] >>> someList [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99] >>> del someList >>> someList # Traceback information NameError: name 'someList' is not defined
When to use lists:
- When you need a heterogeneous collection of elements i.e. not of the same data type
- When ordering of the data matters
- When the items are subject to change
- When you don't require to store only unique objects
- When you want the items to be referenced using indexes
Tuples ()§
Tuples are also an ordered collection of heterogeneous elements, like lists, except for the fact that tuples are immutable, and the number of methods related with tuples is much less than those related with lists. Tuples are denoted by round brackets ().
Immutability of tuples suggests that once you declare a tuple and fill it with values, there is nothing much you can do to alter it except clearing it altogether, and that too not with the clear() method. That said, you can have mutable objects inside a tuple, like a list variable., or a string variable.
Here's an overview of tuples:
- CREATING A TUPLE: () or builtin tuple() or comma-separated values
- CREATING ONE ITEM TUPLE: with a comma after the first and only element
- MERGING TWO OR MORE TUPLES INTO ONE: using the + operator
- COUNTING THE NUMBER OF OCCURRENCES OF AN ELEMENT: count()
- GETTING THE INDEX OF FIRST OCCURRENCE OF AN ELEMENT: index()
- PACKING AND UNPACKING A TUPLE
- LENGTH OF A TUPLE: builtin len() function
- IMMUTABILITY OF TUPLES
- INTERCHANGING TUPLES WITH LISTS: list() and tuple()
- TUPLE COMPREHENSIONS: do not exist.
- RETRIEVING MINIMUM AND MAXIMUM VALUED ELEMENTS IN A TUPLE: max() & min()
- SORTING A TUPLE: can be done, but not directly
- REVERSING A TUPLE: doable, yes, but no explicit function for it
- JOINING ELEMENTS OF A TUPLE WITH A PROVIDED STRING: join()
- SLICING A TUPLE: [:]
- ORDERLINESS OF A TUPLE: tuples are ordered
- MEMBERSHIP TESTS IN TUPLES: the membership operators in and not
- REMOVING A TUPLE FROM MEMORY: using the del keyword
##################################################################### # CREATING A TUPLE: () or builtin tuple() or comma-separated values # ##################################################################### tup1 = ("one", 2, "three", [1, 2, 3]) print(type(tup1)) # <class 'tuple'> print(tup1) # ("one", 2, "three", [1, 2, 3]) tup2 = tuple() # creates an empty tuple tup3 = tuple(tup1) # crates a tuple with values of tup1 tup4 = 'value1', 'value2', 'value3', 'value4' print(tup4) # ('value1', 'value2', 'value3', 'value4') ########################################################################## # CREATING ONE ITEM TUPLE: with a comma after the first and only element # ########################################################################## testTuple = ('hi') print(type(testTuple)) # <class 'str'> testTuple = ('hi',) print(type(testTuple)) # <class 'tuple'> ############################################################# # MERGING TWO OR MORE TUPLES INTO ONE: using the + operator # ############################################################# oneHalf = (1, 2, 3) theOtherHalf = (4, 5, 6) print( oneHalf + theOtherHalf ) # (1, 2, 3, 4, 5, 6) ############################################################# # COUNTING THE NUMBER OF OCCURRENCES OF AN ELEMENT: count() # ############################################################# tup1 = (0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89) print( tup1.count(1) ) # 2 ################################################################ # GETTING THE INDEX OF FIRST OCCURRENCE OF AN ELEMENT: index() # ################################################################ tup1 = (3, 3, 4, 7, 9, 10, 7) print( tup1.index(7) # 3 ################################# # PACKING AND UNPACKING A TUPLE # ################################# # You can assign each constituent element in a tuple to a variable, and also create a tuple from variables. The former is called unpacking, and the latter is called packing. # Unpacking a tuple coordinates = (20, 50, 100) x, y, z = coordinates print(x) # 20 print(y) # 50 print(z) # 100 # Packing a tuple x, y, z = 10, 20, 30 # multiple variable assignment is supported in Python. coordinates = x, y, z print(coordinates) # (10, 20, 30) ############################################# # LENGTH OF A TUPLE: builtin len() function # ############################################# # Use the builtin len() function to determine the length of a tuple, like any other sequence. coordinates = (20, 50, 100) print( len(coordinates) ) # 3 ########################## # IMMUTABILITY OF TUPLES # ########################## # A tuple variable has only two methods associated with it: index() and count(). There is no insert() because tuples don't allow for addition of elements, nor do they allow for their removal. You can refer to tuple elements with their indexes, but you can't assign a new value using the [] notation, that's what we mean by tuples being immutable. Here's what happens if you try to do the same: coordinates = 10, 20, 30 print (coordinates[1] ) # 20 coordinates[1] = 40 Traceback (most recent call last): # traceback information followed by: TypeError: 'tuple' object does not support item assignment ####################################################### # INTERCHANGING TUPLES WITH LISTS: list() and tuple() # ####################################################### # You may encounter situations where you might wish to overcome the immutability of tuples, but you might want the end result to be a tuple, to prevent it from being changed by anyone else in future, or for any other reason. The thing is, you can actually transform a tuple into a list and then back into a tuple. In fact, there are ways you could take the elements of any data structure and make any other data structure with it. We'll get to that later. First, let's interchange tuples and lists using the builtin tuple() and list() functions.. tupleCoordinates = (10, 20, 30) listCoordinates = list(tupleCoordinates) print( listCoordinates ) # [10, 20, 30] listCoordinates[1] = 100 tupleCoordinates = tuple(listCoordinates) print( tupleCoordinates ) # (10, 100, 30) ####################################### # TUPLE COMPREHENSIONS: do not exist. # ####################################### # If you try to make a tuple comprehension, you will end up making a generator instead. You can learn more about generators at https://www.djangospin.com/python-generators/ numbers = (i for i in range(1, 10)) print(numbers) # <generator object <genexpr> at 0x02E43800> ############################################################################ # RETRIEVING MINIMUM AND MAXIMUM VALUED ELEMENTS IN A TUPLE: max() & min() # ############################################################################ stringTuple = ("abc", "dce", "dab") print( min(stringTuple) ) # 'abc' print( max(stringTuple) ) # 'dce' # ord() function behind the scenes numberTuple = (100, 200, 300, 400, 500) print( min(numberTuple) ) # 100 print( max(numberTuple) ) # 500 ################################################## # SORTING A TUPLE: can be done, but not directly # ################################################## # Since tuples are immutable, you cannot sort tuples per se, but what you can do is create a list from the tuple, sort the list, and then create a new tuple out of the sorted list. # The method involves interchanging data structures, something we have not looked at yet, We will cover it in the 'Few General Things' section down below. I'll show you the method though. >>> tupleOfNumbers = (5, 4, 2, 3, 1) >>> listOfNumbers = list(tupleOfNumbers) # creating a list out of a pre-defined sequence >>> listOfNumbers.sort() >>> listOfNumbers [1, 2, 3, 4, 5] >>> newTupleOfNumbers = tuple(listOfNumbers) # creating a tuple out of a pre-defined sequence >>> newTupleOfNumbers (1, 2, 3, 4, 5) # You can condense the method a bit by using the builtin sorted() function, which returns a sorted list of any sequence that's provided to the sorted() function as an argument. >>> tupleOfNumbers = (5, 4, 2, 3, 1) >>> newTupleOfNumbers = tuple( sorted( tupleOfNumbers ) ) >>> newTupleOfNumbers (1, 2, 3, 4, 5) # Note that you cannot sort a tuple with mixed data in it. Python raises a TypeError in this case. >>> tupleOfNumbers = (5, 4, 2, 'one', 1) >>> newTupleOfNumbers = tuple( sorted( tupleOfNumbers ) ) # Traceback information TypeError: unorderable types: str() < int() ################################################################### # REVERSING A TUPLE: doable, yes, but no explicit function for it # ################################################################### # Just like sorting a tuple, you can use a list to reverse the tuple elements, and then use the list to create a new tuple. >>> tupleOfNumbers = (5, 4, 2, 'one', 1) >>> intermediateList = list(tupleOfNumbers) >>> intermediateList [5, 4, 2, 'one', 1] >>> intermediateList.reverse() >>> intermediateList [1, 'one', 2, 4, 5] >>> newTupleOfNumbers = tuple(intermediateList) >>> newTupleOfNumbers (1, 'one', 2, 4, 5) ############################################################## # JOINING ELEMENTS OF A TUPLE WITH A PROVIDED STRING: join() # ############################################################## # The join() method applies to all sequences. We have seen how it with strings and lists, let's see how they work with tuples(the same way, actually): stringTuple = ("one", "two", "three") print( "_".join(stringTuple) ) # one_two_three # Note that just like lists, join() function will not work on tuples containing numeric values, unless you make them strings by wrapping them with double quotes. ######################## # SLICING A TUPLE: [:] # ######################## # Slicing DOES NOT hamper the original tuple, unless you assign the sliced list to the original variable. tuple1 = (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) # [:] gives the original tuple print(tuple1[:]) # (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) # [x:] gives a tuple starting from element at index 1 till the end of the tuple print(tuple1[1:]) # (1, 2, 3, 4, 5, 6, 7, 8, 9) # [:x] gives a tuple from the beginning of the original tuple till the element at index(x - 1). print(tuple1[:3]) # (0, 1, 2) # [x:y] gives a tuple from element at index x till element at index (y - 1). print(tuple1[1:3]) # (1, 2) # [x:y:z] gives a tuple from element at index x till the element at index y, picking every zth element. print(tuple1[::3]) # (0, 3, 6, 9) # Negative step size also works. In this case, the element represented by first index you specify has to be on the right of the element represented by the second index, in the original tuple. Else, it will return an empty tuple. An exception of this would be [::-z] which will iterate the tuple from the rear end by default. tuple1[10:0:-3] # (9, 6, 3) tuple1[::-3] # (9, 6, 3, 0) # the default value of the second index is -(len + 1) when the step/stride is negative. Note that -11 in the second argument will mean a slice from the end to index -10, which is element 0 in the list. In the above expression, slice is till 0 + 1 i.e. index 1. # You can of course use negative indexes as well, I'll leave that to you to explore. Comment below if you run into any problem. ############################################## # ORDERLINESS OF A TUPLE: tuples are ordered # ############################################## # Elements in a tuple are preserved in the order in which you add them. tuple1 = (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) print( tuple1[2] ) # 2 ####################################################################### # MEMBERSHIP TESTS IN TUPLES: the membership operators 'in' and 'not' # ####################################################################### tuple1 = "Sue", "Bob" >>> 'Bob' in tuple1: True >>> 'Peggy' not in tuple1 False ######################################################### # REMOVING A TUPLE FROM MEMORY: using the 'del' keyword # ######################################################### # If you are conscious of the memory occupied by your program, you may want to consider deleting your variables yourself. # FYI: By default, the interpreter deletes your variables for you as soon as you exit it, or after your program has run its course. >>> someList = [num for num in range(1, 100)] >>> someTuple = tuple(someList) >>> someTuple (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99) >>> del someTuple >>> someTuple # Traceback information NameError: name 'someTuple' is not defined
When to use a Tuple:
- Make a tuple instead of a list when the number of items is known and small.
- Make a tuple while returning multiple values from a function.
- Make a tuple when your data doesn't have to change.
- Make a tuple when the performance of your script is important. Tuples are more efficient than lists when it comes to performance.
Dictionaries {}§
Dictionaries are unordered collections of objects, wherein the items are stored and fetched by keys, rather than indexes. Dictionaries are one of the most flexible data structures, along with lists. You may relate lists to arrays in other languages, and dictionaries to search tables, where item names are more meaningful than item positions. For example, consider the following example:
football = {"goalkeeper" : "David De Gea", "striker" : "Wayne Rooney"} print( football["goalkeeper"] ) # David De Gea print( football("striker"] ) # Wayne Rooney
Keys and Values: Dictionaries are a list of key-value pairs. In the above example, "goalkeeper" is a key, and "David De Gea" is the value corresponding to the key "goalkeeper".
Dictionaries are characterized by:
- unordered collection of arbitrary objects
- accessed by keys, and not by indexes
- mutable
- serve as search tables
Let's have an overview of dictionaries:
- CREATING A DICTIONARY: {} or dict() or from a list of tuples
- GETTING THE LIST OF KEYS IN A DICTIONARY: keys()
- GETTING THE LIST OF VALUES IN A DICTIONARY: values()
- GETTING LIST OF KEY-VALUE PAIRS FROM A DICTIONARY: items()
- ADDING ELEMENTS TO A DICTIONARY: update()
- RETRIEVING A VALUE FROM A DICTIONARY: get() or [key]
- REMOVING ELEMENT FROM A DICTIONARY: the del keyword, pop(), popitem()
- CLEARING A DICTIONARY: clear()
- LENGTH OF A DICTIONARY: builtin len() function
- MUTABILITY OF A DICTIONARY: mutable
- MAXIMUM AND MINIMUM VALUED ELEMENTS IN A DICTIONARY
- SLICING: n/a
- ORDERLINESS OF A DICTIONARY: dictionaries are unordered
- MEMBERSHIP TESTS IN DICTIONARIES: the membership operators in and not
- JOINING KEYS OF A DICTIONARY WITH A PROVIDED STRING: join()
- ITERATING OVER A DICTIONARY: a for loop
- SORTING A DICTIONARY: irrelevant
- REVERSING A DICTIONARY: irrelevant
- OTHER FUNCTIONS RELATED TO DICTIONARIES: .copy() .fromkeys() .pop() .popitem() .setdefault()
- PACKING AND UNPACKING A DICTIONARY
- MERGING USING + OPERATOR: not supported
- DICTIONARY COMPREHENSIONS
- REMOVING A DICTIONARY FROM MEMORY: using the del keyword
################################################################ # CREATING A DICTIONARY: {} or dict() or from a list of tuples # ################################################################ IndianCricketTeam = {"batsman": "V. Kohli", "bowler": "B. Kumar"} print(type(IndianCricketTeam)) # <class 'dict'> print(IndianCricketTeam) # {'batsman': 'V. Kohli', 'bowler': 'B. Kumar'} IndianCricketTeam2 = dict("batsman": "V. Kohli", "bowler": "B. Kumar") print (type(IndianCricketTeam2)) # <class 'dict'> IndianCricketTeam3 = dict( [("batsman","V. Kohli"), ("bowler","B. Kumar")] ) print(IndianCricketTeam3 ) # {'batsman': 'V. Kohli', 'bowler': 'B. Kumar'} # Note that you have to wrap your sequence of tuples with round brackets or square brackets for the above to work. IndianCricketTeam4 = dict(batsman = "V. Kohli", bowler = "B. Kumar") print(IndianCricketTeam4) # {'batsman': 'V. Kohli', 'bowler': 'B. Kumar'} # Nesting a dictionary within a dictionary IndianCricketTeam = { "batsmen": {1: "S. Tendulkar", 2: "V. Kohli"}, "bowlers": {1: "B.Kumar", 2: "M. Shami"} } print( IndianCricketTeam['batsmen'][1]) # S. Tendulkar #################################################### # GETTING THE LIST OF KEYS IN A DICTIONARY: keys() # #################################################### IndianCricketTeam = {"batsman": "V. Kohli", "bowler": "B. Kumar"} print( IndianCricketTeam.keys() ) # dict_keys(['batsman', 'bowler']) # You can iterate over these keys using a for loop. # You would have noticed that the keys() function actually returns a sequence of type '<class 'dict_keys'>', and not of type '<class 'list'>'. This is because the keys(), items(), values() used to return lists, which wasted memory. These methods were then replaced with iterkeys(), iteritems(), itervalues(), which didn't waste computer memory, but didn't offer many features(you could only iterate over their contents, and that too only ONCE) either. Hence, the Python developers resorted back to keys(), items(), values() which return special sequences('dict_keys', 'dict_items', 'dict_values') which offer features like iteration over contents multiple times, efficient comparison and set operations. ######################################################## # GETTING THE LIST OF VALUES IN A DICTIONARY: values() # ######################################################## IndianCricketTeam = {"batsman": "V. Kohli", "bowler": "B. Kumar"} print( IndianCricketTeam.values() ) # dict_values(['V. Kohli', 'B. Kumar']) # You can iterate over these values using a for loop. # You would have noticed that the values() function actually returns a sequence of type '<class 'dict_values'>', and not of type '<class 'list'>'. This is because the keys(), items(), values() used to return lists, which wasted memory. These methods were then replaced with iterkeys(), iteritems(), itervalues(), which didn't waste computer memory, but didn't offer many features(you could only iterate over their contents, and that too only ONCE) either. Hence, the Python developers resorted back to keys(), items(), values() which return special sequences('dict_keys', 'dict_items', 'dict_values') which offer features like iteration over contents multiple times, efficient comparison and set operations. ############################################################## # GETTING LIST OF KEY-VALUE PAIRS FROM A DICTIONARY: items() # ############################################################## IndianCricketTeam = {"batsman": "V. Kohli", "bowler": "B. Kumar"} print( IndianCricketTeam.items() ) # dict_items([('batsman', 'V. Kohli'), ('bowler', 'B. Kumar')]) # You can iterate over these items using a for loop with two variables like such: for role, player in IndianCricketTeam.items(): print( player, "is an Indian", role ) # OUTPUT V. Kohli is an Indian batsman B. Kumar is an Indian bowler # You would have noticed that the items() function actually returns a sequence of type '<class 'dict_items'>', and not of type '<class 'list'>'. This is because the keys(), items(), values() used to return lists, which wasted memory. These methods were then replaced with iterkeys(), iteritems(), itervalues(), which didn't waste computer memory, but didn't offer many features(you could only iterate over their contents, and that too only ONCE) either. Hence, the Python developers resorted back to keys(), items(), values() which return special sequences('dict_keys', 'dict_items', 'dict_values') which offer features like iteration over contents multiple times, efficient comparison and set operations. ############################################# # ADDING ELEMENTS TO A DICTIONARY: update() # ############################################# # Actually, the right term to use here is 'update', since we update a dictionary with a new key-value pair, rather than adding to it. The way we do it, is by supplying the new key-value pair(s) to the update method, in the form of another dictionary. IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") IndianCricketTeam.update( {"finisher": "M.S. Dhoni"} ) print( IndianCricketTeam ) # {'bowler': 'B. Kumar', 'batsman': 'V. Kohli', 'finisher': 'M.S. Dhoni'} IndianCricketTeam.update( dict(allrounder1 = "R. Jadeja", allrounder2 = "Y. Singh") ) print( IndianCricketTeam ) # {'bowler': 'B. Kumar', 'batsman': 'V. Kohli', 'allrounder1': 'R. Jadeja', 'allrounder2': 'Y. Singh', 'finisher': 'M.S. Dhoni'} # Note that the + operator doesn't work on dictionaries. print( {1: "one", 2: "two"} + {3: "three", 4: "four"} ) # TypeError: unsupported operand type(s) for +: 'dict' and 'dict' ######################################################## # RETRIEVING A VALUE FROM A DICTIONARY: get() or [key] # ######################################################## # The get(key) function returns the value corresponding to the passed key. IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") print( IndianCricketTeam.get("batsman") ) # V. Kohli print( IndianCricketTeam["batsman"] ) # V. Kohli # Note that if you specify a key that is not in the dictionary, Python raises a KeyError. print( IndianCricketTeam["wicket-keeper"] ) Traceback (most recent call last): # Other traceback information print( IndianCricketTeam["wicket-keeper"] ) KeyError: 'wicket-keeper' ######################################################################### # REMOVING ELEMENT FROM A DICTIONARY: the del keyword, pop(), popitem() # ######################################################################### IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") del IndianCricketTeam['batsman'] print( IndianCricketTeam ) # {'bowler': 'B. Kumar'} # To delete the dictionary from the memory del IndianCricketTeam print( IndianCricketTeam ) # NameError: name 'IndianCricketTeam' is not defined # pop() # deletes the the key value pair from the dictionary, and returns the value corresponding to the key you have supplied to the pop() function. IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar", finisher = "M.S. Dhoni") IndianCricketTeam.pop("batsman") print( IndianCricketTeam ) # {'finisher': 'M.S. Dhoni', 'bowler': 'B. Kumar'} # To re-emphasize, the pop() function not only deletes the key-value pair, it also returns the 'value' in the key-value. If you run the pop statement above in the interactive shell, you will see the output as 'V. Kohli' i.e. the value associated with the key 'batsman'). IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar", finisher = "M.S. Dhoni") print( IndianCricketTeam.pop("batsman") ) # V. Kohli print( IndianCricketTeam.pop("batsman") == 'V. Kohli' ) # True # popitem() # deletes the first item as stored by the dictionary, and returns the key-value pair. IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar", finisher = "M.S. Dhoni") IndianCricketTeam.popitem() # ('finisher', 'M.S. Dhoni') IndianCricketTeam.popitem() # ('batsman', 'V. Kohli') IndianCricketTeam.popitem() # ('bowler', 'B. Kumar') IndianCricketTeam.popitem() # KeyError: 'popitem(): dictionary is empty' ################################## # CLEARING A DICTIONARY: clear() # ################################## # clear() function wipes off the contents of the dictionary. IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") IndianCricketTeam.clear() print( IndianCricketTeam ) # {} ################################################## # LENGTH OF A DICTIONARY: builtin len() function # ################################################## IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") print( len(IndianCricketTeam) ) #2 ####################################### # MUTABILITY OF A DICTIONARY: mutable # ####################################### # Dictionaries are mutable i.e. once declared, you can tweak individual keys by assigning them new values on the fly without having to declare the dictionary all over again(i.e. the identity of the dictionary, obtainable by id() stays intact). IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") IndianCricketTeam["batsman"] = "S. Tendulkar" print( IndianCricketTeam ) # {'batsman': 'S. Tendulkar', 'bowler': 'B. Kumar'} ####################################################### # MAXIMUM AND MINIMUM VALUED ELEMENTS IN A DICTIONARY # ####################################################### # Since there are two sets of values in dictionary, namely, keys and values, min() and max() are somewhat of an advanced thing to understand. Don't fret though, I have shed some light on how to retrieve keys and values with minimum and maximum values, in the 'Few General Things' section down below. Look for 'max() and min() in dictionary'. ################ # SLICING: n/a # ################ # Slicing a Python dictionary is not a valid operation, since dictionaries are not index-based data structures like lists. If you try to slice a dictionary, Python throws: TypeError: unhashable type: 'slice' ########################################################### # ORDERLINESS OF A DICTIONARY: dictionaries are unordered # ########################################################### # Python doesn't necessarily store the dictionaries in the order in which you declared its elements in. So, don't count on the order of a dictionary. # Since dictionaries are unordered, operations like finding the index of a key, reversing a dictionary, sorting a dictionary, reversing a dictionary have a silly ring towards them. # You can however print the sorted keys and values using a simple for loop. I'll leave that to you to explore. ############################################################################# # MEMBERSHIP TESTS IN DICTIONARIES: the membership operators 'in' and 'not' # ############################################################################# IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") print( "batsman" in IndianCricketTeam ) # True print( "coach" not in IndianCricketTeam ) # True ############################################################### # JOINING KEYS OF A DICTIONARY WITH A PROVIDED STRING: join() # ############################################################### # The join() method simply concatenates the keys of the dictionary, separated by the string you specify before the join() method. IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar", director = "R. Shastri") "_".join(IndianCricketTeam) # 'bowler_batsman_director' ########################################### # ITERATING OVER A DICTIONARY: a for loop # ########################################### # for loop with two variables # We have looked at this before when we were testing the .items() method. To reiterate(pun unintended), you can iterate over these items using a for loop with two variables like this: IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") for role, player in IndianCricketTeam.items(): print( player, "is an Indian", role ) # OUTPUT V. Kohli is an Indian batsman B. Kumar is an Indian bowler # for loop with a single variable IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") for role in IndianCricketTeam: print( IndianCricketTeam[role], "is an Indian", role ) # OUTPUT V. Kohli is an Indian batsman B. Kumar is an Indian bowler #################################### # SORTING A DICTIONARY: irrelevant # #################################### # Firstly, the order in which you specify a dictionary, is rarely the order in which it gets stored in memory. In fact, the custom-indexes in dictionaries makes the order seem futile. # If you, for some reason, want to sort the keys and values of a dictionary, you can always take them out of the dictionary and sort them. >>> IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar", director = "R. Shastri") >>> IndianCricketTeam {'bowler': 'B. Kumar', 'director': 'R. Shastri', 'batsman': 'V. Kohli'} >>> keys = IndianCricketTeam.keys() >>> keys dict_keys(['bowler', 'director', 'batsman']) >>> sortedKeys = sorted(keys) >>> sortedKeys ['batsman', 'bowler', 'director'] >>> values = IndianCricketTeam.values() >>> values dict_values(['B. Kumar', 'R. Shastri', 'V. Kohli']) >>> sortedValues = sorted(values) >>> sortedValues ['B. Kumar', 'R. Shastri', 'V. Kohli'] ###################################### # REVERSING A DICTIONARY: irrelevant # ###################################### # Reversing a dictionary doesn't make much since dictionaries are based on custom indexes. Hence, the order of elements in a dictionary is irrelevant. ################################################################################################ # OTHER FUNCTIONS RELATED TO DICTIONARIES: .copy() .fromkeys() .pop() .popitem() .setdefault() # ################################################################################################ # copy() # copies the contents of a dictionary to another dictionary. names1 = "V. Kohli", "S. Tendulkar" names2 = "B. Kumar" IndianCricketTeam = dict( batsmen = names1, bowler = names2 ) IndiaSideB = IndianCricketTeam.copy() print( IndiaSideB ) # {'bowler': 'B. Kumar', 'batsmen': ('V. Kohli', 'S. Tendulkar')} # This is known as a 'shallow' copy, and it is different from the simple a = b assignment. In fact, there is something known as a 'deep' copy. But wrapping your head around these three kinds of copy is a little difficult, especially for beginners. However, if you are interested, I'll refer you to a stack overflow link, which explains everything impeccably: http://stackoverflow.com/questions/17246693/what-exactly-is-the-difference-between-shallow-copy-deepcopy-and-normal-assignm Use the above example to verify the results. # fromkeys() # useful in particular for creating a dictionary with keys of an existing dictionary. # dict.fromkeys(sequence[, default_value]) creates a new dictionary with keys from a provided sequence(string, tuple, list, set, another dictionary), and initializes all the keys to a default_value, if provided. If the default_value is not provided, the keys are initialized to None value. # 1: using fromkeys() of dict class. Used when the dictionary doesn't exist already. sequence = 'one', 'two', 'three' dictFromKeys = dict.fromkeys(sequence) print( dictFromKeys ) # {'two': None, 'three': None, 'one': None} # 2: using fromkeys() of an existing dictionary. sequence = ['four', 'five', 'six'] dictFromKeys = {} dictFromKeys = dictFromKeys.fromkeys(sequence) print( dictFromKeys ) # {'four': None, 'five': None, 'six': None} # Using the optional 'default' parameter sequence = 'seven', 'eight', 'nine' dictFromKeys = dict.fromkeys(sequence, 'a number') print( dictFromKeys ) # {'seven': 'a number', 'eight': 'a number', 'nine': 'a number'} # The dictionaries we spawned were all from tuples, let's see how other sequences can employed as well. sequence = 'hello' dictFromKeys = dict.fromkeys(sequence, 'from str') print( dictFromKeys ) # {'l': 'from str', 'o': 'from str', 'e': 'from str', 'h': 'from str'} sequence = ['keyOne', 'keyTwo', 'keyThree'] dictFromKeys = dict.fromkeys(sequence, 'from list') print( dictFromKeys ) # {'keyThree': 'from list', 'keyOne': 'from list', 'keyTwo': 'from list'} sequence = {'keyOne': 'valueOne', 'keyTwo': 'valueTwo', 'keyThree': 'valueThree'} dictFromKeys = dict.fromkeys(sequence) print( dictFromKeys ) # {'keyThree': None, 'keyOne': None, 'keyTwo': None} # setdefault() # dict1.setdefault(key[,value]) -> dict1.get(key,value), also set dict1[key]=value if key not in dict1 # If the key is present in the dictionary, then return the corresponding value. Else, insert the key into the dictionary with its value as provided, and return the value. An example should make it clear. >>> IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") >>> IndianCricketTeam.setdefault('batsman', 'S. Tendulkar') 'V. Kohli' >>> IndianCricketTeam.setdefault('finisher', 'M.S. Dhoni') 'M.S. Dhoni' ###################################### # PACKING AND UNPACKING A DICTIONARY # ###################################### # Like all other data structures, a dictionary can be deconstructed into individual variables, and likewise, a new dictionary can be constructed using a list of keys, as we have already seen above. # Unpacking a dictionary >>> IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") >>> role1, role2 = IndianCricketTeam.keys() >>> role1 'batsman' >>> role2 'bowler' >>> player1, player2 = IndianCricketTeam.values() >>> player1 'V. Kohli' >>> player2 'B. Kumar' # Packing a dictionary from a sequence of keys >>> roles = 'batsman', 'bowler' >>> IndianCricketTeamA = dict.fromkeys(roles) >>> IndianCricketTeamA {'batsman': None, 'bowler': None} ########################################### # MERGING USING + OPERATOR: not supported # ########################################### >>> IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") >>> IndianCricketTeamAddition = dict(finisher= "M.S. Dhoni", director= "R. Shastri") >>> IndianCricketTeam + IndianCricketTeamAddition # Traceback information TypeError: unsupported operand type(s) for +: 'dict' and 'dict' ############################# # DICTIONARY COMPREHENSIONS # ############################# # Just like lists, there is another way to create a dictionary, using comprehensions. # Comprehensions a lot more 'pythonic', that is to say, a lot more intuitive and natural to the language of Python. squaresDict = {number:number*number for number in range(1, 11) } print( squaresDict ) # {1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81, 10: 100} for number, square in {x:x*x for x in range(1, 11)}.items(): print(number, "times", number, "is", square) # OUTPUT 1 times 1 is 1 2 times 2 is 4 3 times 3 is 9 4 times 4 is 16 5 times 5 is 25 6 times 6 is 36 7 times 7 is 49 8 times 8 is 64 9 times 9 is 81 10 times 10 is 100 # Needless to say, there is a lot more power in comprehensions than making them square a few numbers, I'll leave that for you to explore. If you have come up with a 'wow' comprehension, like 'I didn't know you could do that in Python', please share the same in the comments below. ############################################################## # REMOVING A DICTIONARY FROM MEMORY: using the 'del' keyword # ############################################################## # If you are conscious of the memory occupied by your program, you may want to consider deleting your variables yourself. # FYI: By default, the interpreter deletes your variables for you as soon as you exit it, or after your program has run its course. >>> someDict = {num:num*num for num in range(1, 20)} >>> someDict {1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81, 10: 100, 11: 121, 12: 144, 13: 169, 14: 196, 15: 225, 16: 256, 17: 289, 18: 324, 19: 361} >>> del someDict >>> someDict # Traceback information NameError: name 'someDict' is not defined
Sets {}§
Sets (denoted by {} ) in Python are an implementation of the mathematical sets. They are like lists, only that they do not allow duplicates and support set specific operations like union, intersection, difference etc.
An overview of sets:
- CREATING SETS: set() or {}
- ADDING MEMBERS TO A SET: add() & update() or |=
- REMOVING MEMBERS FROM A SET: remove(), discard(), pop()
- CLEARING A SET: clear()
- MEMBERSHIP TESTS IN SETS: the membership operators in and not
- ITERATING OVER ELEMENTS OF A SET: for loop
- LENGTH OF A SET: len()
- PACKING AND UNPACKING A SET
- JOINING THE ELEMENTS OF A SET WITH THE PROVIDED SYMBOL/STRING: join()
- SUBSET AND SUPERSET TESTS: issubset() or <= & issuperset() or >=
- COPYING A SET: copy()
- ORDERLINESS OF A SET: sets are unordered
- GETTING MAXIMUM AND MINIMUM VALUES FROM A SET
- REVERSING A SET: not a valid operation, since sets are unordered.
- SORTING A SET: not a valid operation, since sets are unordered.
- MERGING USING + OPERATOR: not supported
- MUTABILITY OF A SET
- SET OPERATIONS: intersection() &, difference() -, symmetric difference() ^, union() |
- SET UPDATE METHODS: difference_update() -=, intersection_update() &=, symmetric_difference_update() ^=
- CHECKING TO SEE IF TWO SETS ARE MUTUALLY EXCLUSIVE OR DISJOINT: isdisjoint()
- FROZEN SETS: immutable sets
- SLICING A SET: not supported since a set is unordered
- SET COMPREHENSIONS
- REMOVING A SET FROM MEMORY: using the del keyword
############################## # CREATING SETS: set() or {} # ############################## # { } animals = {"cat", "dog", "cat", "elephant"} print( animals ) # {'cat', 'dog', 'elephant'} print( type(animals) ) # <class 'set'> # set() # The set method takes one argument i.e. a sequence. listOfAnimals = ['cat', 'dog', 'cat', 'elephant'] print( listOfAnimals ) # ['cat', 'dog', 'cat', 'elephant'] setOfAnimals = set(listOfAnimals) print( setOfAnimals ) # {'cat', 'dog', 'elephant'} print( type(setOfAnimals) ) # <class 'set'> tupleOfAnimals = 'cat', 'dog', 'cat', 'elephant' print( tupleOfAnimals ) # ('cat', 'dog', 'cat', 'elephant') setOfAnimals = set(tupleOfAnimals) print( setOfAnimals ) # {'cat', 'dog', 'elephant'} print( type(setOfAnimals) ) # <class 'set'> dictOfAnimals = {'cat': 'c', 'dog': 'd', 'elephant': 'e'} print( dictOfAnimals ) # {'dog': 'd', 'cat': 'c', 'elephant': 'e'} setOfAnimals = set(dictOfAnimals) print( setOfAnimals ) # {'dog', 'cat', 'elephant'} print( type(setOfAnimals) ) # <class 'set'> # Note that only the keys of the dictionary make up the set, much like the normal behavior noticed on print(dictionary_name) # Note that if you initialize a variable to { } with the intention of creating an empty set, Python actually creates an empty dictionary. For creating an empty set, use the builtin set() function. animals = {} print( type(animals) ) # <class 'dict'> animals = set() print( type(animals) ) # <class 'set'> ################################################### # ADDING MEMBERS TO A SET: add() & update() or |= # ################################################### animals = {"cat", "dog", "elephant"} animals.add('snake') print( animals ) # {'cat', 'dog', 'elephant', 'snake'} # Note that you can add the same element again without Python raising an error. animals = {"cat", "dog", "elephant"} animals.add('snake') animals.add('snake') animals.add('snake') print( animals ) # {'cat', 'dog', 'elephant', 'snake'} # update() or |= # set1.update(sequence) or set1 |= sequence adds the members of the sequence(string, tuple, another set, dictionary, list) to those already present in the set. Note that the |= operator only works when the sequence is another set. Otherwise, Python throws a TypeError. >>> # dict >>> animals = {"cat", "dog", "elephant"} >>> sequence = {"hamster": 'h', "lion": 'l'} >>> animals.update(sequence) >>> animals {'dog', 'lion', 'cat', 'elephant', 'hamster'} >>> >>> >>> >>> # tuple >>> animals = {"cat", "dog", "elephant"} >>> sequence = ("hamster", "lion") >>> animals.update(sequence) >>> animals {'dog', 'lion', 'cat', 'elephant', 'hamster'} >>> >>> >>> >>> >>> # list >>> animals = {"cat", "dog", "elephant"} >>> sequence = ["hamster", "lion"] >>> animals.update(sequence) >>> animals {'dog', 'lion', 'cat', 'elephant', 'hamster'} >>> >>> >>> >>> >>> # string >>> animals = {"cat", "dog", "elephant"} >>> sequence = 'hen' # makes no sense, just to show that it is there in Python. >>> animals.update(sequence) >>> animals {'e', 'cat', 'dog', 'elephant', 'n', 'h'} >>> >>> >>> >>> # another set >>> animals = {"cat", "dog", "elephant"} >>> wildAnimals = {'lion', 'boar'} >>> animals.update(wildAnimals) >>> animals {'dog', 'lion', 'cat', 'elephant', 'boar'} >>> >>> >>> # |= operator >>> animals = {"cat", "dog", "elephant"} >>> wildAnimals = {'lion', 'boar'} >>> animals |= wildAnimals >>> animals {'dog', 'elephant', 'boar', 'cat', 'lion'} ########################################################### # REMOVING MEMBERS FROM A SET: remove(), discard(), pop() # ########################################################### # remove() # removes() removes an element from a set; it must be a member. If is not a member, a KeyError is thrown animals = {"cat", "dog", "elephant", "snake"} animals.remove('snake') print( animals ) # {'cat', 'dog', 'elephant'} animals.remove('hamster') # Traceback info KeyError: 'hamster' # discard() # discard() removes an element from a set if it is a member. If it is not a member, it does nothing. animals = {"cat", "dog", "elephant", "snake"} animals.discard('snake') print( animals ) # {'cat', 'dog', 'elephant'} animals.discard('hamster') # no error thrown # pop() # removes and returns an arbitrary(random) set element # raises a KeyError if set is empty >>> animals = {"cat", "dog", "elephant"} >>> animals.pop() 'cat' >>> animals.pop() 'dog' >>> animals.pop() 'elephant' >>> animals.pop() # traceback information KeyError: 'pop from an empty set' ########################### # CLEARING A SET: clear() # ########################### animals = {"cat", "dog", "elephant", "snake"} animals.clear() print(animals) # set() ##################################################################### # MEMBERSHIP TESTS IN SETS: the membership operators 'in' and 'not' # ##################################################################### >>> animals = {"cat", "dog", "elephant", "snake"} >>> 'cat' in animals True >>> 'cheetah' not in animals True ############################################## # ITERATING OVER ELEMENTS OF A SET: for loop # ############################################## animals = {"cat", "dog", "elephant", "snake"} for animal in animals: print(animal.title(), "is an animal.") # .title() capitalizes the first letter of each word in the string. # OUTPUT Cat is an animal. Dog is an animal. Snake is an animal. Elephant is an animal. ########################## # LENGTH OF A SET: len() # ########################## >>> animals = {"cat", "dog", "elephant", "snake"} >>> len(animals) 4 #################################################################### # SUBSET AND SUPERSET TESTS: issubset() or <= & issuperset() or >= # #################################################################### # Every element in subset is present in superset. # subset <= superset where subset and superset are two sets. # set1.issubset(set2) returns True if set2 contains all elements in set1 i.e. set1 is a subset of set2 >>> animals = {"cat", "dog", "elephant", "snake"} >>> domesticAnimals = {"cat", "dog"} >>> domesticAnimals.issubset(animals) True >>> domesticAnimals <= animals # the same test True # set1.issuperset(set2) returns True if set1 contains all elements in set2 i.e. set1 is a superset of set2 >>> animals = {"cat", "dog", "elephant", "snake"} >>> domesticAnimals = {"cat", "dog"} >>> animals.issuperset(domesticAnimals) True >>> animals >= domesticAnimals True # Note that if two sets are equal i.e. they have same number of elements and same elements, then both these functions return True. This is in conformance to mathematical properties of sets: "Any set is a subset of itself." >>> domesticAnimals = {"cat", "dog"} >>> domesticAnimals2 = {"dog", "cat", "dog"} >>> domesticAnimals.issubset( domesticAnimals2 ) True >>> domesticAnimals.issuperset( domesticAnimals2 ) True >>> domesticAnimals2.issubset( domesticAnimals ) True >>> domesticAnimals2.issuperset( domesticAnimals ) True ######################### # COPYING A SET: copy() # ######################### # copies the contents of a set to another set. c, d, e = 'cat', 'dog', 'elephant' animals = {c, d, e} domesticAnimals = animals.copy() print(domesticAnimals) # {'cat', 'dog', 'elephant'} # Just like we saw in dictionaries, this is a shallow copy. We went over three terms: shallow copy, deep copy and object assignment(=). Comprehending these 3 copies is, to some extent, beyond greenhorns, it was for me as well. If you are curious as to how all these work, feel free to explore it. I have included a handy reference to a stack overflow link in the dictionaries section. ############################################ # ORDERLINESS OF A SET: sets are unordered # ############################################ # Set elements are not stored sequentially like lists and tuples, meaning that you cannot access the elements of a set using indexes. >>> mixedSet = {'cat', 10, 'dog', 20} >>> mixedSet[1] Traceback (most recent call last): File "<pyshell#41>", line 1, in <module> mixedSet[1] TypeError: 'set' object does not support indexing ################################################# # GETTING MAXIMUM AND MINIMUM VALUES FROM A SET # ################################################# # Using the builtin max() and min() methods. # NOTE that for sets having mixed type of values, say strings and numbers, Python will raise an error while trying to fetch the maximum or minimum value from it. setOfAnimals = {"cat", "dog", "elephant", "snake"} print( max(setOfAnimals) ) # snake print( min(setOfAnimals) ) # cat setOfNumbers = [10, 20, 30, 40] print( max(setOfNumbers) ) # 40 print( min(setOfNumbers) ) # 10 >>> print( min(mixedSet) ) # Traceback info TypeError: unorderable types: int() < str() >>> print( max(mixedSet) ) # Traceback info TypeError: unorderable types: int() > str() ##################################################################### # REVERSING A SET: not a valid operation, since sets are unordered. # ##################################################################### ################################################################### # SORTING A SET: not a valid operation, since sets are unordered. # ################################################################### ########################################### # MERGING USING + OPERATOR: not supported # ########################################### >>> domesticAnimals = {'cat', 'dog'} >>> wildAnimals = {'fox', 'lion'} >>> domesticAnimals + wildAnimals # Traceback information TypeError: unsupported operand type(s) for +: 'set' and 'set' ####################### # MUTABILITY OF A SET # ####################### # Sets are mutable i.e. you can add or remove elements to and from it, once you declare it. We have seen evidence of it already, while going over update(), remove(), add(), discard() etc. ######################################################################################### # SET OPERATIONS: intersection() &, difference() -, symmetric difference() ^, union() | # ######################################################################################### # union() or | # set1.union(set2) returns all elements that are in either set. # alternative syntax: set1 | set2 >>> animals = {"cat", "dog", "elephant", "snake"} >>> domesticAnimals = {"cat", "dog", "hamster"} >>> animals.union(domesticAnimals) {'cat', 'elephant', 'snake', 'dog', 'hamster'} >>> animals | domesticAnimals {'cat', 'elephant', 'snake', 'dog', 'hamster'} # intersection() or & # set1.intersection(set2) returns a set giving the elements that are common in both the sets. # alternative syntax: set1 & set2 >>> animals = {"cat", "dog", "elephant", "snake"} >>> domesticAnimals = {"cat", "dog"} >>> animals.intersection(domesticAnimals) {'dog', 'cat'} >>> animals & domesticAnimals {'dog', 'cat'} # difference() or - # set1.difference(set2) returns all elements that are in set1 but not in set2. >>> animals = {"cat", "dog", "elephant", "snake"} >>> domesticAnimals = {"cat", "dog"} >>> animals.difference(domesticAnimals) {'elephant', 'snake'} >>> animals - domesticAnimals {'elephant', 'snake'} >>> domesticAnimals - animals {'hamster'} # symmetric_difference() or ^ # set1.symmetric_difference(set2) returns all elements that are in exactly one of the sets i.e. those elements that do not form the intersection of the sets. >>> animals = {"cat", "dog", "elephant", "snake"} >>> domesticAnimals = {"cat", "dog", "hamster"} >>> animals.symmetric_difference(domesticAnimals) {'hamster', 'elephant', 'snake'} >>> animals ^ domesticAnimals {'hamster', 'elephant', 'snake'} ########################################################################################################## # SET UPDATE METHODS: difference_update() -=, intersection_update() &=, symmetric_difference_update() ^= # ########################################################################################################## # difference_update() or -= # set1.difference_update(set2) removes all elements of set2 from set1. In difference(), the original set is not altered, whereas in difference_update(), the original set is altered. >>> animals = {"cat", "dog", "elephant", "lion", "boar"} >>> wildAnimals = {"elephant", "lion", "boar"} >>> animals.difference_update(wildAnimals) >>> animals {'cat', 'dog'} >>> >>> animals = {"cat", "dog", "elephant", "lion", "boar"} >>> wildAnimals = {"elephant", "lion", "boar"} >>> animals -= wildAnimals >>> animals {'dog', 'cat'} # intersection_update() or &= # set1.intersection_update(set2) update set1 with the intersection of set1 and set2. Again, the intersection() gives you the intersection as a new set, whereas the intersection_update() updates the original set as the intersection so obtained. >>> animals = {"cat", "dog", "elephant", "lion", "boar"} >>> wildAnimals = {"elephant", "lion", "boar"} >>> animals.intersection_update(wildAnimals) >>> animals {'lion', 'elephant', 'boar'} >>> >>> >>> animals = {"cat", "dog", "elephant", "lion", "boar"} >>> wildAnimals = {"elephant", "lion", "boar"} >>> animals &= wildAnimals >>> animals {'elephant', 'boar', 'lion'} # symmetric_difference_update # set1.symmetric_difference(set2) updates set1 with the symmetric difference of set1 and set2. The original set of set1 is updated in the process. >>> animals = {"cat", "dog", "elephant", "snake"} >>> domesticAnimals = {"cat", "dog", "hamster"} >>> animals.symmetric_difference(domesticAnimals) >>> animals {'elephant', 'hamster', 'snake'} >>> >>> >>> animals = {"cat", "dog", "elephant", "snake"} >>> domesticAnimals = {"cat", "dog", "hamster"} >>> animals ^= domesticAnimals >>> animals {'snake', 'elephant', 'hamster'} ################################################################################# # CHECKING TO SEE IF TWO SETS ARE MUTUALLY EXCLUSIVE OR DISJOINT: isdisjoint () # ################################################################################# # set1.isdisjoint(set2) returns True if two sets have a null intersection. >>> supersetAnimals = {'lion', 'tiger', 'boar', 'cat', 'dog', 'hamster'} >>> wildAnimals = {'lion', 'tiger', 'boar'} >>> domesticAnimals = {'cat', 'dog', 'hamster'} >>> wildAnimals.isdisjoint(domesticAnimals) True >>> supersetAnimals.isdisjoint(wildAnimals) False >>> supersetAnimals.isdisjoint(domesticAnimals) False ############################### # FROZEN SETS: immutable sets # ############################### # We know by know that sets are mutable i.e. you can add or remove elements to and from it. However, Python has a variant of sets if you want the set to be immutable. This data structure is known as a frozen set. # Frozen sets do not allow for addition or removal of elements, once it(frozen set) is declared. # Let's review the normal behavior of sets. >>> setOfAnimals = {"cat", "dog", "elephant", "snake"} >>> setOfAnimals.add("hamster") >>> setOfAnimals {'dog', 'hamster', 'cat', 'snake', 'elephant'} >>> setOfAnimals.update(["lion", "tiger"]) >>> setOfAnimals {'hamster', 'cat', 'snake', 'elephant', 'dog', 'tiger', 'lion'} >>> setOfAnimals.remove('snake') >>> setOfAnimals {'hamster', 'cat', 'elephant', 'dog', 'tiger', 'lion'} # FROZEN SETS >>> frozenSetOfAnimals = frozenset({"cat", "dog", "elephant", "snake"}) # note the builtin frozenset() function >>> frozenSetOfAnimals.add("hamster") # Traceback info AttributeError: 'frozenset' object has no attribute 'add' >>> frozenSetOfAnimals.update(["lion", "tiger"]) # Traceback info AttributeError: 'frozenset' object has no attribute 'update' >>> frozenSetOfAnimals.remove('snake') # Traceback info AttributeError: 'frozenset' object has no attribute 'remove' # It's evident that the frozen set doesn't have any method that ends up updating/altering the originally declared set. So, methods like clear(), pop(), remove(), discard(), update methods will not work. # Conventional set methods like difference(), intersection(), symmetric_difference(), issubset(), issuperset(), isdisjoint(), copy(), union() will work, since frozensets are sets after all. # Here's a brief rundown of which methods supported by sets and frozensets. Methods supported by SETS Methods supported by FROZENSETS add() clear() copy() copy() difference() difference() difference_update() discard() intersection() intersection() intersection_update() isdisjoint() isdisjoint() issubset() issubset() issuperset() issuperset() pop() remove() symmetric_difference() symmetric_difference() symmetric_difference_update() union() union() update() ######################################################### # SLICING A SET: not supported since a set is unordered # ######################################################### >>> setOfAnimals = {"cat", "dog", "elephant", "snake"} >>> setOfAnimals[:] # Traceback info TypeError: 'set' object is not subscriptable ############################### # PACKING AND UNPACKING A SET # ############################### # Like other sequences, it is possible to assign different elements of a set to individual variables, this is known as unpacking a set. And you can use individual variables to pack a set as well. # Unpacking a set >>> setOfAnimals = {"cat", "dog", "elephant", "snake"} >>> a, b, c, d = setOfAnimals >>> a 'snake' >>> b 'dog' >>> c 'cat' >>> d 'elephant' # Packing a set >>> newSetOfAnimals = {a, b, c} >>> {'dog', 'cat', 'snake'} ######################################################################### # JOINING THE ELEMENTS OF A SET WITH THE PROVIDED SYMBOL/STRING: join() # ######################################################################### # join() applies to all sequences(strings, data structures) # "!".join(set1) produces a string by joining all members of set1 by placing ! symbol between the elements. Note that the join() function will raise a TypeError if there is any numeric value inside the set, because join is a method associated with string. Although, if you wrap these numeric values with double quotes(thereby making them strings), the join() function shall work with no errors. >>> setOfAnimals = {"cat", "dog", "elephant", "snake"} >>> " -> ".join(setOfAnimals) 'snake -> dog -> cat -> elephant' ###################### # SET COMPREHENSIONS # ###################### # Another way to create sets is by using comprehensions. Let's create a set from a list using a comprehension. >>> listOfAnimals = ['snake', 'snake', 'cat', 'dog', 'elephant'] >>> setOfAnimals = {animal for animal in listOfAnimals} >>> setOfAnimals {'cat', 'snake', 'dog', 'elephant'} >>> type(setOfAnimals) <class 'set'> # Again, comprehensions have way more power than creating a set out of a list. It's up to you to find that out! If you have an amazing comprehension up your sleeve, please share it below in the comments. ####################################################### # REMOVING A SET FROM MEMORY: using the 'del' keyword # ####################################################### # If you are conscious of the memory occupied by your program, you may want to consider deleting your variables yourself. # FYI: By default, the interpreter deletes your variables for you as soon as you exit it, or after your program has run its course. >>> someSet = {num for num in range(1, 100)} >>> someSet {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99} >>> del someSet >>> someSet # Traceback information NameError: name 'someSet' is not defined
When to use sets
Choose sets when:
- You need a unique set of values
- Your data need not have nested values
- Your data may change, since sets are mutable
- Your data may undergo set operations like union, difference, intersection etc.
Making the correct choice: When to use what§
Each data structure has an array of associated methods and attributes, and each one of them can do different things. Here are a few tips on how to choose the right one to suit your needs:
- Use a list if you have a mixed collection of data, capable of being modified and added to, that you want to be able to refer to using indexes.
- Use a set if you want a collection of unique yet mutable elements, and you require to perform mathematical operations like union, intersection etc. on the elements. Also, keep in mind, that sets cannot hold mutable types such as dictionaries, sets or lists. Frozen sets work, though.
- Use a tuple if you know that your data is not going to undergo any changes, especially if you are focusing on the performance of your programs. Owing to their immutability, Python knows just how much memory to allocate for the data, and hence are great for performance.
- Use a dictionary if you want to store key-value pairs, which not only implement logical associations, but also are mutable and offer a fast lookup i.e. fast retrieval of values because of custom keys. Keep in mind that sets, lists and dictionaries(all mutable types) cannot assume the roles of a dictionary key. Frozen sets can work as a dictionary key.
- Use a frozen set if want a collection of unique elements in an immutable way.
You can always see the methods and attributes of each of these by using the builtin dir() function:
>>> dir(set) ['__and__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__iand__', '__init__', '__ior__', '__isub__', '__iter__', '__ixor__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__or__', '__rand__', '__reduce__', '__reduce_ex__', '__repr__', '__ror__', '__rsub__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__xor__', 'add', 'clear', 'copy', 'difference', 'difference_update', 'discard', 'intersection', 'intersection_update', 'isdisjoint', 'issubset', 'issuperset', 'pop', 'remove', 'symmetric_difference', 'symmetric_difference_update', 'union', 'update'] >>> dir(list) ['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'] >>> dir(tuple) ['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index'] >>> dir(dict) ['__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'clear', 'copy', 'fromkeys', 'get', 'items', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values'] # Ignore the double underscore(a.k.a. dunder) methods and attributes for now. We will touch on them as we proceed in the course.
You can view the help text on each of methods and attributes using the builtin help() function:
>>> help(list.index) Help on method_descriptor: index(...) L.index(value, [start, [stop]]) -> integer -- return first index of value. Raises ValueError if the value is not present. >>> help(set.difference_update) Help on method_descriptor: difference_update(...) Remove all elements of another set from this set.
Other less used data types§
Apart from the four (five, if you count frozensets) we have thoroughly covered so far, there are a few data structures that are there in language, but are used infrequently. Namely, these are:
- Array
- Queue
- Deque
- Heapq
- Defaultdict
Feel free to research these in your time.
Review: Comparison of different data structures§
Property | Data Structure | List | Tuple | Dict | Set |
Mutable | Yes | No | Yes | Yes |
Ordered | Yes | Yes | No | No |
Sortable | Yes | Yes | No | No |
Reversible | Yes | Yes | No | No |
Slice-able [ : ] | Yes | Yes | No | No |
Comprehensions | Yes | No | Yes | Yes |
Accessible using index operators [ ] | Yes | Yes | No | No |
Merging using + operator | Yes | Yes | No | No |
Review: Operations common to all data structures§
In brief, following are the operations common to all data structures, as we have seen in this chapter:
- packing-unpacking
- max() & min()
- len()
- join()
- membership tests
- iterating using a for loop
- deleting from memory
A Few General Things§
max() and min() in dictionary
There are two sets of objects in a dictionary, namely, keys and values. Let's talk about keys first.
We know that when we iterate over a dictionary like this: for key in dictionary_name:, then the for loop iterates only over the keys of the dictionary. For example
IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") for key in IndianCricketTeam: print(key) # Output batsman bowler
The builtin min() and max() functions do the same. When you pass the dictionary_name to these functions, say min(IndianCricketTeam), it will compare the keys and return the minimum value. Bear in mind that if your keys are of different types such as strings and numbers, you would see something like
TypeError: unorderable types: int() < str()
Now the question arises how to get the minimum and maximum valued values(context: key-value). To get the answer, we must, first, have a look at the key argument to the builtin min() and max() functions.
min( ["apple", "blackberry", "samsung", "lg"]) # 'apple' min( ["apple", "blackberry", "samsung", "lg"], key = len) # 'lg' min( [ (1,2), (3,4), (4,5) ], key = sum) # (1, 2) max( [ (1,2), (3,4), (4,5) ], key = sum) # (4, 5) # For starters, sum() is an builtin function, just like len(). So we specify a function name in the key argument in its keyword form i.e. without parentheses. The function so mentioned must be capable of taking each element of the previous argument, and it must be capable of returning a comparable value. Read it again for better understanding. # So, in the second example, sum function evaluates the total of all the tuples, and returns the tuple that has the min/max value. # The first example is a simple comparison of length of strings. Hence, "lg" (2) < "apple" (5) < "samsung" (7) < "blackberry" (10). # Proceeding, see if you can make sense of the following. IndianCricketTeam = dict(batsman = "V. Kohli", bowler = "B. Kumar") min( IndianCricketTeam, key = IndianCricketTeam.get ) # 'bowler' # WALKTHROUGH: The first argument evaluates to ['batsman', 'bowler']. The min() function compares values returned by IndianCricketTeam.get('batsman') and IndianCricketTeam.get('bowler') i.e. 'V. Kohli' and 'B. Kumar', out of which the latter is lesser, so to speak, than the former. Hence, the output is that the value of key 'bowler' is less than the value of key 'batsman'. # To get the corresponding value in the result: print( IndianCricketTeam[ min( IndianCricketTeam, key = IndianCricketTeam.get ) ] ) 'B. Kumar' # To get the corresponding value in the result: print( IndianCricketTeam[ max( IndianCricketTeam, key = IndianCricketTeam.get ) ] ) 'V. Kohli'
This is how min() and max() work with dictionaries.
Hashing in Python
A hash, in Python, is a special number returned by the builtin hash() function, which never changes during its lifetime. Objects with the same value have the same hash value. The converse is not necessarily true, but likely.
If an object in Python is hashable, then and only then it can serve, one, as a set element, and two, as a dictionary key. By default, all immutable types in Python are hashable, whereas the mutable types (lists, sets, dictionaries) are not hashable. Hence, you cannot use a list/set/dictionary as a dictionary key, or as a set element (when tried, Python throws a TypeError). Since frozen sets are immutable, they can be used as both of these.
>>> myStr = 'str' >>> myTup = 1, 2, 3 >>> myList = [1, 2, 3, 4] >>> myDict = {1: 'one', 2: 'two'} >>> mySet = {'one', 'two'} >>> myFrozenSet = frozenset(mySet) >>> hash(myStr) -1501539149 >>> hash(myTup) -378539185 >>> hash(myList) # Traceback info TypeError: unhashable type: 'list' >>> hash(myDict) # Traceback info TypeError: unhashable type: 'dict' >>> hash(mySet) # Traceback info TypeError: unhashable type: 'set' >>> hash(myFrozenSet) -1281994893
Interchanging Data Structures: using list(), tuple(), dict(), set(), frozenset()
It's easy to translate a list to a tuple to make it immutable, or to extract the keys of a dictionary into a list which you can use to initialize another dictionary (using fromkeys()), or making a set immutable by initializing a frozenset from it, and so on.
The truth is, that you can make any data structure from any other data structure using the builtin functions of list(), tuple(), dict(), set(), frozenset(). You can view this as another way of creating a list, tuple, dictionary, set and frozenset. Let's see all this in action:
# list to tuple # transforming a mutable list into an immutable tuple >>> myList = [number for number in range(1, 11)] >>> myTuple = tuple( myList ) >>> print("Type: ", type( myTuple ), ";", "Contents: ", myTuple) Type: <class 'tuple'> ; Contents: (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) # tuple to list # sorting a tuple >>> tupleOfNumbers = (5, 4, 2, 3, 1) >>> listOfNumbers = list(tupleOfNumbers) >>> listOfNumbers.sort() >>> listOfNumbers [1, 2, 3, 4, 5] >>> newTupleOfNumbers = tuple(listOfNumbers) >>> newTupleOfNumbers (1, 2, 3, 4, 5) # list to set # removing duplicates >>> myList = [number for number in range(1, 11)] >>> myList2 = [number for number in range(5, 16)] >>> myList + myList2 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] >>> mySet = set( myList + myList2 ) >>> mySet {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} # set to frozenset # making a set immutable >>> mySet = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} >>> mySet.add(16) >>> mySet {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16} >>> myImmutableSet = frozenset( mySet ) >>> myImmutableSet.add(17) # Traceback information AttributeError: 'frozenset' object has no attribute 'add' # dict to list # initializing a new dictionary from the keys of an existing dictionary >>> myDictionary = {'one': 1, 'two': 2, 'three': 3, 'four': 4, 'five': 5} >>> myListOfKeys = list( myDictionary.keys() ) >>> myListOfKeys ['one', 'four', 'five', 'three', 'two'] >>> myNewDictionary = dict.fromkeys( myListOfKeys ) >>> myNewDictionary {'one': None, 'four': None, 'two': None, 'three': None, 'five': None} # This interchanging is incredibly helpful in certain situations e.g. while reversing a tuple, while removing duplicates from a list, extracting keys and values from a dictionary, making a set immutable by initializing a frozenset from it etc.
Using builtin dir() and help() functions
The dir() and help() functions are extremely helpful and will help you gain better understanding of the data types in Python.
The dir(object) statement, when executed, will give you all the methods and associated with the mentioned object. Everything in Python is an object, be it variables, classes, functions etc. The parent data types are also objects, so if you pass 'str' to this function, it will list every function and attribute associated with it. And to gain what all do these related objects do, you can use the help() function.
>>> dir(str) ['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill'] # Ignore the double-underscore bits for now. >>> help(str.replace) Help on method_descriptor: replace(...) S.replace(old, new[, count]) -> str Return a copy of S with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced. # The usage is clear, won't you say? # dir() without any argument gives you all the objects in the current scope i.e. the variables you have declared after you opened the shell. We will cover scopes in chapter 5. >>> dir() ['__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'a', 'b', 'c', 'coordinates', 'd', 'dimension', 'n', 'newSetOfAnimals', 'primes', 'set2', 'set3', 'setOfAnimals', 't'] # help() without any argument gives you the help console, where you type the object you seek help on. >>> help() Welcome to Python 3.4's help utility! If this is your first time using Python, you should definitely check out the tutorial on the Internet at http://docs.python.org/3.4/tutorial/. Enter the name of any module, keyword, or topic to get help on writing Python programs and using Python modules. To quit this help utility and return to the interpreter, just type "quit". To get a list of available modules, keywords, symbols, or topics, type "modules", "keywords", "symbols", or "topics". Each module also comes with a one-line summary of what it does; to list the modules whose name or summary contain a given string such as "spam", type "modules spam". help> str.replace Help on method_descriptor in str: str.replace = replace(...) S.replace(old, new[, count]) -> str Return a copy of S with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
Comparing sequences in Python: Lexicographical ordering
Objects belonging to the same sequence type can be compared. The comparison uses something known as lexicographical ordering i.e. first, the first two items are compared, and if they differ, that tells you about the comparison conclusion; if they are equal, the next two elements are compared, and so on, until either sequence wears out. Let' s view a few examples:
(1, 2, 3, 4, 5) < (1, 2, 4, 5, 6) # since 3 < 4 [1, 2, 3, 4, 5] < [1, 2, 4, 5, 6] 'ABC' < 'C++' < 'Perl' < 'Python' (1, 2, 3, 4, 5) < (1, 2, 3, 5) (1, 2, 3) < (1, 2, 3, -1) (1, 2, 3, 4) == (1.0, 2.0, 3.0, 4.0) # Compared on numeric values i.e. 0 equals 0.0 (1, 2, 3, ('aaa', 'ab'), 4) < (1, 2, 3, ('abc', 'd'))
On the agenda in next chapter§
Huge chunks of information, take your time digesting it. Practice the code yourself, and you'll eventually get the dynamics of data structures.
In the next chapter, we will have a glance at functions and importing modules. Till next time!
Further Reading
Exercises§
- Write a Python script which prompts the user for a sequence of integer values, and once he is finished, displays the list of positives, negatives and number of zeroes he entered. Hint: Use a while loop with a breaking condition for when the user decides to end the sequence of integers. [ Solution ]
- Write a Python program which asks the user for a string, and outputs number of unique characters in it. For example: 'Ethan' has 5 unique characters, 'Hello' has 4. Use a dictionary to achieve the functionality, not a set. Hint: Once a key is registered in a dictionary, you can assign values to it any number of times. [ Solution ]
- Write a Python script which prompts the user for a sequence of space-separated words, and outputs the sorted sequence with no duplicates. For example, if the user inputs 'apple aardvark apple ball cat', then the output should be 'aardvard apple ball cat'. You may use builtin sort() to sort the words. [ Solution ]
- Write a Python program which accepts end bounds of a range from the user, and outputs a tuple containing only odd numbers lying in the range. For example, if the user inputs 10 & 20, then the output should by (11, 13, 15, 17, 19). [ Solution ]
See also:
- Object Oriented Python
- Design Patterns in Python
- 50+ Handy Standard Library Modules
- 60+ Handy Third Library Modules
- 50+ Tips & Tricks for Python Developers
- 50+ Know-How(s) Every Pythonista Must Know