Python @ DjangoSpin

50+ Tips & Tricks for Python Developers

Buffer this pageShare on FacebookPrint this pageTweet about this on TwitterShare on Google+Share on LinkedInShare on StumbleUpon
Reading Time: 25 minutes

Page #7


Serialization in Python: pickle, shelve & json modules

The process of converting native language objects into a sequence of bytes or objects of an interchange format, which are either stored in a file or stored in a string, for the purpose of loading the data in a different session, or for transmitting the data over the network, is known as Serialization. Saving the state of a game when you exit it, and loading it when you launch the game next time, is one such example where serialization is implemented. We’ll discuss 3 modern-day Python modules which implement serialization: pickle, shelve and json.

Expand the following code snippet to view basic examples of serialization using these 3 modules. For further details, have a look at this post.

####### pickle MODULE #######
### Serialization: pickle.dump(obj, file, protocol=None, *, fix_imports=True)
>>> import pickle
>>> saveGameDetails = {
    'playerName': 'Ethan',
    'level': 3,
    'arrowCount': 12
    }
>>> with open('saveGame01.robinhood', 'wb') as pickleFileHandler:
    pickle.dump(saveGameDetails, pickleFileHandler)
     
### Contents of 'saveGame01.robinhood' are in binary format.
 
### Deserialization: pickle.load(file, *, fix_imports=True, encoding="ASCII", errors="strict")
>>> with open('saveGame01.robinhood', 'rb') as pickleFileHandler:
    loadGameDetails = pickle.load(pickleFileHandler)
>>> loadGameDetails
{'level': 3, 'playerName': 'Ethan', 'arrowCount': 12}
 
### Pickling without a file: loads() and dumps()
>>> saveGameDetails = {
    'playerName': 'Ethan',
    'level': 3,
    'arrowCount': 12
    }
 
>>> saveGameDetailsBinary = pickle.dumps(saveGameDetails)
>>> saveGameDetailsBinary
b'\x80\x03}q\x00(X\n\x00\x00\x00arrowCountq\x01K\x0cX\n\x00\x00\x00playerNameq
\x02X\x05\x00\x00\x00Ethanq\x03X\x05\x00\x00\x00levelq\x04K\x03u.'
 
>>> loadGameDetails = pickle.loads(saveGameDetailsBinary)
>>> loadGameDetails
{'level': 3, 'playerName': 'Ethan', 'arrowCount': 12}
 
 
 
 
 
 
####### shelve MODULE #######
### Serialization ###
>>> import shelve
>>> saveGameOneDetails = {'playerName': 'Ethan', 'level': 3, 'arrowCount': 12}
>>> saveGameTwoDetails= {'playerName': 'Ethan', 'level': 5, 'arrowCount': 6}
 
>>> with shelve.open('save_games.robinhood') as saveGames:                         # as good as saveGames = shelve.open('save_games.robinhood')
    saveGames['saveGame001'] = saveGameOneDetails
     
>>> with shelve.open('save_games.robinhood') as saveGames:
    saveGames['saveGame002'] = saveGameTwoDetails
     
### creates 2 files: save_games.robinhood.dat(containing serialized bytes, like the output file of a pickle) and save_games.robinhood.dir(containing records of individual pickles, like a register.) ###
# contents of .dir:
'saveGame002', (512, 70)
'saveGame001', (0, 70)
 
### Deserialization ###
>>> with shelve.open('save_games.robinhood') as saveGames:
    loadGameOneDetails = saveGames['saveGame001']
    print(loadGameOneDetails)
 
     
{'level': 3, 'arrowCount': 12, 'playerName': 'Ethan'}
 
>>> loadGameOneDetails == saveGameOneDetails
True
 
 
 
 
####### json MODULE #######
### Serialization: json.dump(obj, fp, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
>>> import json
>>> jsonFileHandler = open('saveGame.robinhood', mode = 'w', encoding = 'utf-8')
>>> saveGameDetails = {
    'playerName': 'Ethan',
    'level': 3,
    'livesSpared': 0.88,
    'easyDifficultyLevel': False,
    'intermediateDifficultyLevel': True,
    'hardDifficultyLevel': False,
    'visitsToHolyLand': None,
    'merryMen': {
        'Healer': 'Maid Marian',
        'SwordsMan': 'Will Scarlet',
        'FistFighter': 'Little John'
        },
    'locations': [
        'Nottinghamshire', 'Yorkshire', 'Sherwood'
        ]
    }
>>> json.dump(saveGameDetails, jsonFileHandler, indent = 4)
>>> jsonFileHandler.close()
 
### contents of saveGame.robinhood' in text format
{
    "locations": [
        "Nottinghamshire",
        "Yorkshire",
        "Sherwood"
    ],
    "level": 3,
    "intermediateDifficultyLevel": true,
    "merryMen": {
        "FistFighter": "Little John",
        "Healer": "Maid Marian",
        "SwordsMan": "Will Scarlet"
    },
    "easyDifficultyLevel": false,
    "hardDifficultyLevel": false,
    "livesSpared": 0.88,
    "visitsToHolyLand": null,
    "playerName": "Ethan"
}
 
 
### Deserialization: json.load(fp, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
>>> jsonFileHandler = open('saveGame.robinhood', mode = 'r', encoding = 'utf-8')
>>> loadGameDetails = json.load(jsonFileHandler)
 
>>> loadGameDetails
{'locations': ['Nottinghamshire', 'Yorkshire', 'Sherwood'], 'easyDifficultyLevel': False, 'hardDifficultyLevel': False, 'visitsToHolyLand': None, 'playerName': 'Ethan', 'intermediateDifficultyLevel': True, 'level': 3, 'merryMen': {'FistFighter': 'Little John', 'SwordsMan': 'Will Scarlet', 'Healer': 'Maid Marian'}, 'livesSpared': 0.88}
 
>>> for key in loadGameDetails:
    print(key, ": ", loadGameDetails[key])
 
     
locations :  ['Nottinghamshire', 'Yorkshire', 'Sherwood']
easyDifficultyLevel :  False
hardDifficultyLevel :  False
visitsToHolyLand :  None
playerName :  Ethan
intermediateDifficultyLevel :  True
level :  3
merryMen :  {'FistFighter': 'Little John', 'SwordsMan': 'Will Scarlet', 'Healer': 'Maid Marian'}
livesSpared :  0.88
 
### Serialization without files using JSON: dumps() and loads()
>>> configDetails = {
    "dbPass": "DEF",
    "dbUser": "ABC",
    "dbSID": "GHI"
    }
>>> configDetailsJSON = json.dumps(configDetails)
>>> configDetailsJSON
'{"dbUser": "ABC", "dbPass": "DEF", "dbSID": "GHI"}'
>>> type(configDetailsJSON)
<class 'str'>
 
>>> configDetailsDeserialized = json.loads(configDetailsJSON)
>>> configDetailsDeserialized
{'dbUser': 'ABC', 'dbPass': 'DEF', 'dbSID': 'GHI'}
 
 
### NOTE: JSON has no support for tuples and bytes object.


Decorators

Decorators is a Python-specific feature, virtue of which you can define a function called a decorator function. This decorator function takes an object, manipulates it using its own object and returns this latter object. For example, in the event of decorating a function:

def decoratorFunction(inputFunction):
    def manipulateInputFunction():
        capture return value of inputFunction
        return manipulatedReturnValueOfInputFunction
    return manipulateInputFunction
 
@decoratorFunction
def functionToBeDecorated():
    # body of function
    returns an object, say a string
Significance of @ Notation

Any call to functionToBeDecorated() translates to decoratorFunction() with functionToBeDecorated as its argument i.e. functionToBeDecorated() becomes decoratorFunction(functionToBeDecorated)(). For example:

stringOne = functionToBeDecorated()
BECOMES
stringOne = decoratorFunction(functionToBeDecorated)()
Example

Let’s make the output of a function fancier by wrapping its return value with additional text. We will make use of the decorator annotation (@) provided by Python.

def decorateMyFunction(originalFunction):
    '''Decorates a function by wrapping its return value in a pair of HTML paragraph tags.'''
    def addAdditionalText():
        # Obtain string returned by original function
        textFromOriginalFunction = originalFunction()
        # Adding new functionality to the function being decorated
        return "<p>" + textFromOriginalFunction + "</p>"
    return addAdditionalText
 
@decorateMyFunction
def functionToBeDecorated():
    '''A simple function that returns a string.'''
    return "Hi there!"
 
print( functionToBeDecorated() )                  # OUTPUT: <p>Hi there!</p>

Comparing two objects: Equality & Identity

While comparing two objects, there are two things to consider. One, whether the two objects refer to the same object in memory. Second, whether the values held by the two objects are the same.

To verify if two objects refer to the same object in memory, you can use the builtin id() function or use the is operator. The builtin id() function gives the "identity" of an object i.e. an integer which is guaranteed to be unique and constant for this object during its lifetime. Python's help() function suggests that the id() function returns the object's memory address. No two objects in the same program lifecycle can have the same id() value.

>>> objOne = 'This is a string.'
>>> objTwo = objOne
>>> objThree = 'This is a string.'


>>> objOne is objTwo
True
>>> objOne is objThree
False


>>> id(objOne)
49223224
>>> id(objTwo)
49223224
>>> id(objThree)
49218208


>>> id(objOne) == id(objTwo)
True
>>> id(objOne) == id(objThree)
False
>>> 

To check whether the values held by the two objects are equal, you can use the == operator.

>>> objOne = 'This is a string.'
>>> objTwo = objOne
>>> objThree = 'This is a string.'
>>> 
>>> objOne == objTwo
True
>>> objOne == objThree
True

Executing Python statements dynamically with compile(), exec() & eval()

Python offers two builtin functions to execute pieces of code put together as a string: eval() & exec(). These functions can be used to execute command inputs from a user, like in a custom interpreter. Optionally, these pieces of code which are in the form of a string, can be fed to the builtin compile() function first, to create a code object (Python bytecode), which can then be handed over to eval() & exec() for execution.

The builtin functions eval() and exec() have 2 similarities and 2 differences. Similarities:

  1. Both the eval() and exec() evaluate Python statements, which can be in the form of a string, or a code object as returned by the compile() function.
  2. Both the eval() and exec() take two optional arguments: globals and locals.

These functions differ in the following aspects:

  1. eval() returns the result of the expression, while exec() does not.
  2. eval() only evaluates a single expression (anything on the right hand side of an assignment operation), whereas the exec() can take code blocks having loops, try/except, def clauses.

Let's take a look at basic usage of these functions:

>>> a = 10
>>> eval('a + 5')				
15
>>> a
10
>>> exec('a + 10')				
>>> a
10
>>> exec('a = 20')
>>> a
20

Expand the following code snippet for more examples.

# eval() only evaluates an expression and returns the result. An expression is any combination of operators and operands that you can write on the right hand side of an assignment operation. 
>>> a = 10
>>> eval('a + 5')				
15
>>> eval('a = 20')									# does not evaluate anything other than an expression
Traceback (most recent call last):
    eval('a = 20')
  File "<string>", line 1
    a = 20
      ^
SyntaxError: invalid syntax


>>> eval( 'def anyFunction(): print(50)' )			# does not evaluate anything other than an expression
Traceback (most recent call last):
    eval( 'def anyFunction(): print(50)' )
  File "<string>", line 1
    def anyFunction(): print(50)
      ^
SyntaxError: invalid syntax

>>> eval('"Hi"')									# has a return value
'Hi'


>>> eval('if 1: print("Hi")')						# does not evaluate anything other than an expression
Traceback (most recent call last):
    eval('if 1: print("Hi")')
  File "<string>", line 1
    if 1: print("Hi")
     ^
SyntaxError: invalid syntax



# exec() supports execution of code-blocks, including statements such as assignment. It does not return any value.
>>> exec('a = 20')
>>> a
20

>>> exec( 'def anyFunction(): print(50)' )
>>> anyFunction()
50

>>> exec('print(1) \nprint(2)')
1
2

>>> exec('if 1: print("Hi")')
Hi

>>> exec('"Hi"')									# No output; no return value

In addition to the string form, the statements being passed to exec() & eval() can be in the form of a code object. The compile() function compiles the provided module/statement/expression, and creates a code object (contains Python bytecode) which can be passed to exec() & eval(). This is particularly useful when the same piece of code is being evaluated repeatedly. Expand the following code snippet for usage of compile() function.


# compile(source, '<string>', 'eval') returns the code object which would have been executed if you had executed eval(source).
# source can only be a single expression in this mode.
>>> a = 5
>>> evalCodeObject = compile('a + 10', '<string>', 'eval')
>>> evaluatedValueOfa = eval(evalCodeObject)
>>> evaluatedValueOfa
15


# compile(source, '<string>', 'exec') returns the code object which would have been executed if you had executed exec(source).
# source can be code blocks having loops, try/except, def clauses.
>>> execCodeObject = compile('a = 8; a = a + 10; print(a)', '<string>', 'exec')
>>> executeCodeBlock = exec(execCodeObject)
18


# compile(source, '<string>', 'single') offers a limited functionality of the 'exec' mode, by accepting a single statement or multiple statements separated by semi-colon. It is capable of executing a loop, an if-elif-else construct, a try-except-else-finally construct, a function with semi-colon delimited statements.

>>> singleCodeObject = compile('a = 50; print(a + 4); print(a + 10)', '<string>', 'single')			# multiple statements delimited by ;
>>> executeSingleCodeObject = exec(singleCodeObject)
54
60


>>> codeBlock = '''																					# an if-else construct 
if True:
	print("TRUE!")
else:
	pass
'''
>>> compiledCodeBlock = compile(codeBlock, '<string>', 'single')
>>> exec(compiledCodeBlock)
TRUE!



>>> codeBlock = '''																					# INVALID: two constructs
if True:
	print("TRUE!")
else:
	pass

if True:
	print("TRUE AGAIN!")
else:
	pass
'''
>>> compiledCodeBlock = compile(codeBlock, '<string>', 'single')
Traceback (most recent call last):
    compiledCodeBlock = compile(codeBlock, '<string>', 'single')
  File "<string>", line 7																			# referring to the second construct
    if True:
     ^
SyntaxError: invalid syntax



>>> codeBlock = '''																					# a function definition with semi-colon delimited statements
def functionOne(): print(6); print(12)
'''
>>> compiledCodeBlock = compile(codeBlock, '<string>', 'single')
>>> exec(compiledCodeBlock)
>>> functionOne()
6
12

>>> codeBlock = '''																					# INVALID: two constructs
def functionOne(): print(6); print(12)
def functionTwo(): print(18)
'''
>>> compiledCodeBlock = compile(codeBlock, '<string>', 'single')
Traceback (most recent call last):
    compiledCodeBlock = compile(codeBlock, '<string>', 'single')
  File "<string>", line 3
    def functionTwo(): print(18)
      ^
SyntaxError: invalid syntax

For further details on arguments of the compile() function, check out this post on Executing Python statements dynamically with compile(), exec() & eval().


Class Methods in Python: The @classmethod decorator

In Object Oriented Python, an instance method perceives its argument as the object on which the method is being called. It can operate on class attributes as well as instance attributes. Here's an example:

>>> class Toy:
	'''Toy class'''
	count = 0

	def __init__(self, name, color):
		'''sets instance attributes to provided values; increments counter and prints it.'''
		self.name = name
		self.color = color
		Toy.count += 1
		print("Toys manufactured so far:", Toy.count)

		
>>> woody = Toy('Woody', 'Brown')
Toys manufactured so far: 1
>>> buzz = Toy('Buzz Lightyear', 'White & Purple')
Toys manufactured so far: 2

@classmethod: A method following the @classmethod decorator will perceive its first argument to be the class, and not the instance. The @classmethod decorator denotes a method that operates on the class attributes rather than instance attributes. Let's segregate the initializiation and count increment operation into two different methods to demonstrate this.

>>> class Toy:
	'''Toy class'''
	count = 0

	def __init__(self, name, color):
		'''sets instance attributes to provided values; increments counter and prints it.'''
		self.name = name
		self.color = color
		self.incrementCount()
		
	@classmethod
	def incrementCount(cls):
		cls.count += 1
		print("Toys manufactured so far:", cls.count)

		
>>> woody = Toy('Woody', 'Brown')
Toys manufactured so far: 1
>>> buzz = Toy('Buzz Lightyear', 'White & Purple')
Toys manufactured so far: 2

Note that the class method can be invoked by the object as well. We can call it using the Toy.incrementCount() notation as well, but that would defeat the purpose of the example. Also, the argument can be called anything apart from 'cls', it makes more sense to call it something that is synonymous to the word 'class' (remember, class is a keyword).


Static Methods in Python: The @staticmethod decorator

In Object Oriented Python, an instance method perceives its argument as the object on which the method is being called. It can operate on class attributes as well as instance attributes. I'll cite the example I used while demonstrating the @classmethod decorator.

>>> class Toy:
	'''Toy class'''
	count = 0

	def __init__(self, name, color):
		'''sets instance attributes to provided values; increments counter and prints it.'''
		self.name = name
		self.color = color
		Toy.count += 1
		print("Toys manufactured so far:", Toy.count)

		
>>> woody = Toy('Woody', 'Brown')
Toys manufactured so far: 1
>>> buzz = Toy('Buzz Lightyear', 'White & Purple')
Toys manufactured so far: 2

@staticmethod: A method following the @staticmethod decorator will NOT perceive its first argument to be the class or the instance. Rather, it will take the first argument to mean a regular positional argument. This is used to denote utility methods, which belong in the class code, but don't operate on the instance or the class. In the below example, checkForName() performs a validation, something like a utility method. It does not operate the instance or the class, and hence neither of these needs to be passes as an argument to it.

>>> class Toy:
	'''Toy class'''
	count = 0

	def __init__(self, name, color):
		'''sets instance attributes to provided values; increments counter and prints it.'''
		self.name = name
		self.color = color
		self.checkForName(self.name)

	@staticmethod
	def checkForName(name):
		if name.startswith('w') or name.startswith('W'):
			print("Hey! This is a random check to see if your name begins with 'W', and it does!")
		else:
			print("Oh no! Your name does not start with W, you just missed out on a goodie!")

			
>>> woody = Toy('Woody', 'Brown')
Hey! This is a random check to see if your name begins with 'W', and it does!
>>> buzz = Toy('Buzz Lightyear', 'White & Purple')
Oh no! Your name does not start with W, you just missed out on a goodie!

Using semi-colon (;) to delimit statements in a suite

Python uses new-lines as statement delimiters. This is in contrast to many contemporary languages which use semi-colons to delimit statements. However, Python also allows to use semi-colons. A clause in Python, consists of a header and a suite. For example, the def clause, used for defining functions, will have 'def functionName():' as header while the statements constituting the functionBody will serve as the associated suite. You can put semi-colon delimited statements in suites to put multiple statements on the same line.

>>> def anyFunction(): print(5); print(6); print(7)

>>> anyFunction()
5
6
7

The reason why this is allowed is a design decision on the part of Python developers. While this is not a necessary feature of Python, it is certainly nice to have.


Augmented Assignments

There are cases when you are storing the result of a binary operation into one of the operands. For example, a = a + b or a = a + 1. In such cases, the operand a is evaluated twice. Python offers an alternative, known as Augmented Assignment, in which the operand a is evaluated only once. This alternative is more efficient and offers shorthand for binary operations.

# REGULAR ASSIGNMENT
>>> x = 10
>>> x = x + 5
>>> x
15

# AUGMENTED ASSIGNMENT
>>> x = 10
>>> x += 5
>>> x
15

The operator is not limited to +, it can be either of the following binary operators: +, -, *, /, //, %, **, >>, <<, &, ^, |.


Multiple Assignments

Python allows you to make multiple assignments in a single statment. This makes for a compact & concise code.

>>> a, b, c = 1, 2, 3
>>> a
1
>>> b
2
>>> c
3

Keep in mind that if this trait compromises the readability of your code, feel free to drop it.


Most Frequent Elements in an Iterable Object

The Counter class of builtin module collections creates a dictionary of all elements of a given iterable object, along with corresponding number of occurrences. It provides a few useful methods, one of which is most_common(). This method takes an integer input, say n, and provides a list of tuples of n most common elements with their corresponding number of occurrences.

import collections

words = [
    '!', '@', '#', '$', '%',
    '@', '#', '$', '%',
    '#', '$', '%',
    '$', '%',
     '%',
]

characterCounts = collections.Counter(words)
threeMostFrequentCharacters = characterCounts.most_common(3)
print(threeMostFrequentCharacters)              # [('%', 5), ('$', 4), ('#', 3)]

See also: 50+ Know-How(s) Every Pythonista Must Know


Buffer this pageShare on FacebookPrint this pageTweet about this on TwitterShare on Google+Share on LinkedInShare on StumbleUpon

Leave a Reply