My DS Coding Bolg: Learning Python 2

-- Everything is an object

-- Mutable or immutable? That is the question

If the value can change, the object is called mutable, while if the value cannot change, the object is called immutable.

age=42

id(age)

age=43

id(age)

mutable

fab = Person(age=39)

fab.age

id(fab)

fab.age=29

id(fab)

immutable

-- Numbers

Numbers are immutable objects.

Integers

a = 12

b = 3

a + b # addition

b - a # subtraction

a // b # integer division

a / b # true division

a * b # multiplication

b ** a # power operator

2 ** 1024 # a very big number, Python handles it gracefully

7 / 4 # true division

7 // 4 # integer division, flooring returns 1

-7 / 4 # true division again, result is opposite of previous

-7 // 4 # integer div., result not the opposite of previous

int(1.75)

int(-1.75)

10 % 3 # remainder of the division 10 // 3

10 % 4 # remainder of the division 10 // 4

Booleans

int(True) # True behaves like 1

int(False) # False behaves like 0

bool(1) # 1 evaluates to True in a boolean context

bool(-42) # and so does every non-zero number

bool(0) # 0 evaluates to False

# quick peak at the operators (and, or, not)

not True

not False

True and True

False or True

1 + True

False + 42

7 - True

Reals

http://en.wikipedia.org/wiki/Double-precision_floating-point_format

pi = 3.1415926536 # how many digits of PI can you remember?

radius = 4.5

area = pi * (radius ** 2)

area

import sys

sys.float_info

3 * 0.1 – 0.3

double precision numbers suffer from approximation issues even when it comes to simple numbers like 0.1 or 0.3.

Complex numbers

c = 3.14 + 2.73j

c.real # real part

c.imag # imaginary part

c.conjugate() # conjugate of A + Bj is A - Bj

c * 2 # multiplication is allowed

c ** 2 # power operation as well

d = 1 + 1j # addition and subtraction as well

c - d

Fractions and decimals

from fractions import Fraction

Fraction(10, 6) # mad hatter? # notice it's been reduced to lowest terms

Fraction(1, 3) + Fraction(2, 3) # 1/3 + 2/3 = 3/3 = 1/1

f = Fraction(10, 6)

f.numerator

f.denominator

from decimal import Decimal as D # rename for brevity

D(3.14) # pi, from float, so approximation issues

D('3.14') # pi, from a string, so no approximation issues

D(0.1) * D(3) - D(0.3) # from float, we still have the issue

D('0.1') * D(3) - D('0.3') # from string, all perfect

-- Immutable sequence

Strings and bytes

# 4 ways to make a string

str1 = 'This is a string. We built it with single quotes.'

str2 = "This is also a string, but built with double quotes."

str3 = '''This is built using triple quotes, so it can span multiple lines.'''

str4 = """This too is a multiline one built with triple double-quotes."""

>>> str4 #A

'This too\nis a multiline one\nbuilt with triple double-quotes.'

print(str4) #B

len(str1)

Encoding and decoding strings

s = "This is ü?íc0de" # unicode string: code points

type(s)

encoded_s = s.encode('utf-8') # utf-8 encoded version of s

encoded_s

type(encoded_s) # another way to verify it

encoded_s.decode('utf-8') # let's revert to the original

bytes_obj = b"A bytes object" # a bytes object

type(bytes_obj)

Indexing and slicing strings

my_sequence[start:stop:step]

s = "The trouble is you think you have time."

s[0] # indexing at position 0, which is the first char

s[5] # indexing at position 5, which is the sixth char

s[:4] # slicing, we specify only the stop position

s[4:] # slicing, we specify only the start position

s[2:14] # slicing, both start and stop positions

s[2:14:3] # slicing, start, stop and step (every 3 chars)

s[:] # quick way of making a copy

Tuples

A tuple is a sequence of arbitrary Python objects. In a tuple, items are separated by commas.

t = () # empty tuple

type(t)

one_element_tuple = (42, ) # you need the comma!

three_elements_tuple = (1, 3, 5)

a, b, c = 1, 2, 3 # tuple for multiple assignment

a, b, c # implicit tuple to print with one instruction

3 in three_elements_tuple # membership test

one-line swaps

a, b = 1, 2

c = a # we need three lines and a temporary var c

a = b

b = c

a, b # a and b have been swapped

a, b = b, a # this is the Pythonic way to do it

a, b

-- Mutable sequences

There are two mutable sequence types in Python: lists and byte arrays.

Lists

[] # empty list

list()

[1,2,3]

[x+5 for x in[2,3,4]]

list((1,3,5,7,9))

list('hello')

list comprehension, a very powerful functional feature of python

a = [1, 2, 1, 3]

a.append(13) # we can append anything at the end

a.count(1) # how many `1` are there in the list?

a.extend([5, 7]) # extend the list by another (or sequence)

a.index(13) # position of `13` in the list (0-based indexing)

a.insert(0, 17) # insert `17` at position 0

a.pop() # pop (remove and return) last element

a.pop(3) # pop element at position 3

a.remove(17) # remove `17` from the list

a.reverse() # reverse the order of the elements in the list

a.sort() # sort the list

a.clear() # remove all elements from the list

a = list('hello') # makes a list from a string

a.append(100) # append 100, heterogeneous type

a.extend((1, 2, 3)) # extend using tuple

a.extend('...') # extend using string

a = [1, 3, 5, 7]

min(a) # minimum value in the list

max(a) # maximum value in the list

sum(a) # sum of all values in the list

len(a) # number of elements in the list

b = [6, 7, 8]

a + b # `+` with list means concatenation

a * 2 # `*` has also a special meaning

operator overloading - it means that operators such as +, -. *, %, and so on, may represent different operations according to the context they are used in. It doesn't make any sense to sum two lists, right? Therefore, the + sign is used to concatenate them. Hence, the * sign is used to concatenate the list to itself according to the right operand.

from operator import itemgetter

a = [(5, 3), (1, 3), (1, 2), (2, -1), (4, 9)]

sorted(a)

sorted(a, key=itemgetter(0))

sorted(a, key=itemgetter(0, 1))

sorted(a, key=itemgetter(1))

sorted(a, key=itemgetter(1), reverse=True)

Byte arrays

bytearray() # empty bytearray object

bytearray(10) # zero-filled instance with given length

bytearray(range(5)) # bytearray from iterable of integers

name = bytearray(b'Lina') # A - bytearray from bytes

name.replace(b'L', b'l')

name.endswith(b'na')

name.upper()

name.count(b'L')

-- Set types

Python also provides two set types, set and frozenset. The set type is mutable, while frozenset is immutable. They are unordered collections of immutable objects.

Hashability is a characteristic that allows an object to be used as a set member as well as a key for a dictionary.

small_primes = set() # empty set

small_primes.add(2) # adding one element at a time

small_primes.add(3)

small_primes.add(5)

small_primes

small_primes.add(1) # Look what I've done, 1 is not a prime!

small_primes

small_primes.remove(1) # so let's remove it

3 in small_primes # membership test

4 in small_primes

4 not in small_primes # negated membership test

small_primes.add(3) # trying to add 3 again

small_primes

bigger_primes = set([5, 7, 11, 13]) # faster creation

small_primes | bigger_primes # union operator `|`

small_primes & bigger_primes # intersection operator `&`

small_primes - bigger_primes # difference operator `-`

small_primes = {2, 3, 5, 5, 3}

small_primes

Another immutable counterpart of the set type: frozenset.

small_primes = frozenset([2, 3, 5, 7])

bigger_primes = frozenset([5, 7, 11])

small_primes.add(11) # we cannot add to a frozenset

small_primes.remove(2) # neither we can remove

small_primes & bigger_primes # intersect, union, etc. allowed

-- Mapping types - dictionaries

A dictionary maps keys to values. Keys need to be hashable objects, while values can be of any arbitrary type. Dictionaries are mutable objects.

a = dict(A=1, Z=-1)

b = {'A': 1, 'Z': -1}

c = dict(zip(['A', 'Z'], [1, -1]))

d = dict([('A', 1), ('Z', -1)])

e = dict({'Z': -1, 'A': 1})

a == b == c == d == e # are they all the same?

list(zip(['h', 'e', 'l', 'l', 'o'], [1, 2, 3, 4, 5]))

list(zip('hello', range(1, 6))) # equivalent, more Pythonic

d = {}

d['a'] = 1 # let's set a couple of (key, value) pairs

d['b'] = 2

len(d) # how many pairs?

d['a'] # what is the value of 'a'?

d # how does `d` look now?

del d['a'] # let's remove `a`

d['c'] = 3 # let's add 'c': 3

'c' in d # membership is checked against the keys

3 in d # not the values

'e' in d

d.clear() # let's clean everything from this dictionary

Three special objects called dictionary views: keys, values, and items.

keys() returns all the keys in the dictionary

values() returns all the values in the dictionary

items() returns all the (key, value) pairs in the dictionary

d = dict(zip('hello', range(5)))

d.keys()

d.values()

d.items()

3 in d.values()

('o', 4) in d.items()

d.popitem() # removes a random item

d.pop('l') # remove item with key `l`

d.pop('not-a-key') # remove a key not in dictionary: KeyError

d.pop('not-a-key', 'default-value') # with a default value?

d.update({'another': 'value'}) # we can update dict this way

d.update(a=13) # or this way (like a function call)

d.get('a') # same as d['a'] but if key is missing no KeyError

d.get('a', 177) # default value used if key is missing

d.get('b', 177) # like in this case

d.get('b') # key is not there, so None is returned

d = {}

d.setdefault('a', 1) # 'a' is missing, we get default value

d.setdefault('a', 5) # let's try to override the value

d = {}

d.setdefault('a', {}).setdefault('b', []).append(1)

-- The collections module

When Python general purpose built-in containers (tuple, list, set, and dict) aren't enough, we can find specialized container data types in the collections module.

They are:

Named tuples

A namedtuple is a tuple-like object that has fields accessible by attribute lookup as well as being indexable and iterable (it's actually a subclass of tuple).

vision = (9.5, 8.8)

vision

vision[0] # left eye (implicit positional reference)

vision[1] # right eye (implicit positional reference)

from collections import namedtuple

Vision = namedtuple('Vision', ['left', 'right'])

vision = Vision(9.5, 8.8)

vision[0]

vision.left # same as vision[0], but explicit

vision.right # same as vision[1], but explicit

Vision = namedtuple('Vision', ['left', 'combined', 'right'])

vision = Vision(9.5, 9.2, 8.8)

vision.left # still perfect

vision.right # still perfect (though now is vision[2])

vision.combined # the new vision[1]

Defaultdict

d = {}

d['age'] = d.get('age', 0) + 1 # age not there, we get 0 + 1

d = {'age': 39}

d['age'] = d.get('age', 0) + 1 # d is there, we get 40

from collections import defaultdict

dd = defaultdict(int) # int is the default type (0 the value)

dd['age'] += 1 # short for dd['age'] = dd['age'] + 1

dd['age'] = 39

dd['age'] += 1

ChainMap

from collections import ChainMap

default_connection = {'host': 'localhost', 'port': 4567}

connection = {'port': 5678}

conn = ChainMap(connection, default_connection) # map creation

conn['port'] # port is found in the first dictionary

conn['host'] # host is fetched from the second dictionary

conn.maps # we can see the mapping objects

conn['host'] = 'packtpub.com' # let's add host

conn.maps

del conn['port'] # let's remove the port information

conn.maps

conn['port'] # now port is fetched from the second dictionary

dict(conn) # easy to merge and convert to regular dictionary

-- Final considerations

Small values caching

a = 1000000

b = 1000000

id(a) == id(b)

a = 5

b = 5

id(a) == id(b)

How to choose data structures

# example customer objects

customer1 = {'id': 'abc123', 'full_name': 'Master Yoda'}

customer2 = {'id': 'def456', 'full_name': 'Obi-Wan Kenobi'}

customer3 = {'id': 'ghi789', 'full_name': 'Anakin Skywalker'}

# collect them in a tuple

customers = (customer1, customer2, customer3)

# or collect them in a list

customers = [customer1, customer2, customer3]

# or maybe within a dictionary, they have a unique id after all

customers = {

'abc123': customer1,

'def456': customer2,

'ghi789': customer3,

}

About indexing and slicing

Slicing in general applies to a sequence, so tuples, lists, strings, etc.

With lists, slicing can also be used for assignment.

Could you slice dictionaries or sets? I hear you scream "Of course not! They are not ordered!".

a = list(range(10)) # `a` has 10 elements. Last one is 9.

len(a) # its length is 10 elements

a[len(a) - 1] # position of last one is len(a) - 1

a[-1] # but we don't need len(a)! Python rocks!

a[-2] # equivalent to len(a) - 2

a[-3] # equivalent to len(a) - 3

negative indexing

About the names

My DS Coding Bolg

Wednesday, March 16, 2016

Learning Python 2 - Build-in Data Types

No comments:

Post a Comment

Blog Archive