My DS Coding Bolg: Learning Python 7 - Tesing, Profiling, and Dealing with Exceptions

test-driven development (TDD)

-- Testing your application

White-box tests are those which exercise the internals of the code, they inspect it down to a very fine level of granularity. 
On the other hand, black-box tests are those which consider the software under testing as if being within a box, the internals of which are ignored.

-- The anatomy of a test

Preparation
Execution
Verification

-- Testing guidelines

Keep them as simple as possible
Tests should verify one thing and one thing only
Tests should not make any unnnecessary assumption when verifying data
Tests should exercise the what, rather than the how
Tests should assume the leset possible in the preparation phase
Tests should use upthe least possible amount of resources

Unit testing

Writing a unit test

data.py

def get_clean_data(source):
    data = load_data(source)
    cleaned_data = clean_data(data)
    return cleaned_data

Mock objects and patching

The fake objects are called mocks. the mock library was a third-party library that basically every project would install via pip.

The act of replacing a real object or function with a mock is called patching. Once you have replaced everything you need not to run, with suitable mocks, you can pass to the second phase of the test and run the code you are exercising. After the execution, you will be able to check the mocks to verify your code has worked correctly.

Assertions

An assertion is a function (or method) that you can use to verify equality between objects, as well as other conditions.


A classic unit test example

Mocks, patches, and assertions are the basic tools we'll be using to write tests. 

filter_funcs.py

def filter_ints(v):
    return [num for num in v if is_positive(num)]

def is_positive(n):
    return n > 0

tests/test_ch7/test_filter_funcs.py

from unittest import TestCase  # 1
from unittest.mock import patch, call  # 2
from nose.tools import assert_equal  # 3
from ch7.filter_funcs import filter_ints  # 4

class FilterIntsTestCase(TestCase):  # 5

    @patch('ch7.filter_funcs.is_positive')  # 6
    def test_filter_ints(self, is_positive_mock):  # 7
        # preparation
        v = [3, -4, 0, 5, 8]

        # execution
        filter_ints(v)  # 8

        # verification
        assert_equal(
            [call(3), call(-4), call(0), call(5), call(8)],
            is_positive_mock.call_args_list
        )  # 9

$ pip install nose

$ nosetests tests/test_ch7/

Making a test fail

$ nosetests tests/test_ch7/

Interface testing

tests/test_ch7/test_filter_funcs.py

def test_filter_ints_return_value(self):
    v = [3, -4, 0, -2, 5, 0, 8, -1]

    result = filter_ints(v)

    assert_list_equal([3, 5, 8], result)

$ nosetests tests/test_ch7/

Comparing tests with and without mocks

filter_funcs_refactored.py

def filter_ints(v):
    v = [num for num in v if num != 0]  # 1
    return [num for num in v if is_positive(num)]

$ nosetests tests/test_ch7/test_filter_funcs_refactored.py 

You must keep your mocks up-to-date and in sync with the code they are replacing, otherwise you risk having issues like the preceding one, or even worse. 

tests/test_ch7/test_filter_funcs_final.py

from unittest import TestCase
from nose.tools import assert_list_equal
from ch7.filter_funcs import filter_ints

class FilterIntsTestCase(TestCase):
    def test_filter_ints_return_value(self):
        v = [3, -4, 0, -2, 5, 0, 8, -1]
        result = filter_ints(v)
        assert_list_equal([3, 5, 8], result)

filter_funcs_triangulation.py

def filter_ints(v):
    return [3, 5, 8]

tests/test_ch7/test_filter_funcs_final_triangulation.py

def test_filter_ints_return_value(self):
    v1 = [3, -4, 0, -2, 5, 0, 8, -1]
    v2 = [7, -3, 0, 0, 9, 1]

    assert_list_equal([3, 5, 8], filter_ints(v1))
    assert_list_equal([7, 9, 1], filter_ints(v2))

Boundaries and granularity

tests/test_ch7/test_filter_funcs_is_positive_loose.py

def test_is_positive(self):
    assert_equal(False, is_positive(-2))  # before boundary
    assert_equal(False, is_positive(0))  # on the boundary
    assert_equal(True, is_positive(2))  # after the boundary

tests/test_ch7/test_filter_funcs_is_positive_correct.py

def test_is_positive(self):
    assert_equal(False, is_positive(-1))
    assert_equal(False, is_positive(0))
    assert_equal(True, is_positive(1))

tests/test_ch7/test_filter_funcs_is_positive_better.py

def test_is_positive(self):
    assert_equal(False, is_positive(0))
    for n in range(1, 10 ** 4):
        assert_equal(False, is_positive(-n))
        assert_equal(True, is_positive(n))

A more interesting example

data_flatten.py

nested = {
    'fullname': 'Alessandra',
    'age': 41,
    'phone-numbers': ['+447421234567', '+447423456789'],
    'residence': {
        'address': {
            'first-line': 'Alexandra Rd',
            'second-line': '',
        },
        'zip': 'N8 0PP',
        'city': 'London',
        'country': 'UK',
    },
}

flat = {
    'fullname': 'Alessandra',
    'age': 41,
    'phone-numbers': ['+447421234567', '+447423456789'],
    'residence.address.first-line': 'Alexandra Rd',
    'residence.address.second-line': '',
    'residence.zip': 'N8 0PP',
    'residence.city': 'London',
    'residence.country': 'UK',
}

data_flatten.py

def flatten(data, prefix='', separator='.'):
    """Flattens a nested dict structure. """
    if not isinstance(data, dict):
        return {prefix: data} if prefix else data

    result = {}
    for (key, value) in data.items():
        result.update(
            flatten(
                value,
                _get_new_prefix(prefix, key, separator),
                separator=separator))
    return result

def _get_new_prefix(prefix, key, separator):
    return (separator.join((prefix, str(key)))
            if prefix else str(key))

tests/test_ch7/test_data_flatten.py

# ... imports omitted ...
class FlattenTestCase(TestCase):

    def test_flatten(self):
        test_cases = [
            ({'A': {'B': 'C', 'D': [1, 2, 3], 'E': {'F': 'G'}},
              'H': 3.14,
              'J': ['K', 'L'],
              'M': 'N'},
             {'A.B': 'C',
              'A.D': [1, 2, 3],
              'A.E.F': 'G',
              'H': 3.14,
              'J': ['K', 'L'],
              'M': 'N'}),
            (0, 0),
            ('Hello', 'Hello'),
            ({'A': None}, {'A': None}),
        ]
        for (nested, flat) in test_cases:
            assert_equal(flat, flatten(nested))

    def test_flatten_custom_separator(self):
        nested = {'A': {'B': {'C': 'D'}}}
        assert_equal(
            {'A#B#C': 'D'}, flatten(nested, separator='#'))

-- Test-driven development

TDD is a software development methodology that is based on the continuous repetition of a very short development cycle.

Pros:
You will refactor with much more confidence.
The code will be more readable.
The code will be more loose-coupled and easier to test and maintain.
Writing tests first requires you to have a better understanding of the business requirements.
Having everything unit tested means the code will be easier to debug.

Cons:
The whole company needs to believe in it.
If you fail to understand the business requirements, this will reflect in the tests you write, and therefore it will reflect in the code too.
Badly written tests are hard to maintain.

-- Exceptions

exceptions/first.example.py

gen = (n for n in range(2))
next(gen)
next(gen)
next(gen)
print(undefined_var)
mylist = [1, 2, 3]
mylist[5]
mydict = {'a': 'A', 'b': 'B'}
mydict['c']
1 / 0

exceptions/try.syntax.py

def try_syntax(numerator, denominator):
    try:
        print('In the try block: {}/{}'
              .format(numerator, denominator))
        result = numerator / denominator
    except ZeroDivisionError as zde:
        print(zde)
    else:
        print('The result is:', result)
        return result
    finally:
        print('Exiting')

print(try_syntax(12, 4))
print(try_syntax(11, 0))

$ python exceptions/try.syntax.py 

exceptions/json.example.py

import json
json_data = '{}'
try:
    data = json.loads(json_data)
except (ValueError, TypeError) as e:
    print(type(e), e)

exceptions/multiple.except.py

try:
    # some code
except Exception1:
    # react to Exception1
except (Exception2, Exception3):
    # react to Exception2 and Exception3
except Exception3:
    # react to Exception3
...

exceptions/for.loop.py

n = 100
found = False
for a in range(n):
    if found: break
    for b in range(n):
        if found: break
        for c in range(n):
            if 42 * a + 17 * b + c == 5096:
                found = True
                print(a, b, c)  # 79 99 95

exceptions/for.loop.py

class ExitLoopException(Exception):
    pass

try:
    n = 100
    for a in range(n):
        for b in range(n):
            for c in range(n):
                if 42 * a + 17 * b + c == 5096:
                    raise ExitLoopException(a, b, c)
except ExitLoopException as ele:
    print(ele)  # (79, 99, 95)

-- Profiling Python

Profiling means having the application run while keeping track of several different parameters, like the number of times a function is called, the amount of time spent inside it, and so on. Profiling can help us find the bottlenecks in our application, so that we can improve only what is really slowing us down.

profiling/triples.py

def calc_triples(mx):
    triples = []
    for a in range(1, mx + 1):
        for b in range(a, mx + 1):
            hypotenuse = calc_hypotenuse(a, b)
            if is_int(hypotenuse):
                triples.append((a, b, int(hypotenuse)))
    return triples

def calc_hypotenuse(a, b):
    return (a**2 + b**2) ** .5

def is_int(n):  # n is expected to be a float
    return n.is_integer()

triples = calc_triples(1000)

$ python -m cProfile profiling/triples.py

profiling/triples.py

def calc_hypotenuse(a, b):
    return (a*a + b*b) ** .5

profiling/triples.py

def is_int(n):
    return n == int(n)
My DS Coding Bolg

Thursday, March 24, 2016

Learning Python 7 - Tesing, Profiling, and Dealing with Exceptions

No comments:

Post a Comment

Blog Archive