Your browser doesn't support the features required by impress.js, so you are presented with a simplified version of this presentation.

For the best experience please use the latest Chrome, Safari or Firefox browser.

The power of duck typing and linear algrebra

    Track: Python devroom
    Room: H.1308 (Rolin)
    Day: Saturday
    Start: 11:30
    End: 12:00
(me too, I love to have the slides during presentation so here is the url
All great story begins with a book! And too much free time

Part1: The magic of algebra

It's all about behaviours
(and so is duck typing)
Legal discolusure: My linear algebra books is the french version of
Lipschutz Linear Algebra (Shaum edition)

Linear algebra 101:

If you behave nicely and consistently you have super cow powers including

Theorem 1: given simple enough rules you can make TDD

Linear algebra definition is basically if you pass the tests, then you have super powers. Nota Bene:

Step 0: Writing you own test frameworks

Doesn't it look like the exact perfect match case of a unit test?
Consistent Algebra test

Which makes you curious of what is compliant by default

(yes I purposefully did not went on edge cases)
testing for  'complex' class
 a = (1+0j)
 b = (1+1j)
 c = (1-4j)
 an_int = 2 
 other_scalar = 3 
 neutral element for addition is 0j 


test #1
 a + b = b + a 
'test_commutativity' is 'ok'

test #2
 a + neutral = a 
'test_neutral' is 'ok'

... other tests ....

test #12
 an_int ( a + b) = an_int *a + an_int * b 
'test_distributivity' is 'ok'


<type 'complex'> respects the linear algebrae standard rules 


Just for fun what default type can be considered scalars?

Type Neutral of add. Neutral of mul. Result
Numpy array zeroes(dim) ones(dim) OK
floats 0.0 1.0 OK
int 0 1 OK
Decimal Decimal(0.000000) Decimal(1.000000) KO
complex 0+0j 1+0j OK
list [ ] 1 KO
You can also test strings, quaternions....

Yes, math rox, you are free to code whatever you want and you don't have to justify youself, as long as you pass the tests

Note: it has nothing to do with coconut/rust algebraic data type as explained on wikipedia. I would say, if I would understand it.

Let's make a simple observation

On a MutableMapping anything that is not a leaf is a part of a key.
But let's turn
{ 'a': { 'b' : 1 }, 'b' : 2} AS
{ ('a', 'b') : 1; 'b': 2 }
All keys are immutable. So the tuple of the keys is also...immutable.

Every distinct immutables path to a value are a key, and then we can reduce a recursive problem to a 1 level problem.
All MutableMapping are utimately possibly put in a canonical one level form.
Every distinct immutable paths to a value are equivalent to a dimension.
We have an isomorphism between the one level MutableMapping and arbitrary depth MutableMappings.

We transformed a recursive/deep level problem in a one depth problem.

Math books then gives you the Cook book

(did I pronounced meth books?)

Part 2: making dict follow linear algebra behaviour

Arbitrary choices:

Abstract base class* (type) duck typing and operator overloading are the only 2/3 tricks used.

The exact same trick can probably be implemented in C++, in ruby, Perl, PHP, javascript...

Part 3: And? Does it pass the tests?

At least for the simple case...

testing for  'Daikyu' class

a = {'one_and_two': 3, 'one': 1}
b = {'two': 2, 'one_and_two': -1}
c = {'one': 3, 'three': 1, 'two': 2}
an_int = 2 
other_scalar = 3
neutral element for addition is {} 

... 16 tests later ...

 respects the linear algebrae standard rules 


We have a one level dict behaving as a vector...

Part 4: Let's use the isomorphism to make intricate dict into vector
let's implement, cos, dot

from collections import MutableMapping


def paired_row_iter(tree, path=MARKER):
    if path is MARKER:
        path = tuple()

    for k, v in tree.iteritems():
        if isinstance(v, MutableMapping) and len(v):
            for child in paired_row_iter(v, path + (k,)):
                yield child
            yield ((path + (k,)) , v)

dict(paired_row_iter({ 'a' : { 'b' :1},  'c':2, 'd' : {'e': []}, 'e': {} }) )
# {('a', 'b'): 1, ('c',): 2, ('d', 'e'): [], ('e',): {}}

let's read the specs of the dot product

Let's implelement them

NB: Purists want measure u.u to be to fit with the Lebesgue measure.
My distance however does yield a complex, if scalars are complex.
dot = lambda u,v: sum(v for i,v in paired_row_iter(u*v))

dot(mdict(x=1,y=1, z=1), mdict(x=1, y=1, z=1))
# 3
dot(mdict(x=1,), mdict( y=1, z=1))
# 0
dot(mdict(x=1,), mdict(x=-1, y=1, z=1))
# -1 
For int and real, the resutls are following expectation.
For complex, I am pretty clueless if it is normal.

Please stop the roto-zoom slides

Yes I noticed the persons on the last row being sea sick.
Social experiment #154 over: roto-zoom presentation make people uncomfortable. Lol

Dot product is one step away from distance

Now we can define abs (which is a mathemtic shortcut for L2 distance/norm) and distance as
vabs = lambda v: dot(v,v)**.5
distance = lambda u, v: vabs(u-v)
It would be more aesthetic for vector of dimension p to write. $$ \left\| x \right\| _p = \bigg( \sum_{ i \in \mathbb N} \left| x_i \right| ^p \bigg) ^{1/p}$$
However L2 norm, is making coding way easier, so I totally agree.

And not far from the ... cosinus

(math pun are a sine)
vcos = lambda u,v : dot(u,v) / vabs(u) / vabs(v)

vcos(mdict(x=1, y=1), mdict(x=1, y=0))
# Out[27]: 0.7071067811865475

# Out[29]: 0.7071067811865476 (cos(45°)

vcos(mdict(x=1, ), mdict(x=2, y=4, z=4, v=4))
# Out[30]: 0.3333333333333333  (cos 60°)

vcos(mdict(x=1), mdict())
# ZeroDivisionError: float division by zero
So far so good.

We have something that is built for infinite intricated dimensions, and that already works for the 2D & 3D euclidean space without even trying.
And we even have a meaningful exception: cos with a null vector does not mean a thing.

Please stop the abuse of zoom and random direction

Gosh :) Sorry

But does it have the promised behaviour?

Let's check.
Can we mathematically expose an API that following the least surprise principle: Especially, is it useful?
working correctly?

Let's watch the behaviour of more complex data

Time to switch to ipython notebook \o/

Use case?

An orthogonal rant on OOP introduction with the 2D point class

(oh! a geometrical joke)
Complex type does EVERYTHING better than any 2D class I saw!
Maths are powerful, use them: check here
python3 documentation has reduced considerably the Point example in the collections doc
I cannot say how much I consider the 2D class Point as misleading as Hello world

Parellism you said?

(Another geometrical joke)

Our boring data job? Take a dict (like after we parse a log line, or cvs) and aggregating the data in a dict.

log_line = mdict( ip = "", user_agent = "moz", country="BE",
url="/toto" )
data = mdict( by_country = mdict( GR=12, CA=10), user_agent=mdidct(chrome=1,  ie=1000,))
data["user_agent"] += log_line * mdict( user_agent = 1)
What if we don't have a dict, but just a MutableMapping that directs __getitem__ and __setitem__ to a distant instance?

Should I use your library?

VectorDict is used O_o (I still have to push a pull request in pypi). I DO NOT recommend it, except for seeing the features I was initially encompassing.
Archery implementing the mixin separately at my opinion is quite mature. BUT, I have trust issues, because I lack of feedback.
I hate the naming and how I pedantically javafied the module layout.
I strongly advise to see this code as an academic work and a proof of concept that is delivered AS IS

Other mad scientist project with MutableMappings?

And have you tried proposing to python-ideas?

I tried.
Not a great communicator

Thanks to:

version interactive

Use a spacebar or arrow keys to navigate