__init__.py

Python basics: functions

This is the post version of a talk I had the opportunity to give to my collegues at CAPSiDE. As any talk, it does not only serve to transmit knowledge and passion for something, but to rehearse and relearn some concepts myself.

I once was a physics student. As such, I had to learn some skills like integrating on the complex plane, or solving some differential equations. At least I had to pass my exams!

A few years after, I had to teach those mathematical methods to other physics students. It was then when I really had to understand all the cases. It was then, when I finally did all the assignments before going to class. It was then when I learnt and it was then when I was able to teach.

Here I just wanted to relearn the basics, so I have started with functions.

To motivate the talk, I needed some storytelling. The news are always good material. Judge by yourself.

The future goes through making web pages

Politicians (at least on the country I live) know better than us what is good for us. Sometimes they know exactly what we need. We just need to listen to their wisdom.

The future goes through making web pages
The future goes through making web pages
—Méndez de Vigo

So, the goal here is to learn how to create web pages. As we know, web pages are made of a special markup language called HTML.

So, to set up the stage, we want to be obedient and try to do what politicians think is the thing we need to, a web page:

<html><body><p>ola ke ase</p></body></html>

Functions

One of the best things of python is that it has a REPL. Like the other brilliant languages, you can play with it.

And playing is probably the most serious thing you can do, at least while learning (which is mostly what programmers do until they construct programs).

Let’s just fire up an interactive session an see what we can do together:

>>> def p(text):
...     return f'<p>{text}</p>'
...
>>> def body(contents):
...     return f'<body>{contents}</body>'
...
>>> def html(body):
...     return f'<html>{body}</html>'
...
>>> html(body(p('ola ke ase')))
'<html><body><p>ola ke ase</p></body></html>'
>>>
python interactive session

Functions are the simplest building bloks in python that allow for abstraction.

By the way, here we’re using python3 and the f-strings syntax that enables for nice string interpolation.

Arguments

The simple functions we’ve written have only one parameter, but we can define functions with more.

>>> def div(a,b):
...     return a / b
...
>>> div(84, 2)
42.0
>>> div(42, 0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in div
ZeroDivisionError: division by zero
function parameters

This type of argument is called positiona argument. When the function is called, argument values are bound according to the positions they appear in the function call to parameter names appearing on the same positions on the function signature.

*args

One handy thing of python functions is that you can collect all the positional arguments of a function with the *args (pronounced starargs) notation:

>>> def f(*args):
...     for i, a in enumerate(args):
...             print(f'args[{i}] -> {a}')
...
>>> f('the', 'meaning', 'of', 'life:', 42)
args[0] -> the
args[1] -> meaning
args[2] -> of
args[3] -> life:
args[4] -> 42
starargs

With this we can redefine body to accept more than one element:

>>> def body(*contents):
...     return f'<body>{"".join(contents)}</body>'
...
>>> html(body(p('ola'),p('ke'),p('ase')))
'<html><body><p>ola</p><p>ke</p><p>ase</p></body></html>'

You can also use the *args notation to pass serveral arguments to a function from a list-like object:

>>> args = [84, 2]
>>> div(*args)
42.0

When you call a function with an *args, the arguments are spliced into the call, like in the lisp/JavaScript apply.

Keyword arguments

We can also specify the arguments by name:

>>> def p(text, _class=None):
...     if _class is not None:
...             return f'<p class="{_class}">{text}</p>'
...     else:
...             return f'<p>{text}</p>'
...
>>> p('hello', _class='blink')
'<p class="blink">hello</p>'

This type of argument is called keyword argument. When the function is called, values are bound by name and not the by the positions they appear.

**kwargs

And python also collects them all into **kwargs:

>>> def f(**kwargs):
...     for k,v in kwargs.items():
...             print(f'kwarg[{k}] -> {v}')
...
>>> f(_class='blink', name='paudirac', disabled=False)
kwarg[_class] -> blink
kwarg[name] -> paudirac
kwarg[disabled] -> False

Again, you can also use the **kwargs syntax to call a function with keyword arguments from a dictionary.

>>> kwargs = {
...     '_class': 'blink',
...     'id': 42
... }
>>> p('oka ke ase', **kwargs)
'<p class="blink" id="42">oka ke ase</p>'
Note that the names *args and **kwargs are arbitrary. They are the ones the python community normally uses to refer to the starargs and star-starargs of a generic function. Later we'll se some uses of this idiom.

Modules

The interpreter is handy for exploration, but if you want something to persist along sessions, you’ll need to put it on a file.

Open and Emacs buffer with the following python code

print(f"I am loading the module {__name__}")

def greet(name):
    print(f'Hello, {name}')

if __name__ == '__main__':
    greet(input(prompt='what is your name? '))
first.py

and save it to a file called first.py.

As before, we can start a python interpreter and load the first module:

>>> import first
I am loading the module first
>>> first.__name__
'first'

but we can also run the module as a python script

$ python first.py
I am loading the module __main__
what is your name?

We see various things here:

  • the semantics of importing a module is the same as the semantics of executing a script
  • the module is an object and has properties, such as __name__
  • the module itself has access to the __name__ property and this property gets the name of the module when imported, but the name __main__ when executed as a script

This explains why the pattern or idiom

if __name__ == '__main__':
   do_something()

is so ubiquitous: it lets you define a reusable module, and—at the same time—allows you to have an entry point for an executable script.

Import

Before going back to semantics, we have to say that import is python’s silver bullet. With import you can access the whole standard library() (which is huge) and any other python library on your PYTHONPATH.

This is a theme for a post in itself, so let me show you only two important imports you can do on the REPL:

>>> import this

and

>>> import antigravity

As we’ve seen, the semantics of importing a module or executing it are similar.

Import semantics

Python was born as a scripting language. This is why the semantics of a module are the same as those of a script.

  • The system reads line after line and executes one line after the other.
  • The semantics of any line depends of the code in that line.

For instance: the semantics of a print(message) statement is to send the message to the stdout, but the semantics of a def life(): return 42 is to create a function object from the body of the def and bind it to the name life.

So, when you’re importing a module from another module or from the REPL, you’re actually executing the script line-by-line. To prevent some part of the script to execute on import, people use the idiom:

if __name__ == '__main__':
   do_something()

Another thing that happens when importing a module is that the module name then is available from the importing site. We can see this with a little experiment (all experiments are done with the REPL) and using the dir built-in:

>>> dir()
['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__']
>>> import first
I am loading the module first
>>> dir()
['__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'first']

Namespaces

It’s all about names.

Functions allow us to abstract functionality an hide it under a name. Our minds then can get rid off the details of how the function does its thing and trust the name itself. Then we create modules to put several functions (and other stuff) in them in order to orchestrate them once, but execute it several times (like in a script) or make them available for importing and reuse.

When importing we use the same trick as with functions, we trust the names.

Packages

—The Zen of Python

If the complexity grows (and it will), we need more tools to help us keep it under control. The modules are not enough: enter packages.

A package allows us to import things from dotted module names, like in

from paudirac.html import html, body, p

Here the purpose is twofold:

  • on one side, the dotting of namespaces allows us to structure code in a hierachical way,
  • on the other side, the namespacing serves as a mechanism to prevent name collisions.

As the import semantics populates the current (importing) module with names from the imported module, the dot notation allows for having diffrent modules with the same names. Actually, while developing this talk/tutorial I collisioned with the built-in `html` module, so I had to put my stuff on the paudirac package.

Folders
Folders

Packages hare implemented as folders on the filesystem. A package is an importable folder that contains a (possibly empty) file named __init__.py on it.

Building the html library

We are now in the position to build our html library. First we create the package `paudirac`

$ mkdir paudirac
$ touch paudirac/__init__.py

and then we create the module html on that package

$ touch paudirac/html.py

This will allow us do to things like

from paudirac.html import html, body, p

h = html(
    body(
        p('ola ke ase', _class='blink')
    )
)
print(h)

to great amuzement of all our friends and family.

DRY

All html tag functions (tags for short) we’ve been creating are too similar, so we’re repeating ourselves. This is not bad per se, but if a bug arises in one of the tags then it will probably arise in all of them. This is bad per se.

And, because we’ve already seen 3 cases, maybe it’s time we generalize a bit. The first observation is that you don’t actually need have a function for any tag. We can use a single function for all the tags and pass the tag name as a parameter:

>>> def tag(name, *contents):
...     if len(contents) > 0:
...             return f'<{name}>{"".join(contents)}</{name}>'
...     else:
...             return f'<{name} />'
...
>>> tag('p', 'ola ke ase')
'<p>ola ke ase</p>'
>>> tag('br')
'<br />'

But this is not so nice as the previous API. To recover the previous API, first we indirect the function one level and create a factory of tag functions

>>> def make_tag(name):
...     def _tag(*contents):
...             return tag(name, *contents)
...     _tag.__qualname__ = name # we don't want the function to be called _tag!
...     return _tag
...
>>> make_tag('p')('ola ke ase')
'<p>ola ke ase</p>'

we an also bind the function to a variable and use the function afterwards

>>> p = make_tag('p')
>>> div = make_tag('div')
>>> div(p('ola ke ase'))
'<div><p>ola ke ase</p></div>'

But here we’re repeating ourselves to, because we’ve using the name div and p twice.

Everything is an object

Then we remember that everything in python is an object. A module too. We can actually access the module we are writting code to (the current module). Just ask python:

import sys
self = sys.modules[__name__]

Here in self, we have the current python module.

At this point, we’ve got all the ingredients to create our html library. On the paudirac/html.py we write:

def attr_name(name):
    return name[1:] if name.startswith('_') else name

def make_attrs(attrs_dict):
    if attrs_dict:
        return ' ' + ' '.join(f'{attr_name(k)}="{v}"' for k,v in attrs_dict.items())
    else:
        return ''

def make_tag_fn(name):
    """Creates the tag <name>"""
    def _tag(*contents, **kwargs):
        attrs = make_attrs(kwargs)
        if len(contents) > 0:
            contents = ''.join(contents)
            return f'<{name}{attrs}>{contents}</{name}>'
        else:
            return f'<{name}{attrs} />'
    _tag.__qualname__ = name
    return _tag

TAGS = [
    'a',
    'body',
    'html',
    'p',
    'input',
    'div',
    'br',
]

import sys
self = sys.modules[__name__]
for name in TAGS:
    setattr(self, name, make_tag_fn(name))

That’s it. We’ve now can create a simple web server from the reference wsgi implementation

from wsgiref.simple_server import make_server

from functools import partial
encode = partial(str.encode, encoding='utf-8')

from paudirac.html import *

def simple_app(environ, start_response):
    """A simple web application"""
    status = '200 OK'
    headers = [('Content-type', 'text/html; charset=utf-8')]
    start_response(status, headers)
    res = html(
        body(
            p('Hello, world', _class='my-class'),
            p(a('Python is awesome!', href='http://www.python.org'))
        )
    )
    return map(encode, [res])

PORT = 9000
with make_server('', PORT, simple_app) as httpd:
    print(f'Serving on {PORT}...')
    httpd.serve_forever()

and we’ve done.

The future goes through making web pages

We’ve just accomplished our goal, but, during the journey, we’ve already seen several things:

  • python functions and arguments
  • modules
  • import semantics
  • the if __name__ == ‘__main__’: idiom
  • python data structures: lists & dicts
  • python comprehensions

Next time we will see that the html this code generates is ugly. We would like to have well indented code. And this will be another opportunity to learn more basic python stuff.

But..., what does basic mean?