GAMR1520: Markup languages and scripting

python logo

Lab 2.2: Functions and modules

Part of Week 2: Files, functions and classes

General setup

For all lab exercises you should create a folder for the lab somewhere sensible.

Assuming you have a GAMR1520-labs folder, you should create a GAMR1520-labs/week_2 folder for this week and a GAMR1520-labs/week_2/lab_2.2 folder inside that.

Create a collection of scripts. If necessary, use folders to group related examples.

GAMR1520-labs
└─ week_2
    └─ lab_2.2
        ├─ experiment1.py
        └─ experiment2.py

Try to name your files better than this, the filename should reflect their content. For example, string_methods_.py, conditionals.py or while_loops.py.

Make sure your filenames give clues about what is inside each file. A future version of you will be looking back at these, trying to remember where a particular example was.

General approach

As you encounter new concepts, try to create examples for yourself that prove you understand what is going on. Try to break stuff, its a good way to learn. But always save a working version.

Modifying the example code is a good start, but try to write your own programmes from scratch, based on the example code. They might start very simple, but over the weeks you can develop them into more complex programmes.

Think of a programme you would like to write (don't be too ambitious). Break the problem down into small pieces and spend some time each session trying to solve a small problem.

So far we have been using a procedural approach to programming. Our programmes have been all in one file with no particular structure.

In this set of exercises we will expand to a functional approach. We will develop some structure in our code using functions and build our own python module.

In this module we are not functional programming zealots, we take a pragmatic approach which mixes procedural, functional and object-oriented styles. Python allows us to do this very naturally.

First we will look at a few simple examples of functions. We will look at how to add arguments that define the interface our functions expose to the rest of our code.

Then we will start building something moderately complex. Starting with a simple function to wrap content in properly formatted HTML tags, we will expand into a simple but functional set of tools for converting a limited markdown format into HTML.

The idea here is for us to get some practice with functions and to see how they can be used to extract reusable pieces of code and break problems down into smaller parts.

In a complex programme, the functions we choose to write should reflect our understanding of the problem. However, our initial intuition may be limited, sometimes we discover the nature of a problem by working on it. Our code, including the way we use functions, is not fixed in stone. It should be refactored as our understanding of a problem increases.

By using functions, we can keep our code simple and readable. Breaking the problem down into steps and writing a function for each step helps us to avoid the large problems by solving lots of small problems instead.

Extracting a working function takes a bite out of a larger problem. But it may still need to be chewed.

Table of contents

Functions

We can think of functions as saved recipes. We often need to do the same things multiple times with different data. We create functions as reusable bits of code that can do a useful operation for us. Usually a function should perform an operation that we need to use more than once using different data.

We define a function using the def keyword followed by a suitable function name. Functions follow the familiar form of all compound statements. The function body can contain any number of valid python statements.

def my_function(arg1, arg2, kwarg3=None):
    return "ignoring all arguments"

We specify what values the function can receive as arguments within the parentheses. The above function takes two positional arguments (arg1 and arg2) and one keyword argument (kwarg3 with a default value of None). It then ignores these arguments and simply returns a string.

Whenever we call the function, it will return the same string every time, no matter what arguments are provided. This is not a very useful function.

a = my_function(1, 2, kwarg3="hello")
print(a)
ignoring all arguments

Read more about functions before working through the following exercises.

Time to experiment

Spend some time understanding the following function.

def formatted_list(items, title="list", ch='*'):
    width = max([len(i) for i in items + [title]]) + 4
    hline = ch * width
    result = [hline, title, hline] + items + [hline]
    result = [f"{ch}{i.center(width)}{ch}" for i in result]
    return "\n".join(result)

It may be short, but there is a lot going on. It was developed using a process of continuous refactoring. Read about how it was put together.

The function can be used like this:

f = ['apples', 'bananas', 'cherries']
print(formatted_list(f, title='fruit'))

It produces an output like this:

**************
*   fruit    *
**************
*   apples   *
*  bananas   *
*  cherries  *
**************

Notice there is still one literal value inside the function body. The additional width added to each row is set to 4.

Modify the function to include this as an optional argument pad with a default value of 4. Try to generate something like this with a function call.

++++++++++++++++++++++++++++++
+           fruit            +
++++++++++++++++++++++++++++++
+           apples           +
+          bananas           +
+          cherries          +
++++++++++++++++++++++++++++++

The code to create it might look something like this.

f = ['apples', 'bananas', 'cherries']
l = formatted_list(f, title='fruit', ch="+", pad=20)
print(l)

The required change is actually very small.

A function signature is the name and the order/type of arguments. Consider how it could be confusing to write a function with a meaningless name or ambiguous arguments.

value = inv(1, 2)

If you saw this in a long piece of code, what does this even do? Compare to this:

item = get_inventory_item(user_id=1, inventory_position=2)

It may not be a good function, but at least we can guess roughly what its doing.

A function to generate HTML fragments

We are going to spend some time developing a function. Imagine we are writing a programme to convert markdown files into HTML. We need a function for part of the rendering code, to generate HTML fragments as strings.

For example, a paragraph of text in markdown is just the text. Imagine we had the following markdown code:

This is markdown!

We want to convert this into an HTML <p> element like this:

<p>This is markdown!</p>

The following example function implements a simple f-string which adds HTML tags around some provided arguments.

If you haven’t yet, read about f-strings

def html_element(content, tag):
    return f"<{tag}>{content}</{tag}>"

The function containing a return statement. It will return a formatted string containing the content argument wrapped in opening and closing HTML tags with the given tag name. This is a useful function that we may want to call many times when generating an HTML file.

Arguments

When we call our function, we pass arguments within parentheses. For example, if we want a level 1 heading element <h1>hello world</h1> then we can use our function like this:

heading = html_element('hello world', 'h1') 

Within the function, the arguments are passed into the f-string and returned. The function call evaluates to the returned value which is then assigned to the heading variable.

<h1>hello world</h1>

We can use the function again, with different arguments.

subheading = html_element('reusable functions', 'p')

In this case we are passing the string 'reusable functions' as the content argument and 'p' as the tag argument. The function returns the formatted string which is assigned to the subheading variable.

<p>reusable functions</p>

This simple function could prove very useful. It will avoid the need to write lots of similar code and can be tested thoroughly to reduce the potential for mistakes.

Defining a library module

Our function works so well that we find we have multiple scripts that need to use it. Rather than copying and pasting our function from one script to another (creating a maintenance nightmare), we can convert it into a module and import it into our scripts.

Move the function into a new file called my_library.py in the same folder as your working file.

That’s it, we’ve created our first python module.

A module is just a file with a .py extension. We can choose any filename we like, (though we should follow the PEP8 style guide) and we need to be careful to avoid clashes with existing modules from the standard library.

Importing a module

Now, back in our working file we can import the my_library module and access the html_element function using dot notation.

import my_library

print(my_library.html_element("emphasis", "em"))
print(my_library.html_element("bold", "strong"))
print(my_library.html_element("heading", "h2"))

This generates the HTML output we want.

<em>emphasis</em>
<strong>bold</strong>
<h2>heading</h2>

Notice that modules are objects.

import my_library

print(type(my_library))
<class 'module'>

We can pass them to type() just like any other value.

Importing a function from our module

This additional module reference and dot notation makes our code a bit awkward. In a simple case like this, we can use the from keyword to directly import our html_element function rather than importing the entire my_library module.

from my_library import html_element

print(html_element("emphasis", "em"))
print(html_element("bold", "strong"))
print(html_element("heading", "h2"))

We can now easily share the functionality between modules whilst only needing to maintain one copy of the code.

Sometimes it can be useful to keep the module name in the code so its clear where everything comes from. This is especially true if we are importing stuff from many different modules.

Upgrading our function

When creating a new programme it is rare for code to remain stable for long. It is a natural aspect of software development that, as we encounter and solve problems, our code will become more complex (and hopefully more useful) over time.

If you are presented with a large problem to solve, break it down into smaller pieces and solve the easiest part first. A productive way to work is to discover and implement the problem in stages, beginning with a simple proof-of-concept and adding elements of functionality incrementally. For more examples of this, see refactoring.

Our html_element function works well for the simple cases we have looked at so far. However, it doesn’t allow us to create HTML elements with attributes.

HTML attributes are specified as space-separated attribute="value" pairs within the opening tag of an element, after the tag name.

Perhaps we want to generate an HTML fragment like this:

<a href="#html-attributes">HTML attributes</a> 

This is an anchor element (<a>) which denotes a hyperlink in HTML. The href attribute is pointing to an element with the specified id attribute. The other element might be a section heading that looks something like this:

<h2 id="html-attributes">HTML attributes</h2>

In both cases, the elements need their attributes. The link won’t work without them.

We need to upgrade our html_element function to accept attribute names and values. Let’s say we wanted to add an href argument. We could do something like this:

def html_element(content, tag, href=None):
    attrs = ""
    if href:
        attrs = f' href="{href}"'
    return f"<{tag}{attrs}>{content}</{tag}>"

This allows for creating a link like this.

html_element("link", "a", href="#something")
<a href="#something">link</a>

However, if we want our function to be general purpose then we would need to add keyword argument options for every possible HTML attribute. The above solution works as a proof-of-concept for the formatting, but it doesn’t scale up or generalise in the way we want.

Attributes are essentially {key: value} pairs and so they map nicely onto keyword arguments. We can use the ability to take arbitrary keyword arguments to provide our function with a much more general purpose interface.

def html_element(content, tag, **attributes):
    print(attributes)
    return f"<{tag}>{content}</{tag}>"

The double asterisk on the **attributes argument merges the keyword arguments into a single dictionary. So the calling code can add as many arbitrary arguments as it needs and these will be merged into a dictionary of requested attributes.

The current implementation simply prints the dictionary as a proof-of-concept.

Now we can add keyword arguments like this:

html_element("link", "a", href="something")

This generates output without attributes, but it prints the dictionary of keyword arguments as a side-effect.

{'href': 'something'}

We can add an id attribute and see that both attributes are being collected into the dictionary.

html_element("link", "a", href="something", id="hi")
{'href': 'something', 'id': 'hi'}

We need to convert this into the appropriate format and insert into our result. We can start by using a list comprehension to convert the attributes dictionary into a list of HTML-formatted attributes.

list comprehensions are amazing, use them!

def html_element(content, tag, **attributes):
    attrs = [f'{k}="{v}"' for k, v in attributes.items()]
    print(attrs)
    return f"<{tag}>{content}</{tag}>"

Notice we are looping over the dict.items() iterable.

calling dict.items() on any dictionary returns an iterable that yields the {key: value} pairs as tuples in the form (key, value). Our code unpacks these into the variables k and v within the list comprehension.

Again, the output is unchanged but we can see the dictionary has been transformed into a list of strings.

html_element("link", "a", href="something", id="hi")
['href="something"', 'id="hi"']

This is close to what we need in the final output (the HTML attributes inside the opening tag). The next step is to convert the list into a space-separated string (using str.join()). We can now add this new string into the formatted output.

str.join() often comes in handy like this.

def html_element(content, tag, **attributes):
    attrs = [f'{k}="{v}"' for k, v in attributes.items()]
    attrs = " ".join(attrs)
    return f"<{tag}{attrs}>{content}</{tag}>"

Oops, we can see that there is a missing space between the attributes and the opening tag name.

html_element("link", "a", href="something", id="hi")
<ahref="something" id="hi">HTML attributes</a>

This space should not be added if there are no attributes, so we do this within a simple if statement that adds two lines to our function.

def html_element(content, tag, **attributes):
    attrs = [f'{k}="{v}"' for k, v in attributes.items()]
    attrs = " ".join(attrs)
    if attrs:
        attrs = f" {attrs}"
    return f"<{tag}{attrs}>{content}</{tag}>"

This function is combining a number of common techniques. Make sure you understand each step.

The code now works well, we can generate what we want.

We can make the anchor tag (hyperlink).

a = html_element(
    "HTML attributes", "a", 
    href="#html-attributes"
)
print(a)
<a href="#html-attributes">HTML attributes</a>

We can also make the section heading.

h2 = html_element(
    "HTML attributes", "h2", 
    id="html-attributes"
)
print(h2)
<h2 id="html-attributes">HTML attributes</h2>

It produces exactly the output we wanted.

A bug

Notice our function has a problem, we cannot add a class="my-class" attribute because class is a python keyword.

p = html_element("test", "p", class="my-class")
  File "path/to/my_library.py", line 11
    p = html_element(item, tag, class="my-class")
                                ^^^^^
SyntaxError: invalid syntax

This is unavoidable and no solution to this problem is ideal.

One idea is to allow the calling code to pass _class="my-class" and convert it inside the function.

def html_element(content, tag, _class=None, **attributes):
    if _class:
        attributes['class'] = _class
    attrs = [f'{k}="{v}"' for k, v in attributes.items()]
    attrs = " ".join(attrs)
    if attrs:
        attrs = f" {attrs}"
    return f"<{tag}{attrs}>{content}</{tag}>"

html_element("hello", "p", _class="my-class")

Or, more generally, we can strip all underscores from the beginning of attribute names.

def html_element(content, tag, **attributes):
    attrs = [f'{k.lstrip("_")}="{v}"' for k, v in attributes.items()]
    attrs = " ".join(attrs)
    if attrs:
        attrs = f" {attrs}"
    return f"<{tag}{attrs}>{content}</{tag}>"

html_element("hello", "p", _class="my-class", _def="not a real HTML attribute")

In this way, keyword clashes can be handled in a general way.

We could also remove the double asterisks from the attributes argument and change the function to take an attributes dictionary instead. The point is to provide your users (which may be just you) with an interface that is convenient and intuitive. If you’re not sure, try both. Select whichever feels most natural.

Create a markdown parser

Imagine a text format (a simplified markdown) where input files look like this:

# The Python Tutorial

Python is an easy to learn, powerful programming language.
It has efficient high-level data structures and a simple but effective approach to object-oriented programming.
Python’s elegant syntax and dynamic typing, together with its interpreted nature, make it an ideal language for scripting and rapid application development in many areas on most platforms.

The Python interpreter and the extensive standard library are freely available in source or binary form for all major platforms from the Python web site, and may be freely distributed.
The same site also contains distributions of and pointers to many free third party Python modules, programs and tools, and additional documentation.

We have three basic rules.

  1. Elements must be separated by empty lines (i.e. '\n\n').
  2. The default element is a <p>.
  3. Headings are signified by adding a single hash and a space before the content.

Can we design and build a system that can convert this into HTML formatted output?

Does this sound like a complicated challenge? How much code would you expect to write for this?

The output we want looks something like this:

<h1>The Python Tutorial</h1>

<p>Python is an easy to learn, powerful programming language. It has efficient high-level data structures and a simple but effective approach to object-oriented programming. Python’s elegant syntax and dynamic typing, together with its interpreted nature, make it an ideal language for scripting and rapid application development in many areas on most platforms.</p>

<p>The Python interpreter and the extensive standard library are freely available in source or binary form for all major platforms from the Python web site, and may be freely distributed. The same site also contains distributions of and pointers to many free third party Python modules, programs and tools, and additional documentation.</p>

We want to replicate our my_library module and upgrade it into a very simple proof-of-concept for a basic markdown-to-HTML converter.

Prepare a new module

To keep it separate from previous experiments, create a new folder markdown (under this week) for this project and place the following three new files into it.

We need a new module called markdown.py. This will contain all our library code which does the hard work. Copy the existing html_element function into it.

Code should be tested and proven before it is moved into the markdown module.

Save a copy of test.md into your working folder. This will be our test data.

Then create a new script main.py. This will be where we import our library functions and try things. It will be the entry point for our code (i.e. the file we execute) and will change a lot as we develop experiments.

We will break the problem down into its component parts.

  1. Load the raw data from a file
  2. Parse the source data into chunks
  3. Generate the HTML output for each chunk
  4. Write the result into a file

Loading and parsing the source data

The first thing we need to do is load the data from our test file.

The main.py script can begin like this, just loading the test data and printing it.

from pathlib import Path
import markdown

inpath = Path(__file__).parent / 'test.md'
with inpath.open('r') as f:
    print(f.read())

You should see the full original file as output.

But we want to break the problem into steps, so let’s create a function for this bit.

from pathlib import Path
import markdown

def parse_md_file(path):
    with path.open('r') as f:
        return f.read()

inpath = Path(__file__).parent / 'test.md'
data = parse_md_file(inpath)
print(data)

This additional structure allows us to work on the parse_md_file() function in isolation. When it’s working, we can move it into the markdown module and import it.

Parse the sample data

Complete the parse_md_file function so it returns a list of strings.

It should split the raw data into a list based on double newline characters ('\n\n').

The output should be a list containing the three chunks of content to be wrapped in HTML tags.

['# The Python Tutorial', 'Python is an easy to learn, powerful programming language. It has efficient high-level data structures and a simple but effective approach to object-oriented programming. Python’s elegant syntax and dynamic typing, together with its interpreted nature, make it an ideal language for scripting and rapid application development in many areas on most platforms.', 'The Python interpreter and the extensive standard library are freely available in source or binary form for all major platforms from the Python web site, and may be freely distributed. The same site also contains distributions of and pointers to many free third party Python modules, programs and tools, and additional documentation.\n']

Once you have a working function, move it into markdown.py and update your main.py to use the library function markdown.parse_md_file() as follows.

from pathlib import Path
import markdown

inpath = Path(__file__).parent / 'test.md'
data = markdown.parse_md_file(inpath)
print(data)

Converting individual chunks

Hopefully its clear where we are going with this. Having these three chunks of text as strings gets us pretty close to being able to pass them individually through the html_element() function as content.

It feels like a good idea to use a list comprehension for this with a new function. We are converting a three-element array of raw data into a three-element array of formatted data.

So, something like this is needed:

data = [format_md_chunk(chunk) for chunk in data]

Where the format_md_chunk() function should detect what tag name to use and call html_element() accordingly.

Here’s a naive implementation where format_md_chunk doesn’t look for <h1> elements.

from pathlib import Path
import markdown

def format_md_chunk(chunk):
    return markdown.html_element(chunk, 'p')

inpath = Path(__file__).parent / 'test.md'
data = markdown.parse_md_file(inpath)
data = [format_md_chunk(chunk) for chunk in data]
print(data)

The output generated is an array of strings where each string is a formatted <p> element. This is very close. But before we merge it together and write out to the file, we need to differentiate between <p> elements and <h1> elements.

Detect headings

Complete the format_md_chunk() function.

  1. It should define a tag variable and set it to the value 'p'.
  2. It should look for '# ' at the beginning of the chunk using str.startswith()
  3. Within a conditional statement it should set the tag variable to 'h1' and strip off the first two characters of the content (maybe using slicing) only if it detects a heading.
  4. It should pass the tag variable into html_element() along with the chunk content and return the result.

Once you have a working function, move it into markdown.py.

Make sure you update the function to use html_element() rather than markdown.html_element().

Update your main.py to use the library function markdown.format_md_chunk() as follows.

from pathlib import Path
import markdown

inpath = Path(__file__).parent / 'test.md'
data = markdown.parse_md_file(inpath)
data = [markdown.format_md_chunk(chunk) for chunk in data]
print(data)

Converting all the parsed data

We can now move the list comprehension into the library too. Create a new function in markdown.py like this.

def format_md_chunks(data):
    return [format_md_chunk(chunk) for chunk in data]

…and update main.py like this.

from pathlib import Path
import markdown

inpath = Path(__file__).parent / 'test.md'
data = markdown.parse_md_file(inpath)
data = markdown.format_md_chunks(data)
print(data)

Merging into the final result

The last step in the process is to merge the chunks back together by replacing the '\n\n' which we removed in the initial parsing step.

This is a simple change.

from pathlib import Path
import markdown

inpath = Path(__file__).parent / 'test.md'
data = markdown.parse_md_file(inpath)
data = markdown.format_md_chunks(data)
data = '\n\n'.join(data)
print(data)

We can now add a final convert() function into markdown.py to execute the three steps in one function.

def convert(path):
    data = parse_md_file(path)
    data = format_md_chunks(data)
    return '\n\n'.join(data)

Notice the minor changes renaming the inpath variable and removing the module prefixes.

and finally, update our main.py to use this new library function and write the data out into a file of our choice.

from pathlib import Path
import markdown

inpath = Path(__file__).parent / 'test.md'
outpath = Path(__file__).parent / 'test.html'
outpath.open('w') as f:
    f.write(markdown.convert(inpath))

The final version of the markdown.py module looks like this.

def html_element(content, tag, **attrs):
    attrs = [f'{k.lstrip("_")}="{v}"' for k, v in attrs.items()]
    attrs = " ".join(attrs)
    if attrs:
        attrs = f" {attrs}"
    return f"<{tag}{attrs}>{content}</{tag}>"

def parse_md_file(path):
    with path.open('r') as f:
        return f.read().split('\n\n')

def format_md_chunk(chunk):
    tag = "p"
    if chunk.startswith('# '):
        chunk = chunk[2:]
        tag = "h1"
    return html_element(chunk, tag)

def format_md_chunks(data):
    return [format_md_chunk(chunk) for chunk in data]

def convert(path):
    data = parse_md_file(path)
    data = format_md_chunks(data)
    return '\n\n'.join(data)

And one final (final!) update you may want to make to main.py is to insert the formatted output into a valid HTML template like this.

from pathlib import Path
import markdown

inpath = Path(__file__).parent / 'test.md'
outpath = Path(__file__).parent / 'test.html'

with outpath.open('w') as f:
    html = f"""
    <!doctype html>
    <html lang="en">
    <head>
        <meta charset="utf-8">
        <title>Markdown converted output</title>
    </head>
    <body>
{markdown.convert(inpath)}
    </body>
    </html>
    """
    f.write(html)

A note on concision

Python can be very concise. This is the equivalent convert function all in one.

We don’t add any attributes, so we have removed this feature.

def convert(path):
    with path.open('r') as f:
        data = f.read().split('\n\n')
    result = []
    for chunk in data:
        tag = "p"
        if chunk.startswith('# '):
            chunk = chunk[2:]
            tag = "h1"
        result.append(f"<{tag}>{chunk}</{tag}>")
    return '\n\n'.join(result)

So we can do the basic task in 11 lines of code.

A note on readability

After a bit of refactoring and renaming, I ended up with this.

"""
markdown
a module for parsing markdown files and converting them to HTML
"""

def html_element(content, tag, **attr):
    """
    html_element creates a valid HTML element
    with given content, tag and attributes
    """
    attrs = [f' {k.lstrip("_")}="{v}"' for k, v in attr.items()]
    attrs = "".join(attrs)
    return f"<{tag}{attrs}>{content}</{tag}>"

def md_to_element(chunk):
    """
    md_to_element converts a single chunk of md data
    into an HTML element
    """
    tag = "p"
    if chunk.startswith('# '):
        chunk = chunk[2:]
        tag = "h1"
    return html_element(chunk, tag, _class="from-markdown")

def doc_to_html(data):
    """
    doc_to_html breaks a string of md data into chunks, 
    passes them to md_to_element 
    and puts the result back together
    """
    sep = '\n\n'
    html = [md_to_element(md) for md in data.split(sep)]
    return sep.join(html)

Notice I’ve added docstrings plus a few upgrades and refactors

Then conversion looks like this:

from pathlib import Path
import markdown2 as markdown

inpath = Path(__file__).parent / 'test.md'
outpath = inpath.with_suffix('.html')

template = """<!doctype html>
<html lang="en">
<head>
    <meta charset="utf-8">
    <title>Markdown converted output</title>
</head>
<body>
**BODY GOES HERE**
</body>
</html>"""

with inpath.open('r') as infile, outpath.open('w') as outfile:
    md = infile.read()
    html = markdown.doc_to_html(md)
    outfile.write(template.replace('**BODY GOES HERE**', html))

When developing a complex piece of functionality, the aim should be to develop code which is readable. It should be clear what the code is doing.

“A programming language is more than how you tell a computer what to do. It is how programmers express and communicate ideas. The audience for your code is other programmers, not computers.” - Guido van Rossum

We also need to think about the interface we present to other programmers (which includes ourselves). The above functions can be used independently as a simple toolkit for working with markdown formatted data. The file handling side of things is not really a markdown issue so I decided to leave it out for now so it is up to the user to handle this as in the example main.py file.

A note on expanding this implementation

I’ve been considering the following approach to continue this simplified parser. It is a drop-in replacement for the md_to_element function which handles the actual conversion from markdown to HTML. We use a dictionary to map markdown syntax to the HTML elements. The function can look to see if the chunk follows any of the known patterns.

valid_elements = {
    '#': 'h1',
    '##': 'h2',
    '###': 'h3',
    '####': 'h4',
    '#####': 'h5',
    '######': 'h6',
    '>': 'blockquote',
}

def md_to_element(chunk):
    """
    md_to_element converts a single chunk of md data
    into an HTML element
    """
    md, md_chunk = chunk.split(maxsplit=1)
    if md in valid_elements:
        return html_element(md_chunk, valid_elements[md])
    else:
        return html_element(chunk, 'p')

This works fairly well for these simple cases, but the actual markdown syntax is not always so simple.

In a real implementation, the best way would probably be to use regular expressions. This is a whole other subject and out of the scope of this module.