GAMR1520: Markup languages and scripting

GAMR1520

Markup languages and scripting

Welcome to GAMR1520!

Dr Graeme Stuart


Python and Javascript

In this module we will learn the basics of two dynamically typed languages.

Python python logo javascript logo and Javascript

Assessment

The module will be assessed in three parts.

  1. A python phase test (20%)

    • In the third week of the module (14th December).
    • Covering the basics we will learn in the next two weeks
    • Automatically marked within blackboard
  2. A python coding assignment (40%)

    • Deadline January 19th
    • You will be provided with automated unit tests
    • 80% of the marks will be awarded for passing the tests
    • 20% of the marks for style/efficiency
  3. A Javascript coding assignment (40%)

    • Deadline February 2nd.
    • A more creative coding challenge
    • Create an animated HTML canvas
    • Marked on functionality and code quality/readability

2022 results

phase test python assignment javascript assignment

Python origins

Python was initially designed by Guido van Rossum with emphasis on code readability, its syntax allows programmers to express concepts in fewer lines of code.

A young Guido Guido in 2014

Readability counts “A programming language is more than how you tell a computer what to do. It is how programmers express and communicate ideas. The audience for your code is other programmers, not computers.” Guido van Rossum


The Zen of Python (1999)

import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Development

The first official release was in 1991. By 2001, ten years later, the Python Software Foundation was launched.

Python has evolved over decades and been through some major changes in that time. In particular, the traumatic transition to python3.

Python 0.9 Feb 1991
Python 1.0 Jan 1994
Python 2.0 Oct 2000
Python 2.7 July 2010
Python 3.0 Dec 2008
Python 3.7 Jun 2016
Python 3.8 Oct 2019
Python 3.9 Oct 2020
Python 3.10 Oct 2021
Python 3.11 Oct 2022
Python 3.12 Oct 2023
Python 3.13 Oct 2024

Open source tools


Python as a dynamically typed language

TLDR; In dynamically typed languages type information is associated with the data, not the variable.

Python has dynamically-typed variables. In practice, this means that variables can be changed fom one type to another.

a = 1
a = 'two'

The remainder of this lecture provides a sketch of what is happening in this simple programme.

There is a lot going on under the hood with python, even with this simple programme. By relinquishing some control to python, your code can be simple and expressive.


A simple model of computation

A basic model of a computer has memory which holds data and a CPU which manages instructions and performs calculations.

Von Neumann architecture (CC BY-SA 3.0)

Modern computers are said to be computationally universal because they can be programmed to do any computation.


Instruction set

Every processor has an instruction set which defines the primitive operations that it can perform. These are the building blocks from which all computer programmes are constructed.

Higher level languages have more abstraction

Such instructions are provided to the processor in machine code, the lowest-level of computer ‘language’, in which each instruction is represented by a unique combination of bits.


Compilation

Some computer languages (such as C and Rust) are compiled into a format suitable for the CPU to consume directly.

flowchart LR; subgraph compiled ["Compiled languages (e.g. C, Rust)"] source["Source code"]-->compiler[Compiler] compiler[Compiler]-->assembler["Assembler"] end assembler-->machinecode["Machine code"] input[Input]-->machinecode machinecode-->output[Output]

This compilation step can produce very efficient code through careful analysis and optimisation. However, it needs to be performed specifically for a given hardware architecture and typically requires the programmer to manage all the details of memory management.

Compilers are bootstrapped in another language and then self-compiled. Assembly languages are said to be assembled rather than compiled. Each assembly language is typically restricted to one processor architecture.


Interpretation

Other languages (such as python and javascript) are compiled to an intermediate form that is interpreted by a special programme that actually interacts with the processor.

flowchart LR; subgraph interpreted ["Interpreted languages (e.g. Python, Javascript)"] source["Source code"]-->compiler["Compiler"] compiler-->bytecode["Byte code"] bytecode-->virtualmachine["Virtual machine"]; end input[Input]-->virtualmachine virtualmachine-->output[Output]

Interpreted languages define their own byte code specifications which can be used on any machine. This makes them portable and very convenient.

The canonical Python compiler/interpreter implementation is CPython (written in C). Though there are many different implementations (e.g. IronPython, Jython, PyPy)


Memory and data types

Binary data in memory can store integers.

flowchart LR; subgraph memory 0b00000001 0b00000010 0b00000011 end

Complex data can be represented in binary format through standards such as ASCII and unicode for characters and IEEE754 for floating point numbers (i.e. non-integers).

a = 97 b = 98 c = 99 d = 100 e = 101 f = 102 g = 103 h = 104 i = 105 j = 106 k = 107 l = 108 m = 109 n = 110 o = 111 p = 112 q = 113 r = 114 s = 115 t = 116 u = 117 v = 118 w = 119 x = 120 y = 121 z = 122

This is what we mean by the type of data. All data in computers is ultimately stored as zeroes and ones but what those zeroes and ones means depends on what you are doing.


Pointers

An important type of data is pointer which holds the value of a unique memory address. Pointers identify the location of some other data in memory. Every location in memory has a unique address.

flowchart LR; subgraph somewhere in memory 0b00000001 0b00000010 end 0x7f3b40df00f0 ---> 0b00000001 0x7f3b40df0110 ---> 0b00000010

32-bits of data are enough to uniquely identify a single byte within 232 bytes (4,294,967,296 bytes or 4 GB) of memory.

64-bits of data are in principle enough to uniquely identify a single byte within 264 bytes (18,446,744,073,709,551,616 bytes or 16 Billion GB) of memory. Though a real system doesn’t quite use 64 bits for directly addressing memory.


Statically typed variables

In languages with statically typed variables, each variable must be declared with a type that cannot change. This code in C demonstrates the idea. We are declaring a variable a of type int and assigning it to the value 1. Then the code updates the value of a, setting it to 2.

int a = 1;
a = 2;

Both operations will write data into the same chunk of memory.

Since C is a statically typed language, the type of a variable cannot be changed.

int b = 1;
b = "two";

This code will not compile.


Dynamically typed languages

In dynamically typed languages, types are associated with values, not variables. All data in python are represented internally as pyObjects. The following code creates a pyObject of type <int> with the value 1.

a = 1

In python, variables are simply names, pointing to these complex objects.

pyObject
id0x7fa8655e00f0
type<class 'int'>
value1
refs1
A pyObject

The size of the integer 1 in memory is 28 bytes. The size of the type <class 'int'> is 408 bytes. Both are pyObjects.


Inspecting a pyObject’s identifier

We can view the identifier of an object in python using built-in function id().

id(a)
140361231892720

The identifier is an integer.

In the CPython implementation, the id is a memory location. We can express it as a hexadecimal using an f-string.

f'{id(a):#x}'
0x7fa8655e00f0

Inspecting object type

The built-in function type will allow us to inspect the type of a.

type(a)
<class 'int'>

The type in this case, is the class int. Integers are instances of the class int

Types also have a type!

type(type(a))
<class 'type'>

Types are instances of the class type!


Inspecting attributes

Depending on their type, objects sometimes have methods which we can use. To get a list of all attributes, we can use the built-in dir() function. The result can be confusing. String objects have lots of methods. Many of which are so-called __dunder__ methods.

dir('hello')
['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'removeprefix', 'removesuffix', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

Ignore the __dunder__ methods

In python, methods (and other attributes) with double underscores are special. For now, we can ignore them with a list comprehension that filters the list to only those without the double underscores.

[method for method in dir('hello') if not method.startswith('__')]
['capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'removeprefix', 'removesuffix', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

Notice that the dir method returned a list of string attribute names. Also notice that we used the startswith method in the above list comprehension.


Python comes with batteries included

xkcd comic

Thanks for listening

Any questions?

We should have plenty of time for questions and answers.

Just ask, you are probably not the only one who wants to know.

Dr Graeme Stuart