Python Programming Language

Python is a scripting language that runs on an interpreter. Under the hood, Python scripts are compiled into byte code and run by the Python Virtual Machine (PVM) - the runtime engine that loops through byte code instructions individually.

Everything in Python is an object, and every object can be classified as mutable or immutable. Numbers, strings and tuples are immutable, while lists, dictionaries and sets are not. As Python only changes references when updating content for mutable types (similar to pointers in C), changing a referenced object in place may impact multiple references from within different objects.

Different mutable types have different built-in methods to make copies. For example, the slicing or range operations of lists yield new copies, and dictionaries have a copy() method. All the copies made in this way are shallow, meaning that only the top-level references are copied. To make a deep copy of a nested object recursively, use the deepcopy() from the copy module.

Python is dynamically and strongly typed. There are no type declarations in Python. Instead, the syntax of code expressions determines the types of objects. For example, square brackets represent lists, and curly braces define dictionaries. A variable is created when you assign it a value, and a variable must be assigned a value before you can use it.

Dynamic Typing in Python

In Python, the type information is associated with objects, not variables. Each Python object contains a header field that tags the object with a type (implemented as a pointer to the type object). For example, expression a = 'Python' creates a ‘typeless’ variable a that points to an object. The object has a type tag pointing to the Python object str, which is the type of the object.

Garbage Collection & References

Certain frequently used immutable objects, such as the number 1 or the string a, may not be reclaimed by the garbage collector immediately after they lose all the references from user-defined variables (the reference count drops to 0). The Python runtime may cache them, keep the memory around, and reuse the objects later. In addition, the Python runtime often references certain objects and keeps them in memory. For example, the number 1 will certainly have a larger reference count than one may expected.

import sys
sys.getrefcount(1)
 
# 5854

Built-in Data Types

In Python, types in the same category share the same set of operations. For example, all sequence types, such as strings, lists and tuples, support indexing, slicing and concatenation operations. Strings are also immutable and, like other immutable types (frozen sets and tuples), do not support in-place changes.

The three main data type categories are:

Numbers (integer, floating point, decimal, fraction…) support addition and multiplication
Sequences (strings lists and tuples) support iteration, indexing, slicing, and concatenation
- Slicing assignments can be considered a combination of two steps: deletion and insertion of an entire section of the original object. The length of the deletion and insertion does not have to be the same; hence, it can be used to replace, expand or shrink the object at hand. For example:
```
L = ['eggs', 'bacon', 'spam']
L[0:2] = ['eat', 'more']
 
L 
# ['eat', 'more', 'spam']
 
L[0:1] = ['I', 'would', 'like', 'to']
L
# ['I', 'would', 'like', 'to', 'more', 'spam']
```
Mappings (dictionaries) support indexing by key

Boolean

Like in many scripting languages, although represented by the dedicated notation True and False and printed as the words True and False, internally, the Boolean type is a subtype of int and has the values 0 and 1. You can even use semantically illogical code like True + 4.

All objects in Python have an intrinsic, inherited True or False chracteristic. For example, 0 is false, but other numbers are true; the empty string '' is false, but the string 'Python' is true. This is similar to the truthy property of JavaScript objects.

Numbers

Python supports a comprehensive range of number types, including integers, floating-point numbers, complex numbers, decimals (the Decimal type), rationals (the Fraction type), and sets. The Python ecosystem also offers a wealth of libraries and packages for advanced mathematical and scientific computation, such as matrix and vector processing and sophisticated graphics and plotting.

String

The String type in Python can be seen as a sequence of one-character strings. Like all other sequence types, it supports positional ordering operations such as len() and the indexing expression s[i]. The String type has a rich set of type-specific methods such as splitlines, encode, endwith and isalpha. The full list can be viewed by dir(str). The String type also supports + (concatenation) and * (repeat) operations through operator overloading (a form of polymorphism, meaning the same operators will behave differently depending on the objects being operated on).

List

Python Lists are positionally ordered collections of objects of arbitrary types.

Lists are of sequence type in Python. Unlike strings, which are also sequences, lists are mutable, providing a flexible data structure for any arbitrary collection, such as files in a directory or to-do items in a task list.

Lists are analogous to the array type in other programming languages, with the difference that Python lists can hold content of arbitrary types. Similar to multidimensional arrays, Python Lists allow arbitrary nesting.

Many built-in Python List methods can modify their content in place, extend or shrink the list (via pop, insert or extend, etc.), or change the entire list (such as sort). However, unlike in C, growing lists by assigning items to indexes that are out of bounds is not permitted in Python, thanks to a feature called bounds checking.

Lists support list comprehension expressions, a powerful feature that simplifies processing complex data structures such as matrices. List comprehension expressions always build new lists by iterating iterable objects without altering the existing source objects.

For example:

[row[1] for row in M if row[1] % 2 == 0]

to filter out the old items in column 2 of a matrix or

[M[i][i] for i in [0, 1, 2]

to collect a diagonal from a 3x3 matrix.

Tuple

Tuple is a sequence type in Python. It is the immutable version of the List. Tuples are ordered collections of arbitrary objects. Tuple literals are coded in parentheses (as opposed to square brackets for Lists). Tuple supports arbitrary types, arbitrary nesting and the usual sequence operations.

t1 = (1, 2, 3, 4)
 
# The parentheses can be omitted when creating a tuple (when the context is not ambiguous to do so)
 
t2 = 'python', 3.1415926, [1, 2, 3, 5, 8]

As with all immutable collections in Python, tuples store the references to objects and not the objects themselves, which means the referenced objects can still be altered.

t = (1, 2, [3, 4])
t
# (1, 2, [3, 4])
 
t[2][1] = 5
t
# (1, 2, [3, 5])

Dictionary

The Dictionary type is the only built-in mapping type in Python. Dictionaries are unordered collections of arbitrary types stored and retrieved by keys instead of positional offsets like the sequence types. The literal syntax for dictionaries is curly brackets {key1: value1, key2: value2, ...}. Like lists, dictionaries are mutable and arbitrarily nestable.

Python dictionaries are similar to associative arrays or hashes in other programming languages. Internally, Python dictionaries are implemented as hash tables optimised for speed and efficiency. That means any hashable object can be used as dictionary keys.

Dictionary type does not raise an out-of-range/bound error when accessing (with the dict.get() method; otherwise, it will yield a KeyError) or setting values for non-existing keys, making it ideal for sparse data structures. It can even be used to simulate a ‘flexible list’ when integers are used as keys. Another common use case is to use tuples as keys to represent sparse matrices.

movies = { 1975: 'Holy Grail', 
          1979: 'Life of Brian', 
          1983: 'The Meaning of Life'}
movies[1979]
 
# Output: 'Life of Brian'
 
matrix = {}
matrix[(2, 3, 4)] = 88
matrix[(7, 8, 9)] = 99
 
x = 2; y = 3; z = 4;
matrix[(x, y, z)]
 
# Output: 88

Set

A set in Python is an unordered collection of unique and immutable objects that support mathematical set theory operations. Since unordered sets do not map values to keys, sets are neither sequence nor mapping types [@lutzLearningPython2003]. Sets can only contain hashable (immutable) objects; mutable objects such as lists or dictionaries cannot be embedded in sets. To store compound values, use Tuples.

s = {1, 2, 3}
 
s.add([1, 2, 3])
# TypeError: unhashable type: 'list'
 
s.add((4, 5, 6))
# {1, 2, 3, (4, 5, 6)}

Sets themselves are mutable, too, and cannot be nested in other sets directly. To store sets inside other sets, you can use frozenset built-in to create an immutable set.

Similar to the List type, Set supports comprehension expression. The only difference is that Set uses curly brackets (Set literal {}) instead of square brackets (List literal []).

{ x ** 2 for x in [1, 2, 3, 4] }
# {1, 4, 9, 16}
 
{ x for x in 'Python' }
# {'P', 'h', 'n', 'o', 't', 'y'}
 
{ x * 3 for x in 'Pythony' }
# {'PPP', 'hhh', 'nnn', 'ooo', 'ttt', 'yyy'}
# Notice that Set only can contain unique objects

File

Unlike many other programming languages, Python treats the file type as a built-in core type, which one can obtain a file instance by calling the open() method. In Python 3, the read and write operations treat files as Unicode encoded by default - unless you specifically pass in the b (for binary) flag.

Python has many built-in utilities for handling files. For example, the pickle module serialises objects, the struct tool packs and unpacks binary data, the JSON module converts Python objects to and from JSON syntax, and the shelve module provides keyed storage and access to Python objects.

Comparison and Equality Tests for Built-in Types

In Python, the equality test (\=\=) performs value equivalence - recursively for nested objects. The operator is conducts a reference test. Because Python caches immutable objects internally, two distinct short strings a = 'Python', b='Python' may pass the is test a is b. However, users should not rely on this compiler implementation feature and always perform the desired equality tests within the correct semantic contexts.

Different types have different definitions of equality and relative magnitude comparison results. For example, sets are equal if they contain the same items, dictionaries are equal if their sorted lists are equal, and lists and tuples are compared by each component from left to right.

Unlike JavaScript, Python does not support intrinsic conversion of types (apart from numbers - they will be converted to the highest precision) for comparison. Comparing variables of different types in Python will throw an error.

Python Virtual Environment

An environment in Python is an isolated context in which a Python program can be run. It consists of the Python interpreter and other required packages for the program. Using an environment is similar to the idea of Docker, which allows you to run an application in an isolated environment without polluting the host machines. For Python, this is especially important for operating systems that use multiple package managers for Python utilities, where the package managers may install conflicting packages and interfere with each other.

The venv module can be used to create a lightweight virtual environment on top of an existing Python installation.

python -m venv /path/to/new/virtual/environment

A virtual environment can be activated by running:

source _<venv>_/bin/activate

It prepends the venv to your PATH, so running regular Python commands, including pip and python, will invoke the virtual environment’s interpreter without being explicitly told so.

The Python Interpreter

Python interpreter comes with a collection of __builtins__ - a set of built-in functions, exceptions and other objects. Use help(__builtins__) to learn more about it.

Statement vs. Expressions

In Python, an expression is a programming construct that can be reduced to a value or a collection of symbols that jointly express a quantity. For example: 3 + 5 or [a.x for a in some_iterable]. A simple way to classify it is whether you can feed it to eval() - a valid expression can always be evaluated by evel(). Expressions can only contain identifiers, literals, and operators - including the call operator (), subscription operator [], arithmetic, and boolean operators.

Statements are everything else that can make up a line. They perform actions, that is, they do something. Statements are the smallest standalone element in an imperative programming language. The distinction between an expression and a statement is an important one in avoiding common coding mistakes:

#
# This is a common mistake in Python
 
L = [1, 2, 3]
L = L.append(4)
# `apend()` changes the list in place but itself returns None
# `=` makes a statement, not an expression
 
print(L)
# By assigning L to L.append(), you actually have lost the
# reference for the reasons above
 
#
# The correct way to do this is:
 
L = [1, 2, 3]
L.append(4)
print(L)        # Works as expected

Every valid expression can be used as a statement (called an expression statement).

References

Lutz, M. & Ascher, D. (2003) Learning Python. ‘O’Reilly Media, Inc.’
Wikipedia Contributors (2019). Python (programming language). [online] Wikipedia. Available at: https://en.wikipedia.org/wiki/Python_(programming_language).

Liwen's Notes

Explorer