Basic data types¶

In a code you can define variables and these have a type, where the type is the kind of thing that the variable represents. For example, a variable might hold an integer number, or a set of alphanumeric values (known as a string).

a variable is the name you give any object that you create in a code, e.g., in x = 2 a variable called x has been defined, which can be reused later in the code.

Note

Variable names must start with a letter, but can then contain numbers or underscores. Variable names are case sensitive, i.e., a = 2 and A = 2 are different variables. It is useful to have descriptive variable names.

Some languages have static typing, where you must explicitly tell the code what the variable's type is. In the C language you would define a variable that holds an integer with, e.g.,

int myvariable = 2;

Python is a "duck typing" language - "If it walks like a duck and it quacks like a duck, then it must be a duck". It will work out the type for basic data types by what they "look" like:

x = 5
type(x)
<class 'int'>

It has determined that the variable x is of the integer (or int) type. In this example type() is a built-in Python function that returns the type of a variable.

Note

Everything in Python is an object (hence object oriented programming, or OOP). In OOP an object is a thing that contains data in the form of variables and/or functions to act on that data. All variables are objects and therefore the type refers to the "type" of object. "Type" is sometimes used interchangeably with "class", where a class defines a type.

The main basic data types (in Python and many languages) are:

int: represents an positive or negative integer number;
float: represents a "floating point number", i.e., a non-integer number;
str: represents some alphanumeric text, know as a "string";
bool: represents a boolean value, i.e., "True" or "False".

# defining an integer
myInteger = 2

# defining a float (note the decimal point)
myFloat = -59.87534

# define a string
myString = "Hello"

# define a boolean
myBool = True

The basic data types are objects, and as such have data attributes and methods:

data attributes are variables that are contained within an object;
methods are functions within an object.

Examples:

x = 1
# int and float objects contain real and imag attributes
x.real
1

y = 1.5
# floats contain an is_integer() method
y.is_integer()
False

# the float can be returned as an integer ratio
y.as_integer_ratio()
(3, 2)

Note

Due to being held in a finite amount of computer memory, floating point numbers are not exact. They can show loss of precision when including many significant figures, e.g.,:

x = 4.1 - 1.2
print(x)
2.8999999999999995

The above calculation has produced a number that is very close to, but not quite, 2.9. This can mean that you have to be careful if doing comparisons with floats, as:

print(x == 2.9)
False

You might instead do:

print(abs((x - 2.9)/x) < 1e-15)  # difference has very small relative error

Strings¶

Strings can be defined in three different, but equivalent, ways:

z1 = "Hello"
z2 = 'Hello'
z3 = """Hello"""  # you could also use three consecutive apostrophes

String methods¶

Strings have a lot of methods, for example:

z = "Hello"
# show the string in upper case
z.upper()
'HELLO'
# replace l's with x's
z.replace("l", "x")
'Hexxo'

Some particularly useful methods (at least ones I use regularly!) are the split and strip methods.

The split method allows you to split a string into a list of values based on a particular separator (lists are covered in another tutorial). By default split will split a string based on whitespace, i.e., spaces, tabs or new lines, e.g.:

name = "Matthew Pitkin"
# split this into a list containing the first name and surname
names = name.split()
print(names)
['Matthew', 'Pitkin']

But, you can also split based on a particular character, e.g.:

# split comma separated values
x = "1,2,3,4"
vals = x.split(",")
print(vals)
['1', '2', '3', '4']

The strip method will strip-off leading or trailing values from a string. By default it will strip off whitespace from a string. This is useful if, for example, you have a set of comma separated values (maybe read in from a file) and they contain superfluous spaces:

namelist = "Matthew, Helen, Tom, Tracy, Steve"
# split these names on commas
names = namelist.split(",")
print(names)  # they'll still contain extra space
['Matthew', ' Helen', ' Tom', ' Tracy', ' Steve']
# so instead:
names = [name.strip() for name in namelist.split(",")]
['Matthew', 'Helen', 'Tom', 'Tracy', 'Steve']

In the above example it has used list comprehension, which is covered in another tutorial.

String concatenation¶

To join strings together you can just use the addition operator +, e.g.,

x = "Hello"
y = " "  # a space
z = "World!"
phrase = x + y + z
print(phrase)
Hello World!

String formatting¶

The basic Python types have a representation of themselves in a string format. So, for example, if you define an integer and print it you will see:

x = 34
print(x)
34

You can place an object's string representation into another string using the format() method. For example:

x = 12
y = 14
z = x + y
print("The sum of {} + {} = {}".format(x, y, z))
The sum of 12 + 14 = 26

The format() method replaces the curly brackets with the string representations for x, y and z.

You can have even more control about how numbers are displayed, for example:

x = 12.9627459845
y = 13.9875284843
z = x + y
print("The sum of {0:.2f} + {1:.2f} = {2:.2f}".format(x, y, z))
The sum of 12.96 + 13.99 = 26.95

In this case the 0 represents the first argument to format, which is followed by the formatting type :.2f that means "show a floating point number (f) to two decimal places (.2)."

In Python version 3.6, another way to use string formatting was introduced. If you define a string with an f before the opening quotes (known as an f-string) you can use variable names within curly brackets to show their values, e.g.:

firstname = "Matthew"
age = 21

mystring = f"My name is {firstname} and my age is {age}."
print(mystring)
My name is Matthew and my age is 21.

Note

There are multiple equivalent ways of getting variables into strings. There are another couple of options (at least). The first of these is to use an older Python 2 style syntax:

mystring = "My name is %s and my age is %d." % (firstname, age)

where the "%s" is a placeholder for inserting a string and the "%d" is a placeholder for inserting an integer (%f would be used for a float, or %e is you wanted it in scientific notation, with the whole range of values that can be used given here). After the string a % character is used followed by a tuple containing the variables to insert given in the order they are to be inserted.

Another option is to just concatenate multiple strings together, e.g.,

mystring = "My name is " + firstname + " and my age is " + str(age)

where it should be noted that we have had to use str to convert the integer valued age variable into a string.

We recommend using the f-string method of string formatting, although it's more important to pick one method and be consistent throughout your code.

Unicode, escape characters and "raw" strings¶

In Python 3 strings are by default unicode strings. This means that in addition to the standard Roman alphabet keyboard characters, they can include characters like: accented letters, non-Roman alphabet characters, and various emoji symbols:

sentence = "Dr Müller likes 𝜋 🙂"

In fact, you can even have variable names that use (some) unicode characters:

𝜋 = 3.14
print(𝜋)
3.14

Strings are defined using quotation marks or apostrophes, so what if you want to use a quotation mark or apostrophe within a string?

If you have a string that you define using quotation marks then you can use apostrophes within it, and vice versa:

a = "It was Matthew's birthday"
b = '"Be quiet!", said the lecturer.'

However, another way is to use the "escape character" \. For example using a \" in a string that is defined with quotation marks means that you want to display a quotation mark:

a = "\"Thank you\", said the lecturer."
print(a)
"Thank you", said the lecturer.

The escape character followed by certain letters can be used to add additional formatting within a string. You can add tabs or new lines within a string with the "\t" and "\n" characters:

listings = "1.\tApples\n2.\tPears\n3.\tOranges\n"
print(listings)
1.  Apples
2.  Pears
3.  Oranges

If you want to show a \ itself within a string you need to use \\. This is particularly important for file paths within Windows, which use \ as the separator:

# rather than, where the \t will get changed to a tab
path = "C:\Users\mydirectory\textfile.txt"
# use
path = "C:\\Users\\mydirectory\\myfile.txt"

Raw strings¶

If you want to ignore escaped characters you can make a string use "raw" text. To do this you add an r before the opening quotation mark/apostrophe. For example to actually include "\n" in a string, rather than have it interpreted as a new line, you would write:

rawstring = r"Ignore the \n escape characters"
print(rawstring)
Ignore the \n escape characters

The above Windows file path could instead be written as a raw string to avoid needing the double \'s:

path = r"C:\Users\mydirectory\textfile.txt"

Booleans¶

A boolean value just represents True or False, so can only take two values:

x = True
y = False

True and False must have an uppercase first letter to be recognised by Python. Booleans are generally used for comparison in logical expressions, which are covered in another tutorial. For example, evaluating an equality will return a boolean value:

x = 2

testequality = (x == 2)  # == test if two values/objects are the same

print(type(testequality))
<class 'bool'>
print(testequality)
True