Functions

When we write code, we want it to be as concise as possible while retaining readability. Sometimes we find ourselves copying and pasting the same block of code over and over again. Instead of repeating this block of code, we can bundle it up into a function. We’ve encountered some built-in functions already:

  • print()

  • len()

  • type()

But we can also define our own functions.

Anatomy of a function

Examples can be very useful for communicating new concepts so let’s get straight to it and define our first function:

def fahr_to_celsius(temp_fahrenheit):
    
    temp_celsius = (temp_fahrenheit - 32) * (5/9)
    
    return temp_celsius

So what is going on here?

  • The function definition opens with the keyword def (short for define) followed by the name of the function, a list of parameters in parentheses (()) and a colon (:).

  • The body of the function — the code that is executed when it runs — is indented below the definition line.

  • At the end of the function, we use a return statement to define the value that should be output when the function is called.

Defining a function does nothing other than make it available for use in our notebooks. In order to use the function we need to call it using the name of the function with the provided value(s) inside parentheses. As you can probaly tell, this function converts temperatures from Fahrenheit to Celsius.

fahr_to_celsius(90)
32.22222222222222

We can also write a function to convert from Celsius to Kelvin:

def celsius_to_kelvin(temp_celsius):
    return temp_celsius + 273.15

print("The freezing point of water is {} K".format(celsius_to_kelvin(0)))
The freezing point of water is 273.15 K

Functions within a function

Now say we wanted to convert from Fahrenheit to Kelvin, we could just include a function within a function.

def fahr_to_kelvin(temp_fahr):
    temp_celsius = fahr_to_celsius(temp_fahr)
    temp_kelvin = celsius_to_kelvin(temp_celsius)
    return temp_kelvin

print("The boiling point of water is {} K".format(fahr_to_kelvin(212)))
The boiling point of water is 373.15 K

Variable scope

When we composed our temperature conversion functions, we created variables inside of those functions called temp_celsius and temp_kelvin. We refer to these variables as local variables because they no longer exist once the function has executed. If we try to access their values outside of the function, we will encounter an error:

temp_celsius
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [12], in <module>
----> 1 temp_celsius

NameError: name 'temp_celsius' is not defined

If we wanted to reuse the temperature in Kelvin after we converted it using fahr_to_kelvin, we can store the result of the function as a variable.

celsius = fahr_to_kelvin(60)
celsius
288.7055555555555

Since the variable is defined outside the function, it is called a global variable. Understanding the difference between local and global variables is crucial since many bugs and issues are caused by misunderstanding of the two. So to recap:

  • A global variable is visible everywhere in a notebook.

  • A local variable is visible only within a function.

Generally, we want to avoid using too many global variables when we are programming because they can make debugging more difficult.

Functions with multiple parameters

It is also possible to define a function with multiple parameters. Here we will define a simple temperature calculator function that accepts temperatures in Fahrenheit and returns the temperature in either Celsius or Kelvin. The function will have two input parameters:

  • temp_fahr = the temperature in Fahrenheit

  • convert_to = the output units in Celsius or Kelvin (using the string C or K accordingly)

def temp_calculator(temp_fahr, convert_to):
    
    if convert_to == "C":
        
        converted_temp = fahr_to_celsius(temp_fahr)
        
    elif convert_to == "K":
        
        converted_temp = fahr_to_kelvin(temp_fahr)
        
    return converted_temp
temp_calculator(75, 'C')
23.88888888888889
temp_calculator(75, 'K')
297.0388888888889

Documenting code

It is important to document code, either for our future selves or for collaborators who might want to use and adapt our code. We can do that using block comments that start with a single hash sign (#) followed by a single space and a text string. The Python interpreter will ignore these lines and display them as a different color.

def temp_calculator(temp_fahr, convert_to):
    
    # Check if user wants the temperature in Celsius
    if convert_to == "C":
        
        # Convert the value to Celsius
        converted_temp = fahr_to_celsius(temp_fahr)
    
    # Check if user wants the temperature in Kelvin
    elif convert_to == "K":
        
        # Convert the value to Kelvin
        converted_temp = fahr_to_kelvin(temp_fahr)
    
    # Return converted temperature
    return converted_temp

When writing functions, it is good practice to include even more documentation such as the data type of the expected input/output and example usage. For example, it is not clear whether convert_to is expecting the letter C or the word Celsius. We can do that using a docstring which is a multi-line comment that starts and ends with triple quotes (""").

def temp_calculator(temp_fahr, convert_to):
    
    """
    Function for converting temperature in Fahrenheit to Celsius or Kelvin.

    Parameters
    ----------
    temp_fahr: <numerical>
        Temperature value in Fahrenheit
    convert_to: <str>
        Temperature unit in either Celsius ('C') or Fahrenheit ('F').

    Returns
    -------
    converted_temp: <float>
        Converted temperature.
        
    Example
    --------
    >>> temp_calculator(75, 'K')
    297.0388888888889    
    
    """
    # Check if user wants the temperature in Celsius
    if convert_to == "C":
        
        # Convert the value to Celsius
        converted_temp = fahr_to_celsius(temp_fahr)
    
    # Check if user wants the temperature in Kelvin
    elif convert_to == "K":
        
        # Convert the value to Kelvin
        converted_temp = fahr_to_kelvin(temp_fahr)
    
    # Return converted temperature
    return converted_temp

Now it would be much easier for another person to use our function. In fact, we can use the help() function in Python to find out how our function should be used.

help(temp_calculator)
Help on function temp_calculator in module __main__:

temp_calculator(temp_fahr, convert_to)
    Function for converting temperature in Fahrenheit to Celsius or Kelvin.
    
    Parameters
    ----------
    temp_fahr: <numerical>
        Temperature value in Fahrenheit
    convert_to: <str>
        Temperature unit in either Celsius ('C') or Fahrenheit ('F').
    
    Returns
    -------
    converted_temp: <float>
        Converted temperature.
        
    Examples
    --------
    >>> temp_calculator(75, 'K')
    297.0388888888889

Producing our own functions with docstrings provides an insight into how larger software is developed in Python. We first define some basic operations, then we combine them into bigger chunks to compute what we want. In Week 2, we used several functions available in the NumPy library. If we call help on one of these functions, we will see that it contains docstrings just like the one we just wrote.

import numpy as np
help(np.mean)
Help on function mean in module numpy:

mean(a, axis=None, dtype=None, out=None, keepdims=<no value>, *, where=<no value>)
    Compute the arithmetic mean along the specified axis.
    
    Returns the average of the array elements.  The average is taken over
    the flattened array by default, otherwise over the specified axis.
    `float64` intermediate and return values are used for integer inputs.
    
    Parameters
    ----------
    a : array_like
        Array containing numbers whose mean is desired. If `a` is not an
        array, a conversion is attempted.
    axis : None or int or tuple of ints, optional
        Axis or axes along which the means are computed. The default is to
        compute the mean of the flattened array.
    
        .. versionadded:: 1.7.0
    
        If this is a tuple of ints, a mean is performed over multiple axes,
        instead of a single axis or all the axes as before.
    dtype : data-type, optional
        Type to use in computing the mean.  For integer inputs, the default
        is `float64`; for floating point inputs, it is the same as the
        input dtype.
    out : ndarray, optional
        Alternate output array in which to place the result.  The default
        is ``None``; if provided, it must have the same shape as the
        expected output, but the type will be cast if necessary.
        See :ref:`ufuncs-output-type` for more details.
    
    keepdims : bool, optional
        If this is set to True, the axes which are reduced are left
        in the result as dimensions with size one. With this option,
        the result will broadcast correctly against the input array.
    
        If the default value is passed, then `keepdims` will not be
        passed through to the `mean` method of sub-classes of
        `ndarray`, however any non-default value will be.  If the
        sub-class' method does not implement `keepdims` any
        exceptions will be raised.
    
    where : array_like of bool, optional
        Elements to include in the mean. See `~numpy.ufunc.reduce` for details.
    
        .. versionadded:: 1.20.0
    
    Returns
    -------
    m : ndarray, see dtype parameter above
        If `out=None`, returns a new array containing the mean values,
        otherwise a reference to the output array is returned.
    
    See Also
    --------
    average : Weighted average
    std, var, nanmean, nanstd, nanvar
    
    Notes
    -----
    The arithmetic mean is the sum of the elements along the axis divided
    by the number of elements.
    
    Note that for floating-point input, the mean is computed using the
    same precision the input has.  Depending on the input data, this can
    cause the results to be inaccurate, especially for `float32` (see
    example below).  Specifying a higher-precision accumulator using the
    `dtype` keyword can alleviate this issue.
    
    By default, `float16` results are computed using `float32` intermediates
    for extra precision.
    
    Examples
    --------
    >>> a = np.array([[1, 2], [3, 4]])
    >>> np.mean(a)
    2.5
    >>> np.mean(a, axis=0)
    array([2., 3.])
    >>> np.mean(a, axis=1)
    array([1.5, 3.5])
    
    In single precision, `mean` can be inaccurate:
    
    >>> a = np.zeros((2, 512*512), dtype=np.float32)
    >>> a[0, :] = 1.0
    >>> a[1, :] = 0.1
    >>> np.mean(a)
    0.54999924
    
    Computing the mean in float64 is more accurate:
    
    >>> np.mean(a, dtype=np.float64)
    0.55000000074505806 # may vary
    
    Specifying a where argument:
    >>> a = np.array([[5, 9, 13], [14, 10, 12], [11, 15, 19]])
    >>> np.mean(a)
    12.0
    >>> np.mean(a, where=[[True], [False], [False]])
    9.0