Assigning a value to a tuple#

Once I turned on type checking for our code base, the first thing it found was a wide variety of errors where we were dealing with tuples. Most of this occurred in manual assignments. While some definitions seem straight forward, I found that many of these behave in ways that may be counter-intuitive. If you are able to know all of these without looking it up, you did better than I did! Because I don’t know all of these without thinking, I let my type checker help me out :-)

# Empty attempts
a = ()
b = tuple()
c = ,
d = (,)
e = tuple(,)
# Single item attempts
f = "sauron"
g = ("gollum")
h = tuple("frodo")
i = "legolas", 
j = ("gandalf",)
k = tuple("bilbo", )
# Two item attempts
l = "pippin", "arwen"
m = ("aragorn", "gimli")
n = tuple("samwise", "sarumon")
o = "merry", "boromir",
p = ("faramir", "treebeard",)
q = tuple("elrond", "galadriel",)

This code is in tuple_puzzle.py on GitHub, but I would encourage not using an IDE to open the file as it will start to give you hints as to what is wrong.

How are tuples typed?#

Before diving into each of the above examples, it is necessary to talk about how a tuple is typed in Python. Typing of tuples is complicated enough it has its own section in the typing module documentation. While that has more details and examples the most important thing is tuples can be typed 3 ways:

  • Empty tuple
    • eg: tuple[()]
  • Tuple of specific length with possibly different data types
    • eg: tuple[str], tuple[str, int], or tuple[str, str]
    • An empty tuple is not valid for any of these since they define a specific length
  • Tuple of unknown length with only one data type
    • eg: tuple[int, ...] or tuple[str, ...]
    • An empty tuple is a valid value for any tuple typed this way

Knowing this will ensure that we have the appropriate nomenclature when discussing each of the above examples.

The solutions#

If you are interested in just diving into and looking at the solutions, feel free to head over to tuple_solution.py on GitHub. While I go into some detail of the cases there, I have more explanations below.

The string#

If we are assigning to just a string, it will clearly just be a string.

f: str = "sauron"

But as we will see later, how we add commas, parentheses, and tuple calls to this will drastically change its behavior. This is just setting the bar that this is a simple string.

Syntactically incorrect statements#

Several of the attempts to create empty tuples are not valid syntactically, specifically the following:

c = ,
d = (,)
e = tuple(,)

The comma requires there to be a valid value before it. Since none of these have a value before the comma, they do not parse, and cause SyntaxError.

Making an empty tuple#

After the failure of the above attempts to create a tuple, that leaves us with the two examples which do create valid empty tuples:

a: tuple[()] = () # => ()
b: tuple[str, ...] = tuple() # => ()

As in the discussion about how to type an empty tuple would depend on how it is being used elsewhere, but the two above are valid.

Let’s add a value#

The next logical progression is to make a tuple with a single value within it. Let’s first take a look at a logical combination of an empty tuple definition of () and a string. In our above puzzle, we did this when g = ("gollum"). Many would believe that this creates a tuple, but rather it generates a string, thus making the following the proper annotation.

g: str = ("gollum")   # => "gollum"

The reason for this is because parentheses can be use to separate a single value for clarity. In this case, that means the parentheses are effectively removed. The () is treated like a special case, and adding a single element doesn’t make it a tuple. Imagine if a list worked the same way, where [123] was not a list! This starts our edge cases when there is just a single input value.

As an aside, at the top I use the # fmt: off declaration. This is because the formatter I am using, black, will automatically reformat g: str = ("gollum") into g: str = "gollum", removing the parenthesis and making it very clear that this is a string. While the formatter won’t warn us of errors like type checking will, it can still be a powerful tool in making code more readable.

To properly define a tuple in this format, we must add a comma after the value, so the following is valid syntax:

j: tuple[str] = ("gandalf",) # => ("gandalf",)

With a single value in a tuple, the , is not optional. Adding this here will generate a tuple as we expect. Also to note here is that tuple[str] is a valid annotation here because the parser can determine there is a single string value. We will see in some later examples where it cannot determine the length until runtime.

Interestingly, the parentheses are actually optional in this declaration! The following is also valid syntax to create a tuple with a single string.

i: tuple[str, ...] = "legolas", # => ("legolas",)

While a type of tuple[str] valid, this also satisfies the type of tuple[str, ...].

The immediate question is why would this be a valid way to generate a tuple? Spoiling a later answer, lets look at the following function.

def get_next_to_fibonacci(val_1: int, val_2: int) -> tuple[int, int]:
	val_3: int = val_1 + val_2
	val_4: int = val_2 + val_3
	return val_3, val_4

To make this easy and allow people to “forget” the parentheses, this is considered a valid way to return a tuple. Another valid return would be val_3, val_4, with a trailing comma. Python treats the single case of "legolas", as having a trailing comma, and thus it is a valid tuple.

Creating single item tuples with tuple()#

As we’ve seen, following logical conclusions can lead to confusing results, and this is very true now that we can make calls with tuple().

We have just stated that "legolas", is a valid tuple, so it would make sense that putting this into a tuple() call would also generate a valid tuple. It does, but just not how we expect.

k: tuple[str, ...] = tuple("bilbo",) # => ("b", "i", "l", "b", "o")

On its own, we may expect that the above would return ("bilbo"). It seems like logically trailing , would make a tuple, and making a tuple of a tuple would seem to be a no-op. And even our type checker is happy with the type of tuple[str, ...], although this is actually hiding a problem. That type annotation indicates any number of strings in the tuple. The key difference here is that the , in a simple definition, like in i = "legolas", is not parsed the same as in a function call, like tuple("bilbo",). In the function call case, the , is considered an optional trailing comma to an input list of arguments and is being stripped. Thus, this is the same as if it was called without a comma. Thus the following also will behave in the same way:

h: tuple[str, ...] = tuple("frodo") # => ("f", "r", "o", "d", "o")

But why is it behaving this way? That is because the if we look at the help for tuple the definition is tuple(iterable=(), /). While this will come up again in another post I have planned with more detail, the / just means that all the inputs before the / are positional, not keyword arguments. So tuple(iterable="frodo") is not valid.

In this case, "frodo" and "bilbo" (now stripped of the comma) are both iterables. They will generate single character strings for each value in the full string. Since Python doesn’t have the concept of a char or similar data type, this means it will result in a tuple of strings, thus passing the tuple[str, ...] check.

This means that in order for this to behave as we would expect it, we must pass an iterable where the first iterable is our single value. First, all of these would result in the single character outputs:

tuple("frodo")    # iterable is just "frodo"
tuple("frodo",)   # iterable is just "frodo" as comma is ignored
tuple(("frodo"))  # iterable is just "frodo" as parentheses are ignored
# All yield
("f", "r", "o", "d", "o")

Here are some valid inputs:

tuple(("frodo",))   # with the trailing comma, it creates a tuple which                       # is iterable, and `tuple` recreates it
# Other types without trailing comma requirement can
# be used to pass in the iterable as we expect
tuple(["frodo"])
tuple({"frodo": 1})
# All of these generate the following with type `tuple[str, ...]`
("frodo", )

One thing to note is all of these examples have the type of tuple[str, ...]. This is because the iteration is not performed until runtime. Thus, the type checking cannot know the length ahead of time, and therefore must use the indeterminate length form of tuple[str, ...].

Creating two item tuples#

First, we can look at the two cases that result in runtime errors. Both of the following result in TypeError exceptions.

n = tuple("samwise", "sarumon")
q = tuple("elrond", "galadriel",)

These exceptions occur because the tuple is expecting a single argument. Type checking will catch this error, so it can be easily avoided. To fix this, each would need to be surrounded in another set of parentheses, or simple remove the tuple call all together, such as:

n: tuple[str, ...] = tuple(("samwise", "sarumon"))
q: tuple[str, str] = ("elrond", "galadriel",)

Note that for the call with tuple, we cannot determine the size until runtime when the iterator is evaluated, so it is necessary to use the general tuple[str, ...] for a tuple of unknown size. Thus, in general I prefer not to call tuple where possible since it reduces the amount of analysis that can be done with static type checkers.

Finally, let’s look at the cases that allow us to successfully generate a two item tuple. All of these are valid, and can use either the tuple[str, ...] or tuple[str, str] typing. I use a combination of both just for example.

l: tuple[str, ...] = "pippin", "arwen" # => ("pippin", "arwen")
m: tuple[str, str] = ("aragorn", "gimli") # => ("aragorn", "gimli")
o: tuple[str, ...] = "merry", "boromir", # => ("merry", "boromir")
p: tuple[str, str] = ("faramir","treebeard",) # => ("faramir","treebeard")

When there are at least two items, the trailing comma is optional. After all of the strange edge cases above, all of these feel rather pedestrian, which I feel like it should feel like.

Conclusions#

Having both formatting and type checking can help to eliminate several of the errors encountered when trying to make a tuple. But even these will not prevent all possible issues, since if you are trying to match a tuple[str, ...], tuple("frodo") would be valid but likely is not doing what you want. Personally, I do avoid the tuple() function if there are any pre-defined values, although it is necessary for generating tuples based on user or other dynamic inputs. In the end, tooling will not solve all of the issues and a robust code review process can be crucial to find some of the more obscure ways that generating a tuple can fail.