Slicing Sequences in Python

Efficiently slice sequences, including lists, tuples, and strings, into a sequence subset in Python.

Slicing Basics

Python includes both a slice function and special syntax to efficiently split sequences into pieces. Python built-in sequence types include lists, tuples, string, bytearrays, bytes, and range objects. Slicing transforms a sequence into a subset of a sequence, like cutting off the end of a carrot and keeping the edible part.

The syntax for a slice function allows for slice(stop) or slice(start, stop, step). The start and stop parameters specify the index positions to extract. Given a basic string sequence of 'ABC', a start value of 0 will extract beginning at 'A' the first index value. A stop value of 2 will exclude any values from the index of 2 and after or in other words all indices before 2.

sequence = 'ABC'               # A is at index 0, B is at 1, C is at 2
slice_func = slice(0, 2, 1)    # Slice from index 0 up to before index 2
subset = sequence[slice_func]  # Sliced 'AB' from the string sequence

The special syntax for slicing is similar to the slice function but uses bracket notation sequence[start:end:stride]. This is referred to as syntactic sugar in programming terms to make it easier to achieve the same results.

For instance, phone numbers in the United States typically consist of 11 digits. A 1-digit country code, 3-digit area code, and 7-digit telephone number e.g. 1-202-456-1111, a contact number for the White House. The phone number could be stored as the string value "1-202-456-1111".

Slicing works well when the length of a sequence is well-known, like a 14 character phone number. The special syntax for slicing is sequence[start:end]. For instance, extracting the area code out of the phone number string is a simple task demonstrated below.

phone_number = '1-202-456-1111'
area_code = phone_number[2:5]    # Sliced "202"

Python strings can be indexed into, like a list, starting from 0, since lists and string are both sequence types. The area code characters are at indexes 2, 3, and 4. The middle colon character in a slicing operation (:) indicates the range of the slice from start to end returning the area code value "202".

Slicing other sequences of data, such as a list or tuple, is also possible.

numbers_list = [100, 92, 50, 33, 0]
numbers_tuple = (100, 92, 50, 33, 0)
sliced_list = numbers_list[1:4]       # Sliced [92, 50, 33]
sliced_tuple = numbers_tuple[1:4]     # Sliced (92, 50, 33)

Python offers many powerful slice features to generate a variety of subsets. Using the same phone number from the initial string example, there are a few other tricks we can use to extract various values.

Loading...

Note: When indexing from either the first item in a sequence or the last item it is considered best practice to leave out the index in the slice syntax, since it is redundant e.g. like the first example above [0:5] -> [:5].

Slicing and Striding

Slicing accepts a third parameter called the stride, which specifies the cadence to move forward during a slice after the first element is extracted. Striding can be thought of as an index step or incrementer. Given a list of integers from 0 to 9, a stride value of 2 makes it easy to pull out only even integers by extracting every other digit. Alternatively, a stride value of 3 will extract every 3rd digit.

digits = list(range(0, 10))  # Generate a list of integers 0-9
even = digits[::2]           # Sliced [0, 2, 4, 6, 8]
third = digits[::3]          # Sliced [0, 3, 6, 9]

However, the syntax above can be easily misunderstood and may not be immediately clear what action ::2, for instance, performs on the sequence. A comment could be used to clarify the operation, such as "Slice the whole sequence with a stride of 2". Two alternative solution is to use islice from the itertools standard package.

itertools islice

From the Python docs, itertools standardizes a core set of fast, memory efficient tools.

islice generates an iterator that returns the selected elements from an iterable without copying. It's okay if you aren't familiar with the terms iterator or iterable. Following from the even digits example, a list of digits can be passed in as the iterable and the iterator can be converted into a list containing only even numbers.

from itertools import islice

# Syntax:
# islice(iterable, start, stop, step)

digits = list(range(0, 10))
even = list(islice(digits, None, None, 2))  # Sliced [0, 2, 4, 6, 8]

islice accepts an iterable or sequence, like a list, and parameter arguments for stop or start and end along with an optional step size. The examples from the docs are a perfect way to experiment and get a better feel for how islice works with different parameters.

Loading...

In summary, slicing in Python is achieved in an efficient and simple way through the slice function, bracket notation, or itertools isslice for more advanced scenarios. Slicing is used to transform a sequence of data into another subset by extracting the values that are relevant to an operation or dataset.

Published