m
y
s
t
r
i
n
g
Overview
We’re going to take a closer look at lists and strings, particularly the methods that each data type has. Methods are like functions, but they belong to the data object, and they have access to the object’s information. Python provides many useful and powerful methods for manipulating strings and lists.
Key idea for strings: Strings are immutable, you cannot modify an existing string, but you can construct a modified copy of the string. Many string methods build new, modified versions of the original string.
Key idea for lists: Lists are mutable, you can modify an existing list. They can hold any kind of data, including numbers, strings, other lists and tuples, and even functions. Many list methods actually modify the list they apply to.
This activity will first ask you to try out some of the operations described in the reading, and then it will ask you to write a series of functions that operate on stings and lists.
The Github repository contains a starter code file, activ8.py. Add all your work to this file, as directed below and by the TODO comments in the file.
Review of string and list operators
String review: operators, selection, slicing, functions, loops
Earlier in the term you learned a number of tools to manipulate strings: the + and * operators, the selection operator and slicing (square bracket operations), len function, and the accumulator pattern for strings. Here you will practice writing functions that operate on strings, using the tools you already know.
String operators & functions
| Example | Meaning |
|---|---|
len('foo') |
Returns the number of characters in its argument |
'foo' + 'bar' |
Concatenates the two strings together |
'foo' * 3 |
Concatenates the string with itself the number of times given |
'mom' in s |
Checks if first string occurs in second string |
s[3] |
Returns the character at the given position, zero-based |
s[3:5] |
Returns a substring starting at 3 and ending before 5 |
Loops with strings
for loop over characters
for loop over indices
String accumulator pattern
Practice
Choose one of these functions to complete, until you have worked through the whole activity, then return and use the remaining functions for extra practice.
Try this to hand in: Practice using string operators, functions, and loops by completing one of the following two exercises. Do not use string methods here.
- Create a Python function
symbolPatternthat takes 2 input parameters. The first,str, should be a string, the second,num, should be an integer. The function should build a new string where each character in the input string,str, is repeated num times. The new string should be returned. Use the accumulator pattern for strings as the base for this function. You will also need the string operators * and + to accomplish this task.
*****!!!!!*****!!!!!
@@@$$$&&&
((((****))))
- Create a Python function
firstVowelthat takes a string as its input. It should use a loop to iterate over the string looking for vowels -‘aeiou’. It should return the index of the first vowel in the string. Theinoperator is helpful here.
Review of basic list operations
Earlier in the term you learned a number of tools to manipulate lists: the + and * operators, the selection operator (square brackets, used to select a value from the list) and list slicing, and the accumulator pattern for lists. Here you will practice writing functions that operate on lists, using the tools you already know.
Remember that strings and lists may be indexed from right to left with negative integers: -1 is the index of the rightmost element in a string or list.
| Example | Meaning |
|---|---|
len([25, 12, 19]) |
Returns the number of characters in its argument |
['a', 'b'] + ['c', 'd', 'e'] |
Concatenates the two strings together |
[5, 3] * 3 |
Concatenates the string with itself the number of times given |
'mom' in lst |
Checks if first string occurs in second string |
lst[3] |
Returns the character at the given position, zero-based |
lst[3:5] |
Returns a substring starting at 3 and ending before 5 |
Loops with lists
for loop over list elements
for loop over indices
List accumulator pattern
Practice
Try this to hand in: For these problems use only list functions (len), operators (+, *, [ ], in). Do not use list methods yet.
Choose one of these functions to complete, until you have worked through the whole activity, then return and use the remaining functions for extra practice.
- Create a Python function
everyOtherthat takes a list as its input. It should build a new list that contains every other value from the original input list. The easiest solution is to use list slicing (look up list slicing to recall how it works), but you could also loop over the list or its indices and use an accumulator variable.
[1, 3, 5]
['d', 'f']
[]
- Create a Python function
sumPositivethat takes a list of numbers as its input. Your function should use aforloop to iterate over the values in the list. If the number is positive then it should be added to an accumulator variable that holds the number that is the sum.
Modifying lists
Unlike strings, lists can be modified. This is powerful, but can also be dangerous. It means you have to be more careful with lists. If you pass a list to a function, and that function changes the contents of the list, that change is permanent and visible everywhere the list is visible.
NOTE: using a for loop to loop over a changing list is a bad idea!
The basic method for changing lists uses the selection and slicing operators. Try each of the following, and experiment with your own examples to change testList
testList = [1, 2, 3, 4, 5, 6]
print("line 2:", testList)
testList[3] = 105
print("line 4:", testList)
testList[0] = 99
print("line 6:", testList)
testList[4:6] = [25, 26]
print("line 8:", testList)
testList[1:3] = [-5]
print("line 10:", testList)line 2: [1, 2, 3, 4, 5, 6]
line 4: [1, 2, 3, 105, 5, 6]
line 6: [99, 2, 3, 105, 5, 6]
line 8: [99, 2, 3, 105, 25, 26]
line 10: [99, -5, 105, 25, 26]
Note that the last modification actually changes the length of the list. You can also delete elements from a list using the del operator:
Practice
Try this to hand in: Create a function changeStart that takes a value and a list as inputs. It should not build a new list. It should modify the input list by changing zeroth value to be the input value. Below is an example of what should happen:
String and list methods
String methods
Your reading had a partial list of string methods. I recommend looking at the Python Documentation for String Methods for the complete list. I’ve broken down the list of (most) string methods by category below. Define the example strings below, and then try each of the string methods in the examples to see what it does.
Adjusting string spacing
These methods are used to add or remove spaces (or other characters) from the front or end of the string to change the spacing around the main contents of the string.
| Method | Description |
|---|---|
center |
Takes in a new width and an optional fill character, and it returns a new string where the original string is centered in the new one. By default extra characters are filled with spaces, or with the l fill character if given. If width is shorter than the string, then the original string is returned. |
ljust |
Similar to center, but it adds spaces/fill chars to the right end |
rjust |
Similar to center, but it adds spaces/fill chars to the left end |
strip |
Takes in an optional string. If given no input, it removes whitespace from front and end of string. If input string is provided, then it removes those characters, if they occur, from front and end of string. |
lstrip |
Similar to strip, but only removes from front of string |
rstrip |
Similar to strip, but only removes from end of string |
Here are some examples that show how to call these methods, assuming the four strings defined above.
print("*" + s1.center(20) + "*")
print(s3.center(5, "X"))
print(s2.ljust(10, '-'))
print(s1.rjust(30))
s1Stripped = s1.strip("ab")
print(s1Stripped)
s5=" foobar "
s5S = s5.strip()
s5LS = s5.lstrip()
s5RS = s5.rstrip()
print("*" + s5S + "*" + s5LS + "*" + s5RS + "*")* banana *
Glimmer
FROG------
banana
nan
*foobar*foobar * foobar*
Check string contents
The following set of methods ask about the contents of the string, and are useful for determining whether a string is in a useful format or not. None of these methods take in any inputs.
| Method | Description |
|---|---|
isalpha |
Returns True if all the characters in its string are alphabetic |
isalnum |
Returns True if all the characters in its string are either alphabetic or numerical |
isdigit |
Returns True if all the characters in its string are digits |
isspace |
Returns True if all the characters in its string are “whitespace:” space, tab, or newline |
Here are some examples that show how to call these methods, assuming the four strings defined above.
Searching and replacing in strings
These methods look for certain substrings or characters in a string, and sometimes replace them with new strings. Remember that they always generate a new string to do a replacement, they do not ever change the input string, because strings are immutable.
| Method | Description |
|---|---|
find |
Takes in a string, and returns the index of the leftmost occurrence of that string inside its string or -1 if it isn’t there |
rfind |
Similar to find, but returns the index of the rightmost occurrence |
endswith |
Basic version takes in a string and return True if its string ends with the input string. For options, see documentation. |
startswith |
Basic version takes in a string and return True if its string starts with the input string. For options, see documentation. |
index |
Another name for find for compatibility with list methods |
rindex |
Another name for rfind for compatibility with list methods |
count |
Takes in a string, and returns the number of occurrences of that string in its string (no overlaps) |
replace |
Takes in two strings, and it builds a new string where every occurrence in its string of the first input has been replaced by the second input |
Here are some examples that show how to call these methods, assuming the four strings defined above.
Breaking strings into pieces
Often when we have a piece of text, we want to break it up into words or lines. Or if we have other structured data like CSV files, we might want to split it by commas or tabs. These methods let you separate a string into parts.
| Method | Description |
|---|---|
split |
Takes an optional input, a character. Returns a list of strings, created by splitting up its string. By default it splits on whitespace, but with an input it splits on that character. |
splitlines |
Similar to split but it only splits on newline, returning a list of strings for each line |
Here are some examples that show how to call these methods, assuming the four strings defined above.
Practice
Try this to hand in:
Choose one of these functions to complete, until you have worked through the whole activity, then return and use the remaining functions for extra practice.
The first two functions here are very simple and can be done with just one or two applications of string methods. The third example is a bit more complex.
- Define a function
shoutthat takes one input, a string. The function should return a new string that is the same as the input string, but with all letters in uppercase. See examples below.
WHAT ARE YOU DOING?
I SAW A FROG IN MY BATHTUB
- Define a function
nameSubstthat takes two inputs. The first input is a string representing a name, and the second input is a piece of text (also a string). The function should look for an occurrence of the special string"ZZZ"in the text, and it should build a new string that has substituted the input name for"ZZZ"in the text. It should return this new string.
sallie = nameSubst("Sallie", "My friend, ZZZ, won an award.")
print(sallie)
print(nameSubst("Fred", "Jamie and ZZZ flew over the trees."))My friend, Sallie, won an award.
Jamie and Fred flew over the trees.
- (Optional challenge question) Create a function
countWordsthat takes two strings as input. One string is a word, the other is longer piece of text. It should break up the text into words (using the split method), and then loop over the list of words. Use an accumulator variable to count how many times the first input string is a whole word in the text (not a part of a larger word, for example, if you are counting the occurrences of"ban"then don’t count it when it is in"banana"). Hint: don’t useinto look for the word, use==.- For an extra challenge, make the function ignore capitalization.
- For even more of a challenge, remove punctuation from the start and end of each word before comparing it to the input.
List Methods
Lists, like strings, have useful methods for manipulating them. The Python documentation for lists has a nice short summary of list operations.
Note: Because lists are mutable, changeable, most list methods, unlike string methods, modify the list, rather than returning a new list.
| Method | Description |
|---|---|
append |
Takes an item as input, adds it to the end of the list (modifying the list!) |
extend |
Takes a list (or similar) as input, and adds the list contents to the end of the original list |
insert |
Takes an index and a value, and inserts the value into the list at the given index, moving the old values one position down, starting at the given index through the end of the list. |
remove |
Takes a value and removes the first occurrence of that value from the list, raising an error if no such value exists in the list. |
pop |
Takes in an optional position, and removes the value at that position from the list. If no position is specified, it removes the last values from the list. |
clear |
Removes all items from the list. |
index |
Takes in a value, and optionally a starting position and an ending position. The starting and ending positions define the range of the list to be searched: if not specified, then the whole list is searched. It returns the index of the leftmost occurrence of the value in the search range. If no occurrence is found, then an error is raised. |
count |
Takes in a value, and returns the number of times that value appears in the list (as a full element, not inside another list element). |
sort |
Sorts the data in the list, rearranging the list elements. By default, data is sorted in increasing order, but optional arguments allow us to change the search criteria and ordering. |
reverse |
Reverse the elements in the list, modifying the list. |
copy |
Returns a new list that contains the same data items as the original. |
Try out the examples below in the Python console, to see how these methods work. Look at the list whose method has been called after each call to see how it has changed.
list1 = [5, 6, 7]
list2 = [4, 3, 2, 1]
list1.append(8)
print("line 4:", list1, list2)
list2.extend([0, -1])
print("line 6:", list1, list2)
# list1.extend(5) # this should generate an error, why?
list1.insert(0, 4.5)
print("line 9:", list1, list2)
list2.insert(2, 3.5)
print("line 11:", list1, list2)
list1.remove(7)
print("line 13:", list1, list2)
list2.pop(1)
print("line 15:", list1, list2)
print("line 16:", list1.index(8), list2.index(2))
print("line 17:", list1.count(5), list2.count(0))
list1.reverse()
print("line 19:", list1, list2)
list2.sort()
print("line 21:", list1, list2)
list3 = list2.copy()
print("line 23:", list2, list3, list2 == list3, list2 is list3)line 4: [5, 6, 7, 8] [4, 3, 2, 1]
line 6: [5, 6, 7, 8] [4, 3, 2, 1, 0, -1]
line 9: [4.5, 5, 6, 7, 8] [4, 3, 2, 1, 0, -1]
line 11: [4.5, 5, 6, 7, 8] [4, 3, 3.5, 2, 1, 0, -1]
line 13: [4.5, 5, 6, 8] [4, 3, 3.5, 2, 1, 0, -1]
line 15: [4.5, 5, 6, 8] [4, 3.5, 2, 1, 0, -1]
line 16: 3 2
line 17: 1 1
line 19: [8, 6, 5, 4.5] [4, 3.5, 2, 1, 0, -1]
line 21: [8, 6, 5, 4.5] [-1, 0, 1, 2, 3.5, 4]
line 23: [-1, 0, 1, 2, 3.5, 4] [-1, 0, 1, 2, 3.5, 4] True False
Can you answer these questions: * What is the difference between append and extend? * What is the difference between remove and pop?
Practice
Try this to hand in: Create a function onBeyond that takes in a number, and a list of numbers in sorted order. The input number must be larger than the last number in the list. The function should modify the list to add to the end of the list all the integers between the last number in the list and the input number. See the examples below for help:
Hint: The last number in the list might not be an integer. The first number to be added to the list, then, should be calculated using this more general form, where \(x\) is the last number in the list: \(\lfloor x + 1 \rfloor\).
Also, if you are clever, the function can be defined with 2-4 lines of code in total.
nl1 = [1, 3, 5, 6, 9]
nl2 = [-2.5, 0, 1.372, 4.7]
nl3 = [15.2]
print("Before:")
print(nl1)
print(nl2)
print(nl3)
onBeyond(15, nl1)
onBeyond(7, nl2)
onBeyond(20, nl3)
print("After:")
print(nl1)
print(nl2)
print(nl3)Before:
[1, 3, 5, 6, 9]
[-2.5, 0, 1.372, 4.7]
[15.2]
After:
[1, 3, 5, 6, 9, 10, 11, 12, 13, 14, 15]
[-2.5, 0, 1.372, 4.7, 5, 6, 7]
[15.2, 16, 17, 18, 19, 20]
Tuples
A tuple is very similar to a list, superficially. It is a linear collection of data, and the data may be of any type. Unlike a list, however, a tuple is immutable: once created, it cannot be changed in any way. We write a tuple as a series of values separated by commas. For convenience, we usually surround the values with parentheses.
Operators like + and * work on tuples just as on lists, and values may be accessed and sliced just as in a list. Functions that operate on lists, such as sum, max, min, and len, also work on tuples. Tuples have no methods.
Tuples are often used to temporarily collect together data values. For example, when returning multiple values from a function they are collected into a tuple. You can at any point collect data into a tuple and assign a variable to hold it. There is also a special “unpacking” assignment format that lets you unpack a tuple and assign each value to a separate variable. See examples below.
mainPt = (25.2, 99.1)
print(mainPt, mainPt[0], mainPt[1])
longTup = ('bee', 'fly', 'beetle', 'wasp', 'moth')
print(longTup[3:5])
(x, y) = mainPt
print(x, "and", y)
# Uncomment the following line to see what happens
# longTup[0] = 'butterfly'(25.2, 99.1) 25.2 99.1
('wasp', 'moth')
25.2 and 99.1
Try this to hand in: Create a function endPoints that takes a list of numbers as its input. It should find the minimum and maximum values from the list, and return both of them as a tuple. Read the test calls below carefully to see the different ways we can handle a tuple returned from a function call.
List Comprehensions (Recommended, but optional)
Because we so often need to do something to every element of a list, or to combine together every pair of elements from two lists, Python provides some shorthand notation called list comprehensions. List comprehensions always build new lists; they are “pure” and don’t modify the list they are given.
A list comprehension has square brackets to indicate that we are building a list. At the end the list comprehension has one or more for-like forms, and possibly an if as well. At the start of the comprehension there is an expression that (usually) refers to the loop variable from the for-like form.
The basic forms of a list comprehension are shown below. Note that the expr forms usually involve the loop variable val in some way. Also note that you can use any variable name in the for part of a comprehension, I just chose to use val for simplicity.
The effect of a list comprehension is to build a new list. The values in the new list are the result of computing the expression at the front for each value in the list in the for loop part. If there is an if-like part, then only those values that pass the test in the if part are included in the answer.
Here are some examples of list comprehensions and what they produce:
lst1 = [8, 5, 1, 2]
lst2 = [x + 3 for x in lst1]
lst3 = [[x, -x] for x in lst1 if x > 3]
print(lst1)
print(lst2)
print(lst3)
lst4 = [a + b for a in lst1 for b in [10, 30, 50, 20]]
print(lst4)[8, 5, 1, 2]
[11, 8, 4, 5]
[[8, -8], [5, -5]]
[18, 38, 58, 28, 15, 35, 55, 25, 11, 31, 51, 21, 12, 32, 52, 22]
Try to predict what each comprehension below will produce, then try them to see what they do.
lst1 = [5, 20, 13, 19]
lst2 = [ 3 * x for x in lst1 ]
lst3 = [ [x, x] for x in lst1 ]
lst4 = ['haha' for x in lst1 ]
lst5 = [ x for x in lst1 if (x % 5) != 0 ]
lst6 = ['amy', 'rachel', 'barney', 'filip']
lst7 = [ (x, y) for x in lst1 for y in lst6 ]
lst8 = [ (x, y) for x in lst1 for y in lst1 if x < y ]Practice
Try this to hand in:
Choose one of these to complete, until you have worked through the whole activity, then return and use the remaining ones for extra practice.
Create a function
timesNthat takes a number and a list of numbers. Use a list comprehension to build a new list that has multiplied each of the original numbers in the list by the input number. Return the new list.Create a function
removeAllthat takes in an item and a list, and it builds a new list with all occurrences of the item removed from the list. Use a list comprehension, and use the if extension to keep only those values not equal to the item.
Optional Challenge Activities
- Create a Python function
randomAveragethat takes one input (size) – the size of the list. Your function should use a for loop to generate a list of size random integers between 0 and 100. Use a second for loop to calculate the average of the list of numbers. Return a tuple containing the list and the average. Below is an example function call.
[76, 64, 24, 0, 87, 55, 87, 95, 73, 98]
65.9
- Create a function
parenthesesthat takes in a string of parentheses, curly braces, and brackets. It should then determine whether the line is well-formed (every open bracket, brace, or paren has a matching close bracket, brace, or parent of the correct type). I recommend storing the open parentheses/braces/brackets in a list and then removing it when the corresponding close parentheses/brace/bracket is read.
print(parentheses('{()[]}'))
print(parentheses('{(())'))
print(parentheses('{()()({(())}{}())}'))
print(parentheses('{()()({(())}{}()})'))True
False
True
False
- Create a function
centeredAveragethat returns the “centered” average of an array of ints. The centered average is the mean average of the values, ignoring the largest and smallest values in the array. If there are multiple copies of the smallest value, ignore just one copy, and likewise for the largest value. Use integer division to produce the final average. You may assume that the array is length 3 or more.
What to hand in
Submit all your work through the activ8.py file.
Use commit and push to copy your code to Github to submit this work.