Strings in Python

1. Python strings are immutable sequences and can be indexed, sliced, and iterated like any other sequence, as well as being subject to the in and not in operators. There are two kinds of strings in Python:

  • one-line strings, which cannot cross line boundaries – we denote them using either apostrophes ('string') or quotes ("string")
  • multi-line strings, which occupy more than one line of source code, delimited by trigraphs:

or

 

2. The length of a string is determined by the len() function. The escape character (\) is not counted. For example:

2

2

 x="\\\"
      ^
SyntaxError: unterminated string literal (detected at line 1)

\ escape only one character.

 

3. Strings can be concatenated using the + operator, and replicated using the * operator. For example:

outputs *+*+*+*+*.

 

4. The pair of functions chr() and ord() can be used to create a character using its codepoint, and to determine a codepoint corresponding to a character. Both of the following expressions are always true:

 

5. Some other functions that can be applied to strings are:

list() – create a list consisting of all the string’s characters;
max() – finds the character with the maximal codepoint;
min() – finds the character with the minimal codepoint.

 

6. The method named index() finds the index of a given substring inside the string and raises an exception if the value is not found.

The syntax is as follows :
string.index(value, start, end)  where:
– value [required] : the string/character to search for
– start [optional] : the index to start the search (default is 0)
– end [optional] : the index to end the search (default is the end of the string) – note that this index is excluded from the search.

In the above question, my_str.index('n') will return the index of the first occurrence of the letter “n” in the string my_str, searching between the character at position 0 and the last character of the string (i.e. the whole string). In the string ‘Luke !!\n’ , '\n' is actually the code for a new line (note that escape character ‘\’); this is not representing the character ‘n’. So character 'n' is NOT in the string ‘Luke !!\n’ and my_str.index(‘n’) will raise an exception.

 

7. Some of the methods offered by strings are:

  • capitalize() – changes all string letters to capitals;

Abcd
Alpha
Alpha
alpha
123
Αβγδ

  • center() – centers the string inside the field of a known length;

[   alpha   ]
[Beta]
[Beta]
[ Beta ]

The two-parameter variant of center() makes use of the character from the second argument, instead of a space. Analyze the example below:

[*******gamma********]

 

  • count() – counts the occurrences of a given character;

  • join() – joins all items of a tuple/list into one string;

omicron,pi,rho

John#Peter#Vicky

 

lower() – converts all the string’s letters into lower-case letters;

sigma=60

 

lstrip() – removes the white characters from the beginning of the string;

[tau ]

The one-parameter lstrip() method does the same as its parameterless version, but removes all characters enlisted in its argument (a string), not just whitespaces:

cisco.com

 

rstrip() – removes the trailing white spaces from the end of the string;

do nearly the same as lstrips, but affect the opposite side of the string.

[ upsilon]

cis

 

strip() – removes the leading and trailing white spaces;

combines the effects caused by rstrip() and lstrip() – it makes a new string lacking all the leading and trailing whitespaces.

[aleph]

of all fruits banana is my favorite

banana

 

split() – splits the string into a substring using a given delimiter;

['phi', 'chi', 'psi']

['hello', 'my name is Peter', 'I am 26 years old']

Note: the reverse operation can be performed by the join() method.

 

replace() – replaces a given substring with another;

www.pythoninstitute.org
Thare are it!
Apple

The three-parameter replace() variant uses the third argument (a number) to limit the number of replacements.

three three was a race horse, two two was one too.

 

find() – it looks for a substring and returns the index of first occurrence of this substring. For an argument containing a non-existent substring (it returns -1 then)

1
-1

 

rfind() – finds a substring starting from the end of the string; string.rfind(value, start, end) 

8
-1
4

12

8

 

swapcase() – swaps the letters’ cases (lower to upper and vice versa)

i KNOW THAT i KNOW NOTHING.

 

title() – makes the first letter in each word upper-case;

I Know That I Know Nothing. Part 1.

 

upper() – converts all the string’s letter into upper-case letters.

I KNOW THAT I KNOW NOTHING. PART 2.

 

2. String content can be determined using the following methods (all of them return Boolean values):

endswith() – does the string end with a given substring?

yes
True
False
False
True

startswith() – does the string begin with a given substring?

The startswith() method is a mirror reflection of endswith() – it checks if a given string starts with the specified substring.

False
True

 

isalnum() – does the string consist only of letters and digits?

True
True
True
False
False
False

False (the cause of the first result is a space – it’s neither a digit nor a letter.)
True
True

isalpha() – does the string consist only of letters?

True
False

 

isdigit() – looks at digits only – anything else produces False as the result.

True
False

 

islower() – does the string consists only of lower-case letters?

False
True

isupper() – does the string consists only of upper-case letters?

False
False
True

isspace() – does the string consists only of white spaces?

True
True
False

 

 

Exercise 1

What is the length of the following string assuming there is no whitespaces between the quotes?

 

1

Exercise 2

What is the expected output of the following code?

 

['t', 'e', 'r']

Exercise 3

What is the expected output of the following code?

 

bcd

Exercise 4

What is the expected output of the following code?

 

ABC123xyx

Exercise 5

What is the expected output of the following code?

 

of

Exercise 6

What is the expected output of the following code?

 

Where*are*the*snows?

Exercise 7

What is the expected output of the following code?

 

It is either hard or possible

Slicing

The syntax is:

There is also the step value, which can be used with any of the above:

The key point to remember is that the :stop value represents the first value that is not in the selected slice. So, the difference between stop and start is the number of elements selected (if step is 1, the default).

The other feature is that start or stop may be a negative number, which means it counts from the end of the array instead of the beginning. So:

Similarly, step may be a negative number:

Python is kind to the programmer if there are fewer items than you ask for. For example, if you ask for a[:-2] and a only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, so you have to be aware that this may happen.

 

 

Comparing strings

1. Strings can be compared to strings using general comparison operators, but comparing them to numbers gives no reasonable result, because no string can be equal to any number. Comparing strings against numbers is generally a bad idea. For example:

    • string == number is always False;
    • string != number is always True;
    • string >= number always raises an TypeError exception.

True

String comparison is always case-sensitive (upper-case letters are taken as lesser than lower-case).

True

Let’s check it:

The results in this case are:
False
True
False
True
TypeError exception

 

2. Sorting lists of strings can be done by:

    • a function named sorted(), creating a new, sorted list;

['omega', 'alpha', 'pi', 'gamma']

['alpha', 'gamma', 'omega', 'pi']

 

    • a method named sort(), which sorts the list in place

['omega', 'alpha', 'pi', 'gamma']

['alpha', 'gamma', 'omega', 'pi']

 

3. A number can be converted to a string using the str() function.

The code outputs:
13 1.3

 

4. A string can be converted to a number (although not every string) using either the int() or float() function. The conversion fails if a string doesn’t contain a valid number image (an exception is raised then).

This is what you’ll see in the console:
14.3

 

Exercise 1

Which of the following lines describe a true condition?
'smith' > 'Smith'
'Smiths' < 'Smith'
'Smith' > '1000'
'11' < '8'

1, 3 and 4

 

Exercise 2

What is the expected output of the following code?

are

['Where', 'are', 'of', 'snows', 'the', 'yesteryear?']

 

Exercise 3

What is the expected result of the following code?

The code raises a ValueError exception because you can’t int('12.8')

 

Exercise 4

What will be printed to the monitor ?

This question is all about string comparison.

As a reminder :

-> when comparing strings, Python compares code point values, character by character.

-> string comparison is case-sensitive : upper-case letters are taken as lesser than lower-case

-> when comparing two strings of different lengths and the shorter one is identical to the longer one’s beginning, the longer string is considered greater.

-> when comparing a string to a number (integer or float), the only comparisons you can perform are with the operators  == and != operators – other comparison operators (>, <) will raise an exception.

-> code points of number characters (0..9) are less than code points of upper-case characters (A..Z)

-> code points of upper-case characters (A..Z) are less than code points of lower-case characters (a..z)

Note : function ord()  returns the code point of a given character.

So, in the above question :

'Luke' < 'luke'   is True  ( upper-case 'L'  has a smaller code point than lower-case 'l' )

'10' < '5'   is True ('1' has a smaller code point than  '5' )

So : print('Luke' < 'luke' and '10' < '5')  will return True

'luke' > '10'   is True  ( lower-case 'l'  has a greater code point than '1' )

'Luke' > '5'   is True (upper-case 'L' has a greater code point than '5' )

So : print('luke' > '10' and 'Luke' > '5')  will return True

 

Exercise 5

What is the output of the following code snippet?

print(str(2/3)[-1])

6

Explanation:

2/3 is a valid operation and evaluates to 0.66666666666666666  (a float).

The str() function will convert the float to a string.

str(2/3)[-1]    is a slicing operation on the string and will return the last character of the string which is 6 .

 

Ex. 6

What is the expected output of the following snippet ?

 

4

Explanation:

isalpha() checks if the string contains only alphabetical characters (letters), and returns True or False according to the result.

isdigit() checks if the string contains only digits, and returns True or False according to the result.

isalnum() checks if the string contains only digits or alphabetical characters (letters), and returns True or False according to the result.

Note that a space is neither a digit nor a letter.

So, the above code returns 4.

 

Ex. 7

What is the expected output of the following code snippet ?

Explanation:

sort() is a method of the list class. It is similar to sorted() but with a few differences :

sorted() returns a new sorted list, while sort() sorts the list in place

sorted() can be used with an iterable (like a string for example), while sort() can only be used with lists.

Both sort() and sorted() can be used with one of these optional keyword arguments:

reverse : if set to True, sorting is done in a descending order – if set to False, sorting is done in ascending order.

key : this argument is a function which will be used on each value in the list being sorted to determine the resulting order.

In the above function, they key argument is used in the sort() method, with key = lambda x: x[::-1]  -> this is a lambda function that will be applied on each element of my_list  – this lambda function will reverse the order of the characters for each of the strings in my_list . For example : 'apple' will become 'elppa' ; 'koala' will become 'alaok', etc.. The sort() method will then perform the sorting operation based on the first letter of these modified strings ('elppa', 'alaok', etc…) but the original strings are still returned (they are not actually modified by the function identified by key – the function in key is only used to figure out what to sort).

Based on this, 'alaok' would be the first string (because it starts with character ‘a’ which has the smallest Unicode value among lowercase letters). The second one will be 'elppa'. Third one is 'oloP' and last one is 'zB4321'.

And so, my_list.sort(key = lambda x: x[::-1])  will return :

['koala', 'apple', 'Polo', '1234Bz']  (remember that the list is not modified by the function in key, it is just used for ordering).

 

Ex. 8

Which statement is true about the two objects below ? (Pick two)

A. id(str1) == id(str2)
B. str1 is not str2
C. str1[::1] == str2

D. id(str1) != id(str2)

AC

 

Ex. 9

What is the expected output of the following code, assuming there is no whitespace after the 3rd quote of the first line of code?

A. 2

B. 7

C. 8

D. 4

Explanation:

''' allows to delimit a multi-line string.

my_str is a 2-lines string : the first line only includes the special character for a new line : \n – it counts as one character. The 2nd line only includes the # character (which is not considered as a comment in this case).

So, the total length of the string is : 2.

 

Ex.10

Knowing that ord('a') has a value of  97, which code below will print bee to the monitor ?

A. print(chr(98), chr(101), chr(101), sep=',')

B. print(''.join([str(ord('b')), str(ord('e')), str(ord('e'))]))

C. print(chr(98) + 2*chr(101))

D. print(chr(ord('a')+1), chr(ord('a') + 4), chr(ord('a') + 4))

 

The Caesar Cipher: encrypting a message

This cipher was (probably) invented and used by Gaius Julius Caesar and his troops during the Gallic Wars. The idea is rather simple – every letter of the message is replaced by its nearest consequent (A becomes B, B becomes C, and so on). The only exception is Z, which becomes A.

 

The Caesar Cipher: decrypting a message

 

The Numbers Processor

The third program shows a simple method allowing you to input a line filled with numbers, and to process them easily. The processing will be extremely easy – we want the numbers to be summed.

Enter a line of numbers - separate them with spaces: 12 4

The total is: 16.0

 

The IBAN Validator

The fourth program implements (in a slightly simplified form) an algorithm used by European banks to specify account numbers. The standard named IBAN (International Bank Account Number) provides a simple and fairly reliable method for validating account numbers against simple typos that can occur during rewriting of the number.

The standard says that validation requires the following steps (according to Wikipedia):

  • (step 1) Check that the total IBAN length is correct as per the country (this program won’t do that, but you can modify the code to meet this requirement if you wish; note: you have to teach the code all the lengths used in Europe)
  • (step 2) Move the four initial characters to the end of the string (i.e., the country code and the check digits)
  • (step 3) Replace each letter in the string with two digits, thereby expanding the string, where A = 10, B = 11 … Z = 35;
  • (step 4) Interpret the string as a decimal integer and compute the remainder of that number by modulo-dividing it by 97; If the remainder is 1, the check digit test is passed and the IBAN might be valid.