String best practices with python

2019-02-16
Python



There a lot of ways to work with strings in python. And there are some cool tricks I want to share that will make it easier to deal with strings.

1. Formatting strings

1.1. Raw strings with r" "

This allows to have a literal string without scaping characters or special ones.

Do

r"C:\some\name"
Out: C:\some\name

Don't do

"C:\some\name" # \n will be interpreted as new line
Out:
C:\some
ame

1.2. Formatting with repeated ocurrencies

You can use "".format() and use names for each text that you want to insert. For example:

from datetime import date

print("""
    Hello {name},
    Welcome to {company}. Your new email is: {name}@{company}.
    Regards,

    {date:%Y-%m-%d}
    """.format(
        name="john",
        company="awesomecompany",
        date=date.today()
    )
)

Triple quotation marks (""") allows to have more than one line in a string

1.3. Formatting with f" " (Python 3.6+)

Imagine you have the following filename src/data/2019-02-16.xlsx. The best way to get the filename giving the following parameters:

from datetime import date

# This will probably change in a for loop or something similar
path = "src/data"
mdate = date(2019, 2, 16)

Do (python 3.6+)

filename = f"{path}/{mdate:%Y-%m-%d}.xlsx"

Don't do

filename = path + "/" + mdate.strftime("%Y-%m-%d") + ".xlsx"

With older versions of python

filename = "{}/{:%Y-%m-%d}.xlsx".format(path, mdate)

All give the same result but with f" " it is more compact and easier to read.

2. Concatenating strings

2.1. Concat few strings

For small concatenations you can simply put two strings togther:

"hello " "world"
Out: "hello world"

You can also repeat strings with:

"hello_" * 4:
Out: "hello_hello_hello_hello_"

2.2. Concat a lot of strings

To concatenate strings you should work with lists and join them at the end. Its faster and cleaner.

Do

%%timeit
" ".join(mlist)

7.53 µs ± 53.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Don't do

%%timeit
out = ""

for x in mlist:
    out += x + " "

105 µs ± 3.92 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

3. Slicing strings

text = "hello world"

text[0]:          "h"
text[:4]:         "hell"
text[-1]:         "d"
text[-5: -1]:     "worl"
text[1:-1]:       "ello worl"

4. String builtin functions

Upper, lower and title functions:

text = "heLLo world"

text.upper():     "HELLO WORLD"
text.lower():     "hello world"
text.title():     "Hello World"

Clean spaces, identation and other special chars:

text = "\n hello\r\t"

text.strip():     "hello"
text.lstrip():    "hello\r\t"
text.rstrip():    "\n hello"

Counting the number of times a char appears in a string:

ip = "192.168.1.1"
ip.count(".")
Out: 3