Python Best Practices and Tips by Toptal Developers

Share

This resource contains a collection of Python best practices and Python tips provided by our Toptal network members.

This resource contains a collection of Python best practices and Python tips provided by our Toptal network members. As such, this page will be updated on a regular basis to include additional information and cover emerging Python techniques. This is a community driven project, so you are encouraged to contribute as well, and we are counting on your feedback.

Python is a high level language used in many development areas, like web development (Django, Flask), data analysis (SciPy, scikit-learn), desktop UI (wxWidgets, PyQt) and system administration (Ansible, OpenStack). The main advantage of Python is development speed. Python comes with rich standard library, a lot of 3rd party libraries and clean syntax. All this allows a developer to focus on the problem they want to solve, and not on the language details or reinventing the wheel.

Check out the Toptal resource pages for additional information on Python. There is a Python hiring guide, Python job description, common Python mistakes, and Python interview questions.

Dealing With Pyc Files When Working With Git.

Python developers are familiar with the fact that Python automatically generates a byte code from the .py file, which is executed by the interpreter. This byte code is stored in a .pyc file, usually in the same directory of its respective source file. The .pyc generation can happen either when the main Python script is being executed or when a module is imported for the first time.

As an example, in many Python web frameworks when we create a views.py, a file containing the logic for our views, we will most probably get a views.pyc file in the same directory after running an instance of our application.

As developers, we often work with big codebases utilizing Git, and on projects with many developers in teams. This means a bunch of features are being developed at the same time, and thus we have to switch branches frequently to pair with other developers, or to review and test someone’s code. Depending on the differences between two branches, we can end up with .pyc files from the other branch, which can lead to unexpected behaviors.

Git is a well known source code management tool which cleverly provides hooks, a way to fire custom scripts when certain important actions occur. We can include hooks to the most used actions, like before or after committing, pushing, rebasing, merging, and similar.

One of the available hooks is a “post-checkout” hook, which is fired after we checkout another branch or specific commit. We can include our code to clean the .pyc files in the “post-checkout” hook. All Git projects have .git folder in the project’s root and from there we just need to edit (or create new) file .git/hooks/post-checkout, by adding the following code:

#!/bin/sh
find . -type f -name "*.py[co]" -delete
find . -type d -name "__pycache__" -delete

This is for Linux and Mac. On Windows, first we need to install Cygwin, so Git can execute the hook scripts as shell. Then we need to change the commands a little bit:

#!/bin/sh
find . -type f -name "*.py[co]" | xargs rm
find . -type d -name "__pycache__" | xargs rm -rf

When the Python interpreter is invoked with the -O flag, optimized code is generated and stored in .pyo files. For that reason, our command checks for .pyc or .pyo files and removes them (*.py[co]). Python 3 stores its compiled bytecode in __pycache__ directories, so we included a command that will remove them as well.

In the end, after saving our hook file on Linux and Mac, we need to make sure we add execution permissions to it:

$ chmod +x .git/hooks/post-checkout

Contributors

Ivan Neto

Freelance Python Developer
Brazil

Ivan is an engineer with eight years of experience in software and web development. Primarily focused on the web, he has solved code optimization problems arising from growing systems, as well as the migrated production apps and services on-the-fly with zero downtime. While a major part of his experience has been on the back-end, he can also work with the front-end, databases, networking, and infrastructure.

Show More

Get the latest developer updates from Toptal.

Subscription implies consent to our privacy policy

Why Should You Avoid Using The `try: except: pass` Pattern?

What is the complete opposite of the design pattern? It is an antipattern, a pattern which silently destroys your efficiency in code. The pattern we are about to describe is one of them. Aaron Maxwell calls it the most diabolical or devilish Python antipattern:

try: 
    subtle_buggy_operation() # most possibly some I/O or DB ops
except:
    pass

You may think you will save some development time by pass-ing exceptions by “just for now”. But it will take hours, if not days, to find future bugs later inside this code block, as any exception will be masked by the pass, and the error location will be moved and thrown somewhere else outside this try:except block which may look like the most innocent code.

A quote from Aaron:

In my nearly ten years of experience writing applications in Python, both individually and as part of a team, this pattern has stood out as the single greatest drain on developer productivity and application reliability, especially over the long term.

If you really want to pass one or two well-expected exceptions, then make it explicit instead of all-pass. In Python, “explicit is better than implicit”, like in the code example bellow:

try: 
    subtle_buggy_operation() # most possibly some I/O or DB ops
except ThouShaltPassError:
    pass

Again, referencing “Zen of Python”:

Errors should never pass silently

Unless explicitly silenced

Contributors

Khaled Monsoor

Backend Engineer (Freelancer)

Python-based software developer. Big-data enthusiast. Photography and coffee aficionado.

How to Deal with Hash Table Based Operations?

Many people are confused with the fact that almost all hash table based operations in Python are done in place, without returning an object itself. As an example, they usually remember dict.update() and almost all operation on sets:

>>> a = {1,2,3}
>>> a.update({5,6,7})  # returns None
>>> a
set([1, 2, 3, 5, 6, 7])  # object changed in place
>>> a.add(8)  # returns None
>>> a
set([1, 2, 3, 5, 6, 7, 8])  # object changed in place
>>> d = {"foo": "bar"}
>>> d.update({"zzz": "derp"})  # returns None
>>> d
{'foo': 'bar', 'zzz': 'derp'}  # object changed in place

Thinking of a more practical example, you can’t write something like this:

requests = [
    web.Request(
        body=json.dumps(
            item.update({"foo": "bar"})
        )
    ) for item in my_long_iterable
]

The above code will set the body of each request to string null, which may lead to weird bugs or some surprises that are even worse.

Usually, most programmers will rewrite it as a simple loop after some thinking:

requests = []
for item in my_long_iterable:
	body = item.update({"foo": "bar"})
	requests.append(
		web.Request(body=json.dumps(body))
	)

The fun thing here is there is still a possibility to write it as a single list comprehension. Here is how:

requests = [
    web.Request(
        body=json.dumps(
            dict(foo="bar", **item)
        )
    ) for item in my_long_iterable
]

Simple and understandable, right?

Although, I know some people who say loop version is more obvious and readable. In the case when we are dealing with huge amounts of data, we can just replace square brackets with usual round ones (like David Beazly usually suggests), and our code becomes a lazy generator expression. Not to mention, when dealing with loop version we’ll need to move it into separate generator function, with “yield” and stuff, which can sometimes lead to even worse readability. You can read more from David himself on this whole topic.

Contributors

Nikolay Markov

Freelance Python Developer
Russia

Nikolay is a software engineer with a good knowledge of Python, algorithms and data structures. He has experience with scalable and highly loaded systems architecture - Web technologies, NoSQL, OpenStack - as well as experience leading groups of developers.

Show More

Should I Use Exceptions of Conditional Handling?

Python best practice is to use exceptions as a better way to handle “exceptional” cases. Unnecessary use of if’s may slow down your code. Although, keep in mind frequent “except” calls are even slower than plain if conditional handling, so you must use it wisely.

To summarize, exceptions are good for rare cases, and conditions are better for frequent use cases. In the end, of course “it’s better to ask for forgiveness than permission”.

Contributors

Nikolai Golub

Freelance Python Developer
Netherlands

Nikolai is a self-directed and organized professional. He is a results-oriented problem solver with sharp analytical abilities and excellent communication skills. He is a highly competent software engineer with four years of experience in Python programming and web technologies.

Show More

Use Python Properties Rather Than Explicit Getters and Setters

Python properties (@property) are much cleaner than getters and setter. Let’s take a look at the following code snippet, written in a Java-style, with getters and setters:

class A:

  def __init__(self, some)
    self._b = some

  def get_b(self):
    return self._b

  def set_b(self, val):
    self._b = val

a = A('123')
print(a.get_b())
a.set_b('444')

Now compare this with a code written using Python-style properties:

class A:

  def __init__(self, some)
    self._b = some

  @property
  def b(self):
    return self._b

  @b.setter
  def b_setter(self, val):
    self._b = val

a = A('123')
print(a.b)
a.b = '123'

Please be aware, if you encapsulate complicated logic or heavy calculations behind the property, you may confuse other developers who will use your class.

Contributors

Nikolai Golub

Freelance Python Developer
Netherlands

Nikolai is a self-directed and organized professional. He is a results-oriented problem solver with sharp analytical abilities and excellent communication skills. He is a highly competent software engineer with four years of experience in Python programming and web technologies.

Show More

How to Compare Two Objects Using `==` Operator?

If a class doesn’t provide __eq__ method, Python will compare two objects and return True value only if both two objects are actually the same object:

class A:

  def __init__(self, i):
    self.i = i

a = A(1)
b = a
c = A(1)

a == b # True
a == c # False

In this example, objects a and c are False, even though they store the same value inside.

If you want to enable your own comparison logic for two objects and use a == operator, you can implement __eq__ method:

class Car(object):

  def __init__(self, horse_power, color):
    self.horse_power = power
    self.color = color

  def __eq__(self, other):
    return self.horse_power == other.horse_power and self.color == other.color

Contributors

Nikolai Golub

Freelance Python Developer
Netherlands

Nikolai is a self-directed and organized professional. He is a results-oriented problem solver with sharp analytical abilities and excellent communication skills. He is a highly competent software engineer with four years of experience in Python programming and web technologies.

Show More

Feel the Power of the Python Logic Operands.

Python’s logical operations don’t return just Boolean values True or False, they can also return the actual value of the operations.

This power of the Python logic operands can be used to speed up development and to increase code readability. In the following example, object will be fetched from the cache, or if it’s missed in the cache it will be fetched from the database:

# check_in_cache returns object or None
def get_obj(): 
return check_in_cache() or pull_from_db()

A quick explanation of the provided example: it will first try to get an object from cache (check_in_cache() function). If it doesn’t return an object and returns a None instead, it will get it from the database (pull_from_db() function). Written in this way is better than the following code snippet, written in a standard way:

def get_obj():
result = check_in_cache()
if result is None:
  		result = pull_from_db()
	return result

Our first code example solves a problem in one line of the code, which is better than four lines of code from the second code example. Not to mention the first code example is more expressive and readable.

Just one thing to watch for - you should be aware of returning objects with logical equivalent of False, like an empty lists for example. If check_in_cache() function returns such an object, it will be treated as missing, and will cause your app to call a pull_from_db function. So, in cases where your functions could be returning these kind of objects, consider using additional explicit is None check.

Contributors

Nikolai Golub

Freelance Python Developer
Netherlands

Nikolai is a self-directed and organized professional. He is a results-oriented problem solver with sharp analytical abilities and excellent communication skills. He is a highly competent software engineer with four years of experience in Python programming and web technologies.

Show More

Why Is It Important to Write Documentation or Inline Comments?

Many developers think they are saving time by avoiding writing comments with a semi-obfuscated code. Or, they think avoiding docstrings will help them meet deadlines. Stay assured within a short period of time you’ll hate yourself when you will not remember what and why you did something in the way you did while reading your own code.

In the future, probably you will leave the company, and your code will haunt all the members of your team who will come across this zombie-like code. There is just no excuse to not write a documentation. Writing a doc-strings and comments on complex code sections don’t take that too much time. The other way to approach writing your code is to name your functions, methods and variables to reflect the purpose of the component, making them “self-documented”.

Here is a nice guide on how to properly documente your Python code.

Contributors

Khaled Monsoor

Backend Engineer (Freelancer)

Python-based software developer. Big-data enthusiast. Photography and coffee aficionado.

Why You Should Avoid Using `from module import *` in Your Projects?

The practice of using a from module import * in your code can turn nice and clean modules into a nightmare. This practice is not a trouble-maker in small projects, consisting of just up to ten modules which are being developed by a small team. But when the project grows into a mid-sized project, and working team spreads across multiple teams in multiple locations, the code using this practice will start to see bewildering errors due to circular references.

According to a core Python developer David Goodger:

The from module import * wild-card style leads to namespace pollution. You’ll get things in your local namespace that you didn’t expect to get. You may see imported names obscuring module-defined local names. You won’t be able to figure out where certain names come from. Although a convenient shortcut, this should not be in production code.

Let’s show it with the examples. As mentioned, the worst use case would be the following code:

from module_name import *
# ...
spam = function(foo, bar)

A better use case would be something in line with the following code:

from module_name import function
# ...
spam = function(foo, bar)

The best possible way of using import module is as follows:

import module_name as mn
spam = mn.function(foo, bar)

Contributors

Khaled Monsoor

Backend Engineer (Freelancer)

Python-based software developer. Big-data enthusiast. Photography and coffee aficionado.

Don’t Make Everything a Class.

In Python, overusing classes and making everything a class is considered a bad practice. Jack Diederich in his talk at PyCon 2012, pointed out developers should stop creating classes and modules every now and then. Before creating one, developers should think hard. Most likely, they would be much better with a function.

As mentioned before, read Zen of Python:

  • Beautiful is better than ugly
  • Explicit is better than implicit
  • Simple is better than complex
  • Flat is better than nested
  • Readability counts
  • If the implementation is hard to explain, it’s a bad idea

Let’s show this with few examples. In the code snipped below, although it is a perfectly valid class, it is a perfect example case of useless use of classes:

class Greeting(object):
    def __init__(self, greeting='hello'):
        self.greeting = greeting
 
    def greet(self, name):
        return '%s! %s' % (self.greeting, name)
 
greeting = Greeting('hola')
print greeting.greet('bob')

That code could be rewritten in a more efficient way by using a function, as shown in the code below:

def greet(greeting, target):
    return '%s! %s' % (greeting, target)

Jack Diederich showed in his talk a practical example how he simplified, or better re-factored, an API’s complete code, consisting of:

  • 1 Package, 22 Modules
  • 20 Classes
  • 660 Source Lines of Code

into the code below, a grand total of 8 lines. Yes, you read it right, into just 8 lines of code:

MUFFIN_API = url='https://api.wbsrvc.com/%s/%s/'
MUFFIN_API_KEY = 'SECRET-API-KEY'
 
def request(noun, verb, **params):
    headers = {'apikey' : MUFFIN_API_KEY}
    request = urllib2.Request(MUFFIN_API % (noun, verb),
                              urllib.urlencode(params), headers)
    return json.loads(urllib2.urlopen(request).read())

In the end: stop re-inventing the wheel, use more built-in library functions, and use much less long chains of class-hierarchy.

Contributors

Khaled Monsoor

Backend Engineer (Freelancer)

Python-based software developer. Big-data enthusiast. Photography and coffee aficionado.

Please, Write Unit Tests and Doctests.

Avoiding the writing of automated tests just for the sake of shipping quickly is a bad practice. Sooner or later, a bug will hit the surface, and it will happen on the production server, resulting with customer’s downtime. Just because of a “completely” manually-tested new feature, which will break something “almost” unrelated. Maybe after many sleepless nights of the development team, the bug will be found. But it will be too late.

Consequences can be catastrophic, a company can even lose millions and can be out of business.

Maybe this whole mess could be simply avoided if the developer would use best practices of writing unit tests and doctests. And after implementing a newly written feature, he would have run the tests once across the whole project.

The online book Dive into Python 3 has an excellent introduction on unit-testing. Another good start is in The Hitchhiker’s Guide to Python!.

Contributors

Khaled Monsoor

Backend Engineer (Freelancer)

Python-based software developer. Big-data enthusiast. Photography and coffee aficionado.

In Python-Verse, `try: except: else` Construct Is a Natural Control Flow.

If you are coming from the C++ or Java world, the confusion around try: except: else is natural. However, Python adopted this construct so much differently than C++ or Java. In Python, try: except: else construct is considered a good practice, as it helps to realize one of the core Python philosophy: “It is easier to ask for forgiveness than permission”, or “EAFP paradigm”.

Trying to avoid this practice will result in a messy, unpythonic code. On the StackOverflow, a core Python developer Raymond Hettinger, portrayed the philosophy behind it:

In the Python world, using exceptions for flow control is common and normal. Even the Python core developers use exceptions for flow-control and that style is heavily baked into the language (i.e. the iterator protocol uses StopIteration to signal loop termination). In addition, the try-except-style is used to prevent the race-conditions inherent in some of the “look-before-you-leap” constructs.

For example, testing os.path.exists results in information that may be out-of-date by the time you use it. Likewise, Queue.full returns information that may be stale. The try-except-else style will produce more reliable code in these cases. In some other languages, that rule reflects their cultural norms as reflected in their libraries. The “rule” is also based in-part on performance considerations for those languages.

You can also consider checking out this Q&A on StackOverflow on the same premise.

Contributors

Khaled Monsoor

Backend Engineer (Freelancer)

Python-based software developer. Big-data enthusiast. Photography and coffee aficionado.

Why Should I Use Generator Comprehensions?

By using parentheses () instead of square brackets [] for comprehensions, we tell Python to use a generator rather than to create a list. This can be very useful if the full list is not needed, or if it is expensive to compile due to some combination of the list being long, each object being big, or the conditional is expensive to compute.

My main generator comprehension use case is, at the time of the writing, when I want a single object from group of objects under some conditional, and when I expect many objects will satisfy the conditional but I only need one. Like in the example below:

short_list = range(10)
%%timeit
y = [x for x in short_list if x > 5][0]

1000000 loops, best of 3: 601 ns per loop

The square brackets tell Python to compile the whole list [6, 7, 8, 9] and then pick off the 0th element. With a list this small, no it is not a big deal.

Let’s now take a look at the following example below:

short_list = range(10)
%%timeit
y = (x for x in short_list if x > 5).next()

1000000 loops, best of 3: 910 ns per loop

The parentheses tell Python it is another comprehension, but to instead create a generator for it. That is, an iterable is set up so that it can be used to generate the same list as above, but only as far as you tell it. The .next() tells the generator to generate the first value. Since only the first value (6 in this case) is pulled off and saved, the generator goes away and the remaining 7, 8, and 9 are never computed.

Given the smallness of the data and the extra overhead, this works out to be slower than simply using the list. But let’s take a look at what happens with a longer list:

long_list = range(100000)
%%timeit
y = [x for x in long_list if x > 5][0]

100 loops, best of 3: 6.35 ms per loop

Again, this list comprehension first generates the full list [6, 7, 8, … 99998, 99999], then picks off the first value.

The same example using generator comprehension:

long_list = range(100000)
%%timeit
y = (x for x in long_list if x > 5).next()

1000000 loops, best of 3: 931 ns per loop

As before, the generator comprehension just sets up an iterator. That iterator is asked for its first value with .next(). The generator then runs until it can deliver the first value. It checks 0, which fails the conditional. It proceeds to check 1, which again fails. It keeps going until it finds the first value that passes the conditional (again, 6) which is returned and saved into y. The remaining 99993 values are never checked which performs only slightly worse than the short_list of length 10 and an order of magnitude better than the long_list using list comprehension.

Contributors

Allen Gary Grimm

Freelance Python Developer
United States

Fascinated by the intersection of abstraction and reality, Allen found his calling in data science. Formally trained in machine learning plus a breadth of experience in applying ML as prototypes up through production, his specialty is in finding and implementing tractable solutions to complex data modeling problems: e.g., user behavior prediction, recommender systems, NLP, spam filters, deduplication, or feature engineering.

Show More

Be Consistent About Indentation in the Same Python File.

Indentation level in Python is really important, and mixing tabs and spaces is not a smart, nor recommended practice. To go even further, Python 3 will simply refuse to interpret mixed file, while in Python 2 the interpretation of tabs is as if it is converted to spaces using 8-space tab stops. So while executing, you may have no clue at which indentation level a specific line is being considered.

For any code you think someday someone else will read or use, to avoid confusion you should stick with PEP-8, or your team-specific coding style. PEP-8 strongly discourage mixing tabs and spaces in the same file.

For further information, check out this Q&A on StackExchange:

  1. The first downside is that it quickly becomes a mess

… Formatting should be the task of the IDE. Developers have already enough work to care about the size of tabs, how much spaces will an IDE insert, etc. The code should be formatted correctly, and displayed correctly on other configurations, without forcing developers to think about it.

Also, remember this:

Furthermore, it can be a good idea to avoid tabs altogether, because the semantics of tabs are not very well-defined in the computer world, and they can be displayed completely differently on different types of systems and editors. Also, tabs often get destroyed or wrongly converted during copy-paste operations, or when a piece of source code is inserted into a web page or other kind of markup code.

Contributors

Khaled Monsoor

Backend Engineer (Freelancer)

Python-based software developer. Big-data enthusiast. Photography and coffee aficionado.

Learn to Use Python Dictionary

Dictionary data structure in Python is a way to store data; moreover, it is powerful and easy to use. Dictionaries are found in other languages as “associative memories” or “associative arrays”. In Python, a dictionary is an unordered set of key: value pairs, with the requirement that each key is unique.

How should we use a dictionary in everyday development?

Let’s examine some common use cases with accompanying code examples. Let’s say you use many if/else clauses in your code:

if name == "John": 
   print "This is John, he is an artist"
elif name == "Ted":
   print "This is Ted, he is an engineer"
elif name == "Kennedy":
   print "This is Kennedy, he is a teacher"

By using a dictionary, we can write the same code like this:

name_job_dict = {
   "Josh": "This is John, he is an artist",
   "Ted": "This is Ted, he is an engineer",   
   "Kenedy": "This is Kennedy, he is a teacher",
}
print name_job_dict[name]

The second use case is when we need a default value for a key:

def count_duplicates(numbers):
   result = {}
   for number in numbers:
       if number not in result:  # No need for if here
           result[key] = 0
       result[number] += 1
   return result

By using setdefault, we get cleaner code:

>>> def count_duplicates(numbers):
   result = {}
   for number in numbers:
       result.setdefault(number, 0)  # this is clearer
       result[number] += 1
   return result

We can also use the dictionary to manipulate lists:

>>> characters = {'a': 1, 'b': 2}
>>> characters.items() // return a copy of a dictionary’s list of (key, value) pairs (https://docs.python.org/2/library/stdtypes.html#dict.items)
[('a', 1), ('b', 2)]
>>> characters = [['a', 1], ['b', 2], ['c', 3]]
>>> dict(characters) // return a new dictionary initialized from a list (https://docs.python.org/2/library/stdtypes.html#dict)
{'a': 1, 'b': 2, 'c': 3}

If necessary, it is easy to change a dictionary form by switching the keys and values – changing {key: value} to a new dictionary {value: key} – also known as inverting the dictionary:

>>> characters = {'a': 1, 'b': 2, 'c': 3}
>>> invert_characters = {v: k for k, v in characters.iteritems()}
>>> invert_characters
{1: 'a', 2: 'b', 3: 'c'}

The final tip is related to exceptions. Developers should watch out for exceptions. One of the most annoying is the KeyError exception. To handle this, developers must first check whether or not a key exists in the dictionary.

>>> character_occurrences = {'a': [], ‘b’: []}
>>> character_occurrences[‘c’]
KeyError: 'c'
>>> if ‘c’ not in character_occurrences:
        character_occurrences[‘c’] = []
>>> character_occurrences[‘c’]
[]
>>> try:
        print character_occurrences[‘d’]
    except: 
        print “There is no character `d` in the string”

However, to achieve clean and easily testable code, avoiding exceptions catch and conditions is a must. So, this is cleaner and easier to understand if the defaultdict, in collections, is used.

>>> from collections import defaultdict
>>> character_occurrences = defaultdict(list)
>>> character_occurrences['a']
[]
>>> character_occurrences[‘b’].append(10)
>>> character_occurrences[‘b’].append(11)
>>> character_occurrences[‘b’]
[10, 11]

Contributors

Vu Quang Hoa

Freelance Python Developer
Vietnam

Hoa, nicknamed Joe, is a brilliant engineer who is capable of grasping new concepts very quickly. His most striking quality is the commitment he shows in whatever he does. He specializes in full-stack, highly scalable Python-Django applications, with experience in Java and PHP. He has over two years of experience developing applications on the Django framework at the StoryTree company—one of the top 500 startups in the US.

Show More

Submit a tip

Submitted questions and answers are subject to review and editing, and may or may not be selected for posting, at the sole discretion of Toptal, LLC.

* All fields are required

Toptal Connects the Top 3% of Freelance Talent All Over The World.

Join the Toptal community.