Optimize Your Python Code Even If You’re a Newbie

Picture by Writer | Ideogram

Let’s be sincere. Once you’re studying Python, you are most likely not fascinated with efficiency. You are simply attempting to get your code to work! However here is the factor: making your Python code sooner does not require you to develop into an knowledgeable programmer in a single day.

With a number of easy strategies that I am going to present you at this time, you’ll be able to enhance your code’s velocity and reminiscence utilization considerably.

On this article, we’ll stroll by 5 sensible beginner-friendly optimization strategies collectively. For every one, I am going to present you the “earlier than” code (the way in which many novices write it), the “after” code (the optimized model), and clarify precisely why the development works and the way a lot sooner it will get.

🔗 Hyperlink to the code on GitHub

1. Change Loops with Checklist Comprehensions

Let’s begin with one thing you most likely do on a regular basis: creating new lists by reworking current ones. Most novices attain for a for loop, however Python has a a lot sooner means to do that.

Earlier than Optimization

Here is how most novices would sq. an inventory of numbers:

import time

def square_numbers_loop(numbers):
    end result = [] 
    for num in numbers: 
        end result.append(num ** 2) 
    return end result

# Let's take a look at this with 1000000 numbers to see the efficiency
test_numbers = record(vary(1000000))

start_time = time.time()
squared_loop = square_numbers_loop(test_numbers)
loop_time = time.time() - start_time
print(f"Loop time: {loop_time:.4f} seconds")

This code creates an empty record referred to as end result, then loops by every quantity in our enter record, squares it, and appends it to the end result record. Fairly simple, proper?

After Optimization

Now let’s rewrite this utilizing an inventory comprehension:

def square_numbers_comprehension(numbers):
    return [num ** 2 for num in numbers]  # Create the whole record in a single line

start_time = time.time()
squared_comprehension = square_numbers_comprehension(test_numbers)
comprehension_time = time.time() - start_time
print(f"Comprehension time: {comprehension_time:.4f} seconds")
print(f"Enchancment: {loop_time / comprehension_time:.2f}x sooner")

This single line [num ** 2 for num in numbers] does precisely the identical factor as our loop, but it surely’s telling Python “create an inventory the place every factor is the sq. of the corresponding factor in numbers.”

Output:

Loop time: 0.0840 seconds
Comprehension time: 0.0736 seconds
Enchancment: 1.14x sooner

Efficiency enchancment: Checklist comprehensions are sometimes 30-50% sooner than equal loops. The development is extra noticeable once you work with very massive iterables.

Why does this work? Checklist comprehensions are applied in C below the hood, so that they keep away from a variety of the overhead that comes with Python loops, issues like variable lookups and performance calls that occur behind the scenes.

2. Select the Proper Knowledge Construction for the Job

This one’s big, and it is one thing that may make your code lots of of instances sooner with only a small change. The secret is understanding when to make use of lists versus units versus dictionaries.

Earlier than Optimization

As an example you need to discover frequent parts between two lists. Here is the intuitive strategy:

def find_common_elements_list(list1, list2):
    frequent = []
    for merchandise in list1:  # Undergo every merchandise within the first record
        if merchandise in list2:  # Test if it exists within the second record
            frequent.append(merchandise)  # If sure, add it to our frequent record
    return frequent

# Check with moderately massive lists
large_list1 = record(vary(10000))     
large_list2 = record(vary(5000, 15000))

start_time = time.time()
common_list = find_common_elements_list(large_list1, large_list2)
list_time = time.time() - start_time
print(f"Checklist strategy time: {list_time:.4f} seconds")

This code loops by the primary record, and for every merchandise, it checks if that merchandise exists within the second record utilizing if merchandise in list2. The issue? Once you do merchandise in list2, Python has to look by the whole second record till it finds the merchandise. That is sluggish!

After Optimization

Here is the identical logic, however utilizing a set for sooner lookups:

def find_common_elements_set(list1, list2):
    set2 = set(list2)  # Convert record to a set (one-time price)
    return [item for item in list1 if item in set2]  # Test membership in set

start_time = time.time()
common_set = find_common_elements_set(large_list1, large_list2)
set_time = time.time() - start_time
print(f"Set strategy time: {set_time:.4f} seconds")
print(f"Enchancment: {list_time / set_time:.2f}x sooner")

First, we convert the record to a set. Then, as an alternative of checking if merchandise in list2, we examine if merchandise in set2. This tiny change makes membership testing practically instantaneous.

Output:

Checklist strategy time: 0.8478 seconds
Set strategy time: 0.0010 seconds
Enchancment: 863.53x sooner

Efficiency enchancment: This may be of the order of 100x sooner for giant datasets.

Why does this work? Units use hash tables below the hood. Once you examine if an merchandise is in a set, Python does not search by each factor; it makes use of the hash to leap on to the place the merchandise must be. It is like having a guide’s index as an alternative of studying each web page to search out what you need.

3. Use Python’s Constructed-in Capabilities Every time Doable

Python comes with tons of built-in capabilities which are closely optimized. Earlier than you write your individual loop or customized perform to do one thing, examine if Python already has a perform for it.

Earlier than Optimization

Here is the way you would possibly calculate the sum and most of an inventory in case you did not learn about built-ins:

def calculate_sum_manual(numbers):
    whole = 0
    for num in numbers:  
        whole += num     
    return whole

def find_max_manual(numbers):
    max_val = numbers[0] 
    for num in numbers[1:]: 
        if num > max_val:    
            max_val = num   
    return max_val

test_numbers = record(vary(1000000))  

start_time = time.time()
manual_sum = calculate_sum_manual(test_numbers)
manual_max = find_max_manual(test_numbers)
manual_time = time.time() - start_time
print(f"Guide strategy time: {manual_time:.4f} seconds")

The sum perform begins with a complete of 0, then provides every quantity to that whole. The max perform begins by assuming the primary quantity is the utmost, then compares each different quantity to see if it is greater.

After Optimization

Here is the identical factor utilizing Python’s built-in capabilities:

start_time = time.time()
builtin_sum = sum(test_numbers)    
builtin_max = max(test_numbers)    
builtin_time = time.time() - start_time
print(f"Constructed-in strategy time: {builtin_time:.4f} seconds")
print(f"Enchancment: {manual_time / builtin_time:.2f}x sooner")

That is it! sum() provides the full of all numbers within the record, and max() returns the most important quantity. Similar end result, a lot sooner.

Output:

Guide strategy time: 0.0805 seconds
Constructed-in strategy time: 0.0413 seconds
Enchancment: 1.95x sooner

Efficiency enchancment: Constructed-in capabilities are sometimes sooner than guide implementations.

Why does this work? Python’s built-in capabilities are written in C and closely optimized.

4. Carry out Environment friendly String Operations with Be part of

String concatenation is one thing each programmer does, however most novices do it in a means that will get exponentially slower as strings get longer.

Earlier than Optimization

Here is the way you would possibly construct a CSV string by concatenating with the + operator:

def create_csv_plus(knowledge):
    end result = ""  # Begin with an empty string
    for row in knowledge:  # Undergo every row of knowledge
        for i, merchandise in enumerate(row):  # Undergo every merchandise within the row
            end result += str(merchandise)  # Add the merchandise to our end result string
            if i < len(row) - 1:  # If it is not the final merchandise
                end result += ","     # Add a comma
        end result += "n"  # Add a newline after every row
    return end result

# Check knowledge: 1000 rows with 10 columns every
test_data = [[f"item_{i}_{j}" for j in range(10)] for i in vary(1000)]

start_time = time.time()
csv_plus = create_csv_plus(test_data)
plus_time = time.time() - start_time
print(f"String concatenation time: {plus_time:.4f} seconds")

This code builds our CSV string piece by piece. For every row, it goes by every merchandise, converts it to a string, and provides it to our end result. It provides commas between objects and newlines between rows.

After Optimization

Here is the identical code utilizing the be part of methodology:

def create_csv_join(knowledge):
    # For every row, be part of the objects with commas, then be part of all rows with newlines
    return "n".be part of(",".be part of(str(merchandise) for merchandise in row) for row in knowledge)

start_time = time.time()
csv_join = create_csv_join(test_data)
join_time = time.time() - start_time
print(f"Be part of methodology time: {join_time:.4f} seconds")
print(f"Enchancment: {plus_time / join_time:.2f}x sooner")

This single line does rather a lot! The internal half ",".be part of(str(merchandise) for merchandise in row) takes every row and joins all objects with commas. The outer half "n".be part of(...) takes all these comma-separated rows and joins them with newlines.

Output:

String concatenation time: 0.0043 seconds
Be part of methodology time: 0.0022 seconds
Enchancment: 1.94x sooner

Efficiency enchancment: String becoming a member of is way sooner than concatenation for giant strings.

Why does this work? Once you use += to concatenate strings, Python creates a brand new string object every time as a result of strings are immutable. With massive strings, this turns into extremely wasteful. The be part of methodology figures out precisely how a lot reminiscence it wants upfront and builds the string as soon as.

5. Use Turbines for Reminiscence-Environment friendly Processing

Generally you need not retailer all of your knowledge in reminiscence without delay. Turbines allow you to create knowledge on-demand, which may save large quantities of reminiscence.

Earlier than Optimization

Here is the way you would possibly course of a big dataset by storing every part in an inventory:

import sys

def process_large_dataset_list(n):
    processed_data = []  
    for i in vary(n):
        # Simulate some knowledge processing
        processed_value = i ** 2 + i * 3 + 42
        processed_data.append(processed_value)  # Retailer every processed worth
    return processed_data

# Check with 100,000 objects
n = 100000
list_result = process_large_dataset_list(n)
list_memory = sys.getsizeof(list_result)
print(f"Checklist reminiscence utilization: {list_memory:,} bytes")

This perform processes numbers from 0 to n-1, applies some calculation to every one (squaring it, multiplying by 3, and including 42), and shops all leads to an inventory. The issue is that we’re preserving all 100,000 processed values in reminiscence without delay.

After Optimization

Here is the identical processing utilizing a generator:

def process_large_dataset_generator(n):
    for i in vary(n):
        # Simulate some knowledge processing
        processed_value = i ** 2 + i * 3 + 42
        yield processed_value  # Yield every worth as an alternative of storing it

# Create the generator (this does not course of something but!)
gen_result = process_large_dataset_generator(n)
gen_memory = sys.getsizeof(gen_result)
print(f"Generator reminiscence utilization: {gen_memory:,} bytes")
print(f"Reminiscence enchancment: {list_memory / gen_memory:.0f}x much less reminiscence")

# Now we will course of objects one after the other
whole = 0
for worth in process_large_dataset_generator(n):
    whole += worth
    # Every worth is processed on-demand and might be rubbish collected

The important thing distinction is yield as an alternative of append. The yield key phrase makes this a generator perform – it produces values one after the other as an alternative of making them unexpectedly.

Output:

Checklist reminiscence utilization: 800,984 bytes
Generator reminiscence utilization: 224 bytes
Reminiscence enchancment: 3576x much less reminiscence

Efficiency enchancment: Turbines can use “a lot” much less reminiscence for giant datasets.

Why does this work? Turbines use lazy analysis, they solely compute values once you ask for them. The generator object itself is tiny; it simply remembers the place it’s within the computation.

Conclusion

Optimizing Python code does not need to be intimidating. As we have seen, small adjustments in the way you strategy frequent programming duties can yield dramatic enhancements in each velocity and reminiscence utilization. The secret is creating an instinct for choosing the proper software for every job.

Keep in mind these core rules: use built-in capabilities once they exist, select applicable knowledge constructions on your use case, keep away from pointless repeated work, and be aware of how Python handles reminiscence. Checklist comprehensions, units for membership testing, string becoming a member of, mills for giant datasets are all instruments that must be in each newbie Python programmer’s toolkit. Continue to learn, hold coding!

Bala Priya C is a developer and technical author from India. She likes working on the intersection of math, programming, knowledge science, and content material creation. Her areas of curiosity and experience embrace DevOps, knowledge science, and pure language processing. She enjoys studying, writing, coding, and low! Presently, she’s engaged on studying and sharing her data with the developer neighborhood by authoring tutorials, how-to guides, opinion items, and extra. Bala additionally creates partaking useful resource overviews and coding tutorials.

Main Menu

What's Hot

Hackers Breach Toptal GitHub, Publish 10 Malicious npm Packages With 5,000 Downloads

You must flip off this default TV setting ASAP – and why even consultants advocate it

Prime Abilities Information Scientists Ought to Study in 2025

Optimize Your Python Code Even If You’re a Newbie

Prime Abilities Information Scientists Ought to Study in 2025

mRAKL: Multilingual Retrieval-Augmented Information Graph Building for Low-Resourced Languages

How Uber Makes use of ML for Demand Prediction?

Hackers Breach Toptal GitHub, Publish 10 Malicious npm Packages With 5,000 Downloads

Evaluating the Finest AI Video Mills for Social Media

Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

Midjourney V7: Quicker, smarter, extra reasonable

Hackers Breach Toptal GitHub, Publish 10 Malicious npm Packages With 5,000 Downloads

You must flip off this default TV setting ASAP – and why even consultants advocate it

Prime Abilities Information Scientists Ought to Study in 2025

Apera AI closes Sequence A financing, updates imaginative and prescient software program, names executives

Main Menu

Subscribe to Updates

What's Hot

Optimize Your Python Code Even If You’re a Newbie

1. Change Loops with Checklist Comprehensions

Earlier than Optimization

After Optimization

2. Select the Proper Knowledge Construction for the Job

Earlier than Optimization

After Optimization

3. Use Python’s Constructed-in Capabilities Every time Doable

Earlier than Optimization

After Optimization

4. Carry out Environment friendly String Operations with Be part of

Earlier than Optimization

After Optimization

5. Use Turbines for Reminiscence-Environment friendly Processing

Earlier than Optimization

After Optimization

Conclusion

Related Posts