Advanced Django ORM Concepts

30. The N+1 Problem and Query Optimization

What is the N+1 Problem?

The N+1 problem is a common performance issue that occurs when fetching a list of objects (N items), where each object needs related field data, causing Django to run:

  • 1 query for the base list
  • N queries for related fields

Total = N+1 queries

This happens because Django fetches related fields lazily by default.

Using django-debug-toolbar

To identify N+1 problems, install and configure django-debug-toolbar. It shows the number of queries executed and their details.

Example 1: Forward ForeignKey

class Author(models.Model):
    name = models.CharField(max_length=100)

class Book(models.Model):
    title = models.CharField(max_length=100)
    author = models.ForeignKey(Author, on_delete=models.CASCADE)

# This causes N+1 problem
books = Book.objects.all()
for book in books:
    print(book.title, book.author.name)  # N+1 queries!

SQL Generated:

SELECT * FROM book;                     -- 1 query
SELECT * FROM author WHERE id=...;      -- repeated for each book

If you have 1000 books → 1001 queries!

Example 2: Reverse ForeignKey

authors = Author.objects.all()
for author in authors:
    print(author.name, [b.title for b in author.book_set.all()])

SQL Generated:

SELECT * FROM author;                        -- 1 query
SELECT * FROM book WHERE author_id=...;      -- repeated for each author

If you have 200 authors → 201 queries!

Example 3: ManyToMany

class Category(models.Model):
    name = models.CharField(max_length=100)

class Book(models.Model):
    title = models.CharField(max_length=100)
    categories = models.ManyToManyField(Category)

books = Book.objects.all()
for book in books:
    print(book.title, [c.name for c in book.categories.all()])

SQL Generated:

SELECT * FROM book;                            -- 1 query
SELECT * FROM category JOIN book_categories... -- repeated for each book

If you have 500 books → 501 queries!

Fast Facts About Django ORM

When using operations like:

Book.objects.filter(author__name='John Doe')

Consider the one-to-many relationship between Author and Book. Django will use JOIN clause to check the condition.

  • Works with ForeignKey and OneToOneField
  • Uses SQL JOIN to fetch related objects in the same query
  • Fetches related object in the same query

Basic Usage

# Solution: Use select_related
books = Book.objects.select_related("author")
for book in books:
    print(book.title, book.author.name)  # Only 1 query!

SQL Generated:

SELECT book.id, book.title,
       author.id, author.name
FROM book
INNER JOIN author ON book.author_id = author.id;

Still 1 query total even for 1000 books!

# Multiple fields - still 1 query with multiple JOINs
Book.objects.select_related("author", "publisher", "editor")

Efficient when all are FK/O2O relationships.

  • Works with ManyToMany and reverse ForeignKey relationships
  • Django runs 2 queries (base + related), then links results in Python
  • Avoids row explosion (cartesian product)

Basic Usage

# Solution: Use prefetch_related
authors = Author.objects.prefetch_related("book_set")
for author in authors:
    print(author.name, [b.title for b in author.book_set.all()])

SQL Generated:

SELECT * FROM author;                      -- 1st query
SELECT * FROM book WHERE author_id IN (...); -- 2nd query

Always 2 queries, regardless of author count.

ManyToMany Example

books = Book.objects.prefetch_related("categories")
for book in books:
    print(book.title, [c.name for c in book.categories.all()])

SQL Generated:

SELECT * FROM book;                                    -- 1st query
SELECT * FROM category 
JOIN book_categories ON category.id = book_categories.category_id 
WHERE book_categories.book_id IN (...);                -- 2nd query
# Each gets its own extra query
Book.objects.prefetch_related("categories", "tags", "languages")

1 base query + 3 prefetch queries = 4 queries total.

33. Query Optimization Summary Table

Relation TypeDefault (lazy)With Optimization
FK (forward)N+1 queriesselect_related → 1 query
O2ON+1 queriesselect_related → 1 query
FK (reverse)N+1 queriesprefetch_related → 2 queries
M2MN+1 queriesprefetch_related → 2 queries

Important Notes

  • The order of filter() and select_related() chaining isn’t important:
# These are equivalent
Entry.objects.filter(pub_date__gt=timezone.now()).select_related("blog")
Entry.objects.select_related("blog").filter(pub_date__gt=timezone.now())

You can use both select_related and prefetch_related in the same query to optimize different relationships:

entries = Entry.objects.select_related("blog").prefetch_related("authors")

35. GROUP BY

We can perform GROUP BY operation using the values() along with annotate() method.

Models.objects.values('field1', 'field2').annotate(aggregation_function('field3'))

This means we are grouping by field1 and field2 and applying the aggregation function on field3.

from django.db.models import Sum
sales_data = Sales.objects.values('name').annotate(total_sales=Sum('amount'))

This will group the sales data by the ’name’ field and calculate the total sales for each name.

from django.db.models import Avg
average_scores = Student.objects.values('grade').annotate(avg_score=Avg('marks__obtained'))

This will group the students by their grade and calculate the average score for each grade.

from django.db.models import Count
customer_countryCount = Customer.objects.values('country').annotate(count=Count('profile__id'))

This will group the customers by their country and count the number of profiles for each country.

36. Transactions in Django ORM

Django provides powerful tools to manage database integrity in the presence of concurrent access. Two of the most important concepts are transactions and row-level locking. Using them correctly prevents race conditions, ensures data consistency, and allows safe concurrent updates.

Understanding Transactions

A transaction is a logical unit of work treated as a single “all-or-nothing” operation in the database. Django exposes transactions through the transaction module.

The ACID Principles

Every transaction follows the ACID principles:

  • Atomicity: All operations succeed, or none do.
  • Consistency: Data remains consistent after the transaction.
  • Isolation: Concurrent transactions do not interfere with each other.
  • Durability: Committed changes persist even if the system crashes.

In Django, you use transactions via transaction.atomic():

from django.db import transaction

with transaction.atomic():
    # All operations here form one transaction
    wallet.balance -= 100
    wallet.save()

If any line inside the block raises an exception, all changes are rolled back.

The Need for Row-Level Locks

Imagine an e-commerce website where two customers try to purchase the last unit of a product simultaneously. Without proper locking:

# BAD: No lock — race condition!
product = Product.objects.get(id=1)
if product.stock > 0:
    product.stock -= 1
    product.save()

Two requests may read stock = 1 at the same time. Both subtract 1, resulting in stock = -1. This is a race condition, a classic problem in concurrent systems.

Django’s select_for_update

Django provides row-level locking using select_for_update(). When combined with transaction.atomic(), it ensures that the selected rows are locked until the transaction is complete.

from django.db import transaction

with transaction.atomic():
    product = Product.objects.select_for_update().get(id=1)
    if product.stock <= 0:
        raise ValueError("Out of stock")
    product.stock -= 1
    product.save()

How it Works:

  1. Transaction begins (atomic)
  2. Row is locked (SELECT ... FOR UPDATE)
  3. Any other transaction trying to lock the same row waits until the first finishes
  4. Transaction commits → lock released

When to Use select_for_update

Use row-level locks when:

  • Updating counters (like likes or votes)
  • Managing financial transactions
  • Reserving inventory
  • Booking tickets or seats
  • Scheduling tasks (to prevent double assignments)

Practical Example: Wallet Transfer

A classic example is transferring money between two users:

from django.db import transaction

@transaction.atomic
def transfer(sender_id, receiver_id, amount):
    sender = Wallet.objects.select_for_update().get(id=sender_id)
    receiver = Wallet.objects.select_for_update().get(id=receiver_id)

    if sender.balance < amount:
        raise ValueError("Insufficient balance")

    sender.balance -= amount
    receiver.balance += amount

    sender.save()
    receiver.save()

Without select_for_update, concurrent transfers could overwrite each other, causing lost updates.

Concurrency Behavior

Consider two concurrent transactions:

T1: Locks row (wallet id=1)
T2: Tries to lock same row → waits
T1: Updates & commits → lock released
T2: Continues → reads updated balance

This ensures data integrity even under heavy concurrent load.

37. Bulk Create

Django’s bulk_create() method allows you to efficiently insert multiple records into the database in a single query. This is much faster than creating and saving each object individually.

# Example of bulk_create
new_books = [
    Book(title='Book 1', author=author1),
    Book(title='Book 2', author=author2),
    Book(title='Book 3', author=author3),
]
Book.objects.bulk_create(new_books)

This will generate a single SQL INSERT statement that inserts all three books at once, significantly improving performance compared to multiple individual inserts.

38. Bulk Update

Django’s bulk_update() method allows you to efficiently update multiple records in the database in a single query. This is much faster than updating each object individually.

# Example of bulk_update
books_to_update = Book.objects.filter(author=author1)