Advanced Django ORM Concepts

30. The N+1 Problem and Query Optimization

What is the N+1 Problem?

The N+1 problem is a common performance issue that occurs when fetching a list of objects (N items), where each object needs related field data, causing Django to run:

  • 1 query for the base list
  • N queries for related fields

πŸ‘‰ Total = N+1 queries

This happens because Django fetches related fields lazily by default.

Example 1: Forward ForeignKey

class Author(models.Model):
    name = models.CharField(max_length=100)

class Book(models.Model):
    title = models.CharField(max_length=100)
    author = models.ForeignKey(Author, on_delete=models.CASCADE)

# This causes N+1 problem
books = Book.objects.all()
for book in books:
    print(book.title, book.author.name)  # N+1 queries!

SQL Generated:

SELECT * FROM book;                     -- 1 query
SELECT * FROM author WHERE id=...;      -- repeated for each book

If you have 1000 books β†’ 1001 queries! 😱

Example 2: Reverse ForeignKey

authors = Author.objects.all()
for author in authors:
    print(author.name, [b.title for b in author.book_set.all()])

SQL Generated:

SELECT * FROM author;                        -- 1 query
SELECT * FROM book WHERE author_id=...;      -- repeated for each author

If you have 200 authors β†’ 201 queries!

Example 3: ManyToMany

class Category(models.Model):
    name = models.CharField(max_length=100)

class Book(models.Model):
    title = models.CharField(max_length=100)
    categories = models.ManyToManyField(Category)

books = Book.objects.all()
for book in books:
    print(book.title, [c.name for c in book.categories.all()])

SQL Generated:

SELECT * FROM book;                            -- 1 query
SELECT * FROM category JOIN book_categories... -- repeated for each book

If you have 500 books β†’ 501 queries!

Fast Facts About Django ORM

When using operations like:

Book.objects.filter(author__name='John Doe')

Consider the one-to-many relationship between Author and Book. Django will use JOIN clause to check the condition.

  • Works with ForeignKey and OneToOneField
  • Uses SQL JOIN to fetch related objects in the same query
  • Fetches related object in the same query

Basic Usage

# Solution: Use select_related
books = Book.objects.select_related("author")
for book in books:
    print(book.title, book.author.name)  # Only 1 query!

SQL Generated:

SELECT book.id, book.title,
       author.id, author.name
FROM book
INNER JOIN author ON book.author_id = author.id;

πŸ‘‰ Still 1 query total even for 1000 books!

# Multiple fields - still 1 query with multiple JOINs
Book.objects.select_related("author", "publisher", "editor")

πŸ‘‰ Efficient when all are FK/O2O relationships.

  • Works with ManyToMany and reverse ForeignKey relationships
  • Django runs 2 queries (base + related), then links results in Python
  • Avoids row explosion (cartesian product)

Basic Usage

# Solution: Use prefetch_related
authors = Author.objects.prefetch_related("book_set")
for author in authors:
    print(author.name, [b.title for b in author.book_set.all()])

SQL Generated:

SELECT * FROM author;                      -- 1st query
SELECT * FROM book WHERE author_id IN (...); -- 2nd query

πŸ‘‰ Always 2 queries, regardless of author count.

ManyToMany Example

books = Book.objects.prefetch_related("categories")
for book in books:
    print(book.title, [c.name for c in book.categories.all()])

SQL Generated:

SELECT * FROM book;                                    -- 1st query
SELECT * FROM category 
JOIN book_categories ON category.id = book_categories.category_id 
WHERE book_categories.book_id IN (...);                -- 2nd query
# Each gets its own extra query
Book.objects.prefetch_related("categories", "tags", "languages")

πŸ‘‰ 1 base query + 3 prefetch queries = 4 queries total.

34. What Happens After Query Execution

Without Optimization

  • Related objects are not loaded
  • Each .author, .book_set.all(), .categories.all() triggers new SQL
  • Related objects are joined at query time
  • Accessing .author does not trigger new SQL
  • Django stores a cache of related objects
  • First .book_set.all() comes from memory, not database

35. Query Optimization Summary Table

Relation TypeDefault (lazy)With Optimization
FK (forward)N+1 queriesselect_related β†’ 1 query
O2ON+1 queriesselect_related β†’ 1 query
FK (reverse)N+1 queriesprefetch_related β†’ 2 queries
M2MN+1 queriesprefetch_related β†’ 2 queries
Multiple FKs/O2OsN+1 queriesselect_related (many JOINs) β†’ 1 query
Multiple M2MsN+1 queriesprefetch_related (extra per field)

Important Notes

  • The order of filter() and select_related() chaining isn’t important:
# These are equivalent
Entry.objects.filter(pub_date__gt=timezone.now()).select_related("blog")
Entry.objects.select_related("blog").filter(pub_date__gt=timezone.now())
  • Caching behavior: Once you use select_related, the related objects are cached:
# Hits the database with joins to the author and hometown tables.
b = Book.objects.select_related("author").get(id=4)
p = b.author         # Doesn't hit the database.

# Without select_related()...
b = Book.objects.get(id=4)  # Hits the database.
p = b.author                # Hits the database.