Image

SEO friendly urls in Django

Examples sites built with Django usually use primary keys as the URL identifier for a resource. While this is the simplest and cleanest approach to resource urls, it's not optimized for SEO.

It's debatable on whether or not keywords in urls matter for Google, but it's obvious that most of the major players have adopted this technique. Below are five examples of popular sites that all use keywords in urls to improve their SEO.

If you poke around at the top 100 sites as ranked by Alexa you will notice that most sites with public facing content use some sort of keywords approach in their urls.

Besides being a possible SEO optimization strategy it also improves the chances that a user will click on your link over other links.. Both Chrome and Firefox will match and highlight parts of urls that match the keywords if the user has visited the site previously.

Google search bar

Show me the code

Django doesn't have views that use keywords in urls out of the box, but it's trivial to add. I'll go through a basic example with job postings that we use at findwork.dev.

Suppose we have a model for job postings that have a company name and a role.

# models.py
from django.db import models

class JobPosting(models.Model):
    company_name = models.CharField(max_length=50)
    role = models.CharField(max_length=50)

We have a view which shows each job by primary key in the url:

# views.py

from django.http import HttpResponse
from django.shortcuts import get_object_or_404

from .models import JobPosting
def job_posting_view(request, pk):
    job = get_object_or_404(JobPosting, pk=pk)
    html = f"""
    <html>
    <body>
    <h1>{job.role} at {job.company_name}</h1>
    </body>
    </html>
    """
    return HttpResponse(html)

In urls.py:

# urls.py

from django.urls import path
from web import views

urlpatterns = [
    path("<int:pk>", views.job_posting_view, name="job-posting"),
]

The urls we now use identify a resource by primary key. We want to add keywords to the url, but still include primary keys identify the resource by ID to ensure we don't break old urls and because it ensures uniqueness. Otherwise two job postings with the same role and company_name may conflict with each other.

The solution is straightforward. We'll generate keywords from the company_name and role and append them to the end of the url. Using Django's slugify to on the two fields we eliminate any whitespace or other characters that are not permitted in urls. Then we match the <slug> path converter to the slugified company_name + role. It's better explained with code:

urls.py

from django.urls import path
from web import views

urlpatterns = [
    # Match urls based on ID
    path("<int:pk>", views.job_posting_view, name="job-posting-id"),
    # Match urls based on ID + keywords
    path("<int:pk>/<slug:slug>", views.job_posting_view, name="job-posting-keywords"),
]

views.py

from django.http import HttpResponse
from django.shortcuts import get_object_or_404

from .models import JobPosting
def job_posting_view(request, pk, slug):
    job = get_object_or_404(JobPosting, pk=pk)
    html = f"""
    <html>
    <body>
    <h1>{job.role} at {job.company_name}</h1>
    </body>
    </html>
    """
    return HttpResponse(html)

That's the simplest solution I've come up with. However, it suffers one serious drawback for SEO. We now have two different urls that identify a single resource. Search engines will at random choose one version of to display in searches and it will attribute backlinks and link equity to one of the urls at random. This also applies to other sites that link to you. Your content becomes diluted with regards to "SEO points" and it impacts your page rank.

To circumvent this we need to ensure that there exists only one way to identify a resource. Since we want search engines to index the keyword-based url we want that to be considered the single source of truth.

In short:

  • we are going to use two urls - one identifiable by primary key and one by primary-key + url
  • the primary-key based url will redirect to the keyword based url to ensure the search engines attribute all the link juice to our keyword based url.

Django has the get_absolute_url method which it uses internally to simplify linking to a specific model. We'll add that to ensure the keyword based url is the definitive url for the resource. Whenever Django internally uses the reverse helper method Django will in turn call get_absolute_url and resolve it to our keyword based url.

models.py

from django.db import models
from django.urls import reverse
from django.utils.text import slugify


class JobPosting(models.Model):
    company_name = models.CharField(max_length=50)
    role = models.CharField(max_length=50)

    def get_absolute_url(self):
        # Slugify the combination of role and company_name as these may contain
        # whitespace or other characters that are not permitted in urls.
        slug = slugify(f"{self.role}-at-{self.company_name}")
        return reverse("job-posting", kwargs={"pk": self.id, "slug": slug})

views.py

def job_posting_view(request, pk, slug):
    job = get_object_or_404(JobPosting, pk=pk)

    # Redirect with a 301 in case someone uses the ID based url. This ensures
    # that the canonical url is the keyword based one and will attribute
    # backlinks and search rankings to our keyword based url.
    if request.path != job.get_absolute_url():
        return redirect(job, permanent=True)

    html = f"""
    <html>
    <body>
    <h1>{job.role} at {job.company_name}</h1>
    </body>
    </html>
    """
    return HttpResponse(html)

Class based views

For a class-based solution we need to override the get method to perform the redirect if the url doesn't contain keywords.

from django.views.generic.detail import DetailView

class JobsDetailView(DetailView):
    model = JobPosting

    def get(self, request, *args, **kwargs):
        self.object = self.get_object()

        if self.request.path != self.object.get_absolute_url():
            return redirect(self.object, permanent=True)

        return super().get(self, request, args, kwargs)

The full code example can be found on Github.

Resources:

  • https://stackoverflow.com/questions/820493/can-an-seo-friendly-url-contain-a-unique-id/820529#820529
  • https://webmasters.stackexchange.com/a/74639/45262
  • https://webmasters.stackexchange.com/questions/47342/are-keywords-in-urls-good-seo-or-needlessly-redundant
  • https://stackoverflow.com/questions/910683/why-is-just-an-id-in-the-url-path-a-bad-idea-for-seo
  • https://stackoverflow.com/questions/820493/can-an-seo-friendly-url-contain-a-unique-id
  • https://moz.com/learn/seo/backlinks
  • https://moz.com/learn/seo/what-is-link-equity
  • https://moz.com/blog/301-redirect-or-relcanonical-which-one-should-you-use