Reply to comment

Drupal Pathauto Tutorial

Drupal's Pathauto Module is a great module for SEO. It automatically generates URL aliases for nodes based on highly-configurable rules.

This tutorial starts by introducing Drupal's URL aliases, and then moves on to configuring Pathauto. I'm writing this tutorial for Pathauto version 5.x-2.1, but the basic concepts should apply to other versions also.

First, download the Pathauto module and the Token Module. Both are needed to make Pathauto work. I also recommend installing the related Global Redirect module for reasons mentioned in my Drupal SEO tutorial.

Drupal URL Terminology

Drupal has internal URLs with structures like node/123, user/5, and taxonomy/term/1. If you build a Drupal site and don't use the Path or Pathauto modules, your URLs will look like example.com/node/123.

Drupal's built-in Path module lets you override those default URLs with URL aliases. A URL alias is just an alternate URL that will load an internal Drupal URL.

For example, if you have a node (page) at http://example.com/node/123, you can manually create a URL alias with the Path Module so that http://example.com/green-cars is used instead of http://example.com/node/123. (Side note: the reason for installing the Global Redirect Module above is to prevent duplicate content by creating redirects to the URL aliases.)

The Pathauto Module is a module that automatically generates URL aliases based on your custom Pathauto settings.

Pathauto Settings

After you install Pathauto, go to the Pathauto settings page at admin –> pathauto.

Your Pathauto settings page may look different, depending on which modules you have installed:

The settings page for the Drupal Pathauto Module

Click on General Settings to open that section. In this basic Pathauto tutorial, you can leave everything at the default setting except for this section:

General Pathauto Module settings

For Update action, be sure to choose "Do nothing. Leave the old alias intact." That will prevent your URLs from accidentally changing if you change the title of an already published node.

I have some sites with characters in foreign languages so I check the box that says "Transliterate prior to creating alias". That means that if you use non-English characters in your post titles, Pathauto will convert them to their equivalents instead of replacing them with a dash.

You won't be able to check the box until you make a i18n-ascii.txt file, which is as easy as renaming the one that comes in the Pathauto directory (just remove the word "sample" from the existing file's filename). There is a page on Drupal.org with information about customizing your transliteration file if you need something else from it.

I also check the box that says "Reduce strings to letters and numbers from ASCII-96" because search engine crawlers are not very smart and I like to keep things simple for them.

The only setting that you need to change in the Punctuation Settings section is the one for Hyphen. Change it to No action (do not replace) as shown below:

Pathauto hyphen

The only reason for that is to remove an error message that will otherwise appear.

Pathauto Rules

Each content type in Drupal can have a different set of Pathauto URL rules. The URL rules can be based on things like taxonomy term ("category"), node ID number, title of the node, date or time based, author name, etc.

Here is a sample list of available tokens that you can use in URLs that you can view by clicking on "Replacement patterns" on the Pathauto configuration page:

a list of available Pathauto tokens

You then enter the above-mentioned replacement patterns in your Pathauto rules for each content type:

Pathauto rules settings for Drupal

In the screenshot above I'm creating very "non-Drupal" URL paths. More common would be to leave out the [nid] and the .html file extension.

Pathauto Warnings

There is a page on Drupal.org about dangerous Pathauto patterns. You can avoid all of those potential problems by just making sure that you never use just [title-raw] as the only setting for Pathauto. There should always be more data in the URL than just the title otherwise people could authenticate your site in Google Webmaster Tools or create a URL alias that conflicts with an existing system path.

The thing you don't want is for someone to be able to create a node called "This is my post" and have the resulting URL be example.com/this-is-my-post. You should have a little more data in there.

Examples of extra data that you could add to the Pathauto rules are:

  • Node: content/[title-raw] which would result in: example.com/content/this-my-post
  • Node: even better would be [term-raw]/[title-raw] because then it would put your content in keyword-themed directories.
  • Node: [nid]-[title-raw] which would result in: example.com/123-this-my-post. The node ID number is good in certain situations, like when trying to get your site into Google News, which requires a minimum 3-digit number in the URLs.
  • Node: you can even create pages with file extensions like this: [term-raw]/[title-raw]-[nid].html which would look like this: example.com/green-cars/this-my-post-123.html
  • Taxonomy term: example.com/tags/green-cars

Reply

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Use the special tag [adsense:format:slot] or [adsense:format:[group]:[channel][:slot]] or [adsense:block:location] to display Google AdSense ads.
  • Internal paths in double quotes, written as "internal:node/99", for example, are replaced with the appropriate absolute URL or relative path.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.