Skip to main content

How Draft API Uses Format Rules to Generate Content URLs

A key piece of Draft API is its ability to generate URLs according to URL Format Rules for you. With URL generation in place, editorial users usually do not need to worry about exactly what URL their document belongs to.

However, URL Generation often happens behind the scenes. This document explains how your story’s URL is being generated.

What makes up a URL?

First, let’s take a look at a URL.

https://the-herald-news.com/sports/baseball/2021/05/03/team-wins-series/

In Arc XP, when we talk about URL generation, we focus on the piece of the URL after your base path. In this example, we’re looking specifically at how we got to /sports/baseball/2021/05/03/team-wins-series/.

Generating this URL is possible due to URL Format Rules, which are a set of rules that your organization creates to determine how the Draft API should generate a URL. URL Format Rules are composed of two parts: criteria and format.

The criteria determines when the rules use a specific format. Often, your organization may want to generate URLs differently depending on the type of content. Author pages, for example, do not have circulations and, therefore, shouldn’t use a section field.

You define the criteria as a JSON object. If the piece of content that the Draft API is evaluating has fields that fully match the criteria, the Draft API uses that format. If the content matches multiple criteria, then the Draft API uses their priority to determine which format to use. For example, if the criteria is:

{
    "type":"story",
    "subtype":"blog-post"
}

Then any piece of content that has type of story and subtype of blog-post match this criteria and use its format.

The format is the actual rule to follow when generating the URL. The format describes which pieces of ANS fields will be picked to be part of the generated URL. Any ANS field that exists on a published piece of content can be selected as a piece of the URL format. The URL format can also apply several built-in functions to the ANS fields that can change the field's value on the fly, for example by applying a slugify function to a headline. Most URL formats use a combination of dates, sections, headline, or ANS id to form a unique URL.

Let’s take a look at some example formats:

/%display_date|year()%/%display_date|month()%/%display_date|day()%/%headlines.basic|slugify()%/

This example uses a combination of display_date and headline to ensure a unique URL. First, the format uses functions to specifically select the display date value, and pick out its year, month, and day. Then, it applies the slugify() method on the article's headline. This example results in a URL like /2020/05/03/my-headline-here/.

Now let's consider a more complicated example.

/video%websites.the-herald.website_section%/%headlines.basic|slugify()%/%publish_date|year()%/%publish_date|month()%/%publish_date|day()%/%_id%_video.html

In this example, the format uses a combination of ANS fields and static text to make a URL. It also makes use of the websites object, which can pull circulation information and insert it into the URL. This example results in a URL like /video/sports/baseball/my-headline-here/2020/05/01/ABC123_video.html

When does my URL generate?

In the most basic case, your URL generates when you have both circulated and published a document.

rId20.png

Draft API also offers a /resolve endpoint that can mimic URL generation ahead of publish time. In those cases, your flow may look more like:

rId23.png

Some risk exists with this method, as many URL formats use date fields that may not be correctly populated until publish time. You can mimic these fields ahead of time if you know when your story will be published. For this reason, this methodology is often used with Scheduled Publish workflows where the story URL needs to be known before the publication actually occurs.

Adding Sections to URLs

Depending on which version of ANS your stories adhere to, you can use several ANS fields to add a section to a URL.

ANS Versions 0.10.6 and later

If a customer onboarded to Arc XP after January 2021, the ANS version of story and gallery content is at least version 0.10.6. The ANS version number for videos is locked to a historical ANS version. However, for the purposes of understanding URL format rules referencing section information, video content follows the rules of modern ANS versions.

websites.{website id}.website_section is an ANS field where the document's section information is stored for each of its circulated websites. Content circulated to multiple websites, like The Herald and The Gazette, has two fields in the ANS, for example, websites.the-herald.website_section and websites.the-gazette.website_section.

For modern versions of story or gallery ANS or ANS of a video, websites.{website name}.website_section is the correct field to use in all URL formats where the section name needs to be represented in the resulting URL.

There are other fields under the taxonomy space that hold website and section values. In modern ANS these taxonomy fields are maintained for backwards compatibility and also to keep section information searchable in Content API queries.

Note

The Video ANS version number is locked to 0.8.0.

ANS Versions prior to 0.10.6

If content was migrated onto Arc XP prior to January 2021, it may have an older ANS version, and its ANS structure may be slightly different than more modern content. URL Format Rules that were created to support this older content may use the taxonomy ANS keys to pull the section information into the URL.

  • taxonomy.primary_section is the document’s primary section on the document’s primary (canonical) website. It does not vary based on the website parameter. In Draft API, this field can end up empty if you do not circulate your document to its canonical website. For this reason, we do not recommend using this field in modern URL Formats.

  • taxonomy.sections[*] is a list of all sections to which this document is circulated. This list includes all sections regardless of website. Historically, ANS versions older than 0.10.6 guaranteed that the first item in this list, taxonomy.sections[0], would be equivalent to taxonomy.primary_section. However, this is not always the case and is a risky field to use in a URL format.

Multiple Websites and URL Format Rules

If your organization has multiple websites within Arc XP, there are a few more considerations you may want to make.

  • Each website can have its own URL Format Rules.

  • A story’s canonical URL generates based on the rules of its canonical website.

  • Certain deprecated taxonomy ANS fields are not suited for URL Generation. For multi-site organizations, we always recommend using the ANS websites object to generate URLs.

Use Cases

Use Case

Steps to Take

Why would I do this?

I want my URL to “just work” without changing anything.

  1. Create my document.

  2. Circulate the document without setting a URL explicitly.

  3. Publish the document. URL Generation occurs after publish, and the story is now live at the generated URL. In this example, Circulate and Publish can happen in either order.

This is considered the happy path of URL generation in Draft API.

I want to know my URL ahead of publish time.

  1. Create my document.

  2. Circulate the document without setting a URL explicitly.

  3. Use Draft API’s /resolve endpoint to generate a preview of the URL. This step may require “filling in” some missing ANS information that would normally not exist until publish time.

  4. Circulate the document with this preview URL filled into the circulation.

  5. Later, the document publishes.

You may have scenarios where a story is not set to publish until a future day, an overnight time, etc. In these cases, you may also want to schedule social media posts about the story that include a URL. This example gives you the ability to know your URL ahead of time.

I want to explicitly set my URL without using URL Format Rules.

  1. Create my document.

  2. Circulate the document with a URL explicitly filled into the circulation.

  3. Publish the document. In this example, URL Generation never occurs and is bypassed in favor of the URL you attached to the circulation.

  4. Create a URL format rule that has a criteria unique to only migrated content and a format rule that explicitly sets the url to %websites.{website id}.website_section%

Most often this is required when migrating stories into Arc that already have URLs from a previous CMS.

The URL format rule will ensure that the explicitly set URL is never re-generated when the Composer UI's Regenerate URL button is used.

Troubleshooting

After setting up your URL Format Rules, you may still run into URL Generation Failure errors in Composer. If so, you can follow the steps here before creating an ACS Ticket to see if you can resolve the issue.

If you are getting this error

Try

“...missing or invalid values for field(s) [taxonomy.primary_section.id] caused URL generation failure”

taxonomy.primary_section is equivalent to the primary section of the primary (canonical) website. If you want the primary section of a different website, use: websites.{website id}.website_section. If you want the primary section of the canonical website, ensure your story is circulated to the canonical website.

“...missing or invalid values for field(s) [taxonomy.sections[0]] caused URL generation failure”

taxonomy.sections just refers to the first section in the list of sections return in the taxonomy information. If you want the primary section of a different website, use: websites.{website id}.website_section. If you want the primary section of the canonical website, ensure your story is circulated to the canonical website.

“...missing or invalid values for field(s) [taxonomy.sites[0]] caused URL generation failure”

taxonomy.sites is a deprecated ANS field. If you want the primary section of a different website, use: websites.{website id}.website_section. If you want the primary section of the canonical website, ensure your story is circulated to the canonical website.