Tweet

How to Create a Content Strategy
(In Only 652 Steps)

By

This is a really, really, really long piece. There's no TL;DR version. That's because a content audit has a lot of steps. I've tried to break them down and explain each one, so that a writer who knows their way around a computer or a marketing geek who knows how to write can take this and do a complete audit.

First, a content strategy is not an inventory or an audit.

An inventory is a simple list. A strategy:

  1. Inventories existing content
  2. Analyzes the competition
  3. Draws conclusions
  4. Lays out a strategy based on those conclusions

#1 is the most mechanically-involved task, because you have to grab a lot of data and mush it all together. #2 is the shortest. #3 and 4 are the most demanding (for me, anyway) because I have to suss out impossible-to-automate marketing stuff that's essential to success.

Before you start

A few basic assumptions, and some sacred cows to slay:

What you're auditing

One of the team here at Portent read a draft of this and said, "This is actually a conversation audit!"

It's true. Conversations are comprised of content. Content is how we converse. That's why some guy wrote a book in 2003 or so about Conversation Marketing. See what I did there?

In all seriousness, content drives every exchange you have with a potential customer: Product descriptions are content. So are photos, blog posts, podcasts, your company's "About Us" page, those 40 ridiculous links you stuff at the bottom of your home page, and every other scrap of information you put, anywhere.

The goal of a content strategy

Content strategies have three goals:

  1. Call out goals, and how you'll accomplish them.
  2. Persuade the reader—your client/boss— that they should believe you, stop barfing keyword essays out on their site, and become a true resource for potential customers.
  3. Create a 'profile' of particularly successful content types that others can apply for the foreseeable future.

Trust me, none of these are easy. First, though, you have to break the connection in your brain between SEO and content.

Forget the rankings

Whatdidyousay?

I said: Forget about SEO and rankings. This is about content, which affects every inch of your marketing strategy. Content influences everything in marketing:

relative influence of content on marketing - infographicDid I set out to create an infographic? No. It just happened.

SEO is one reason to do a fantastic job on content. It's certainly not the only one. I see SEO as 20% of the reason to create great content.

This whole step-by-step is meant to support all of your organization's communications efforts. It'll only support those efforts if you use the data you collect to define 'good' content for your site. That will only happen if you look at content as a marketing strategist, rather than an SEO. OK - start sending me flaming e-mails.

Step 1: Do a content inventory

You could go through your entire web site, by hand, and find every last piece of content. You could.

Or, you can automate the inventory. Which is what I strongly suggest. Here's what I do, and how I do it:

The site crawl: Break it up, before you go-go

Yikes. Awful, I know, but that's the music I grew up with.

The first step in any inventory is getting a list of your stuff. Here, the first step is getting a list of site pages. All you need is a list of page URLs. If you have a large site (say, more than 1000 pages) you should group those URLs by category.

Wait, what? Grouped by category? Ian, WTF?!!! I'll have to do that by hand!!!

Nope. Assuming your site has any semblance of structure, you'll have categories, and those categories will have a hub page, like this:

Sample sitemap Example of a logical URL group

On this site, you'd probably create an 'Outerwear' URL group that included everything in the Outerwear/ folder, including the outerwear page itself. Then you'd create another one for underwear, and so on. If there's a blog, you break that up by category, too.

It doesn't matter if you're an e-commerce site, by the way. Most sites have category structures. If you don't, look really carefully—you may have found the first step in your content strategy.

Here's an easy way to see if you've got the right URL groups. Each URL group should:

  1. Share a common desired visitor emotional response, such as "I love this. I want it." or "This guy totally answered my question. He rocks." or even "Good god, what the hell were they thinking?!"
  2. Make sense. Are you pairing services information with today's funny cartoon? Better be a good reason.
  3. Be manageable. The URL group has to be small enough that you can do reasonable measurement, and large enough to give you a decent sampling. There's no hard-and-fast rule for this, I'm afraid. Usually it's obvious: On a 10,000 page site, for example, a 10 page URL group will have little value, but a 5,000 page group may be impossible to work with.

You can get these URL groups fairly easily using a standard site crawler like Screaming Frog. Keep reading and I'll show you how.

Running the site crawl

Crawl each category list of links separately. My favorite desktop tool for this is Screaming Frog. Go get it. I'll wait.

Screaming Frog is so full of awesome, it'd take another 10,000 words to describe it. If you want to learn more about using it, check out SEER's incredibly complete guide.

First. Before you do anything else, click Configuration > Spider and uncheck every box. That will set the spider to skip external addresses, javascript and css links and images. We don't need those for a basic audit.

Second. If you're lucky, and each URL group matches a folder on your site, you can enter your web address plus the folder, like this:

Sticking to one folder in Screaming Frog Sticking to one folder in Screaming Frog

...and Screaming Frog will only crawl links within that folder. Easy-peasy.

If your site doesn't use folders, try using the Include filter, instead: :

  1. Click "Configuration > Include": Using a Screaming Frog include folder
  2. Then, type in the URL pattern you want to match. Don't let the mention of regular expressions scare you. A plain old pattern like this works brilliantly. Just remember the '*':
    setting the filter
  3. Start your crawl.

You can also filter the URLs after the fact, using the Filter tool, or by getting creative with Excel. I'm not going to write about all that here, 'cause frankly this post is long enough, yes?

Third. Save your crawl to a CSV, so you can import it into a spreadsheet, like so: Exporting your crawl

But what about monster sites? 1 million+ pages? My favorite publicly-available tool is 80Legs. It's not a cinch to set up, but it can crawl scads of pages.

Fourth. Finally, take each URL group crawl result and import them into Excel. Keep the Status Code, as well as the H1/H2s, Response Codes, Page Title and Meta Description, Word Count. That gives me a pretty complete list of URLs and basic data you'll need for page quality.

Finally. Get rid of duplicate source URLs! Don't forget!!!

You can use a different tool than Excel. When I do this myself, I dump the list of URLs into Sublime Text and filter for unique URLs, and then use a regular expression to dump any offsite stuff that somehow got into the crawl. But Excel is a great starting point.

Grab your performance data

Now, you've got your list(s) of links—the 'stuff' part of the inventory. Time to fetch your performance data—this the the 'information about the stuff' part of the inventory. You'll use this data as part of your strategy.

What I do not measure in a content audit (I use inventory and audit interchangeably. Castigate in the comments if necessary.):

  1. Pageviews: Too deceptive.
  2. Time on page: Panic-inducing, and may not indicate the real 'quality' of the page.
  3. Bounce rate: See time on page. If folks read your site through feeds, bounce rate may be really high. But that means nothing.
  4. Visitors or unique visitors.

None of these stats show me whether the content had impact. They show me whether someone came and looked at the page, and they show me if you left your browser open for 10 minutes. That's about it.

What I do measure in a content audit:

  1. Facebook likes. If someone 'likes' a page on Facebook, something about that page got their juices flowing.
  2. Facebook shares. Ditto.
  3. Facebook comments. Ditto Ditto.
  4. Twitter posts. Ditto... you get it.
  5. LinkedIn posts, if this is a business site.
  6. Reddit votes, if relevant.
  7. Google +1s.
  8. If the page has commenting, the number of comments.
  9. If the page has reviews, the number of reviews.
  10. Authority, based on a tool like Moz's Open Site Explorer or Majestic SEO (or both!). I don't care whether you're doing SEO or not - this is another good measure of audience response.
  11. Revenue/conversions generated by visitors to those pages. Only sometimes. Use with caution, because great content typically gets you conversions later, and attribution is nearly impossible.

Read on if you want some tips on grabbing all of this data without going insane.

Language/quality data

Performance data tells you how a specific piece of content helps your overall strategy. Language & quality data provides a snapshot of subject matter and best practices:

  1. (Required) Words/page. You already have this from your Screaming Frog crawl, remember?
  2. Paragraph tags. Clearly you should have them. But do you?
  3. (Required) Title tag. Because it's important. You already have this. This is not for SEO evaluation, although you can certainly do that. You'll examine titles to see if one style of title tag gets more shares/authority/good stuff than another. Forget the rankings.
  4. (Required) Description tag. You have this already, too. Again, this is not an SEO thing! It's about looking at trends: On your site, do pages with good description tags get more shares/authority than those that do not? You want to know.
  5. (Optional) Term Frequency-Inverse Document Frequency (TF-IDF). The top 5 or so terms by document and group group, based on a score that combines the term frequency of a phrase within a document, and the inverse document frequency of the term within the entire URL group (aka, the corpus, in natural language processing-speak). If the idea of calculating TF-IDF makes your eyes roll back in your head, then show the top 5 terms per document, based on frequency. Again, NOT SEO. This is to help you figure out what each page/section is about, without requiring you to read every one. It's not perfect, but it's better than nothing.
  6. (Optional) Flesch-Kincaid grade level and reading ease. These numbers give you a basic look (very basic) at writing complexity.
  7. (Optional) Heading element usage. Track how many elements each page has. To a point, more headings may show a well-thought-out document structure.
  8. (Optional) OGP markup. OGP markup enhances Facebook sharing. If some pages have it and some don't, you can track the impact.
  9. (Optional) Twitter markup. Same as OGP markup, but for Twitter.
  10. (Optional) Count of possible spelling errors. Tragic that we have to test for this, but there you have it.

There is no ideal number of words per page, by the way. Or reading ease. Or anything else. You're looking at this data to build a profile of particularly successful content within this one category on your site. In one URL group, that may mean 500 words/page and a reading grade level 12. In another, that may mean 100 words and reading grade level 7. It's up to you to look at the data and draw conclusions.

But still - how do you grab all of this data?!!! There's the rub. Keep reading:

Mechanize data collection

Collecting all of this by hand could take weeks. Or you can automate it. You've got a few options, from super-technical to most accessible:

  1. You can write your own data grabbing script. To me, all marketers should be programmers. If you believe that, writing a little script to fetch this data, page by page, isn't that hard. I did it, and I code about as well as I dance. Use this option if you'll be doing repeated audits.
  2. Fancier tools like Content Insight appear to fetch all the data you want and then some. I haven't tried any of them yet, so I can't say for sure. If you've tried them, let me know.
  3. Use a service like Smartsheet. Set up a blank spreadsheet with all the columns you want, and paste in all the URLs. Write instructions on how to grab each metric. Then use Smartsheet to set up Amazon Mechanical Turk jobs for each URL. Poof. You're done. Great if you're only doing occasional content audits.

More likely, you'll use a mix of tools and manual labor. Here's how you pull it all together:

Aggregate your data

Tools like Open Site Explorer will let you pull a lot of the content performance numbers. If you can't every datapoint, work with what you can. In spite of the numbers, this is more art than science.

But data like authority numbers and revenue/conversions require logins, and you don't want to share those on AMT or Smartsheet. Use Excel VLOOKUP, instead:

  1. Add 2 tabs to your content data spreadsheet.
  2. In your web analytics software, set up a content report that includes page URLs and revenue or conversions. Download a CSV export of that report.
  3. Get SEOGadget's link API Excel plugin. Paste your URLs into one of the spreadsheet tabs you just created. That'll give you a bulk lookup of all of your URLs.
  4. Paste the web analytics content report into the other tab.
  5. Use VLOOKUP to grab relevant data.

If you need to learn VLOOKUP, check out this excellent tutorial from Distilled.

Note: I try to avoid self-promotion in these posts. But we've built a pretty nice inventory tool that looks up and generates a lot of the data I've talked about here. Because it pounds the crap out of our servers and APIs, we can't make it publicly available. If you want a report, though, we can generate it for you. Yes, it costs money, or a donated left knee—mine's on the fritz.

One last bit for data: Any disasters?

Record any catastrophic events: A major PR gaffe, or a governance failure, or something similar. These kinds of events may point out the need for a stricter content policy. Or, they may point out how well a particular style of response worked to correct the problem and move on. Either way, there are lessons to be learned.

Strategy Step 2: Spy on the Competition

To be honest, I rarely do a deep competitive analysis. We're not going to imitate the competition, because that probably won't work. And being a copycat is really, really bad for your brand, as Adecco found out. And we're not going to learn much from them, because we have no information about their process/challenges/resources.

However, there are times when competitive research and comparisons make sense:

  1. Prompt a response. Nothing gets a team motivated faster than "our biggest competitor is using this strategy, and they're kicking our butt." This works well when you're justifying everything from fully descriptive title tags to content that's not marketing-focused.
  2. Give you a goal. If you have no idea what to set as a goal, you might check share data for competitor content. That assumes the competitor is doing well, of course.
And, since you've already automated so much of the data collection process, it's easy enough to run the process above for a few dozen pages on your top competitors. So you may want to take a look.

Strategy Step 3: Draw conclusions

You now have scads of data on every page in each URL group. Maybe you grabbed every possible datapoint. Maybe you didn't. It doesn't matter. This is the part that really, really matters. You need to look at all of this data and start to build a profile for 'great' content in each URL group.

You could try to use a formula or a statistical technique like Pearson Correlation, but I don't recommend it. Content is largely about emotional response: Agreement. Satisfaction. Dislike. Sense of security/lack thereof. Also known as significance, by the way. You can't calculate that.

You can, however, look at each URL group, determine what emotional response each group was trying to obtain, and then use the data to see whether you succeeded. Here's an example:

Portent's content report

I ran a small report—about 150 pages—on the Portent site, then trimmed it even more. You can download it here to follow along, if you want.

I need to see if any content 'sticks out' as particularly successful. The easiest way? Just sort by various columns, looking for pages with the most tweets, Facebook likes, +1s, etc. If I find something significant, I'll create a pretty report for it later. Have a look at content by tweets (we get this statistic from the Topsy API).

Heeyyyyy nice Tweets Heeyyyyy nice Tweets

The top 3 or four are either super-specific 'how-tos' or super-general philosophical, rabble-rousing stuff. They're also all well over 1500 words, with two of them over 5,000. In fact, just about all blog content that got tweeted falls into those two categories, and have over 1500 words.

If one of my strategic goals is to build share of voice through social shares (and it is) then I can safely assume that:

  1. Longer articles are at least OK, and may be a really good idea.
  2. We need to either hit major, current issues (like Google and marketing, or Tom Cruise 2 weeks before Oblivion's first screening tied to content marketing) or truly useful how-tos (Effective outreach e-mails).
  3. Titles with 'Why' or 'How' at the beginning may somehow get more sharing and authority, which means folks are engaging a bit better. Make a note to test this.

These may seem like "well, duh" discoveries. But now you can back them up. Plus, I'm not sure the newsworthy part would've jumped out at me without looking at these top 10 pages.

Still, I won't want to base all of my assumptions on one statistic. And there's probably more to learn. If I dig deeper, I find:

  1. Majestic SEO's Citation and trust flow seem to back up the Twitter numbers. So these posts aren't just Twitter phenomenon. They're getting shared via blogs too.
  2. No one topic is the big driver. Everything from responsive design to PPC to SEO generates attention. That's good—part of our strategy is to emphasize we're full-service.
  3. Reading grade level/ease aren't super-significant, but nothing over grade 9 made it into the most-tweeted content.

Here's the warning about all this: It's just numbers, and you're dealing with people. It may be that the top two posts performed so well because we used the word 'flibbergibbet' in our tweets, or we tweeted on a Saturday (I checked the tweets we sent, and when we sent them, and didn't see any huge differences).

So yes, you need to think it through, and explore other information as necessary. That's research. It's also why we all still have a job, and aren't likely to be replaced any time soon by Python scripts that fetch numbers from 10 different APIs.

Strategy Step 4: Build the strategy

A quick disclaimer here: This next section lists the things I feel must be in the strategic portion of a content audit. But the whole concept of 'content strategy' is a new one, and there are very different opinions of what goes where. Feel free to use my structure, or find another, or make your own if it better addresses your needs.

This is the part that brings everything together: Your data, conclusions, competitive analysis, all of it.

Here's what I put into a strategy:

The role of content in the marketing plan

First things first. What's the content supposed to do? You may know. I may know. But I guarantee at least one critical decision maker who reads your strategy will not. This is not something most people spend a lot of time thinking about - we're kind of weird. So no matter how repetitive it feels, write out exactly what you hope content will accomplish.

This is yet another place to emphasize that adding-keywords-to-the-website-so-the-Googles-rank-us is not part of the strategy. It's also a good place to point out content's impact on the entire marketing plan. Get buy-in on these concepts here, and opportunities for great, creative, useful stuff start popping up like sun-starved Seattlites on a clear day.

What you did

Where did you get your data? How did you draw your conclusions? Again, someone's going to ask. Nothing fancy - a simple list of sources and methods will do.

Goals and metrics

What should we measure going forward? The Goals and Analysis plan, below, will talk about how often to measure it. But you also need to set out the metrics, and the goals, right away. Like the role of content, putting this into writing really gets folks on the right track.

For Portent, the goal is more shareable content and ultimately, reaching beyond the search community into the marketing community as a whole. We'll measure that by watching:

  1. Twitter and Facebook sharing, likes and comments
  2. Topics - we want to make sure we don't over-focus in one area, and we want to talk more about marketing as a whole
  3. Comments generated per page
Goals: I'd like to see comments per page increase by 25%, and see an even topic distribution between PPC, SEO, social media, analytics, development and overall strategy (I'm making these up).

Essential topics

The 2-3 guiding principles behind the organization, translated into central themes for every piece of content you write. This is one of the critical bits that, if you get it right, cause the rest of the audit to fall nicely into place.

The 'essential topics' might be 'keywords' for some folks. I don't think keywords are very strategic, and I know what my writing looks like if someone tells me "Hey, can you write a blog post about tires?" So I'd rather stick to topics.

A good place to start is with the organization's 'Why.' http://www.startwithwhy.com/

For me (and therefore my company, HAH - benefits of being CEO) the 'why' is "Help everyone communicate better, because great communications will save the world."

That doesn't mean every blog post we write has to talk about communications and a world-ending lack thereof. Instead, break the 'why' up into essential topics like this. Every post we write should probably touch on one of these points:

  1. Learn to be a better marketing communicator
  2. Be heard
  3. What's good, what's bad
  4. Linking marketing to communications

We don't have to literally write about each of these topics, either. But any content we create should probably somehow support, teach to or learn from one of the essential topics. For example:

  1. How to: Write clearly.
  2. Choose the right images for your blog post.
  3. Great communications can save the world (it's OK to write the why now and then)
  4. Use Python for sentiment analysis
  5. Machine learning and spam link detection
  6. Adecco: Anatomy of a social media meltdown

Every one of those posts either offers advice on better communications, calls out great or awful communications, or otherwise trots alongside the communications bandwagon.

Essential topics are a very 'soft' concept. That makes it even more critical that you make it as concrete as possible, with lots of examples and very clear topics. Still, anything this touchie-feelie in a world of give-me-my-roi-and-meta-tags-now-dammit spells nightmare. Here are the properties of a good essential topic:

  1. It works for a how-to, a rant and a philosophical discussion. aka: ?WTF*. Any essential topic should be at least that flexible. So "how to program Python" won't work. But "Program Python better" might, and "Programming to learn better" is perfect.
  2. It travels on its own: Seeing this essential topic, anyone will understand what it means. "Writing Python list iterators" is a total fail. It's fine as a blog post. It's just not an essential topic. A better one might be "Process data faster" or "Do more with data."
  3. It lets you be funny, serious or mad. Similar to ?WTF*, but this is more about pure tone. "Taking better photographs" probably doesn't work. I can't get terribly angry about that. "Better visual storytelling" works.

Some of these look like narrow semantic distinctions, right? You probably rolled your eyes at "Better visual storytelling." It's ok, you can admit it. I backspaced over it three times myself. But it's not just me making up fancy phrases. 2 years from now, someone's going to have to look at this essential topic and be able to apply it to whatever content they're dreaming up.

Tone & Audience

If you're going to write personas, this is not the place to do it. You should have gotten everyone's buy-in regarding personas before you started writing strategy. If you didn't, no big deal. Just don't create the gigantic kerfuffle that'll result if you throw personas into the mix now. Skip 'em.

Instead, in this section, write little 2-sentence descriptions of each audience type and do the same for a few different tones you think are appropriate.

Content Types

Easiest section in the whole audit. Just explain the kinds of content you think can work. If video and text are the only types, explain why. If an audio podcast makes sense, go with that. Just think it through, so that this is a plan the reader can stick to.

Hierarchy

Now, explain how you're going to rank content. I don't mean good to bad - I mean "Stuff we should write a lot" versus "Stuff we should write a little," or "Stuff that scales" and "Stuff that doesn't." For Portent, that's the 70/20/10 rubric. I won't bother going into it here - you can read about it in Katie Fetting's post here.

You can't just cut-and-paste a standard description, though. You have to make this hierarchy make sense for the client. I usually do that by:

  1. Showing content they've already done that fits each position in the hierarchy
  2. Showing examples of content others produced that fits
  3. Providing lots of example titles and content types

Content Calendar

Don't write out a list of titles. This is a strategic document, remember. Instead, create a calendar showing how often to produce content that fits each position in the hierarchy. For example:

  1. Write 70% content 7 times per month.
  2. Write 20% content 2 times per month.
  3. Write 10% content 1 time per month.

I'd use an actual calendar format if I were you, though. People seem to absorb the information more easily that way. Here's a fake example of what we often do:

A strategic content calendar A strategic content calendar

Best Practices

This is much more tactical. Take a look at the content inventory report you created. If there are any mistakes/omissions/great things that happen more than 20% of the time, write a best practice to address it.

Then, picture a content team sitting in your spot, six months from now. If you're not there, what might they not think of? Write best practices for each of those, too.

These are often little things, like writing fully-descriptive titles. But sometimes big stuff ends up here. For example:

  1. Don't copy content from your other sites. Seems obvious to me, maybe, but put yourself in the team's shoes: Six months from now, under deadline pressure, with a VP telling them she just spent 50% of the year's budget on that research piece for the other web site, is there any chance the team might forget? Or be pressured into duplicating? Yes.
  2. Write a fully-descriptive article title. Again, it seems obvious, but will it be obvious in six months?
  3. Put links where they matter, not in a list at the end of an article. Reason: No one clicks the links at the end. Links in the article have higher utility.
  4. Make sure no image is larger than NNkb in size. Reason: Faster content page load times.
  5. Always use the singular form of taxonomical tags. Reason: It avoids a common form of tag duplication.
  6. Place a full transcription of all videos on the video page. Place all videos on their own, independent pages within the site. Reason: Content discoverability, search, and skim-able versions for people in a hurry.

Understand that these are not laws. Sometimes the team has to bypass a best practice. Your job here is to provide the information they need to make an intelligent, informed decision when they do. If they decide to place an image that's larger than NNkb, fine, just understand that it increases article load time. If they skip the transcript, they may get fewer video views because people can't get a quick preview, and because search engines may not properly categorize the page.

Examples

Lots and lots of examples. Good, bad, edits you'd make, edits you wouldn't. The more examples you provide, the easier it is for the reader to figure out best practices and essential topics. We often provide marked-up screen captures showing what we'd change and how:

An annotated page with edits An annotated page with edits

Outreach

Depending on the reader, I might write out basic guidelines for outreach: How much to do and when, who to reach and how. Sorry, but there's absolutely zero chance I'm going to cover that in one or two paragraphs here.

Governance

Some clients really want governance to be part of their strategy and audit. To me, it's a whole different thing. I know that's not the norm, so I'll explain:

  1. A content strategy is about what you create, when you create it and how.
  2. Governance is about who's allowed to create it and what they need to do to get it approved. It's also about what you can and cannot do. So it actually needs to be in place before the strategy and audit, or you need to do it based on the strategy and audit findings. You can't do it at the same time.
  3. I can count the number of organizations that have stuck to a content governance framework on one pseudopod, and despite Dr. Pete's claims to the contrary, I do not have any pseudopods. Unless there was a major governance catastrophe, focus on strategy, first.

Review & analysis plan

Finally, write down your plan for measuring and tracking content performance over time. Be really, really specific: Exactly what would you do every time you publish a new piece of content? Exactly what would you regularly check, and exactly how often?

You might even want to provide basic guidelines for interpreting data: If the numbers go up, do X. If they go down, do Y. But I'm really cautious with this kind of information in a strategy. It's bound to get transformed from strategic advice to tactical musts. Next thing you know, every time a post gets 3 Facebook likes, the entire content team has to write on the same topic for three months.

Whoa.

This sounds really hard. And really complicated. That's because it is. Content strategies drive communications policy which, ironically, is really hard to communicate. But it's priceless information for your organization. A good strategy forms a long-term framework for content, and it keeps everyone honest. It guides an organization's entire marketing plan. So it's well worth the effort.

A few credits

We created this page using the Skeleton responsive framework. It's nifty. We also used a tooltip thingamajig by Caleb Jacob.

Oh, and you can follow Ian on Twitter, at @portentint, and follow Portent at @portent. Or, track him down on Google Plus, at Google+

Comments