This is a really, really, really long piece. There's no TL;DR version. That's because a content audit has a lot of steps. I've tried to break them down and explain each one, so that a writer who knows their way around a computer or a marketing geek who knows how to write can take this and do a complete audit.
First, a content strategy is not an inventory or an audit.
An inventory is a simple list. A strategy:
#1 is the most mechanically-involved task, because you have to grab a lot of data and mush it all together. #2 is the shortest. #3 and 4 are the most demanding (for me, anyway) because I have to suss out impossible-to-automate marketing stuff that's essential to success.
A few basic assumptions, and some sacred cows to slay:
One of the team here at Portent read a draft of this and said, "This is actually a conversation audit!"
It's true. Conversations are comprised of content. Content is how we converse. That's why some guy wrote a book in 2003 or so about Conversation Marketing. See what I did there?
In all seriousness, content drives every exchange you have with a potential customer: Product descriptions are content. So are photos, blog posts, podcasts, your company's "About Us" page, those 40 ridiculous links you stuff at the bottom of your home page, and every other scrap of information you put, anywhere.
Content strategies have three goals:
Trust me, none of these are easy. First, though, you have to break the connection in your brain between SEO and content.
I said: Forget about SEO and rankings. This is about content, which affects every inch of your marketing strategy. Content influences everything in marketing:
SEO is one reason to do a fantastic job on content. It's certainly not the only one. I see SEO as 20% of the reason to create great content.
This whole step-by-step is meant to support all of your organization's communications efforts. It'll only support those efforts if you use the data you collect to define 'good' content for your site. That will only happen if you look at content as a marketing strategist, rather than an SEO. OK - start sending me flaming e-mails.
You could go through your entire web site, by hand, and find every last piece of content. You could.
Or, you can automate the inventory. Which is what I strongly suggest. Here's what I do, and how I do it:
Yikes. Awful, I know, but that's the music I grew up with.
The first step in any inventory is getting a list of your stuff. Here, the first step is getting a list of site pages. All you need is a list of page URLs. If you have a large site (say, more than 1000 pages) you should group those URLs by category.
Wait, what? Grouped by category? Ian, WTF?!!! I'll have to do that by hand!!!
Nope. Assuming your site has any semblance of structure, you'll have categories, and those categories will have a hub page, like this:
On this site, you'd probably create an 'Outerwear' URL group that included everything in the Outerwear/ folder, including the outerwear page itself. Then you'd create another one for underwear, and so on. If there's a blog, you break that up by category, too.
It doesn't matter if you're an e-commerce site, by the way. Most sites have category structures. If you don't, look really carefully—you may have found the first step in your content strategy.
Here's an easy way to see if you've got the right URL groups. Each URL group should:
You can get these URL groups fairly easily using a standard site crawler like Screaming Frog. Keep reading and I'll show you how.
Crawl each category list of links separately. My favorite desktop tool for this is Screaming Frog. Go get it. I'll wait.
Screaming Frog is so full of awesome, it'd take another 10,000 words to describe it. If you want to learn more about using it, check out SEER's incredibly complete guide.
Second. If you're lucky, and each URL group matches a folder on your site, you can enter your web address plus the folder, like this:
...and Screaming Frog will only crawl links within that folder. Easy-peasy.
If your site doesn't use folders, try using the Include filter, instead: :
You can also filter the URLs after the fact, using the Filter tool, or by getting creative with Excel. I'm not going to write about all that here, 'cause frankly this post is long enough, yes?
Third. Save your crawl to a CSV, so you can import it into a spreadsheet, like so:
But what about monster sites? 1 million+ pages? My favorite publicly-available tool is 80Legs. It's not a cinch to set up, but it can crawl scads of pages.
Fourth. Finally, take each URL group crawl result and import them into Excel. Keep the Status Code, as well as the H1/H2s, Response Codes, Page Title and Meta Description, Word Count. That gives me a pretty complete list of URLs and basic data you'll need for page quality.
Finally. Get rid of duplicate source URLs! Don't forget!!!
You can use a different tool than Excel. When I do this myself, I dump the list of URLs into Sublime Text and filter for unique URLs, and then use a regular expression to dump any offsite stuff that somehow got into the crawl. But Excel is a great starting point.
Now, you've got your list(s) of links—the 'stuff' part of the inventory. Time to fetch your performance data—this the the 'information about the stuff' part of the inventory. You'll use this data as part of your strategy.
What I do not measure in a content audit (I use inventory and audit interchangeably. Castigate in the comments if necessary.):
None of these stats show me whether the content had impact. They show me whether someone came and looked at the page, and they show me if you left your browser open for 10 minutes. That's about it.
What I do measure in a content audit:
Read on if you want some tips on grabbing all of this data without going insane.
Performance data tells you how a specific piece of content helps your overall strategy. Language & quality data provides a snapshot of subject matter and best practices:
There is no ideal number of words per page, by the way. Or reading ease. Or anything else. You're looking at this data to build a profile of particularly successful content within this one category on your site. In one URL group, that may mean 500 words/page and a reading grade level 12. In another, that may mean 100 words and reading grade level 7. It's up to you to look at the data and draw conclusions.
But still - how do you grab all of this data?!!! There's the rub. Keep reading:
Collecting all of this by hand could take weeks. Or you can automate it. You've got a few options, from super-technical to most accessible:
More likely, you'll use a mix of tools and manual labor. Here's how you pull it all together:
Tools like Open Site Explorer will let you pull a lot of the content performance numbers. If you can't every datapoint, work with what you can. In spite of the numbers, this is more art than science.
But data like authority numbers and revenue/conversions require logins, and you don't want to share those on AMT or Smartsheet. Use Excel VLOOKUP, instead:
If you need to learn VLOOKUP, check out this excellent tutorial from Distilled.
Note: I try to avoid self-promotion in these posts. But we've built a pretty nice inventory tool that looks up and generates a lot of the data I've talked about here. Because it pounds the crap out of our servers and APIs, we can't make it publicly available. If you want a report, though, we can generate it for you. Yes, it costs money, or a donated left knee—mine's on the fritz.
Record any catastrophic events: A major PR gaffe, or a governance failure, or something similar. These kinds of events may point out the need for a stricter content policy. Or, they may point out how well a particular style of response worked to correct the problem and move on. Either way, there are lessons to be learned.
To be honest, I rarely do a deep competitive analysis. We're not going to imitate the competition, because that probably won't work. And being a copycat is really, really bad for your brand, as Adecco found out. And we're not going to learn much from them, because we have no information about their process/challenges/resources.
However, there are times when competitive research and comparisons make sense:
You now have scads of data on every page in each URL group. Maybe you grabbed every possible datapoint. Maybe you didn't. It doesn't matter. This is the part that really, really matters. You need to look at all of this data and start to build a profile for 'great' content in each URL group.
You could try to use a formula or a statistical technique like Pearson Correlation, but I don't recommend it. Content is largely about emotional response: Agreement. Satisfaction. Dislike. Sense of security/lack thereof. Also known as significance, by the way. You can't calculate that.
You can, however, look at each URL group, determine what emotional response each group was trying to obtain, and then use the data to see whether you succeeded. Here's an example:
I ran a small report—about 150 pages—on the Portent site, then trimmed it even more. You can download it here to follow along, if you want.
I need to see if any content 'sticks out' as particularly successful. The easiest way? Just sort by various columns, looking for pages with the most tweets, Facebook likes, +1s, etc. If I find something significant, I'll create a pretty report for it later. Have a look at content by tweets (we get this statistic from the Topsy API).
The top 3 or four are either super-specific 'how-tos' or super-general philosophical, rabble-rousing stuff. They're also all well over 1500 words, with two of them over 5,000. In fact, just about all blog content that got tweeted falls into those two categories, and have over 1500 words.
If one of my strategic goals is to build share of voice through social shares (and it is) then I can safely assume that:
These may seem like "well, duh" discoveries. But now you can back them up. Plus, I'm not sure the newsworthy part would've jumped out at me without looking at these top 10 pages.
Still, I won't want to base all of my assumptions on one statistic. And there's probably more to learn. If I dig deeper, I find:
Here's the warning about all this: It's just numbers, and you're dealing with people. It may be that the top two posts performed so well because we used the word 'flibbergibbet' in our tweets, or we tweeted on a Saturday (I checked the tweets we sent, and when we sent them, and didn't see any huge differences).
So yes, you need to think it through, and explore other information as necessary. That's research. It's also why we all still have a job, and aren't likely to be replaced any time soon by Python scripts that fetch numbers from 10 different APIs.
A quick disclaimer here: This next section lists the things I feel must be in the strategic portion of a content audit. But the whole concept of 'content strategy' is a new one, and there are very different opinions of what goes where. Feel free to use my structure, or find another, or make your own if it better addresses your needs.
This is the part that brings everything together: Your data, conclusions, competitive analysis, all of it.
Here's what I put into a strategy:
First things first. What's the content supposed to do? You may know. I may know. But I guarantee at least one critical decision maker who reads your strategy will not. This is not something most people spend a lot of time thinking about - we're kind of weird. So no matter how repetitive it feels, write out exactly what you hope content will accomplish.
This is yet another place to emphasize that adding-keywords-to-the-website-so-the-Googles-rank-us is not part of the strategy. It's also a good place to point out content's impact on the entire marketing plan. Get buy-in on these concepts here, and opportunities for great, creative, useful stuff start popping up like sun-starved Seattlites on a clear day.
Where did you get your data? How did you draw your conclusions? Again, someone's going to ask. Nothing fancy - a simple list of sources and methods will do.
What should we measure going forward? The Goals and Analysis plan, below, will talk about how often to measure it. But you also need to set out the metrics, and the goals, right away. Like the role of content, putting this into writing really gets folks on the right track.
For Portent, the goal is more shareable content and ultimately, reaching beyond the search community into the marketing community as a whole. We'll measure that by watching:
The 2-3 guiding principles behind the organization, translated into central themes for every piece of content you write. This is one of the critical bits that, if you get it right, cause the rest of the audit to fall nicely into place.
The 'essential topics' might be 'keywords' for some folks. I don't think keywords are very strategic, and I know what my writing looks like if someone tells me "Hey, can you write a blog post about tires?" So I'd rather stick to topics.
A good place to start is with the organization's 'Why.' http://www.startwithwhy.com/
For me (and therefore my company, HAH - benefits of being CEO) the 'why' is "Help everyone communicate better, because great communications will save the world."
That doesn't mean every blog post we write has to talk about communications and a world-ending lack thereof. Instead, break the 'why' up into essential topics like this. Every post we write should probably touch on one of these points:
We don't have to literally write about each of these topics, either. But any content we create should probably somehow support, teach to or learn from one of the essential topics. For example:
Every one of those posts either offers advice on better communications, calls out great or awful communications, or otherwise trots alongside the communications bandwagon.
Essential topics are a very 'soft' concept. That makes it even more critical that you make it as concrete as possible, with lots of examples and very clear topics. Still, anything this touchie-feelie in a world of give-me-my-roi-and-meta-tags-now-dammit spells nightmare. Here are the properties of a good essential topic:
Some of these look like narrow semantic distinctions, right? You probably rolled your eyes at "Better visual storytelling." It's ok, you can admit it. I backspaced over it three times myself. But it's not just me making up fancy phrases. 2 years from now, someone's going to have to look at this essential topic and be able to apply it to whatever content they're dreaming up.
If you're going to write personas, this is not the place to do it. You should have gotten everyone's buy-in regarding personas before you started writing strategy. If you didn't, no big deal. Just don't create the gigantic kerfuffle that'll result if you throw personas into the mix now. Skip 'em.
Instead, in this section, write little 2-sentence descriptions of each audience type and do the same for a few different tones you think are appropriate.
Easiest section in the whole audit. Just explain the kinds of content you think can work. If video and text are the only types, explain why. If an audio podcast makes sense, go with that. Just think it through, so that this is a plan the reader can stick to.
Now, explain how you're going to rank content. I don't mean good to bad - I mean "Stuff we should write a lot" versus "Stuff we should write a little," or "Stuff that scales" and "Stuff that doesn't." For Portent, that's the 70/20/10 rubric. I won't bother going into it here - you can read about it in Katie Fetting's post here.
You can't just cut-and-paste a standard description, though. You have to make this hierarchy make sense for the client. I usually do that by:
Don't write out a list of titles. This is a strategic document, remember. Instead, create a calendar showing how often to produce content that fits each position in the hierarchy. For example:
I'd use an actual calendar format if I were you, though. People seem to absorb the information more easily that way. Here's a fake example of what we often do:
This is much more tactical. Take a look at the content inventory report you created. If there are any mistakes/omissions/great things that happen more than 20% of the time, write a best practice to address it.
Then, picture a content team sitting in your spot, six months from now. If you're not there, what might they not think of? Write best practices for each of those, too.
These are often little things, like writing fully-descriptive titles. But sometimes big stuff ends up here. For example:
Understand that these are not laws. Sometimes the team has to bypass a best practice. Your job here is to provide the information they need to make an intelligent, informed decision when they do. If they decide to place an image that's larger than NNkb, fine, just understand that it increases article load time. If they skip the transcript, they may get fewer video views because people can't get a quick preview, and because search engines may not properly categorize the page.
Lots and lots of examples. Good, bad, edits you'd make, edits you wouldn't. The more examples you provide, the easier it is for the reader to figure out best practices and essential topics. We often provide marked-up screen captures showing what we'd change and how:
Depending on the reader, I might write out basic guidelines for outreach: How much to do and when, who to reach and how. Sorry, but there's absolutely zero chance I'm going to cover that in one or two paragraphs here.
Some clients really want governance to be part of their strategy and audit. To me, it's a whole different thing. I know that's not the norm, so I'll explain:
Finally, write down your plan for measuring and tracking content performance over time. Be really, really specific: Exactly what would you do every time you publish a new piece of content? Exactly what would you regularly check, and exactly how often?
You might even want to provide basic guidelines for interpreting data: If the numbers go up, do X. If they go down, do Y. But I'm really cautious with this kind of information in a strategy. It's bound to get transformed from strategic advice to tactical musts. Next thing you know, every time a post gets 3 Facebook likes, the entire content team has to write on the same topic for three months.
This sounds really hard. And really complicated. That's because it is. Content strategies drive communications policy which, ironically, is really hard to communicate. But it's priceless information for your organization. A good strategy forms a long-term framework for content, and it keeps everyone honest. It guides an organization's entire marketing plan. So it's well worth the effort.