A Copywriter's Guide to Semantic Markup

Ian Lurie

Copywriters are a weird, weird bunch. They fearlessly crank out pages of great prose – an activity that terrifies most people – but are horrified when I mention something like ‘semantic markup’.
a copywriter reacts with horror at the sight of HTML
The concept of a heading tag reduces the best writers to a mass of quivering jelly.
Well, for all you writers out there, here’s a guide to semantic markup – the use of heading tags and basic HTML to make your writing easier to format, read and optimize for the internet. It can also help you improve sales or whatever it is you’re trying to accomplish on your site.
The truth is, you’ve always known about semantic markup. Your word processor uses it. So this guide uses MS Word to introduce the concepts:

Step 1: What a real paragraph looks like

In Word, you can create the appearance of a paragraph by using line breaks, like this:
The two line breaks create the appearance of a paragraph. But my computer doesn’t know that ‘canned goods?’ is the end of one paragraph, and ‘Then’ is the start of another. If I use Word’s formatting tools to increase paragraph spacing, nothing happens.
That’s the same as using a ‘<br>’ (a line break) to create a new paragraph in HTML. It’s not semantic – the formatting doesn’t reflect the structure of the document.
If, on the other hand, I use a single carriage return (different from a line break!) to create a new paragraph, Word creates a real paragraph. My computer can tell where each paragraph stops and starts. And, if I use Word’s style editor to change the distance between paragraphs, it actually works:
That’s semantically correct. The formatting reflects the structure of the document.
In HTML, you can get the same semantically correct paragraphs by surrounding each paragraph with a paragraph element: <p> and </p>. The raw HTML code looks like this:
Note that the blank line is just for easier reading on our part. The <p> tags are what actually matter.

HAH! I just snuck in some geekinese. The <p> and </p> are called tags. Together, they turn all the text between them into a paragraph element.

So, rule 1: Create real paragraphs. If you’re using a blog editor and you have an option to set ‘Convert Line Breaks’, turn that on, and use carriage returns just like you would in a regular word processor. If you’re using a plain old HTML editor that doesn’t do this work for you, surround each paragraph in <p> and </p>, creating a paragraph element.
Next, I’m really going to bake your noodle with headings. Take a second to wipe the sweat from your aching brow…

Use Real Headings

Headings are the real bane of online copywriters. You can make something that looks like a heading but isn’t one. Then you get anal-retentive search marketers like me tsk-tsking you and making you go back and edit the last 3 years’ articles.
Better to do it right from the start. We’ll start with MS Word again.
In Word, you can create what looks like a heading by making some text bold, then setting the font size larger, like this:
Pretty on the outside, but like a Barbie, lacking substance. Nothing tells MS Word that ‘Ian’s Nightmare’ is actually a heading. Convert the page to an outline, and you get this:
A relatively featureless, unstructured mass of text.
If you used the same technique in HTML, and used styling or (ack) font tags to make the text larger and bold, you’d run into the same problem. But the consequences are more severe: Search engines use headings to categorize content. No headings? No rankings for you, monkey boy.
Luckily, MS Word has a solution: Styles. I can assign the style ‘Heading 1’ to my title, and Word will automatically make it look the way I want:
Even better, it marks that text as a heading. If I edit the style ‘Heading 1’ using Word’s style editor, I can change the formatting of every heading 1 in the entire document, all at once. Magic! And, if I view it as an outline, I get this:
See that? By using the style heading 1, I actually create a structure for the document, not just a look and feel.
In HTML, you get the same effect using the <H></H> tags. <H1></H1> indicates a level one heading. <H2></H2> indicates a level two heading. And so on. In my epic tale, it would look like this:
That tells visiting web browsers “This is a level one heading”.
Headings should nest in an outline structure: The level 1 heading (H1) should indicate the ‘top level’ of ideas on the page. Level 2 headings (H2) then show related sub-topics for that H1. Level 3 headings (H3) show sub-topics for each H2, and so on.
Rule 2: Make sure you’re creating real headings, not just text that looks like headings. If you’re not sure, find a geek, and before they scuttle away in panic, ask them if your headings are semantically correct. They’ll practically swoon when you do.

Last One: All That Other Stuff

There are a lot of semantic tags, and almost all of them have analogs in word processors:
<em></em> shows text that’s emphasized. You can make it look any way you like, using styles. But a computer will recognize this text as semantically more important than the text immediately around it.
<li></li> indicates the text inside the tags is part of a list. If you’re creating a numbered or bulleted list, read up on ordered and unordered lists, first. It’s safe, I promise. You won’t get hurt. At least not as badly as if I find out you’re just slapping a ‘1’, ‘2’, ‘3’… in front of sentences and throwing that up on your site as a list.
The possibilities are nearly endless. If you want to learn more, read Digital Web Magazine’s excellent post on the subject. Also check out the post on Pearsonified.
Start with paragraphs and headings, though, and you’re ahead of 99% of the pack.

The Benefits

By using semantic markup, you make a lot of things easier:

  • Search engine optimization gets easier, because content is already structured as it should be, and search engines like that.
  • Design is far easier, because designers can apply styles to the various semantic elements, site-wide, and quickly adjust the look and feel.
  • Editing is easier, because it just makes more sense to others when they look at the document.
  • Distribution is easier, because cell phones, feed readers, screen readers and the many other devices people use to view web sites can all make sense of what you wrote.

So go learn this stuff. Start with MS Word, and practice using headlines and paragraphs and lists. Then find the analog to each one in XHTML. Within an hour you’ll be writing semantically correct documents.
Imagine the shock on your webmaster’s face…
Related Articles
The Semantic Web Will Help You Sell
65 Ways to Improve Online Sales

Ian Lurie
CEO & Founder

Ian Lurie is CEO and founder of Portent and the EVP of Marketing Services at Clearlink. He's been a digital marketer since the days of AOL and Compuserve (25 years, if you're counting). He's recorded training for Lynda.com, writes regularly for the Portent Blog and has been published on AllThingsD, Smashing Magazine, and TechCrunch. Ian speaks at conferences around the world, including SearchLove, MozCon, Seattle Interactive Conference and ad:Tech. He has published several books about business and marketing: One Trick Ponies Get Shot, available on Kindle, The Web Marketing All-In-One Desk Reference for Dummies, and Conversation Marketing. Follow him on Twitter at portentint, and on LinkedIn at LinkedIn.com/in/ianlurie.

Start call to action

See how Portent can help you own your piece of the web.

End call to action


  1. @Janine That may be part of it, but it may also be that you’re pasting into a plain text editor. Plain text editors don’t read any formatting at all.

  2. I have to agree! I have been using semantically correct markup on my bliki (including the use of headings) and after posting twice a week for three and one-half months, my bliki has a google pagerank of 2, and about 50 subscribers.
    Not to mention that search engines just adore correctly coded pages and will really shoot them up much higher than poorly coded pages.
    The third benefit, which you don’t mention, is accessibility. People who use assistive devices rely heavily on pages being correctly formatted and can really run into snarls when presented with a lot of pseudo-markup.
    Thanks for a great article!

  3. Thanks Ian, that was easy to follow and concise. I wonder what extra importance the tag is given.
    Also, worth pointing out that for semantic mark up; H1 tags should be kept to the headline of the page only… no multiple H1s on a page.

Comments are closed.

Close search overlay