by Megan McDermott, 8 March 2008 - 5:11pm
Which Doctype should I use? This is one of the first questions people ask
when they start using web standards. There are four main doctypes in use
today. This artcile will firstly define what a doctype is and how it works,
and then go on to explain the four types and help you to decide which one to
use.
What is a doctype?
A doctype (or DTD – document type definition) is a tag at the very
beginning of an HTML document that tells the browser what type of document it
is. This way the browser knows which specification was used to code the
document and, therefore, how to display it.
A typical doctype looks like this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
This says that this is an html document written under the html 4.01
specification with a link to that specification on the W3C website. This is
kind of complex and difficult to understand and remember. The W3C has
recognized this and greatly simplified the format for future specification.
The proposed html 5 doctype looks like this:
<doctype html5>
Much better!
The four choices
Currently, four doctypes are commonly used:
- html 4.01 transitional
- html 4.01 strict
- xhtml 1.0 transitional
- xhtml 1.0 strict
There are two major differences between these doctypes: html vs. xhtml and
strict vs. transitional. Your decision on these two points will determine
which doctype you use.
html vs. xhtml
This is actually a trickier decision than you'd think. xhtml was developed
as a bridge between html and the much stricter xml.It is a stricter syntax,
which is good because it encourages better coding habits. However, the
problem is that some browsers have never properly supported xhtml.
What is a mime type?
A mime type simply identifies the format of the page. A mime type
declaration in a web page looks like this:
<meta http-equiv="content-type" content="application/xhtml+xml;
charset=UTF-8">
This says that this is an xhtml document. In order to correctly serve an
xhtml document you would need to include this specification that tells the
browser to serve the page as xhml.
The mime type declaration also includes the character encoding, which
tells the browser what type of text to use to render the page. HTML documents
always use the unicode character set, which is a universal standard for
rendering language characters on computers. The W3C's Character
sets & encodings in XHTML, HTML and CSS tutorial has more details on
character encoding. All you really need to know is that UTF-8 is usually the
best character set to use for html documents. If you're having problems with
certain characters showing up as question marks or square boxes it may be a
problem with the character encoding.
Xhtml documents may also use an xml declaration at the top of the document
(before the doctype) to set the character encoding:
<?xml version="1.0" encoding="UTF-8"?>
The only thing we need to know about this right now is that it puts
Internet Explorer 6 into quirks mode (see below). Some html editors will
insert this tag with an xhtml doctype. If you're having rendering problems in
IE 6 check to see if this tag is present and if it is, remove it. The W3C notes that:
... if you decide to omit the XML declaration you should choose either
UTF-8 or UTF-16 as the encoding for the page.
What does this mean?
There are two problems with the xhtml mime type. The first is that the
specification says that the browser should break and return an error message
whenever it encounters a problem in the xhtml code. This is the way xml
works. So, for example, if you had forgotten to close a <li> tag, the
browser would stop and return and error. This means that when serving
a document as xhtml you have to make sure that it's correct.
Otherwise users will not be able to see that page at all, they will just see
an error message.
The second problem is that Internet Explorer doesn't support xml mime
types and has no plans to do so in the future. That means that if you
included the application/xhtml+xml content-type specification, Internet
Explorer would give the user a download dialog since it doesn't know how to
display the page. There are ways that you can get around this (use scripting
to serve xhtml only to browsers that support it) but that's a bit
complicated. The bottom line is that since IE doesn't support the
xhtml mime type, you really can't serve pages as xhtml.
Since that's the case, what is the benefit of using an xhtml doctype? If
the page can't be served with an xhtml mime type it's really the same as
html. But, xhtml was developed with the intention that it would work as
text/html.There is really nothing wrong with serving an xhtml
document as html. According to a W3C
background document:
In addition, [XHTML1] defines a profile of use of XHTML which is
compatible with HTML 4.01 and which may also be labeled as text/html.
Just keep in mind that if/when you do change the mime type on your page to
to xml you will ned to make sure that everything is valid. Since you're
unlikely to go back and do that on old pages, there really is no danger in
using an xhtml doctype with an text/html mime type.
xhtml is stricter than html
The one caveat here is that xhtml does encourage better quality coding
practices than html. In xhtml:
- all tags and attributes must be lower case
- all tags must be closed, including non-enclosing tags such as
<img />
and <br />
- all attributes must be quoted
- attribute minimization is not allowed (e.g.
checked="checked"
not just checked
)
As the Web
Standards Project notes:
The margin for errors in HTML is much broader than in XHTML, where the
rules are very clear. As a result, XHTML is easier to author and to
maintain, since the structure is more apparent and problem syntax is easier
to spot.
For this reason, it may be a good idea to use an xhtml
doctype. There really is no harm in serving an xhtml document as
html and it will help you to learn better coding practices.
What about xhtml 1.1?
The w3c guidelines require xhtml 1.1 documents to be served as
xhtml with the an xml mime type. This means that you shouldn't serve
xhtml 1.1 documents as text/html and the previously discussed problems with
xhtml mime types apply.
Transitional vs. strict
Transitional doctypes were invented to provide a way for webmasters to
transition to the new, stricter specification (html 4.01 from html 3.2). They
allow you to get away with older, depreciated tags such as
<font>
tags. If you are working on moving to a new
specification this would ben an appropriate doctype to use.
Strict doctypes don't let you use these older tags. That doesn't mean that
they won't display them – the page may look the same as it would with a
transitional doctype. The difference is that when you run the validator all
of those depreciated tags and other errors will be reported. A strict
doctype helps you to write better html by reporting all of these
errors including:
- depreciated
tags and attributes
(see 'Depr' column)
- improperly nested tags (i.e.
<p><strong>
some
words</strong></p>
not
<p><strong>
some
words</p></strong>
)
- inline elements must be contained by a block level element (e.g.
<strong>
must be inside a <p>
or
other block-level element such as <div>
or
<blockquote>
)
What is quirks mode?
Quirks mode was created by browsers to ensure that old documents wouldn't
be broken by changes to their rendering engines. This was because older
versions of browsers didn't get CSS quite right. Many old documents that were
designed for the old implementation would break if the implementation was
corrected in a new version of the browser.
To ensure that those old documents would still be displayed as intended,
browser makers decided to use doctypes to decide whether documents shold be
displayed in the old way or the new way. The table towards the bottom of this page shows you which doctypes
trigger quirks mode in which browsers. In that table you'll also notice an
“almost standards mode” for some browsers. This is just what it
says – almost like standards mode but not quite. Either way, you can
end up with an unreliable display if you use a “quirks mode”
doctype.
One of the reasons to use a strict doctype is that you'll always
get standards compliant rendering mode. This way your page will
always render correctly and according to the specificaiton. It will also be
more likely to look the same in different browsers. Consistent rendering is
one of the reasons to use a doctype in the first place.
Which doctype to choose?
When your just starting out with web standards you'll probably want to
start with an html 4.01 transitional doctype. Once you have your pages
validating with that doctype you should change to html 4.01 strict and
continue to work on any errors that are detected. From there you may choose
to move on to xhtml transitional and finally xhtml strict.
Doctype codes to copy & paste
Whether you chooes html or xhtml, transitional or strict, below is the
doctype code that you will need to copy and paste into your pages. Mime types
with utf-8 character sets are included.
Html 4.01 Transitional
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<meta http-equiv="Content-Type"
content="text/html;charset=utf-8">
Html 4.01 Strict
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<meta http-equiv="Content-Type"
content="text/html;charset=utf-8">
Xhtml 1.0 Transitional
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"
/>
Xhtml 1.0 Strict
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"
/>
Discussion
To discuss, ask questions or comment on this article please see the Webmaster Forums discussion about this article.
References