I was recently dragged in to an argument about whether or not it’s ok to send XHTML files with a mime type of
text/html instead of
Dude[ette] sends XHTML document as
application/xhtml+xmland therefore is gaining no advantage over plain old HTML 4.01. Since XHTML ‘should’ be sent as
application/xhtml+xml, doing it any other way is wrong and you will burn in hell. Instead you should use HTML 4.01 because more browsers understand it, and when the time comes to switch, you can just use HTML tidy or make the small changes your pages need by hand.
Ok, I added the ‘burn in hell’ part, but you get the general idea.
Now I’m not going to go into why you should use
application/xhtml+xml or all the benefits and differences between that and
text/html. Instead, I’m going to let you read up on the subject yourself, and then tell you why it’s quite acceptable to keep on using
text/html for the time being.
All of these links are full of excellent information and everyone building XHTML websites should know this whether they agree with them or not. Go ahead and take a few moments to read them. I’ll wait here.
- Sending XHTML as
- RFC 3236: The ‘
application/xhtml+xml‘ Media Type
- Pretending to Use XHTML
Ok. Wasn’t that fun? Now here are some reasons why sending your XHTML pages as
text/html is just fine. It is important to realize that the real argument here is not whether they should be sent as
application/xhtml+xml or not, but what to serve as an alternative to user agents that can’t support it, such as IE.
1) XHTML 1.0 is like a gateway drug. It’s not HTML, and it’s not quite XML, but it’s a nice middle ground. It teaches you to close your tags, use lowercase attributes, always quote your attributes, never ‘minimize’ your attributes, and a few more. All of these are outlined here: The difference between XHTML 1 and HTML 4. This is a great benifit in that it teaches developers to follow stricter rules when building pages, and when the time eventually comes when they need to switch to
application/xhtml+xml, there will be fewer changes to worry about than with HTML 4.
2) The XHTML standard allows you to use
text/html. A short excerpt:
XHTML Documents which follow the guidelines set forth in Appendix C, “HTML Compatibility Guidelines” may be labeled with the Internet Media Type “text/html” [RFC2854], as they are compatible with most HTML browsers.
So the way I see things is that sending XHTML as
text/html isn’t a bad thing at all. Using XHTML teaches developers to write well-formed documents, and teaches them the basics of XML, which will be more and more valuable as we progress from HTML to XML documents on the web. I think it’s important that people know the differences when sending documents with different mime types, but discouraging the use of XHTML sent as
text/html doesn’t help anyone, and telling them there are no advantages at all is misleading.
UPDATE (11-10-2004): I just came across this post about content negotiation and saw an excellent conversation in the comments about converting your XHTML to HTML and the issues you will encounter.
One of the readers makes this point:
Your situation is different, though. I totally agree that XHTML should be served as application/xhtml+xml to browsers that support it. With respect to IE, you have some choices.
You could just serve it the XHTML content as text/html. Provided you’ve authored it in HTML-compatibility mode, IE will handle that perfectly well (or, at least, no worse than it handles HTML4).
Alternatively, you could decide, as you apparently have done, to serve IE HTML4, using a conversion that
1) only works with XHTML written in compatibility mode
2) runs the risk of seriously munging your text. (This is, after all, a weblog about web design!)
What’s the benefit of doing this instead of just sending IE the unaltered XHTML document (as text/html)?
And the response:
Nothing, probably. I’m reworking the “CMS” at the moment, and if I ever finish it, I will use a better system for converting to HTML. In fact, since XHTML is currently useless unless you actually need it (like your blog), I’m thinking about serving HTML 4.01 Strict to all. I’ll still store the posts as XML, though, for various reasons, so some kind of rewriting will be necessary.
So it seems to me that the optimal solution for most websites these days would be this:
- Author your pages as XHTML 1.0 strict or transitional in ‘compatibility mode‘
- Use content negotiation to send
application/xhtml+xmlpages to user agents that favor XHTML content
- For user agents that don’t like
application/xhtml+xml, send them ‘compatible’ XHTML 1.0 as
This way user agents that can handle XHTML get it, and if the user agent doesn’t understand
application/xhtml+xml, you will have a very minimal amount of work on the server side to convert your XML content to ‘compatible’ XHTML.