Overview
Sometimes known as “MS-Bloat,” Microsoft programs create HTML documents that are full of non-standard code that can severely increase your web page’s file size and effect functionality.
So why does Microsoft include such useless code and proprietary tags in their HTML documents? Well, if you are using a Microsoft server that is properly configured, and you are using Internet Explorer, the web server will be able to read those tags and add additional functionality for the user. Also, it allows the HTML document to be read back into MS Word.
Good in theory, but bad in practice since not everyone uses Microsoft’s server software. The College of Engineering web pages are served on a Unix platform via the Apache server software. And so the added non-standard code serves no purpose except to increase the file size and sometimes cause problems with browsers other than Internet Explorer.
Here are some additional reasons not to use Microsoft software to create HTML code:
- Browser Incompatibility – Your Excel spreadsheet exported as HTML will only work perfectly if you are using Internet Explorer 5.x or better and you have the Excel Web site component installed. Other MS programs cause similar problems if you are not using Internet Explorer to view your page.
- PowerPoint – Exporting to HTML in PowerPoint usually creates extremely large image files that will take a very long time for your viewers to download.
- Display Problems – A page that looks beautiful in your Word document will probably end up looking horrible online. Internet browsers do not have the ability to display fonts and font sizes as accurately as a page layout program. The browser incompatibility mentioned above only exacerbates the problem.
- MetaData – Microsoft automatically includes hidden information at the beginning of your the HTML documents that includes your name, the name of your computer, any document revisions and comments and much more possibly damaging data that can be viewed by anyone looking at the source code of your web page.
Here’s an example of a very simple web page’s HTML code created in Microsoft Word. The file size for this document is 2408 bytes.
<html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 9">
<meta name=Originator content="Microsoft Word 9">
<link rel=File-List href="./MS_files/filelist.xml">
<title>MS-Bloat Example</title>
<!--[if gte mso 9]><xml>
<o:DocumentProperties>
<o:Author>James Davis</o:Author>
<o:LastAuthor>James Davis</o:LastAuthor>
<o:Revision>1</o:Revision>
<o:TotalTime>3</o:TotalTime>
<o:Created>2009-03-24T18:39:00Z</o:Created>
<o:LastSaved>2009-03-24T18:42:00Z</o:LastSaved>
<o:Pages>1</o:Pages>
<o:Company>Engineering Network Services</o:Company>
<o:Lines>1</o:Lines>
<o:Paragraphs>1</o:Paragraphs>
<o:Version>9.2720</o:Version>
</o:DocumentProperties>
</xml><![endif]-->
<style>
<!--
/* Font Definitions */
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;
mso-font-charset:0;
mso-generic-font-family:swiss;
mso-font-pitch:variable;
mso-font-signature:1627421319 -2147483648 8 0 66047 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-parent:"";
margin:0in;
margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:"Times New Roman";
mso-fareast-font-family:"Times New Roman";}
h1
{mso-style-next:Normal;
margin:0in;
margin-bottom:.0001pt;
mso-pagination:widow-orphan;
page-break-after:avoid;
mso-outline-level:1;
font-size:16.0pt;
mso-bidi-font-size:12.0pt;
font-family:Tahoma;
mso-bidi-font-family:"Times New Roman";
mso-font-kerning:0pt;
mso-bidi-font-weight:normal;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;
mso-header-margin:.5in;
mso-footer-margin:.5in;
mso-paper-source:0;}
div.Section1
{page:Section1;}
-->
</style>
</head>
<body lang=EN-US style='tab-interval:.5in'>
<div class=Section1>
<h1>MS-Bloat Example</h1>
<p class=MsoNormal><![if !supportEmptyParas]> <![endif]><o:p></o:p></p>
<p class=MsoNormal>This is an example of the “MS-Bloat” that is created by
Microsoft Word. As you can see, the majority of the code is not standard HTML
and will be ignored by browsers.</p>
</div>
</body>
</html>
Here is the same page reduced down to simple HTML code. The file size for this document is 372 bytes, less than 1/6th the size of the Word exported file.
<html>
<head>
<meta http-equiv=Content-Type content="text/html">
<title>MS-Bloat Example</title>
</head>
<body>
<h1>MS-Bloat Example</h1>
<p>This is an example of the “MS-Bloat” that is created by Microsoft Word. As you can see, the majority of the code is not standard HTML and will be ignored by browsers.</p>
</body>
</html>