Search engine optimization is one of the most popular subjects when nerds sit around and talk about Flash. “Does Google index your swf files?” seems to be the most popular question, usually garnering plenty of ‘yes‘ and ‘no’ and ‘maybe’ answers. The real answer to this question, once and for all, is this:
It doesn’t matter.
To understand this answer, you need to understand what Flash is. And to do that, you need to understand modern web development philosophy. First off, you need to embrace web standards. Semantic markup and separating content from style and behavior is the only way you should be building your sites. Many web standardistas have been recommending this method of web development for years, and rightly so. However, this post isn’t the place to go into the whys of this type of development, so I’ll skip that part and just say this about how it’s done: There are three areas of front-end web development: Content, Style, and Behavior. You should always keep these three things separated as much as possible.
That brings up the question: “Where does Flash fit into this three pillar method of web development?” Is it content? Is it behavior? Is it style? While it could be considered all three, most professional Flash developers will remove the content from their Flash movies and load it in using Flash remoting or XML files. That leaves us with style and behavior.
Style is added using CSS. Generally when you add images to your HTML that are purely presentational (no text or required content in them) you should add them in using CSS. In most cases you don’t want Google to index them because people don’t search the web for ‘top left rounded corner gif.” They search for content. Even if Google upgrades their crawler someday to read CSS files and index the images, they probably wouldn’t use the information for more than statistical analysis because of this.
Behavior is generally added using Javascript. Maybe you want a new window to open set to a certain size, or you want to use some fancy Ajax to let users rate something without refreshing the page. This should all be added unobtrusively, and if the browser doesn’t support Javascript, it will hopefully still work. Unfortunately, not everyone considers this, and these days Javascript is becoming more and more of a requirement to use most websites. So you should always provide some sort of alternative for non-Javascript users. When it comes to indexing behavior, Google will for the most part not index your Javascript files. Even if it did, most web users would have no idea what the .js file they are looking at actually does. When using Javascript to change your document, Google will not read the ‘final’ page, but only the raw HTML file. Google does not render Javascript 1.
Now that you know all of this, it’s time to look at how to treat your Flash content. Since we’ve determined we don’t want Google to index our swf files, but we do want it to index the content displayed inside them, what is the best way to go about this?
As stated before, if you are building Flash sites professionally, you probably move all your content out of your Flash movie and into an XML file or keep it in a database. This makes it much easier to allow Google to index this content by using progressive enhancement.
Progressive enhancement is a method of web development that goes hand in hand with Web Standards. You start with your HTML (your content), then add CSS (your look and feel), then add in additional behavior (Javascript, Ajax, Flash, any other interactivity that isn’t handled automatically by the browser).
The best way to add Flash progressively is by using Javascript, or more specifically, a script like FlashObject. First you lay out your page as if you aren’t using Flash. If you are using a database for your content, you can spit out that data as HTML where the Flash movie will go on the page (or maybe just a preview of the content, it’s up to you to show Google the content you would like indexed). Then you use FlashObject to replace this content only if the user has Javascript enabled and the required Flash plugin version.
Here’s a small example of what that might look like:
<div id="flashcontent">
This is replaced by the Flash content if the user has the correct version of the Flash plugin installed.
Place your HTML content in here and Google will index it just as it would normal HTML content (because it is HTML content!)
Use HTML, embed images, anything you would normally place on an HTML page is fine.
</div>
<script type="text/javascript">
// <![CDATA[
var fo = new FlashObject("flashmovie.swf", "flashmovie", "300", "300", "8", "#FF6600");
fo.write("flashcontent");
// ]]>
</script>
This causes Google to skip the Flash swf files and only index the HTML (the content!) you place on the page. You can place links to other pages, images, whatever you want Google to index, and when a viewer with a browser that supports Flash visits your site, they will then see the Flash content. This gives you full control and much greater predictability over what content Google will index. And if your content is pulled from a database that is editor controlled, your pages will update and be re-indexed as the content changes without the need to re-publish all your swf files.
1 Currently Google does not render the Javascript on a page, but there are rumors that they are developing a new crawler based on Firefox (they employ a number of Mozilla foundation members) that will index pages based on how the browser sees them, instead of the raw HTML content. This means HTML hidden by CSS may not be indexed, and pages that are altered by Javascript after they load will be indexed how they appear to the user. However, this is all rumors and until it happens Google will ignore your Javascript content.
Note: In this article I use the ‘Google’ name often, but it can be interchanged with any search engine, as they all work roughly the same way.