Monday, September 17, 2007

IE's Memory Leak Fix Greatly Exaggerated

So Microsoft (as reported here and here) recently released a "cumulative security update" for IE that fixes its egregious memory leaks. Sounds great. Even if it takes a while to get everybody updated, at least the problem is fixed and we can all stop bending over backwards to work around this problem in our libraries, right.

Not So Fast
Let's have a look at the actual knowledge-base article to see exactly what it says:
"... a Web page that uses JScript scripting code, a memory leak occurs in Internet Explorer. When you visit a different Web page, the leaked memory is not released."
So far so good. It even references the original "circular-reference" knowledge- base article, implying that this is indeed what is fixed.

When I saw this article, I nearly spilled tea all over the keyboard. They really fixed this issue? You mean I can untangle all the painful code in GWT that works around this issue, diligently cleaning up all its circular DOM references under all sorts of circumstances?

Settle Down, Beavis
Before I got too excited, I had to do a little gut-check. Did they really go back and make it possible for their garbage collector to chase references through COM objects? That would be wonderful, but I'm not holding my breath.

And it's a good thing, because there's basically no way in hell they did that. In fact, it turns out that all they did was write a little code to sweep the DOM on unload and clean up all the extant circular references on those elements. This means that *all elements not still attached on unload are still leaked, along with the transitive closure over all references Javascript objects*. In even marginally complex applications, that means you're still going to leak like a bloody sieve!

I put together a little test script to show this in action. Have a look in any version of IE, and watch its spew memory!

I'm With Alex ...
... on this one. This is more like a bad joke than anything else. I recognize that fixing IE's memory leaks is a really complex problem, but the fact that it's not being done is still more evidence that Microsoft is abandoning IE, at least as far as any real progress is concerned. I just wish they would come out and say it.

In the Meantime
Don't go ripping out that memory-leak cleanup code. And keep checking for leaks (perhaps with Drip).

Sunday, January 07, 2007

This Old Blog

I think I may have broken the 'longest period of time between two blog posts' record, at roughly two years. The few people who remember it will probably be asking themselves, 'what the hell happened to it, anyway?'

Well, you may not be terribly surprised to learn that I took a job at Google, and have been incredibly busy in the intervening time, working on the Google Web Toolkit. Initially, I tried to keep up with the blog and related flood of random email, but it got to be a bit more than I had time to deal with, so I just decided to nuke it so I could concentrate. This worked, of course, but seemed to irritate a few people who saw a little value in some of the posts.

Now that I've got a little more time, I've decided to resurrect it. I kept copies of the old articles around, some of which I've re-posted. I apologize for the loss of the old comments -- some of them were really helpful, but they're really a pain to get back into Blogger (maybe I'll find a way to do this easily sometime).

I do plan on covering largely the same subjects, and please feel free to email me with questions or subjects you'd like me to cover.

Monday, July 11, 2005

Google Maps Information

I've been getting a slow but steady stream of requests for more information on how to work with Google Maps. I would love to respond to each of these individually, but honestly haven't been doing much with it since I wrote my first few articles on the subject.

Fortunately, I just ran across the Google Maps Mania blog. It seems to be doing a pretty good job of aggregating all of the stuff going on with Maps at the moment. I'm really quite amazed at the array of different things people are building with it (and I doubt there's any way I could keep up with it all anyway).

Happy mapping to all!

Wednesday, June 22, 2005

Another Word or Two on Memory Leaks

Ok, I promised to explain in more detail how to get rid of memory leaks once you've found them. Though I haven't had time to gather all of the information and examples I would have liked, I have run across a few external resources that might be of help.

The first of these is a new Microsoft Technical Article that discusses the various forms that IE memory leaks can take in some detail. Particularly interesting is the fact that it discusses an even more obscure type of leak that's not even a DOM element. It's definitely worth a read.

A bit more information on JavaScript closures can be found on Eric Lippert's blog (which I highly recommend) here.

For a nice, straightforward library that does an excellent job helping you avoid the problem altogether, take a look at Mark Wubben's Event Cache. I particularly like the fact that if you follow a simple set of rules, then you cannot easily leak elements.

On Another Note
I suggested earlier that the slowdown associated with leaking large amounts of memory in IE might be associated with hash tables or something similar getting full and therefore more inefficient. Eric Lippert left the following comment, which makes perfect sense to me and seems more likely to characterize the problem:
The symbol tables are very search-efficient. What's more likely is that the non-generational mark and sweep garbage collector is getting more and more full, and therefore taking longer and longer to walk each time a collection happens. A generational GC, like the .NET framework's GC, solves this problem by not GCing long-lived networks of objects very often.

And don't worry, I haven't forgotten about Drip at all. As time allows, I will be adding the features that I mentioned earlier. Of course, if anyone else wants to play with the source and make their own additions, please feel free!

Monday, June 06, 2005

Drip 0.2

Happy Monday morning to everyone (or, depending upon where you may be, evening). This is just a quick note to announce Drip 0.2! Here is a quick list of changes in this version:

  • The main window is now resizeable.

  • The property list is sorted.

  • Property lists are now separate from the leak dialog. You can double-click on an element to see its properties. And you can double-click on any object property to see its properties. Think of it as a poor-man's expandable property list.

  • The source is also available here.


My current list of definitely known issues is as follows:

  • Still need to hook node.cloneNode() to catch all possible leaks.

  • Still need to hook new windows as they are created.

  • It sometimes reports that leaks are coming from about:blank rather than their actual source.


And my current list of possible issues is:

  • A couple of people have mentioned crashes occuring, which I have not yet been able to reproduce. If anyone having such a problem has a chance to build the source and catch this in a debugger, that would be wonderful.

  • I've also heard mention of issues with deeply-nested frames. My demo leak test page should exhibit this issue, but seems to work fine. Again, any help appreciated.


As always, please let me know of any other issues you discover, suggestions, and (even better) patches. And I haven't forgotten about my promise to provide a solid overview of how to deal with leak issues. I'm still doing a bit of research on the subject, but this will be forthcoming soon!

Saturday, June 04, 2005

Drip Redux

Wow. Thanks for all the excellent feedback on Drip. It was really just a tool that I needed for myself, but I'm glad that it may prove useful for others as well.

There were a lot of comments, both here and on Slashdot, so I'm going to try to put as many of my thoughts and responses as possible in this post. As such, it may be a bit of a grab-bag.

Exacerbating the problem
The first point I want to make is in response to one or two comments here, and many on Slashdot: That is, that I am not particularly concerned about whether or not I am exacerbating the problem by helping developers to "work around" IE's issues. Don't get me wrong; I find it just as unfortunate as everyone else that these problems exist in the first place. It is truly awful that developers using such a high-level tool as a web browser have to take memory allocation issues into account. Particularly given the fact that they're not really given the tools to effectively deal with them (window.CollectGarbage() doesn't count, since it won't really fix the problem).

Anyone who's spent a significant amount of time developing software has to realize that they will always be dealing with inadequacies of their tools and platforms. This has always been the case. It doesn't mean that vendors shouldn't fix their mistakes, but it does mean that you can't usually bitch at your customers for their choice of platform. If you are going to make software development your profession, then you must generally accept this responsibility. Certainly there are cases where you can dictate the details of the client's platform, but this is not the case for most vendors.

I also want to point out two things about this specific problem. First, IE's memory leak issues stem largely from the underlying model that allows scripting languages to interface with native COM objects (that is, making all objects accessible to scripting languages COM objects deriving from IDispatch). While imperfect, this model is also quite efficient -- and given that it was developed in the mid-90's, not an unreasonable compromise at the time. The second point I want to make is that IE is not the only browser with this problem. Mozilla had fairly severe memory leak issues until recently, and I've been told that Safari does as well. So let's not use this as an excuse to jump all over Microsoft.

When do leaks matter?
This is another point that I think bears some discussion. If you've spent a little time pointing Drip at existing sites, you've probably found that most sites exhibit no issues at all. This is simply because most sites simply don't use enough complex DHTML (with complex object graphs and the like) to create the specific sort of circular references that cause leaks. Most sites that do have a few leaks seem to be of PARAM objects passed to Java and/or Flash components. I've gotten mixed reports on when this happens, and when it causes a significant leak, so the jury's still out on whether this matters.

On the other hand, I saw one comment to the effect that Google Maps leaks a lot of elements. This is exactly the sort of application that is in danger of leaking enough to matter. If you look at the Maps code, you'll discover that they've done an excellent job of abstracting the components that comprise the application, and it's quite easy to follow (if you de-obfuscate it, anyway). And I believe that the fact that it leaks so much is actually an indication that its developers have done a good job. The problem is that the very abstractions that make a code base of that size manageable make it really easy to create leaks. Because there are a lot of references among all of its objects, and most DOM elements are wrapped in one way or another, even a single leak can cause the entire reference graph to leak. Nasty, huh?

How do I fix leaks?
This is a pretty complex question. So I've decided to punt this to a forthcoming post. There are a lot of resources out there on this subject, but I hope to gather as much of it as possible into one post so that I can provide a reasonable framework for finding and dealing with them.

What now?
I've gotten a lot of helpful suggestions and a couple of bug reports. What I would like to do now is to list all of the fixes and enhancements that I can think of, and solicit advice on how to prioritize them. Once I've had another pass at the code, I will release the source as well so that you can all help maintain it! This is my current list:

  • Deal with deeply nested frames. This is a real issue for a lot of sites -- apparently Drip only hooks one level of nested frames, but fails to hook deeper windows.

  • Hook the cloneNode() method. This is simply an oversight on my part, but it's necessary to catch all possible leaks.

  • Resizable window. This was just me being lazy. I've gotten really used to constraint-based layout in the Java world, and to be honest, I just didn't want to deal with doing this by hand in MFC.

  • Sorted and expandable element properties. 'Nuff said.

  • Hook new windows (via window.open). I think this is feasible, and will do my best.

  • Anything else you guys can think of!

Tuesday, May 31, 2005

Drip: IE Leak Detector

To anyone still following this site, my apologies for taking a millenium or two between posts recently. Things have been a bit crazy of late, but I have something to introduce that will hopefully make up for the radio silence:

Drip -- an Internet Explorer leak detector.

Over the last few months, a number of people have written to me or left comments asking questions about their memory leak issues with DHTML (or AJAX or whatever-you-want-to-call-it-this-week) applications. Unfortunately, there's not much I could offer in the way of advice that most people don't already know. Get rid of closures, unhook your event handlers, etc. This advice just isn't all that helpful when you've got a giant mess of JavaScript (often inherited) and visually detecting leak scenarios can be maddeningly subtle.

I did, however, find it quite surprising that no one had ever built a leak detector for Internet Explorer (or apparently for any other browser with leak problems; Mozilla has some, but they seem to be more for developers working on Mozilla itself, and the browser does a pretty good job of cleaning up leaks anyway). So I built one.

What it Does

It's a pretty simple application. Basically, it lets you open an HTML page (or pages, in succession) in a dialog box, mess around with it, then check for any elements that were leaked.

The interface is currently rather spartan. Here's what the main app looks like:

[Sorry, I lost the image somewhere along the way]

On the top you'll notice what looks like a crude version of Explorer's navigation bar. You've got the standard back and forward buttons, the URL box, and the 'go' button. These behave exactly as you might expect. To the right of it, however, is a 'check leaks' button, which will be grayed out when you first run the app. In order to try it out, you will first need to go to an HTML page (preferably one that you suspect leaks). The test page at [sorry I lost this page] will work. When you load this page, the 'check leaks' button will become enabled. Click it to see the following report:

[Yet again, I lost the image somewhere along the way]

This simple page leaks two DOM elements, a DIV and a BUTTON. These two elements are displayed in the top list, along with their source documents (useful if you've loaded more than one document between leak tests, or if you have more than one frame), the number of outstanding references on them, and their ID and CLASS attributes.

If you click on one, you'll see a list of its enumerable attributes in the bottom list. A particularly useful attribute for identifying the elements is 'innerHTML'.

Blowing Memory

Back to the main dialog for a moment. You might also have noticed the interestingly-titled 'blow memory' button. Its function is simple: to constantly reload a page as fast as it can, and to report the process' memory usage in the list box below. This is a helluva lot easier than pressing F5 for hours to determine how fast a page leaks memory.

How it Works

Fortunately, Internet Explorer's architecture made this app fairly easy to build. It's basically a simple MFC app with a browser COM component in it. The strategy for catching leaked elements is as follows:

  • When a document has been downloaded, sneakily override the document.createElement() function so that the application is notified of all dynamically-created elements.

  • When the document is fully loaded, snag a reference to all static HTML elements.

  • To detect leaks: navigate to a blank HTML page (so that IE attempts to release all of the document's elements),

  • force a garbage-collection pass (by calling window.CollectGarbage()),

  • and look at each element to see if it has any outstanding references (by calling AddRef() and Release() in succession on it).


Within the leak dialog, each element's attributes are discovered and enumerated using the appropriate IDispatch/ITypeInfo methods.

Caveats

This is basically an alpha release. The interface more or less blows, and I may have left glaring holes in the leak-detection strategy or in the code itself. It seems to work for me, but I would really like for anyone using it to keep an eye out for any problems so that I can fix them. And please don't hesitate to contact me, of course, if you have any ideas, praise, criticism, or even rants to offer. I really want this to help people to stop dealing with these god-awful leaks, and since Microsoft doesn't seem inclined to fix this design flaw, we can at least try to make it more bearable.

What Next?

Obviously, I would like any feedback I can get. There are definitely some interface quirks I need to iron out. And I would like to do more to help determine the actual cause of each leak. There are a few things that I would like to find out, and if anyone has any pointers, please share them:

  • Can you perform similar tricks with Safari/KHTML or Opera? (I know you can with Mozilla, but since it doesn't really leak much, that seems rather pointless)

  • Does anyone know if it's possible to enumerate variables on one of IE's JavaScript closures? (meaning the stack frame hanging off of the function reference)

  • How about enumerating expandos on IE DOM objects from C++? (I only seem to get built-in properties from ITypeInfo)


I'm sure other questions will come up in the near future. Oh, and I will be releasing the source before too long, as soon as I get a few things cleaned up.

Happy leak hunting!