Sunday, April 11, 2010

Yapeal, memory, and XML part 3

Hi all sorry it took so long before getting to this post but while writing the last one I got some ideas and decided to try them out in Yapeal.

Ok Let's do a summary from the other posts here. First there's several ways to work with XML in PHP. Some work with the whole document at a time like DOM, SimpleXML, and most versions of XSL. Others work with small pieces like SAX and XMLReader.

Next we know some APIs like Account Balance are small, some are large but limited like Wallet Journal, and others can become very large and aren't limited like Asset List. We also know that as Yapeal now stands it's doing a lot of converting and copying and duplicating of the XML which multiples the memory used. When Yapeal was first started it mostly did the small, or large limited APIs so memory never become much of a problem and SimpleXML made it just that, simple. Since then Yapeal has gone through a lot of changes but largely the logic used to process the XML hasn't changed even after adding some of the largest unlimited APIs. PHP has also gone through some changes in PHP 5.3 and it's now showing the memory use for some extensions that was hidden in prior versions. Putting all those things together with my ability to only test it with a small number of accounts at a time (< 5 and often only a couple) it's not to surprising that when we did some testing with a few hundred accounts there were a few issues. The biggest one was it used around 128MB of memory and there were few other issues with the ordering of the APIs  which let earlier APIs keep later ones from having a chance to get their data. I'm not going to get into the ordering problems here but I do plan on covering that in a future blog.

One of the great things with SimpleXML as well as DOM and XSL is you have XPath to help you cut out small part of well designed XML like a skilled surgeon with a scalpel but because of some poor design in many of the APIs IMHO trying to use it is more like trying to hack off of a piece of meat from a charging wild animal with a dull broad sword without get trampled.  Often times in Yapeal trying to use it ends up doing little more than cutting off the head and maybe getting it cut up into quarters that still have the skin on them. As I said XPath can let you get small parts but it actually does this in SimpleXML and DOM by making a copy or at least you end up having to do so yourself to use the result which when added to the XML design on the larger APIs ends up being a problem. Now just to make it clear I really like XPath, SimpleXML and most of the larger APIs actually don't have many of the bad design issues as some of the other APIs but it's very hard to work with them without using a lot of memory.

So give the above the question becomes would one of the other extensions work better in Yapeal? The DOM would have the same problems as now but be harder to work with. XSL if it used a SAX type backend would work but the one in PHP isn't so it's not going to be helpful either. SAX could improve the memory issues but it very hard to use and I believe it would make Yapeal un-maintainable and I don't see converting over to it. So that leaves just two extensions and only one of them is made for reading XML and not making it.

Let's look at XMLReader some more. It uses much less memory than SimpleXML, DOM, or XSL because it only deals with the XML in small pieces and it's easier to use than SAX. So far it sound like it could be a better fit for Yapeal but "the true is in the code" so to say. Just to keep this blog from getting any longer than needed I'll say after working with XMLReader for almost two weeks now that it does seem to be a better fit for Yapeal. So far in trying it with some other changes I've made I've seen memory use drop to less than half what it was before.

Now the only other thing that may be a problem but shouldn't be is do most hosting sites include XMLReader? They should because it's a standard included and enabled extension for PHP since version 5.1.0 but as I've found out after releasing Yapeal some hosting sites don't even have SPL (Standard PHP Library) available. So I'd like to hear from some people to get some idea how common it is or what they needed to do to get their host to make it available. For anyone that want to check for it try this:
php -r 'if (extension_loaded("xmlreader")) {print "YES!" . PHP_EOL;};'

Other than needing a different extension this change shouldn't be visible outside of Yapeal itself in any way but some of the other changes to work around the issues with the order in which Yapeal does the APIs may be more visible as they may require so database changes in the util* tables that may have to be done manually. If it does I'll write the instructions up with the SQL and post them.

That's it for now see you down the blog.

No comments:

Post a Comment