ArcGIS Server Performance and Scalability

Posted by Dave Bouwman | Posted in Uncategorized | Posted on 10-09-2005

0

One reader asked for my thoughts on ArcGIS Server Performance and Scalability. After writing up a long email, I thought I’d blog it instead.

In my opinion, the performance and scalability really comes down to how you architect the solution and control the user experience. That said, even with the best design, there are some things that are just not going to scale well.

Architecture

Perhaps the best way to sum this up is that without a good architecture, your site will be slow. Period. Ok, super simple sites, working with really small datasets, will be OK (i.e. the samples). To repeat the mantra – avoid looping over fine grained ArcObjects in the “client” application . This will kill performance. Chapter 4 of the Developer Guide has all kinds of info on this.Any tight loops over ArcObjects must be done on the SOC using COM objects to encapsulate the fine grained ArcObjects calls. If you create COM classes which actually run in-process on the SOC systems (see previous posts here, and here on debugging these classes), you’ll see a significant performance improvement.
Next up, use pooled server objects. This saves a LOT of time when first hitting the site because non-pooled server objects must be created for each session (user). This takes about the same amount of time as opening ArcMap. As far as I can see, the only reason you would want to use a non-pooled server object is for an editing application which required the ability to undo an edit. Contrary to what may be popular belief, you can edit data using a pooled server object. It’s just that the edit operation becomes atomic within one client-server round trip. At this years ESRI user conference, Jithan Singh showed a demo of a huge application for New Zealand Post that allows a user to create a postal address location by digitizing on-screen – using pooled server objects.
Another thing to mention in this area is that all those nifty design patterns you may have read about, and lots of saucy object oriented goodness needs to be secondary to the architectural requirements of ArcGIS Server itself. While it may make great OOP sense to have a “Report” object, that actually does the interaction with ArcGIS Server, for performance reasons, you may have to chop it in half, and have a ReportSOC COM class which run in the SOC. This is likely one of the biggest hurdles for experienced developers who jump into ArcGIS Server.

Controlling the User (experience)

By this, I mean that you need to keep the user from doing something that will effectively crash the server. For example, the project that I have been working on is for a state agency that wanted to create land cover statistic reports on the fly. In order to generate these reports a portion of a raster land cover map is extracted, it’s attribute table is read, stats are computed and dumped into a report.

If the user selected the entire state as their area of interest, the application would go ahead and extract the entire state, then extract the entire raster table, and calculate statistics on this. Meanwhile, the server would be pinned for about 30 minutes or more. Clearly this is not good for a multi-user system. The solution is to enforce limits on the user. My application has a configurable maximum AOI size. Anything above that size will simply inform the user that they need to make a smaller selection. Code defensively.

Things that (Likely) Won’t Scale

Writing something like this is definitely sticking my neck out – so – these are some guidelines based on my experience

You want to stay away from high-intensity analysis. If it takes 10 minutes to run in ArcMap on your workstation, all other things being equal (same datasets, same data sources etc.) putting it in ArcGIS server is not going to make it any faster. However, there may be ways to reduce the processing time. Working on the assumption that similar types of analysis will be run repeatedly, you should look for ways to pre-compute as much data as possible. For example – one report I had to build was very simple – report back the number of intersections between roads and streams in the area of interest. Another one that sounds simple – until you try to do it without writing out a dataset on the fly. Essentially you select all the roads and all the streams in the AOI, then do a double loop over both sets of features and intersect each one to get the points. Then, you need to simplify the point set to remove duplicates, and then get the count. Even running this on the SOC (roughly equivalent to running it on your workstation) it took about 5 minutes to complete, and pinned the server at 100% CPU usage. Not a good solution. However, if I simply intersected all the roads with all the streams, and stored the point layer in ArcSDE, this report would run in a fraction of a second.

So – when attacking an ArcGIS Server project, be sure to think outside the brute force box.