Monday, December 10, 2007

Blame NHibernate, Why Not?

I’m currently assisting with another project that is experiencing some performance issues.  A lot of finger-pointing and blaming has been going on, and most of the parties involved just want to jump in and blame NHibernate.

I figured that we could optimize the way NHibernate is being used to help gain some perf. back in the application and hopefully restore its reputation at the client (and frankly, my company). 

Some info about the application:

It’s a WinForms application that communicates via Web Services to a service layer.  The service layer utilizes NHibernate over a domain model for persistence.  The domain types are serialized and used directly on the client.  (more on this later)

There are roughly 100 tables and almost as many mapped entities.  It is using NHibernate 1.0.2.

We did some basic benchmarking of the application so that we could quantify how effective various fixes were.

Startup time: 14 seconds.  Ouch!  I turned on the hibernate.show_sql setting in the config file and watched the SQL queries fly.  They were covering the screen!  I wrote a simple log4net IAppender that would count the number of log statements that contained a query and found out that 304 queries were being executed at startup.  Obviously something is wrong here, so I copied the queries to notepad and began to group them.

The domain of the application is shipping (like, in ships across the ocean).  So there is a concept of Terminals, Ports, Port Locations, and Geographic Locations.  If we load all of the terminals, we need to load the port for each one, the port location for each one of those, and the geographic location for each one of those.  It turns out that this many-to-one is not that bad.  You can get most of the data you need in one query using left joins.  The problem we faced was that GeographicLocations held an IList of all of its PortLocations.  And furthermore the PortLocation held an IList of all of its Ports. 

Picture trying to build up just a single terminal…

  • select terminal details and build up a Terminal
    • need a port, so build it up
      • need a port location, so build that up
        • need a geographic location, so build that up
          • not done?  Now we need all of the port locations for the geographic location
            • you guess it, now we need all of the ports for all of the portlocations you just loaded
              • now we need to get all of the terminals for all of the ports for all of the port locations, ….

This is starting to sound like that children’s story The House That Jack Built.

This is EXACTLY the reason that ORM’s have lazy-loading.  If you haven’t used it before, here’s how it works:  You load up the terminal, but it doesn’t fire a query for the port.  It gives you a stand-in object instead.  The second you access the terminal.Port property, *boom* – a database query is executed, the port is loaded, and you are returned a port.  This allows us to select a minimal amount of data, and is generally favorable unless you know ahead of time that you’ll need the port, in which case it’s faster to just query for it all at once.  Needless to say, you need to at least think about this issue while developing.

Lesson #1:  Understand Lazy Loading Or Be Doomed

But remember that our domain entities are being returned from a web service.  There is no way for NHibernate to open a database connection on the client.  Thus, we cannot utilize lazy loading at all.

I decided to remove the IList of port locations from the GeographicLocation class.  As it turns out, there was no code using this method, so it was a simple change.  I removed the list, removed the relevant mapping xml, and rebuilt.

The queries were reduced to 94.  That 5–minute change improved the application load time to about 6 seconds.  This is not the end of the story on improving startup time, but this was a big win at a very low cost.

There are likely other areas like this where we can break collections, but I probably will be breaking client code, and I will have to resort to adding a service method and utilizing NHibernate queries to provide the objects in a collection, rather than object graph traversal.

Lesson #2:  Don’t go crazy with your colleciton associations.  Possibly favor querying over object traversal for problem areas.

As a side-note to lesson 2, read Eric Evans’ Domain Driven Design for a much better understanding of aggregates, aggregate roots, and how to partition your domain model.

Then I started looking at the logs.  These were turned off in development, so nodody ever saw this.  I noticed that there were a ton of messages like “CodeDOM failed for class …..   unknown character `” — this was the reflection optimizer trying to compile some code on the fly to assist with getting/setting property values.  The code was listed along with the error and you could clearly see that this happened only on types that had a generic parameter.  I searched around and found that this was a bug in 1.0.2 that didn’t understand how to read the string representation of generic types.  It was fixed in 1.2.0, so I decided to upgrade. 

The upgraded version reduced the startup time to 3 seconds, and I’m pretty confident that we won’t get much better than that.  Sure we can reduce the number of queries a bit more, but we’re getting into the realm of acceptable values here.

How did we go from 6 seconds to 3 seconds?  Well the reflection optimizer had previously been turned on and was failing… once for every entitiy.  This caused an exception and easily slowed down the startup time.  Additionally, the reflection optimizer wasn’t helping out getting and settting property values so they fell back on reflection, whcih can contribute to overal runtime performance degredation.  The reflection optimizer is FAST, so let it do it’s job!  (see how fast in this article by Jay Chapman)

Lesson #3: Turn on logging (even in Dev!)

Lesson #4:  Take advantage of the reflection optimizer!

Next we took a look at one of the trouble screens and discussed how to make it faster.  The screen needed a lot of different types of data.  It had details at the top of the page, most of which were dropdown selections loaded by another entity, and the grid needed data from 3 different entities, flattened out.

The issue here is that the UI needs SO many different entities, that a signification portion of the entities are being loaded when only a subset of the data is actually being displayed.  This is an excellent opportunity for applying Screen Bound DTO’s.

A screen-bound DTO is a custom type that contains a flattened view of only the data I need to fulfill the needs of the screen.  I can utilize Projections to create slim entities that only contain name/value pairs for example.  I can accomplish this all with HQL.

This means less queries being sent to SQL server, and less data going over the wire.

Lesson #5:  Consider Screen-bound DTO’s instead of consuming your entities on the UI

On the database side, we can probably deal with a more friendly transaction isolation strategy, such as ReadCommitted.  This will allow more concurrent reads than other isolation levels.  This is accomplished through configuration.

Lesson #6:  Consider using ReadCommitted as your isolation level

We should also configure the default_schema setting to be databasename.dbo because that will be used to fully qualify the objects in all database queries.  Without this, SQL Server will not cache the query plan for these queries.

Lesson #7:  Always make sure you’ve set hibernate.default_schema

There are still more optimizations that we might make, by analyzing opportunities for cachine on both server side and the client, tweaking the mappings, and loading large data sets on a background thread to keep the UI responsive.

I believe that we will be able to achieve very acceptable performance with a lot of analysis and a tweak here and there.  Maybe then we can restore some faith in NHibernate, as it is a truly powerful persistence framework and it would be a shame to yank it because it was implemented poorly.

An excellent resource is Billy McCafferty’s article on NHibernate Best Practices with ASP.NET.  Be sure and read it if you haven’t already (it’s full of useful advanced tips on NHibernate.

Are there any perf. considerations I’ve missed?

Debt - Credit Card Consolidation - Hotel Las Vegas - Internet Marketing