DCSIMG
April 2008 - Posts - Doron's .NET Space

April 2008 - Posts

InProc Cache or Session Are Bad for Your Health

I've been doing a lot of performance tuning for our application lately, and the headline above is my number one tip for you. Unless you're writing a tiny application that will never ever have to scale, relying on the default ASP.NET Cache and Session modes - in process mode, that is - is a very bad idea.

If you use in-process memory, you will get stuck without the ability to scale, that is move to a web-garden (more than one worker process on the same server) or a web-farm (more than one server serving requests for your app). Anyway, the idea is the same: Most applications will need more than one process to run them.

And why is that, might you ask. Well, consider that on a 32-bit machine every process can utilize up to 2GB of memory. 2GB is not a lot for ASP.NET applications which are quite the memory dogs: All these server controls that get created on each postback/UpdatePanel Ajax callback don't come in cheap, and if you have enough users, well, these OutOfMemoryExceptions are right around the corner. Even on a 64-bit maching, using more than one worker process can help you better utilize your server resources. Not to talk about web-farm scenarios where at least 2 worker processes is a given (one for each server).

And why would InProc bother you when trying to use more processes, you might ask. The session is the easiest example, and also the easiest one to fix. If you use In-Proc session the user's data will be stored in one process, but when his request is served in another process, there'll be nothing there and he'll be treated as if he hadn't logged in. Why is it easy to fix? Well, all you need is to change the storage definition in your web.config to either state-server or SQL. Of course, you'll soon be getting some SerializationExceptions if you stored in the session anything that is not serializable, so you'd have to fix that as well.

The bigger problem is caching. On a 2 and more worker processes scenario, each process has it's own Cache object. Suppose you have the following code (not best practice code, mind you):

public List<User> GetAllUsers()

{

     List<User> users = HttpContext.Current.Cache["AllUsers"] as List<User>;

     if (users == null)

     {

          users = UsersRepository.GetAllUsers();

          HttpContext.Current.Cache["AllUsers"] = users;

     }

     return users;

}

 

public void Add(User u)

{

  UserRepository.Add(u);

  HttpContext.Current.Cache.Remove("AllUsers");

}

 

You're storing the database outcome of UsersRepository.GetAllUsers() in the cache. The process that called the method will put the objects in the cache, and the other processes will have nothing. But that's OK, right? They'll just go to the database again and put the objects in their own cache. Thing is, adding another user is also done by a single process. And it will remove the cache-key only from its own cache. The other processes will not know that the data has been updated, and requests routed to them will have data that is obsolete, that is, cached data without the newly added user. In some applications that might not be the worst thing in the world, but you should definitely take this into consideration.

ASP.NET doesn't have a built-in out of process solution for caching, so if you're going to use cache in your application you will have to look at third party providers. There is a pretty good Memcached provider for .NET we've started to use. Memcached is an open source distributed caching solution. It provides your application with an out-proc caching, and the cache processes can even be on different servers, allowing your application to scale if it needs to. You can read more about that here.

You can also look into Cacheman, which is written in managed code, but I haven't checked that out yet. Also, there are many retail solutions, which tend to be rather expansive.

Whatever solution you choose, the code you write should be ignorant of it, so you should probably wrap it in your own ICacheProvider or something. The underlying service shouldn't matter much at the end. The only thing that client code should be aware of is the fact that the objects will be stored on a different process. So all items must be Serializable, and getting and retrieving items from the cache is bound to be slower than the default ASP.NET cache, so that should be done sparingly. Once client is aware of that and behaves accordingly, you should be able to easily switch your cache provider to a more sophisticated one, if you see the need to.

This is something you should take into account early on in the project. Assume that your application will run on more than one process and test it for that. Session and cache is the beginning, but you should also be careful with static variables, the ASP.NET Application hash-table and anything else that is stored in-process or gets run once for a single AppDomain. Remember, practically any web-application will have more than one process, so that means more than one AppDomain, more than one HttpApplications, more than one HttpContext.Current.Cache, more than one anything.

I think the main problem is that everything defaults to InProc in ASP.NET, and we have a tendency to leave defaults as they are. This is one case where leaving the defaults as they are is a really bad idea.

Posted by dorony | 2 comment(s)

What Every .NET Developer Should Know To Do With Sos.dll

This week had been a debugging week. Up until now my only debugging tool had been Visual Studio, but in some cases it just isn't enough. For instance, suppose you've noticed that your web application in production or in development  is currently utilizing a suspicious amount of memory. You want to know exactly what's taking up all this memory. Visual Studio does not normally allow you to find out, and so comes in sos.dll, which is a debugging extension for managed code.

You can use sos.dll with windbg or from within Visual Studio. If this is production code we're talking about, you'll probably use windbg, But learning yet another tool might be discouraging for some developers, so I believe that it's better at first to learn it via Visual Studio, and then move to windbg and find out that it's pretty much the same, just more powerful and comfortable.

So, the basic scenario we're talking about is find out which type of objects are in our memory. First we need to attach the debugger to our application process (while making sure we enable managed & native debugging, otherwise sos.dll won't work), and set a breakpoint in a certain place. Once the breakpoint is hit, we can start the action.

immediate1

As you can see in the picture, the first command we ran was !dumpheap -stat this gives you a summary of all the types that are currently in the managed heap, and their memory consumption. If we go all the way down, we will see the most "problematic" types, that is, that consume the most memory.

The next step is to pick one of these types and investigate why are they there. For the sake of the example, let's pick System.CodeDom.Compiler.CodeDomConfigurationHandler that you can see in the image. Its memory consumption is insignificant, but it may seem suspicious to us, as our application doesn't use CodeDom at all. So let's find out the address of this object (we can see in the count parameter that there is only one of those).

immediate2

Our second command was !dumpheap -type System.CodeDom.Compiler.CodeDomConfigurationHandler. This will give us all the locations in the memory of a certain type (well, every type that contains the specified string). So we can see that our object resides in Heap 0 at the address 015b0f08. Now, let's find out the reason for it's existence, or in other words - who points to this object?

immediate3

We ran the command !gcroot [address]. This tells us the exact route that leads to our object. In other words, the reason this instance of CodeDomConfigurationHandler isn't garbage collected is because it is referenced by RuntimeConfigurationFactory which is referenced all the way up to System.Web.HttpContext. Ah. So that is where our object belongs, it is required by one of the of the members of HttpContext. Nothing wrong with that, then.

Of course this was a bit of a contrived example, but this process in general is useful and not that difficult. Remember the steps:

  1. Attach to the process (allow native mode) and hit a breakpoint.
  2. Bring up the immediate window and hit .load sos.dll.
  3. !dumpheap -stat and look for something suspicious, usually memory consumption that seems too high.
  4. !dumpheap -type [suspicious type name] -> get addresses.
  5. !gcroot [suspicious type address].

There is a lot more you can do with sos.dll and windbg, and I would like to recommend Tess' great blog on the subject. That girl is quite the hacker and a real debugging detective, I tell you (a genuine Gashash Balash as we would call her in Israel). Her blog contains tons of walkthroughs for many different scenarios.

Happy Passover folks!

Posted by dorony | with no comments
תגים:

Gimme Those VB9 XML Features

I've been to Tech-Ed Israel in Eilat this week and it was awesome. Tons of interesting lectures, food, parties and smart people to talk with.

The first lecture I attended was "A Lap Around Visual Studio 2008 IDE and VB 9.0" by Lisa Feigenbaum of Microsoft. Now, I'm not much of a Visual Basic user (in fact, I rather dislike the syntax) but I've heard of some new support of VB for XML and I wanted to hear what it was about.

Lisa started by showing some of the great Visual Studio support for Visual Basic 9.0, that includes intellisense features that seem trivial to anyone who's using C#/ReSharper ('filter as you type' is really new in VB? weirdness), and also the Refactoring support for VB in Visual Studio 2008. It seems a lot better than the C# support, as the VB team seemed to have teamed up with DevExpress to incorporate their Refactor! tool into the IDE. The tool is not freely provided for C# users, but again, as I use ReSharper, I didn't care that much.

But when Lisa got to the XML support for VB I was stunned. This is some really awesome stuff they have there. It allows you to write this:

  Public Sub CreateBookXml(ByVal books As IList(Of Book), ByVal file As String)

        Dim bookElements = From book In books _

                           Select <Book author=<%= book.Author %>> _

                                      <%= book.Title %> _

                                  </Book>

        Dim document = <?xml version="1.0" encoding="utf-8"?>

                       <Books>

                           <%= bookElements %>

                       </Books>

        document.Save(file)

 

    End Sub

I won't go into all the details, but you can read more about these features right here. In short, this is an extremely neat (and useful!) native support for XML,  that makes me a bit jealous of VB programmers. The same code in C# would look a lot uglier as it will be littered with XDocument and XElement stuff. Now, I know the C# team had some vague reasons for not including this, but I really don't get it.

They're saying "What if XML won't be that useful in the future? What then, ha?". My answer is that it is a big 'what if' right there. In Israel we say "What if my grandma had wheels?" to this kind of remarks. XML has been very prominent for the last decade or so, and it probably won't be going away any time soon. And when it does, well, we'll stop using these features and do something else. And if SQL becomes obsolete as well, we might stop using Linq, but that's no reason to not let it into the language in the first place. XML is the de-facto standard to represent data in a textual manner nowadays, and if VB can admit to that, why can't C# as well?

Posted by dorony | 1 comment(s)

Why Too Many Overloads Can Be a Bad Thing

At work we're using code generation to create our data access layer. So for the users table we have a UsersDalBase with many overloads of methods such as Add(User) and Add(IList<User>). Thing is, many time we need to override the default behavior of a method. For instance we might want to change Add(User) to notify the user by mail, write something to a log, use caching, etc. We do that in a UsersDal class that inherits from UsersDalBase.

But hmm, that other overload that accepts a list of users is still there. No one uses it for now, but they might try to in the future. Still, it probably wouldn't have been written if it hadn't been for code-generation. So should we also update Add(IList<User>) with all the changes (mails, caching...)? The implementation is more difficult than the single entity case and we would have to add tests for that too.

I had a bitter argument with a member of my team on this very subject the other day. He claimed that we must update both methods since they're there and might be used, and if we don't, we won't get notification and caching in that case. I claimed that it is foolish to waste time on methods that will probably never be used. No where in the system do we add more than one user at the time. The purpose of the many overloads the code-generation gives us is to help us by giving us many options, not hinder us.

At the end we agreed to throw a NotSupportedException when someone tries to add more than user at a time. This way we don't have to write unneeded code, but we are also not concerned that someone in the future will ill-use the class.

Posted by dorony | 7 comment(s)
תגים: