Tech-Ed Israel 2010 - 3-Tier Remediation using Opalis
Hi,
As I wrote in a previous post, on of my Tech-Ed sessions this year is on Opalis.
As a quick preview, I wanted to tell you about one of the many demos I’m going to show in the session.
Consider this scenario, an application uses multiple servers – an IIS server, an Application server and an SQL server. I’m sure you’ll agree that most (if not all) monitoring solutions (like SCOM, Patrol, Unicenter etc.) can perform some kind of basic recovery operation (like restarting a stopped service) if they detect an error, but what if a more complex recovery process is needed? what if the detected problem require running tasks on the IIS, checking their outcome, then running tasks on the SQL server or something like that?
This recovery scenario (which is relevant for most problems if you ask me) is where Opalis really shines, Opalis can be used to watch a monitoring product like Operations Manager for incoming alerts that match a specific pattern known to be associated with an application outage. Normally, procedures would be manually executed that would attempt to “fix” each tier of the application. Opalis automates this whole problem solving process.
First, the alert is annotated to indicate that remediation is proceeding. This is important to people watching the alert, since it will let them know action is being taken on issue. Then the service is “re-tested” several time to rule out a false alarm or intermittent issue. If after retesting the application remains down, the remediation process begins. The “Get Server Names” activity is a Map Publish Data foundation activity. It looks at the name of the application and returns the involved web, MS SQL Server and application server associated with a given application. In some environments this might be replaced with a query to a CMDB. Once the server names to be targeted for remediation are identified, the remediation process begins in parallel on each tier.

The Junction activity guarantees that the post-remediation re-test wont’ be run until all three tiers have finished the remediation process. Once remediation has finished for each tier (successful or otherwise) the application is re-tested. If the retest fails a new alert isn’t created. Rather, we update the existing alert to indicate that manual intervention is required.
And yeah, it can even make coffee 
More information can be found on Dario Blog - Tech-Ed Israel 2010 – 3-Tier Remediation using Opalis