Why You Should Care About Boot Performance
This blog post is slightly off topic, it’s more related to the performance of operating systems in the enterprise and why you should care about the boot performance of your machines. This posts comes from some work I have been doing with a customer recently analyzing their new image during a Windows 7 migration. This post details some of my findings.
All to often when visiting customers, I hear users and even IT staff complaining about logon times to the desktop environment but nothing ever seems to be done. However the simple fact is that you should care about the performance hit you are taking on your machines. I recently ran some tests using the Windows Performance Toolkit and the excellent tool XPerf. I ran the tests using the same test data on multiple machine 20 times on each (one desktop and one laptop) then averaged out the data and started to produce some analysis.
For these tests I will show you the timings for the boot phases, for those of you not familiar these are the boot phases measured in XPerf:
- Pre Session Init
- Session Init
- Winlogon Init
- Explorer Init
- Post Boot
From the testing of the desktop, as I mentioned I ran the tests over the same model of desktop but using two different machines, I also ran the tests 20 times on each and punched the numbers in, here are the results:
- Pre Session Init: 3.646 sec
- Session Init: 18.282 sec
- Winlogon Init: 77.469 sec
- Explorer Init: 54.654 sec
- Post Boot: 46.700 sec
- Total: 200.751 sec
So looking at the figures we are experiencing around a 200 second boot time or around 3.33 minutes. This might not seem long or it might not seem relevant but after some optimising (which I will cover shortly) the figures run down, here are the results:
- Pre Session Init: 4.046 sec
- Session Init: 5.086 sec
- Winlogon Init: 40.602 sec
- Explorer Init: 4.223 sec
- Post Boot: 15.800 sec
- Total: 69.757 sec
So here you have it, we have reduced the time experienced by our users by 130.994 seconds or around 2.18 minutes (65.25%), this is a massive saving but it doesn’t seem that much still does it? Let’s put some figures behind this to put a bit more realism behind it.
In The Real World
So let’s assume that each user works 244 days a year, this is working days in a year minus 14 days leave and doesn’t include public/federal holidays. Let’s also assume the minimum wage is earned which as of October 1st, 2013 is £6.13. It’s important to note that these are just indicative figures, it’s difficult to calculate how much staff actually cost and how much they actually work but it gives you an idea.
First of all, for a single user, here are the savings based on time:
As you can see over a single week nothing much to shout about however when you scale this up to a year, one person can save you 22.20 hours. This is almost one day saved per year from just a small amount of error (it’s coming don’t worry). If you scale that up to a full company, say 8,500 employees then you are looking at the following:
Now we are talking, we can save 161.09 hours a week for the whole company, that’s a lot of time, even at a month we are looking at 644.36 hours. Next step is to use the rate I mentioned earlier as a minimum wage to figure out how much we are actually saving, as I mentioned, some employees earn more than this so it’s just a minimum if everyone was on the same wage. I have also built in error rates of 5% to 30% at 5% intervals and taken the average of this figure, this makes the data a bit more believable taking away some for error, I have averaged out the margins as well.
As I mentioned it’s difficult to gauge this type of information, but you can see how management, especially the budget holder and the CIO can find this information attractive and push for more maintenance to be done and monitoring of boot performance.
ReadyBoot is a boot technology that maintains a RAM cache used to service reads faster than a a disk drive. ReadyBoot prefetches data into the cache before it is requested. Prefetching optimizes disk access patterns by taking data locality and hard drive’s performance characteristics into account. Read requests from system processes, services and user applications are then serviced out of the ReadyBoot RAM cache.
This plays a massive role in how fast Windows starts. If ReadyBoot is broken then any other analysis can be incorrect or give a false picture of the state of your performance. When using XPerf and looking at a trace we are interested in three colours:
- Blue – Write requests to the disk
- Black – Misses, meaning read requests are serviced from the disk instead of ReadyBoot
- Green – These are hits meaning the request was services from the cache
Here is a trace taken from a machine experiencing slow boot performance:
The image above isn’t the worst I have seen but it could be much better, it starts off OK and gets worse on this machine. You can see the pile of black is fairly evident and the green is very high. To resolve this issue we can use the prep system feature of XBootMgr to optimise the boot process, this is done by issuing the following command.
xbootmgr.exe -trace boot -prepsystem
This will reboot your system six times, collect trace information and order files for the best performance. Once this is completed, I ran the trace again and got the following graph back.
Here you will now notice much less black and much more green, this indicates a much better performing boot sequence and a much happier user. Also notice here ReadyBoot finishes in around 105 seconds in the first screen shot and the second on around 64 seconds. Given our boot was around 69 seconds, this shows how effective the tool is.
You can also run a quick command to get a summary of the ReadyBoot information, original information on this is posted on a TechNet Wiki article. One way to analyze the prefetcher activities is to run xperf.exe from the Windows Performance Toolkit.
Xperf.exe –i <trace_file.etl> -o prefetcher.txt –a bootprefetch –summary
Of course all this should not replace regular maintenance such as updating BIOS and making sure your drivers are correct and up to date. I have seen plenty of times where slow performance can be improved with a simple driver update.
I hope this has given you some insight to how this can help save your users some time and ultimately increase productivity and also save your company some money.