27 April 2010

Series – Building your business on Microsoft technologies (Part 0 – Roadmap)

Many people launch new businesses or expand small businesses to the point where IT starts to play a role.  It is at that precipice where the question about which software to use and build on becomes evident.  As happens in most companies, a software package that most closely does what is more urgently needed, is installed by someone and it starts to gain user traction.  This repeats over and over again until at some point, someone has to figure out how to untangle the spaghetti mess that resulted.
If only someone had planned the expansion and use of software beforehand, it would have saved tons of time for whoever ends up with that project.  And that… is where this series comes into play…
I’ve built my career over the last 10 years or so, on Microsoft technologies.  There’s always someone out there who’s done what you need, IF you understand what you need.  That’s what Enterprise Architects to best.  Understanding the business need and marrying that up with technology decisions that will help drive the business forward.  I intend this series to provide a road map for anyone who needs to build a business on technology that’ll allow less rework down the road.  I will cover all the topics as one may encounter them from the perspective of small (or even one man/woman) IT departments where budgets are tight (especially in the current economy) and getting high priced consultants isn’t always an option.  The most expensive thing that’s done in IT, is rework.  Doing the same thing over and over again because it wasn’t done properly the first time.
My vision for this series is to be a guide that most IT personnel could follow to deploy technologies within their company that’ll be properly positioned to support company growth in the future, requiring little to no rework at any point in time.  So without any further delay… here is my Roadmap for this series… Please note that I’ll be updating the Roadmap as time goes on and I write the corresponding articles and link to them.  It may be a good idea to Bookmark/Favorite this post for future reference.
  1. Installation – Windows Server 2008 R2.  Since Windows Server 2008 R2 is the latest and greatest server operating system from Microsoft, we’ll use it as the basis for all our servers.
  2. Configuration – Creating the Primary Domain Controller – Enabling the Active Directory Domain Services Role on Windows Server 2008 R2.  Once we have our first server with an operating system installed, it’s time to create our company domain.  We’ll be using Active Directory authentication for our environment.
  3. Business Continuity – Enabling and Testing the Windows Server Backup Feature on Windows Server 2008 R2.  No progress can or should be made until we’re sure we can recover from absolute disaster.  That means our server is completely dead and we have to restore onto new metal.  Backup and Restore functionality must be tested before we do anything else.
  4. Configuration – Enabling the Hyper-V Role on Windows Server 2008 R2.  Getting ready for virtualization is a key action here.  In a small business, there is seldom money for multiple servers so we have to stretch our resources to the max by employing virtualization.  Since running absolutely everything on one single server is not only NOT recommended as a Best Practice but also detrimental to scaling with business growth, virtualization is a perfect solution.  We will be using Microsoft’s Hyper-V technology to host all our servers on the same physical box.
  5. Installation – SQL Server 2008 R2 on Hyper-V.  Since absolutely everything we’ll do requires a SQL Server database, and since SQL Server 2008 R2 is Microsoft’s latest and greatest database server product, we’ll build on it.  Initially we’re not going to cluster or scale the SQL Server, but that will be the first point of scaling once volume and traffic increase.
  6. Business Continuity – Configuring and Testing Disaster Recovery for Hyper-V Servers.  Since our SQL Server was the first Hyper-V server we built, we have to test the Backup and Restore of our Hyper-V server before proceeding.
  7. Installation – Exchange Server 2010 on Hyper-V.  Now that we have a domain and a database server, we need email.  We’ll be building on Microsoft’s latest email server for that.
  8. Business Continuity – Testing Disaster Recovery for the Exchange Server 2010 server on Hyper-V.
  9. Installation – SharePoint Server 2010 on Hyper-V.  After establishing email for the company, we need to work on the web site and collaboration between employees.  We’ll use the latest version of SharePoint for that.
  10. Business Continuity – Testing Disaster Recovery for the SharePoint Server 2010 server on Hyper-V.
  11. Etc.
And so the list will grow and continue over time.  I am going to endeavor to post a new chapter in the series every week to two weeks so stay tuned.


Cheers
C

22 April 2010

SharePoint 2010 on MSDN, SQL Server 2008 R2 RTM

SharePoint 2010 is finally available for download on MSDN.  Come and get it! Elsewhere, SQL Server 2008 R2 also finally reached RTM so it’ll be on MSDN shortly. Good times… lots of servers to build…

Cheers
C

19 April 2010

SharePoint, IIS, w3wp.exe, threads and app pools… Sometimes, it IS the end user after all!

I’ve been helping a good client of mine trouble shoot some performance issues with their SharePoint environment.  They have a single MOSS 2007 server under 32 bit, so their 1,000+ active users (not concurrent though) is stretched about as thin as it will go.  Recently, the server started having issues where the app pool would get locked up and take all the users down.  Now IIS app pools are designed recycle when certain limits are reached so that it would be seamless to the end user.  The app pool was set to recycle when memory consumption under the worker process (w3wp.exe) reaches 1 GB or the virtual memory consumption for the app pool reached 1.9 GB.  We were not seeing the overlapped recycle taking place automatically because the app pool would get locked up when memory reached around 940 MB.  It was not consistent though so it couldn’t be identified readily.  We ended up trimming the values back to the eventual 800 MB physical and 1.5 GB virtual memory before triggering a recycle.
imageOnce the app pool reached either of those limits in it’s memory consumption, IIS would spin up a new w3wp.exe worker process with a fresh app pool and all new SharePoint requests would be directed to that process instead.  All existing pending requests on the current worker process/app pool would complete or be terminated once the timeout configured in IIS was reached.  After all requests completed and released their execution threads, the worker process would terminate and release it’s memory back to the pool for IIS to use.
If you are seeing similar behavior in your SharePoint environment, there are a couple of things you need to pay attention to:
  1. IIS Timeout setting.
  2. Runaway/locked up threads.
  3. Time between recycles.
  4. Physical server memory.
  5. Bit architecture of the server and SharePoint.
Once your server isn’t crashing for end users any more, it’s time to tune it’s health more closely.  Identify what the IIS timeout setting is set for, for your server/app.  If your server is still on IIS6, you will want to ensure that theLogEventOnRecycle property in the IIS metabase is set to true.  Next you want to look in the Event Log under System for  message 1077 which indicates that an overlapped recycle took place for the app pool.  Make sure to note the time between these messages.  It’s best to use the smallest time which should relate to your peak volume time of day for the given server.  Lastly you want to make sure how much physical memory the server has and what the bit architecture of the server, the OS and SharePoint is, i.e. are you running 32 bit or 64 bit.
imageNow it’s time for some math.  If you have a 32 bit server running 32 bit Windows Server and 32 bit SharePoint, this is a much more crucial issue than if you were running all 64 bit.  The issue deals with memory.  You have to figure that the server will not realistically have available to your worker processes more than half of it’s actual physical ADDRESSABLEmemory.  I highlight addressable here because remember than under 32 bit architecture, your server cannot address more than 3.2 GB of memory, even if it has 8 GB of physical memory!
Thus in our all 32 bit example, even though the server has 4 GB of memory physically, the OS can only address 3.2 GB which means by my math, about 1.6 GB would be available to our worker processes in IIS.  You may be tempted to use something just below that as your recycle point, but remember that we have OVERLAPPED RECYCLE going on which means that IIS is managing two worker processes at the same time, so each would require it’s own memory in order to function properly.
That was the problem we ran into when the recycle threshold was set at 1 GB.  The worker process would trip the limit and then IIS would attempt to spin up an overlapping worker process, but since there wasn’t enough memory available to do so, it took no time at all to completely lock up IIS and bring down end users.  Only a forced recycle of the app pool, which forcibly releases all threads and memory pages, thus also dropping users, before spinning up a new worker process, could restore the server to a working state.
Dropping the memory recycle trigger down to 800 MB instead, we consumed half of our available memory, or 25% of the addressable memory.  When the worker process triggered the overlapped recycle, it would spin up a second worker process and direct traffic to it while finishing up requests in the first worker process.  Provided none of these requests had runaway threads, the worker process would typically shut down and release memory within a minute or two.
imageThis gets the server into a usable state as far as the end user is concerned because they no longer see crashes or get locked up.  On the server side, you will see the app pools recycle much more frequently and you are running the risk that a runaway thread would lockup the first worker process until the IIS timeout is reached.  That setting is 15 minutes under IIS by default, but most SharePoint shops have upped that to 30 minutes, especially where low bandwidth or VPN users are in play.  As a result, a runaway thread would keep the first worker process alive for 30 minutes.  You can see how the time between recycles now becomes superCRITICAL!  If you overlapped recycles happen more frequently than your IIS timeout value, change something.
RECOMMENDATION:  Ensure that your IIS timeout value is always LESS than your overlapped recycle time at it’s shortest interval.
Of course the answer is to solve the memory leak problems so that the app pools don’t have to recycle, but if you’ve ever tried to track down memory leaks, you know it’s HELL!    If you’ve never had the misfortune of having to do so, consider yourself truly blessed.
It’s also not always realistic to bring the IIS timeout value down.  If your server is recycling worker processes every 15 minutes, it’s certainly not likely to be doable.  That’s when it becomes mission critical to hunt down any runaway threads and determine their cause.  Anything that may cause the worker process to remain alive need to be addressed in order to keep your server up and running.  At my client’s site, we were still getting runaway processes that could potentially put us in a state where a third worker process needs to be spun up which would bring the whole thing to a screeching halt.
As an Enterprise Architect I get to see all sides of the fence.  I work with and talk to everyone involved.  When talking to developers, the feeling is usually that Ops people must have done “something” to the servers which is causing the instability.  When talking to operations personnel, the feeling is usually that Devs are writing bad code that’s causing the instability.  I’ve been in many SharePoint shops and have seen both sides of this argument be true, but not this time.
We had an awesome traffic profiling tool available for the job and that’s where we discovered two items that would cause runaway threads.
  1. imageSQL Server Reporting Services Integrated Mode.  If you’re a SharePoint Architect, you probably just had a cold shiver go over your entire body as you read that line.  Yes, every SharePoint shop dabbles with SSRSIM at some point.  Most come to the conclusion that performance is a problem and usually deploy a dedicated server to run SSRS.  That was also the case here.  Unfortunately, there was a couple of instances of IM reports that could not be moved over to the dedicated server so IM was left active.  What we discovered was a series of reports developed and built (as SSRS empowers end users to do) by end users.  Of course end users are not going to know how to write optimized queries for data so as a result, these reports performed poorly.  There were reports that would take upward of 30 seconds to load, and that was being local to the servers and on a 1 GB ethernet connection.  The reports have very large amounts of tabular data and we know how well IE renders tables.  Imagine being a user, on a remote VPN connection.  Your wait time on the report could easily go over 2 minutes.  The problem with that is the thread requesting data is locked up while all this data is transferred and interpreted for render on the browser.  Additionally, a user could easily lose patience and simply close their browser fearing that it may be “locked up”.  When a user does that, the thread still remains alive in the background until the download is complete and the loss of the end point on the client side could very well cause the thread to become a runaway thread that never releases its resources.  No matter how you slice or dice it, it’s bad.
  2. Image Rotating Banner.  We have a nifty little web part that adds pizzazz to user created pages by rotating through images determined by the designer/user.  Now as I said, any time we empower end users to do design of content delivery, a LOT of thought has to go into it.  In this case, the web part was designed for ease of use in that all the designer/user had to do was drop it on the page, set the Title for it and point it to an Image Library on the site.  Then when the page is loaded, the web part would start rotating images using JavaScript.  Nice.  But way, there’s more.  Using our awesome traffic profiling tool, we discovered pages, like main departmental home pages, were loading literally dozens of images.  Taking a closer look at the pages, they appeared to load rather slowly.  If you’ve ever dissected the loading sequence of a SharePoint page, you’d know that, even if you set them to display partially while downloading content, the JavaScript is usually the last part to be downloaded.  As a SharePoint developer would understand, a SharePoint page isn’t really functional until that JavaScript has loaded.  None of the dropdowns work etc.  But I digress.  Needless to say, until the page is completely loaded, you can’t really do too much.  What we saw was all these images loading with the page before the script would load, making the page load times very slow.  To make matters worse, we looked at some of the pictures being loaded and most were not resized to the 100 pixel banner size they displayed in.  On the contrary, the images were in their original 9 mega pixel JPG format!    Cracking open the code for the web part, we discovered that it did exactly what I just described.  It showed ALL of the pictures in the picture library regardless of SIZE.  Though that design is OK for uses where experienced developers or web designers would be using the web part, it unfortunately does not work well for end users or inexperienced designers because it’s not realistic to expect an end user to think about the number of pictures being displayed, all being preloaded on the page as well as the size of those images.  Considering some images up to 5 MB in size and libraries easily containing in excess of 20 images, you can see how the 1 MB size page, now having to preload all these images, suddenly became a 100 MB+ page.  That’s never good for performance.  Now granted, the web part should probably use AJAX to load it’s images and not preload them on the page, but this was the design that was available.  We implemented a hot fix to the code whereby we simply leveragedSharePoint’s built in thumbnails for image libraries since it’s just a banner anyway.  In addition, we display only a random 10 images from the library each time.  That meant no more than 100 KB in extra page size.  Again, you can see how a user could easily give up and close the browser, leaving a thread locked as it processes.
As we’ve seen in this case, as developers and architects, we always have to be conscious of our end users.  Tools we provide them in order to empower them can often come back to haunt us at the most inopportune times.


Cheers
C

SharePoint sheds the “Office” label with Microsoft SharePoint 2010

Tom Rizzo announced yesterday on the SharePoint Team Blog that SharePoint 14 will be known as Microsoft SharePoint 2010. I’m happy to see SharePoint shake the “Office” label and stand on its own! I predicted this eventuality three years ago in May 2006 while at the SharePoint Conference. J The beta of Exchange 2010 is also now available. Can’t wait for the beta bits of MSP 2010


Cheers
C

18 April 2010

SharePoint, Office, Visio, Project 2010 RTM – A BIG week for Microsoft

Well, it’s official.  SharePoint 2010, Office 2010, Visio 2010 and Project 2010 all RTMed on Friday.  They joined the ranks of Visual Studio2010 and .NET 4.0 which RMTed on Monday and SilverLight 4.0 which RTMed on Tuesday.  So all in all, a HUGE week for Redmond!
Get those MSDN downloads cranking!


Cheers
C

16 April 2010

Things that annoy

Have you ever wondered why:
  • The windows on planes are set 6 inches below a useful height for anyone amongst us, save the shortest?  It’s almost like they don’t want us to look outside.  As a pilot, glass canopy and great visibility is always a raved about feature.  So why rob passengers of the experience?  Instead, you end up having to bend over to peek outside.  Hmm… maybe that’s it…
  • Families on vacation completely ignore the “Expert Traveler” sign and clog up the fast lane at airport security?
  • People would park their shopping kart broad way in the isle, completely blocking traffic, and then wander off to look for something?  Of course, you get the stink eye if you dare move the kart out of the way.  Like I wanted your groceries anyway…
  • Security at the local city/county building won’t allow a garage clicker in, but cell phones breeze right through?  Supposedly it’s a one button detonator… with about 200 feet range…
  • Marriage licenses are only valid for a short time, like 60 days?  Anyone who’s ever planned a wedding will tell you, it takes a YEAR to do.  Why not dispense with paperwork early and be done with it.  Oh right, that’ would be efficient…
I should probably put these in a running list…


Cheers
C

12 April 2010

SharePoint built in automatic thumbnails for an image library

SharePoint maintains some built in thumbnails that’s automatically generated when images are uploaded to the image library.  This is in order to allow for the thumbnail view of the image library.  Here’s an example from my blog.
On the left side is the original image that exist in the image library.  Click on the link to see the image in it’s full size.  The right side is the SharePoint thumbnail.  You can click it to see the size of the image.

The formula is pretty simple…  simply change the URL as follows:
Original:  http://www.cjvandyk.com/blog/Lists/Photos/042308_1817_VMWareSnaps3.png
Thumb:   http://www.cjvandyk.com/blog/Lists/Photos/_t/042308_1817_VMWareSnaps3_png.jpg
As you can see, simply:
  • Insert “_t/” in front of the file name,
  • Replace any periods (.) with underscores (_) and finally,
  • Append a “.jpg” at the end of the file name.
As you can see, my example was actually a .png file, but the thumbnail is still a .jpg file.


Cheers
C

08 April 2010

Want some cores with that?

OK, so I’ve been starting to look at new hardware… new iron to run my life on.  I recently acquired a new Intel i7 Quad Core laptop for my mobile VM needs and am currently writing this article on my Asus Eee 1005HAB Netbook in flight to Boston, MA.  Unfortunately, my servers at home, including the one hosting this article, are aging fast and will be in need of upgrade within the next 12 months.
imageI have therefore started looking at hardware, specifically, server type hardware.  I’ve only just begun this process, but I ran across something so sweet, I just had to share.  Now I know it’s not in production yet and I know even if it was, I would certainly NOT be able to afford it, but that doesn’t mean I couldn’t put it on my Dream Hardware Wishlist!    Just imagine what you could do with a 100-core processor!  That’s right!  100-cores baby! 
A new upstart called Tilera has unveiled a series of 4 processors of which the Tile-Gx100 is the flagship.  There’s also the Tile-Gx64Tile-Gx36 (No, it’s not a typo.  I can’t quite figure out why 36 and not 32 cores either.  ) and Tile-Gx16.  Hey, maybe I’d be able to afford the baby in the family. 
The CPUs are built on 40nm technology and top out at 1.5 GHz, which may be a little low unless your OS and apps were specifically designed and written to take advantage of multiple cores, which is mostly not the case today. 
image
Nevertheless, it’s nice to dream.  And don’t worry, with mainstream octa-cores on our doorstop, mainstream centu-cores should only be about 6 years away from reality, according to Moore’s Law. 


Cheers
C

07 April 2010

Inline hardware disk encryption

If you’re like me, you’ve probably not given data encryption on your home PC a second thought. Sure, most employers use some form of data encryption or another for our corporate laptops, but at home it’s a totally different story. I’m in the process of evaluating my server hardware at home, the topic of encryption came back up.
The problem with encryption is that it’s a pain to implement and use. If you’re doing file level encryption, you have to remember to encrypt your files or you have to remember to save your files in an encrypted folder. That sounds too much like work, so most of us just won’t even bother.
The other alternative is to have whole disk encryption. The down side to that is that it adds a software abstraction layer between the hardware and operating system which takes CPU cycles to process thus taking away from your system horsepower… i.e. it slows the computer down. Now if it’s implemented in conjunction with a hardware upgrade, you may not notice it and it might be OK. Mostly though, it’s not. Nobody wants to give up CPU cycles.
image_7_0DCCBA2D
The only true solution is actual hardware based encryption. Something that can encrypt the data on the fly as it’s being written to the disk, but without taking any of your CPU cycles for it. It must read, write, cache and encrypt completely self sufficient.
Enter Addonics with their new Dual CipherChain (CCM35MK2). This little beauty lives in one of your 5.25” drive bays and configuration is dead simple. Connect your SATA drives (it supports two), to the card. Connect the output port of the card to the motherboard. Insert the encryption keys and you’re good to go! The device provides real time 256-bit AES encryption and at just over $150, it’s a small price to pay for the safety of your data.
I’ll report back in the future on my experience with this device.


Cheers
C

01 April 2010

Moral responsibility – Doing the –RIGHT– thing…

A friend of mine went to see her family doctor.  The doctor has been her family doctor for decades.  He was her family doctor ever since she was a child.  This makes for wonderful continuity in patient history and allows the doctor the get a more holistic view of their patient and the patient’s family.  She went to go see him because she was having major lower back muscle spasms and pain.  This had been going on for about a week and his response was to prescribe some muscle relaxers for her… Flexeril to be exact.  Now having a pharmacist for a father, she’s always been keenly aware of drug reactions and side effects so, she specifically asked the doctor if there was any side effects to the drug he prescribed.  His reply was “No, not really.”.
She filled the prescription and called her dad to just double check on possible side effects of the drug.  He said that it could cause extreme drowsiness so it should only be taken at night.  She followed her dad’s advice.  What precipitated during the course of the next two days, was nothing short of mind blowing.  The drowsiness was certainly there, as her dad had warned (yet the doctor failed to note), but there was also a weird change in mood swings that was just unnatural.  Thankfully she recognized the sudden, irrational changes in mood as well as the acute anger and sensitivity that accompanied it.  She began to do some research on the topic and within 10 minutes on Google, found information that could only be described as disturbing.  
She immediately stopped taking the drug and disposed of the remaining pills, vowing never to take it again.  This is all good and well, but what if she had NOT been so acutely aware of side effects?  I don’t want to play “what if” games, because it doesn’t lead anywhere good.
In my profession, I don’t have any legal or ethical responsibility to keep up to date with changes in the marketplace, yet I do it diligently because I see it as my moral responsibility to provide the best possible service to my clients and to have as much information available and processed as possible before making one recommendation over another.  I know we live in an “it’s not my fault” society and yes, I despise that mode of thinking, but with a medical doctor, especially one you’ve trusted your well being to for so many years, there’s a very deep level of trust associated.
Shouldn’t this doctor KNOW what the potential side effects of the drugs he prescribe are?
Shouldn’t this doctor ASK QUESTIONS of the pharmaceutical rep that pawns this crap off for him to prescribe?
Shouldn’t this doctor at least DO SOME RESEARCH on the kind of drugs he prescribes?
To me the answers are and always will be YES, YES and YES!!!  But alas, that wasn’t the case here.  It seems to me this doctor has been in the business too long and has become complacent.  Maybe it’s easier to improve your swing and make your tea time than it is to spend some time researching facts that may affect the lives of those who trust their care into your hands.


Cheers
C

Microsoft Authentication Library (MSAL) Overview

The Microsoft Authentication Library (MSAL) is a powerful library designed to simplify the authentication process for applications that conn...