What I learnt about managing 300 sites and the transition to Azure

I cofound a company in 2013. The company ran a medical directory and we also managed the online presence for practitioners which in laymans speak meant we built and hosted practitioners websites.

We managed to achieve something quite special. We managed to host both the Directory application and more than 250 sites on a budget of less than USD 150. We did this out of necessity as the company was self-funded. We had secured enrollment into the BizSpark program which gave us free Azure credits to use in production. The program gave us the following Azure benefits:

  • 25 % App Service plans
  • 15 % Virtual machines
  • USD 150 free Azure credits a month

Initial hosting and the need for a fail-over##

We initially only had the Directory application and it was hosted on a virtual machine rented from a South African hosting company called Afrihost. It was cheap, however it was not reliable and in a 3 month period we had 3 outages. One lasting more than 3 days.

We started looking at other hosting platforms and decided to host overseas.We made this decision because we decided South African hosting was not up to scratch. The choice to move to Azure was a natural one given our BizSpark benefit.

We decided to keep our Virtual Machine with Afrihost and add a fail-over in the Azure Cloud. We utilized Azure Virtual Machines for the fail-over and the Azure Traffic manager for load balancing. It was very easy and pain free to setup.

The arrival of client websites##

We decided to expand the business into managing the online presence for medical practitioners. This presented a problem when it came to hosting. We wanted to avoid hosting it on our infrastructure as we wanted to minimize risk. We decided to isolate the two sections out. We only had a small number of websites and decided an extra small Azure Virtual Machine would suffice and we could scale it up when the time came. We decided to run a second Azure Virtual Machine in a different region as a fail-over for this.

The transition to Azure Web Apps##

We eventually decided to retire the Afrihost server as it was hardly ever being used and the Azure Cloud offered us the performance we wanted. We now had over 100 websites and were already on a small Azure Virtual Machine. The website business was taking off and as a result we had hired two web developers to assist in the creation of the websites. This meant we needed a more formal and secure process for creating client sites. We decided to make a move to App Services for the following reason:

  • We didn't have to manage server configuration
  • Web Developers could create and provision new sites from within Visual Studio
  • Publishing from within Visual Studio was easy to set up and configure
  • We could set up autoscaling based on traffic
  • Good reliability record

This meant we had to migrate/convert our existing services to Azure Services.

Directory Migration##

Our Directory service was almost Azure ready. We had to deal with Windows Services, Session Management and database.

The Windows Service contained many functions that were fired off at certain times of the day. We split each of these functions into their own WebJobs and we set them up as schedule WebJobs

The Session State management was being handled by State Service. We moved this to SQL Server based session management. There was a slight reduction in performance but the Azure Redis Cache was too expensive for the value we wanted.

The SQL database was 99% compatible with SQL Azure and we only had to move a few Stored Procedures in to code.

Website Migration##

There was no easy to migrate 100 sites from traditional hosting to Azure Web Apps. It was a painful manual process. Some of our clients owned their own domains and they all were with different providers. Automating it would not have worked. However doing it manually was a fruitful process. We discovered some a few worrying things.

  • Some pages exceeded 3mb in size
  • Most pages contained over 100 requests
  • Some page load times exceeded 10 seconds
  • Some pages had as many as 10 404 errors

After some investigation we decided that if this continued we would be spending a lot more on hosting that we would have liked. We discovered the developers we had hired were very good at making things work well and look very professional but were inexperienced when it came to performance testing.

We decided to use GT Metrix. The site uses both Yahoo's YSLOW and Google's own PageSpeed test to analyze the performance of your site. We discovered they weren't doing the following:

  • Minifying and Bundling Javascript and CSS
  • Ensuring images matched the size of the holder
  • Optimising images
  • Ensuring no inline styles or Javascript
  • Moving Javascript to the bottom of the page
  • Using CSS Sprites
  • Using a Content Distribution Network for any common 3rd party resources resources

We implemented a new testing benchmark. The site had to exceed 80% in both PageSpeed and YSlow benchmarks. The page size had to be below 1mb.

The pages were all html files and we had no way of handling shared parts of the page. This meant when the client wanted a simple change of a telephone number in the footer it would take a good 30 min to update all 20 pages. We constantly had clients coming back moaning a change had been applied everywhere except in one place. This also meant any page optimisations had to be done per page and not a simple update in one place and cascade. In short it was not sustainable. We needed to make parts of the file shared

Master Template Migration

We decide to create a custom Visual Studio Template for new websites. This would contain all the common code that was used to run the sites. This would also be the start of ensuring the developers used Master Pages to ensure common parts of the webpage.
We added the following to our template:

  • Standardised Web configuration
  • Caching rules
  • Url rewriting rules
  • Security settings
  • Contact form Logic
  • Google Analytics template

This meant updates would be must quicker and more accurate.We could add a new feature or configuration fairly easily. Any performance optimisations could now be easily applied after the design of the website.

Web Essentials

The trick to getting people to stick to standards is to make it quick and easy for them to adhere to them. We discovered a very useful Visual Studio extension called Web Essentials. It enabled us fix the following issues at the click of a button:

  • Minifying and Bundling Javascript and CSS
  • Optimising images
  • Using CSS Sprites

We had a few teething issues but the results were worth it.

The Results

Once we got around the resistance to the change, the results were astounding. With little more than 15 minutes of post design work the developers were able to achieve the 80% score on both easily. They started to self improve and take ownership of their work. It didn't take long for the benchmark to become a competition. There was the odd clown who decided he was going to try and achieve the lowest score.

Content Distribution network

We still had not tackled the Content Distribution Network(CDN) issue. We had to reduce the number of server requests we had to process per site and we had to reduce the size of the requests. The average site was still serving up 1mb of data a page and at an average of 50 visits a month x 300 sites it was not going to kill us but our HTTP queues were.
The App Service Plan was using more and more RAM per site we loaded onto it. We had to find a way to reduce that as scaling up was not really an answer given budgetary restraints. We decided that we had to do two things:

  1. Ensure all third party JS and CSS libraries are served up by a CDN.
  2. Implement a CDN for all the sites.

We had a goal, other than images we only wanted to serve up 3 files: Document, JS file CSS File. The rest must be served up by CDNs. We had a lot of resistance to changing all the third party libraries to CDNs. It was a manual process of replacing the references from the web page with links from CDNJS.

This would increase our GtMetrix score and ensure the performance of the site was better as the CDNs would likely have a node closer to the client than Azure. We increased the benchmark pass score to 90% on PageSpeed and 85 % on YSlow to ensure this process was followed.

The results of this was a saving about 230kb per page as many of the designers were adding references to every 3rd party library under the sun. The images still accounted for about 60%-90% of total page size. We needed to get them on to a CDN.

Azure had a CDN option and it seemed very easy to implement and they charged per GB served at a slightly higher rate than standard traffic. We weighed up the cost of scaling up to a large instance vs the increased traffic cost. It was a no-brainer. We implemented the CDN for every site. It was a bit of a long process changing all the references on the site to point to the CDN. We went from 3 files to 1 being served up each time a page loads. There was the initial pull of resources for the CDN but that happened once a month (cache expiry was set to 30 days).

We had come a long way from over a 100 resources being pulled per page to between 30 and 40 per page and only 1 was being served up by our server. We decided we wanted to get rid of the 1 file being served as much as possible. We had researched CloudFlare and decided it was worth a try. The Azure CDNs didn't have a node in South Africa where our client's clients were based. CloudFlare did and it was free. CloudFlare also offered us a decent solution to protect us against a DDOS attack. We were worried about this as our attack surface became larger and larger as our client base grew.

Integrating with CloudFlare was very easy and was a matter of changing the NS records on each domain and it took care of the rest. We again had the issue of clients who owned their domains who took a long time to change their records. It reduced our server request count by about 25%. This meant in a typical user session viewing 4 pages we were only serving up 3 requests. Previously we would have had to serve up 400+ requests.

We had come a long way from hosting a few sites on a virtual machine. I think we managed to squeeze everything we could out of Azure App services. I would not recommend any other platform for hosting.

We learnt a lot along the way. The most important thing I took away from this was that you should offload as much as possible to the bigger guys. Third party libraries should not be served by you but by someone else. Make someone else pay!

Richard Pilkington

I am a .NET Developer who is always seeking out new things and loves trying anything new. There has to be a quicker way tends to be the way I approach things

Cape Town, South Africa