www.hahosting.com | www.sheffielddatacentre.com | Contact us

Welcome to the High Availability Hosting Blog

The blog for HA Hosting, home of The Sheffield Data Centre


Managed Firewall / vRouter server reboot

Over the last 2 days, one of our Managed Firewall / vRouter server nodes has rebooted, forcing firewalls and virtual routers to move to other nodes in the cluster.

We’re currently looking into why the server rebooted, any relevant updates will be posted here.  All firewalls and vRouters reloaded on other nodes as expected.

UPDATE @ 16:00

The server didn’t reboot, but there may be a fault with the LACP NIC Teaming, which caused a network card to “drop off” making the management believe the server was down.  We are implementing a fix.

Virgin Media maintenance (outage?) 20/09/2016

In the early hours of Tuesday morning (20th September), Virgin Media stopped routing traffic over their network. _Our_ BGP was automatically rerouted with minimal impact to customer services. Any customer who is directly connected to Virgin Media may have suffered other problems which would have been a direct result of their network and not us.

We tested connectivity throughout the night, finding inbound traceroutes stopping at Leeds, not getting to Sheffield. Our outbound traceroutes also stopped at Leeds, indicating a problem or scheduled(??) maintenance there.

BGP was re-established around 05:00, we allowed routes to propagate again at the start of the working day yesterday.

HA Hosting would like to apologise again for our carriers’ issues. Whether planned or unplanned, we only found out at the same time everyone else did. Thankfully our redundant connectivity “kept the lights on” yet again.

We do have some exciting news to share regarding redundant connectivity, but that will have to wait until October *tease*.

Service Interruption – Sunday 4th 08:25

We’re being made aware of a service interruption yesterday around 08:25.  Customers who connect to us via Virgin Media (because their internet traffic is logically closer, so favours that link) seem most affected.

We are gathering more info, but traffic appears to have re-routed via Level 3 for the duration, and then re-balancing again around 10 minutes later.

Updates to follow as we get them.

Update 05/09/2016 @ 09:00

It seems our BGP connection did reset uncleanly to Virgin Media, meaning traffic will have taken upto 3 minutes to fully re-route via Level3.  This is how BGP works and is unfortunately beyond our control (it takes upto 3 minutes for the routes to “drop out” of Virgin Media’s network).   The peer came back online around 5 minutes later (~08:30).

IPv6 traffic was unaffected, as this wholly comes in via Level3.

Short DDoS

One of our customers has experienced a short DDoS, from around 22:52 until 00:14. We have intervened and removed that IP from the network, all traffic is back to normal.  Apologies for any disruption to connected customers. 

New HA Website Live!

At 12:00 today (06/07/2016) we made our new website live @ https://www.hahosting.com

It’s taken many months of development, both in-house and with our friends at Impelling Solutions, to not only come up with a new user experience for the website, but also to integrate it more with our backend customer management software.

You can read the full article, and browse the new website, on our new website!

New Hosting Control Panel live

As recently announced, we have ripped-out our Plesk Automation based Hosting Control Panel and replaced it.  Now we’ve completed migrations, customer access has been restored to https://hcp.hahosting.com

So why?  What happened to Plesk Automation?

Last year we engaged with Odin to use their Plesk Automation control panel, which allows multiple hosting accounts on multiple servers to be controlled via a single login.  This was live for DNS hosting, and any new web hosting (within the last six months) has also been on there.

Odin now belongs to Ingram Micro, who are making Plesk Automation End-of-Life, and are no longer actively supporting our licence or application for their new partner program, essentially leaving us up a smelly creek without a Plesk-Paddle.  Whilst Odin (Ingram) and Parallels (Plesk) both have alternatives “soon”, we felt disheartened and have decided to cease using Odin products at this time.

Enter MSPControl…

We have been deploying MSPControl (formally WebsitePanel) for our new Hosted Exchange 2016 service, so it was a no-brainer to extend it for DNS and Web hosting, as it shares much of the ethos of what Plesk Automation had (single login, multiple servers, multiple services).

Last week we were able to migrate 400+ DNS zones from Plesk Automation to MSPControl without interruption, and this week we’ve migrated a dozen-or-so hosting customers, allowing us to switch off Plesk Automation for good.

MSPControl allows us to deploy DNS, web hosting, email hosting, Exchange, SharePoint, MySQL and MSSQL to customers all from a single login.  Since being re-released by Virtuworks, the panel is getting constant updates, and new features are being added all the time.

We have also taken the opportunity to switch from MySQL to MariaDB – the true Open Source fork of MySQL.  In addition, MS SQL Server 2014 databases are now available.

Finally, with MSPControl, we are able to offer Hybrid Hosting, where you have a dedicated web server, but perhaps use our cloud for DNS – again all under a single control panel login.

Just so you know – if your web hosting was previously on “linuxplesk”, “picard”, “janeway” or “serverb”, it will still be there for the time being.  We are looking to slurp all these hosting accounts into MSPControl over the coming months.  DNS for all customers is on the new panel now.

“But I have a dedicated Plesk server!”

If you have a dedicated server running Plesk Panel, that remains unaffected – our relationship with Parallels stays the same, and the standalone Plesk Panel will continue to be available and supported by us.

In closing…

We’d like to thank everyone for their patience while we replaced what is essentially an entire hosting platform, especially those customers who had to raise tickets to change DNS entries.  We’re very excited about what MSPControl can do – we’re already looking at its integration with Hosted Skype for Business, Hyper-V servers, and Hosted Desktops.

An enormous amount of work has gone into this project.  Thanks for your support, and thanks for choosing HA Hosting.

Cheers,
Stuart.

Managed Firewall failover

At 16:55 today, a small number of Managed Firewalls failed-over to other hosts within the cluster.  Only a handful of customers would have been affected, and the firewalls were back online within a few minutes.  The cluster worked as-per design and recovered on its own.

A new and replacement cluster has been built for our Managed Firewalls and vRouters, and is awaiting installation when additional power has been run to the cloud racks – due this month.

Hosting Control Panel upgrade/replacement

We are currently progressing an urgent change to our Hosting Control Panel and associated hosting servers.  All will be revealed over the coming days, however we are moving away from Odin Plesk Automation to an alternative panel, one which will allow us more flexibilty in hosting services and add Exchange 2016 to live.

The team is currently busy migrating DNS, websites, and email from the old servers to the new servers.  Updated credentials will be sent to every user soon.  In the meaintime if you have a DNS change you need to make, email support <at> hahosting.com

Regards,
Stuart.

New DNS Servers being deployed

As part of planned infrastructure upgrades, we’re deploying two new DNS Servers to supplement our zone hosting servers (name servers).  Due to software requirements, we’re deploying these servers as Windows 2012R2 rather than CentOS.

dnsserveraddrole

It’s the first time in years we’ve built new Windows based DNS servers.  Exciting stuff!

Managed Firewall node failover 01/06/2016 UPDATED 17:00

Some customers may have noticed a brief interruption to service during the early hours today. This was caused by Managed Firewalls running on a Hyper-V mode to failover to another node.

We’ve confirmed all services are back to normal, most customers wouldn’t have been affected, a small number would have seen a few minutes interruption whilst firewalls rebooted on alternative nodes.

We don’t know root cause of the failover yet, we will update the blog when we know more.

UPDATE 01/06/2016 10:00

The same problem we had during the early hours has happened again this morning.  Managed Firewalls and vRouters have “jumped ship” from one cluster node to another.  This time we were able to catch some LACP NIC Team errors, so we’re currently investigating this avenue…

UPDATE 01/06/2016 10:49

Two out of four cluster nodes have been rebooted cleanly and are accepting firewall/router traffic again.  We are performing controlled failovers from the remaining two nodes right now (not service affecting)

UPDATE 01/06/2016 12:00

We have manually balanced the Firewall / vRouter cluster to ascertain if a particular firewall is the cause of the fail-overs.  At time of writing all is stable.  We will update this again at 13:00.

UPDATE 01/06/2016 14:30

No issues in the last 2 hours.  We’ll try to rebalance the cluster as it should be now to reproduce the fault (one firewall at a time).

(OK we missed the 13:00 update, the team had to have an Emergency KFC)

UPDATE 01/06/2016 17:00

Still no problems following rebalancing of the cluster.  We’re unable to reproduce the circumstances which triggered the failover.  Continuing to monitor.  Dan and Stuart are on-call this evening.

Previous Entries