Faciotech’s managed uptime monitoring service ensures your websites and hosting environment remain available and secure. This article explains our incident response and notification process, describing the steps we take from initial detection through resolution and reporting. As this is a fully managed service, you do not need to perform any technical actions; our team handles everything and keeps you informed through multiple channels.

 

Continuous Detection
Our monitoring service runs automated checks on your hosted websites and services every five minutes. These checks verify that web pages load correctly, critical ports respond as expected and SSL certificates remain valid. If an anomaly is detected—such as repeated connection failures, missing keywords on a login page or an expired certificate—the system immediately flags the event.

Initial Alert and Acknowledgement
When a potential issue is detected, our monitoring platform sends an internal alert to the on-call engineer via email and Slack. Alerts are only raised after a second confirmation check to minimise false positives. The on-call engineer acknowledges the alert within 15 minutes and begins investigating the cause.

Investigation and Triage
Our technical team uses diagnostic tools such as server logs, network tests and hosting control panels to determine whether the issue is a transient glitch, a configuration change or a genuine service outage. Where possible, remediation is initiated immediately, for example restarting a service or renewing an SSL certificate.

Escalation and Resolution
If the issue cannot be resolved quickly, it is escalated to a Tier 2 engineer or infrastructure lead. The escalation path ensures that more specialised resources are available to address hardware failures, network disruptions or security incidents. Throughout the investigation, the on-call engineer documents actions taken and the current status.

Client Communication
Once a confirmed outage affects a user-facing service, we notify clients through multiple channels to ensure timely awareness:
- Mailchimp email notifications: An incident notice is sent to the email address on record, summarising the problem, affected services and our remediation steps.
- SMS alerts: For critical issues, a text message is sent to your registered mobile number so that you receive immediate notification even if you are away from email.
- Status page updates: Our public status page provides real-time information about ongoing incidents, maintenance windows and resolved issues. Clients can visit this page at any time to check the latest service status.
- Social media updates: Brief updates are posted on our official social channels to inform a wider audience and reassure stakeholders that we are aware of the problem and working on a solution.

If the outage is part of scheduled maintenance that you have previously advised us about, monitoring is paused and no alerts are issued.

Post-Incident Reporting
After service is restored, an incident report is generated outlining the root cause, the corrective actions taken and any preventive measures identified. For major incidents, a more detailed post-mortem is produced within 48 hours. Monthly uptime reports are also provided so you have a transparent record of service availability over time.

Keeping Us Informed
Although the monitoring service is fully managed, we ask that you notify us of any planned maintenance or changes to your infrastructure. This allows us to temporarily suspend monitoring during maintenance windows and avoid unnecessary alerts. If you add new domains or services that require monitoring, please contact our support team so we can update your monitoring profile.

Visual Overview of the Workflow
To make the incident lifecycle easier to understand, our knowledge base includes a simple diagram that outlines the flow from detection to resolution. The steps are as follows:
1. Detection of anomaly.
2. Internal alert and acknowledgement.
3. Investigation and triage.
4. Resolution or escalation.
5. Client notification (Mailchimp, SMS, status page, social media).
6. Post-incident reporting and follow-up.

Maintaining Current Contact Details
To ensure that you receive incident notifications without delay, please keep your contact information up to date. Notify our support team if your email address or mobile number changes so that Mailchimp and SMS alerts continue to reach you.

Service Scope and Boundaries
Our monitoring covers the availability of your hosted websites, control panels, mail services and associated SSL certificates. It does not monitor client-side devices, third-party applications or network conditions outside of our hosting environment. Issues outside the monitoring scope may still affect availability; in such cases we will assist with diagnosis to the extent possible but cannot control external systems.

Severity and Prioritisation
Incidents are classified based on their impact and urgency. Critical incidents involve a complete outage or security risk and trigger immediate escalation. Warnings indicate potential issues, such as high latency or certificate warnings, and are investigated promptly. Informational alerts, for example a temporary spike in response time, are monitored for further signs of degradation.

Managing Notification Preferences
If you wish to adjust how you receive notifications—such as adding or removing a mobile number for SMS alerts or opting in to RSS feeds from our status page—please contact our support team. We will update your preferences so that notifications reach you through the channels you prefer.

Was dit antwoord nuttig? 0 gebruikers vonden dit artikel nuttig (0 Stemmen)