Maximizing Efficiency in SaaS Integrations at SSENSE — Our Guide to Monitoring Enterprise SaaS Products

Published in

SSENSE-TECH

8 min readAug 25, 2023

Most digital transformations today use Software-as-a-service (SaaS) products to accelerate the transformation journey. Monitoring the availability and performance of SaaS products in enterprise environments is not only crucial for maintaining uninterrupted operations but also for maximizing efficiency and delivering exceptional user experiences.

When looking at the health of the SaaS systems that integrate with enterprise applications, it’s important to understand the internal architecture of the third party systems and understand the nature of the SaaS product. Most of these products use one of the public cloud infrastructures.

Understanding the way the tenants are built is critical to design the monitoring systems accordingly. There are several ways to isolate the tenants in a SaaS system:

Single Tenancy
Multi-Tenancy
Hybrid
Serverless

Let’s explore each one and their impact on monitoring.

Single Tenancy

In a single tenancy architecture, each customer or tenant has their own dedicated instance of the application and infrastructure. This means that the software is deployed separately for each customer, providing them with isolated environments and dedicated resources. Monitoring a single tenancy environment is generally straightforward since each customer’s instance can be monitored independently.

Impact on Monitoring:

Monitoring is focused on individual instances, allowing for granular visibility into the performance, availability, and security of each customer’s environment.
Resource utilization in the single tenant stack can be tracked easily as long as the SaaS products expose the monitoring.
The isolation between instances reduces the risk of performance interference or security breaches affecting multiple customers simultaneously.
Typically, when SaaS products support single tenancy, they expose more monitoring capabilities that the enterprise can directly consume.

Multi-Tenancy

In a multi-tenancy architecture, multiple customers or tenants share a single instance of the application and infrastructure. Tenants are logically isolated from one another through data partitioning and access controls. Monitoring in a multi-tenancy environment requires a gray-box approach as SaaS products don’t expose fine-grained monitoring capabilities. There are some health check endpoints that some of the providers expose, which are usually more high level. Although there are cost advantages to selecting multi-tenant SaaS products, the opaqueness of the monitoring systems can become a major concern.

Impact on Monitoring:

Resources are usually pooled and shared between multiple tenants. There is the potential for a single tenant to become a dominating resource.
Typically, compliance and security frameworks are leveraged to ensure that the data is isolated and encrypted. For example SOC compliance and periodic audits help in this case.
Monitoring should assess the scalability of the infrastructure to handle increasing tenant loads and identify any bottlenecks that may affect all tenants.
Due to the fact that a noisy neighbor in a multi-tenant is bound to impact the non-functional aspects of the system, we should be extra vigilant in putting together proper monitoring of the NFR SLAs.

Hybrid

A hybrid architecture combines elements of both single tenancy and multi tenancy. It allows for a mix of shared infrastructure and dedicated instances based on the specific needs of the customer. Certain components may be shared among tenants, while others can be customized or isolated for specific customers. For instance, most companies have separate compute instances for tenants and use shared data persistence. Monitoring in a hybrid environment requires monitoring both shared resources and dedicated instances.

Impact on Monitoring:

As consumers of these services, enterprises should identify the shared resources and dedicated ones. Even in hybrid tenancy products, access to logs and SLA metrics can become a challenge as internal monitoring infrastructures are shared resources. Hence, all the challenges of monitoring a multi-tenant SaaS apply here as well.
While using a hybrid SaaS product it’s important to note that the security of the shared components, such as databases, may be covered by the vendor. However, the compute infrastructure that is Instance specific, like the containers, Virtual Machines should be monitored for security compliance.

Serverless

Serverless is an emerging isolation model that is used by enterprise SaaS products depending on use cases. Twilio, for example, uses serverless architecture for its Functions. Twilio is a communication platform as a service and Twilio Functions is a serverless environment used to deploy event-driven Twilio applications. This is essentially a single-tenant model, but elastic, which uses serverless architectural paradigms at its foundation. Here monitoring is typically focused on the individual functions or serverless services.

Impact on Monitoring:

The granularity of monitoring is focused on individual functions or services, tracking their performance, execution times, and resource usage.
A key factor to consider is Invocation Monitoring which should track the invocation of functions and identify any errors or latency issues in real time.
These services are usually auto-scaling seamlessly, however the time to handle burst traffic should be monitored.
Distributed tracing is challenging in FaaS-based SaaS products due to the nature of the architecture. However, as these compute instances are fairly isolated we should expect SaaS providers to expose the logs and monitoring of these services.

Techniques for Availability Monitoring

Heartbeat Monitoring: This entails sending regular signals or status updates from a system or component, indicating its operational state and proper functionality. The main objective of heartbeat monitoring is to ensure the continuous availability of the system or component by receiving these periodic heartbeat signals. While these signals confirm that the system is active and capable of sending and receiving requests, they may not offer in-depth insights into the internal health or functionality of the system. Heartbeat monitoring is commonly employed to detect disruptions, failures, or instances of downtime, serving as a high-level indicator of the overall operational status of the system.

Health Checks: These are systematic verifications that evaluate the health and optimal operation of essential components within a system. These checks entail active probing or testing of the components to validate their availability. In contrast to heartbeat monitoring, health checks offer more comprehensive insights into the internal health and functionality of a system or component. By conducting these checks, organizations can identify any issues or anomalies, enabling proactive maintenance, troubleshooting, or remedial actions based on the results obtained.

{
  "status": "healthy",
  "timestamp": "2023-07-05T10:15:30Z",
  "components": [
    {
      "name": "Database",
      "status": 200,
      "responseTime": 5
    },
    {
      "name": "LegacyAPIConnectivity",
      "status": 200,
      "responseTime": 20
    },
    {
      "name": "File Storage",
      "status": 200,
      "responseTime": 8
    }
  ]
}

Synthetic Transactions: This technique creates synthetic transactions that simulate user interactions and monitor the availability of key workflows or critical paths within the system. By regularly executing synthetic transactions, you can detect any availability issues and ensure that critical functionality is operational. There is a cost impact to making these synthetic transactions, both in terms of triggering the call and compute power.

Log Monitoring: Involves collecting and analyzing log data generated by various components of the SaaS product. The HTTP response codes of 400(s) are symptomatic of client errors, including data issues. But HTTP response codes 500(s) are more reflective of issues on the server side and should be closely monitored. Timeouts and network connectivity exceptions can be used specifically to monitor the health of the SaaS product. This pattern has a lower cost implication. The drawback with this pattern is that the monitoring is only available when there are live transactions. Typically, the SLO provided by the SaaS products are based on 24x7 and exclude any planned maintenance windows. Hence, having health checks based on log monitoring would make it difficult the compare the actual performance against the SLO promise.

Log Integration: Recently, we have been seeing SaaS products providing log integrations from their platforms to cloud-native logging platforms such as Datadog. This is achieved through one of the following internal techniques:

Log Forwarding: This involves forwarding logs from the SaaS product to a central log management system
Log Ingestion: This involves ingesting logs from the SaaS product into a cloud-based log management system

We could also use APIs provided by the SaaS product to collect logs and ingest them into logging platforms for critical metrics/logs.

Techniques for Performance SLA Monitoring

Log Monitoring: Implementing comprehensive performance monitoring to track key metrics such as response times, throughput, latency, and error rates. This includes utilizing APM (Application Performance Monitoring) tools, custom instrumentation, or cloud-native monitoring solutions to gain real-time insights into the performance of critical components and services.

Event Response Time Monitoring: Event response time monitoring is a valuable technique for measuring the elapsed time from event creation to the generation of a response by the SaaS system. It provides insights into system performance and responsiveness, allowing enterprises to assess efficiency, identify bottlenecks, and meet performance objectives and SLAs.

Real User Monitoring (RUM): Real user monitoring can be leveraged to capture performance metrics from actual user interactions with the SaaS product. By instrumenting client-side code or utilizing RUM tools, you can gather data on page load times, transaction durations, and user experience. RUM helps measure and analyze performance from the end-user perspective, allowing you to identify areas for improvement and align with SLA targets.

New Ways, New Challenges

As the adoption of SaaS continues to evolve, there is a growing need for seamless integration and unified monitoring systems between SaaS vendors and enterprises. The goal is to provide a consolidated view of the enterprise’s health, allowing customers to focus on their core business operations. This level of transparency benefits both SaaS products and enterprises alike. In this article, we have explored various monitoring techniques and their applicable scenarios. By leveraging these techniques, enterprises can streamline operations, drive efficiency, and achieve business success while trusting that their SaaS solutions are well-monitored and optimized.

Editorial reviews by Catherine Heim & Mario Bittencourt

Want to work with us? Click here to see all open positions at SSENSE!

Maximizing Efficiency in SaaS Integrations at SSENSE — Our Guide to Monitoring Enterprise SaaS Products

Written by Perumal Babu