What is Low Latency?

A low latency network is a system that executes tasks and returns results in the shortest amount of time possible. Low latency means that any dependent system can run at maximum speed, which increases productivity and improves the user experience.

How is Low Latency Measured?

Latency is the delay between the initiation and completion of a task. No system has zero latency, as there must always be some kind of processing overhead, even if this is only an infinitesimal fraction of a second.

Low latency is, therefore, a qualitative term to some extent. System architects might be willing to accept higher rates of latency in some contexts, but not others. For example, when a user refreshes a web page, they generally won't be able to detect any latency that is substantially less than a second.

In other systems, such as high-speed financial trading systems, a single microsecond could impact the outcome of a deal. These systems have a poor tolerance for latency, so networks must return database results within a fraction of a second.

There are two main ways of measuring latency:

  • Time to First Byte (TTFB): The time elapsed from when the client sends the first byte of a request to the moment the server receives it.  
  • Round Trip Time (RTT): The total journey time for a packet to travel to the server and then back to the client. 

Many engineers will judge latency by benchmarking, which means comparing their system against similar systems. A benchmarking test can involve comparisons of millions of queries. The effective latency rate is the difference in TTFB or RTT between systems. Low latency means that the effective latency rate is below or at the level required.  

What Hinders Low Latency?

Latency can increase at any point during the execution of a task. For example, consider an application that sends a query to a database. The database processes the query and returns some results. As these two systems communicate, latency can creep in for a number of reasons:

  • Poor configuration: The initial request might not have the correct configuration. The sender could have structured their query badly or made a mistake with their credentials. A non-fatal error will result in greater latency.
  • Network traffic: On busy networks, packets take a longer time to reach their destination. This can happen if there is a surge in network traffic, if the network that doesn't have sufficient bandwidth, or if the network infrastructure isn't configured correctly.
  • Resource availability: If the server is busy, it will take some time to deal with incoming queries. A database query, like that mentioned in the above example, will have to wait until the server can allocate processing resources.
  • Processing overhead: Complex queries take longer to execute. Query complexity is a function of the quality of the underlying data, so queries will run faster on data that has passed through a cleansing and integration process. Data repositories such as data warehouses tend to return results faster, as they do not need to pre-process data before returning results.
  • Outages: In some instances, an infrastructure component may be unavailable. This may be a cloud service that is offline, an on-premise system that has crashed, or a network failure.
  • Security: Data security is non-negotiable, but it can create some latency overheads. The right configurations and software can minimize the additional latency associated with essential security measures.  

Any of these factors can increase latency. If the overall delay is below benchmarking targets, then the network is low latency. Otherwise, the organization may need to consider ways to reduce latency. 

How Can Data Repositories Achieve Low Latency?

Low latency is often the result of good decisions about network architecture, IT infrastructure, and data governance policy. There's no one single technique that guarantees low latency. Instead, organizations try to get the fundamentals right, such as:

Store Data in the Right Place

Data should be close to the processes that rely on this data. This may not mean physical location (although geography can increase latency) but rather in terms of networks. For example, if an external data request has to pass through some additional layers of security, this can increase latency.

Integrate Data into a Single Source

The way to reduce overall latency is to combine tasks where possible. For database tasks, this means integrating multiple sources into a single repository. With data in a single source, a single query can yield detailed results. This leads to a faster overall processing time that querying multiple sources.

Pre-Process Data Where Possible

Big Data platforms such as Hadoop and Spark have sophisticated tools that can navigate massive unstructured data repositories in a short time. However, this can be slower than working with a database that has been properly transformed and integrated. Transformed data is in a unified schema, and any queries will execute faster than they might on disparate schemas.

Ensure that the Network is Fit for Purpose

Network latency is one of the major factors in overall system latency. Often, this may be a network engineering issue that requires a reconfiguration of the network to allocate resources dynamically. In an enterprise setting, consistent latency may be justification for a substantial network upgrade.  

Choose Wisely between Cloud and On-Premise Systems

On-premise systems are not inherently faster than their cloud equivalents. While on-premise systems have some obvious advantages, such as a shorter network journey for data, there are other factors that can impact latency. Cloud servers, for example, often have better underlying hardware than on-premise systems, which means faster processing teams. This contributes to overall low latency rates.

Benchmarking is essential when comparing options, as trade-offs between different factors may ultimately result in low latency.