Editor's note: The following is a guest article from Ed Wright, director of product marketing at Teridion.
We tend to think of the internet as fast, equating lots of bandwidth to speed and performance, but the reality is that is not the case.
The internet is not designed to be a high-performance network; it is designed to be resilient to points of failure.
It's important to note why internet performance is so important today. Most web applications, particularly those delivered by SaaS providers, deliver value through dynamic, personalized interactions and bidirectional content flows.
The consequences of poor dynamic internet performance include an inability or prohibitively high cost of entering regions and markets where internet performance is poor; high user dissatisfaction, often resulting in high customer turnover; and increased infrastructure, headcount and operational cost and complexity as web and SaaS applications are rolled out in regional data centers or points of presence (PoPs).
Consider these top 10 internet performance myths that prevent web applications and SaaS providers from delivering a great experience to global end users.
Myth #1: The internet was designed for optimal performance
The internet was built for resilience, not performance. Border Gateway Protocol (BGP), which handles routing across the ISPs that make up the backbone of the internet, makes routing decisions based on paths, policies or rule sets configured by the ISPs.
Policies and routing rules are typically based on metrics that ensure connectivity and resilience, but also target cost control and deliver the cheapest path. BGP is oblivious to performance around and within the networks it directs traffic towards.
If a particular path or ISP is experiencing loss, delay or jitter, BGP can and will continue to route traffic toward or across that path or ISP.
It should be no surprise that the next "network hop" has much more to do with the cost to the carrier than speed or performance for the user. It's for this very reason it is nearly impossible to implement consistent routing policies across ISPs to identify the best performing path.
Myth #2: Internet performance is good enough
Poor internet performance has a significant negative impact on the performance of web applications and SaaS delivery.
Ilya Grigorik, in his book "High Performance Browser Networking," points to studies that consistently report the same human reactions to interaction delay across all application types and devices.
A response time of at least 200ms will be reported by a user as "lag." Once the 300ms response time threshold is reached, it feels "sluggish," according to users. At the 1 second threshold, the user has checked out mentally and context switched, and at 10 seconds they have given up on the task altogether.
Today average web page response time is more than 3 seconds and mobile response time is more than 8 seconds. This means that most SaaS applications deliver far less productivity value than they could with optimized internet performance.
Enterprises and SaaS vendors spend a lot of time and money on web application performance management, minification and other techniques to increase web and SaaS application performance levels. Yet internet performance levels are passed over with the assumption that performance is good enough.
Poor internet performance is at the core of many SaaS business challenges, including:
- Low user/customer satisfaction scores in the EU, APAC and LATAM markets where application performance is much slower than in the U.S.
- End users with poor performance are less likely to purchase and more likely to cancel contracts.
- New interactive features can negatively affect performance, forcing SaaS vendors to choose either attractive features or acceptable performance.
Myth #3: The internet's BGP routing will find the fastest path
The truth is internet routing using BGP does not follow the fastest path. BGP makes one decision — which ISP (called an autonomous system or AS) to hand traffic off to next.
BGP has two drawbacks. First, routes are selected based on cost. ISPs design routing tables based on business relationships negotiated with other ISPs and therefore route traffic based on cost, not speed.
Second, routes don't adapt to congestion or loss. There is no feedback mechanism which allows an ISP to change a route based on path or network performance, so routing traffic into congested networks is a common occurrence. BGP ensures routing resilience, but does not address speed or performance.
Myth #4: The speed of light explains poor internet performance
Poor internet performance is rarely due to transmit speed and propagation delay due to distance. The speed of light through fiber takes about 80ms to get from anywhere on the globe to any other point.
Users complaining about slow internet performance are typically experiencing page load delays of 10 seconds or more, meaning propagation delay is a trivial portion of the experienced delay.
Sluggish data-intensive operations such as file transfers or slow web page loads in bandwidth-constrained regions like Asia and Latin America are typically due to traffic congestion, not propagation delay.
Myth #5: Building regional points of presence is the best way to improve global internet performance
Regional PoPs don't address the underlying cause of poor internet performance. The original internet vision was for a web application in a single data center to be accessed by users around the globe.
Web sites quickly found that congestion across the internet caused poor performance, particularly in regions with limited internet bandwidth. Website owners and SaaS vendors started using regional PoPs to cache static content like pictures, videos and files for download.
Content Delivery Networks (CDNs) evolved as an efficient way for web applications to improve performance for static content. Web and SaaS applications with dynamic and bidirectional content are unable to take advantage of CDNs, and many companies have gone through the time-consuming and expensive path of building out networks of regional PoPs to address the issue.
It's also important to note that physical proximity doesn't guarantee the best performance. There are instances where "performance" proximity defies logic and routing via a physically more distant PoP provides better overall internet performance.
Myth #6: Performance or Security — You can't have both!
You can optimize internet performance without exposing sensitive data or needing to share SSL certificates with a third party vendor.
CDNs have long been touted as the best way to optimize internet performance. Yet the recent CloudBleed bug showed the risks of having confidential information stored in caches scattered around the globe.
CDNs improve the user experience by moving static web content closer to the end user, but this requires that cached content be stored unencrypted. Any hacker who gains access to a single cache can compromise the content.
One option to work around this is to enable SSL offload. SSL offload enables SSL encryption, and while it reduces latency, it requires the original web server to share its SSL certificates. Sharing SSL credentials may violate corporate policy or privacy regulations.
Myth #7: Upload performance always stinks!
This myth is the result of two factors:
- Consumer/Residential DSL and Cable typically place caps on upstream bandwidth and forwarding rates.
- Most CDNs can traffic for downstream use.
It is possible to significantly improve web upload performance. Existing approaches such as CDNs provide faster page load times for static content.
As web pages get larger and more dynamic (JavaScript payloads are growing 50% per year), new approaches such as overlay networks are needed to accelerate uploads and dynamic download.
Cloud-based internet Overlay Networks use techniques similar to WAN Optimization to provide protocol and routing level optimizations, improving performance for bidirectional and dynamic content traffic.
Myth #8: Chinese internet performance problems can't be fixed
Web and SaaS application performance in China can be consistent with U.S. performance. China is a massive internet market for SaaS vendors with more than 700 million users compared to 280 million users in the U.S. Yet SaaS vendors struggle to deliver outstanding user experiences in China.
For example, the network monitoring company Thousand Eyes reports 7% packet loss in China vs 0.04% packet loss in the U.S. This is largely due to a series of legislative and technological actions by the Chinese government known as the Great Firewall of China.
The average throughput SaaS vendors deliver to end users in North America is 30 MBps, but these same SaaS vendors deliver only about 8 MBps to users in China. While CDNs improve performance for delivering static content in China, they do not speed dynamic content like SaaS business workflows, which by their nature are customized for each user.
New approaches such as overlay networks address many of the issues surrounding China's internet and the Great Firewall of China.
Myth #9: Monitoring can solve internet backbone problems
Monitoring helps find and diagnose problems, but it doesn't solve them. There's a reason the internet is often represented as a cloud — we can't see what's going on inside this network of networks. The advent of sophisticated monitoring tools gave network operations groups alerts and the data to diagnose what might be causing poor network performance.
The most common culprits include:
- Congestion: Traffic jams on the internet are inevitable. With the volume of rich content growing and the sheer number of internet users (estimated at half the world's population), congestion slows down traffic.
- BGP Convergence: In Myth #3, we explained that BGP only governs which autonomous system (AS) to route to next. This can cause a phenomenon known as BGP convergence, where traffic gets stuck in an internet "eddy" and is routed in an infinite loop between AS's until finally timing out.
- Outages: No network is infallible, as widely reported outages for Amazon Web Services, Google and Azure confirm. These network failures are disruptive and often bring traffic to a standstill.
- Government regulation: Governments often have an interest in controlling the free flow of information, and that means controlling internet access. As noted in Myth #8, the Great Firewall of China is a prominent example that causes severe packet loss.
Monitoring tools can help pinpoint the cause of an issue and explain delays. They can also assign blame to the ISP, cloud service provider or customer's network.
Monitoring is a good idea, but on its own is not enough. Many times monitoring provides only middle-of-the-night alerts for problems that are out of your control.
Internet issues can be solved through direct connection to customers using approaches such as expensive private lines that businesses have control over or new technologies around overlay networks.
Myth #10: Putting servers close to end users solves performance problems
Server proximity can help to some degree, but it is costly and complex. When users in remote or distant regions experience lackluster performance, the first thing that often comes to mind is building or renting a closer PoP to users in the region.
This course of action introduces significant cost and operational challenges, including code modifications that are required to support multiple application instances; management and synchronizing of duplicated content between multiple data centers; significant headcount and resource increases to plan, execute and maintain these new PoPs; and loss of business agility from all of these compromises.
Internet proximity is very different from physical proximity. Even after putting in regional PoPs, the last mile can still be a problem for two reasons.
First, consider congestion. Usually, the metric that determines data center location is round-trip time (RTT) to a major user base (usually a large city). However, low latency does not translate to high throughput, and users can still see bad performance in congested routes.
Second, user traffic can still be routed in loops, or even to different PoPs, due to BGP convergence. For example, a New Jersey data center may intend to better serve New York-metro users, but local user traffic could still be routed to London or some other location.