Compare LAN, WAN, Cloud

Zoom Window Out
Larger Text
Smaller Text
Hide Page Header
Show Expanding Text
Print Topic
Share This Topic
Save Permalink URL

In May 2018, the discussion "Measuring Performance based on location of A-Shell server" took place on the A-Shell forum.

Question

My customer is asking for performance measurement comparisons before migrating A-Shell to a cloud. Specifically they want to know what difference there is (if any) when A-Shell is running on a LAN, the WAN, or in a Cloud.

Since almost everything happens on the server (through AlphaLAN) wouldn't the main difference be how long it takes the user's PC to receive screen displays from the server? Does it make sense to measure that (and if so, how?)

Also, much of the performance differences could be due to the speed of the cloud's processors versus the VM server's, not to mention the strength and bandwidth of the internet connection.

We also use ATE, does that act as a true client or is it just another terminal emulator? If it's a client, does it bear its own type of measurement? Is there a way to time client to server transactions (and vice-versa)?

Is this even a realistic thing to measure? If so, has anyone done this and can they share their solution?

If not, what would make more sense to measure?

For example, does it make more sense to measure the time to run a batch job (e.g., reading through a large ISAM file and displaying data on screen) using four variations:

a)	access the A-Shell VM from the VM server's location (LAN)

b)	access the same VM from a remote location (WAN)

c)	access cloud-based A-Shell from the VM server's location

d)	access the same cloud-based A-Shell from the above remote location

Are c and d redundant? Should I test from multiple remote locations?

With the batch job approach, should I do each type of run 10 times and then average the results? Is there a standard procedure for this kind of thing?

I'd appreciate any thoughts or suggestions. Thank you.

(Later:) Another thought for the batch job, instead of reading an ISAM file, would it be sufficient to merely count to 100,000 and display the numbers on screen? The display would be stilted unless I only displayed every 100th number. Is this a test that would accurately provide metrics defining response time differences caused by accessing A-Shell from various locations?

Answer

Trying to come up with a meaningful measure of performance—as opposed to MIPS, i.e. Meaningless Indicator of Performance, Stupid—is not easy.

The first suggestion I would make is to assign relative degrees of importance to the various types of performance that are substantially independent of each other:

1)	CPU performance: calculations, ability to handle a lot of users active at once

2)	Data I/O performance: memory access, disk access (random or streaming), queue & cache efficiency, etc.

3)	Terminal I/O performance: both bandwidth and latency.

One thing I can tell you is that parts 1 and 2 are pretty much independent of the network architecture. Your CPU and ISAM tests are going to perform the same regardless of whether the server and client are on the same machine, the same LAN, or on separate planets (assuming the same server capabilities). It's a little more difficult to make blanket statements about a physical vs a virtual server, but I think most of the industry is in agreement that VM technology puts very little extra overhead on the system, especially compared to the advantages it offers, so basically no one chooses physical vs virtual for performance reasons. In fact, it's quite the opposite.

The only situation where the network architecture affects the Data I/O performance would be in the case of NAS or SAN disks, whereby the CPU and disks are separated by some kind of network. That can have a huge but extremely variable effect on performance. At one extreme would be shared directories across a LAN or WAN, which can be quite slow, at least when it comes to multi-user access, due to a combination of the network delay/bottleneck and the need for the file server to coordinate locking with multiple clients on remote machines. At the other would be a dedicated SAN, which might perform nearly as well as a disk local to the server.

In your case, the choices are mainly going to differ in the area of Terminal I/O Performance. If your application is purely text based, and you're connecting via SSH or Telnet, you probably don't need much bandwidth, so that's not likely to be a problem. But the turnaround latency could become a slight annoyance. In a typical input environment, each character typed has to travel to the server and back again before it shows up on the screen; when that delay starts to get up to a few hundred milliseconds, it starts to annoy some people. That's the one area where the cloud or WAN may be noticeably worse than the LAN.

GUI would typically increase the demand on the bandwidth as there would be a lot more data transferred from the server to the client. But depending on the design, it might decrease the pressure on latency, since much of the UI activity can take place purely locally.

Maybe that's an evasive answer, but I'm not sure what else I can say in general about the issue of how changing environments affects performance. If you have two actual environments that you can access, and want to come up with some metrics to quantify how they compare performance-wise, then we can get into the various kinds of tests that could be run.

Your test of just counting up to 100,000 in a FOR/NEXT loop might give some indication of pure CPU performance if you didn't output the numbers. Otherwise it will be primarily a measure of terminal I/O performance, although you can achieve some compromise between the two by only outputting every nth value, like you suggest. As an example:

program COUNT2,1.0(100) ! measure for/next performance

SIGNIFICANCE 11

MAP1 MISC

MAP2 USECS,B,4

MAP2 I,F

MAP2 N,B,4

INPUT "Enter N for to output every Nth value: ",N

IF N = 0 THEN N = 1

XCALL TIMES, 3, USECS

FOR I = 1 TO 100000

IF (I MOD N) = 0 ? I

NEXT I

XCALL TIMES, 3, USECS

? USECS;"us"

end

Here's how that plays out on my Windows laptop (A-Shell/Windows) vs an ATE connection to a local Linux VM within the same laptop and an ATE connection to a remote Linux VM across the country via a VPN/WAN:

N Local Win Local Linux Remote Linux

----- ---------- ----------- ------------

1 11 seconds 12 seconds 8 seconds

100 142 ms 136 ms 27 ms

1000 40 ms 36 ms 24 ms

At first glance the relative consistency of those values seems a bit surprising. How could the remote machine output 100,000 numbers to the screen faster than either of the local ones? The only answer I can come up with there is that in the ATE case, the server is responsible for computing and outputting the values, whereas in the local Windows case, the same instance of A-Shell is also responsible for displaying them. Plus, the total amount of I/O is still pretty small, less than 1 MB total, so we're not anywhere near pushing up against the limits of the network bandwidth even for the remote WAN.

Another complication is that although the program reported 8 seconds for the remote connection (outputting each of the 100,000 numbers), in fact, the actual elapsed time as experienced by the user was over 12 seconds. In other words, due to buffering the program finished 5 seconds before all the buffered output was delivered to my workstation. And we should probably take into consideration the fact that a lot of that buffering may have been local to the server, and/or associated with the server's NICs. That's probably a 4th independent category of performance to be added to the 3 I listed at the top. And note that there are both hardware and OS-level components to the network interface performance; each major Linux release seems to adjust the trade-offs between latency, bandwidth, and CPU efficiency in ways that may appear to favor overall server batch performance/throughput vs. individual responsiveness.

So even this brain-dead simple test becomes complicated to interpret! Except for the last line (where we only output every 1000th value) - in that case it becomes more of a pure CPU test, with the results indicating that the remote server is quite a bit faster when it comes to raw CPU performance than my local machine, even though the remote physical machine has several VMs running on it, and this particular VM has 100+ users running on it. Which just goes to demonstrate that even though a modern PC with an i7 CPU is incredibly fast, real server hardware (in this case with a Xeon processor) can still run circles around it. That's not to say that a PC-grade machine can't perform admirably as a server, but it is to say that there is a real difference between PC-grade and server-grade hardware.

Please enable JavaScript to view this site.

A-Shell Reference

Compare LAN, WAN, Cloud