Monthly Archives: March 2014

How CMIPS binary and source code works for CPU and RAM benchmarks

I’ve been requested to explain how CMIPS works. Here I explain the basic mechanics of the source code for the CPU and RAM speed benchmarks.

Cmips uses a lot of my knowledge on computers, architecture, virtualization and assembler to prevent the hypervisors from devising the results, and providing fake data.

So at the end the program is a very precise one, concentrating into doing its jobs the best way possible.

It uses a very small binary file and really few amount of RAM to prevent the Host hypervisor from improving or worse the pure results (some providers allow the tenants to use more total RAM than the host server actually have, as many times only a part of the RAM assigned to the instances is really used, and uses swap the same way a computer does if RAM is really used).

Basically it calculates the CPU speed, by doing simple calculations involving the hardware registers and the read and write access to memory speed.

For the writings to the memory only one byte is written, and different, to minimize the hardware and software caches optimizations.

The operations are the simplest, the most close to assembler basic functions.

Operations are:

  • Increase counter
  • Compare if greater
  • Assign var to 0
  • Read a byte from a position of memory (read a char)
  • Write a byte to a char variable

So there are no callings to the Operating System that can be tweaked by the Hypervisor / guest tools or containers.

Finally cmips launches 100 threads (void *t_calculations(void *param)) at the same time to stress all the cores available, and provide a real benchmark on the independent CPU power of the public instance (some host servers isolate or share resources more than others, so cmips claims all the resources to get the real picture of performance provided).

When we benchmark an instance, we block the firewall to prevent incoming petitions from wasting resources and we launch cmips several times, one time after the other, on the same instance to be sure that the results are consistent and reliable.

Netbeans is used as IDE for the cmips source code. (For my Linux C++ GUI apps I use Qt Creator)
That’s the basic code in C++

Using those libraries:

#include <cstdlib>
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <fstream>
#include <sstream>
#include <cstring>
#include <sys/time.h>
#include <ctime>

using namespace std;

So we link the program with the standard Posix thread library:

-o ${CND_DISTDIR}/${CND_CONF}/${CND_PLATFORM}/cmips -lpthread

Some global variables:

typedef unsigned long long timestamp_t;

char s_cmips[50] = "CMIPS V.1.0.3 by Carles Mateo www.carlesmateo.com";
char s_tmp_copy[1];

int i_max_threads = 100;
int i_finished_threads = 0;

int i_loop1 = 0;
int i_loop_max = 32000;
int i_loop2 = 0;
int i_loop2_max = 32000;
int i_loop3 = 0;
int i_loop3_max = 10;

 

The core is this thread function:

void *t_calculations(void *param)
{

    // current date/time based on current system
    time_t now = time(0);
    int i_counter = 0;
    int i_counter_char = 0;

    // convert now to string form
    char* dt_now = ctime(&now);

    printf("Starting thread ");
    cout << dt_now << "\n";
    for (i_loop1 = 0; i_loop1<i_loop_max; i_loop1++)
    {
        for (i_loop2 = 0; i_loop2<i_loop2_max; i_loop2++) 
        {
            for (i_loop3 = 0; i_loop3<i_loop3_max; i_loop3++) {
                // Increment test
                i_counter++;
                // If test and assignement
                if (i_counter > 32000) {
                    i_counter = 0;
                }
                // Char test
                s_tmp_copy[0] = s_cmips[i_counter_char];

                i_counter_char++;
                if (i_counter_char > 49) {
                    i_counter_char = 0;
                }

            }
        }   
    }

    time_t now_end = time(0);

    // convert now to string form
    char* dt_now_end = ctime(&now_end);

    printf("End thread at ");
    cout << dt_now_end << "\n";

    i_finished_threads++;

    return NULL;
}

The timestamps is calculated:

static timestamp_t get_timestamp ()
{
  struct timeval now;
  gettimeofday (&now, NULL);
  return  now.tv_usec + (timestamp_t)now.tv_sec * 1000000;
}

After all the threads finish main calculates:

    // Process
    timestamp_t t1 = get_timestamp();

    double secs = (t1 - t0) / 1000000.0L;

    int cmips = (1 / secs) * 1000000;

 

CMIPS benchmarks with new Amazon c3.8xlarge and prices updated

The new Amazon c3.8xlarge based on Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz and SSD has been added. Is the most powerful instance tested to the date, beating Cloudsigma 37 Cores/80 Ghz by a little (although CloudSigma is much more cheaper).

The cool thing of c3.8xlarge comparing to cc2.8xlarge is that the first is not a Cluster like the last, so you can use a standard Linux distribution, not the specially cluster distributions.

In a graphic (more CMIPS means more CPU+RAM Speed power, so more is better):

cmips-net-2014-03-10-cmips-score-benchmarks

Prices for Amazon have been updated, only m3.xlarge (SSD) changed passing from $0.5 USD/hour to $0.45 USD/hour.

http://aws.amazon.com/ec2/pricing/

Prices for Azure have been reviewed but did not change since last update on 2013.

http://www.windowsazure.com/en-us/pricing/calculator/

cmips-net-2014-03-11-cups-score-benchmarks

Detailed results:

Type of Service Provider Name of the product Codename Zone Processor Ghz Processor Cores (from htop) RAM (GB) Os tested CMIPS Execution time (seconds) USD /hour USD /month
Cloud Amazon T1 Micro t1.micro US East Intel Xeon E5-2650 2 1 0.613 Ubuntu Server 13.04 64 bits 49 20,036.7 $0.02 $14.40
Cloud Amazon M1 Small m1.small US East Intel Xeon E5-2650 2 1 1.6 Ubuntu Server 13.04 64 bits 203 4,909.89 $0.06 $43.20
Cloud GoGrid Extra Small (512 MB) Extra Small US-East-1 Intel Xeon E5520 2.27 1 0.5 Ubuntu Server 12.04 64 bits 441 2,265.14 $0.04 $18.13
Physical (laptop) Intel SU4100 1.4 2 4 Ubuntu Desktop 12.04 64 bits 460 2,170.32
Cloud CloudSigma 1 Core / 1 Ghz 1 Core / 1 Ghz Zurich (Europe) Amd Opteron 6380 2.5 Ghz to 3.4 Ghz with Turbo 1 1 Ubuntu Server 12.04.3 64 bits 565 to 440 1,800 $0.04475 $32,22
Cloud Amazon M1 Large m1.large US East Intel Xeon E5-2650 2 2 7.5 Ubuntu 13.04 64 bits 817 1,223.67 $0.24 $172.80
Cloud Linode 1x priority (smallest) 1x priority London Intel Xeon E5-2670 2.6 8 1 Ubuntu Server 12.04 64 bits 1,427 700.348 n/a $20
Cloud Amazon M1 Extra Large m1.xlarge US East Intel Xeon E5-2650 2 4 15 Ubuntu 13.04 64 bits 1,635 606.6 $0.48 $345.60
Cloud LunaCloud 8 Core 1.5 Ghz, 512 MB RAM, 10 GB SSD CH 1.5 8 0.5 Ubuntu 13.10 64 bits 1,859 537.64 $0.0187 $58.87
Cloud CloudSigma 3 Core / 1,667 Ghz each / 5 Ghz Total 3 Core / 1,667 Ghz each / 5 Ghz Total Zurich (Europe) Amd Opteron 6380 2.5 Ghz to 3.4 Ghz with Turbo 3 1 Ubuntu Server 13.10 64 bits 1928 to 1675 518.64 $0.1875 $135
Cloud Amazon M3 Extra Large m3.xlarge US East Intel Xeon E5-2670 2.6 4 15 Ubuntu 13.04 64 bits 2,065 484.1 $0.45 $324
Cloud Linode 2x priority 2x priority Dallas, Texas, US Intel Xeon E5-2670 2.6 2 Ubuntu Server 12.04 64 bits 2,556 391.19 n/a $40
Cloud GoGrid Extra Large (8GB) Extra Large US-East-1 Intel Xeon E5520 2.27 8 8 Ubuntu Server 12.04 64 bits 2,965 327.226 $0.64 $290
Cloud Amazon C1 High CPU Extra Large c1.xlarge US East Intel Xeon E5506 2.13 8 7 Ubuntu Server 13.04 64 bits 3,101 322.39 $0.58 $417.60
Dedicated OVH Server EG 24G EG 24G France Intel Xeon W3530 2.8 8 24 Ubuntu Server 13.04 64 bits 3,881 257.01 n/a $99
Cloud Amazon M2 High Memory Quadruple Extra Large m2.4xlarge US East Intel Xeon E5-2665 2.4 8 68.4 Ubuntu Server 13.04 64 bits 4,281 233.545 $1.64 $1,180.80
Cloud Rackspace RackSpace First Generation 30 GB RAM – 8 Cores – 1200 GB US Quad-Core AMD Opteron(tm) Processor 2374 HE 2.2 8 30 Ubuntu Server 12.04 64 bits 4,539 220.89 $1.98 $1,425.60
Physical (desktop workstation) Intel Core i7-4770S 3.1 (to 3.9 with turbo) 8 32 Ubuntu Desktop 13.04 64 bits 5,842 171.56
Cloud Digital Ocean Digital Ocean 48GB RAM – 16 Cores – 480 GB SSD Amsterdam 1 QEMU Virtual CPU version 1.0 16 48 Ubuntu Server 13.04 64 bits 6,172 161.996 $0.705 $480
Cloud Amazon High I/O Quadruple Extra Large hi1.4xlarge US East Intel Xeon E5620 2.4 16 60.5 Ubuntu Server 13.04 64 bits 6,263 159.65 $3.1 $2,232
Cloud Digital Ocean Digital Ocean 64GB RAM – 20 Cores – 640 GB SSD Amsterdam 1 QEMU Virtual CPU version 1.0 20 64 Ubuntu Server 13.04 64 bits 8,116 123.2 $0.941 $640
Cloud Digital Ocean Digital Ocean 96GB RAM – 24 Cores – 960 GB SSD New York 2 QEMU Virtual CPU version 1.0 24 96 Ubuntu Server 13.04 64 bits 9,733 102.743 $1.411 $960
Cloud GoGrid XXX Large (24GB) XXX Large US-East-1 Intel Xeon X5650 2.67 32 24 Ubuntu Server 12.04 64 bits 10,037 99.6226 $1.92 $870
Cloud CloudSigma 24 Core / 52 Ghz Total 24 Core / 52 Ghz Total Zurich (Europe) Amd Opteron 6380 2.5 Ghz to 3.4 Ghz with Turbo 24 1 Ubuntu Server 13.10 64 bits 10979 to 8530 98 $0.9975 $718.20
Cloud Amazon Memory Optimized CR1 Cluster 8xlarge cr1.8xlarge US East Intel Xeon E5-2670 2.6 32 244 Ubuntu Server 13.04 64 bits for HVM instances (Cluster) 16,468 60.721 $3.5 $2,520
Cloud Amazon Compute Optimized CC2 Cluster 8xlarge cc2.8xlarge US East Intel Xeon E5-2670 2.6 32 60.5 Ubuntu Server 13.04 64 bits for HVM instances (Cluster) 16,608 60.21 $2.4 $1,728
Cloud CloudSigma 37 Core / 2.16 Ghz each / 80 Ghz Total 37 Core / 2.16 Ghz each / 80 Ghz Total Zurich (Europe) Amd Opteron 6380 2.5 Ghz to 3.4 Ghz with Turbo 37 1 Ubuntu Server 13.10 64 bits 17136 to 8539 58 $1.5195 $1,094.10
Cloud Amazon Compute Optimized C3 8xlarge c3.8xlarge US East Intel Xeon E5-2680 2.8 32 60 Ubuntu Server 13.10 64 bits for HVM instances (Cluster) 17,476 57.21 $2.4 $1,728