Tuesday, April 26, 2016

Testing Angular Directives with Karma, Mocha, and Phantom

All Code on Github

The entire code from this article is on github.

Introduction

Angular Directives are part of the web part/components group of client-side  tools that allow quick additions to web pages. The idea is that with a very simple and short addition to an html page, complex functionality and UX are available.

Imagine a user login form with the traditional validation contained in a html template and Angular controller. The main, calling web page’s html could just include <login></login> to bring in that rich validation and display. Directives are wonderful for encapsulating the complexity away from the containing HTML.

Testing the directive is trickier. Granted the controller isn’t difficult to test. But the compiled html and scope are not so easy.

Perhaps it is enough to test the controller, and depend on the browser and the Angular library to manage the rest.  You could definitely make that case. Or perhaps you leave the compiled directive testing to the end to end (e2e) tests. That is also a fair. If either of these solutions doesn’t work for you, this article will explain 3 ways to test the compiled Angular directive.

Template
When you build the directive, you can choose static (hopefully short) html in the definition of the directive.  Notice that {{data}} will be replaced with the $scope.data value from the directive.
exports.user = function() {
  return {
    controller: 'userController',
    template: '<div class="user">{{data}}</div>'
    };
};
TemplateUrl
If you have more html or want to separate the html from the Angular directive, templateUrl is the way to go.
exports.menu = function() {
  return {
    controller: 'menuController',
    templateUrl: '/templates/menu.html'
  };
};
The html contained in menu.html is below:
<div class="menu">
    {{data}}
</div>

Compilation

Angular compiles the controller and the template in order to replace the html tag. This compilation is necessary to test the directive.

Prerequisites

This article assumes you have some understanding of javascript, Angular, and unit testing. I’ll focus on how the test was setup and compiled the directive so that it could be tested. You need node and npm to install packages and run scripts. The other dependencies are listed in the package.json files.

Karma, Mocha, Chai, Phantom Test System

Since Angular is a client-side framework, Karma acts as the web server and manages the web browser for html and javascript. Instead of running a real browser, I chose Phantom so everything can run from the command line. Mocha and chai are the test framework and assert library.

While the Angular code is browserified, that doesn’t have any impact on how the directive is tested.

The Source Code and Test

With very little difference, each of the 3 examples is just about the same: a very simple controller that has $scope.data, an html template that uses {{data}} from the scope, and a test that compiles the directive and validates that the {{data}} syntax was replaced with the $scope.data value.

Each project has the same directory layout:

/client angular files to be browserified into /public/app.js
/public index.html, app.js, html templates
/test mocha/chai test file
karma.conf.js karma configuration file
gulpfile.js browserify configuration
package.json list of dependencies and script to build and run test

Testing the static template


The first example tests a template using static html. 

The controller:
exports.userController = function ($scope) {
    $scope.data = "user";
};
The directive: 
exports.user = function() {
  return {
    controller: 'userController',
    template: '<div class="user">{{data}}</div>'
    };
};
The calling html page
<html ng-app="app">
  <head>
    <script type="text/javascript"
      src="https://ajax.googleapis.com/ajax/libs/angularjs/1.4.0/angular.js">
    </script>
    <script type="text/javascript"
      src="https://ajax.googleapis.com/ajax/libs/angularjs/1.4.0/angular-route.js">
    </script>
    <script type="text/javascript" src="/app.js"></script>
  </head>
  <body>
      <div>
          <user></user>
      </div> 
  </body>
</html>
The final html page will replace <user></user> with the compiled html from the template with the string “user” replacing {{data}} in the static template html. The mocha test
describe('userDirective', function() {
  var scope;

  beforeEach(angular.mock.module("app"));
  
  beforeEach(angular.mock.module('ngMockE2E'));

  beforeEach(inject(function ($rootScope) {
      scope = $rootScope.$new();
    }));
    
  it('should exist', inject(function ($compile) {
    element = angular.element('<user></user>');
    compiledElement = $compile(element)(scope);
 
    scope.$digest();

    var dataFromHtml = compiledElement.find('.user').text().trim();
    var dataFromScope = scope.data;
    
    console.log(dataFromHtml + " == " + dataFromScope);

    expect(dataFromHtml).to.equal(dataFromScope);
  }));
});
As part of the test setup, in the beforeEach functions, the test brings in the angular app, brings in the mock library, and sets the scope. Inside the test (the ‘it’ function), the element function defines the directive’s html element, and compiles the scope and the element. The compiled html text value is tested against the scope value – which we expect to be the same. The overall test concept is the same for each of the examples. Get the controller and template, compile it, check it. The details differ in exactly how this is done in the 2 remaining examples.

The 2 TemplateUrl Examples

The first example above relied on the directive definition to load the template html. The next 2 examples use the TemplateUrl property which has the html stored in a separate file so that method won’t work. Both of the next 2 examples use the templateCache to load the template, but each does it in a different way. The first example loads the template and tests the template as though the templateCache isn’t used. This is a good example for apps that don’t generally use the templateCache and developers that don’t want to change the app code in order to test the directive. The templateCache is only used as a convenience for testing. The second example alters the app code by loading the template in the templateCache and browserifying the app with the ‘templates’ dependency code. This is a good way to learn about the templateCache and test it.

Testing TemplateUrl – least intrusive method

The second example stores the html in a separate file instead of in the directive function. As a result, the test (and the karma config file) needs to bring the separate file in before compiling the element with the scope. The controller:
exports.menuController = function ($scope) {
    $scope.data = "menu";
};
The directive:
exports.menu = function() {
  return {
    controller: 'menuController',
    templateUrl: '/templates/menu.html'
  };
};
The template:
<div class="menu">
    {{data}}
</div>
The calling html page only different in that the html for the directive is
<menu></menu>
karma-utils.js
function httpGetSync(filePath) {
  var xhr = new XMLHttpRequest();
  
  var finalPath = filePath;
  
  //console.log("finalPath=" + finalPath);
  
  xhr.open("GET", finalPath, false);
  xhr.send();
  return xhr.responseText;
}

function preloadTemplate(path) {
  return inject(function ($templateCache) {
    var response = httpGetSync(path);
    //console.log(response);
    $templateCache.put(path, response);
  });
}
The file along with the template file are brought in via the karma.conf.js file in the files property:
// list of files / patterns to load in the browser
files: [
    'http://code.jquery.com/jquery-1.11.3.js',
    'https://ajax.googleapis.com/ajax/libs/angularjs/1.4.0/angular.js',
   
    // For ngMockE2E
    'https://ajax.googleapis.com/ajax/libs/angularjs/1.4.0/angular-mocks.js',
    'test/karma-utils.js',
    'test/test.directive.*.js',
    'public/app.js',
    'public/templates/*.html'
],
The mocha test
describe('menuDirective', function() {
  var mockScope;
  var compileService;
  var template;

  beforeEach(angular.mock.module("app"));
  
  beforeEach(angular.mock.module('ngMockE2E'));

  beforeEach(preloadTemplate('/templates/menu.html'));

  beforeEach(inject(function ($rootScope) {
      scope = $rootScope.$new();
    }));
    
  it('should exist', inject(function ($compile) {
    element = angular.element('<menu></menu>');
    compiledElement = $compile(element)(scope);
 
    scope.$digest();

    var dataFromHtml = compiledElement.find('.menu').text().trim();
    var dataFromScope = scope.data;
    
    console.log(dataFromHtml + " == " + dataFromScope);

    expect(dataFromHtml).to.equal(dataFromScope);
  }));
});
As part of the test setup, in the beforeEach functions, the test brings in the angular app, brings in the mock library, brings in the html template, and sets the scope. The template is brought in via a help function in another file in the test folder called preloadTemplate. Inside the test (the ‘it’ function), the element function defines the directive’s html element, and compiles the scope and the element. The compiled html text value is tested against the scope value – which we expect to be the same. This is the line of the test file that deals with the templateCache:
beforeEach(preloadTemplate('/templates/menu.html')); 
The app and test code do not use the templateCache in any other way. The test itself is almost identical to the first test that uses the static html in the directive.

Testing TemplateUrl – most intrusive method

In this last example, the app.js code in the /client directory is altered to pull in a new ‘templates’ dependency module. The templates module is built in the gulpfile.js by grabbing all the html files in the /public/templates directory and wrapping it in javascript that adds the templates to the templateCache. The test explicitly pulls the template from the templateCache before compilation. gulpfile.js – to build templates dependency into /client/templates.js
var gulp = require('gulp');
var browserify = require('gulp-browserify');

var angularTemplateCache = require('gulp-angular-templatecache');

var concat = require('gulp-concat');
var addStream = require('add-stream');

gulp.task('browserify', function() {
  return gulp.
    src('./client/app.js').
    //pipe(addStream('./client/templates.js')).
    //pipe(concat('app.js')).
    pipe(browserify()).
    pipe(gulp.dest('./public'));
});

gulp.task('templates', function () {
  return gulp.src('./public/templates/*.html')
    .pipe(angularTemplateCache({module:'templates', root: '/templates/'}))
    .pipe(gulp.dest('./client'));
});

gulp.task('build',['templates', 'browserify']);
templates.js – built by gulpfile.js
angular.module("templates").run([
    "$templateCache",  function($templateCache) {
     $templateCache.put("/templates/system.html","<div class=\"menu\">\n    {{data}}\n</div>");
    }
]);
app.js – defines template module and adds it to app list of dependencies
var controllers = require('./controllers');
var directives = require('./directives');
var _ = require('underscore');

// this never changes
angular.module('templates', []);

// templates added as dependency
var components = angular.module('app', ['ng','templates']);

_.each(controllers, function(controller, name)
{ components.controller(name, controller); });

_.each(directives, function(directive, name)
{ components.directive(name, directive); });

require('./templates')
The mocha test
describe('menuDirective', function() {
  var mockScope;
  var compileService;
  var template;

  beforeEach(angular.mock.module("app"));
  
  beforeEach(angular.mock.module('ngMockE2E'));

  beforeEach(inject(function ($rootScope) {
      scope = $rootScope.$new();
    }));
    
  it('should exist', inject(function ($compile, $templateCache) {
    element = angular.element('<system></system>');
    compiledElement = $compile(element)(scope); 

    // APP - app.js used templates dependency which loads template
    // into templateCache
    // TEST - this test pulls template from templateCache 
    template = $templateCache.get('/templates/system.html'); 
 
    scope.$digest();

    var dataFromHtml = compiledElement.find('.menu').text().trim();
    var dataFromScope = scope.data;
    
    console.log(dataFromHtml + " == " + dataFromScope);

    expect(dataFromHtml).to.equal(dataFromScope);
  }));
});

The Test Results

The test results for each of the 3 projects is checked in to the github project. Karma ran with debug turned on so that the http and file requests could be validated. When you look at the testResult.log files (1 in each of the 3 subdirectories), you want to make sure that the http and file requests that karma made were actually successful. The lines with ‘Fetching’, ‘Requesting’, and ‘(cached)’ are the important lines before the test runs.

36m26 04 2016 11:42:09.318:DEBUG [middleware:source-files]: [39mFetching /home/dina/repos/AngularDirectiveKarma/directiveTemplate/test/test.directive.template.js
[36m26 04 2016 11:42:09.318:DEBUG [middleware:source-files]: [39mRequesting /base/public/app.js?6da99f7db89b4401f7fc5df6e04644e14cbed1f7 /
[36m26 04 2016 11:42:09.318:DEBUG [middleware:source-files]: [39mFetching /home/dina/repos/AngularDirectiveKarma/directiveTemplate/public/app.js
[36m26 04 2016 11:42:09.319:DEBUG [web-server]: [39mserving (cached): /home/dina/repos/AngularDirectiveKarma/directiveTemplate/node_modules/mocha/mocha.js
[36m26 04 2016 11:42:09.329:DEBUG [web-server]: [39mserving (cached): /home/dina/repos/AngularDirectiveKarma/directiveTemplate/node_modules/karma-mocha/lib/adapter.js [36m26 04 2016 11:42:09.331:DEBUG [web-server]: [39mserving (cached): /home/dina/repos/AngularDirectiveKarma/directiveTemplate/test/test.directive.template.js
[36m26 04 2016 11:42:09.333:DEBUG [web-server]: [39mserving (cached): /home/dina/repos/AngularDirectiveKarma/directiveTemplate/public/app.js

During the test, you should see that the $scope.data value (different in each of the 3 tests) is equated to the compiled html as expected, for example, “user = user”:
LOG: 'user == user'
And finally, that the test passed:
Executed 1 of 1 SUCCESS

Friday, March 4, 2016

Capturing a Stripe Credit Card Charge

In this article, I'll show you the JSON objects and Angular/Node code to capture a credit card with the Stripe service. The code for this example project is on GitHub. It is a working project so the code may not be exactly like this article by the time you find it.

Introduction
Capturing credit card information for products, services, and subscriptions is easy with many tools provided by credit card processing companies. Stripe provides an html button that pops up a form to collect credit card information. It is incredibly simple and straightforward.



You can control the form to collect more data. You don't need to know how to program it, and it works. Yeah! This article doesn't cover the easy route of button and pop up because Stripe did a great job of that on their website.

If you would prefer to control the credit card experience from beginning to end, you may choose to build your own form and process the data you collect to meet your own business needs. If that is the case read on.

The Credit Card Token/Charge Experience
A credit card number should never make it to your server – where you are liable for fraud or theft. Instead, the client-side code sends the credit card information to the processor, in this case Stripe, and Stripe sends back a token to the client-side code.

The token isn't a charge to the card but more a promise to charge the card in the future. For the remainder of the customer's experience, you do not use the credit card information, only the token. The token travels with the rest of the customer information (name, address, items purchased) to the server. The server will make the actual charge and receive the resulting information from Stripe.

With stripe, all you really need is a way to make an https request to their Api server, passing JSON, then receive the JSON request. That is true for both the token request and the charge request. Stripe responses with JSON.

JSON is independent of the Technology
The rest of this article is broken into two different sections. The first section will review the JSON objects which have some requirements and a lot of optional fields. You can use any technology you want to make these, including curl. The second section will cover a specific code base of Angular on the client and Node/Express on the server. The code is very simple so there is little or no styling, validation, or error management.

The Stripe Key
Stripe needs your Publishable key to create the token. You can find this on the Stripe dashboard in your account settings.



If you are using curl, you can pass the key along with the token request. If you are passing it from code, it will need to be set before the token request. The curl example is:

curl https://api.stripe.com/v1/tokens \  -u sk_test_WqKypPUzUwdHSgZLT3zWZmBq: \  -d card[number]=4242424242424242 \  -d card[exp_month]=12 \  -d card[exp_year]=2017 \  -d card[cvc]=123

The key is the value after the –u param: sk_test_…

Notice that only the key and card information is passed in the above curl. You can and should capture and pass the billing address. This allows you to see the address in the Stripe Dashboard when the card is finally made.



Notice that you only see the last 4 digits of the credit card.

JSON to create a Stripe Token
The JSON object to request a token contains the credit card information. It should also contain the billing information for the customer.  A full list of card key/value pairs is listed in the Stripe docs.
client stripe token request =  
{   
   "number":"4242424242424242", 
   "cvc":"123", 
   "exp_month":"11", 
   "exp_year":"2016", 
   "name":"Barbara Jones", 
   "address_city":"Seattle", 
   "address_line1":"5678 Nine Street", 
   "address_line2":"Box 3", 
   "address_country":"USA", 
   "address_state":"WA", 
   "address_zip":"98105" 
}

The response will be a status code, success is 200 with a json object of data including the token.

client stripe token response.response = {   
   "id":"tok_17jylQJklCPSOV9aLkiy5879", 
   "object":"token", 
   "card":{   
      "id":"card_17jylQJklCPSOV9ahtXhPVB8", 
      "object":"card", 
      "address_city":"Seattle", 
      "address_country":"USA", 
      "address_line1":"5678 Nine Street", 
      "address_line1_check":"unchecked", 
      "address_line2":"Box 3", 
      "address_state":"WA", 
      "address_zip":"98105", 
      "address_zip_check":"unchecked", 
      "brand":"Visa", 
      "country":"US", 
      "cvc_check":"unchecked", 
      "dynamic_last4":null, 
      "exp_month":11, 
      "exp_year":2016, 
      "funding":"credit", 
      "last4":"4242", 
      "metadata":{     
      }, 
      "name":"Barbara Jones", 
      "tokenization_method":null 
   }, 
   "client_ip":"73.11.000.147", 
   "created":1456782072, 
   "livemode":false, 
   "type":"card", 
   "used":false 
}

For the rest of the credit card transaction, use the token only. You will need to pass it when you charge the customer's credit card.

JSON for a successful Stripe Charge
Now that you have the token, you can create a JSON object to represent the credit card charge.

client stripe token response.status = 200
server stripe charge request object = {    
   "amount":1000,  
   "currency":"usd",  
   "source":"tok_17jylQJklCPSOV9aLkiy5879",  
   "description":"Donation for XYZ",  
   "metadata":{    
      "ShipTo":"Bob Smith",  
      "BillTo":"Barbara Jones"  
   },  
   "receipt_email":"bob@company.com",  
   "statement_descriptor":"MyStripeStore",  
   "shipping":{    
      "address":{    
         "city":"Seattle",  
         "country":"USA",  
         "line1":"1234 Five Lane",  
         "line2":"Floor 2",  
         "postal_code":"98101",  
         "state":"WA"  
      },  
      "name":"Bob Smith",  
      "phone":""  
   }  
} 

Make sure the statement_descriptor has meaningful information to figure out the charge was valid– it shows up on the customer's bill. If you have information important to the transaction that Stripe doesn't collect, put those values in the metadata key. You can retrieve this information from Stripe to reconcile or fulfill the transaction on your end. Think of it as a backup – if your system goes down, Stripe still has enough information for you to rebuild the transaction.  The amount includes dollars and cents but no decimal. So "1000" is ten dollars, $10.00.

If you are new to Stripe, you may not be using their advanced features but if you collect the data now, converting to and seeding some of the advanced feature objects will be easy.

A successful charge response returns an null error object and a result object.

server stripe charge response.charge = {    
   "id":"ch_17jylQJklCPSOV9aHyof0XV8",  
   "object":"charge",  
   "amount":1000,  
   "amount_refunded":0,  
   "application_fee":null,  
   "balance_transaction":"txn_17jylQJklCPSOV9afi1bpfz5",  
   "captured":true,  
   "created":1456782072,  
   "currency":"usd",  
   "customer":null,  
   "description":"Donation for XYZ",  
   "destination":null,  
   "dispute":null,  
   "failure_code":null,  
   "failure_message":null,  
   "fraud_details":{    
   },  
   "invoice":null,  
   "livemode":false,  
   "metadata":{    
      "ShipTo":"Bob Smith",  
      "BillTo":"Barbara Jones"  
   },  
   "order":null,  
   "paid":true,  
   "receipt_email":"bob@company.com",  
   "receipt_number":null,  
   "refunded":false,  
   "refunds":{    
      "object":"list",  
      "data":[    
      ],  
      "has_more":false,  
      "total_count":0,  
      "url":"/v1/charges/ch_17jylQJklCPSOV9aHyof0XV8/refunds"  
   },  
   "shipping":{    
      "address":{    
         "city":"Seattle",  
         "country":"USA",  
         "line1":"1234 Five Lane",  
         "line2":"Floor 2",  
         "postal_code":"98101",  
         "state":"WA"  
      },  
      "carrier":null,  
      "name":"Bob Smith",  
      "phone":"",  
      "tracking_number":null  
   },  
   "source":{    
      "id":"card_17jylQJklCPSOV9ahtXhPVB8",  
      "object":"card",  
      "address_city":"Seattle",  
      "address_country":"USA",  
      "address_line1":"5678 Nine Street",  
      "address_line1_check":"pass",  
      "address_line2":"Box 3",  
      "address_state":"WA",  
      "address_zip":"98105",  
      "address_zip_check":"pass",  
      "brand":"Visa",  
      "country":"US",  
      "customer":null,  
      "cvc_check":"pass",  
      "dynamic_last4":null,  
      "exp_month":11,  
      "exp_year":2016,  
      "fingerprint":"uOlT1SgxEykd9grd",  
      "funding":"credit",  
      "last4":"4242",  
      "metadata":{    
      },  
      "name":"Barbara Jones",  
      "tokenization_method":null  
   },  
   "source_transfer":null,  
   "statement_descriptor":"MyStripeStore",  
   "status":"succeeded"  
} 

The two items you want to look for is status and paid.

Stripe keeps a log of your transactions as JSON objects including the request and response of both the token and the charge. You may want to store all the information on your server, but if you don't, you can get it from Stripe when you need to.

JSON for a failed Stripe Charge
If you customer's charge fails, the JSON for the charge is a bit different. You will want to present the customer with the meaningful error message to help them correct the problem on their end and complete the transaction successfully.

The charge object will be null and the status object will have the error message.

server stripe charge response.status = {    
   "type":"StripeCardError",  
   "stack":"Error: Your card was declined.\n    at Error._Error (/Users/dfberry/repos/stripe-express-angular/node_modules/stripe/lib/Error.js:12:17)\n    at Error.Constructor (/Users/dfberry/repos/stripe-express-angular/node_modules/stripe/lib/utils.js:105:13)\n    at Error.Constructor (/Users/dfberry/repos/stripe-express-angular/node_modules/stripe/lib/utils.js:105:13)\n    at Function.StripeError.generate (/Users/dfberry/repos/stripe-express-angular/node_modules/stripe/lib/Error.js:54:14)\n    at IncomingMessage.<anonymous> (/Users/dfberry/repos/stripe-express-angular/node_modules/stripe/lib/StripeResource.js:138:39)\n    at emitNone (events.js:72:20)\n    at IncomingMessage.emit (events.js:166:7)\n    at endReadableNT (_stream_readable.js:905:12)\n    at nextTickCallbackWith2Args (node.js:455:9)\n    at process._tickCallback (node.js:369:17)",  
   "rawType":"card_error",  
   "code":"card_declined",  
   "message":"Your card was declined.",  
   "raw":{    
      "message":"Your card was declined.",  
      "type":"card_error",  
      "code":"card_declined",  
      "charge":"ch_17jz4rJklCPSOV9aDPh2eooP",  
      "statusCode":402,  
      "requestId":"req_7zz28B9jlOpH1x"  
   },  
   "requestId":"req_7zz28B9jlOpH1x",  
   "statusCode":402  
} 

The message contains the information to display to the customer. While the transaction didn't complete, the log in the Stripe Dashboard will contain the same information. For a complete list of issues, look at the Stripe api for errors.

The Angular and Node/Express application
In order to use Stripe on the client, you need to pull in the Stripe javascript library.

You can get the Stripe library for the client from stripe

<script type="text/javascript" src="https://js.stripe.com/v2/"></script>

In order to use Stripe on the server, you need to install Stripe from NPM

npm install stripe –save-dev

The Donation Form
The example is a donation form. It collects the customer information and allows the customer to choose the donation amount. A shipping address is also collected, for a thank you card back to the customer.



The web page has very little styling, no validation, and only a single error message if the credit card charge is denied.

Angular Client Code
The Angular code includes a directive to display the html, a controller to collect the customer's information, and a service to post to the server. The token creation and charge are both handled in the service one call right after the other. This is definitely not ideal for a real-world situation.

The card, card, and customer information are kept in JSON objects in the models.js file. The Stripe publishable key is in config.js along with the cart item name. When the token and charge are made, the JSON object that Stripe expects for each of these is created.

As all of the Stripe work is in the service, that is the Angular code to review. The complete charge as 3 separate objects, cart, card, and customer.

exports.$myservice = function($http,$myappconfig){  
    var commit = function (completeCharge, callback){  
        var result = {};  
        // my stripe test key  
        Stripe.setPublishableKey($myappconfig.stripePublishableKey);  
        // credit card info passed   
        // billing address passed as part of charge.card  
        Stripe.card.createToken(completeCharge.card, function(status, response) {  
             if (status.error) {  
                console.log("stripe token not created");  
                result.error = status.error;  
                callback(result);  
            }   
            var chargeRequestObject = {   
                    stripeToken: response.id,   
                    cart: completeCharge.cart ,   
                    customer: completeCharge.customer  
            };  
            // token (not credit card) passed  
            // shipping address passed in charge.customer  
            $http.  
                post('/api/v1/checkout', chargeRequestObject).  
                then(function(data) { //success  
                    callback(null, data);  
                },  
                function(response){ //failure  
                    callback(response.data.error, response);  
                });  
            });  
        }  
    return {  
      commit: commit  
    };      
}

Node Server Code
The server code is a node app using Express. It only serves the html page above and has an api to process a charge -- very lean in order to drop it into any other web site.  Morgan is used for logging and Wagner is used for dependency injection.  You can start the app with npm start or node server/index.js.

Index.js
var express = require('express');  
var wagner = require('wagner-core');  
var path = require('path');  
require('./dependencies')(wagner);  
var app = express();  
app.use(require('morgan')());  
app.get(['/'], function (req, res) {res.sendFile(path.join(__dirname + '/../public/default.html'));});  
app.use('/api/v1', require('./api')(wagner));  
app.use('/public', express.static(__dirname + '/../public', { maxAge: 4 * 60 * 60 * 1000 /* 2hrs */}));  
app.listen(3000);  
console.log('Listening on port 3000!'); 

The dependencies file creates the wagner dependency injection for the config and stripe objects.

Dependencies.js
var fs = require('fs');  
var Stripe = require('stripe');  
var configFile = require('./config.json');  
module.exports = function(wagner) {  
  wagner.factory('Stripe', function(Config) {  
    return Stripe(Config.stripeKey);  
  });  
  wagner.factory('Config', function() {  
    return configFile;  
  });  
};

The config.json is simple the configuration json object:

{  
  "stripeKey": "sk_test_WqKypPUzUXXXXXXXT3zWZmBq",  
  "stripePublishableClientKey": "pk_test_ArJPMXXXXlF2Ml4m4e8ILmiP",  
  "Characters22_StoreName": "MyStripeStore"  
}

The main Stripe code of the application is in the api.js file to process the Stripe charge.
module.exports = function(wagner) {  
  var api = express.Router();  
  api.use(bodyparser.json());  
  /* Stripe Checkout API */  
  api.post('/checkout', wagner.invoke(function(Stripe) {  
    return function(req, res) {  
        // https://stripe.com/docs/api#capture_charge  
        // shipping name is in the metadata so that it is easily found on stripe's website   
        // statement_descriptor & description will show on credit card bill  
        // receipt_email is sent but stripe isn't sending receipt - you still have to do that  
        // shipping is sent only so you can pull information from stripe   
        // metadata: 20 keys, with key names up to 40 characters long and values up to 500 characters long  
        var stripeCharge = {  
            amount: req.body.cart.totalprice,  
            currency: 'usd',  
            source: req.body.stripeToken,  
            description: req.body.cart.name,  
            metadata: {'ShipTo': req.body.customer.shipping.name, 'BillTo': req.body.customer.billing.name},  
            receipt_email: req.body.customer.email,  
            statement_descriptor: config.Characters22_StoreName,  
            shipping: req.body.customer.shipping   
        };  
        console.log("server stripe charge request object = " + JSON.stringify(stripeCharge)+ "\n");  
        // Charge the card NOW  
        Stripe.charges.create(stripeCharge,function(err, charge) {  
            console.log("server stripe charge response.err = " + JSON.stringify(err) + "\n");        
            console.log("server stripe charge response.charge = " + JSON.stringify(charge) + "\n");   
            if (err) {  
                return res.  
                status(status.INTERNAL_SERVER_ERROR).  
                json({ error: err.toString(), charge: err.raw.charge, request: err.requestId, type : err.type});  
            }  
            return res.json(charge);  
        });   
     };  
  }));  
  return api;  
}; 

Summary
Captures credit cards with Stripe is simple. You can use their button and form or create your own system. If you create your own client and server, understanding the JSON objects for the token request and the charge request is important. You can pass just the minimum or all the data you have. Create the token on the client and pass the token and charge details to the server to complete the transaction.

Thursday, February 18, 2016

Extending a linux web dashboard

Adding pm2 status to linux-dash

linux-dash is a light-weight, open-source web dashboard to monitor your linux machine or virtual machine. You can find this package on Github.


The dashboard reports many different aspects of our linux installation via shell scripts (*.sh). This allows the dashboard to be light-weight, and work on most linux machines. The website displays running charts, and tables. The web site can be node, php, or go. For the node webserver, the only dependencies are express and websocket.

Extending linux-dash

You may have a few extra services or programs running on your linux installation that you would like to display on linux-dash . I use pm2, a process manager. Adding a table to display the pm2 status information was very easy -- even if you are not familiar with client-side Angular directives or server-side Node.JS or server-side shell scripts.

The naming convention and templating allows us to focus on the few components we need to build without struggling on the glue between them.

pm2 Dashboard Design

The 'pm2 list' command shows a table with information on the command line.


We want to show this in the linux-dash website on the applications tab in its own table.


In order to do that, we need:
  1. a new shell script - to capture the results of running "pm2 list" and return json
  2. changes to glue - to find script and display as table

Installing linux-dash

If you do not have linux-dash installed, you will need to get it. Clone it from github. Make sure scripts have execute status and the webserver is started with SUDO privileges.

Writing the server-side Shell script

This section applies to services with a snapshot or single point in time.

If you have not written a shell script before, no need to worry. There are plenty of examples of shell scripts at /server/modules/shell_files. The final output of the shell script needs to either be an empty json object such as {} or an array of values such as [{},{},{}]. The values will be key/value pairs (1 key, 1 value) which will diplay as a 2xn grid of information.




The second choice is a table (array of key/value pairs) with more columns which is what we need.

pm2 list output

The command I usually run at the command line is "pm2 list" -- the response shows each process with uptime, status, and other information in a table.




We need to know which lines to ignore (1-3, 6, 7) and which to include (only 4 and 5).
Make sure each line of your output is accounted for as either ignored or parsed. While I ignored the header and footer, perhaps your data should be included.

The shell script needs to be able to read each row into a meaningful json object such as:

[  
   {  
      "appName":"dfberry-8080",
      "id":"0",
      "mode":"fork",
      "pid":"1628",
      "status":"online",
      "restart":"0",
      "uptime":"13D",
      "memory":"20.043MB",
      "watching":"disabled"
   },
   {  
      "appName":"linux-dash",
      "id":"1",
      "mode":"fork",
      "pid":"29868",
      "status":"online",
      "restart":"21",
      "uptime":"7D",
      "memory":"28.293MB",
      "watching":"disabled"
   }
]

pm2.sh

The script has 3 sections. The first section sets the command to the variable 'command'. The second section executes the command and sets the returned text (the command line table) to the variable 'data'. The third section is in two sections.

The first section (a) executes if the 'data' variable has any length. The second section (b) returns an empty json object if the 'data' variable is empty.

Most of the work is in section 3.a with the 'awk' command. The first line pipes the 'data' variable through tail, only passing lines 4 or greater to the next pipe which is head. Head takes all the lines except the last 2 and pipes the results to awk.

The rest of 3.a is working through each column of each row, getting the values $6 means the sixth column. Columns include column break characters of '|' so make sure to include them in the count.

If you are watching the trailing commas, you may be wondering how the last one is removed. Bash has a couple of different ways, I'm using the older bash version syntax which is ${t%?}.

#!/bin/bash

#1: set text of command
command="pm2 list"

#2: execute command
data="$($command)"

#3: only process data if variable has a length 
#this should handle cases where pm2 is not installed
if [ -n "$data" ]; then

    #a: start processing data on line 4
    #don't process last 2 lines
    json=$( echo "$data" | tail -n +4 | head -n +2 \
    | awk   '{print "{"}\
        {print "\"appName\":\"" $2 "\","} \
        {print "\"id\":\"" $4 "\","} \
        {print "\"mode\":\"" $6 "\","} \
        {print "\"pid\":\"" $8 "\","}\
        {print "\"status\":\"" $10 "\","}\
        {print "\"restart\":\"" $12 "\","}\
        {print "\"uptime\":\"" $14 "\","}\
        {print "\"memory\":\"" $16 $17 "\","}\
        {print "\"watching\":\"" $19 "\""}\
        {print "},"}')
    #make sure to remove last comma and print in array
    echo "[" ${json%?} "]"
else
    #b: no data found so return empty json object
    echo "{}"
fi

Make sure the script has execute permissions then try it out on your favorite linux OS. If you have pm2 installed and running, you should get a json object filled in with values.

At this point, we are done with the server-side code. Isn't that amazing?

Naming conventions

The client-side piece of the code is connected to the server-side script via the naming convention. I called this script pm2.sh on the server in the server/modules/shell_files directory. For the client-side/Angular files, you need to use the same name or Angular version of the same name.

Client-side changes for Angular

The Angular directive will be pm2 and used like:

<pm2></pm2>

Add this to the /templates/sections/applications.html so that the entire file looks like:

<common-applications></common-applications>
<memcached></memcached>
<redis></redis>
<pm2></pm2>

Since the pm2 directive is at the end, it will display as the last table. Notice I haven't actually built a table in html, css, or any other method.

I just added a directive using a naming convention tied to the server-side script. Pretty cool, huh?

Routing to the new Angular directive

The last piece is to route the directive 'pm2' to a call to the server for the 'pm2.sh' script.
In the /js/modules.js file, the routing for simple tables in controlled by the 'simpleTableModules' variable. Find that section. We need to add a new json object to the array of name/template sections.

{
    name: 'pm2',
    template: '<table-data heading="P(rocess) M(anager) 2" module-name="pm2" info="pm2 read-out."></table-data>'
}, 

It doesn't matter where in the array the section is added, just that the naming convention is used. Notice the name is 'pm2' and the template.module-name is set to the same value of 'pm2'.

If I wanted a simple table of 2 columns instead of 9 columns, the json object would look like:

{
    name: 'pm2',
    template: '<key-value-list heading="P(rocess) M(anager) 2" module-name="pm2" info="pm2 read-out."></key-value-list>'
},

The key-value-list changes the html display to a 2xN column table.

Summary

With very little code, you can add reports to linux-dash. You need to write a shell script with execute permissions that outputs a json object for the server-side. For the client-side you need to create a directive via adding its syntax to the appropriate section template. Then add a route to the modules.js file. The biggest piece of work is getting the shell script to work.

Now that you know how to create new reporting tables for linux-dash, feel free to add your own code to the project via a pull request.

Friday, February 5, 2016

Prototyping in MongoDB with the Aggregation Pipeline stage operator $sample

Prototyping in MongoDB with the Aggregation Pipeline stage operator $sample

The World Map as a visual example

In order to show how the random sampling works in the mongoDB query, this NodeJS Express website will show the world map and display random latitude/longitude points on the map. Each refresh of the page will produce new random points. Below the map, the docs will display.



Once the website is up and working with data points, we will play with the query to see how the data points change in response.

The demonstration video is available on YouTube.


Setup steps for the website

Setup

This article assumes you have no mongoDB, no website, and no data. It does assume you have an account on Compose. Each step is broken out and explained. If there is a step you already have, such as the mongoDB with latitude/longitude data or a website that displays it, skip to the next.
  1. get website running, display map with no data
  2. setup the mongoDB+ ssl database
  3. get mock data including latitude and longitude
  4. insert the mock data into database
  5. update database data types
  6. verify world map displays data points

Play

When the website works and the world map displays data points, let's play with it to see how $sample impacts the results.
  1. understand the $sample operator
  2. change the row count
  3. change the aggregation pipeline order
  4. prototype with $sample

System architecture

The data import script is /insert.js. It opens and inserts a json file into a mongoDB collection. It doesn't do any transformation.

The data update script is /update.js. It updates the data to numeric and geojson types.

The server is a nodeJs Express website using the native MongoDB driver. The code uses the filesystem, url, and path libraries. This is a bare-bones express website. The /server/server.js file is the web server, with /server/query.js as the database layer. The server runs at http://127.0.0.1:8080/map/. This address is routed to /public/highmap/world.highmap.html. The data query will be made to http://127.0.0.1:8080/map/data/ from the client file /public/highmap/world.highmap.js.

The client files are in the /public directory. The main web file is /highmap/world.highmap.html. It uses jQuery as the javascript framework, and highmap as the mapping library which plots the points on the world map. The size of the map is controlled by the /public/highmap/world.highmap.css stylesheet for the map id.

Step 1: The NodeJS Express Website

In order to get the website up and going, you need to clone this repository, make sure nodeJS is installed, and install the dependency libraries found in the package.json file.

Todo: install dependencies

npm install

Once the dependencies are installed, you can start the web server.

Todo: start website

npm start

Todo: Request the website to see the world map. The map should display successfully with no data points.

http://127.0.0.1:8080/map/


Step 2: Setup the Compose MongoDB+ Deployment and Database

You can move on to the next section, if you have a mongoDB deployment with SSL to use, and have the following items:
  • deployment public SSL key in the /server/clientcertificate.pem file
  • connection string for that deployment in /server/config.json
Todo: Create a new deployment on Compose for a MongoDB+ database with an SSL connection.




While still on the Compose backoffice, open the new deployment and copy the connection string.

Todo: Copy connection string

You will need the entire connection string in order to insert, update, and query the data. The connection string uses a user and password at the beginning and the database name at the end.




You also need to get the SSL Public key from the Compose Deployment Overview page. You will need to login with your Compose user password in order for the public key to show.



Todo: Save the entire SSL Public key to /server/clientcertificate.pem.

If you save it somewhere else, you need to change the mongodb.certificatefile setting in /server/config.json.

You will also need to create a user in the Deployment's database.




Todo: Create new database user and password. Once you create the user name and user password, edit the connection string for the user, password, and database name.

connection string format

mongodb://USER:PASSWORD@URL:PORT,URL2:PORT2/DATABASENAME?ssl=true

connection string example

mongodb://myname:myuser@aws-us-east-1-portal.2.dblayer.com:10907,aws-us-east-1-portal.3.dblayer.com:10962/mydatabase?ssl=true


Todo: Change the mongodb.url setting in the /server/config.json file to this new connection string.

{
    "mongodb": {
        "data": "/data/mockdata.json",
        "url": "mongodb://DBUSER:DBPASSWORD@aws-us-east-1-portal.2.dblayer.com:10907,aws-us-east-1-portal.3.dblayer.com:10962/DATABASE?ssl=true",
        "collection": "mockdata",
        "certificatefile": "/clientcertificate.pem",
        "sample": {
            "on": true,
            "size": 5,
            "index": 1
        }
    }
}

Step 3: The Prototype Data

If you already have latitude and longitude data, or want to use the mock file included at /data/mockdata.json, you can skip this step.

Use Mockeroo to generate your data. This allows you to get data, including latitude and longitude quickly and easily. Make sure to add the latitude and longitude data in json format.



Make sure you have at least 1000 records for a good show of randomness and save the file as mockdata.json in the data subdirectory.

Todo: Create mock data and save to /data/mockdata.json.


Step 4: Insert the Mock Data into the mockdata Collection

The insert.js file converts the /data/mockdata.json file into the mockdata collection in the mongoDB database.

Note: This script uses the native MongoDB driver and the filesystem node package. The Mongoose driver can also use the ssl connection and the $sample operator. If you are using any other driver, you will need to check for both ssl and $sample.

The configuration is kept in the /server/config.json file. Make sure it is correct for your mongoDB url, user, password, database name, collection name and mock data file location. The configuration is read in and stored in the privateconfig variable of the insert.js script.

The mongos section of the config variable is for the SSL mongoDB connection. You shouldn't need to change any values.

insert.js


var MongoClient = require('mongodb').MongoClient,  
  fs = require('fs'),
  path = require('path');

var privateconfig = require(path.join(__dirname + '/config.json'));
var ca = [fs.readFileSync(path.join(__dirname + privateconfig.mongodb.certificatefile))];
var data = fs.readFileSync(path.join(__dirname + privateconfig.mongodb.data), 'utf8');
var json = JSON.parse(data);

MongoClient.connect(privateconfig.mongodb.url, {
    mongos: {
        ssl: true,
        sslValidate: true,
        sslCA: ca,
        poolSize: 1,
        reconnectTries: 1
    },
}, function (err, db) {
    if (err) {
        console.log(err);
    } else {
        console.log("connected");
        db.collection(privateconfig.mongodb.collection).insert(json, function (err, collection) {
            if (err) console.log((err));
            db.close();
            console.log('finished');
        }); 
    }  
});



Todo: Run the insert script.

node insert.js
If you create an SSL database but don't pass the certificate, you won't be able to connect to it. You will get a sockets closed error.

Once you run the script, make sure you can see the documents in the database's mockdata collection.


Step 5: Convert latitude and longitude from string to floats

The mock data's latitude and longitude are strings. Use the update.js file to convert the strings to floats as well as create the geojson values.

update.js
var MongoClient = require('mongodb').MongoClient,  
  fs = require('fs'),
  path = require('path');

var privateconfig = require(path.join(__dirname + '/config.json'));
var ca = [fs.readFileSync(path.join(__dirname + privateconfig.mongodb.certificatefile))];

MongoClient.connect(privateconfig.mongodb.url, {
    mongos: {
        ssl: true,
        sslValidate: true,
        sslCA: ca,
        poolSize: 1,
        reconnectTries: 1
    },
}, function (err, db) {
    if (err) console.log(err);
    if (db) console.log("connected");
       
    db.collection(privateconfig.mongodb.collection).find().each(function(err, doc) {       
        if (doc){
            
            console.log(doc.latitude + "," + doc.longitude);
            
            var numericLat = parseFloat(doc.latitude);
            var numericLon = parseFloat(doc.longitude);
            
            doc.latitude = numericLat;
            doc.longitude = numericLon;
            doc.geojson= { location: { type: 'Point', coordinates : [numericLat, numericLon]}}; // convert field to string
            db.collection(privateconfig.mongodb.collection).save(doc);
            
        } else {
            db.close();
        }
    });
    console.log('finished');
});



Todo: Run the insert script
node update.js
Once you run the script, make sure you can see the documents in the database's mockdata collection with the updated values.


Step 6: Verify world map displays points of latitude and longitude

Refresh the website several times. This should show different points each time. The variation of randomness should catch your eye. Is it widely random, or not as widely random as you would like?


Todo: Refresh several times
http://127.0.0.1:8080/map/?rows=5



The warning of the $sample behavior says the data may duplicate within a single query. On this map that would appear as less than the number of requested data points. Did you see that in your tests?

How $sample impacts the results

Now that the website works, let's play with it to see how $sample impacts the results.
  1. understand the $sample code in /server/query.js
  2. change the row count
  3. change the aggregation pipeline order
  4. prototype with $sample

Step 1: Understand the $sample operator in /server/query.js

The $sample operator controls random sampling of the query in the aggregation pipeline.
The pipeline used in this article is a series of array elements in the arrangeAggregationPipeline function in the /server/query.js file. The first array element is the $project section which controls what data to return.

arrangeAggregationPipeline()


  
  
var aggregationPipeItems = [
        { $project: 
            {
                last: "$last_name",
                first: "$first_name",
                lat: "$latitude",
                lon:  "$longitude",
                Location: ["$latitude", "$longitude"],
                _id:0 
            }
        },
        { $sort: {'last': 1}} // sort by last name


The next step in the pipeline is the sorting of the data by last name. If the pipeline runs this way (without $sample), all documents are returned and sorted by last name.

The location of $sample is controlled by the pos value in the url. If pos isn't set, the position defaults to 1. If it is set to 1 of the zero-based array, it will be applied between $project and $sort, at the second position. If the code runs as supplied, the set of data is randomized, documents are selected, then the rows are sorted. This would be meaningful in both that the data is random, and returned sorted.

Note: In order for random sampling to work, you must use it in connection with 'rows' in the query string.

We will play with the position in step 3.

Step 2: Change the row count

The count of rows is a parameter in the url to the server, when the data is requested. Change the url to indicate 10 rows returned.

Todo: request 10 rows, with sorting applied after

http://127.0.0.1:8080/map/?rows=10



Step 3: Change the aggregation pipeline order

The aggregation pipeline order is a parameter in the url to the server. You can control it with the 'pos' name/value pair. The following url is the same as Step 2 but the aggregation pipeline index is explicitly set.

Todo: request 10 rows, with sorting applied after
http://127.0.0.1:8080/map/?rows=10&pos=1

Note: Only 0, 1, and 2 are valid values




The results below the map should be sorted.



If the $sample position is moved to the 0 position, still before the sort is applied, the browser shows the same result.

Todo: request 10 rows, with sorting applied after

http://127.0.0.1:8080/map/?rows=10&pos=0

But, however, if the $sample is the last item (pos=2), the entire set is sorted, then 5 rows are selected. The results are no longer sorted.

Todo: request 10 rows, with sorting applied before

http://127.0.0.1:8080/map/?rows=10&pos=2



Note that while the documents are returned, they are not in sorted order.

If they are in sorted order, it isn't because they were sorted, but because the random pick happened that way on accident, not on purpose.

Step 4: Prototype with $sample

The mongoDB $sample operator is a great way to to try out a visual design without needing all the data. At the early stage of the design, a quick visual can give you an idea if you are on the right path.

The map with data points works well for 5 or 10 points but what about 50 or 100?

Todo: request 500 rows

http://127.0.0.1:8080/map/?rows=500



The visual appeal and much of the meaning of the data is lost in the mess of the map. Change the size of the points on the map.

Todo: request 500 rows, with smaller points on the map using 'radius' name/value pair

http://127.0.0.1:8080/map/?rows=500&radius=2


Summary

The $sample aggregation pipeline operator in mongoDB is a great way to build a prototype testing with random data. Building the page so that the visual design is controlled by the query string works well for quick changes with immediate feedback.

Enjoy the new $sample operator. Leave comments about how you have or would use it.

Thursday, January 21, 2016

Frugal Cloud The In-Memory versus SSD Paging File

Many people remember the days when you could use a USB memory stick to boost the performance of Windows. This memory caused me to ask the question: Is there a potential cost saving with little performance impact by going sparse on physical memory and configuring a paging file.

For windows folks:

For Linux, see TechTalk Joe post. Note his "I/O requests on instance storage does not incur a cost. Only EBS volumes have I/O request charges." so it is not recommended to do if you are running with EBS only.

This approach is particularly significant when you are "just over" one the offering levels. 


For some configurations, you will not get a CPU boost - by using the paging files. I know recent experience with a commercial SAAS actually had high memory usage but very log CPU (3-5%, even during peak times!). Having 1/2 or even 1/4 the CPUs would not peg the CPU. The question then becomes whether the Paging File on a SSD drive would significantly drop performance (whether you can strip for extra performance across multiple SSD on cloud instances, is an interesting question). This is a question that can only be determined experimentally.

  • How the paging file is configured and the actual usage of memory by the application is key. Often 80-90% of the usage hits only 10% of the memory (Pareto rule). The result could be that the median (50%ile) time may be unchanged -- and time may increase only along the long tail of the response distribution (say top 3% may be longer).
These factors cannot be academically determined. They need to be determine experimentally.

If performance is acceptable, there is an immediate cost saving because when new instances are created due to load, they are cheaper instances.

Bottom line is always: Experiment,stress, time and compare cost. Between the pricing models, OS behaviors and application behaviors, there is no safe rule of thumb!

Second Rule: Always define SLA as the Median (50%-ile) and never as an average.  Web responses are long-tailed which makes the average(mean) very volatile. The median is usually very very stable.

Sunday, January 17, 2016

Sharding Cloud Instances

Database sharding has been with us for many years. The concept of cloud instance sharding has not been discussed much. There is a significant financial incentive to do.

Consider a component that provides address and/or postal code validation around the world. For the sake of illustration, let us consider 10 regions that each have the same volume of data.

Initial tests found that it took 25 GB of data to load all of them in memory. Working of AWS EC2 price list, we find that a m4.2xlarge is needed to run it, at $0.479/hr. This gives us 8 CPUs.

If we run with 10 @ 2.5 GB instead, we end up with 10 t2.medium, each with 2 CPU and a cost of $0.052/hr, or $0.52/hr -- which on first impression is more expensive, except we have 20 CPUs instead of 8 CPU. We may have better performance. If one of these instances is a hot spot (like US addresses), then we may end up with  9 instances that each support one region each and perhaps 5 instances supporting the US. As a single instance model, we may need 5 instances.

In this case, we could end up with

  • Single Instance Model: 5 * $0.479 = $2.395/hr with 40 CPU
  • Sharded Instances Model: (9 + 5) * $0.052 = $1.02/hr with 28 CPU
We have moved from one model being 10% more expensive to the other model being 100% as expensive.

Take Away

Flexibility in initial design to support independent cloud instances with low resource requirements as well as sharding may be a key cost control mechanism for cloud applications.

It is impossible to a-priori determine the optimal design and deployment strategy. It needs to determined by experiment. To do experiments cheaply, means that the components and architecture must be designed to support experimentation.

In some ways, cloud computing is like many phone plans -- you are forced to pay for a resource level and may not used all of resources that you pay for. Yes, the plans have steps, but if you need 18 GB of memory you may have to also pay for 8 CPUs that will never run more than 5% CPU usage (i.e. a single CPU is sufficient). Designing to support flexibility of cloud instances is essential for cost savings.

Saturday, January 16, 2016

An Financially Frugal Architectural Pattern for the Cloud

I have heard many companies complain about how expensive the cloud is becoming as they moved from development to production systems. In theory, the saved costs of greatly reduced staffing of Site Reliability Engineers and reduced hardware costs should compensate -- key word is should.  In reality, this reduction never happens because they are needed to support other systems that will not be migrated for years.

There is actually another problem, the architecture is not designed for the pricing model.

In the last few years there have been many changes in the application environment, and I suspect many current architectures are locked into past system design patterns. To understand my proposal better, we need to look at the patterns thru the decades.

The starting point is the classic client server: Many Clients - one Server - one database server (possibly many databases)
As application volume grew, we ended up with multiple servers to handle multiple clients but retaining a single database.
Many variations arose, especially with databases - federated, sharding etc. The next innovation was remote procedure calls with many dialects such as SOAP, REST, AJAX etc. The typical manifestation is shown below.
When the cloud came along,the above architecture was too often just moved off physical machines to cloud machines without any further examination.

Often they will be minor changes, if a queue service was being used with the onsite service concurrent with the application server, it may be spawned off to a separate cloud instance. Applications are often design for the past model of all on one machine. It is rare when an existing application is moved to the cloud that it is design-refactored significantly. I have also seen new cloud base application be implemented in the classic single machine pattern.

The Design Problem

The artifact architecture of an application consisting of dozens, often over 100 libraries (for example C++ dll's), It's a megalith rooted in the original design being for one PC. 

Consider the following case: Suppose that instead of running these 100 libraries on a high end cloud machines with say 20 instances, you run each library on it's own light-weight machine? Some libraries may only need two or three light-weight machines to handle the load. Others may need 20 instances because it is computationally intense and a hot spot. If you are doing auto-scaling, then the time to spin-up a new instance is much less when instances are library based -- because it is only one library. 

For the sake of argument, suppose that each of the 100 libraries require 0.4 GB to run. So to load all of them in one instance we are talking 40GB (100 x 0.4).

Looking at the current AWS EC2 pricing, we could use 100 instances of the t2.nano and have $0.0065 x 100 = $0.65/hour for all 100 instances with 1 CPU each (100 CPU total). The 40GB would require c3.8xlarge at $1.68/hour, 3 times the cost and only 32 cores instead of 100 cores. Three times the cost and 1/3 of the cores... sounds like our bill could be 9 times what is needed.


What about scaling, with the megalith, you have to spin up a new complete instance. With the decomposition into library components, you only need to spin up new instances of the library that needs it. In other words, scaling up become significantly more expensive with the megalith model.

What is another way to describe this? Microservices

This is a constructed example but it does illustrate that moving the application to the cloud may require appropriate redesign with a heavy focus on building components to run independently on the cheapest instances. Each swarm of these component-instances are load balanced with very fast creation of new instances.

Having a faster creation of instances actually save more money because the triggering condition can be set higher (and thus triggered less often - less false positives). You want to create instances so they are there when the load build to require them. The longer the time it takes to load the instance, the longer lead time you need need, which means the lower on the build curve you must set the trigger point. 

There is additional savings for deployments, because you can deploy at the library level to specific machines instead of having to deploy a big image. Deploys are faster, rollbacks are faster.

Amazon actually does this approach internally with hundreds of services (each on their own physical or virtual machine) backing their web site. A new feature is rarely integrated into the "stack", instead it is added as a service that can actually be turned off or on on production by setting appropriate cookies in the production environment. There is limited need for a sandbox environment because the new feature is not there for the public -- only for internal people that know how to turn it on.

What is the key rhetorical question to keep asking?

Why are we having most of the application on one instance instead of "divide and save money"?  This question should be constantly asked during design reviews.

In some ways, a design goal would be to design the application so it could run on a room full of PI's.

This design approach does increase complexity -- just like multi-threading and/or async operations adds complexity but with significant payback. The process of designing libraries to minimize the number of inter-instances call while striving to minimize the resource requirements is a design challenge that will likely require mathematical / operations research skills.

How to convert an existing application?

A few simple rules to get the little gray cells firing:
  • Identify methods that are static - those are ideal for mini-instances
  • Backtrack from these methods into the callers and build up clusters of objects that can function independently.
    • There may be refactoring because often designs go bad under pressure to deliver functionality
    • You want to minimize external (inter-component-instances) calls from each of these clusters
  • If the system is not dependent on dozens of component-instance deployments there may be a problem.
    • If changing the internal code of a method requires a full deployment, there is a problem
One of the anti-patterns for effective-frugal cloud base design is actually object-orientated (as compared to cost-orientated) design. I programmed in Simula and worked in GPSS -- the "Adam and Eve" of object programming. All of the early literature was based on the single CPU reality of computing then. I have often had to go in and totally refactor an academically correct objective system design in order to get performance. Today, a refactor would also need to get lower costs.

The worst case system code that I refactored for performance was implemented as an Entity Model in C++, a single call from a web front end went thru some 20 classes/instances in a beautiful conceptual model, with something like 45 separate calls to the database. My refactoring resulted in one class and a single stored procedure (whose result was cached for 5 minutes before rolling off or being marked stale).

I believe that similar design inefficiencies are common in cloud architecture.

When you owned the hardware, each machine increased labor cost to create, license, update and support. You have considerable financial and human pressure to minimize machines. When you move to the cloud with good script automation, having 3 instances or 3000 instances should be approximately the same work. You actually have financial pressure to shift to the model that minimizes costs -- this will often be with many many more machines.