Category: Web Applications

WSO2 ESB Endpoint Error Handling

Introduction

WSO2 ESB can be used as an intermediary component to connect different systems. When connecting those systems the availability of those systems is a common issue. Therefore ESB has to handle those undesirable situations carefully and take relevant actions. To cater that requirement outbound-endpoints of the WSO2 ESB can be configured. In this article I discuss two common ways of configuring endpoints.

Two common approaches to configure endpoints are;

  1. Configure with just a timeout (without suspending endpoint)
  2. Configure with a suspend state

Configure with just a timeout

This would suitable if the endpoint failure is not very frequent.

Sample Configuration:

<endpoint name="SimpleTimeoutEP">
    <address uri="http://localhost:9000/StockquoteService">
    <timeout>
        <duration>2000</duration>
        <responseAction>fault</responseAction>
    </timeout>
    <suspendOnFailure>
        <errorCodes>-1</errorCodes>
        <initialDuration>0</initialDuration>
        <progressionFactor>1.0</progressionFactor>
        <maximumDuration>0</maximumDuration>
    </suspendOnFailure>
    <markForSuspension>
        <errorCodes>-1</errorCodes>
    </markForSuspension>
</address>
</endpoint>

 

In this case we only focus on the timeout of the endpoint. The endpoint will stay as Active for ever. If a response does not receive within duration, the responseAction triggers.

duration – in milliseconds

responseAction – when response comes to a time-out message one of the following actions trigger.

  • fault – calls the fault-sequence associated
  • discard – discards the response
  • none – will not take any specific action on response (default action)

The rest of the configuration avoids the endpoint going to suspend state.

If you specify responseAction as “fault”, you can define define customize way of informing the failure to the client in fault-handling sequence or store that message and retry later.

Configure with a suspend state

This approach is useful when connection failures are very often. By suspending endpoint, ESB can save resources without unnecessarily waiting for responses.

In this case endpoint goes through a state transition. The theory behind this behavior is the circuit-breaker pattern. Following are the three states:

  1. Active – Endpoint sends all requests to backend service
  2. Timeout – Endpoint starts counting failures
  3. Suspend – Endpoint limits sending requests to backend service

Sample Configuration:

<endpoint name="Suspending_EP">
    <address uri="http://localhost:9000/StockquoteServicet">
    <timeout>
        <duration>6000</duration>
    </timeout>
    <markForSuspension>
        <errorCodes>101504, 101505</errorCodes>
        <retriesBeforeSuspension>3</retriesBeforeSuspension>
        <retryDelay>1</retryDelay>
    </markForSuspension>
    <suspendOnFailure>
        <errorCodes>101500, 101501, 101506, 101507, 101508</errorCodes>
        <initialDuration>1000</initialDuration>
        <progressionFactor>2</progressionFactor>
        <maximumDuration>60000</maximumDuration>
    </suspendOnFailure>
</address>
</endpoint>

 

In the above configuration:

If endpoint error codes are 101504, 101505; endpoint is moved from active to timeout state.

When the endpoint is in timeout state, it tries 3 attempts with 1 millisecond delays.

If all those retry attempts fail, the endpoint will move to suspend state. If a retry succeed, then endpoint will move to active state.

If active endpoint receives error codes 101500, 101501, 101506, 101507, 101508; endpoint will directly move to suspend.

After endpoint somehow moves to suspend state, it waits initialDuration before attempting any furthermore. Thereafter it will determine the time period between requests according to following equation.

Min(current suspension duration * progressionFactor, maximumDuration)

In the equation, “current suspension duration” get updated for each reattempt.

Once endpoint succeed in getting a response to a request, endpoint will go back to active state.

If endpoint will get any other error codes (eg: 101503), it will not do any state transition, and remain in active state.

Conclusion

In this article I have shown two basic configurations that would be useful to configure endpoints of WSO2 ESB. You can refer WSO2 ESB documentation for implementing more complex patterns with endpoints.

References

WSO2 ESB Documentation: https://docs.wso2.com/display/ESB500/Endpoint+Error+Handling#EndpointErrorHandling-timeoutSettings

Timeout and Circuit Breaker Pattern in WSO2 Way: http://ssagara.blogspot.com/2015/05/timeout-and-circuit-breaker-pattern-in.html

Endpoint Error Codes: https://docs.wso2.com/display/ESB500/Error+Handling#ErrorHandling-codes

Endpoint Error Handling: http://wso2.com/library/articles/wso2-enterprise-service-bus-endpoint-error-handling/

Creating Geometry Compass Using JavaScript

Introduction

Few weeks back, I was asked to look around for existing geometry drawing tools. While searching, I found numbers of dynamic geometric construction tools which allows  drawing. Most of those projects are desktop-based, and few are web-based. Most of the web-based are reusing existing platforms such as GeoGebra.

In my research, I was focusing on the ruler and the compass tools which can be reused. Unfortunately I didn’t came across a web-based tool which can reuse. There were some good stuff, which are not reusable. Since most of the other web-based systems reuse GeoGebra, so they have the same limitations that GeoGebra has. Almost all reusable web-based tools have an issue with their compass, which was not user-friendly.

Problem

Geometry compass in most web-based tools allows you to draw arcs in one direction. Mostly anti-clockwise direction. This uni-directional compass tools were annoying for me while drawing constructions. Searching furthermore, I found Geometra (a desktop-based tool) has provided a reasonable solution for that. The compass in Geometra allows drawing arcs both sides, but allows drawing only small-arcs (less than 180 degrees). In my implementation I thought of bringing Geometra’s approach to web-based environment.

Background

For this implementation I used HTML5 canvas, and JavaScript. To make the implementation easier I added fabricjs library. Though FabricJs makes it easy to use HTML5 canvas, it has the limitation of representing Arcs. Therefore I extended its circle class to cater the requirement. Furthermore I used Compass class to handle properties of compass and event.js for handling mouse events.

Implementation

In this section, I’ll go through files which I discussed earlier. “Arc.class.js” file extends Circle class of fabricJs. Then I’ve overridden the initialize, render and toSVG methods to suite the requirement.

Arc.class.js


fabric.Arc = fabric.util.createClass(fabric.Circle, {
	type: 'arc',

	counterclockwise: false,

	initialize: function (options) {
		this.counterclockwise = options.counterclockwise;
		this.callSuper('initialize', options);
	},

	_render: function (ctx, noTransform) {
		ctx.beginPath();
		ctx.arc(noTransform ? this.left + this.radius : 0,
		      noTransform ? this.top + this.radius : 0,
		      this.radius,
		      this.startAngle,
		      this.endAngle, this.counterclockwise);
		this._renderFill(ctx);
		this._renderStroke(ctx);
    },

    toSVG: function(reviver) {
    	var markup = [];

		var rx = this.left + this.radius * Math.cos(this.startAngle);
		var ry = this.top + this.radius * Math.sin(this.startAngle);

		var ex = this.left + this.radius * Math.cos(this.endAngle);
		var ey = this.top + this.radius * Math.sin(this.endAngle);

		var svgPath = '';
	    if (!this.counterclockwise) {
	    	svgPath += '<path d=\"M'+rx+','+ry+' A'+this.radius+','+this.radius+' 0 0,1 '+ex+','+ey+'\" style=\"'+this.getSvgStyles()+'\"/>';
	    } else {
	    	// Exchange starting and ending points when it's counterclockwise
	    	svgPath += '<path d=\"M'+ex+','+ey+' A'+this.radius+','+this.radius+' 0 0,1 '+rx+','+ry+'\" style=\"'+this.getSvgStyles()+'\"/>';
	    }

	    markup.push(svgPath);

    	return reviver ? reviver(markup.join('')) : markup.join('');
    }
});

The above Arc class provides generic support for drawing arcs, similar to Line and Circle classes, already comes with FabricJs. To use Arc class, I created compass JavaScript file. It has 3 public methods: redraw; which is handling mouse movements when drawing starts, complete; which concludes the drawing and toSVG; which gives out the SVG representation of the arc drawn.

Compass.js


function Compass (mouseStart) {
	// 'c' for center, 'r' for radius, 'e' for end
	this.cx = this.rx = this.ex = mouseStart.x;
	this.cy = this.ry = this.ey = mouseStart.y;
	
	this.radius = 0;

	var points = [this.cx, this.cy, this.rx, this.ry];

	this.radiusLine = new fabric.Line(points, {				
										    strokeWidth: 2,
										    fill: 'black',
										    stroke: 'black',
										    strokeDashArray: [6, 3],
										    selectable: false
										});

	fabricCanvas.add(this.radiusLine);

	this.textObj = new fabric.Text('0', {
									        fontFamily: 'Times_New_Roman',
									        left: this.x1,
									        top: this.y1,
									        fontSize: 20,
									        originX: 'center'
									    });

	fabricCanvas.add(this.textObj);

	this.status = 'radius';
}

Compass.prototype = {
	constructor : Compass,

	redraw : function (mouse) { 

		if (this.status == 'radius') {
			this.rx = mouse.x;
		 	this.ry = mouse.y;

		 	this.radiusLine.set({ x2: this.rx, y2: this.ry });

		 	var tmp = addDistanceLabel (this.textObj, {x: this.cx, y:this.cy}, {x:this.rx, y:this.ry});

		 	fabricCanvas.renderAll();

		} else if (this.status = 'end') {
			this.ex = mouse.x;
			this.ey = mouse.y;

			this.endAngle = this._getAngle({ x:this.ex, y:this.ey })


			var angleDiff = this.endAngle - this.startAngle;

			if ((-Math.PI * 2 < angleDiff) && (angleDiff < -Math.PI)) {
				this.counterclockwise = false;
			} else if ((-Math.PI < angleDiff) && (angleDiff < 0)) {
				this.counterclockwise = true;
			} else if ((0 < angleDiff) && (angleDiff < Math.PI)) {
				this.counterclockwise = false;
			} else if ((Math.PI < angleDiff) && (angleDiff < Math.PI * 2)) {
				this.counterclockwise = true;
			}

			this.fabricObj.set( {endAngle: this.endAngle, counterclockwise: this.counterclockwise} );

			fabricCanvas.renderAll();
		}

	},

	complete : function () {

		if (this.status == 'radius') {

			fabricCanvas.remove(this.radiusLine);

			fabricCanvas.remove(this.textObj);

			fabricCanvas.renderAll();

			this.radius = Math.sqrt( Math.pow((this.rx-this.cx), 2) + Math.pow((this.ry-this.cy), 2) );

			this.startAngle = this._getAngle({ x:this.rx, y:this.ry });

			this.fabricObj = new fabric.Arc({
										left: this.cx,
										top: this.cy,
										radius: this.radius,
										startAngle: this.startAngle,
										endAngle: this.startAngle,
										counterclockwise: false,
										fill: '',
										stroke: 'black',
										originX: 'center',
								        originY: 'center',
								        selectable: false,
								        strokeDashArray: [6, 3]
									});

			fabricCanvas.add(this.fabricObj);

			this.status = 'end';

		} else if (this.status = 'end') {

			this.fabricObj.set({ strokeDashArray: [] });

			fabricCanvas.renderAll();

			drawings.push(this);
		}		
		
	},

	toSVG : function () {
	    return this.fabricObj.toSVG();
	},

	_getAngle : function (point) {
		var angleRequired = 0;

		// gets the actual angle from center
		if ((point.x-this.cx) == 0) {
			// handling special cases
			if (this.cy > point.y) { 
				angleRequired = Math.PI/2;
			} else if (this.cy < point.y) {
				angleRequired = -Math.PI/2;
			}
		} else {
			// in general cases
			angleRequired = Math.atan ((point.y-this.cy) / (point.x-this.cx));

			if ((this.cy < point.y) && (angleRequired < 0)) { // handle 2nd quadrant
				angleRequired = Math.PI - Math.abs(angleRequired);
			} else if ((this.cy > point.y) && (angleRequired > 0)) { // handle 3rd quadrant
				angleRequired = Math.PI + Math.abs(angleRequired);
			} else if ((this.cy > point.y) && (angleRequired < 0)) { // handle 4th quadrant
				angleRequired = 2*Math.PI - Math.abs(angleRequired);
			}
		}

		return angleRequired;
	}
}

For handling mouse events, I’ve created event JavaScript file. Depending on mouse events, this event file calls methods belongs to compass.

events.js


var fabricCanvas = new fabric.Canvas('sheet', { selection: false });
var selectedTool = '';
var toolState = '';
var toolPreviousState = '';
var instruction = $('#instructionText');

var currentTool = null;


var compassSettings = $('#compass_settings');
var compassSettingsState = $('input[name=compass-state]');

$('input[name=tool]').click(function() {
   $('input[name=tool]').removeClass('active_tool');
   $(this).addClass('active_tool');
});

fabricCanvas.on('mouse:down', function(e) {

	// Get mouse coordinates
	var mousePointer = getMousePointer(fabricCanvas, e);

	switch(selectedTool) {

		case 'compass' :

			switch(toolState) {
				case 'center' :
					
					currentTool = new Compass(mousePointer); 
					
					instruction.text('Select Radius Point');
					toolPreviousState = toolState;
					toolState = 'radius';

				break;
				case 'radius' :

					// do radius logic here
					currentTool.complete();

					// change to next
					// currentTool.addPoint(mousePointer);

					toolPreviousState = toolState;

					instruction.text('Select Ending Point');						
					toolState = 'end';
					

				break;
				case 'end' :

					// do end logic here
					currentTool.complete();

					instruction.text('Select Center Point');
					toolPreviousState = '';
					toolState = 'center';

				break;
			}

		break;		
	}
}, false);

fabricCanvas.on('mouse:move', function(e) {
	var mousePointer = getMousePointer(fabricCanvas, e);

	switch(selectedTool) {
		case 'compass' :
			switch(toolState) {
				case 'radius':
				case 'end' :
					currentTool.redraw(mousePointer);
				break;
			}

		break;
	}

}, false);

function getMousePointer (canvas, evt) {
    var mouse = canvas.getPointer(evt.e);
    var x = (mouse.x);
    var y = (mouse.y);
    return {
        x: x,
        y: y
    };
}


function addDistanceLabel (lineObj, start, end) {
	// change text label
 	var textX = start.x + ((end.x - start.x) / 2);
 	var textY = start.y + ((end.y - start.y) / 2);

 	var distance = Math.sqrt( Math.pow((end.x-start.x), 2) + Math.pow((end.y-start.y), 2) );
 	distance = (distance / 50.0).toFixed(1); // make it centimeters

 	lineObj.set( {left: textX, top: textY } );
 	lineObj.setText(distance + ' cm');
}


function initTool (toolName) {

	compassSettings.hide();

	switch (toolName) {
		case 'compass' :
			selectedTool = 'compass';
			toolState = 'center';
			instruction.text('Select Center Point');

			compassSettings.show();

			break;
	}
}

Finally you need to include dependencies in index.html file, which is shown below.

index.html


<!DOCTYPE html>
<html>
	<head>
	    <title>Mathematical Constructions</title>
	    <meta charset="utf-8"/>
	</head>

	<body>
		<h3>Geometrical Construction Drawing</h3>
		<canvas id="sheet" style="left:10px;top:10px;bottom:10px; border:1px solid #000000;" height="550" width="1280"></canvas>
		<br/>
		<div id="instructionText">Click on Compass</div>
		<br/>
		<input type="button" name="tool" class="btn btn-default" onclick="initTool('compass')" value="Compass">

		<script type="text/javascript" src="./js/jquery-3.1.1.min.js"></script>
		<script type="text/javascript" src="./js/fabric.js"></script>
		<script type="text/javascript" src="./js/events.js"></script>

		<!-- Extended shapes -->
		<script type="text/javascript" src="./js/Arc.class.js"></script>

		<!-- Tools for drawing -->
		<script type="text/javascript" src="./js/Compass.js"></script>

		<style type="text/css">
			.active_tool {background-color: gray}
		</style>
	</body>
</html>

To use this, you need to open index.html file in a browser. Then click Compass button, then click center and radius points respectively, and click on ending of arc. It’ll draw arcs shown as below.

compass_1

compass_2

Conclusion

In this post I shared my experience on how a bi-directional geometric compass is implemented for a web-based environment. Hope this will be helpful for you.

Resources

[1] GeoGebra – https://www.geogebra.org/

[2] Geometra – https://sourceforge.net/projects/geometra/

[3] HTML5 Canvas – http://www.w3schools.com/html/html5_canvas.asp

[4] SVG Path representation for Arcs – https://developer.mozilla.org/en/docs/Web/SVG/Tutorial/Paths#Arcs

 

 

Configuring ownCloud with XAMPP in Ubuntu

xampp_owncloud

Introduction

The ownCloud is a popular open-source, self-hosted file sync and sharing platform [1]. With ownCloud you can almost create a cloud storage of your own (private cloud). For that you need to installing ownCloud on top of an already existing server is the common way. For this base infrastructure, I’m using XAMPP [2], and please take a note of product versions below.

Prerequisites

  • Linux machine (I tested with Ubuntu 16.04)
  • XAMPP 7.0.8 (x64)
  • ownCloud 9.1.0

Note: I have experienced issues with XAMPP 7.0.9, missing module mod_ssl [3]. Also older ownCloud versions may not support newer PHP version coming up with XAMPP

Installation

Installing XAMPP is quite simple task, that you need to run downloaded “.run” executable in sudo mode.


chmod +x xampp-executable-name.run

sudo ./xampp-executable-name.run

You will get a GUI to configure further settings. Once you finished installing XAMPP, you can move on to installing ownCloud.

Installing ownCloud wan’t easy as thought. You’ll have to deal with several ownership level changes to directories.

First you need to download the ownCloud, and extract the zip file. Secondly you need to put it in to “htdocs” folder of XAMPP.

Then use following in terminal to go to super-user mode and creating the “data” folder inside owncloud folder you copied. Thereafter you need to give all privileges to “data” folder with “chmod”


sudo su

cd owncloud/

mkdir data chmod 777 data/

Now you can access “http://localhost/owncloud&#8221; URL from browser and configure admin account.

Once you have created the admin account, you need to make sure that data folder has right permissions. For that you need to change user, user-group ad permissions of the “data” folder.


chown daemon:daemon -R data/

chmod 770 -R data/

After completion, you’ll be able to see the welcome screen of ownCloud when navigating to “http://localhost/owncloud&#8221; URL.

References

[1] ownCloud – https://owncloud.org/

[2] XAMPP – https://www.apachefriends.org/index.html

[3] XAMPP issue – http://unix.stackexchange.com/questions/307364/xampp-apache-server-wont-start

Experience with Azure Big Data Analyzing

Introduction

This experiment was done in order to have first understanding on how to deal with big data on cloud. So this is my first time experience with lots of new technologies such as Hadoop, Hive etc. Before the experiment I have tried few examples with Hadoop and other related technologies, and yet found that this would be a better way to go.

Use case

This experiment is based on the dataset with annual salaries for different persons. In this case this is about California, USA in 2015. The data set contained names, job titles, basic salary, additional benefits and total salary. So in this, I’ll concentrate on averages of job titles and their averaged total salaries.

Please note that this is an experiment which is solely depends on the dataset. Actual survey reports may contain values different from final results.

Prerequisites

Before moving to the rest of the article, you are expected to have better understanding on followings:

  • Your understanding on Microsoft Azure platform is necessary. You can create a free azure account and try to play around with portal. Also you need to understand their concepts such as Resources, Resource Groups etc.
  • You need to have some sort of understanding on big data concepts such as what is big data, Hadoop DFS, Hadoop Eco system, Hive, SQL etc.
  • Better to have done some tutorials in Azure documentation. Specially “Analyze flight delay data”, which laid the foundation for this article.
  • Understanding on HTML, PHP, JS with AJAX is required for understand visualization part.
  • Using tools like Putty, WinSCP if you are in Windows, else commands related to scp and ssh

Planning Deployment

Deployment required for this article is can be shown as follows.

Azure Deployment

Please note that I’ll shutdown HD Insight cluster once it completed its job, else you’ll lose your credits!

The diagram shows different steps that needs in this experiment.

  • First you need a dataset to examine.
  • Then you have to transfer it to Hadoop cluster, and to Hadoop Distributed File System (HDFS) from there.
  • Thereafter you will run required Hive queries to extract the essence of the dataset. After processing we’ll move those data to a SQL database for ease of accessing.
  • Finally you need to develop a PHP application which runs on an App server node to visualize results.

Preparing dataset

You can download San Francisco annual income dataset from following location:

http://transparentcalifornia.com/salaries/2015/san-francisco/

After downloading, you need to open that using excel. I have observed that several job titles contains comma in text. So use find & replace to replace commas in it with hyphen character. Since this is a CSV file, those commas will negatively affect our analysis.

Put the CSV inside a zip folder to minimize data transfer.

Setting up Azure resources

Now you need to create Azure resources required. So let’s start with HD Insight. In this case we need to create a Hadoop-Linux cluster, with 2 worker nodes. Following guide will help to create such cluster quickly.

https://azure.microsoft.com/en-in/documentation/articles/hdinsight-hadoop-linux-tutorial-get-started/

Also for further information about cluster types and concepts, you may look at following link:

https://azure.microsoft.com/en-in/documentation/articles/hdinsight-hadoop-provision-linux-clusters/

Unfortunately Azure still hasn’t way to deactivate HD Insight cluster when idle. You need to manually delete it, else you’ll be charged for idle hours too.

Thereafter you need to create SQL database. Following tutorial will help on that:

https://azure.microsoft.com/en-us/documentation/articles/sql-database-get-started/#create-a-new-azure-sql-database (Create a new Azure SQL database section)

Finally you need to create an empty App Service. For further information about App Service you may refer following:

https://azure.microsoft.com/en-us/documentation/articles/app-service-web-overview/

This App Service will contain PHP runtime which will be needed at the last part of the article.

A best practice would be, when creating above resources would be to allocate all the resources to a single resource group, which makes it easy to manage.

Also make sure to give strong passwords and remember those.

Executing process

First you need to transfer zip file created to Hadoop nodes’s file system. You can do it by using scp command or any GUI tool which does scp.

As the host, you need to mention “CLUSTERNAME-ssh.azurehdinsight.net”. Along with that, you need to provide ssh credentials.


scp FILENAME.zip USERNAME@CLUSTERNAME-ssh.azurehdinsight.net:

Then you need to access that node using SSH. In windows you can use Putty tool, others may use terminal.


ssh USERNAME@CLUSTERNAME-ssh.azurehdinsight.net

Unzip the file uploaded


unzip FILENAME.zip

Next, you have to move the csv file to Hadoop File System. Use following commands to create a new directory in HDFS and move the csv file


hdfs dfs -mkdir –p /sf/salary/data

hdfs dfs -put FILENAME.csv /sf/salary/data

Now, you need to create Hive query and execute. You can execute Hive query using a file or in an interactive manner. We’ll do in both manner, but first using the file.

Create the following “salaries.hql” using nano

nano salaries.hql

and add the following queries:


DROP TABLE salaries_raw;
-- Creates an external table over the csv file
CREATE EXTERNAL TABLE salaries_raw (
 EMPLOYEE_NAME string,
 JOB_TITLE string,
 BASE_PAY float,
 OVERTIME_PAY float,
 OTHER_PAY float,
 BENEFITS float,
 TOTAL_PAY float,
 TOTAL_PAY_N_BENEFITS float,
 YEAR string,
 NOTES string,
 AGENCY string,
 STATUS string)
-- The following lines describe the format and location of the file
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/sf/salary/data';

-- Drop the salaries table if it exists
DROP TABLE salaries;
-- Create the salaries table and populate it with data
-- pulled in from the CSV file (via the external table defined previously)
CREATE TABLE salaries AS
SELECT
 JOB_TITLE AS job_title,
 BASE_PAY AS base_pay,
 TOTAL_PAY AS total_pay,
 TOTAL_PAY_N_BENEFITS AS total_pay_n_benefits
FROM salaries_raw;

You can also locally create “salaries.hql” and upload via SCP.

The queries are self-explanatory, but to make it easy, each query  ends with a semicolon. Table “salaries_raw” is creating to directly extract values in CSV. So first query has one-to-one mapping with csv data. Data to the table is taken from where we stored csv file. Thereafter “salaries” table is created using “salaries_raw” table. The “salaries” table filters values of base_pay, total_pay and total_pay_n_benifits columns only. Those columns are selected, because only those would necessary for the next level.

To execute the HIVE query, use the following command


beeline -u 'jdbc:hive2://localhost:10001/;transportMode=http' -n admin -f salaries.hql

Next part of the Hive query we’ll going to do with interactive manner. You can open interactive shell with command:


beeline -u 'jdbc:hive2://localhost:10001/;transportMode=http' -n admin

and enter following commands:


INSERT OVERWRITE DIRECTORY '/sf/salary/output'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
SELECT
job_title,
avg(base_pay),
avg(total_pay),
avg(total_pay_n_benefits)
FROM salaries
WHERE base_pay IS NOT NULL AND total_pay IS NOT NULL AND total_pay_n_benefits IS NOT NULL
GROUP BY job_title;

The above Hive query will output the result to “/sf/salary/output” folder. It’ll group the job title and get the average values of base_pay, total_pay and total_pay_n_benifits columns.

Use “!quit” command to exit from interactive shell.

At this stage, we have successfully completed extracting essence of the dataset. Next, we need to make it ready for presentation.

For presentation, we’re going to copy output data to the SQL database created.

To create a table and do other interactions with SQL database, we need to install FreeTDS to Hadoop node. Use following command to install and verify the connectivity.


sudo apt-get --assume-yes install freetds-dev freetds-bin

TDSVER=8.0 tsql -H <serverName>.database.windows.net -U <adminLogin> -P <adminPassword> -p 1433 -D <databaseName>

Once you execute the last command, you’ll be directed to another interactive shell where you can interact with the database you created, when creating SQL node.

Use following commands to create a table to put the output we got


CREATE TABLE [dbo].[salaries](
[job_title] [nvarchar](50) NOT NULL,
[base_pay] float,
[total_pay] float,
[total_pay_n_benefits] float,
CONSTRAINT [PK_delays] PRIMARY KEY CLUSTERED
([job_title] ASC))
GO

Use “exit” command to exit from SQL interactive session.

To move the data from HDFS to SQL database, we are going to use Sqoop. Following Sqoop command with put output data in HDFS to SQL database.


sqoop export --connect 'jdbc:sqlserver://<serverName>.database.windows.net:1433;database=<databaseName>' --username <adminLogin> --password <adminPassword> --table ' salaries' --export-dir 'wasbs:/// sf/salary/output' --fields-terminated-by '\t' -m 1

Once the task is successfully completed, you can again log in to SQL interaction session and execute following to view table results:


SELECT * FROM salaries

GO

Finally you need to use FTP to connect with App Services node, and FTP the following PHP files (including js foolder).

https://github.com/Buddhima/AzureSFSalaryApp

You need to change SQL server host and credentials in “dbconnect.php” file. I’ll leave the rest of code in PHP file as a self-explanatory for you. If you successfully created the app, you should see something similar to following:

azure_web_app

 

Conclusion

In this article I have shown you how I did my first experiment with Azure big data analysis. Along the path I had to cover several other related technologies such as Spark, Azure Stream Analysis. So there are pros and cons with using those technologies. In such cases like analyzing annual income, it’s generally accepted to use Hadoop along with Hive. But if you want to do more frequent activities, you may look in to alternatives.

References

  1. Project sourcecode – https://github.com/Buddhima/AzureSFSalaryApp
  2. Get started with Hadoop in Windows – https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-tutorial-get-started-windows
  3. Analyze Flight Delays with HD Insight – https://azure.microsoft.com/en-us/documentation/articles/hdinsight-analyze-flight-delay-data-linux
  4. Use Hive and HiveQL with Hadoop in HDInsight – https://azure.microsoft.com/en-us/documentation/articles/hdinsight-use-hive/

Architectural Constraints of REST

Introduction

The term REST, (Representational State Transfer) was introduced in way back in year 2000, by Roy Fielding in his doctoral dissertation at UC Irvine[1]. Though it is about 16 years ago, still it’s importance is more or less the same. REST introduces an architectural style for web applications.

REST is tightly coupled, and embraces the potential of HTTP. Because of that it has become more popular in web application domain. REST see a web application as a virtual state-machine where states are different web pages; progressing user from one state to another through links; and going to another state.

rest-state-machine
REST web application with state transitions

To call be a web application RESTful, it needs to comply certain constraints. Following those constraints also helps web application to have desirable non-functional properties as well.

Architectural Constraints

Decoupled client-server interaction

There should be a uniform interface to separate client and server, which helps to achieve separation of concerns. Client has no need to store data, such that client is portable. On the other hand, server does not worry about keeping user state, so server-side can be scaled. In RESTful architecture server and clients should be able to use different technologies to develop separately.

Stateless

Going further with client-server architecture, the server should not keep client context between requests. Each request from client should include all the information necessary to serve that request. Session state is held at client side. Once session data is received, the service can transfer that to another service.

Each state of application contains the links for client to choose the next state transition.

Layered

The client may connect to intermediate layers, which might not aware before-hand. So those layers can enforce security, do load balancing, provide caching etc.

Cacheable

Response from server can be cached at client side and at intermediaries, as defined by HTTP response. Cache-Control Header plays a key role in that behavior [2].

Extensible through code on demand

This is an optional requirement. Server can send executable code to client side to customize the functionality of the client. Client-side scripts such as JavaScript can be taken as an example.

Uniform Interface

This a basic requirement of a RESTful service and helps to make decoupled client-server interaction.

  • Identification of resources

Server can use requests to identify resources separately (eg: URI), and returns the representation of that resource to client (as HTML, XML or JSON).

  • Manipulate the resource through its representation

The representation send by the server is enough for a client to manipulate (update, delete) that resource.

  • Self-descriptive messages

Information included in a message contains adequate information on how to manipulate that message.

  • Hypermedia as the engine of application state (HATEOAS)

“Hypermedia, an extension of the term hypertext, is a nonlinear medium of information which includes graphics, audio, video, plain text and hyperlinks” [3]

So clients can only make transitions, through actions that are dynamically identified within hypermedia by the server. There are no standard format to represent links in a hypermedia, but there are few popular ones [4].

Architectural Properties

The constraints helps you to achieve following properties;

  • Performance
  • Scalability
  • Simplicity of interfaces
  • Modifiability
  • Visibility
  • Portability
  • Reliability

Web service APIs that comply with REST architectural constraints are called RESTful APIs [5].

References

[1] http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm

[2] http://www.tutorialspoint.com/restful/restful_caching.htm

[3] https://en.wikipedia.org/wiki/Hypermedia

[4] http://sookocheff.com/post/api/on-choosing-a-hypermedia-format/

[5] http://restfulapi.net/

CAP Theorem

This theorem is presented as a conjecture by Prof. Eric Brewer at PODC (Principles of Distributed Computing) 2000 keynote talk. The theorem states that a distributed computer system to obtain all following 3 conditions is difficult.

  1. Consistency
  • All nodes should see the same data at the same time
  1. Availability
  • Every request receives response whether it is succeed or failed
  1. Partition Tolerance
  • System continue to work despite partitioning of the system

cap_venn

In 2002, Seth Gilbert and Nancy Lynch were able to prove the theorem.

Proving the CAP theorem

To prove the theorem, let’s assume that theorem is false. Which means there’s a system such that Consistency, Availability and Partition-Tolerance are satisfied.

CAP Theorem - Writing

So let’s say that system has 2 nodes which are N1 and N2.

Also there’s a value v0 in both N1 and N2 nodes, which is interested by the client.

Since system can tolerate partitioning, a user can get the same service from N1 and N2, and according to our assumption the link between N1 and N2 can be broken.

When the link is broken, client updates v0 value in N1 to v1.

Now client expect to get new ‘v’ value from N2, but still returns v0, because the link between N1 & N2 is broken.

Since client still getting v0 from N2, which implies system’s consistency has broken.

Therefore our assumption is wrong and CAP theorem is proven!

So now there can be systems which satisfies either 2 of those 3.

But what about a system which satisfies just Consistency and Availability not Partition-Tolerance?

Let’s get back to our previous system.

So the system supposed to be Consistent and Available.

Again, the link between N1 & N2 breaks.

Client sends v2 to update v values in N1 node.

What should the system do?

  1. Accept the request and update ‘v’ value in N1 – leads to inconsistent system
  2. Reject the client’s requests – leads to unavailable system

So that says there’s a rare chance of having a system which satisfies Consistency and Availability.

What is available out there is systems with,

AP – relax consistency, but not inconsistent

CP – sacrificed availability, but not unavailable.

So real world systems with AP and CP can offer a degree of consistency, availability and partition-tolerance.

 

PACELC

This was developed to give more complete description of the space of potential tradeoffs for a distributed system

If there’s Partition (P)

  • System trade-off Availability (A) and Consistency (C)

Else (E)

  • System trade-off Latency (L) and Consistency (C)

 

Availability and Latency are arguably the same thing:

Unavailable → extreme high latency

PACELC Theorem - Writing
Example Systems:

PA/EL Systems – Dynamo, Cassendra, Riak

PC/EC Systems – BigTable, HBase, VoltDB/H-Store

PA/EC Systems – MongoDB

PC/EL Systems – Yahoo! PNUTS

 

References

[1] https://en.wikipedia.org/wiki/CAP_theorem

[2] http://mwhittaker.github.io/2014/08/16/illustrated-proof-cap-theorem/

[3] https://www.youtube.com/watch?v=hUd_9FENShA

 

JDBC Message Store for WSO2 ESB

Introduction

Message stores in WSO2 ESB is important in implementing various scenarios like store & forwarding, and reliable delivery. They can be used to persist messages during mediation or even after. Currently synapse uses in-memory message stores, which is incapable of persists after execution. It is also consuming memory too. JMS message stores can save messages, but it need more additional resources and speed becomes slower. So as a good alternative for those is a JDBC message store. This has been added to WSO2 ESB 4.9.0 onwards.

JDBC message store

In this implementation ESB uses the same implementation of message store already in ESB but in a different form. It is designed closely similar to the design of JMS message store implementation in WSO2 ESB. It uses a JDBC connector to connect to external relational databases (eg: mysql, h2 are tested).

WSO2_jdbc_store_foward
JDBC Message Store for store and Forward Pattern

 

From the very beginning of the designing phase JDBC Message Store focused on eliminating the difficulties that faced when using JMS queues as message stores. So following list of aims are achieved from JDBC store,

Easy to connect – It’s rather easy to connect with databases compared to JMS queues
More operations on data – In JMS queues methods like random selecting messages are not supposed to support. But with JDBC it become a reality, and has operations very close to in-memory stores.
Fast transactions – In test I have seen that JDBC stores are capable of handling about 2300 transactions per second which is 10 times faster than the existing system.
Work with high capacity and long-time – Since JDBC Stores uses databases as the medium to store data, and can depend on up to Terabytes of data. It is also generally accepted that Databases are capable of handling data for long-time compared to JMS queues.
After having tests in different backgrounds with different configurations, I have seen that the outcomes of JDBC message store has achieved more than expected at the initial stages.

In this store implementation, the message is converted to serializable Java object, so it to be able to store as a blob.
Construction of the persisting message has two basic parts JDBC Axis2 Message and JDBC Synapse Message. Combination of those two will produce the Storable Message , which is sent to database.

wso2_jdbc_store_message
Constructing a Storable Message

Other than message constructing classes following are explanation on what rest of the classes are doing.

JDBCMessageStore– Provides the fundamental interface to external parties by encapsulating the underlying implementations. This class exposes store, poll, get, clear, peek and other generic methods of messages stores to outside parties.
JDBCMessageStoreConstants – This class defines the related constant values for JDBC message store. This class make it easy for maintain JDBC store implementation by gathering all the constants in to a single place.
JDBCConfiguration – This class was defined to provide necessary utility functionalities to JDBC operations. Basically it deals with creating connections and terminating connections, querying database tables etc.
JDBCMessageConverter – This class is to help with converting SOAP messages in to serializable Java objects and the reverse process after querying the required. This works as an adaptor between database and ESB.
JDBCProducer – This is to produce messages into store which is used by the store mediator.
JDBCConsumer – This is to consume messages from a message-store, and to be used by message processors.
Those classes along with the classes mentioned previously, creates a successful JDBC Message Store.

Configuration of JDBC Message Store

To use JDBC Message store customer has to add the required JDBC support. There after following configuration will allow any message processor to use JDBC message store as same as other message store. Configuration can be specified as an inline or points to a datastore (which gives you additional control over database).

<store messageStore="MyStore"/>

<messageStore class="org.apache.synapse.message.store.jdbc.JDBCMessageStore" name="MyStore">

(
<parameter name="store.jdbc.driver">com.mysql.jdbc.Driver</parameter>
<parameter name="store.jdbc.connection.url">jdbc:mysql://localhost:3306/mystore</parameter>
<parameter name="store.jdbc.username">root</parameter>
<parameter name="store.jdbc.password"></parameter>
<parameter name="store.jdbc.table">store_table</parameter>

|

<parameter name="store.jdbc.dsName">reportDB</parameter>
<parameter name="store.jdbc.table">store_table</parameter>

)

</messageStore>

In-lined Data Source

store.jdbc.driver – Database driver class name
store.jdbc.connection.url– Database URL
store.jdbc.username – User name for access Database
store.jdbc.password – Password for access Database
store.jdbc.table – Table name of the database

External Data Source

store.jdbc.dsName – The name of the Datasource to be looked up
store.jdbc.table – Table name of the database
Optionally;
store.jdbc.icClass – Initial context factory class. The corresponding java environment property is java.naming.factory.initial
store.jdbc.connection.url– The naming service provider url . The corresponding java environment property is java.naming.provider.url
store.jdbc.username – This is corresponding to the java environment property java.naming.security.principal
store.jdbc.password – This is corresponding to the java environment property This is corresponding to the java environment property java.naming.security.principal

Database script

For creation database table you can use following scripts.

MySQL :
CREATE TABLE jdbc_store_table(
indexId BIGINT( 20 ) NOT NULL AUTO_INCREMENT ,
msg_id VARCHAR( 200 ) NOT NULL ,
message BLOB NOT NULL ,
PRIMARY KEY ( indexId )
)

H2 :
CREATE TABLE jdbc_store_table(
indexId BIGINT( 20 ) NOT NULL AUTO_INCREMENT ,
msg_id VARCHAR( 200 ) NOT NULL ,
message BLOB NOT NULL ,
PRIMARY KEY ( indexId )
)

You can create similar SQL script according to your database.

Sample

First you need to put the relevant database driver into repository/components/lib folder. [2]
Following sample configuration is based on a mqsql database name “mystore” and table “store_table”.

<proxy xmlns="http://ws.apache.org/ns/synapse"
       name="MessageStoreProxy"
       transports="https http"
       startOnLoad="true"
       trace="disable">
   <description/>
   <target>
      <inSequence>
         <property name="FORCE_SC_ACCEPTED" value="true" scope="axis2"/>
         <property name="OUT_ONLY" value="true"/>
         <property name="target.endpoint" value="StockQuoteServiceEp"/>
         <store messageStore="MyStore"/>
      </inSequence>
   </target>
   <publishWSDL uri="http://localhost:9000/services/SimpleStockQuoteService?wsdl"/>
</proxy>

<messageStore xmlns="http://ws.apache.org/ns/synapse"
              class="org.apache.synapse.message.store.impl.jdbc.JDBCMessageStore"
              name="MyStore">
   <parameter name="store.jdbc.password"/>
   <parameter name="store.jdbc.username">root</parameter>
   <parameter name="store.jdbc.driver">com.mysql.jdbc.Driver</parameter>
   <parameter name="store.jdbc.table">store_table</parameter>
   <parameter name="store.jdbc.connection.url">jdbc:mysql://localhost:3306/mystore</parameter>
</messageStore>

<messageProcessor xmlns="http://ws.apache.org/ns/synapse"
                  class="org.apache.synapse.message.processor.impl.forwarder.ScheduledMessageForwardingProcessor"
                  name="ScheduledProcessor"
                  messageStore="MyStore">
   <parameter name="max.delivery.attempts">5</parameter>
   <parameter name="interval">10</parameter>
   <parameter name="is.active">true</parameter>
</messageProcessor>

Message processor also can be added via management console through following section:

wso2_message_store_screen
JDBC Message Processor

 

Use axis2 client as follows to send messages to the proxy:

ant stockquote -Daddurl=http://localhost:8280/services/MessageStoreProxy

Rough comparison between JMS and JDBC message store performance

For the testing the selected database details are as follows:

Type : MySQL on XAMPP server for database (JMS is just for comparison)
Storage Engine : MyISAM
Machine Details : Ubuntu 14.04 on Intel® Core™ i7-4800MQ CPU @ 2.70GHz × 8 with 16 GB RAM
Messages sent via : JMeter client

# of Threads Messages/Thread Total Messages Throughput / sec
JMS Store Producer JDBC Store Producer
1 100 100 63.9 552.5
1 500 500 62.8 566.3
1 1000 1000 63.5 577.4
1 2000 2000 62.3 629.3
1 3000 3000 62.8 649.9
10 10 100 73.7 108.8
10 50 500 70.4 511.8
10 100 1000 70.7 928.5
10 200 2000 71.3 1537.3
10 300 3000 72.3 1827
50 10 500 71.6 494.1
50 100 5000 72.5 3494.1
100 250 25000 73.4 5529.8
500 100 50000 7055.2

(Note: At small number of messages, JDBC figures have not reached stable throughput)

References:

[1] Pull request containing the implementation – https://github.com/wso2/wso2-synapse/pull/91

[2] MySQL connector – https://dev.mysql.com/downloads/connector/j/

[3] WSO2 ESB 4.9.0-ALPHA – https://github.com/wso2/product-esb/releases/tag/esb-parent-4.9.0-ALPHA