JSON Enrich Mediator for WSO2 ESB


JSON support for WSO2 ESB [1] was introduced sometime back. But only small number of mediators support manipulating JSON payloads. In this article I am going to introduce a new mediator called JsonEnrichMediator [2], which work quite similar to existing Enrich mediator [3], but aiming JSON payloads. The specialty of this mediator is, since this is working with native JSON payload, JSON payload will not be converted to an XML representation. Hence there won’t be any data loss due to transformations.

Please note that this is a custom mediator I have created and will not ship with WSO2 ESB pack.

Configuring Mediator

  1. Clone the GitHub repository: https://github.com/Buddhima/JsonEnrichMediator
  2. Build the repository using maven (mvn clean install)
  3. Copy the built artifact in target folder to ESB_HOME/repository/components/dropins
  4. Download json-path-2.1.0.jar [5], json-smart-2.2.jar [6] and put them to the same folder (dropins).
  5. Start WSO2 ESB (sh bin/wso2server.sh)

Sample Scenario

For this article I am using a sample scenario which moves a JSON property within the payload. For that you need to add the following API to WSO2 ESB.

<api xmlns="http://ws.apache.org/ns/synapse" name="sampleApi" context="/sample">
<resource methods="POST" uri-template="/*">
<log level="full"/>
<source type="custom" clone="false" JSONPath="$.me.country"/>
<target type="custom" action="put" JSONPath="$" property="country"/>

The above configuration will take the value pointed by JSONPath “$.me.country” and move it to the main body. You can find further details about JSONPath at the location [4].

Once the API is deployed, you need to send following message to ESB.

curl -H "Content-Type: application/json"
-X POST -d '{
"country": "Sri Lanka",
"language" : "Sinhala"

The output of the ESB should look like below

"me": {
"language": "Sinhala"
"country": "Sri Lanka"


I have shown a simple use-case of using JSON Enrich Mediator. You can see the comprehensive documentation at the code repository [2].


[1] WSO2 ESB JSON support : https://docs.wso2.com/display/ESB500/JSON+Support

[2] Code Repository for JSON Enrich Mediator : https://github.com/Buddhima/JsonEnrichMediator

[3] WSO2 ESB Enrich Mediator : https://docs.wso2.com/display/ESB500/Enrich+Mediator

[4] JSON Path documentation : https://github.com/jayway/JsonPath/blob/json-path-2.1.0/README.md

[5] json-path-2.1.0 : https://mvnrepository.com/artifact/com.jayway.jsonpath/json-path/2.1.0

[6] json-smart-2.2.1 : https://mvnrepository.com/artifact/net.minidev/json-smart/2.2.1

WSO2 ESB Endpoint Error Handling


WSO2 ESB can be used as an intermediary component to connect different systems. When connecting those systems the availability of those systems is a common issue. Therefore ESB has to handle those undesirable situations carefully and take relevant actions. To cater that requirement outbound-endpoints of the WSO2 ESB can be configured. In this article I discuss two common ways of configuring endpoints.

Two common approaches to configure endpoints are;

  1. Configure with just a timeout (without suspending endpoint)
  2. Configure with a suspend state

Configure with just a timeout

This would suitable if the endpoint failure is not very frequent.

Sample Configuration:

<endpoint name="SimpleTimeoutEP">
    <address uri="http://localhost:9000/StockquoteService">


In this case we only focus on the timeout of the endpoint. The endpoint will stay as Active for ever. If a response does not receive within duration, the responseAction triggers.

duration – in milliseconds

responseAction – when response comes to a time-out message one of the following actions trigger.

  • fault – calls the fault-sequence associated
  • discard – discards the response
  • none – will not take any specific action on response (default action)

The rest of the configuration avoids the endpoint going to suspend state.

If you specify responseAction as “fault”, you can define define customize way of informing the failure to the client in fault-handling sequence or store that message and retry later.

Configure with a suspend state

This approach is useful when connection failures are very often. By suspending endpoint, ESB can save resources without unnecessarily waiting for responses.

In this case endpoint goes through a state transition. The theory behind this behavior is the circuit-breaker pattern. Following are the three states:

  1. Active – Endpoint sends all requests to backend service
  2. Timeout – Endpoint starts counting failures
  3. Suspend – Endpoint limits sending requests to backend service

Sample Configuration:

<endpoint name="Suspending_EP">
    <address uri="http://localhost:9000/StockquoteServicet">
        <errorCodes>101504, 101505</errorCodes>
        <errorCodes>101500, 101501, 101506, 101507, 101508</errorCodes>


In the above configuration:

If endpoint error codes are 101504, 101505; endpoint is moved from active to timeout state.

When the endpoint is in timeout state, it tries 3 attempts with 1 millisecond delays.

If all those retry attempts fail, the endpoint will move to suspend state. If a retry succeed, then endpoint will move to active state.

If active endpoint receives error codes 101500, 101501, 101506, 101507, 101508; endpoint will directly move to suspend.

After endpoint somehow moves to suspend state, it waits initialDuration before attempting any furthermore. Thereafter it will determine the time period between requests according to following equation.

Min(current suspension duration * progressionFactor, maximumDuration)

In the equation, “current suspension duration” get updated for each reattempt.

Once endpoint succeed in getting a response to a request, endpoint will go back to active state.

If endpoint will get any other error codes (eg: 101503), it will not do any state transition, and remain in active state.


In this article I have shown two basic configurations that would be useful to configure endpoints of WSO2 ESB. You can refer WSO2 ESB documentation for implementing more complex patterns with endpoints.


WSO2 ESB Documentation: https://docs.wso2.com/display/ESB500/Endpoint+Error+Handling#EndpointErrorHandling-timeoutSettings

Timeout and Circuit Breaker Pattern in WSO2 Way: http://ssagara.blogspot.com/2015/05/timeout-and-circuit-breaker-pattern-in.html

Endpoint Error Codes: https://docs.wso2.com/display/ESB500/Error+Handling#ErrorHandling-codes

Endpoint Error Handling: http://wso2.com/library/articles/wso2-enterprise-service-bus-endpoint-error-handling/

Classic Mistakes in Software Development


Software development is a complicated activity. Hence the people work in the project can do mistakes that could affect the project. Researchers have reviewed number of software projects and identified a set of mistakes that can be seen throughout projects. They have mentioned that those mistakes might not be only causes for slow development. To slip a project into slow development, all you need to do is to make one big mistake. However to achieve efficient development, you need to avoid all mistakes.

The set of mistakes that researchers have identified is known as “Classic Mistakes”. Those bad practices have been chosen so often, by so many people. And those mistakes have predictable bad-results on the development of the project.

Four categories of classic mistakes:

  1. People related
  2. Process related
  3. Product related
  4. Technology related

People related classic mistakes

This kind of mistakes talks about how to avoid mistakes among team mates. This kind of mistakes affect directly to the development speed and it is crucial to rectify those.

Undermined motivation – Studies have shown that giving suspicious talks at the beginning, asking to work overtime reduces the motivation of the people. Sometimes team leaders take long vacations while team is working overnights. The researchers highlighted that team lead has to work along with other team members is a positive motivation.

Weak personnel – If a team need an efficient development throughout the project, the recruitment needs to hire talented developers. Also carefully filter people who could do most of the work until the end of the project.

Uncontrolled problem employees – Failure to take actions for problems with team members and team leads will eventually affect the development speed. Some higher management should actively look into those and sort out.

Heroics – Heroics within the team increases the risk and discourages cooperation among the other members of the team

Adding people to a late project – Adding new people when the project is behind schedule, can take more productivity away from team members.

Noisy, crowded offices

Friction between developers and customers – Need to increase the communication between customers and developers.

Unrealistic expectations – Setting deadlines earlier without any proper reasons can lengthen the development schedule.

Process related classic mistakes

This type of mistakes talks about issues that may arise in management and technical methodologies.

Overly optimistic schedules – This sort of scheduling will result in failure by under-scoping the project and hurt long-term morale and productivity of the developers.

Insufficient risk management – If projects risks are not actively managed, the project will lead in to slow-development mode.

Contractor failure – weak relationship with contractors can lead to slow-down the project

Insufficient planning

Short-changed upstream activities – Start coding without properly design the project plans will costs 10 or 100 times than doing it with properly designed plans.

Short-changed quality assurance – Eliminating design and code reviews, eliminating test planning and do only perfunctory testing will reduce the development of the project and ends up with major bugs.

Omitting necessary tasks from estimates – People forget about the less visible tasks and those tasks add up.

Code-like-hell programming – Developers should be sufficiently motivated rather forcing them to work hard.

Product related classic mistakes

This type of mistakes talks about which can affect the outcome of the project.

Requirements gold-planting – More requirements that are not really necessary, and pay less attention on complex features

Feature creep – On average 25% of requirements can be changed and affect the project schedule.

Developer gold planting – It is frequent that developers attempt to try new technologies that they saw in other projects, which is not actually necessary.

Technology related classic mistakes

This type of mistakes is about technologies use during the project.

Silver-bullet syndrome – Thinking that certain approach will solve every issue, and that approach has not already used by developers (eg: Object oriented design)

Overestimated savings from new tools or methods – New practices will introduce a new risk as team has to go through a learning-curve to become familiar.

Switching tools in the middle of a project – Using new tools will add a learning curve, rework and inevitable mistakes to project schedule

Lack of automated source-code control – If two or more developers are working on the same part of the project, it is necessary to adhere to source-code control practices. If not developers have to spend time on resolving conflicting changes.


In this article I have mentioned several mistakes that could done by people during a project life time. There could be many other mistakes which can lead to slow-down a project. However at least you should avoid these well-known classic mistakes.

Java Virtual Machine


From this post I thought of discussing how underlying components work together for a successful execution of a Java program. Content in this post consists of a collection of knowledge I gathered after going though several articles on this topic.

Most of us never bothered about internal stuffs related to Java, because IDEs have made our life a lot easier. But it’s worth if you have some idea about internal stuffs as if you happened to work with an enterprise level application, and faced with memory issues. In this article, I though of going from basic concepts to an intermediate level.


In brief “JVM (Java Virtual Machine) is an abstract machine. It is a specification that provides runtime environment in which java bytecode can be executed.” Which implies JVM is merely a concept. I’ll come back to JVM later to discuss more.

JRE (Java Runtime Environment) provides a runtime environment (implementation of JVM). JRE contains the concrete implementation of JVM (eg: HotSpot JVM, JRockit, IBM J9), set of libraries and other files need for JVM. Different vendors release their own JRE, based on a reference.

JDK is for developers to create Java program. JDK consists of JRE + development tools such as ‘javac’.

One important thing to remember is all JVM, JRE and JDK are platform-dependent.

JVM, JRE, JDK (source: javatpoint.com)


JVM in Detail

It is said that JVM is;

  1. A specification where working of Java Virtual Machine is specified. But implementation provider is independent to choose the algorithm. Its implementation has been provided by Sun and other companies.
  2. An implementation Its implementation is known as JRE (Java Runtime Environment).
  3. Runtime Instance Whenever you run a java program, an instance of JVM is created.

JVM is responsible for following operations:

  • Loads code
  • Verifies code
  • Executes code
  • Provides runtime environment

Internal Architecture of JVM

(source: javatpoint.com)
  1. Classloader : Classloader is used to load class files. The embedded classloader in JVM is also called “primordial class loader”. Depending on the class name, classloader can search for the .class file in directory structure. User can also define classloaders (called “non-primordial class loader”) if required.
  2. Method (Class) Area : Also called non-heap area, it has 2 subsections. “Permanent Generation – This area stores class related data from class definitions, structures, methods, field, method (data and code) and constants. Can be regulated using -XX:PermSize and -XX:MaxPermSize. It can cause java.lang.OutOfMemoryError: PermGen space if it runs out if space”. OutOfMemoryError happens when class definitions are accumulated. The other section, Code Cache is used by JIT to store compiled code (hardware specific native code).
  3. Heap : Area allocated for runtime data. This area is shared by all threads. We can use  -Xms and -Xmx JVM options to tune the heap size. In most of the time “java.lang.OutOfMemoryError” error occurs because heap is getting exhausted. Heap consists of 3 sub-areas:
    1. Eden (Young) “New object or the ones with short life expectancy exist in this area and it is regulated using the -XX:NewSize and -XX:MaxNewSize parameters. GC (garbage collector) minor sweeps this space”
    2. Survivor – “The objects which are still being referenced manage to survive garbage collection in the Eden space end up in this area. This is regulated via the -XX:SurvivorRatio JVM option”
    3. Old (Tenured) – “This is for objects which survive long garbage collections in both the Eden and Survivor space (due to long-time references of course). A special garbage collector takes care of this space. Object de-alloaction in the tenured space is taken care of by GC major”

      Analysis on “java.lang.OutOfMemoryError” can be done by taking a heap-dump when the incident occours. You can refer a case-study on analyzing for a real-world case at here: http://javaeesupportpatterns.blogspot.com/2011/11/hprof-memory-leak-analysis-tutorial.html

  4. Stack : This area is specific to a thread. Each thread has it’s stack which is used to store local variables and regulates method invocation, partial result and return values. This space can be tuned with -Xss JVM option.
  5. Program Counter Register : It contains the address of the Java virtual machine instruction currently being executed.
  6. Native Stack : Used for non-Java code, per thread allocation
  7. Execution Engine : This contains 3 parts;
    1. A virtual processor
    2. Interpreter, which reads the bytecode and execute the instructions
    3. Just-In-Time(JIT) compiler: “It is used to improve the performance. JIT compiles parts of the byte code that have similar functionality at the same time, and hence reduces the amount of time needed for compilation.Here the term ‘compiler’ refers to a translator from the instruction set of a Java virtual machine (JVM) to the instruction set of a specific CPU.”


In this post I want to gather a colllection of knowledge I found related to JVM from various articles. All those areticles are mentioned in Reference section below. I would like to thank all those original authors, and hope you will get some knowledge out of this as well.


  1. JVM internals tutorial – http://www.javatpoint.com/internal-details-of-jvm
  2. About JVM Memory – https://abhirockzz.wordpress.com/2014/09/06/jvm-permgen-where-art-thou/
  3. Understanding Java PermGen – http://www.integratingstuff.com/2011/07/24/understanding-and-avoiding-the-java-permgen-space-error/
  4. Java HotSpot VM Options – http://www.oracle.com/technetwork/articles/java/vmoptions-jsp-140102.html
  5. Java Classloaders – http://www.javaworld.com/article/2077260/learn-java/learn-java-the-basics-of-java-class-loaders.html
  6. Heap Dump Analysis with VisualVM – http://www.javaworld.com/article/2072864/heap-dump-and-analysis-with-visualvm.html
  7. Real-world Example for Heap Dump Analysis with MAT – http://javaeesupportpatterns.blogspot.com/2011/11/hprof-memory-leak-analysis-tutorial.html

Creating Geometry Compass Using JavaScript


Few weeks back, I was asked to look around for existing geometry drawing tools. While searching, I found numbers of dynamic geometric construction tools which allows  drawing. Most of those projects are desktop-based, and few are web-based. Most of the web-based are reusing existing platforms such as GeoGebra.

In my research, I was focusing on the ruler and the compass tools which can be reused. Unfortunately I didn’t came across a web-based tool which can reuse. There were some good stuff, which are not reusable. Since most of the other web-based systems reuse GeoGebra, so they have the same limitations that GeoGebra has. Almost all reusable web-based tools have an issue with their compass, which was not user-friendly.


Geometry compass in most web-based tools allows you to draw arcs in one direction. Mostly anti-clockwise direction. This uni-directional compass tools were annoying for me while drawing constructions. Searching furthermore, I found Geometra (a desktop-based tool) has provided a reasonable solution for that. The compass in Geometra allows drawing arcs both sides, but allows drawing only small-arcs (less than 180 degrees). In my implementation I thought of bringing Geometra’s approach to web-based environment.


For this implementation I used HTML5 canvas, and JavaScript. To make the implementation easier I added fabricjs library. Though FabricJs makes it easy to use HTML5 canvas, it has the limitation of representing Arcs. Therefore I extended its circle class to cater the requirement. Furthermore I used Compass class to handle properties of compass and event.js for handling mouse events.


In this section, I’ll go through files which I discussed earlier. “Arc.class.js” file extends Circle class of fabricJs. Then I’ve overridden the initialize, render and toSVG methods to suite the requirement.


fabric.Arc = fabric.util.createClass(fabric.Circle, {
	type: 'arc',

	counterclockwise: false,

	initialize: function (options) {
		this.counterclockwise = options.counterclockwise;
		this.callSuper('initialize', options);

	_render: function (ctx, noTransform) {
		ctx.arc(noTransform ? this.left + this.radius : 0,
		      noTransform ? this.top + this.radius : 0,
		      this.endAngle, this.counterclockwise);

    toSVG: function(reviver) {
    	var markup = [];

		var rx = this.left + this.radius * Math.cos(this.startAngle);
		var ry = this.top + this.radius * Math.sin(this.startAngle);

		var ex = this.left + this.radius * Math.cos(this.endAngle);
		var ey = this.top + this.radius * Math.sin(this.endAngle);

		var svgPath = '';
	    if (!this.counterclockwise) {
	    	svgPath += '<path d=\"M'+rx+','+ry+' A'+this.radius+','+this.radius+' 0 0,1 '+ex+','+ey+'\" style=\"'+this.getSvgStyles()+'\"/>';
	    } else {
	    	// Exchange starting and ending points when it's counterclockwise
	    	svgPath += '<path d=\"M'+ex+','+ey+' A'+this.radius+','+this.radius+' 0 0,1 '+rx+','+ry+'\" style=\"'+this.getSvgStyles()+'\"/>';


    	return reviver ? reviver(markup.join('')) : markup.join('');

The above Arc class provides generic support for drawing arcs, similar to Line and Circle classes, already comes with FabricJs. To use Arc class, I created compass JavaScript file. It has 3 public methods: redraw; which is handling mouse movements when drawing starts, complete; which concludes the drawing and toSVG; which gives out the SVG representation of the arc drawn.


function Compass (mouseStart) {
	// 'c' for center, 'r' for radius, 'e' for end
	this.cx = this.rx = this.ex = mouseStart.x;
	this.cy = this.ry = this.ey = mouseStart.y;
	this.radius = 0;

	var points = [this.cx, this.cy, this.rx, this.ry];

	this.radiusLine = new fabric.Line(points, {				
										    strokeWidth: 2,
										    fill: 'black',
										    stroke: 'black',
										    strokeDashArray: [6, 3],
										    selectable: false


	this.textObj = new fabric.Text('0', {
									        fontFamily: 'Times_New_Roman',
									        left: this.x1,
									        top: this.y1,
									        fontSize: 20,
									        originX: 'center'


	this.status = 'radius';

Compass.prototype = {
	constructor : Compass,

	redraw : function (mouse) { 

		if (this.status == 'radius') {
			this.rx = mouse.x;
		 	this.ry = mouse.y;

		 	this.radiusLine.set({ x2: this.rx, y2: this.ry });

		 	var tmp = addDistanceLabel (this.textObj, {x: this.cx, y:this.cy}, {x:this.rx, y:this.ry});


		} else if (this.status = 'end') {
			this.ex = mouse.x;
			this.ey = mouse.y;

			this.endAngle = this._getAngle({ x:this.ex, y:this.ey })

			var angleDiff = this.endAngle - this.startAngle;

			if ((-Math.PI * 2 < angleDiff) && (angleDiff < -Math.PI)) {
				this.counterclockwise = false;
			} else if ((-Math.PI < angleDiff) && (angleDiff < 0)) {
				this.counterclockwise = true;
			} else if ((0 < angleDiff) && (angleDiff < Math.PI)) {
				this.counterclockwise = false;
			} else if ((Math.PI < angleDiff) && (angleDiff < Math.PI * 2)) {
				this.counterclockwise = true;

			this.fabricObj.set( {endAngle: this.endAngle, counterclockwise: this.counterclockwise} );



	complete : function () {

		if (this.status == 'radius') {




			this.radius = Math.sqrt( Math.pow((this.rx-this.cx), 2) + Math.pow((this.ry-this.cy), 2) );

			this.startAngle = this._getAngle({ x:this.rx, y:this.ry });

			this.fabricObj = new fabric.Arc({
										left: this.cx,
										top: this.cy,
										radius: this.radius,
										startAngle: this.startAngle,
										endAngle: this.startAngle,
										counterclockwise: false,
										fill: '',
										stroke: 'black',
										originX: 'center',
								        originY: 'center',
								        selectable: false,
								        strokeDashArray: [6, 3]


			this.status = 'end';

		} else if (this.status = 'end') {

			this.fabricObj.set({ strokeDashArray: [] });



	toSVG : function () {
	    return this.fabricObj.toSVG();

	_getAngle : function (point) {
		var angleRequired = 0;

		// gets the actual angle from center
		if ((point.x-this.cx) == 0) {
			// handling special cases
			if (this.cy > point.y) { 
				angleRequired = Math.PI/2;
			} else if (this.cy < point.y) {
				angleRequired = -Math.PI/2;
		} else {
			// in general cases
			angleRequired = Math.atan ((point.y-this.cy) / (point.x-this.cx));

			if ((this.cy < point.y) && (angleRequired < 0)) { // handle 2nd quadrant
				angleRequired = Math.PI - Math.abs(angleRequired);
			} else if ((this.cy > point.y) && (angleRequired > 0)) { // handle 3rd quadrant
				angleRequired = Math.PI + Math.abs(angleRequired);
			} else if ((this.cy > point.y) && (angleRequired < 0)) { // handle 4th quadrant
				angleRequired = 2*Math.PI - Math.abs(angleRequired);

		return angleRequired;

For handling mouse events, I’ve created event JavaScript file. Depending on mouse events, this event file calls methods belongs to compass.


var fabricCanvas = new fabric.Canvas('sheet', { selection: false });
var selectedTool = '';
var toolState = '';
var toolPreviousState = '';
var instruction = $('#instructionText');

var currentTool = null;

var compassSettings = $('#compass_settings');
var compassSettingsState = $('input[name=compass-state]');

$('input[name=tool]').click(function() {

fabricCanvas.on('mouse:down', function(e) {

	// Get mouse coordinates
	var mousePointer = getMousePointer(fabricCanvas, e);

	switch(selectedTool) {

		case 'compass' :

			switch(toolState) {
				case 'center' :
					currentTool = new Compass(mousePointer); 
					instruction.text('Select Radius Point');
					toolPreviousState = toolState;
					toolState = 'radius';

				case 'radius' :

					// do radius logic here

					// change to next
					// currentTool.addPoint(mousePointer);

					toolPreviousState = toolState;

					instruction.text('Select Ending Point');						
					toolState = 'end';

				case 'end' :

					// do end logic here

					instruction.text('Select Center Point');
					toolPreviousState = '';
					toolState = 'center';


}, false);

fabricCanvas.on('mouse:move', function(e) {
	var mousePointer = getMousePointer(fabricCanvas, e);

	switch(selectedTool) {
		case 'compass' :
			switch(toolState) {
				case 'radius':
				case 'end' :


}, false);

function getMousePointer (canvas, evt) {
    var mouse = canvas.getPointer(evt.e);
    var x = (mouse.x);
    var y = (mouse.y);
    return {
        x: x,
        y: y

function addDistanceLabel (lineObj, start, end) {
	// change text label
 	var textX = start.x + ((end.x - start.x) / 2);
 	var textY = start.y + ((end.y - start.y) / 2);

 	var distance = Math.sqrt( Math.pow((end.x-start.x), 2) + Math.pow((end.y-start.y), 2) );
 	distance = (distance / 50.0).toFixed(1); // make it centimeters

 	lineObj.set( {left: textX, top: textY } );
 	lineObj.setText(distance + ' cm');

function initTool (toolName) {


	switch (toolName) {
		case 'compass' :
			selectedTool = 'compass';
			toolState = 'center';
			instruction.text('Select Center Point');



Finally you need to include dependencies in index.html file, which is shown below.


<!DOCTYPE html>
	    <title>Mathematical Constructions</title>
	    <meta charset="utf-8"/>

		<h3>Geometrical Construction Drawing</h3>
		<canvas id="sheet" style="left:10px;top:10px;bottom:10px; border:1px solid #000000;" height="550" width="1280"></canvas>
		<div id="instructionText">Click on Compass</div>
		<input type="button" name="tool" class="btn btn-default" onclick="initTool('compass')" value="Compass">

		<script type="text/javascript" src="./js/jquery-3.1.1.min.js"></script>
		<script type="text/javascript" src="./js/fabric.js"></script>
		<script type="text/javascript" src="./js/events.js"></script>

		<!-- Extended shapes -->
		<script type="text/javascript" src="./js/Arc.class.js"></script>

		<!-- Tools for drawing -->
		<script type="text/javascript" src="./js/Compass.js"></script>

		<style type="text/css">
			.active_tool {background-color: gray}

To use this, you need to open index.html file in a browser. Then click Compass button, then click center and radius points respectively, and click on ending of arc. It’ll draw arcs shown as below.




In this post I shared my experience on how a bi-directional geometric compass is implemented for a web-based environment. Hope this will be helpful for you.


[1] GeoGebra – https://www.geogebra.org/

[2] Geometra – https://sourceforge.net/projects/geometra/

[3] HTML5 Canvas – http://www.w3schools.com/html/html5_canvas.asp

[4] SVG Path representation for Arcs – https://developer.mozilla.org/en/docs/Web/SVG/Tutorial/Paths#Arcs



Configuring ownCloud with XAMPP in Ubuntu



The ownCloud is a popular open-source, self-hosted file sync and sharing platform [1]. With ownCloud you can almost create a cloud storage of your own (private cloud). For that you need to installing ownCloud on top of an already existing server is the common way. For this base infrastructure, I’m using XAMPP [2], and please take a note of product versions below.


  • Linux machine (I tested with Ubuntu 16.04)
  • XAMPP 7.0.8 (x64)
  • ownCloud 9.1.0

Note: I have experienced issues with XAMPP 7.0.9, missing module mod_ssl [3]. Also older ownCloud versions may not support newer PHP version coming up with XAMPP


Installing XAMPP is quite simple task, that you need to run downloaded “.run” executable in sudo mode.

chmod +x xampp-executable-name.run

sudo ./xampp-executable-name.run

You will get a GUI to configure further settings. Once you finished installing XAMPP, you can move on to installing ownCloud.

Installing ownCloud wan’t easy as thought. You’ll have to deal with several ownership level changes to directories.

First you need to download the ownCloud, and extract the zip file. Secondly you need to put it in to “htdocs” folder of XAMPP.

Then use following in terminal to go to super-user mode and creating the “data” folder inside owncloud folder you copied. Thereafter you need to give all privileges to “data” folder with “chmod”

sudo su

cd owncloud/

mkdir data chmod 777 data/

Now you can access “http://localhost/owncloud&#8221; URL from browser and configure admin account.

Once you have created the admin account, you need to make sure that data folder has right permissions. For that you need to change user, user-group ad permissions of the “data” folder.

chown daemon:daemon -R data/

chmod 770 -R data/

After completion, you’ll be able to see the welcome screen of ownCloud when navigating to “http://localhost/owncloud&#8221; URL.


[1] ownCloud – https://owncloud.org/

[2] XAMPP – https://www.apachefriends.org/index.html

[3] XAMPP issue – http://unix.stackexchange.com/questions/307364/xampp-apache-server-wont-start

Experience with Azure Big Data Analyzing


This experiment was done in order to have first understanding on how to deal with big data on cloud. So this is my first time experience with lots of new technologies such as Hadoop, Hive etc. Before the experiment I have tried few examples with Hadoop and other related technologies, and yet found that this would be a better way to go.

Use case

This experiment is based on the dataset with annual salaries for different persons. In this case this is about California, USA in 2015. The data set contained names, job titles, basic salary, additional benefits and total salary. So in this, I’ll concentrate on averages of job titles and their averaged total salaries.

Please note that this is an experiment which is solely depends on the dataset. Actual survey reports may contain values different from final results.


Before moving to the rest of the article, you are expected to have better understanding on followings:

  • Your understanding on Microsoft Azure platform is necessary. You can create a free azure account and try to play around with portal. Also you need to understand their concepts such as Resources, Resource Groups etc.
  • You need to have some sort of understanding on big data concepts such as what is big data, Hadoop DFS, Hadoop Eco system, Hive, SQL etc.
  • Better to have done some tutorials in Azure documentation. Specially “Analyze flight delay data”, which laid the foundation for this article.
  • Understanding on HTML, PHP, JS with AJAX is required for understand visualization part.
  • Using tools like Putty, WinSCP if you are in Windows, else commands related to scp and ssh

Planning Deployment

Deployment required for this article is can be shown as follows.

Azure Deployment

Please note that I’ll shutdown HD Insight cluster once it completed its job, else you’ll lose your credits!

The diagram shows different steps that needs in this experiment.

  • First you need a dataset to examine.
  • Then you have to transfer it to Hadoop cluster, and to Hadoop Distributed File System (HDFS) from there.
  • Thereafter you will run required Hive queries to extract the essence of the dataset. After processing we’ll move those data to a SQL database for ease of accessing.
  • Finally you need to develop a PHP application which runs on an App server node to visualize results.

Preparing dataset

You can download San Francisco annual income dataset from following location:


After downloading, you need to open that using excel. I have observed that several job titles contains comma in text. So use find & replace to replace commas in it with hyphen character. Since this is a CSV file, those commas will negatively affect our analysis.

Put the CSV inside a zip folder to minimize data transfer.

Setting up Azure resources

Now you need to create Azure resources required. So let’s start with HD Insight. In this case we need to create a Hadoop-Linux cluster, with 2 worker nodes. Following guide will help to create such cluster quickly.


Also for further information about cluster types and concepts, you may look at following link:


Unfortunately Azure still hasn’t way to deactivate HD Insight cluster when idle. You need to manually delete it, else you’ll be charged for idle hours too.

Thereafter you need to create SQL database. Following tutorial will help on that:

https://azure.microsoft.com/en-us/documentation/articles/sql-database-get-started/#create-a-new-azure-sql-database (Create a new Azure SQL database section)

Finally you need to create an empty App Service. For further information about App Service you may refer following:


This App Service will contain PHP runtime which will be needed at the last part of the article.

A best practice would be, when creating above resources would be to allocate all the resources to a single resource group, which makes it easy to manage.

Also make sure to give strong passwords and remember those.

Executing process

First you need to transfer zip file created to Hadoop nodes’s file system. You can do it by using scp command or any GUI tool which does scp.

As the host, you need to mention “CLUSTERNAME-ssh.azurehdinsight.net”. Along with that, you need to provide ssh credentials.

scp FILENAME.zip USERNAME@CLUSTERNAME-ssh.azurehdinsight.net:

Then you need to access that node using SSH. In windows you can use Putty tool, others may use terminal.

ssh USERNAME@CLUSTERNAME-ssh.azurehdinsight.net

Unzip the file uploaded

unzip FILENAME.zip

Next, you have to move the csv file to Hadoop File System. Use following commands to create a new directory in HDFS and move the csv file

hdfs dfs -mkdir –p /sf/salary/data

hdfs dfs -put FILENAME.csv /sf/salary/data

Now, you need to create Hive query and execute. You can execute Hive query using a file or in an interactive manner. We’ll do in both manner, but first using the file.

Create the following “salaries.hql” using nano

nano salaries.hql

and add the following queries:

DROP TABLE salaries_raw;
-- Creates an external table over the csv file
 JOB_TITLE string,
 BASE_PAY float,
 OTHER_PAY float,
 BENEFITS float,
 TOTAL_PAY float,
 YEAR string,
 NOTES string,
 AGENCY string,
 STATUS string)
-- The following lines describe the format and location of the file
LOCATION '/sf/salary/data';

-- Drop the salaries table if it exists
DROP TABLE salaries;
-- Create the salaries table and populate it with data
-- pulled in from the CSV file (via the external table defined previously)
 JOB_TITLE AS job_title,
 BASE_PAY AS base_pay,
 TOTAL_PAY AS total_pay,
 TOTAL_PAY_N_BENEFITS AS total_pay_n_benefits
FROM salaries_raw;

You can also locally create “salaries.hql” and upload via SCP.

The queries are self-explanatory, but to make it easy, each query  ends with a semicolon. Table “salaries_raw” is creating to directly extract values in CSV. So first query has one-to-one mapping with csv data. Data to the table is taken from where we stored csv file. Thereafter “salaries” table is created using “salaries_raw” table. The “salaries” table filters values of base_pay, total_pay and total_pay_n_benifits columns only. Those columns are selected, because only those would necessary for the next level.

To execute the HIVE query, use the following command

beeline -u 'jdbc:hive2://localhost:10001/;transportMode=http' -n admin -f salaries.hql

Next part of the Hive query we’ll going to do with interactive manner. You can open interactive shell with command:

beeline -u 'jdbc:hive2://localhost:10001/;transportMode=http' -n admin

and enter following commands:

FROM salaries
WHERE base_pay IS NOT NULL AND total_pay IS NOT NULL AND total_pay_n_benefits IS NOT NULL
GROUP BY job_title;

The above Hive query will output the result to “/sf/salary/output” folder. It’ll group the job title and get the average values of base_pay, total_pay and total_pay_n_benifits columns.

Use “!quit” command to exit from interactive shell.

At this stage, we have successfully completed extracting essence of the dataset. Next, we need to make it ready for presentation.

For presentation, we’re going to copy output data to the SQL database created.

To create a table and do other interactions with SQL database, we need to install FreeTDS to Hadoop node. Use following command to install and verify the connectivity.

sudo apt-get --assume-yes install freetds-dev freetds-bin

TDSVER=8.0 tsql -H <serverName>.database.windows.net -U <adminLogin> -P <adminPassword> -p 1433 -D <databaseName>

Once you execute the last command, you’ll be directed to another interactive shell where you can interact with the database you created, when creating SQL node.

Use following commands to create a table to put the output we got

CREATE TABLE [dbo].[salaries](
[job_title] [nvarchar](50) NOT NULL,
[base_pay] float,
[total_pay] float,
[total_pay_n_benefits] float,
([job_title] ASC))

Use “exit” command to exit from SQL interactive session.

To move the data from HDFS to SQL database, we are going to use Sqoop. Following Sqoop command with put output data in HDFS to SQL database.

sqoop export --connect 'jdbc:sqlserver://<serverName>.database.windows.net:1433;database=<databaseName>' --username <adminLogin> --password <adminPassword> --table ' salaries' --export-dir 'wasbs:/// sf/salary/output' --fields-terminated-by '\t' -m 1

Once the task is successfully completed, you can again log in to SQL interaction session and execute following to view table results:

SELECT * FROM salaries


Finally you need to use FTP to connect with App Services node, and FTP the following PHP files (including js foolder).


You need to change SQL server host and credentials in “dbconnect.php” file. I’ll leave the rest of code in PHP file as a self-explanatory for you. If you successfully created the app, you should see something similar to following:




In this article I have shown you how I did my first experiment with Azure big data analysis. Along the path I had to cover several other related technologies such as Spark, Azure Stream Analysis. So there are pros and cons with using those technologies. In such cases like analyzing annual income, it’s generally accepted to use Hadoop along with Hive. But if you want to do more frequent activities, you may look in to alternatives.


  1. Project sourcecode – https://github.com/Buddhima/AzureSFSalaryApp
  2. Get started with Hadoop in Windows – https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-tutorial-get-started-windows
  3. Analyze Flight Delays with HD Insight – https://azure.microsoft.com/en-us/documentation/articles/hdinsight-analyze-flight-delay-data-linux
  4. Use Hive and HiveQL with Hadoop in HDInsight – https://azure.microsoft.com/en-us/documentation/articles/hdinsight-use-hive/