Category: Windows

Start Using Java 8 Lambda Expressions

Introduction

Java 8 comes with a bunch of new features for its developers. One such improvement is Lambda expressions. Lambda Expressions allows Java developers to use functions as values, and pass it as a value to a method. This might be familiar for people from functional programming background, but a bit difficult for people who tends to think in object-oriented mindset. In this article, I am going through series of examples to show how we can use lambda expressions in coding.

Prerequisites

Before you start, you need to install Java 8 in your machine and create a project using Java 8. Here I’m going describe things in more abstract-level. So you should possess knowledge on using your IDE for basic things such as creating classes, executing Java code etc.

Example Scenario

For this article I have selected a scenario where you need to go though set of books and select books based on different criteria. Those selection criteria are list all the books, list books which are novels, and list title of the books written in 20th century. So let’s start coding.

Creating entity class

First step of solving such problem is to create an entity class called Book which can represent a single instance of a book. So here it is.


public class Book {

private String name;
 private String author;
 private int year;
 private String language;
 private String category;

public Book(String name, String author, int year, String language, String category) {
 this.name = name;
 this.author = author;
 this.year = year;
 this.language = language;
 this.category = category;
 }

public String getName() {
 return name;
 }

public void setName(String name) {
 this.name = name;
 }

public String getAuthor() {
 return author;
 }

public void setAuthor(String author) {
 this.author = author;
 }

public int getYear() {
 return year;
 }

public void setYear(int year) {
 this.year = year;
 }

public String getLanguage() {
 return language;
 }

public void setLanguage(String language) {
 this.language = language;
 }

public String getCategory() {
 return category;
 }

public void setCategory(String category) {
 this.category = category;
 }

@Override
 public String toString() {
 return "Book{" +
 "name='" + name + '\'' +
 ", author='" + author + '\'' +
 ", year=" + year +
 ", language='" + language + '\'' +
 ", category='" + category + '\'' +
 '}';
 }
}

I think I no need to explain the above simple Java class as it should be familiar to you.

Conventional Method

This is what we usually do to solve this type of problems.

public class BookFinderExample1 {

    public static void main(String[] args) {
        List<Book> books = Arrays.asList(
                new Book("Moby-Dick", "Herman Melville", 1851, "EN", "Novel"),
                new Book("War and Peace", "Leo Tolstoy", 1869, "RU", "Novel"),
                new Book("The Three Musketeers", "Alexandre Dumas", 1844, "FR", "Novel"),
                new Book("Les Miserables", "Victor Hugo", 1862, "FR", "Fiction"),
                new Book("Journey to the West", "Wu Cheng'en", 1592, "ZH", "Fiction"),
                new Book("Wild Swans", "Jung Chang", 1991, "ZH", "Biography"),
                new Book("The Reader", "Bernhard Schlink", 1995, "DE", "Novel"),
                new Book("Perfume", "Patrick Suskind", 1985, "DE", "Fiction")
        );

        // 1. print all books
        System.out.println("Print all books");
        printAllBooks(books);

        // 2. print all novels
        System.out.println("Print all novels");
        printAllNovels(books);

        // 3. print all books in 20th century
        System.out.println("Print all books in 20th century");
        printAllIn20thCentury(books);
    }

    private static void printAllBooks(List<Book> books) {
        for (Book book : books) {
            System.out.println(book.toString());
        }
    }

    private static void printAllNovels(List<Book> books) {
        for (Book book : books) {
            if (book.getCategory().equals("Novel"))
                System.out.println(book.toString());
        }
    }

    private static void printAllIn20thCentury(List&lt;Book&gt; books) {
        for (Book book : books) {
            if (book.getYear() > 1900 && book.getYear() < 2001)
                System.out.println(book.getName());
        }
    }
}

First it created a list of books (I won’t going to repeat this step in next examples). Then we create 3 methods which serves our purpose. And we call those methods one by one. Though this fulfills out requirement, seems that’s not a scalable solution. Once we have a new requirement, we need to create a new method and call it.

Using Generic Solution

If we carefully look into the methods we have implemented, they all do a common thing. They iterate though a given list of books, checks a condition and perform an action (eg: print the book). So we can use interfaces for this and stick with just one method. Let’s see how.

public class BookFinderExample2 {

    public static void main(String[] args) {
        List<Book> books = Arrays.asList(
                .......
        );

        // 1. print all books
        System.out.println("Print all books");
        printBooks(books, new Checker() {
            public boolean check(Book book) {
                return true;
            }
        }, new Action() {
            public void perform(Book book) {
                System.out.println(book.toString());
            }
        });

        // 2. print all novels
        System.out.println("Print all novels");
        printBooks(books, new Checker() {
            public boolean check(Book book) {
                return book.getCategory().equals("Novel");
            }
        }, new Action() {
            public void perform(Book book) {
                System.out.println(book.toString());
            }
        });

        // 3. print all books in 20th century
        System.out.println("Print all books in 20th century");
        printBooks(books, new Checker() {
            public boolean check(Book book) {
                return (book.getYear() > 1900 && book.getYear() < 2001);
            }
        }, new Action() {
            public void perform(Book book) {
                System.out.println(book.getName());
            }
        });
    }

    private static void printBooks(List<Book> books, Checker checker, Action action) {
        for (Book book : books) {
            if (checker.check(book)) {
                action.perform(book);
            }
        }
    }

    interface Checker {
        boolean check(Book book);
    }

    interface Action {
        void perform(Book book);
    }
}

In the above solution I have introduced two interfaces and both of them contains a single method exposed. With the use of just one method printBooks and objects implementation of those two interfaces we have achieved the same results as before. Now we have taken out the action and inject in to the printBooks method. The instances of Checker and Action is created with anonymous inner classes.

The above seems to be fine for generalizing the solution but the syntax seems to be so tedious. So far we have not used anything new in Java 8. Let’s use Java 8 new features to ease our coding.

Using Lambda Expressions

Lets have a look at one of our anonymous inner class

new Action() {
    public void perform(Book book) {
        System.out.println(book.toString());
    }
}

The above is an anonymous inner class created for Action interface. Action interface has only a single method called perform. That method takes one argument and prints it. So such Anonymous Inner classes can be written as follows.

(Book book) -> System.out.println(book.toString())

Since the Action interface has only one method, we no need to say it’s name. Since it has one line of code, no need to use curly-brackets. Now with single argument, no need of using brackets, and specifying argument type.

book -> System.out.println(book.toString())

That’s a lambda expression. So let’s substitute lambda expressions.

public class BookFinderExample3 {

    public static void main(String[] args) {
        List<Book> books = Arrays.asList(
                ....
        );

        // 1. print all books
        System.out.println("Print all books");
        printBooks(books, book ->  true, book -> System.out.println(book.toString()));

        // 2. print all novels
        System.out.println("Print all novels");
        printBooks(books, book -> book.getCategory().equals("Novel"), book -> System.out.println(book.toString()));

        // 3. print all books in 20th century
        System.out.println("Print all books in 20th century");
        printBooks(books, book ->  (book.getYear() > 1900 && book.getYear() < 2001), book -> System.out.println(book.getName()));
    }

    private static void printBooks(List<Book> books, Checker checker, Action action) {
        for (Book book : books) {
            if (checker.check(book)) {
                action.perform(book);
            }
        }
    }

    @FunctionalInterface
    interface Checker {
        boolean check(Book book);
    }

    @FunctionalInterface
    interface Action {
        void perform(Book book);
    }
}

We have introduced 2 interfaces for our work. But is it necessary? No. Java SDK developers have identified this and provide us a set of already defined interfaces inside java.util.function package. All we have to do it, reuse them!

public class BookFinderExample4 {

    public static void main(String[] args) {
        List<Book> books = Arrays.asList(
                ...
        );

        // 1. print all books
        System.out.println("Print all books");
        printBooks(books, book ->  true, book -> System.out.println(book.toString()));

        // 2. print all novels
        System.out.println("Print all novels");
        printBooks(books, book -> book.getCategory().equals("Novel"), book -> System.out.println(book.toString()));

        // 3. print all books in 20th century
        System.out.println("Print all books in 20th century");
        printBooks(books, book ->  (book.getYear() > 1900 && book.getYear() < 2001), book -> System.out.println(book.getName()));
    }

    private static void printBooks(List<Book> books, Predicate<Book> checker, Consumer<Book> action) {
        for (Book book : books) {
            if (checker.test(book)) {
                action.accept(book);
            }
        }
    }
}

In the above example we have used two interfaces Predicate and Consumer. you can find more information about those in Oracle’s official website. You can find plenty of such interfaces which can help your coding.

Using streams

We are not going to stop our simplification from there. If we look in to the printBooks method, it iterates through the list of books and perform things mention in the interface implementations. This kind of iteration is called external iteration, because for-loop iterates the list. In Java 8, streams provide the functionality of internal iteration, which is implemented by converting the list in to a stream. The stream can be considered as a conveyor belt which brings list items one by one. We can use filters etc., to filter-out the required elements and so what every action you like. So when using streams our code would look like as follows.

public class BookFinderExample5 {

    public static void main(String[] args) {
        List<Book> books = Arrays.asList(
                ...
        );

        // 1. print all books
        System.out.println("Print all books");
        books.stream()
                .filter(book ->  true)
                .forEach(book -> System.out.println(book.toString()));

        // 2. print all novels
        System.out.println("Print all novels");
        books.stream()
                .filter(book -> book.getCategory().equals("Novel"))
                .forEach(book -> System.out.println(book.toString()));

        // 3. print all books in 20th century
        System.out.println("Print all books in 20th century");
        books.stream()
                .filter(book ->  (book.getYear() > 1900 && book.getYear() < 2001))
                .forEach(book -> System.out.println(book.getName()));
    }

}

Conclusion

We have started our example from conventional coding and step-by-step brings the Java 8 new features such as lambda expressions, streams and forEach. Similarly you can apply Java 8 concepts in your development as well. Resources in references section below will help you to find more information about Java 8 Lambda Expressions.

References

[1] Oracle’s Java 8 website : http://docs.oracle.com/javase/8/docs/

[2] Oracle’s website on util-functions : https://docs.oracle.com/javase/8/docs/api/java/util/function/package-summary.html

[3] Java Brains Tutorial on Lambda Expressions : https://www.youtube.com/watch?v=gpIUfj3KaOc&list=PLqq-6Pq4lTTa9YGfyhyW2CqdtW9RtY-I3

Microservices with Spring Boot

Introduction

Currently enterprise application development is more interested towards building them as microservices. This trend started about 2 years back and some organizations take this as an opportunity to do a complete re-write of their products. To help developing microservices, several organizations have done framework implementations. In here I am talking about using Spring Boot to create a very basic microservice.

Use-case

This system is about handling patient records. So it is more like an CRUD application. To persist data, I am using a Mongo DB (embedded version). First, let’s see what would be structure of this project.

proj-structure

Fist you need to create a project with the above structure. You may find maven arc-types which helps to do that. Next the pom file should be created properly. Here, I’m showing the important sections of the pom file.

 <parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>1.4.1.RELEASE</version>
</parent>
...
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-mongodb</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jersey</artifactId>
</dependency>

<dependency>
<groupId>de.flapdoodle.embed</groupId>
<artifactId>de.flapdoodle.embed.mongo</artifactId>
</dependency>

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>

</dependencies>

 

Application.java file contains the main method to start the microservice. So it should looks like as follow:

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class Application {

 public static void main(String[] args) {
     SpringApplication.run(Application.class, args);
 }
}

ApplicationConfig.java file is used to provide configurations to Spring framework. Here we provide the location of the service and REST-Template. So it should look like follows:

 

import org.glassfish.jersey.server.ResourceConfig;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.client.RestTemplate;

import javax.inject.Named;

@Configuration
public class ApplicationConfig {
    @Named
    static class JerseyConfig extends ResourceConfig {
        public JerseyConfig() {
            this.packages("com.project.capsule.rest");
        }
    }

    @Bean
    public RestTemplate restTemplate() {
        RestTemplate restTemplate = new RestTemplate();
        return restTemplate;
    }
}

Next we can extend MongoRepository and create PatientReportRepository. This is very interesting capability of Spring framework as it can convert method names in to SQL queries directly.

import com.project.capsule.bean.PatientReport;
import org.springframework.data.mongodb.repository.MongoRepository;
import java.util.List;

public interface PatientReportRepository extends MongoRepository&lt;PatientReport, String&gt; {

 public List<PatientReport> findByName(String name);

 public List<PatientReport> findByNameLike(String name);

 public List<PatientReport> findByTimeBetween(long from, long to);

}

Now let’s create the bean class, PatientReport

 

import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import org.springframework.data.annotation.Id;
import java.util.Map;

@JsonIgnoreProperties(ignoreUnknown = true)
public class PatientReport {

@Id
public String id;

public String name;
public int age;
public String sex;
public String doctorName;
public long time;
public String reportType;
public Map<String, Object> reportData;
}

Finally the service class, PatientReportService. You can define any number of methods and implement a custom logic.

import com.project.capsule.PatientReportRepository;
import com.project.capsule.bean.PatientReport;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestParam;

import javax.inject.Named;
import javax.ws.rs.*;
import javax.ws.rs.core.MediaType;
import javax.ws.rs.core.Response;
import java.util.*;

@Named
@Path("/report")
public class PatientReportService {

@Autowired
private PatientReportRepository repository;

@POST
@Path("")
@Consumes(MediaType.APPLICATION_JSON)
public Response storePatientReport(@RequestBody PatientReport patientReport) {
repository.save(patientReport);
return Response.status(201).build();
}

@GET
@Path("{id}")
@Produces(MediaType.APPLICATION_JSON)
public PatientReport retrievePatientReport(@PathParam("id") int id) {
PatientReport patientReport = repository.findOne(String.valueOf(id));
return patientReport;
}

@POST
@Path("find")
public List<PatientReport> findReports(@RequestParam Map<String, Object> map) {
List<PatientReport> patientReports = new ArrayList<PatientReport>();
Map<String, PatientReport> resultantMap = new HashMap<String, PatientReport>();
List<PatientReport> resultantReports;

if (map.containsKey("name") && map.get("name") != null) {
String patientName = (String) map.get("name");
if (!patientName.trim().equalsIgnoreCase("")) {
resultantReports = repository.findByNameLike(patientName);

for (PatientReport report : resultantReports)
resultantMap.put(report.id, report);

}
}
return patientReports;
}

}

Once you run the Application.java file, microservice will start from port 8080. You can change the post by giving argument “-Dserver.port=8090” etc. Thereafter you can use a REST client to send HTTP requests and see how it works!

References

[1] https://spring.io/blog/2015/07/14/microservices-with-spring

[2] http://blog.scottlogic.com/2016/11/22/spring-boot-and-mongodb.html

[3] https://dzone.com/articles/spring-boot-creating

JSON Enrich Mediator for WSO2 ESB

Introduction

JSON support for WSO2 ESB [1] was introduced sometime back. But only small number of mediators support manipulating JSON payloads. In this article I am going to introduce a new mediator called JsonEnrichMediator [2], which work quite similar to existing Enrich mediator [3], but aiming JSON payloads. The specialty of this mediator is, since this is working with native JSON payload, JSON payload will not be converted to an XML representation. Hence there won’t be any data loss due to transformations.

Please note that this is a custom mediator I have created and will not ship with WSO2 ESB pack.

Configuring Mediator

  1. Clone the GitHub repository: https://github.com/Buddhima/JsonEnrichMediator
  2. Build the repository using maven (mvn clean install)
  3. Copy the built artifact in target folder to ESB_HOME/repository/components/dropins
  4. Download json-path-2.1.0.jar [5], json-smart-2.2.jar [6] and put them to the same folder (dropins).
  5. Start WSO2 ESB (sh bin/wso2server.sh)

Sample Scenario

For this article I am using a sample scenario which moves a JSON property within the payload. For that you need to add the following API to WSO2 ESB.

<api xmlns="http://ws.apache.org/ns/synapse" name="sampleApi" context="/sample">
<resource methods="POST" uri-template="/*">
<inSequence>
<log level="full"/>
<jsonEnrich>
<source type="custom" clone="false" JSONPath="$.me.country"/>
<target type="custom" action="put" JSONPath="$" property="country"/>
</jsonEnrich>
<respond/>
</inSequence>
</resource>
</api>

The above configuration will take the value pointed by JSONPath “$.me.country” and move it to the main body. You can find further details about JSONPath at the location [4].

Once the API is deployed, you need to send following message to ESB.


curl -H "Content-Type: application/json"
-X POST -d '{
"me":{
"country": "Sri Lanka",
"language" : "Sinhala"
}
}'
http://127.0.0.1:8280/sample

The output of the ESB should look like below


{
"me": {
"language": "Sinhala"
},
"country": "Sri Lanka"
}

Conclusion

I have shown a simple use-case of using JSON Enrich Mediator. You can see the comprehensive documentation at the code repository [2].

References

[1] WSO2 ESB JSON support : https://docs.wso2.com/display/ESB500/JSON+Support

[2] Code Repository for JSON Enrich Mediator : https://github.com/Buddhima/JsonEnrichMediator

[3] WSO2 ESB Enrich Mediator : https://docs.wso2.com/display/ESB500/Enrich+Mediator

[4] JSON Path documentation : https://github.com/jayway/JsonPath/blob/json-path-2.1.0/README.md

[5] json-path-2.1.0 : https://mvnrepository.com/artifact/com.jayway.jsonpath/json-path/2.1.0

[6] json-smart-2.2.1 : https://mvnrepository.com/artifact/net.minidev/json-smart/2.2.1

WSO2 ESB Endpoint Error Handling

Introduction

WSO2 ESB can be used as an intermediary component to connect different systems. When connecting those systems the availability of those systems is a common issue. Therefore ESB has to handle those undesirable situations carefully and take relevant actions. To cater that requirement outbound-endpoints of the WSO2 ESB can be configured. In this article I discuss two common ways of configuring endpoints.

Two common approaches to configure endpoints are;

  1. Configure with just a timeout (without suspending endpoint)
  2. Configure with a suspend state

Configure with just a timeout

This would suitable if the endpoint failure is not very frequent.

Sample Configuration:

<endpoint name="SimpleTimeoutEP">
    <address uri="http://localhost:9000/StockquoteService">
    <timeout>
        <duration>2000</duration>
        <responseAction>fault</responseAction>
    </timeout>
    <suspendOnFailure>
        <errorCodes>-1</errorCodes>
        <initialDuration>0</initialDuration>
        <progressionFactor>1.0</progressionFactor>
        <maximumDuration>0</maximumDuration>
    </suspendOnFailure>
    <markForSuspension>
        <errorCodes>-1</errorCodes>
    </markForSuspension>
</address>
</endpoint>

 

In this case we only focus on the timeout of the endpoint. The endpoint will stay as Active for ever. If a response does not receive within duration, the responseAction triggers.

duration – in milliseconds

responseAction – when response comes to a time-out message one of the following actions trigger.

  • fault – calls the fault-sequence associated
  • discard – discards the response
  • none – will not take any specific action on response (default action)

The rest of the configuration avoids the endpoint going to suspend state.

If you specify responseAction as “fault”, you can define define customize way of informing the failure to the client in fault-handling sequence or store that message and retry later.

Configure with a suspend state

This approach is useful when connection failures are very often. By suspending endpoint, ESB can save resources without unnecessarily waiting for responses.

In this case endpoint goes through a state transition. The theory behind this behavior is the circuit-breaker pattern. Following are the three states:

  1. Active – Endpoint sends all requests to backend service
  2. Timeout – Endpoint starts counting failures
  3. Suspend – Endpoint limits sending requests to backend service

Sample Configuration:

<endpoint name="Suspending_EP">
    <address uri="http://localhost:9000/StockquoteServicet">
    <timeout>
        <duration>6000</duration>
    </timeout>
    <markForSuspension>
        <errorCodes>101504, 101505</errorCodes>
        <retriesBeforeSuspension>3</retriesBeforeSuspension>
        <retryDelay>1</retryDelay>
    </markForSuspension>
    <suspendOnFailure>
        <errorCodes>101500, 101501, 101506, 101507, 101508</errorCodes>
        <initialDuration>1000</initialDuration>
        <progressionFactor>2</progressionFactor>
        <maximumDuration>60000</maximumDuration>
    </suspendOnFailure>
</address>
</endpoint>

 

In the above configuration:

If endpoint error codes are 101504, 101505; endpoint is moved from active to timeout state.

When the endpoint is in timeout state, it tries 3 attempts with 1 millisecond delays.

If all those retry attempts fail, the endpoint will move to suspend state. If a retry succeed, then endpoint will move to active state.

If active endpoint receives error codes 101500, 101501, 101506, 101507, 101508; endpoint will directly move to suspend.

After endpoint somehow moves to suspend state, it waits initialDuration before attempting any furthermore. Thereafter it will determine the time period between requests according to following equation.

Min(current suspension duration * progressionFactor, maximumDuration)

In the equation, “current suspension duration” get updated for each reattempt.

Once endpoint succeed in getting a response to a request, endpoint will go back to active state.

If endpoint will get any other error codes (eg: 101503), it will not do any state transition, and remain in active state.

Conclusion

In this article I have shown two basic configurations that would be useful to configure endpoints of WSO2 ESB. You can refer WSO2 ESB documentation for implementing more complex patterns with endpoints.

References

WSO2 ESB Documentation: https://docs.wso2.com/display/ESB500/Endpoint+Error+Handling#EndpointErrorHandling-timeoutSettings

Timeout and Circuit Breaker Pattern in WSO2 Way: http://ssagara.blogspot.com/2015/05/timeout-and-circuit-breaker-pattern-in.html

Endpoint Error Codes: https://docs.wso2.com/display/ESB500/Error+Handling#ErrorHandling-codes

Endpoint Error Handling: http://wso2.com/library/articles/wso2-enterprise-service-bus-endpoint-error-handling/

Java Virtual Machine

Introduction

From this post I thought of discussing how underlying components work together for a successful execution of a Java program. Content in this post consists of a collection of knowledge I gathered after going though several articles on this topic.

Most of us never bothered about internal stuffs related to Java, because IDEs have made our life a lot easier. But it’s worth if you have some idea about internal stuffs as if you happened to work with an enterprise level application, and faced with memory issues. In this article, I though of going from basic concepts to an intermediate level.

JVM, JRE, JDK

In brief “JVM (Java Virtual Machine) is an abstract machine. It is a specification that provides runtime environment in which java bytecode can be executed.” Which implies JVM is merely a concept. I’ll come back to JVM later to discuss more.

JRE (Java Runtime Environment) provides a runtime environment (implementation of JVM). JRE contains the concrete implementation of JVM (eg: HotSpot JVM, JRockit, IBM J9), set of libraries and other files need for JVM. Different vendors release their own JRE, based on a reference.

JDK is for developers to create Java program. JDK consists of JRE + development tools such as ‘javac’.

One important thing to remember is all JVM, JRE and JDK are platform-dependent.

jdk2
JVM, JRE, JDK (source: javatpoint.com)

 

JVM in Detail

It is said that JVM is;

  1. A specification where working of Java Virtual Machine is specified. But implementation provider is independent to choose the algorithm. Its implementation has been provided by Sun and other companies.
  2. An implementation Its implementation is known as JRE (Java Runtime Environment).
  3. Runtime Instance Whenever you run a java program, an instance of JVM is created.

JVM is responsible for following operations:

  • Loads code
  • Verifies code
  • Executes code
  • Provides runtime environment

Internal Architecture of JVM

jvminternal
(source: javatpoint.com)
  1. Classloader : Classloader is used to load class files. The embedded classloader in JVM is also called “primordial class loader”. Depending on the class name, classloader can search for the .class file in directory structure. User can also define classloaders (called “non-primordial class loader”) if required.
  2. Method (Class) Area : Also called non-heap area, it has 2 subsections. “Permanent Generation – This area stores class related data from class definitions, structures, methods, field, method (data and code) and constants. Can be regulated using -XX:PermSize and -XX:MaxPermSize. It can cause java.lang.OutOfMemoryError: PermGen space if it runs out if space”. OutOfMemoryError happens when class definitions are accumulated. The other section, Code Cache is used by JIT to store compiled code (hardware specific native code).
  3. Heap : Area allocated for runtime data. This area is shared by all threads. We can use  -Xms and -Xmx JVM options to tune the heap size. In most of the time “java.lang.OutOfMemoryError” error occurs because heap is getting exhausted. Heap consists of 3 sub-areas:
    1. Eden (Young) “New object or the ones with short life expectancy exist in this area and it is regulated using the -XX:NewSize and -XX:MaxNewSize parameters. GC (garbage collector) minor sweeps this space”
    2. Survivor – “The objects which are still being referenced manage to survive garbage collection in the Eden space end up in this area. This is regulated via the -XX:SurvivorRatio JVM option”
    3. Old (Tenured) – “This is for objects which survive long garbage collections in both the Eden and Survivor space (due to long-time references of course). A special garbage collector takes care of this space. Object de-alloaction in the tenured space is taken care of by GC major”

      Analysis on “java.lang.OutOfMemoryError” can be done by taking a heap-dump when the incident occours. You can refer a case-study on analyzing for a real-world case at here: http://javaeesupportpatterns.blogspot.com/2011/11/hprof-memory-leak-analysis-tutorial.html

  4. Stack : This area is specific to a thread. Each thread has it’s stack which is used to store local variables and regulates method invocation, partial result and return values. This space can be tuned with -Xss JVM option.
  5. Program Counter Register : It contains the address of the Java virtual machine instruction currently being executed.
  6. Native Stack : Used for non-Java code, per thread allocation
  7. Execution Engine : This contains 3 parts;
    1. A virtual processor
    2. Interpreter, which reads the bytecode and execute the instructions
    3. Just-In-Time(JIT) compiler: “It is used to improve the performance. JIT compiles parts of the byte code that have similar functionality at the same time, and hence reduces the amount of time needed for compilation.Here the term ‘compiler’ refers to a translator from the instruction set of a Java virtual machine (JVM) to the instruction set of a specific CPU.”

Conclusion

In this post I want to gather a colllection of knowledge I found related to JVM from various articles. All those areticles are mentioned in Reference section below. I would like to thank all those original authors, and hope you will get some knowledge out of this as well.

References

  1. JVM internals tutorial – http://www.javatpoint.com/internal-details-of-jvm
  2. About JVM Memory – https://abhirockzz.wordpress.com/2014/09/06/jvm-permgen-where-art-thou/
  3. Understanding Java PermGen – http://www.integratingstuff.com/2011/07/24/understanding-and-avoiding-the-java-permgen-space-error/
  4. Java HotSpot VM Options – http://www.oracle.com/technetwork/articles/java/vmoptions-jsp-140102.html
  5. Java Classloaders – http://www.javaworld.com/article/2077260/learn-java/learn-java-the-basics-of-java-class-loaders.html
  6. Heap Dump Analysis with VisualVM – http://www.javaworld.com/article/2072864/heap-dump-and-analysis-with-visualvm.html
  7. Real-world Example for Heap Dump Analysis with MAT – http://javaeesupportpatterns.blogspot.com/2011/11/hprof-memory-leak-analysis-tutorial.html

Experience with Azure Big Data Analyzing

Introduction

This experiment was done in order to have first understanding on how to deal with big data on cloud. So this is my first time experience with lots of new technologies such as Hadoop, Hive etc. Before the experiment I have tried few examples with Hadoop and other related technologies, and yet found that this would be a better way to go.

Use case

This experiment is based on the dataset with annual salaries for different persons. In this case this is about California, USA in 2015. The data set contained names, job titles, basic salary, additional benefits and total salary. So in this, I’ll concentrate on averages of job titles and their averaged total salaries.

Please note that this is an experiment which is solely depends on the dataset. Actual survey reports may contain values different from final results.

Prerequisites

Before moving to the rest of the article, you are expected to have better understanding on followings:

  • Your understanding on Microsoft Azure platform is necessary. You can create a free azure account and try to play around with portal. Also you need to understand their concepts such as Resources, Resource Groups etc.
  • You need to have some sort of understanding on big data concepts such as what is big data, Hadoop DFS, Hadoop Eco system, Hive, SQL etc.
  • Better to have done some tutorials in Azure documentation. Specially “Analyze flight delay data”, which laid the foundation for this article.
  • Understanding on HTML, PHP, JS with AJAX is required for understand visualization part.
  • Using tools like Putty, WinSCP if you are in Windows, else commands related to scp and ssh

Planning Deployment

Deployment required for this article is can be shown as follows.

Azure Deployment

Please note that I’ll shutdown HD Insight cluster once it completed its job, else you’ll lose your credits!

The diagram shows different steps that needs in this experiment.

  • First you need a dataset to examine.
  • Then you have to transfer it to Hadoop cluster, and to Hadoop Distributed File System (HDFS) from there.
  • Thereafter you will run required Hive queries to extract the essence of the dataset. After processing we’ll move those data to a SQL database for ease of accessing.
  • Finally you need to develop a PHP application which runs on an App server node to visualize results.

Preparing dataset

You can download San Francisco annual income dataset from following location:

http://transparentcalifornia.com/salaries/2015/san-francisco/

After downloading, you need to open that using excel. I have observed that several job titles contains comma in text. So use find & replace to replace commas in it with hyphen character. Since this is a CSV file, those commas will negatively affect our analysis.

Put the CSV inside a zip folder to minimize data transfer.

Setting up Azure resources

Now you need to create Azure resources required. So let’s start with HD Insight. In this case we need to create a Hadoop-Linux cluster, with 2 worker nodes. Following guide will help to create such cluster quickly.

https://azure.microsoft.com/en-in/documentation/articles/hdinsight-hadoop-linux-tutorial-get-started/

Also for further information about cluster types and concepts, you may look at following link:

https://azure.microsoft.com/en-in/documentation/articles/hdinsight-hadoop-provision-linux-clusters/

Unfortunately Azure still hasn’t way to deactivate HD Insight cluster when idle. You need to manually delete it, else you’ll be charged for idle hours too.

Thereafter you need to create SQL database. Following tutorial will help on that:

https://azure.microsoft.com/en-us/documentation/articles/sql-database-get-started/#create-a-new-azure-sql-database (Create a new Azure SQL database section)

Finally you need to create an empty App Service. For further information about App Service you may refer following:

https://azure.microsoft.com/en-us/documentation/articles/app-service-web-overview/

This App Service will contain PHP runtime which will be needed at the last part of the article.

A best practice would be, when creating above resources would be to allocate all the resources to a single resource group, which makes it easy to manage.

Also make sure to give strong passwords and remember those.

Executing process

First you need to transfer zip file created to Hadoop nodes’s file system. You can do it by using scp command or any GUI tool which does scp.

As the host, you need to mention “CLUSTERNAME-ssh.azurehdinsight.net”. Along with that, you need to provide ssh credentials.


scp FILENAME.zip USERNAME@CLUSTERNAME-ssh.azurehdinsight.net:

Then you need to access that node using SSH. In windows you can use Putty tool, others may use terminal.


ssh USERNAME@CLUSTERNAME-ssh.azurehdinsight.net

Unzip the file uploaded


unzip FILENAME.zip

Next, you have to move the csv file to Hadoop File System. Use following commands to create a new directory in HDFS and move the csv file


hdfs dfs -mkdir –p /sf/salary/data

hdfs dfs -put FILENAME.csv /sf/salary/data

Now, you need to create Hive query and execute. You can execute Hive query using a file or in an interactive manner. We’ll do in both manner, but first using the file.

Create the following “salaries.hql” using nano

nano salaries.hql

and add the following queries:


DROP TABLE salaries_raw;
-- Creates an external table over the csv file
CREATE EXTERNAL TABLE salaries_raw (
 EMPLOYEE_NAME string,
 JOB_TITLE string,
 BASE_PAY float,
 OVERTIME_PAY float,
 OTHER_PAY float,
 BENEFITS float,
 TOTAL_PAY float,
 TOTAL_PAY_N_BENEFITS float,
 YEAR string,
 NOTES string,
 AGENCY string,
 STATUS string)
-- The following lines describe the format and location of the file
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/sf/salary/data';

-- Drop the salaries table if it exists
DROP TABLE salaries;
-- Create the salaries table and populate it with data
-- pulled in from the CSV file (via the external table defined previously)
CREATE TABLE salaries AS
SELECT
 JOB_TITLE AS job_title,
 BASE_PAY AS base_pay,
 TOTAL_PAY AS total_pay,
 TOTAL_PAY_N_BENEFITS AS total_pay_n_benefits
FROM salaries_raw;

You can also locally create “salaries.hql” and upload via SCP.

The queries are self-explanatory, but to make it easy, each query  ends with a semicolon. Table “salaries_raw” is creating to directly extract values in CSV. So first query has one-to-one mapping with csv data. Data to the table is taken from where we stored csv file. Thereafter “salaries” table is created using “salaries_raw” table. The “salaries” table filters values of base_pay, total_pay and total_pay_n_benifits columns only. Those columns are selected, because only those would necessary for the next level.

To execute the HIVE query, use the following command


beeline -u 'jdbc:hive2://localhost:10001/;transportMode=http' -n admin -f salaries.hql

Next part of the Hive query we’ll going to do with interactive manner. You can open interactive shell with command:


beeline -u 'jdbc:hive2://localhost:10001/;transportMode=http' -n admin

and enter following commands:


INSERT OVERWRITE DIRECTORY '/sf/salary/output'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
SELECT
job_title,
avg(base_pay),
avg(total_pay),
avg(total_pay_n_benefits)
FROM salaries
WHERE base_pay IS NOT NULL AND total_pay IS NOT NULL AND total_pay_n_benefits IS NOT NULL
GROUP BY job_title;

The above Hive query will output the result to “/sf/salary/output” folder. It’ll group the job title and get the average values of base_pay, total_pay and total_pay_n_benifits columns.

Use “!quit” command to exit from interactive shell.

At this stage, we have successfully completed extracting essence of the dataset. Next, we need to make it ready for presentation.

For presentation, we’re going to copy output data to the SQL database created.

To create a table and do other interactions with SQL database, we need to install FreeTDS to Hadoop node. Use following command to install and verify the connectivity.


sudo apt-get --assume-yes install freetds-dev freetds-bin

TDSVER=8.0 tsql -H <serverName>.database.windows.net -U <adminLogin> -P <adminPassword> -p 1433 -D <databaseName>

Once you execute the last command, you’ll be directed to another interactive shell where you can interact with the database you created, when creating SQL node.

Use following commands to create a table to put the output we got


CREATE TABLE [dbo].[salaries](
[job_title] [nvarchar](50) NOT NULL,
[base_pay] float,
[total_pay] float,
[total_pay_n_benefits] float,
CONSTRAINT [PK_delays] PRIMARY KEY CLUSTERED
([job_title] ASC))
GO

Use “exit” command to exit from SQL interactive session.

To move the data from HDFS to SQL database, we are going to use Sqoop. Following Sqoop command with put output data in HDFS to SQL database.


sqoop export --connect 'jdbc:sqlserver://<serverName>.database.windows.net:1433;database=<databaseName>' --username <adminLogin> --password <adminPassword> --table ' salaries' --export-dir 'wasbs:/// sf/salary/output' --fields-terminated-by '\t' -m 1

Once the task is successfully completed, you can again log in to SQL interaction session and execute following to view table results:


SELECT * FROM salaries

GO

Finally you need to use FTP to connect with App Services node, and FTP the following PHP files (including js foolder).

https://github.com/Buddhima/AzureSFSalaryApp

You need to change SQL server host and credentials in “dbconnect.php” file. I’ll leave the rest of code in PHP file as a self-explanatory for you. If you successfully created the app, you should see something similar to following:

azure_web_app

 

Conclusion

In this article I have shown you how I did my first experiment with Azure big data analysis. Along the path I had to cover several other related technologies such as Spark, Azure Stream Analysis. So there are pros and cons with using those technologies. In such cases like analyzing annual income, it’s generally accepted to use Hadoop along with Hive. But if you want to do more frequent activities, you may look in to alternatives.

References

  1. Project sourcecode – https://github.com/Buddhima/AzureSFSalaryApp
  2. Get started with Hadoop in Windows – https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-tutorial-get-started-windows
  3. Analyze Flight Delays with HD Insight – https://azure.microsoft.com/en-us/documentation/articles/hdinsight-analyze-flight-delay-data-linux
  4. Use Hive and HiveQL with Hadoop in HDInsight – https://azure.microsoft.com/en-us/documentation/articles/hdinsight-use-hive/

Reliable Messaging with WSO2 ESB

Introduction

Web-Service Reliable Messaging (WS-ReliableMessaging) is a standard which describes a protocol on how to deliver messages reliably between distributed applications. Message failures due to software component, system, network failures can be overcome though this protocol. This protocol describes a transport-independent protocol, such that messages can be exchanged between systems. For further information, please go through the WS-RM specification [1] which completely describes about this topic. For WSO2 ESB, WS-RM is not a novel concept, as it has been there in previous releases. But with new release, WSO2 ESB 4.9.0, WSO2 has separated QoS from fresh pack. Instead, you can installed WS-RM as a feature from p2-repo. Another major changes are that, now WS-RM operates on top of CXF WS-RM [2] , and acting as an inbound endpoint [3].

In this post, I’m not going into comprehensively describe on WS-RM, but going to show how that can be configured in ESB. If you need to read more on WS-RM protocol, I recommend to access WS-RM specification [1], which is a good source for that. Now, let’s move on step-by-step with a sample use-case.

Setting up

First you need to understand that, WS-RM inbound is designed to reliably exchange message between client and WSO2 ESB. So, the message flow diagram can be shown as follows:

Sample Setup Diagram

For this example, I’m using SimpleStockQuote service which comes with WSO2 ESB. You can read more on configuring and starting the service on default port from the documentation. If you have properly configured it, you should be able to access its wsdl via “http://localhost:9000/services/SimpleStockQuoteService?wsdl“.

Next, you need to install “CXF WS Reliable Messaging” feature from p2-repo. About installing features please go through the documentation Installing Features. With this step, you have completed setting up the infrastructure for use-case. Also please note that, this feature requires cxf-bundle, and jetty-bundle. Make-sure, you have no conflicts regarding installation of those bundles.

In order to configure CXF server, we need to give a configuration file. Sample configuration can be also found at CXF Inbound Endpoint Documentation. In that configuration file, you may need to configure the paths to key stores. A sample can be found at [5]. For this sample, configure it’s key store paths and place it in “<ESB_HOME>/repository/conf/cxf” folder.

Now, I’m going to create a new WS-RM inbound endpoint. For that select “Inbound Endpoints” from left panel. Then click “Add Inbound Endpoint“. Then you’ll get a page to initiate an inbound endpoint. At this stage you need to give a name to WS-RM inbound endpoint, and select type as “Custom“. You have to do that because, I have initially mentioned that WS-RM does not come along with fresh ESB pack. Moving to next step, you will get the chance of doing rest of configurations. Following image depicts the configuration of a sample WS-RM inbound endpoint.

wsrm_sample_config

At this point, you may already have some idea about the inbound endpoint. I have configured it to listen port 20940, in localhost. The Class of custom inbound should be “org.wso2.carbon.inbound.endpoint.ext.wsrm.InboundRMHttpListener” (without quotes). The configuration “inbound.cxf.rm.config-file” describes where you have placed the CXF server configuration file.

Messages coming into that specified port will go to “sequence” specified, in this case RMIn sequence and faulty messages will go to “fault” sequence. Other configurations related details are described at the official documentation [4].

You can do the above step straight forward by adding the inbound configuration directly from synapse-configuration.

Inbound Endpoint:


<inboundEndpoint xmlns="http://ws.apache.org/ns/synapse" name="RM_INBOUND_NEW_EXT" sequence="RMIn" onError="fault" class="org.wso2.carbon.inbound.endpoint.ext.wsrm.InboundRMHttpListener" suspend="false">
   <parameters>
      <parameter name="inbound.cxf.rm.port">20940</parameter>
      <parameter name="inbound.cxf.rm.config-file">repository/conf/cxf/server.xml</parameter>
      <parameter name="coordination">true</parameter>
      <parameter name="inbound.cxf.rm.host">127.0.0.1</parameter>
      <parameter name="inbound.behavior">listening</parameter>
      <parameter name="sequential">true</parameter>
   </parameters>
</inboundEndpoint>

RMIn sequence:


<sequence xmlns="http://ws.apache.org/ns/synapse" name="RMIn" onError="fault">
   <in>
      <property name="PRESERVE_WS_ADDRESSING" value="true"/>

<header xmlns:wsrm="http://schemas.xmlsoap.org/ws/2005/02/rm" name="wsrm:Sequence" action="remove"/>

<header xmlns:wsa="http://www.w3.org/2005/08/addressing" name="wsa:To" action="remove"/>

<header xmlns:wsa="http://www.w3.org/2005/08/addressing" name="wsa:FaultTo" action="remove"/>
      <log level="full"/>
      <send>
         <endpoint>

<address uri="http://localhost:9000/services/SimpleStockQuoteService"/>
         </endpoint>
      </send>
   </in>
   <out>
      <send/>
   </out>
</sequence>

Now you have completed the sample setting up. One more step to go. Let’s test this sample.

Running the sample

For that, ESB provides the client which can send reliable messages. Go to <ESB_HOME>/samples/axis2Client folder from terminal and apply following command:

ant stockquote -Dsymbol=IBM -Dmode=quote -Daddurl=http://localhost:20940 -Dwsrm=true

The command will send a getQuote request to ESB using WS-RM and projects the expected result.

Message flow

As specified in the WS-RM spec [1], between client and ESB, several messages exchange in this scenario. If you use a packet capturing tool like Wireshark, you’ll see those messages. I have already attached the message flow [6],  I observed to make it more clear. In brief, following messages  are exchanged, you can follow the messages at text file with these points:

  1. “CreateSequence” message to initiate reliable messaging
  2. “CreateSequenceResponse” from ESB to client
  3. Actual message with data from client to ESB. This is the 1st and the last message in this case. ESB will send this message to backend server and get the response.
  4. “SequenceAcknowledgement” message along with the response from backend server send from ESB to client
  5. “TerminateSequence” message from client to ESB

Conclusion

Through this post, I wanted to introduce you to the new approach of implementing WS-ReliableMessaging. This implementation will come along with WSO2 ESB 4.9.0 release,and prior have different approach than this. Therefor this post will help anyone who is interested in doing WS-RM with newer ESB versions.

References

[1] WS-ReliableMessaging spec. – http://specs.xmlsoap.org/ws/2005/02/rm/ws-reliablemessaging.pdf

[2] CXF WS-RM – http://cxf.apache.org/docs/ws-reliablemessaging.html

[3] WSO2 ESB, Inbound Endpoint – https://docs.wso2.com/display/ESB490/Working+with+Inbound+Endpoints

[4] CXF WS-RM Inbound Endpoint – https://docs.wso2.com/display/ESB490/CXF+WS-RM+Inbound+Protocol

[5] Sample CXF configuration – http://www.filedropper.com/server_23

[6] Message flow – link-to-file