SessionsComments Off on SQL-on-Hadoop : Is SQL the next big step for Hadoop?
Since early days the Hadoop community has made several attempts to stretch Hadoop beyond its role as a distributed programming framework. The key strength that Hadoop brings to the table is its ability to scale linearly. Can we combine this advantage of Hadoop with the efficiency of databases? What does it take to run SQL over Hadoop?
Running SQL-on-Hadoop implies accessing data from “within” Hadoop using SQL as the interface. Accomplishing this demands a significant re-architecture of the storage and compute infrastructures.
SQL-on-Hadoop also shifts Hadoop’s role from being a technology, viewed so far as complementary to databases into something that could compete with them. Its perhaps the single most significant feature that will help Hadoop find its way into more enterprises.
This will be highlighting some conceptual ideas of the different ways that SQL processors can be implemented atop Hadoop. I’ll be taking some examples of OSS and Research-ware products.
SessionsComments Off on Modelling RESTful applications – Why should I not use verbs in REST url
As per REST, the URLs should make use of HTTP verbs to expose their REST based services via HTTP. (i.e GET/PUT/POST/DELETE). But in a real life complex application, we are faced with exposing many services such as approve, reject where it becomes inevitable to add verbs to the URL. What should we do? Should we just have the URLs like ../foods/1/approve ?
What would go wrong if we use verbs in REST URL.
Whether there is some rationale behind it or it just REST dogma. Are there any “REST guidelines”?
In this session we will explore how to model our services so that we follow the RESTful way adhering to HTTP specifications. But most importantly we will try to understand why should we do that. What will go wrong and what would be the benefits?
We would be going through HTTP Specifications and browser caching capabilities. I will next discuss a use case on how to handle the scenarios of accepting/rejecting friend requests using pure REST urls.
Take away for audience : Learn how to model services in RESTful manner and more importantly should understand why would they so and what would fail. The session is designed mindful of the fact that most of the audience would be aware of basic REST theory. So the session focuses on how to address real world problems faced while modelling REST application.
SessionsComments Off on Building Hadoop Pipelines using Apache Crunch
Most of Hadoop processing is not composed of single job, but a chain of jobs. Building and managing such a chain is quite tricky, and that’s why people start to look at other MR frameworks like PIG. But then again you have to learn the new semantics. Apache Crunch aims at changing this, why learn new semantics to do the same?
For developers this means more focus on solving our actual problems rather than wrestling with MapReduce/Pig/Hive. Crunch is available in Java and Scala and offers a higher level of flexibility than any of the current set of MapReduce tools under Apache license. I will demonstrate how we can build chain of jobs in Crunch. Perform various operations like join, aggregation etc. Crunch is quite extensible so I can showcase how much easy it is to write and build a library of reusable custom functions for our pipelines.
SessionsComments Off on Building Single Page Applications with Angular.js
Single Page Applications are the rage right now. With great frameworks like Backbone.js and Angular.js it has become easier to organize client side code and render the templates on the client side. The backend serves as a basic data generation and capturing machine which makes it simple to scale and test while maintaining great performance and usability.
At the end of the session, the attendees will be able to understand the feature-set of Angular.js, the commonly used terms used in the Angular world and understand how to integrate Angular.js apps with any server side technology.
SessionsComments Off on Big Data Search Simplified With Elasticsearch
Most modern applications generate large amounts of data in order to understand the needs and likes of their customers. However finding meaningful information from within this data is like finding a needle in a haystack. In this session we will look at some solutions that are being used currently for Big Data Search and then take a closer look at one of the frontrunners, Elasticsearch. Github, FourSquare, StumbleUpon, SoundCloud all use ElasticSearch to analyze and search through terabytes of data and millions of search requests.
In Elasticsearch we will be discussing:
What is ElasticSearch, how it works.
How ElasticSearch works to analyze data splitting a document into meaningful portions and indexing each of those portions separately. So whenever a new search request comes in, it knows what to find.
Features and advantages of ElasticSearch like built in sharding defaults, maintaining fail-safe node clusters, automatically adding a new node without having to reboot and so on.
Out of the box features for today’s applications like faceted search, reverse search using Percolators and pre-built Analyzers.
SessionsComments Off on Java Clouds – Evaluating and Adopting Java EE PaaS
This session will look at the promise, benefits & challenges of using popular Java PaaS services. It will compare popular cloud services like Jelastic, CloudBees and the Oracle Java Cloud and next discuss the key points to consider while choosing a Java PaaS vendor. The session will talk of the cloud capabilities and limitations of Java EE 6 and 7; and how you would often have to customize your Java EE application development with reference to the cloud vendor’s specifications. The session will look at how add-on cloud services beyond basic Java EE like a development platform or database support, could often be a crucial factor while deciding on a Java PaaS.
The session will demo deploying Java EE applications on multiple Java PaaS platforms and show the kind of features, reporting and management capabilities of the services. The takeaway for the audience would be a better understanding of the Java EE PaaS alternatives and how to best build for, utilize and adopt a Java EE cloud service.
SessionsComments Off on Java Garbage Collectors – Moving to Java7 Garbage-First (G1) Collector
One of the key strengths of JVM is automatic memory management (Garbage Collection). Its understanding can help in writing better applications. This becomes all the more important as enterprise server applications have large amount of live heap data and significant parallel threads.Until recently, main collectors were parallel collector and concurrent-mark-sweep (CMS) collector. This paper introduces the various Garbage Collectors and compares the CMS collector against its replacement, a new implementation in Java7 i.e. Garbage-First (G1). It is characterized by a single contiguous heap which is split into same-sized regions. In fact if your application is still running on the 1.5 or 1.6 JVM, a compelling argument to upgrade to Java 7 is to leverage G1.
SessionsComments Off on Building An Enterprise Big Data Platform For 100TB Dataset
In this session I will share my experience in building an enterprise Big Data Platform For a 100TB Dataset with a medical sector use case. We will look at how we went about managing the unstructured data (genomics, imaging) on Hbase/Hadoop and structured data (biochemistry, skin tests etc) on nosql mongo database and the challenges faced along the way.
SessionsComments Off on Design Thinking For Web & Mobile Product Design
In this time of volatility and complexity, the importance of design in driving meaningful innovation and change is ever-growing. While there are multiple factors for a product design to be desirable, feasible and viable, the design thinking process can help achieve these product characteristics.
The talk would cover:
DesignThink & the importance of same in Product Design for Web & Mobile
How experts practice this in their respective job functions while driving the product vision
Bootstrap Strategies and Design Challenges
Design thinking approach to solving problems of product features’ design and uncovering the latent needs, behaviours, and desires of users
This talk will present case studies, methodology and tools to take up Design Thinking and deliver an awesome user experience for web & mobile products.
SessionsComments Off on Designing for Concurrency and Performance
There are a wide variety of patterns and practices available which deal with building scalable and performant systems. This talk would attempt to explore different models – getting a holistic view on the good, bad and ugly. It would cover threads, locking, event-handlers, actors and more. We take a look at how we have been handling concurrency in Java and how we should be handling it. We would also look at how something like this becomes a no-brainer with Scala and Akka.
The talk would discuss how to design applications with a paradigm thinking towards concurrency and performance which is different from how we thing about methods and functions as of now.
Key take away would be practices, principles and patterns to achieve concurrency and performance. It would also show, how features in Java and Scala makes each of the above easier.
SessionsComments Off on Using Graph Databases For Insights Into Connected Data
Graph databases address one of the great macroscopic business trends of today: leveraging complex and dynamic relationships in highly connected data to generate insight and competitive advantage. Whether we want to understand relationships between customers, elements in a telephone or data center network, entertainment producers and consumers, or genes and proteins, the ability to understand and analyze vast graphs of highly connected data will be key in determining which companies outperform their competitors over the coming decade. In this session, I am going to cover graph database concepts mainly w.r.t Neo4j.
UnConference is a participant-driven session for short, open discussions on various topics of interest to delegates. A delegate can present a question / make a point in a couple of minutes followed by an open discussion where speakers as well as others from the assembly contribute their views. The Unconference has been working wonderfully well at IndicThreads Conferences, helping delegates crowd source opinions and answers on various topics of intrigue and interest. The topics for the UnConference are decided via listings on a white board at the venue.