SQL-on-Hadoop : Is SQL the next big step for Hadoop?

 Sessions  Comments Off on SQL-on-Hadoop : Is SQL the next big step for Hadoop?
Aug 012013

Since early days the Hadoop community has made several attempts to stretch Hadoop beyond its role as a distributed programming framework. The key strength that Hadoop brings to the table is its ability to scale linearly. Can we combine this advantage of Hadoop with the efficiency of databases? What does it take to run SQL over Hadoop?

Running SQL-on-Hadoop implies accessing data from “within” Hadoop using SQL as the interface. Accomplishing this demands a significant re-architecture of the storage and compute infrastructures.

SQL-on-Hadoop also shifts Hadoop’s role from being a technology, viewed so far as complementary to databases into something that could compete with them. Its perhaps the single most significant feature that will help Hadoop find its way into more enterprises.

This will be highlighting some conceptual ideas of the different ways that SQL processors can be implemented atop Hadoop. I’ll be taking some examples of OSS and Research-ware products.


Srihari SQL HadoopSrihari currently heads the technology organization for ThoughtWorks India. He’s been a developer and architect for several enterprise applications with focus on building large scale systems based on service oriented architectures, domain specific languages etc. He is passionate about distributed systems and databases.

Modelling RESTful applications – Why should I not use verbs in REST url

 Sessions  Comments Off on Modelling RESTful applications – Why should I not use verbs in REST url
Jul 292013

As per REST, the URLs should make use of HTTP verbs to expose their REST based services via HTTP. (i.e GET/PUT/POST/DELETE). But in a real life complex application, we are faced with exposing many services such as approve, reject where it becomes inevitable to add verbs to the URL. What should we do? Should we just have the URLs like ../foods/1/approve ?

What would go wrong if we use verbs in REST URL.
Whether there is some rationale behind it or it just REST dogma. Are there any “REST guidelines”?

In this session we will explore how to model our services so that we follow the RESTful way adhering to HTTP specifications. But most importantly we will try to understand why should we do that. What will go wrong and what would be the benefits?

We would be going through HTTP Specifications and browser caching capabilities. I will next discuss a use case on how to handle the scenarios of accepting/rejecting friend requests using pure REST urls.

Take away for audience : Learn how to model services in RESTful manner and more importantly should understand why would they so and what would fail. The session is designed mindful of the fact that most of the audience would be aware of basic REST theory. So the session focuses on how to address real world problems faced while modelling REST application.


Anirudh-Bhatnagar-Modelling-Rest-Restful-ApplicationsAnirudh Bhatnagar has been working with Java/J2EE based technologies for last 6 years. In his initial phase of the career he had worked with product development companies like Adobe India, but later moved towards consulting as he found it more exciting. He is always fascinated by new technologies and emerging trends in software development. He has been involved in propagating these changes and new technologies in the projects he has been working. He is an avid blogger and agile enthusiast who believes in writing clean and well tested code. Anirudh has previously presented at several occasions including AgileNCR2013,BeingAgile2013,etc. on the topics related to Java, J2ee, Liferay, Hadoop, Cloud and Agile. Currently he is working as a Senior Consultant with Xebia India for past 2 years.

Building Hadoop Pipelines using Apache Crunch

 Sessions  Comments Off on Building Hadoop Pipelines using Apache Crunch
Jul 292013

Most of Hadoop processing is not composed of single job, but a chain of  jobs. Building and managing such a chain is quite tricky, and that’s why people start to look at  other MR frameworks like PIG. But then again you have to learn the new semantics. Apache Crunch aims at changing  this, why learn new semantics to do the same?

For developers this means more focus on solving our actual problems rather than wrestling with MapReduce/Pig/Hive. Crunch is available in Java and Scala and offers a higher level of flexibility than any of the current set of MapReduce tools under Apache license. I will demonstrate how we can build chain of jobs in Crunch. Perform various operations like  join, aggregation etc. Crunch is quite extensible so I can showcase how much easy it is to write and build a library of reusable custom functions for our pipelines.


Rahul Sharma is a Senior Developer for Mettl.com . He has 8 years of experience in the Software Industry and has worked on several projects using Java/J2EE as the primary technology. He has an inclination to open source technologies and likes to explore/delve into new frameworks. He is one  of Apache Crunch developers. He has spoken in Indic threads conference (Pune 12).

Building Single Page Applications with Angular.js

 Sessions  Comments Off on Building Single Page Applications with Angular.js
Jul 292013

Single Page Applications are the rage right now. With great frameworks like Backbone.js and Angular.js it has become easier to organize client side code and render the templates on the client side. The backend serves as a basic data generation and capturing machine which makes it simple to scale and test while maintaining great performance and usability.

In this session we will look at Angular.js, an exciting JavaScript client side framework being developed by Google. We will look at the features Angular.js provides and how to build applications using it by working through a live demo.

At the end of the session, the attendees will be able to understand the feature-set of Angular.js, the commonly used terms used in the Angular world and understand how to integrate Angular.js apps with any server side technology.


Rocky is a software developer with over a decade of experience in software design and programming. He enjoys coding in Ruby, Scala, JavaScript and Java. He loves working on open source projects and tinkers with technology in his free time. He is currently working as a software developer for McKinsey & Company and has also worked as a developer for investment banks like Goldman Sachs and Morgan Stanley.  He has been a speaker at AgileNCR 2010, Agile Tours 2010, IndicThreads Conference on Cloud Computing 2011, Ruby Conf India 2012 & 2013 and IndicThreads Software Development Conference (Delhi 2012).

Big Data Search Simplified With Elasticsearch

 Sessions  Comments Off on Big Data Search Simplified With Elasticsearch
Jul 292013

Most modern applications generate large amounts of data in order to understand the needs and likes of their customers. However finding meaningful information from within this data is like finding a needle in a haystack. In this session we will look at some solutions that are being used currently for Big Data Search and then take a closer look at one of the frontrunners, Elasticsearch. Github, FourSquare, StumbleUpon, SoundCloud all use ElasticSearch to analyze and search through terabytes of data and millions of search requests.

In Elasticsearch we will be discussing:

  • What is ElasticSearch, how it works.
  • How ElasticSearch works to analyze data splitting a document into meaningful portions and indexing each of those portions separately. So whenever a new search request comes in, it knows what to find.
  • Features and advantages of ElasticSearch like built in sharding defaults, maintaining fail-safe node clusters, automatically adding a new node without having to reboot and so on.
  • Out of the box features for today’s applications like faceted search, reverse search using Percolators and pre-built Analyzers.

Manoj Mohan is a software developer at Intelligrape Software based in Noida, UP. He has worked on various technologies for applications ranging from building custom solutions in Grails to PhoneGap to GXT. He is always fastidious over the available frameworks. He loves tinkering with tools to get the most productivity with least hassles.

Java Clouds – Evaluating and Adopting Java EE PaaS

 Sessions  Comments Off on Java Clouds – Evaluating and Adopting Java EE PaaS
Jul 292013

This session will look at the promise, benefits & challenges of using popular Java PaaS services. It will compare popular cloud services like Jelastic, CloudBees and the Oracle Java Cloud and next discuss the key points to consider while choosing a Java PaaS vendor. The session will talk of the cloud capabilities and limitations of Java EE 6 and 7; and how you would often have to customize your Java EE application development with reference to the cloud vendor’s specifications. The session will look at how add-on cloud services beyond basic Java EE like a development platform or database support, could often be a crucial factor while deciding on a Java PaaS.

The session will demo deploying Java EE applications on multiple Java PaaS platforms and show the kind of features, reporting and management capabilities of the services.  The takeaway for the audience would be a better understanding of the Java EE PaaS alternatives and how to best build for,  utilize and adopt a Java EE cloud service.


Speaker At Java ConferenceHarshad Oak is the founder of IndicThreads & Rightrix Solutions. He is the author of the books 1) Oracle JDeveloper 10g 2) Pro Jakarta Commons 3) coauthor J2EE 1.4 Bible. For his contributions to technology and the community, he has been recognized as an Oracle ACE Director and a Java Champion. He is currently working on a book about the Oracle Java Cloud Service…

Java Garbage Collectors – Moving to Java7 Garbage-First (G1) Collector

 Sessions  Comments Off on Java Garbage Collectors – Moving to Java7 Garbage-First (G1) Collector
Jul 282013

One of the key strengths of JVM is automatic memory management (Garbage Collection). Its understanding can help in writing better applications. This becomes all the more important as enterprise server applications have large amount of live heap data and significant parallel threads.Until recently, main collectors were parallel collector and concurrent-mark-sweep (CMS) collector. This paper introduces the various Garbage Collectors and compares the CMS collector against its replacement, a new implementation in Java7 i.e. Garbage-First (G1). It is characterized by a single contiguous heap which is split into same-sized regions. In fact if your application is still running on the 1.5 or 1.6 JVM, a compelling argument to upgrade to Java 7 is to leverage G1.


Gurpreet S. Sachdeva is Director – Technology at Aricent Group. He has 16 years of experience working on Enterprise Computing technologies and Communication Network Applications. A keen Java enthusiast, has worked in Java EE Technology with almost every major application platform ranging from Tomcat to Jboss, Oracle Application Server and WebLogic.

Building An Enterprise Big Data Platform For 100TB Dataset

 Sessions  Comments Off on Building An Enterprise Big Data Platform For 100TB Dataset
Jul 282013

In this session I will share my experience in building an enterprise Big Data Platform For a 100TB Dataset with a medical sector use case. We will look at how we went about managing the unstructured data (genomics, imaging) on Hbase/Hadoop and structured data (biochemistry, skin tests etc) on nosql mongo database and the  challenges faced along the way.


Harpreet-SinghHarpreet Singh is an accomplished engineer/entrepreneur in Informatics and engineering domain with a record of achievements in scientific and business leadership roles. Over 14 years of industry and research experience including fundamental biotechnology research in area of big data, enterprise platform, cloud computing and preclinical (toxicology) drug discovery…

Design Thinking For Web & Mobile Product Design

 Sessions  Comments Off on Design Thinking For Web & Mobile Product Design
Jul 272013

In this time of volatility and complexity, the importance of design in driving meaningful innovation and change is ever-growing. While there are multiple factors for a product design to be desirable, feasible and viable, the design thinking process can help achieve these product characteristics.

The talk would cover:

  1. DesignThink & the importance of same in Product Design for Web & Mobile
  2. How experts practice this in their respective job functions while driving the product vision
  3. Bootstrap Strategies and Design Challenges
  4. Design thinking approach to solving problems of product features’ design and uncovering the latent needs, behaviours, and desires of users

This talk will present case studies, methodology and tools to take up Design Thinking and deliver an awesome user experience for web & mobile products.


Dushyant-Arora-Web-Mobile-Product-DesignDushyant Arora is Director – User Experience at MakeMyTrip.com. He is a seasoned, energetic & result–oriented UX practitioner with 14 yrs of proven track record of delivering innovative experiential B2C, B2B & B2E sites, online products & Apps for Desktop & Mobile for fortune 500 companies and startups…

Designing for Concurrency and Performance

 Sessions  Comments Off on Designing for Concurrency and Performance
Jul 182013

There are a wide variety of patterns and practices available which deal with building scalable and performant systems. This talk would attempt to explore different models – getting a holistic view on the good, bad and ugly. It would cover threads, locking, event-handlers, actors and more. We take a look at how we have been handling concurrency in Java and how we should be handling it. We would also look at how something like this becomes a no-brainer with Scala and Akka.

The talk would discuss how to design applications with a paradigm thinking towards concurrency and performance which is different from how we thing about methods and functions as of now.

Key take away would be practices, principles and patterns to achieve concurrency and performance. It would also show, how features in Java and Scala makes each of the above easier.


Speaker At Java ConferenceVikas is the CTO at Knoldus Software LLP. In his 16+ years of experience he has become a recognized speaker, mentor, and practitioner in the software industry. He blogs, has presented at various Technology conferences and written articles on Software development on ‘Agile Journal’ and ‘The Server Side’. He is an Agile Editor on InfoQ.com where he posts weekly about the latest and greatest in the community. His latest passions include Scala and Akka, and using them to build distributed scalable systems…

Using Graph Databases For Insights Into Connected Data

 Sessions  Comments Off on Using Graph Databases For Insights Into Connected Data
Jul 162013

Graph databases address one of the great macroscopic business trends of today: leveraging complex and dynamic relationships in highly connected data to generate insight and competitive advantage. Whether we want to understand relationships between customers, elements in a telephone or data center network, entertainment producers and consumers, or genes and proteins, the ability to understand and analyze vast graphs of highly connected data will be key in determining which companies outperform their competitors over the coming decade. In this session, I am going to cover graph database concepts mainly w.r.t Neo4j.

  1. High level view of Graph Space
  2. Power of Graph Databases
  3. Data Modeling with Graphs
  4. Cypher : Graph Query language
  5. Building a Graph Database Application
  6. Graphs in Real World / Common Use cases
  7. Predictive Analysis with Graph Theory

Gagan Agarwal is a Sr. Consultant at Xebia IT Architects. I have around 8 years of experience in Software industry and have worked in domains like e-Governance, Document and Content Management, Customer Communication Management, Media Buy Management etc. I have mainly worked on Java/J2EE and related technologies. I have been speaker at Indic threads conference and an active blogger. In my free time I love to explore new technologies and keep my self updated with latest trend.


 Sessions  Comments Off on UnConference
Jul 112013

UnConference is a participant-driven session for short, open discussions on various topics of interest to delegates. A delegate can present a question / make a point in a couple of minutes followed by an open discussion where speakers as well as others from the assembly contribute their views. The Unconference has been working wonderfully well at IndicThreads Conferences, helping delegates crowd source opinions and answers on various topics of intrigue and interest. The topics for the UnConference are decided via listings on a white board at the venue.