Microsoft has announced its first public preview of HD Insight. HD Insight is the Hadoop big-data framework which works on the Windows Azure technology. Microsoft has developed this application in partnership with Hortonworks. This application is a framework aimed at helping the user to handle large amounts of data (both structured and non-structured) across all clusters.
Partnership with Hortonworks
Hortonworks is the main ally of Microsoft in this Apache Hadoop endeavor and has made key contributions in enabling windows to handle big data implications. In fact, the Data platform 1.1, a product of Hortonworks, is the only Hadoop based application that can run successfully both on Linux and Windows. Some notable names in the industry like Google, Yahoo etc make use of MapReduce technology to take care of large data mining operations. Earlier Microsoft was using tools like PowerPivot and Power View for handling data of such magnitude.
This venture of Microsoft in alliance with Hortonworks has created a lot of anticipation among the public. Hortonworks claims that this product is completely open-ended and will enable tools like PowerPivot to not only extract insights from large data pool but also from other data sources including Linux.
Features of The Product
Data becomes voluminous or big data when escalating volumes of information pour into the system. Huge data in its raw format are not useful for organizations. For them to provide insight they should be processed, analyzed and then value should be derived out of them. This sometimes has to be carried out by using data from other sources as well. This is where the HD Insight Big Data Framework comes in to play. It assists in data processing and management. It has the following important features:
The HDFS or Hadoop distributed File System facilitates the storage of huge volumes of data. It offers a basic concept for closure of map and thus minimizes operational delay. It reduces the hassles that are usually involved in extracting large amount of unstructured data from multiple sources.
The MapReduce application helps in analyzing and extracting results from the data. It interprets data in terms of key-value pairs. So all the information fed into or extracted from the system should follow this type of datasets.
Other applications like Pig and Hive are programmed on top of the aforementioned frameworks and aid in breaking large data into scalable proportions. They simplify the configuration process and aid in deploying the clusters.
It also provides ODBC or Open Data Base Connectivity to integrate other business tools such as SQL server analysis and Excel.
Key Advantages to Users
This latest venture from Microsoft is a boon to organizations struggling to handle a huge flow of information. It has some vital advantages to offer the users, such as:
Making Hadoop run easily on Windows
Simple and effective management of huge volumes of data
Secure and enterprise ready Apache Hadoop based application
Fully open ended source of Hadoop available on Windows
Creating your first HDInsight cluster and run samples
The HD Insight Big Data Framework is easily the best framework available on Windows for handling a huge data flow.