How to start learning hadoop

  • 3

How to start learning hadoop

The easiest way to get started with Hadoop is Sandbox with VM Player or Virtual Box. It is a personal, portable Hadoop environment that comes with a dozen interactive Hadoop tutorials. Sandbox includes many of the most exciting developments from the latest CDH/HDP distribution, packaged up in a virtual environment. You can start working on hadoop environment within 10 minutes.

Hadoop Sandbox provides:

  1. A virtual machine with Hadoop preconfigured.
  2. A set of hands-on tutorials to get you started with Hadoop.
  3. An environment to help you explore related projects in the Hadoop ecosystem like Apache Pig, Apache Hive, Apache HCatalog and Apache HBase.

Lets start downloading and installing hadoop into our windows machine in 10 mins

System requirements to run VMplayer:

RAM: at least 8 GB (For a 2-Node Virtual Cluster). Processor: i3 or above. at least 20 GB free disk space.

Step 1. Download and install VMPlayer from the following websites

http://www.vmware.com/in/products/player 

http://filehippo.com/download_vmware_player/

Or download and install Virtual Box from the below link.

www.virtualbox.org/wiki/Downloads

http://filehippo.com/download_virtualbox/

Step 2. Now download the Sandbox from Hortonworks website or Cloudera. I will explain here Hortonwork sandbox process:

http://hortonworks.com/products/hortonworks-sandbox/

The Sandbox download is available for both VirtualBox and VMware Fusion/Player environments. Just follow the instruction to import the Sandbox into your environment.

1. Open the Oracle VM VirtualBox Manager
You can do so by double clicking the icon:

VBox

2. Open the Preferences dialog window.
Select File‐>Preferences… within the Oracle VM VirtualBox Manager

Import_on_Vbox_7_20_2015

3.Uncheck Auto‐Capture Keyboard within the Preferences dialog window.
Select the Input icon button from the left hand pane of the window first
to get to the following window.

Import_on_Vbox_7_20_20151
Click the OK button once done.  This will close the Preferences window.

4. Open the Import Appliance window.

Select File‐>Import Appliance… within the Oracle VM VirtualBox Manager

Import_on_Vbox_7_20_20152

A separate dialog window is put in front of the VM VirtualBox Manager
window:

Import_on_Vbox_7_20_20153

5. Click on the folder icon that will open a file dialog window.  Select the virtual
appliance file that you downloaded as a prerequisite.  After selecting the file click
the Open button.
Import_on_Vbox_7_20_20154

NOTE:  The name of the file you have downloaded depends on the version of the
Hortonworks Sandbox you have chosen to download.  The above pictures are referencing
Sandbox HDP version 2.2

Application settings are now displayed.

On Windows after you select the virtual appliance file, you are brought back to this
window.

Import_on_Vbox_7_20_20155

After clicking on Next, the Appliance Settings are displayed.
Import_on_Vbox_7_20_20157

6. Modify Appliance Settings as needed.
Within the Appliance Settings section you may wish to allocate more RAM to the
virtual appliance.  Setting 8GB of RAM to the Hortonworks Sandbox virtual appliance
will improve the performance.  Make sure you have enough physical RAM on the
host machine to make this change. To make the change, click on the specific value to
modify and make your edits.  Once finished configuring, click Import.

Progress of the Import

Import_on_Vbox_7_20_20158

7. Once the import finishes, you are brought to the main Oracle VM VirtualBox
Manager screen.  From the left hand pane, select the appliance you just imported
and click the green Start arrow.

Import_on_Vbox_7_20_20159
A console window opens and displaying the boot up information.

Import_on_Vbox_7_20_201510

Once the virtual machine fully boots up, the console displays the login instructions.

Import_on_Vbox_7_20_201511

8. Use one of the supported browsers mentioned in the prerequisites section of this
document within your host machine.  Enter the URL displayed in the console.  By
default it should be http://127.0.0.1:8888.