Hadoop interview questions with answers, that have been asked in the recent years in many companies interviews.
1) What is meant by Hadoop?
Hadoop is written in Java ,it is a distributed in computing platform . It consists of the following features like Google File System and MapReduce.
2) Describe the platform and Java version ,which is required to run the Hadoop?
Java( 1.6.x )or any advanced versions of java are good for Hadoop work , Linux and Windows are the preffered operating system for Hadoop environment , but Mac OS/X, BSD, and Solaris are more famous for working.
3) What are Hardware specifications for Hadoop?
Hadoop runs on both dual processor/ dual core machines along with 4-8 GB RAM using ECC memory. It depends on the workflow designs
4)Describe most common input formats defined in Hadoop?
The common input formats in Hadoop
- TextInputFormat
- KeyValueInputFormat
- SequenceFileInputFormat
TextInputFormat is default input format.
5) How do you categorize a big data?
Big data is categorized based on the features like:
- Volume
- Velocity
- Variety
6) what is the use of .media class?
This type of class is used for floating media objects from one side to another side.
7)Use of bootstrap panel in Hadoop?
Panels bootstrap from boxing of DOM components.This panel is used with in the element <div> to create Bootstrap panels.
8) Describe purpose of button groups?
Button groups used for placement of more than one of the buttons in same line.
9)Mention various types of lists supported by Bootstrap.
- Ordered list
- Unordered list
- Definition list
10)Name the command used for retrieval of status of daemons running the Hadoop cluster?
The command ‘jps’ used for retrieval of status of the daemons running Hadoop cluster.
11) What is InputSplit ? Explain.
While running Hadoop job, splits its input files in to chunks and will assign each split to mapper for the processing. which is also called the InputSplit.
12) Explain textInputFormat?
The text file is a textInputFormat which is a record. Value obtained is the content of line while Key is the byte offset of the line. For example Key: longWritable, Value: text
13) What is meant by SequenceFileInputFormat in Hadoop?
SequenceFileInputFormat in Hadoop is used to read files in sequence. which is the compressed binary file format which passes the data between the output of one Map Reduce job to the input of some another Map Reduce the job.
14) How many InputSplits can be made by a Hadoop Framework?
Hadoop makes total 5 splits :
- One split for 64K files
- Two splits for 65MB files, and
- Two splits for 127MB files
15) Describe the use of RecordReader in Hadoop?
InputSplit assigned with work but doesn’t know how to access . The record holder class is totally responsible for performing loading the data from its source and convert it to keys pair suitable for reading by Mapper.
16)Describe JobTracker in Hadoop?
The service JobTracker is with in the Hadoop which runs the MapReduce jobs on cluster.
17) Explain WebDAV in Hadoop?
WebDAV is set of extension to HTTP which used to support editing and uploading the files. In most of the operating system WebDAV shares can be mounted as filesystems, so it is always possible to access HDFS as a standard filesystem by exposing the HDFS over WebDAV.
18) what is Sqoop in Hadoop?
Sqoop is used to transfer data between Hadoop HDFS and Relational Database Management System . Using Sqoop you can transfer data from RDBMS like Oracle/MySQL into HDFS as well exporting data from HDFS file to RDBMS.
19)List functionalities of JobTracker?
This are the main tasks of JobTracker:
- accepting jobs from the client.
- communicating with the NameNode to determine the location of the data.
- To locate TaskTracker Nodes with free slots.
- To submit work to the chosen TaskTracker node and monitor progress of each task.
20) Use of TaskTracker.
TaskTracker is a node in the cluster which accepts jobs like MapReduce and Shuffle operations from the JobTracker .
For more interview questions and live projects contact
0 Comments