In this blog you can find Hadoop interview questions and answers that have been asked in reccent years in many companies.
1) What is meant by Hadoop?
Hadoop is distributed computing platform and is written in Java. It consists of the following features like Google File System and MapReduce.2) mention which platform and Java version required to run Hadoop?
Java 1.6.x or advanced versions are good for Hadoop work , preferably Linux and Windows are the supported operating system for Hadoop environment , but BSD, Mac OS/X, and Solaris are more famous for working.3) What are Hardware specifications for Hadoop?
Hadoop can run on dual processor/ dual core machines along with 4-8 GB RAM using ECC memory. It depends on the workflow designs 4.Describe most common input formats defined in Hadoop?The most common input formats in Hadoop are:- TextInputFormat
- KeyValueInputFormat
- SequenceFileInputFormat
5) How do you categorize a big data?
The big data is categorized by using the features like:- Volume
- Velocity
- Variety
6) what is the use of .mecia class?
For the floating media objects from one side to another side , we use this class.7) List the use of the bootstrap panel.
Panels in bootstrap from the boxing of DOM components.8) Describe purpose of button groups?
Button groups used for placement of more than one buttons in the same line. 9) List the various types of lists supported by Bootstrap.- Ordered list
- Unordered list
- Definition list
10) Name the command which is used for the retrieval of the status of daemons running the Hadoop cluster?
The ‘jps’ command used for the retrieval of status of daemons running the Hadoop cluster.11) What is InputSplit ? Explain.
When a Hadoop job runs, splits input files into chunks and assigns each split to mapper for processing. which is called the InputSplit.12) Explain textInputFormat?
TextInputFormat is the text file which is a record. Value is content of the line while Key is the byte offset of the line. For example Key: longWritable, Value: text13) What is meant by SequenceFileInputFormat in Hadoop?
SequenceFileInputFormat in Hadoop is used to read files in sequence. It is the specific compressed binary file format which passes data between output of one MapReduce job to the input of some another MapReduce job.14) How many InputSplits can be made by a Hadoop Framework?
Hadoop makes total 5 splits :- One split for 64K files
- Two splits for 65MB files, and
- Two splits for 127MB files
15) Describe the use of RecordReader in Hadoop?
InputSplit assigned with work but doesn’t know how to access . The record holder class is totally responsible for performing loading the data from its source and convert it to keys pair suitable for reading by Mapper. The RecordReader’s instance can also be defined by the Input Format.16)Describe JobTracker in Hadoop?
JobTracker is service within the Hadoop which runs MapReduce jobs on the cluster.17) Explain WebDAV in Hadoop?
WebDAV is set of extension to HTTP which used to support editing and uploading the files. On most operating system WebDAV shares can mounted as filesystems, so it is also possible to access HDFS as a standard filesystem by exposing HDFS over the WebDAV.18) what is Sqoop in Hadoop?
Sqoop is a tool which is used to transfer data between the Relational Database Management System and Hadoop HDFS. Using Sqoop, one can transfer data from RDBMS like MySQL or Oracle into HDFS as well as exporting data from HDFS file to RDBMS.19)List functionalities of JobTracker?
This are the main tasks of JobTracker:- accepting jobs from the client.
- communicating with the NameNode to determine the location of the data.
- To locate TaskTracker Nodes with free slots.
- To submit work to the chosen TaskTracker node and monitor progress of each task.
20) Use of TaskTracker.
TaskTracker is node in cluster that accepts jobs like MapReduce and Shuffle operations from JobTracker.For more interview questions and live projects contact