arrow_back
Introduction to Big Data
Introduction
Introduction to Big Data
Three Vs of Big Data
How Big is BIG DATA?
How analysis of Big Data is useful for organizations?
Challenges of Traditional Systems
Big Data Engineering Learning Roadmap
Introduction to HADOOP
What is Hadoop
Why Hadoop and its Use Cases
Different Ecosystems of Hadoop
Structured Unstructured Semi-Structured Data
Relation between Big Data and Hadoop
Future of Hadoop
Challenges with Big Data
Hadoop VS RDBMS
Hadoop VS Data Warehouse
Hadoop VS Teradata
Type of Big Data Projects
What is a Cluster Environment?
What Is a Hadoop Cluster?
Apache Hadoop 3.3.0 Single Node Installation on Windows 10
Download Java 8 and Apache Hadoop 3.3.0
Installing and Configuring Java
Installing and Configuring Hadoop
Starting Apache Hadoop 3.3.0 Single Node Cluster
Stopping Apache Hadoop 3.3.0 Single Node Cluster
Apache Hadoop 3.3.0 Single Node Installation on Ubuntu Linux
Apache Hadoop 3.3.0 installation on Ubuntu Part 1
Apache Hadoop 3.3.0 installation on Ubuntu Part 2
(Ubuntu) Starting Apache Hadoop 3.3.0 Single Node Cluster
(Ubuntu) Stopping Apache Hadoop 3.3.0 Single Node Cluster
HDFS (Hadoop Distributed File System) Commands
Hadoop Distributed File System (HDFS)
File System (FS) shell
(Hands On) FileSystem Shell Command to Check Hadoop version
(Hands On) FileSystem Shell Command to get help for any command
(Hands On) FileSystem Shell Command to Make Directory in HDFS
(Hands On) FileSystem Shell Command to display data [cat]
(Hands On) FileSystem Shell Command [checksum]
(Hands On) FileSystem Shell Command [copyFromLocal]
(Hands On) FileSystem Shell Command [copyToLocal]
(Hands On) FileSystem Shell Command [count]
(Hands On) FileSystem Shell Command [cp]
(Hands On) FileSystem Shell Command [df]
(Hands On) FileSystem Shell Command [du]
(Hands On) FileSystem Shell Command [find]
(Hands On) FileSystem Shell Command [get]
(Hands On) FileSystem Shell Command [getfacl]
(Hands On) FileSystem Shell Command [head]
(Hands On) FileSystem Shell Command [ls]
(Hands On) FileSystem Shell Command [moveFromLocal]
(Hands On) FileSystem Shell Command [mv]
(Hands On) FileSystem Shell Command [put]
(Hands On) FileSystem Shell Command [rm]
(Hands On) FileSystem Shell Command [rmdir]
(Hands On) FileSystem Shell Command [tail]
(Hands On) FileSystem Shell Command [touchz]
(Hands On) FileSystem Shell Command to append data [appendToFile]
(Hands On) FileSystem Shell Command to change group [chgrp]
(Hands On) FileSystem Shell Command to change permission [chmod]
(Hands On) FileSystem Shell Command to change owner [chown]
(Hands On) FileSystem Shell Command to merge files [getmerge]
(Hands On) FileSystem Shell Command to change replication [setrep]
(Hands On) FileSystem Shell Command to view statistics [stat]
(Hands On) FileSystem Shell Command to change modifying timestamp [touch]
(Hands On) FileSystem Shell Command to concat files[concat]
(Hands On) FileSystem Shell Command to display classpath [classpath]
(Hands On) FileSystem Shell Command to display environment variables [envvars]
(Hands On) FileSystem Shell Command fsck [fsck]
(Hands On) FileSystem Shell Command getconf[getconf]
(Hands On) FileSystem Shell Command group[group]
(Hands On) FileSystem Shell Command datanode[datanode]
HDFS and YARN Architecture
HDFS Overview
HDFS Architecture
Storage aspects of HDFS
Hadoop Modes of Installation
NameNode
DataNode
NodeManager
ResourceManager
Secondary NameNode
Data Replication
Rack Awareness
Robustness
HDFS Snapshot
Balancer
YARN
What is YARN?
Difference between Map Reduce & YARN
YARN Architecture
Scheduler for Yarn (CapacityScheduler/Fair Scheduler)
Examples Running Mapreduce on YARN
YARN Web UI
MapReduce
Overview
What is MapReduce?
Mapreduce Limitation
Mapper
Reduce
Shuffle
Sort
Secondary Sort
How Many Maps?
How Many Reduces?
Reducer NONE
Partitioner
Counter
InputSplit
RecordReader
Example
FAQ in Apache Hadoop and Mapreduce Interview
How to unzip .gz files in a new directory in hadoop?
Scenario Based Question
Scenario Based Question
Can I have multiple files in HDFS use different block sizes?
Does Wildcard characters work correctly in FsShell?
How to deal with small files in Hadoop?
What steps do you follow in order to improve the performace of Mapreduce Job?
What is the purpose of shuffling and sorting phase in the reducer in Map Reduce
Is it important for Hadoop MapReduce jobs to be written in Java?
Apache Pig
Introduction to Apache Pig
Map Reduce Vs Apache Pig
Installing Apache Pig
Execution Modes
Batch Mode
Pig Latin Statements
Data types
Example of Simple Data Type
Example of Complex Data Type
Loading Data
Working with Data
FILTER operator (Hands On)
FOREACH operator (Hands On)
GROUP operator (Hands On)
COGROUP operator (Hands On)
JOIN operator (Hands On)
UNION operator (Hands On)
SPLIT operator (Hands On)
Storing Data (Hands On)
Debugging Pig Latin (Hands On)
DUMP operator (Hands On)
DESCRIBE operator (Hands On)
EXPLAIN operator (Hands On)
ILLUSTRATE operator (Hands On)
Comparison Operators (Hands On)
ORDER BY operator (Hands On)
RANK operator (Hands On)
Apache Pig - Built In Functions
AVG - Eval Functions (Hands On)
CONCAT - Eval Functions (Hands On)
COUNT - Eval Functions (Hands On)
MAX - Eval Functions (Hands On)
MIN - Eval Functions (Hands On)
SIZE - Eval Functions (Hands On)
SUM - Eval Functions (Hands On)
IN - Eval Functions (Hands On)
ABS - Math Functions (Hands On)
CBRT - Math Functions (Hands On)
FLOOR - Math Functions (Hands On)
LOG - Math Functions (Hands On)
FAQ in Apache Pig
Scenario Based Question (File modification based)
How to remove single quotes from data using Pig?
How to compute sum of a field in all the rows from an alias?
Is there a way to do this? eg, pass the name of the file to be processed, etc?
Scenario Based Question (Date)
How to do Transpose in corresponding few columns in pig?
Scenario Based Question (Programming)
Write a word count program in pig?
How to load files with different delimiter each time in piglatin?
Apache Hive
Introduction to Apache Hive
Hive Architecture
How a Hive query flows through the system.
Hive Features
Hive Limitation
Installation Steps of Apache Hive
Hive Data Model Diagram
Tables
Partitions
Buckets or Clusters
Hive Data Types
Primitive Type
Complex Type
Create Database
Create Table
Create Table (Hands On)
Managed and External Tables
Managed and External Tables (Hands On)
Storage Formats
LOAD
SELECT
INSERT
UPDATE
DELETE
String Functions
Metastore
Partitions in Detail
Partitions (Hands On)
Bucketing Theory
Bucketing (Hands On)
Frequently Asked Interview Question and Answers (Hive)
How to create HIVE Table with multi character delimiter?
How to load Data from a .txt file to Table Stored as ORC in Hive?
How to skip header rows from a table in Hive?
Create single Hive table for small files without degrading performance in Hive?
How will you consume this CSV file into the Hive warehouse using built SerDe?
Is it possible to change the default location of a managed table?
Can hive queries be executed from script files? How?
Can we run unix shell commands from hive? Give example?
Apache Sqoop
Introduction to Apache Sqoop
Installing Apache Sqoop on Ubuntu
MySQL client and Server Installation
Importing Data with Apache Sqoop
Scoop-import
Mandatory Steps to be performed at mysql
Mandatory Steps to be performed at sqoop lib Directory
Transferring an Entire Table
Specifying a Target Directory
Importing Only a Subset of Data
Protecting Your Password
Using a File Format
Compressing Imported Data
Speeding Up Transfers
Controlling Parallelism
Importing All Your Tables
Importing Only New Data (Incremental Import)
Free-Form Query Import
Exporting Data with Apache Sqoop
Sqoop-export
Transferring Data from Hadoop
Apache Flume
Introduction to Apache Flume
Installing Apache Flume on Ubuntu
Apache Flume Architecture
Features of Apache Flume
Pros and Cons of Apache Flume
When should you go for Apache Flume?
Apache Flume Applications
Hands on Example
Apache Kafka
What is event streaming?
Introduction to Apache Kafka
How does Kafka work in a nutshell?
Elements of Kafka
Core Component of Apache Kafka
Installing Single Node Kafka Cluster
Sending Data file to Kafka Topic
Reading Kafka Topic
Kafka Command-Line Interface (CLI) Tools
kafka-server-start.sh
kafka-server-stop.sh
zookeeper-server-start.sh
zookeeper-server-stop.sh
kafka-cluster.sh
kafka-broker-api-versions.sh
kafka-topics.sh
kafka-console-producer.sh
kafka-console-consumer.sh
kafka-producer-perf-test
Kafka Topic Operations
Add a Topic
Describe a Topic
Change the retention value for a topic
Increase partitions for a topic
Delete a Topic
Python using Databricks
Getting Started with Python
Variables and Data Types
Conditionals and Loops
Methods Functions and Packages
Collection and Classes
Preview - Learn Big Data Hadoop: Hands-On for Beginner
Discuss (
0
)
navigate_before
Previous
Next
navigate_next