arrow_back
Introduction to the Course
Welcome to the Course
What You Will Learn
Why Spark MLlib for Machine Learning Projects
Course Workflow & Project Overview
Tools We’ll Use Apache Spark, Spark ML, Apache Zeppelin
Overview of Employee Dataset
Setting Up the Environment
Requirements
(Hands On) Installing JAVA
Steps for Installing JAVA
(Hands On) Setting JAVA environments
Steps for Setting JAVA environments
(Hands On) Apache Zeppelin Installation Steps on Ubuntu machine
Steps for Installing Apache Zeppelin on Ubuntu machine
(Hands On) Installing Docker Desktop on Windows 10/11
Steps for Installing Docker on Windows
(Hands On) Running Apache Zeppelin on Docker (Windows)
Steps for Running Apache Zeppelin on Docker
(Hands On) Configure and Connect to Spark interpreter
Steps for Configure and Connect to Spark Interpreter
Download Resources
Download Resources
Employee+Attrition+Prediction Zeppelin Notebook
EmployeeAttrition CSV Data
Importing Zeppelin file in Zeppelin Environment
Zeppelin Basics
What is Apache Zeppelin
Features & Benefits
Notebook UI Overview
Markdown and text formatting
Creating and running paragraphs
Hands on Creating and Running paragraphs
Visualization Options (Tables, Bar chart, Pie chart, etc.)
Hands On - Types of Default Chart in Zeppelin
Zeppelin with Apache Spark
Spark interpreter details
Working with RDDs and DataFrames
Spark SQL queries and caching
Visualizing Spark outputs
Job tracking and performance tuning basics
Machine Learning Project
Employee Attrition Prediction with Apache Spark
Understanding Employee Attrition
How This Project Is Structured
Project Roadmap
Understanding Employee Attrition Business Problem & Project Roadmap
Loading Employee Data into Spark DataFrame
Exploratory Data Analysis Dataset Summary and Statistics
Understanding Dataset Schema Using printSchema
Working with Temporary Views in Spark SQL
Handling Imbalanced Data in Employee Attrition Prediction using Spark SQL
Gender Discrepancy Analysis in Employee Attrition Data
Analyzing Job Satisfaction Distribution Using Spark SQL
Analyzing Salary Distribution and Department-Wise Workforce Insights Using Spark
Generational Workforce Analysis and Education-Level Attrition Insights
Is Money Everything? Income, Satisfaction, Overtime, and Attrition Analysis
Working Environment Analysis & Attrition Insights Using Apache Spark
Manager Experience & Environment Satisfaction Analysis Using Spark SQL
Deep Dive into Attrition: Department, Work-Life Balance & Distance Analysis
Analyzing External Factors Influencing Employee Attrition Using Spark SQL
Building an Employee Attrition Classification Model Using Decision Tree
Preparing Training Data for Attrition Prediction Using VectorAssembler
Building a Machine Learning Pipeline in Apache Spark for Feature Transformation
Splitting Data and Creating Feature Vectors for Machine Learning in Apache Spark
Training a Decision Tree Classification Model in Apache Spark
Preparing Testing Data for Model Evaluation in Apache Spark ML
Testing the Machine Learning Model and Generating Predictions in Apache Spark
Evaluating Classification Model Performance in Apache Spark ML
Introduction
Introduction
Download Resources
Download Data for Employee Attrition Project
Download Source Code for Employee Attrition Project
Project Begins
Introduction to Spark
(Old) Free Account creation in Databricks
(New) Free Account creation in Databricks
Provisioning a Spark Cluster
Introduction to Machine Learning
Basics about notebooks
Dataframes
File Content
Project Explaination Part 1
Project Explaination Part 2
Project Explaination Part 3
Project Explaination Part 4
Project Explaination Part 5
Project Explaination Part 6
Project Explaination Part 7
Project Explaination Part 8
Preview - Employee Attrition Prediction in Apache Spark (ML) Project
Discuss (
0
)
navigate_before
Previous
Next
navigate_next