Big data with Spark

Content Provider


Course Level


Mode Of Delivery



40 hrs

Certification By


Course Validity

Valid for 6 months post activation

Learning Resources

Videos, Self-paced Learning Material, Certification


Information Technology
  • 1,800/- 1599/-
Speak with our Expert Team Today Ask Now



Spark in Big Data Analytics is an open-source distributed processing system that is designed to be an execution engine. It is used in processing large-scale datasets. The key feature of Spark is that it functions both on-disk as well as in-memory. Spark can handle large clusters of datasets and works well up to petabytes.


This Course


This Apache Spark Certification Training Course is designed to provide you with the knowledge and skills to become a successful Big Data & Spark Developer. You will understand the difference between Apache Spark and Hadoop and also gain some knowledge about  the Scala classes concept and execute pattern matching.


Learning Outcomes

  • Define and explain Spark Streaming

  • Identify RDD and its operation along with implementation of Spark Algorithms

  • Differentiate between Apache Spark and Hadoop

  • Learn about the Scala classes concept and execute pattern matching


Module 1

  • Understanding Spark basics - Overview of Big Data and Spark

  • Installing Spark

  • Distributed data processing system

  • Spark shell. 

Module 2

  • Writing Spark applications

  • Spark algorithms

  • Sparks core APIs in Scala/Java or in Python

  • Sparks architecture and developer API

  • Predictive analytics based on MLlib

  • Clustering with KMeans

  • Building classifiers

  • Modeling

  • Visualization techniques (matplotlib, ggplot2, D3, etc.).

Module 3

  • Streaming architecture: How DStreams break down into RDD batches

  • Receivers running inside Executor task slots, 

  • Kafka

  • Multiple receivers

  • Union transformation

  • Sliding window operations on DStreams

  • Stateless transformations

  • Statefull transformation

  • Window transformation

  • Output operations

  • Persistence.

Module 4

  • Resilient Distributed Datasets (RDDs) - Narrow vs. Wide dependencies

  • Types of RDDs (HadoopRDD, MappedRDD, FilteredRDD, CassandraRDD, SchemaRDD, etc)

  • Preserves partitioning parameter

  • Broadcast 

  • Accumulators

  • RDD operations - Transformations in RDD

  • Actions in RDD

  • Loading data in RDD

  • Key-value pair.

Module 5

  • Spark SQL - Combining SQL

  • Machine learning, and streaming for Unified pipelines

  • Data transformation techniques

  • Loading of data

  • Hive queries through Spark

  • Spark applications

  • SQL library

  • Support for JSON and parquet file formats.

Key Features

  • 12 Explanatory Videos 

  • Self-paced Learning

  • Anytime, anywhere access

  • Certification

Career Opportunities

  • Senior Database Engineer

  • Spark Programming

  • Data Intelligence – Spark

  • Lead Solution Advisor - Apache Spark


After completing this course and successfully passing the certification examination, the student will be awarded the “Big data with Spark” certification.

If a learner chooses not to take up the examination, they will still get a 'Participation Certificate'

Who Should Attend

  • Engineering and IT students

  • Graduates with a programming background

Frequently Asked Questions

Where can I find courses on the Platform?

After login to KRACKiN(, you can find courses on the dashboard which are segregated as skill courses, career tracks and also Annual Membership.consisting of 75000 + courses. You can enrol to any of the courses available by selecting the ‘Buy Now’ button.
Can I search courses domain wise on the platform?

Yes, Skill courses and Annual Membership plans are segregated domain wise. You can enter your domain and search for the desired courses.
Is online classes available for all the courses on KRACKiN?

No, online classes are not available for all courses on KRACKiN. Courses are designed as self- paced learning courses supported by videos, PPTs and practice quizzes etc. are provided for easy understanding of the students.
Are the course certificates valid?

Yes, the certificates are valid across all the sectors in the Industry.
Can I access your courses on Mobile device?

Yes, you can access the courses on your mobile browser.
Can my course access be extended?

We recommend you complete your assigned courses within the stipulated time as we do not have the provision to extend the course validity.
How to get my certificate after completing the course?

After the completion of all the given modules, the certificate will be auto generated by the system.

Content & Certification Partners