Big data with Spark

Apache Spark Certification Training Course is designed to provide you with the knowledge and skills to become a successful Bi...


Content Provider

40 hrs



Mode Of Delivery

Valid for 6 months post activation

Course Validity




Certification By

This is a paid course.

Course Fee

  • 1,800/-/- 599/-/-


Apache Spark Certification Training Course is designed to provide you with the knowledge and skills to become a successful Big Data & Spark Developer.


Module 1:

Understanding Spark basics - Overview of Big Data and Spark, Installing Spark, Distributed data processing system, Spark shell. 

Module 2:

Writing Spark applications, Spark algorithms, Sparks core APIs in Scala/Java or in Python, Sparks architecture and developer API, Predictive analytics based on MLlib, Clustering with KMeans, Building classifiers, Modeling, Visualization techniques (matplotlib, ggplot2, D3, etc.).

Module 3:

Streaming architecture: How DStreams break down into RDD batches, Receivers running inside Executor task slots, Kafka, Multiple receivers, Union transformation, Sliding window operations on DStreams, Stateless transformations, Statefull transformation, Window transformation, Output operations, Persistence.

Module 4:

Resilient Distributed Datasets (RDDs) - Narrow vs. Wide dependencies, Types of RDDs (HadoopRDD, MappedRDD, FilteredRDD, CassandraRDD, SchemaRDD, etc), Preserves partitioning parameter, Broadcast , Accumulators, RDD operations - Transformations in RDD, Actions in RDD, Loading data in RDD, Key-value pair.

Module 5:

Spark SQL - Combining SQL, Machine learning, and streaming for Unified pipelines; Data transformation techniques, Loading of data, Hive queries through Spark, Spark applications, SQL library, Support for JSON and parquet file formats.

Learning Outcomes

  • Define and explain Spark Streaming
  • Understand RDD and its operation along with implementation of Spark Algorithms
  • Understand the difference between Apache Spark and Hadoop
  • Learn about the Scala classes concept and execute pattern matching

Who Should Attend?

  • Engineering and IT students
  • Graduates with a programming background

Job Prospects

  • Senior Database Engineer
  • Spark Programming
  • Data Intelligence – Spark
  • Lead Solution Advisor - Apache Spark


After completing this course and successfully passing the certification examination, the student will be awarded the “Big data with Spark” certification.

If a learner chooses not to take up the examination, they will still get a 'Participation Certificate'