Spark Streaming with Kafka : Online Learning Center

Lets use covid19 dataset to build a small poc on Spark steaming with Kafka.

Steps:

Download Kafka latest from Internet https://www.apache.org/dyn/closer.cgi?path=/kafka/2.5.0/kafka_2.13-2.5.0.tgz
Extract to a folder say D:\kafka_2.13-2.5.0
Navigate to below path D:\kafka_2.13-2.5.0\kafka_2.13-2.5.0\bin\windows
copy D:\kafka_2.13-2.5.0\kafka_2.13-2.5.0/conf/zookeeper.properties to kafka/bin/windows
copy D:\kafka_2.13-2.5.0\kafka_2.13-2.5.0/conf/server.properties to kafka/bin/windows
start zookeeper
$ zookeeper-server-start.bat zookeeper.properties
start Kafka Server
$ kafka-server-start server.properties
Create Kafka Topic
$ kafka-topics.bat --zookeeper localhost:2181 --create --topic covid19india --partitions 2 --replication-factor 1

Create DB Table to Store Data.

CREATE TABLE `covid19india` (
  `sno` varchar(100) DEFAULT NULL,
  `date_of_identification` varchar(100) DEFAULT NULL,
  `current_status` varchar(100) DEFAULT NULL,
  `state` varchar(100) DEFAULT NULL,
  `num_of_cases` int(11) DEFAULT NULL
)

Run Kafka Consumer Code
Run Kafka Producer Code
Check Your MySQK Database table getting Populated.
Visualize it using Tableau.

Example usecase

https://www.dropbox.com/sh/c8l4uy57wisahub/AAAr9GQGtUb0pXzy5qykr9LZa?dl=0

Published By : Suraj Ghimire

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

Comments

Jquery Comments Plugin

Hi !! Please to Comment

Spark Streaming with Kafka

Comments

Follow Us