Comments on: Spark Use Case – Youtube Data Analysis

By: Satyam

Satyam — Wed, 13 Apr 2016 13:02:08 +0000

Hi Raj,

Your approach is also correct, we can use filter also but we have followed a different approach using conditional statements and in our next use case in titanic and olympic we have extracted the results using the filter approach.You can have a look at https://acadgild.com/blog/spark-use-case-titanic-data-analysis/
Please let us know with your feedbacks for the same.

By: Raj

Raj — Fri, 08 Apr 2016 06:55:16 +0000

Hi,

In the problem statement 1, line# 3, you have used ‘var’ to check the column counts and extract column#3. Instead of using ‘var’ you can write like this,

val counts = textFile.map(line => line.split(“\t”)).filter(columns => columns.length >=3).map(columns => columns(3))

val counts = textFile.map(_.split(“\t”)).filter(_.length >= 3).map(_(3))