in Education by
We are building sentiment analysis application and we converted our tweets dataframe to an array. We created another array consisting of positive words. But we cannot count the number of tweets containing one of those positive words. We tried these and we get 1 as result. It must be more than 1. Apparently it did not count: val sqlContext = new org.apache.spark.sql.SQLContext(sc) var tweetDF = sqlContext.read.json("hdfs:///sandbox/tutorial-files/770/tweets_staging/*") tweetDF.show() var messages = tweetDF.select("msg").collect.map(_.toSeq) println("Total messages: " + messages.size) val positive = Source.fromFile("/home/teslavm/positive.txt").getLines.toArray var happyCount=0 for (e <- 0 until messages.size) { for (f <- 0 until positive.size) { if (messages(e).contains(positive(f))){ happyCount=happyCount+1 } } } print("\nNumber of happy messages: " +happyCount) JavaScript questions and answers, JavaScript questions pdf, JavaScript question bank, JavaScript questions and answers pdf, mcq on JavaScript pdf, JavaScript questions and solutions, JavaScript mcq Test , Interview JavaScript questions, JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)

1 Answer

0 votes
by
This should work. It has the advantage that you do not have to collect the result, as well as being more functional. val messages = tweetDF.select("msg").as[String] val positiveWords = Source .fromFile("/home/teslavm/positive.txt") .getLines .toList .map(word => word.toLowerCase) def hasPositiveWords(message: String): Boolean = { val _message = message.toLowerCase positiveWords.exists(word => _message.contains(word)) } val positiveMessages = messages.filter(hasPositiveWords _) println(positiveMessages.count()) I tested this code locally with: import org.apache.spark.sql.SparkSession val spark = SparkSession.builder.master("local[*]").getOrCreate() import spark.implicits._ val tweetDF = List( (1, "Yes I am happy"), (2, "Sadness is a way of life"), (3, "No, no, no, no, yes") ).toDF("id", "msg") val positiveWords = List("yes", "happy") And it worked.

Related questions

0 votes
    We are building sentiment analysis application and we converted our tweets dataframe to an array. We created another array ... .txt").getLines.toArray var happyCount=0 for (e...
asked Apr 24, 2022 in Education by JackTerrance
0 votes
    I am working on spark project using Scala. I need to print each element of a list named 'c' ... JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 9, 2022 in Education by JackTerrance
0 votes
    I work with Spark often, and it would save me a lot of time if the compiler could ensure that a type is serializable. ... T to be serializable } It's not enough to constrain T...
asked Jul 3, 2022 in Education by JackTerrance
0 votes
    I have a generic method which a generic type parameter T which is a subclass of MyClass. Inside that method, I want ... of type erasure): object Demo extends App { def myMethod[T...
asked Jun 30, 2022 in Education by JackTerrance
0 votes
    I am working on a project with spark and scala and I am new to both but with lot of help from ... , JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Jun 19, 2022 in Education by JackTerrance
0 votes
    I'm struggling to get custom defined mapping between my case classes and database tables due to type ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Jun 5, 2022 in Education by JackTerrance
0 votes
    I have to retrieve Derived class objects stored in a Map given the respective class name as key. As ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked May 27, 2022 in Education by JackTerrance
0 votes
    I have to retrieve Derived class objects stored in a Map given the respective class name as key. As ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked May 27, 2022 in Education by JackTerrance
0 votes
    In scala, it is OK to convert a variable in the Seq, but if I construct the Seq with :: it ... , JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked May 4, 2022 in Education by JackTerrance
0 votes
    I have a 2 column (1 int and 1 double) dataframe "fit_comparison", of predicted values and linear ... Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Apr 9, 2022 in Education by JackTerrance
0 votes
    In Mercury I can use: A = B^some_field := SomeValue to bind A to a copy of B, except that ... , JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Mar 4, 2022 in Education by JackTerrance
0 votes
    It's a sad fact of life on Scala that if you instantiate a List[Int], you can verify that your ... , JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Mar 4, 2022 in Education by JackTerrance
0 votes
    In Mercury I can use: A = B^some_field := SomeValue to bind A to a copy of B, except that ... , JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Mar 4, 2022 in Education by JackTerrance
0 votes
    In Mercury I can use: A = B^some_field := SomeValue to bind A to a copy of B, except that ... , JavaScript Questions for Interview, JavaScript MCQ (Multiple Choice Questions)...
asked Mar 4, 2022 in Education by JackTerrance
0 votes
    Definition says: RDD is immutable distributed collection of objects I don't quite understand what does it mean. Is ... one please help. Select the correct answer from above options...
asked Jan 30, 2022 in Education by JackTerrance
...