1. (20pts) Conduct sentiment analysis on “Text 1” and “Text 2”. For each case, you will create a vector (length 70) with sentiment scores (sentiment score is set to zero if there is no news released on that day). Let’s call this S1 and S2. For this, you should rely on Python codes that I uploaded in canvas. Note that when there is no news, it will read as NaN value. You need to adjust the code to handle the NaN values. Report the sample averages of the two sentiment score vectors.
(Grading rule: There is no partial credit for this question.)
2. (20pts) With the two sentiment score vectors in hand, our objective is to explain the returns for B. Regress B returns on each of sentiment vector (S1, S2) and report the coef?cient estimates. So, two regressions (i) regress B returns on constant and S1 and (ii) regress B returns on constant and S2. Based on the regression results, infer whether B is af?liated with the IT sector or the Biopharma sector.
(Grading rule: You are required to report the coef?cient estimates on the sentiment vector and the R2 value from each regression. Failure to do so will result in deduc- tions in increments of 10 points.)