WallStreetBets is a fast-growing Reddit community with almost 1.5 million members where users post about investing, speculative trading, and the stock market in general. Between the memes of never-ending economic stimulus, or the recent surge in tech stock prices, you’ll find members posting about their next big trade.
At first glance, it’s obvious that WallStreetBets is not your usual investor forum. One of WallStreetBets’ mantras is YOLO (or, “you only live once”, for the uninitiated). Like a battle cry to rally your soldiers into battle, the amateur traders on WSB use the phrase to declare their commitment to a risky play; some resulting in decisive victories, and some in bloody defeat.
Through these antics, the community has earned its way into popular business news publications such as CNBC, Business Insider, and Businessweek. According to a recent article on Bloomberg, even institutional traders are interested in what WSB has to say. Medium user Mjysong programmed a trading bot which performed trades on SPY based on the daily sentiment of WSB posts, generating an impressive 61.5% annual return. Most of this gain can be attributed to the fact that the trading bot was able to predict the market plummet in March 2020 resulting from the COVID-19 pandemic. In August 2020, Reddit user pdwp90 posted a sentiment analysis on WSB, again confirming that a dip in the community’s sentiment aligned with the market crash in March. It’s clear that WallStreetBets is more than a broken clock.
All this is great, but how does WallStreetBets perform in predicting the performance of individual stocks? After all, in the spirit of YOLO, trading individual stocks is generally riskier than trading the market; and with this increased risk comes the opportunity for greater returns.
In an attempt to answer this question, I examined WSB predictions for individual company earnings during the Q2 2020 earnings season, and whether these predictions paid off.
Earnings season is a fun time for WSB, as earnings releases provide opportunities for short-term trades with large payoffs on good news. Q2 2020 was also quite unusual; with the brunt of the effects from COVID-19 lock downs, companies such as airlines and retailers struggled to stay afloat, while the value of adaptable internet-based companies made large strides.
Using the Python library praw, I scraped several hundred top level comments from weekly threads titled “Most Anticipated Earnings Releases” from the weeks beginning July 6th through September 7th, 2020. In these threads, users discuss their plays for the stocks of companies which are expected to announce their quarterly earnings in the next week.
After collecting the comments, I spent about an hour manually labelling each comment, as to whether the comment was bullish or bearish.
(For those cringing at having to manually label hundreds of comments, I have a good reason for doing this. The vocabulary used on WSB is quite different from standard English. For example, the word “tendies” generally indicates a bullish sentiment, that the gains from a profitable trade would be used towards purchasing chicken tenders; while the word “print” hints at a trade that would be so profitable, it is akin to printing money. As such, standard NLP libraries such as NLTK would not be helpful without an appropriate training set, which does not exist, to my knowledge.)
At this point, I’d like to share some examples of comments that I especially liked:
LEVI calls — everyone put on some pounds during lockdown and has to buy new jeans — Reddit user lilwombi
High unemployment means many Americans eating PB&J. YOLOing on Smuckers calls. — Reddit user FFaddict13
Money printer go brrrrrr. Needs wd-40. Long $WDFC — Reddit user orochiman
As you can see, some truly top notch analysis.
Each comment is labelled with positive sentiment if it is bullish, and negative sentiment if it is bearish. Every comment is weighted by the log of the number of upvotes it has, since a comment is more likely to get upvoted once it has already picked up some steam.
The sentiment for a particular company’s earnings results is the sum of all relevant comments. One could think of the magnitude of the sentiment as WSB’s confidence in their prediction.
Comment sentiment = log(upvotes) × (+1 if bullish, -1 if bearish)
Stock sentiment = ∑ Comment sentiment of relevant comments
After a bit of data wrangling, let’s see how WSB performed.
In the plot above, we can see the top 50 stocks (based on strength of sentiment) picked by WSB. Overall, we can see that WSB is generally quite bullish, with bearish predictions on only 3 stocks.
How did these earnings releases go? In the scatterplot below, WSB’s sentiment is the y-axis, while the earnings (EPS) surprise is the x-axis.
Not too shabby! Most of the points lie in the top right quadrant, where WSB correctly predicted an earnings beat. However, I’d be lying if I said that there was a meaningful correlation between WSB’s confidence level and a given stock’s earnings surprise (for the more quantitatively oriented, Pearson’s R = 0.30).
However, earnings surprise is not the only thing that determines a stock’s movement after an earnings release. Other comments such as future guidance can have big effects on the stock price as well. In the plot below, the y-axis is still WSB’s sentiment, but the x-axis now represents the change each stock’s price during the trading session following the respective earnings releases.
Now, this plot looks a lot different than the previous one; and also reflects a lot more poorly on WSB’s predictions. The movement of WSB’s picks are about as random as bird shot (Pearson’s R = -0.04). You might as well have flipped a coin to decide your next trade.
Let’s say hypothetically, you traded earnings announcements, forming a strategy based on WSB’s advice, and you weighted your portfolio using our sentiment metric. You would have lost about 1.6% of your money during the Q2 2020 earnings season. Not too great.
Some Considerations and Possible Improvements
The process of data collection could obviously be improved. Using a customized dictionary and some NLP techniques would probably allow one to scrape a much larger volume of comments and label the sentiment accordingly. Though, one would still need a validation set with ground truth labels to ensure a reasonably high degree of accuracy.
As mentioned earlier, Q2 2020 was definitely an usual earnings season due to COVID-19. As such, companies may have issued future guidance that was contrary to the past earnings result, leading to more stock movements that don’t align with results of the earnings announcements. As such, the results from other quarters may be different.
We haven’t examined the many stocks which have had poor short-term results, but which WSB members have found success through holding them for longer periods (TSLA is one that comes to mind).
While WallStreetBets has drawn the attention of professional traders and hedge fund managers, you should take any “investment advice” you find there with a grain of salt.
Code and data can be found here.