
Unless you’ve been under a rock the past few weeks, you’ve probably heard about Elon Musk’s gaggle of DOGE lads who have been let loose inside the US Treasury payments system. Since they have apparently been given read/write access to some highly sensitive databases and are likely pushing hastily coded “efficiency improvements” to main as we speak, it’s in the public interest to understand how good these guys are as engineers.
Ethan Shaotran is a 22-year-old who got involved with Elon through the expedient of achieving second place in an xAI hackathon.1 He runs a startup called Energize AI, which looks like it makes a nice wrapper for the ChatGPT API. He is also a student at Harvard. This information is all now somewhat difficult to discover, as he has deleted or set to private his Twitter, GitHub, and Harvard profiles, as well as the startup website. So much for government transparency under the new regime! One of Ethan’s claimed achievements in his now-deleted Harvard bio is being “the author of several AI books.” As far as I can tell, however, he has only written a single book2 called Stock Prediction with Deep Learning.
In order to better understand the young minds now seemingly in charge of disbursing payments for the world’s largest economy, I have purchased Ethan’s magnum opus for the not-insignificant price of £15.793 and read all 109 pages.
Ethan’s tender age
It must be noted, in the interest of fairness, that Ethan was 16 when he wrote this book. While for some, this might be a mitigating factor, I believe the book should be assessed as the work of a fully-fledged adult because:
- Prior to recent scrubbing, the book featured prominently on his Harvard bio page, implying that it’s work he’s proud of.
- It’s still for sale on Amazon, where the blurb claims that the book “tackles the common misconception that the stock market cannot be predicted, and builds a stock prediction algorithm to beat the stock market.” What if an unsuspecting layperson picks it up and trades away their life savings?
- Given the unrestricted access he has been granted at DOGE and the responsibility that comes with that, we might surmise that he is academically mature for his age and thus was capable of producing a passable textbook in his mid-teens. Gauss was producing major breakthroughs in number theory at age 16, so why not our Ethan?
With all that being said, what’s this book actually like? Maybe it’s good? Maybe the Treasury is in safe hands?
The Book
The book is not good. In fact, I would say it is some of the worst technical writing I have ever read. The book really has very little to do with stock prediction or deep learning and is basically Ethan’s first-ever Python script, with some added nonsensical explanations about neural networks and financial markets. The proposed system trades based on a naive “sentiment analysis” of a few news articles and is never going to give useful outputs, which is likely why it isn’t evaluated at any point.
Foreword by Dr. J. Mark Munoz
We kick things off with a foreword from Dr. J. Mark Munoz, a professor of Management and International Business at Millikin University, which is seemingly unconnected from the remainder of the text. The strange thing about this section is that it has a real AI slop vibe, despite the fact that the book was published five years before the release of ChatGPT. It includes a list of 12 ways that AI is going to “revolutionize companies,” including gems such as “Reframing of competitive parameters” and “Unprecedented levels of accountability and ethical pressures.” Great stuff! According to Prof. Munoz:
Utilizing the lessons from this book would be like stepping into the AI/DL fast train that is headed towards an amazing and forthcoming technological revolution.
Well, what are we waiting for? Let’s utilize some lessons!
The System
So what exactly is the revolutionary system that Ethan claims “beat the market” in 2017? The text isn’t exactly clear on the details, but as far as I can tell, it goes something like this:
- Choose some stocks, e.g., Tesla, Apple.
- Find some news articles that mention their ticker.
- Take all the words in the article and keep track of the number of positive and negative words (according to a big list he’s got from somewhere); the difference between these counts is the sentiment score.
- Find the slope of the OLS fit to the stock’s closing price over the next five days.
- Fit a very simple neural network classifier to predict the sign of the slope, using two features: the sentiment score and the source of the article.
- ???
- Profit.
Now, on the face of it, this is a bad strategy that isn’t going to work for any number of reasons. That is, perhaps, why the book at no point contains any sort of evaluation of the model, backtest, or description of how the system could be used in practice. There is not even so much as an accuracy score for the classifier. We do get some insight into how the model might perform at the start of Chapter 7:
We just finished creating our prediction system. However, you may quickly notice two things:
- It’s actually not predicting very well.
- The algorithm without our Deep Learning system works better.
This might be surprising – shouldn’t Deep Learning help our system?
After this admission, Ethan goes on to construct some additional features based on historical values of the series, which he claims will improve performance.4 He then fits the model again but provides no evaluation or discussion of accuracy, which might lead us to again conclude that the model just doesn’t work. So the book can’t really teach us anything useful about ML or trading, but perhaps it redeems itself as an introduction to Python?
The Code
The code in the book is littered with errors and demonstrates a poor grasp of the language and of programming generally. I used to teach 10-year-olds introductory Python at an afterschool club, and this makes their code look like it was written by Guido van Rossum. I have included an illustrative snippet in a footnote.5 At one point, he seems to leave his private API keys in the code. Good to know he’s got an eye for detail! Let’s hope that’s not your social security number! I cannot stress enough how bitterly disappointed you would be if you bought this book hoping to learn anything about coding in Python. Obviously, everyone has to start somewhere, but the fact he is still promoting this now is worrying.
Final Thoughts
I’ll leave you with a final thought I lifted straight from the chapter entitled “Final Thoughts”:
Learning a completely new concept is hard.
Very true, Ethan. Maybe this book does contain some insights after all. Let’s hope you remember that when you are diving into a decades-old system at the heart of the global economy. Good luck with the COBOL!
Thanks for reading, if you liked this post you can follow me on Bluesky or subscribe to the RSS feed to get posts in the future.
-
For which he won cloud credits worth the princely sum of $5000. ↩︎
-
He has also contributed to a compendium of short stories about AI and the future of work. ↩︎
-
Don’t feel too bad, I was able to return it for a refund shortly after. ↩︎
-
He does this in such a way that leaks data from the future, but let’s ignore that for now. ↩︎
-
This is an actual code snippet lifted verbatim from the book (I have added comments):
↩︎def SourceInt(url): url = url.lower() if 'cnn' in url: sourceint = 1 elif 'fool.com' in url: sourceint = 2 elif 'finance.yahoo.com' in url: sourceint = 3 elif 'investopedia.com' in url: sourceint = 4 elif 'businessinsider.com' in url: sourceint = 5 elif 'marketwatch.com' in url: sourceint = 6 elif 'forbes.com' in url: sourceint = 7 elif 'thestreet.com' in url: sourceint = 8 elif 'fortune.com' in url: sourceint = 9 elif 'cnbc.com' in url: sourceint = 10 elif 'nasdaq.com' in url: sourceint = 11 elif 'investorplace.com' in url: sourceint = 12 elif 'seekingalpha.com' in url: sourceint = 13 elif 'nytimes' in url: sourceint = 14 elif 'cnbc' in url: sourceint = 15 elif 'apple' in url: sourceint = 16 elif 'techcrunch' in url: sourceint = 17 elif 'huff' in url: sourceint = 18 elif 'fox' in url: sourceint = 19 elif 'usatoday' in url: sourceint = 20 elif 'reuters' in url: sourceint = 21 elif 'npr' in url: sourceint = 22 elif 'nbc' in url: sourceint = 23 elif 'fox' in url: # Already assigned 19 above? sourceint = 24 elif 'wsj' in url: sourceint = 25 elif 'marketwatch' in url: # Already assigned 6 above? sourceint = 26 elif 'bloomberg' in url: sourceint = 27 elif 'guardian' in url: sourceint = 28 elif 'usnews' in url: sourceint = 29 elif 'livetradingnews' in url: sourceint = 30 elif 'streetinsider' in url: sourceint = 31 elif 'stocknewsjournal' in url: sourceint = 32 else: sourceint = 0 return sourceint