Additional Info: I designed a Python workflow to perform OCR on every xkcd comic, feed that text into a large language model, and ask the model whether this comic was about the category named in the title.
Report an error
xkcd comics published about programming correlates with...
Variable | Correlation | Years | Has img? |
Total views on SciShow Space YouTube videos | r=0.94 | 10yrs | No |
Total likes of Computerphile YouTube videos | r=0.93 | 11yrs | No |
How good LockPickingLawyer YouTube video titles are | r=0.93 | 9yrs | No |
Total comments on SciShow Space YouTube videos | r=0.92 | 10yrs | No |
Annual book sales in the US | r=0.91 | 6yrs | No |
Total views on Computerphile YouTube videos | r=0.91 | 11yrs | No |
The number of cashiers in West Virginia | r=0.87 | 16yrs | No |
Popularity of the first name Isabelle | r=0.86 | 16yrs | No |
Popularity of the first name James | r=0.85 | 16yrs | No |
Jet fuel used in Belize | r=0.83 | 15yrs | No |
The number of bellhops in Indiana | r=0.8 | 16yrs | No |
Robberies in Maryland | r=0.76 | 16yrs | No |
The number of psychiatrists in Colorado | r=0.74 | 16yrs | Yes! |
The number of secretaries in Utah | r=0.69 | 13yrs | No |
Petroluem consumption in Portugal | r=0.67 | 16yrs | No |
NASA's budget as a percentage of the total US Federal Budget | r=0.67 | 17yrs | No |
The number of library assistants in North Dakota | r=0.66 | 15yrs | Yes! |
How geeky PBS Space Time YouTube video titles are | r=-0.97 | 9yrs | No |
xkcd comics published about programming also correlates with...
<< Back to discover a correlation
You caught me! While it would be intuitive to sort only by "correlation," I have a big, weird database. If I sort only by correlation, often all the top results are from some one or two very large datasets (like the weather or labor statistics), and it overwhelms the page.
I can't show you *all* the correlations, because my database would get too large and this page would take a very long time to load. Instead I opt to show you a subset, and I sort them by a magic system score. It starts with the correlation, but penalizes variables that repeat from the same dataset. (It also gives a bonus to variables I happen to find interesting.)