Additional Info: I designed a Python workflow to perform OCR on every xkcd comic, feed that text into a large language model, and ask the model whether this comic was about the category named in the title.
Report an error
xkcd comics published about existentialism correlates with...
Variable | Correlation | Years | Has img? |
The number of data entry keyers in Nebraska | r=0.96 | 16yrs | Yes! |
Google searches for 'daylight savings time' | r=0.88 | 17yrs | Yes! |
Air pollution in Huntington, Indiana | r=0.87 | 10yrs | Yes! |
Air pollution in Worcester, Massachusetts | r=0.86 | 10yrs | Yes! |
Average number of comments on MinuteEarth YouTube videos | r=0.86 | 11yrs | Yes! |
Customer satisfaction with Whirlpool | r=0.86 | 15yrs | No |
Air pollution in Mobile, Alabama | r=0.81 | 12yrs | Yes! |
Runs scored by the Detroit Tigers | r=0.6 | 17yrs | No |
xkcd comics published about existentialism also correlates with...
<< Back to discover a correlation
You caught me! While it would be intuitive to sort only by "correlation," I have a big, weird database. If I sort only by correlation, often all the top results are from some one or two very large datasets (like the weather or labor statistics), and it overwhelms the page.
I can't show you *all* the correlations, because my database would get too large and this page would take a very long time to load. Instead I opt to show you a subset, and I sort them by a magic system score. It starts with the correlation, but penalizes variables that repeat from the same dataset. (It also gives a bonus to variables I happen to find interesting.)