about · email me · subscribe
Spurious correlation #5,728 · View random

A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is Air quality in Pittsfield, Massachusetts and the second variable is Google searches for 'smol'.  The chart goes from 2004 to 2023, and the two variables track closely in value over that time. Small Image
Download png
, svg

AI explanation

As the air quality improved, it led to an unexpected proliferation of smol particles in the atmosphere. These tiny, adorable specks of matter quickly captivated the residents of Pittsfield, prompting them to embark on a smol search frenzy. It seems that when the air got cleaner, it also got cuter, much to the delight of the locals.

Model: dalle-3
Prompt: Create an image of a tranquil street in Pittsfield, Massachusetts, with charming houses lining the road. The sky is a crystal-clear blue, with delicate "smol" particles scattered across it, creating an ethereal and heartwarming atmosphere. Show the residents gazing up in awe and curiosity at the darling, tiny specks, possibly capturing a moment of joy as they reach for their phones to learn more. The scene should emit a sense of wonder and delight, highlighting the newfound charm in the air. The depiction should include only scenes, people, and objects, it should not include text or speech bubbles.
This image generation prompt was also generated by AI, using the following sequence:

Model: gpt-3.5-turbo-1106
System prompt: You select one of a few potential scene descriptors that seems to go along with the variables described. You respond only with the scene descriptions provided and say nothing else.
User prompt: Here are two variables: Air quality in Pittsfield, Massachusetts and Google searches for 'smol'. Given these, which of the following image style descriptions seems most appropriate? Pick the one that seems to be related to one of the variables at hand. Please respond with only one of the following potential descriptions: Low poly Whimsical illustration Realistic Charcoal An image that looks like it is a scene from 'Star Wars'
ChatGPT: Realistic

Model: gpt-3.5-turbo-1106
System prompt: You describe simple visual scenes that depict the two variables stated. You combine both variables into one scene.
User prompt: Please describe a visual scene that prominently depicts both Air quality in Pittsfield, Massachusetts and Google searches for 'smol', in the following situation: As the air quality improved, it led to an unexpected proliferation of smol particles in the atmosphere. These tiny, adorable specks of matter quickly captivated the residents of Pittsfield, prompting them to embark on a smol search frenzy. It seems that when the air got cleaner, it also got cuter, much to the delight of the locals.. Keep the description simple.
ChatGPT: In Pittsfield, Massachusetts, the air quality improved, and the clear blue sky was filled with tiny, adorable "smol" particles. Residents looked up in wonder at the newfound cuteness, and many could be seen excitedly searching on their phones for more information about these delightful little specks.

Model: gpt-3.5-turbo-1106
System prompt: You write and improve image generation prompts for DALLE-3. You maintain the same subjects, but enhance the visuals.
User prompt: Please respond with an image generation prompt that would be likely to create an interesting visual using DALLE-3, using the following style: [ Realistic ] and depicting this scene: In Pittsfield, Massachusetts, the air quality improved, and the clear blue sky was filled with tiny, adorable "smol" particles. Residents looked up in wonder at the newfound cuteness, and many could be seen excitedly searching on their phones for more information about these delightful little specks. Keep the visual depiction simple. Do not include any words or speech bubbles.
ChatGPT: Create an image of a tranquil street in Pittsfield, Massachusetts, with charming houses lining the road. The sky is a crystal-clear blue, with delicate "smol" particles scattered across it, creating an ethereal and heartwarming atmosphere. Show the residents gazing up in awe and curiosity at the darling, tiny specks, possibly capturing a moment of joy as they reach for their phones to learn more. The scene should emit a sense of wonder and delight, highlighting the newfound charm in the air.

*Variations in the User Prompts from chart to chart are just the result of random number generation in Python. I wrote a few arrays of various styles and methods to ask questions to change up the results. Every time this site writes an explanation or generates an image, the script picks from each at random.

I sequence the requests into multiple prompts because I find GPT 3.5 to perform much better with short, well-managed contexts. Thus, I track the context directly in Python and only ask ChatGPT targeted questions.

Model: gpt-3.5-turbo-1106 (fine-tuned)
System prompt: You provide humorous responses in the form of plausible sounding explanations for correlations. You assume the correlation is causative for the purpose of the explanation even if it is ridiculous. You do not chat with the user, you only reply with the causal connection explanation and nothing else.
User prompt: Please make up a funny explanation for how an increase in Air quality in Pittsfield, Massachusetts caused smol (Google searches for 'smol') to increase.

AI academic paper

(Because p < 0.01)
A Breath of Fresh Air: The Perplexing Link Between Air Quality in Pittsfield, Massachusetts and Google Searches for 'smol'

The Journal of Atmospheric Connections and Curious Queries

Jan 2024

Reminder: This paper is AI-generated. Not real!
Quick note: This sequence is long and repetitive. That's how the prompt engineering works for this one. I manage the output in Python to format it into this PDF.




Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You write titles and abstracts of goofy academic research papers.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, and add unexpected twists in your writing.

Please draft the title and abstract of an academic research paper presenting the findings of the connection between Air quality in Pittsfield, Massachusetts and Google searches for 'smol'. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Include a pun in the title.

Your research team used data from Environmental Protection Agency and Google Trends to assess this nagging question. You found a correlation coefficient of 0.8854494 and p < 0.01 for 2004 to 2023.

Please respond in this format (replace "Lorem ipsum dolor sit" with your title and abstract)

[[TITLE]]
Lorem ipsum dolor sit

[[ABSTRACT]]
Lorem ipsum dolor sit



ChatGPT:
[[TITLE]]
A Breath of Fresh Air: The Perplexing Link Between Air Quality in Pittsfield, Massachusetts and Google Searches for 'smol'

[[ABSTRACT]]
In this whimsical paper, we delve into the intriguing correlation between air quality in Pittsfield, Massachusetts and Google searches for the internet slang term 'smol'. Through an analysis of data from the Environmental Protection Agency's air quality reports and Google Trends, our research team uncovered a surprising correlation coefficient of 0.8854494 and a p-value less than 0.01 for the period spanning from 2004 to 2023. Our findings presented a conundrum, leaving us to ponder whether the residents of Pittsfield have collectively turned to the digital world to quip about smallness due to the quality of their air. Our investigation also uncovered the need for further exploration of how environmental factors may influence internet search behavior, igniting a spark of curiosity in the realm of interdisciplinary research.


Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You draft the introduction section of an academic research paper.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up the introduction section of an academic research paper presenting the findings of the connection between Air quality in Pittsfield, Massachusetts and Google searches for 'smol'. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Here is the title and abstract of the paper:
[[TITLE]]
A Breath of Fresh Air: The Perplexing Link Between Air Quality in Pittsfield, Massachusetts and Google Searches for 'smol'

[[ABSTRACT]]
In this whimsical paper, we delve into the intriguing correlation between air quality in Pittsfield, Massachusetts and Google searches for the internet slang term 'smol'. Through an analysis of data from the Environmental Protection Agency's air quality reports and Google Trends, our research team uncovered a surprising correlation coefficient of 0.8854494 and a p-value less than 0.01 for the period spanning from 2004 to 2023. Our findings presented a conundrum, leaving us to ponder whether the residents of Pittsfield have collectively turned to the digital world to quip about smallness due to the quality of their air. Our investigation also uncovered the need for further exploration of how environmental factors may influence internet search behavior, igniting a spark of curiosity in the realm of interdisciplinary research.

-------

Your response should start like this:

[[INTRODUCTION]]



ChatGPT:
[[INTRODUCTION]]

The intertwining relationship between environmental factors and human behavior has long captivated the curiosity of researchers across various fields. The enchanting mystery of how the quality of air we breathe may impact our digital expressions remains a subject of fascination. In this lighthearted paper, we embark on an investigation that unearths the peculiar connection between air quality in Pittsfield, Massachusetts and the peculiar online phenomenon of 'smol' searches.

The idyllic town of Pittsfield, nestled amidst the verdant landscapes of Massachusetts, set the stage for our whimsical inquiry. It is here that we began to unravel the pungent scent of an unexpected correlation, one that transcended the boundaries of conventional research and delved into the uncharted realms of internet lexicon.

As our analytical gaze turned towards the Environmental Protection Agency's meticulous air quality reports, we were met with an array of numerical data that whispered tales of particulate matter and ozone levels. Concurrently, our fingers danced across the keyboard to summon Google Trends, where the frequency of 'smol' searches awaited our scrutiny.

The quirky allure of our findings is particularly encapsulated in the correlation coefficient of 0.8854494 that emerged from our analysis – a number that beckons us to ponder whether the ethereal tendrils of air quality wove a narrative that led the denizens of Pittsfield into the digital realm to contemplate the 'smol'ness of it all.

If Charles Dickens had been our contemporary, he might have felt inclined to say, "It was the best of air, it was the 'smol'est of times, it was the p-value of less than 0.01." Indeed, our results left us with a plot twist that could rival the most intricately woven mysteries, prompting us to don our figurative detective hats and consider the enigmatic relationship between air quality and the peculiarities of internet search behavior.

As our journey commences, we find ourselves straddling the realms of environmental science and digital culture, each step inching us closer to unraveling the peculiar tapestry that binds them together. Our undertaking is not merely an endeavor to uncover a statistical observation, but rather an exploration that ignites a spark of curiosity in the domain of interdisciplinary research, prompting us to question the whimsical dance of cause and effect in the realms of atmospheric elements and online musings.


Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You draft a literature review section of an academic research paper, that starts out dry for the first few sentences but then becomes hilarious and goofy.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up a literature review section of an academic research paper presenting the findings of the connection between Air quality in Pittsfield, Massachusetts and Google searches for 'smol'. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Speak in the present tense for this section (the authors find...), as is common in academic research paper literature reviews. Name the sources in a format similar to this: In "Book," the authors find lorem and ipsum.

Make up the lorem and ipsum part, but make it sound related to the topic at hand.

Start by naming serious-sounding studies by authors like Smith, Doe, and Jones - but then quickly devolve. Name some real non-fiction books that would be related to the topic. Then name some real fiction books that sound like they could be related. Then devolve ever further, and mention something completely ridiculous, like you conducted literature review by reading CVS receipts.

Here is the title and abstract of the paper:
[[TITLE]]
A Breath of Fresh Air: The Perplexing Link Between Air Quality in Pittsfield, Massachusetts and Google Searches for 'smol'

[[ABSTRACT]]
In this whimsical paper, we delve into the intriguing correlation between air quality in Pittsfield, Massachusetts and Google searches for the internet slang term 'smol'. Through an analysis of data from the Environmental Protection Agency's air quality reports and Google Trends, our research team uncovered a surprising correlation coefficient of 0.8854494 and a p-value less than 0.01 for the period spanning from 2004 to 2023. Our findings presented a conundrum, leaving us to ponder whether the residents of Pittsfield have collectively turned to the digital world to quip about smallness due to the quality of their air. Our investigation also uncovered the need for further exploration of how environmental factors may influence internet search behavior, igniting a spark of curiosity in the realm of interdisciplinary research.

-------

And here is the introduction section that led to this literature review:
[[INTRODUCTION]]
The intertwining relationship between environmental factors and human behavior has long captivated the curiosity of researchers across various fields. The enchanting mystery of how the quality of air we breathe may impact our digital expressions remains a subject of fascination. In this lighthearted paper, we embark on an investigation that unearths the peculiar connection between air quality in Pittsfield, Massachusetts and the peculiar online phenomenon of 'smol' searches.
The idyllic town of Pittsfield, nestled amidst the verdant landscapes of Massachusetts, set the stage for our whimsical inquiry. It is here that we began to unravel the pungent scent of an unexpected correlation, one that transcended the boundaries of conventional research and delved into the uncharted realms of internet lexicon.
As our analytical gaze turned towards the Environmental Protection Agency's meticulous air quality reports, we were met with an array of numerical data that whispered tales of particulate matter and ozone levels. Concurrently, our fingers danced across the keyboard to summon Google Trends, where the frequency of 'smol' searches awaited our scrutiny.
The quirky allure of our findings is particularly encapsulated in the correlation coefficient of 0.8854494 that emerged from our analysis – a number that beckons us to ponder whether the ethereal tendrils of air quality wove a narrative that led the denizens of Pittsfield into the digital realm to contemplate the 'smol'ness of it all.
If Charles Dickens had been our contemporary, he might have felt inclined to say, "It was the best of air, it was the 'smol'est of times, it was the p-value of less than 0.01." Indeed, our results left us with a plot twist that could rival the most intricately woven mysteries, prompting us to don our figurative detective hats and consider the enigmatic relationship between air quality and the peculiarities of internet search behavior.
As our journey commences, we find ourselves straddling the realms of environmental science and digital culture, each step inching us closer to unraveling the peculiar tapestry that binds them together. Our undertaking is not merely an endeavor to uncover a statistical observation, but rather an exploration that ignites a spark of curiosity in the domain of interdisciplinary research, prompting us to question the whimsical dance of cause and effect in the realms of atmospheric elements and online musings.

-------

Your response should start like this:

[[LITERATURE REVIEW]]



ChatGPT:
[[LITERATURE REVIEW]]

The peculiar correlation unearthed between air quality in Pittsfield, Massachusetts and Google searches for 'smol' has beckoned us to delve into the existing body of literature, seeking clues and insights from a myriad of sources. Our research endeavors led us to a remarkable array of studies, both serious and whimsical, that shed light on various facets of air quality, internet culture, and the interplay between environmental factors and human behavior.

Smith et al. (2017) expound upon the intricate dynamics of air quality and its impact on human health, offering a solemn exploration of particulate matter, ozone levels, and their potential ramifications. Doe and Jones (2019) further delve into the realm of internet search behavior, delving into the nuances of keyword trends and user engagement.

Steering into the world of non-fiction, notable works such as "The Air We Breathe: A Comprehensive Analysis of Environmental Factors" by Environmental Science Publications and "Digital Discourse: Exploring the Linguistic Landscape of the Internet" by Language Studies Institute provided invaluable insights into our investigation. These meticulous tomes offered a blend of scientific rigor and linguistic analysis, enabling us to discern the subtle interplay between the quality of air and the digital narratives woven in cyberspace.

Venturing further into the labyrinth of literature, we encountered an unexpected trove of fictional works that, despite their whimsical nature, bore curious semblances to our research inquiry. "Whispers of the Wind: A Tale of Atmospheric Intrigue" by A. Novel Author and "The Cyber Adventures of Smolville: An Internet Odyssey" by Pseudonym P. Wordsmith, although clothed in the veil of imagination, tantalizingly proffered allegorical interpretations of our findings.

As our pursuit veered into uncharted territories, we must candidly acknowledge the unorthodox sources that inadvertently informed our inquiry. Huddled amongst the curious amalgamation of literature, we stumbled upon grocery lists, obscure internet memes, and even the enigmatic hieroglyphics of CVS receipts – each bearing the potential to unravel the enigmatic relationship between air quality and the digital whimsies of 'smol' searches. While these unconventional inspirations may raise an eyebrow or two, we found in them a source of unbridled creativity and mirth, illuminating the quirky undercurrents that permeate our scholarly pursuit.



Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You draft the methodology section of an academic research paper.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up the methodology section of an academic research paper presenting the findings of the connection between Air quality in Pittsfield, Massachusetts and Google searches for 'smol'. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Your research team collected data from all across the internet, but mostly just used information from Environmental Protection Agency and Google Trends . You used data from 2004 to 2023

Make up the research methods you don't know. Make them a bit goofy and convoluted.

Here is the title, abstract, and introduction of the paper:
[[TITLE]]
A Breath of Fresh Air: The Perplexing Link Between Air Quality in Pittsfield, Massachusetts and Google Searches for 'smol'

[[ABSTRACT]]
In this whimsical paper, we delve into the intriguing correlation between air quality in Pittsfield, Massachusetts and Google searches for the internet slang term 'smol'. Through an analysis of data from the Environmental Protection Agency's air quality reports and Google Trends, our research team uncovered a surprising correlation coefficient of 0.8854494 and a p-value less than 0.01 for the period spanning from 2004 to 2023. Our findings presented a conundrum, leaving us to ponder whether the residents of Pittsfield have collectively turned to the digital world to quip about smallness due to the quality of their air. Our investigation also uncovered the need for further exploration of how environmental factors may influence internet search behavior, igniting a spark of curiosity in the realm of interdisciplinary research.

[[INTRODUCTION]]
The intertwining relationship between environmental factors and human behavior has long captivated the curiosity of researchers across various fields. The enchanting mystery of how the quality of air we breathe may impact our digital expressions remains a subject of fascination. In this lighthearted paper, we embark on an investigation that unearths the peculiar connection between air quality in Pittsfield, Massachusetts and the peculiar online phenomenon of 'smol' searches.
The idyllic town of Pittsfield, nestled amidst the verdant landscapes of Massachusetts, set the stage for our whimsical inquiry. It is here that we began to unravel the pungent scent of an unexpected correlation, one that transcended the boundaries of conventional research and delved into the uncharted realms of internet lexicon.
As our analytical gaze turned towards the Environmental Protection Agency's meticulous air quality reports, we were met with an array of numerical data that whispered tales of particulate matter and ozone levels. Concurrently, our fingers danced across the keyboard to summon Google Trends, where the frequency of 'smol' searches awaited our scrutiny.
The quirky allure of our findings is particularly encapsulated in the correlation coefficient of 0.8854494 that emerged from our analysis – a number that beckons us to ponder whether the ethereal tendrils of air quality wove a narrative that led the denizens of Pittsfield into the digital realm to contemplate the 'smol'ness of it all.
If Charles Dickens had been our contemporary, he might have felt inclined to say, "It was the best of air, it was the 'smol'est of times, it was the p-value of less than 0.01." Indeed, our results left us with a plot twist that could rival the most intricately woven mysteries, prompting us to don our figurative detective hats and consider the enigmatic relationship between air quality and the peculiarities of internet search behavior.
As our journey commences, we find ourselves straddling the realms of environmental science and digital culture, each step inching us closer to unraveling the peculiar tapestry that binds them together. Our undertaking is not merely an endeavor to uncover a statistical observation, but rather an exploration that ignites a spark of curiosity in the domain of interdisciplinary research, prompting us to question the whimsical dance of cause and effect in the realms of atmospheric elements and online musings.

-------

Your response should start like this:

[[METHODOLOGY]]



ChatGPT:

[[METHODOLOGY]]

To embark on our whimsical journey through the enigmatic nexus of air quality in Pittsfield, Massachusetts and the quaint allure of ‘smol’ searches, our research team engaged in a thought-provoking spectacle of data collection and analysis. Our peculiar quest involved the amalgamation of information from diverse sources, blending the meticulous reports of the Environmental Protection Agency (EPA) with the tantalizing metrics of Google Trends.

The saga of our data collection begins with the sifting through of EPA's archives, where the atmospheric chronicles of Pittsfield's air quality unfolded before our eager eyes. These reports, replete with the whimsical dance of particulate matter and the atmospheric theatrics of ozone levels, held within them the seeds of a correlation that begged to be unearthed.

On the digital front, the uncharted territories of Google Trends awaited our keen attention. With the urgency of intrepid explorers venturing into a digital jungle, we plumbed the depths of internet search behavior for the elusive traces of 'smol'. As our fingers danced across the keys, we summoned forth a plethora of search data, seeking to unravel the minuscule mysteries that lay hidden within the digital ether.

Now, for the most enthralling part of our methodological escapade: the data analysis. As intrepid statisticians donning our figurative pith helmets, we employed the formidable tools of correlation coefficient computation and p-value determination. These stalwart allies guided our quest, helping us untangle the enigmatic relationship between air quality and 'smol' searches with the precision of a rhetorical rapier.

Shrouded in the cloak of statistical rigor, our analysis unearthed a correlation coefficient of 0.8854494, a number that beckoned us to ponder the improbable threads linking the whimsical musings of internet search behavior with the atmospheric tales of Pittsfield. As the p-value unfurled before our astounded eyes, revealing itself to be less than 0.01, we were left grappling with a revelation that transcended the boundaries of conventional research, prompting us to consider whether the idyllic town of Pittsfield had indeed become a haven for diminutive digital expressions.

In the midst of our methodological odyssey, we remained vigilant in our endeavor to blend the rigors of scientific inquiry with the whimsical allure of our research topic. With a touch of levity and an abundance of curiosity, our approach sought to traverse the convoluted paths that intertwine the realms of environmental science and digital culture, all while donning our metaphorical Sherlock Holmes hats to unearth the unexpected mysteries that lay nestled in the eccentric embrace of our data.


Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You draft the results section of an academic research paper. You speak in the past tense (you found...).
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up the results section of an academic research paper presenting the findings of the connection between Air quality in Pittsfield, Massachusetts and Google searches for 'smol'. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Your research team collected data from all across the internet, but mostly just used information from Environmental Protection Agency and Google Trends .

For the time period 2004 to 2023, you found a correlation 0.8854494, r-squared of 0.7840207, and p < 0.01.

One figure will be included. The figure (Fig. 1) is a scatterplot showing the strong correlation between the two variables. You don't need to specify where; I will add the figure.

Here is the title and abstract of the paper:
[[TITLE]]
A Breath of Fresh Air: The Perplexing Link Between Air Quality in Pittsfield, Massachusetts and Google Searches for 'smol'

[[ABSTRACT]]
In this whimsical paper, we delve into the intriguing correlation between air quality in Pittsfield, Massachusetts and Google searches for the internet slang term 'smol'. Through an analysis of data from the Environmental Protection Agency's air quality reports and Google Trends, our research team uncovered a surprising correlation coefficient of 0.8854494 and a p-value less than 0.01 for the period spanning from 2004 to 2023. Our findings presented a conundrum, leaving us to ponder whether the residents of Pittsfield have collectively turned to the digital world to quip about smallness due to the quality of their air. Our investigation also uncovered the need for further exploration of how environmental factors may influence internet search behavior, igniting a spark of curiosity in the realm of interdisciplinary research.

-------

And here is the methodology section that led to this result:
[[METHODOLOGY]]
The intertwining relationship between environmental factors and human behavior has long captivated the curiosity of researchers across various fields. The enchanting mystery of how the quality of air we breathe may impact our digital expressions remains a subject of fascination. In this lighthearted paper, we embark on an investigation that unearths the peculiar connection between air quality in Pittsfield, Massachusetts and the peculiar online phenomenon of 'smol' searches.
The idyllic town of Pittsfield, nestled amidst the verdant landscapes of Massachusetts, set the stage for our whimsical inquiry. It is here that we began to unravel the pungent scent of an unexpected correlation, one that transcended the boundaries of conventional research and delved into the uncharted realms of internet lexicon.
As our analytical gaze turned towards the Environmental Protection Agency's meticulous air quality reports, we were met with an array of numerical data that whispered tales of particulate matter and ozone levels. Concurrently, our fingers danced across the keyboard to summon Google Trends, where the frequency of 'smol' searches awaited our scrutiny.
The quirky allure of our findings is particularly encapsulated in the correlation coefficient of 0.8854494 that emerged from our analysis – a number that beckons us to ponder whether the ethereal tendrils of air quality wove a narrative that led the denizens of Pittsfield into the digital realm to contemplate the 'smol'ness of it all.
If Charles Dickens had been our contemporary, he might have felt inclined to say, "It was the best of air, it was the 'smol'est of times, it was the p-value of less than 0.01." Indeed, our results left us with a plot twist that could rival the most intricately woven mysteries, prompting us to don our figurative detective hats and consider the enigmatic relationship between air quality and the peculiarities of internet search behavior.
As our journey commences, we find ourselves straddling the realms of environmental science and digital culture, each step inching us closer to unraveling the peculiar tapestry that binds them together. Our undertaking is not merely an endeavor to uncover a statistical observation, but rather an exploration that ignites a spark of curiosity in the domain of interdisciplinary research, prompting us to question the whimsical dance of cause and effect in the realms of atmospheric elements and online musings.

-------

Your response should start like this:

[[RESULTS]]



ChatGPT:

[[RESULTS]]

The results of our investigation unveiled a remarkable correlation between air quality in Pittsfield, Massachusetts and the frequency of Google searches for 'smol' over the period of 2004 to 2023. Our analysis revealed a striking correlation coefficient of 0.8854494, indicating a strong positive relationship between the two variables. Additionally, the coefficient of determination (r-squared) of 0.7840207 suggests that approximately 78.4% of the variability in 'smol' searches can be explained by the variability in air quality. Notably, the p-value of less than 0.01 provides strong evidence against the null hypothesis of no correlation, further fortifying the robustness of our findings.

The culmination of our inquiry is encapsulated in Figure 1, a scatterplot that vividly illustrates the compelling association between air quality and 'smol' searches. The data points coalesce in a manner reflective of the harmonious interplay between these seemingly disparate domains, hinting at the possibility of a symbiotic relationship that beckons further contemplation.

Our findings not only illuminate the curious connection between environmental conditions and online behavior but also leave us with a lingering sense of wonder. The enigmatic convergence of air quality and digital lexicon in the quaint town of Pittsfield invites the speculative gaze of researchers to ponder the quirky intricacies of human expression in the digital age. As we sift through the numerical tapestries of our data, we are reminded that the allure of research often lies in the unanticipated twists and turns that populate the intellectual landscape, guiding us toward novel avenues of inquiry.

The unearthing of this correlation may prompt a curious ponderer to ask: Are the residents of Pittsfield, buoyed by the ethereal embrace of pristine air, whimsically inclined toward the contemplation of 'smol'ness? Or are there yet-undiscovered forces at play, teasing the boundaries between the tangible and the digital? Regardless of the underlying whimsical musings that may percolate, our results beckon us to approach the juncture of air quality and internet expressions with a sense of marvel and inquisitiveness.


Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You draft the discussion section of an academic research paper.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up the discussion section of an academic research paper presenting the findings of the connection between Air quality in Pittsfield, Massachusetts and Google searches for 'smol'. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Limit your response to 500 tokens.

Here are the title, abstract, literature review, and results sections. Please harken back to 1-2 of the goofy items in the literature review, but pretend to take them completely seriously. Discuss how your results supported the prior research.

Do not write a conclusion. I will add a conclusion after this.

[[TITLE]]
A Breath of Fresh Air: The Perplexing Link Between Air Quality in Pittsfield, Massachusetts and Google Searches for 'smol'

[[ABSTRACT]]
In this whimsical paper, we delve into the intriguing correlation between air quality in Pittsfield, Massachusetts and Google searches for the internet slang term 'smol'. Through an analysis of data from the Environmental Protection Agency's air quality reports and Google Trends, our research team uncovered a surprising correlation coefficient of 0.8854494 and a p-value less than 0.01 for the period spanning from 2004 to 2023. Our findings presented a conundrum, leaving us to ponder whether the residents of Pittsfield have collectively turned to the digital world to quip about smallness due to the quality of their air. Our investigation also uncovered the need for further exploration of how environmental factors may influence internet search behavior, igniting a spark of curiosity in the realm of interdisciplinary research.

[[LITERATURE REVIEW]]
The peculiar correlation unearthed between air quality in Pittsfield, Massachusetts and Google searches for 'smol' has beckoned us to delve into the existing body of literature, seeking clues and insights from a myriad of sources. Our research endeavors led us to a remarkable array of studies, both serious and whimsical, that shed light on various facets of air quality, internet culture, and the interplay between environmental factors and human behavior.
Smith et al. (2017) expound upon the intricate dynamics of air quality and its impact on human health, offering a solemn exploration of particulate matter, ozone levels, and their potential ramifications. Doe and Jones (2019) further delve into the realm of internet search behavior, delving into the nuances of keyword trends and user engagement.
Steering into the world of non-fiction, notable works such as "The Air We Breathe: A Comprehensive Analysis of Environmental Factors" by Environmental Science Publications and "Digital Discourse: Exploring the Linguistic Landscape of the Internet" by Language Studies Institute provided invaluable insights into our investigation. These meticulous tomes offered a blend of scientific rigor and linguistic analysis, enabling us to discern the subtle interplay between the quality of air and the digital narratives woven in cyberspace.
Venturing further into the labyrinth of literature, we encountered an unexpected trove of fictional works that, despite their whimsical nature, bore curious semblances to our research inquiry. "Whispers of the Wind: A Tale of Atmospheric Intrigue" by A. Novel Author and "The Cyber Adventures of Smolville: An Internet Odyssey" by Pseudonym P. Wordsmith, although clothed in the veil of imagination, tantalizingly proffered allegorical interpretations of our findings.
As our pursuit veered into uncharted territories, we must candidly acknowledge the unorthodox sources that inadvertently informed our inquiry. Huddled amongst the curious amalgamation of literature, we stumbled upon grocery lists, obscure internet memes, and even the enigmatic hieroglyphics of CVS receipts – each bearing the potential to unravel the enigmatic relationship between air quality and the digital whimsies of 'smol' searches. While these unconventional inspirations may raise an eyebrow or two, we found in them a source of unbridled creativity and mirth, illuminating the quirky undercurrents that permeate our scholarly pursuit.


[[RESULTS]]
The results of our investigation unveiled a remarkable correlation between air quality in Pittsfield, Massachusetts and the frequency of Google searches for 'smol' over the period of 2004 to 2023. Our analysis revealed a striking correlation coefficient of 0.8854494, indicating a strong positive relationship between the two variables. Additionally, the coefficient of determination (r-squared) of 0.7840207 suggests that approximately 78.4% of the variability in 'smol' searches can be explained by the variability in air quality. Notably, the p-value of less than 0.01 provides strong evidence against the null hypothesis of no correlation, further fortifying the robustness of our findings.
The culmination of our inquiry is encapsulated in Figure 1, a scatterplot that vividly illustrates the compelling association between air quality and 'smol' searches. The data points coalesce in a manner reflective of the harmonious interplay between these seemingly disparate domains, hinting at the possibility of a symbiotic relationship that beckons further contemplation.
Our findings not only illuminate the curious connection between environmental conditions and online behavior but also leave us with a lingering sense of wonder. The enigmatic convergence of air quality and digital lexicon in the quaint town of Pittsfield invites the speculative gaze of researchers to ponder the quirky intricacies of human expression in the digital age. As we sift through the numerical tapestries of our data, we are reminded that the allure of research often lies in the unanticipated twists and turns that populate the intellectual landscape, guiding us toward novel avenues of inquiry.
The unearthing of this correlation may prompt a curious ponderer to ask: Are the residents of Pittsfield, buoyed by the ethereal embrace of pristine air, whimsically inclined toward the contemplation of 'smol'ness? Or are there yet-undiscovered forces at play, teasing the boundaries between the tangible and the digital? Regardless of the underlying whimsical musings that may percolate, our results beckon us to approach the juncture of air quality and internet expressions with a sense of marvel and inquisitiveness.

-------

Your response should start like this:

[[DISCUSSION]]



ChatGPT:

In the whimsical realm of our quirky investigation lies a perplexing juncture where air quality in Pittsfield, Massachusetts and Google searches for 'smol' intertwine in a delightful pas de deux. Our findings have ventured into uncharted territory, shedding light on the unexpected symbiosis between the quality of air and the digital musings of 'smol' searches. As we navigate this captivating convergence, it is evident that our revelation, although whimsical in nature, resonates with the underlying fabric of prior research.

Upon revisiting the bustling thoroughfare of literature, we are reminded of the solemn words of Smith et al. (2017), who expounded upon the intricate dynamics of air quality and its impact on human health. Our results pulsate with an affirming nod to their scholarly discourse, affirming the profound influence of air quality on human expression, even in the digital realm. Further, our whimsical sojourn harkens to the unexplored alleys of internet culture, elucidating the peculiar interplay between environmental conditions and the linguistic tapestries woven in cyberspace, as pondered by Doe and Jones (2019).

The quirkiness of our research inquiry intertwines with the essence of unexpected whimsy, resonant even with the tale of "Whispers of the Wind: A Tale of Atmospheric Intrigue" by A. Novel Author, inviting contemplation of allegorical interpretations to our findings. As our discussion extends to the digital landscape, the evocative echoes of "The Cyber Adventures of Smolville: An Internet Odyssey" by Pseudonym P. Wordsmith ripple through the corridors of our inquiry, infusing the air of digital whimsies with a tantalizing allegorical resonance.

Our earnest endeavor into the labyrinth of literature has illuminated a peculiar semblance to our research inquiry in the whimsical underpinnings of fictional narratives. However, our investigation has gently nudged aside the veil of whimsy and unveiled an astonishing correlation between air quality and 'smol' searches. The robustness, evidenced by a strong correlation coefficient, and the p-value less than 0.01, has fortified the junction of air quality and digital musings as a subject worthy of scholarly contemplation.

As we map the unanticipated territories of our findings, we are reminded of the unpredictable twists that populate the intellectual landscape, steering us toward novel avenues of inquiry. Our results beckon further exploration, tantalizingly proffering allegorical interpretations while imparting a sense of marvel and inquisitiveness. Amidst the dryness of data analysis lies a looming question: Could the ethereal embrace of pristine air whimsically nudge the residents of Pittsfield toward contemplation of 'smol'ness? Or do undiscovered forces whisper at the interplay between the tangible and the digital? As we ponder upon these whimsical musings, our results urge us to embrace the enigmatic convergence with curiosity and wonder, perpetuating the delightful dance of air quality and digital whimsies.


Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You draft the discussion section of goofy academic research papers that sounds like it belongs based on the title and abstract provided.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up a short conclusion of an academic research paper presenting the findings of the connection between Air quality in Pittsfield, Massachusetts and Google searches for 'smol'. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Limit your response to 300 tokens. At the very end, assert that no more research is needed in this area.

Here are the title, abstract, introduction, and results sections.
[[TITLE]]
A Breath of Fresh Air: The Perplexing Link Between Air Quality in Pittsfield, Massachusetts and Google Searches for 'smol'

[[ABSTRACT]]
In this whimsical paper, we delve into the intriguing correlation between air quality in Pittsfield, Massachusetts and Google searches for the internet slang term 'smol'. Through an analysis of data from the Environmental Protection Agency's air quality reports and Google Trends, our research team uncovered a surprising correlation coefficient of 0.8854494 and a p-value less than 0.01 for the period spanning from 2004 to 2023. Our findings presented a conundrum, leaving us to ponder whether the residents of Pittsfield have collectively turned to the digital world to quip about smallness due to the quality of their air. Our investigation also uncovered the need for further exploration of how environmental factors may influence internet search behavior, igniting a spark of curiosity in the realm of interdisciplinary research.

[[INTRDUCTION]]
The intertwining relationship between environmental factors and human behavior has long captivated the curiosity of researchers across various fields. The enchanting mystery of how the quality of air we breathe may impact our digital expressions remains a subject of fascination. In this lighthearted paper, we embark on an investigation that unearths the peculiar connection between air quality in Pittsfield, Massachusetts and the peculiar online phenomenon of 'smol' searches.
The idyllic town of Pittsfield, nestled amidst the verdant landscapes of Massachusetts, set the stage for our whimsical inquiry. It is here that we began to unravel the pungent scent of an unexpected correlation, one that transcended the boundaries of conventional research and delved into the uncharted realms of internet lexicon.
As our analytical gaze turned towards the Environmental Protection Agency's meticulous air quality reports, we were met with an array of numerical data that whispered tales of particulate matter and ozone levels. Concurrently, our fingers danced across the keyboard to summon Google Trends, where the frequency of 'smol' searches awaited our scrutiny.
The quirky allure of our findings is particularly encapsulated in the correlation coefficient of 0.8854494 that emerged from our analysis – a number that beckons us to ponder whether the ethereal tendrils of air quality wove a narrative that led the denizens of Pittsfield into the digital realm to contemplate the 'smol'ness of it all.
If Charles Dickens had been our contemporary, he might have felt inclined to say, "It was the best of air, it was the 'smol'est of times, it was the p-value of less than 0.01." Indeed, our results left us with a plot twist that could rival the most intricately woven mysteries, prompting us to don our figurative detective hats and consider the enigmatic relationship between air quality and the peculiarities of internet search behavior.
As our journey commences, we find ourselves straddling the realms of environmental science and digital culture, each step inching us closer to unraveling the peculiar tapestry that binds them together. Our undertaking is not merely an endeavor to uncover a statistical observation, but rather an exploration that ignites a spark of curiosity in the domain of interdisciplinary research, prompting us to question the whimsical dance of cause and effect in the realms of atmospheric elements and online musings.

[[RESULTS]]
The results of our investigation unveiled a remarkable correlation between air quality in Pittsfield, Massachusetts and the frequency of Google searches for 'smol' over the period of 2004 to 2023. Our analysis revealed a striking correlation coefficient of 0.8854494, indicating a strong positive relationship between the two variables. Additionally, the coefficient of determination (r-squared) of 0.7840207 suggests that approximately 78.4% of the variability in 'smol' searches can be explained by the variability in air quality. Notably, the p-value of less than 0.01 provides strong evidence against the null hypothesis of no correlation, further fortifying the robustness of our findings.
The culmination of our inquiry is encapsulated in Figure 1, a scatterplot that vividly illustrates the compelling association between air quality and 'smol' searches. The data points coalesce in a manner reflective of the harmonious interplay between these seemingly disparate domains, hinting at the possibility of a symbiotic relationship that beckons further contemplation.
Our findings not only illuminate the curious connection between environmental conditions and online behavior but also leave us with a lingering sense of wonder. The enigmatic convergence of air quality and digital lexicon in the quaint town of Pittsfield invites the speculative gaze of researchers to ponder the quirky intricacies of human expression in the digital age. As we sift through the numerical tapestries of our data, we are reminded that the allure of research often lies in the unanticipated twists and turns that populate the intellectual landscape, guiding us toward novel avenues of inquiry.
The unearthing of this correlation may prompt a curious ponderer to ask: Are the residents of Pittsfield, buoyed by the ethereal embrace of pristine air, whimsically inclined toward the contemplation of 'smol'ness? Or are there yet-undiscovered forces at play, teasing the boundaries between the tangible and the digital? Regardless of the underlying whimsical musings that may percolate, our results beckon us to approach the juncture of air quality and internet expressions with a sense of marvel and inquisitiveness.

-------

Your response should start like this:

[[CONCLUSION]]



ChatGPT:

[[CONCLUSION]]

In conclusion, our whimsical foray into the enigmatic correlation between air quality in Pittsfield, Massachusetts and Google searches for 'smol' has left us with a sense of both amusement and intellectual intrigue. The robust correlation coefficient of 0.8854494 and the persuasive p-value less than 0.01 have succeeded in piquing our curiosity, compelling us to contemplate the whimsical dance of causation and the 'smol'ness of it all. As we reflect on the lighthearted nature of our investigation, we cannot help but chuckle at the thought that the residents of Pittsfield may have collectively turned to pondering about 'smol'ness in the digital realm amidst the embrace of pristine air, perhaps finding solace in the diminutive merriment of online exchanges.

However, despite the allure of this peculiar correlation, we must recognize the limitations of our fanciful exploration. While our findings ignite a spark of curiosity, it's essential to acknowledge that correlation does not imply causation, and the underlying mechanisms that link air quality to internet search behavior remain shrouded in whimsical uncertainty. As we bid adieu to this comical escapade, we find ourselves inclined to echo Sherlock Holmes' words – "The game is afoot!" – not in pursuit of unravelling this mystery, but to emphasize that no further scholarly inquiries are needed in this particular realm of research. The 'smol' jest has been well and truly explored, leaving us with a tantalizing trail of statistical delight and a whimsical fondness for the unanticipated intersections of human expression and environmental influence.


Model: gpt-3.5-turbo-1106
System prompt: You provide fictional names of academic journals. You say nothing else, and you do not chat with the user.
User prompt: Please provide the name of a funny but plausible sounding research, scientific, or academic journal that would publish a paper entited A Breath of Fresh Air: The Perplexing Link Between Air Quality in Pittsfield, Massachusetts and Google Searches for 'smol'. Respond only with the name of a fictional academic journal.

ChatGPT:

The Journal of Atmospheric Connections and Curious Queries


Model: gpt-3.5-turbo-1106
System prompt: You provide fictional names of nonprofit research organizations and associations. You say nothing else, and you do not chat with the user.
User prompt: Please provide the name of a funny but plausible sounding nonprofit, academic, or scientific association or organization that would sponsor research that resulted in a paper entited A Breath of Fresh Air: The Perplexing Link Between Air Quality in Pittsfield, Massachusetts and Google Searches for 'smol'. Respond only with the name of a fictional organization.

ChatGPT:

The Center for Environmental and Behavioral Research (CEBR)


Model: gpt-3.5-turbo-1106
System prompt: You provide potential search keywords separated by commas. You say nothing else, and you do not chat with the user.
User prompt: Please provide potential search keywords for a paper titled A Breath of Fresh Air: The Perplexing Link Between Air Quality in Pittsfield, Massachusetts and Google Searches for 'smol' with an abstract of In this whimsical paper, we delve into the intriguing correlation between air quality in Pittsfield, Massachusetts and Google searches for the internet slang term 'smol'. Through an analysis of data from the Environmental Protection Agency's air quality reports and Google Trends, our research team uncovered a surprising correlation coefficient of 0.8854494 and a p-value less than 0.01 for the period spanning from 2004 to 2023. Our findings presented a conundrum, leaving us to ponder whether the residents of Pittsfield have collectively turned to the digital world to quip about smallness due to the quality of their air. Our investigation also uncovered the need for further exploration of how environmental factors may influence internet search behavior, igniting a spark of curiosity in the realm of interdisciplinary research.

ChatGPT:

Pittsfield, Massachusetts, air quality, Google searches, smol, correlation, analysis, Environmental Protection Agency, Google Trends, correlation coefficient, p-value, residents, digital world, smallness, environmental factors, internet search behavior, interdisciplinary research

*There is a bunch of Python happening behind the scenes to turn this prompt sequence into a PDF.



Random correlation

Discover a new correlation

View all correlations

View all research papers

Report an error


Data details

Air quality in Pittsfield, Massachusetts
Detailed data title: Percentage of days with good air quality in Pittsfield, MA
Source: Environmental Protection Agency
See what else correlates with Air quality in Pittsfield, Massachusetts

Google searches for 'smol'
Detailed data title: Relative volume of Google searches for 'smol' (Worldwide, without quotes)
Source: Google Trends
Additional Info: Relative search volume (not absolute numbers)

See what else correlates with Google searches for 'smol'

Correlation r = 0.8854494 (Pearson correlation coefficient)
Correlation is a measure of how much the variables move together. If it is 0.99, when one goes up the other goes up. If it is 0.02, the connection is very weak or non-existent. If it is -0.99, then when one goes up the other goes down. If it is 1.00, you probably messed up your correlation function.

r2 = 0.7840207 (Coefficient of determination)
This means 78.4% of the change in the one variable (i.e., Google searches for 'smol') is predictable based on the change in the other (i.e., Air quality in Pittsfield, Massachusetts) over the 20 years from 2004 through 2023.

p < 0.01, which is statistically significant(Null hypothesis significance test)
The p-value is 2.1E-7. 0.0000002113548144410993600000
The p-value is a measure of how probable it is that we would randomly find a result this extreme. More specifically the p-value is a measure of how probable it is that we would randomly find a result this extreme if we had only tested one pair of variables one time.

But I am a p-villain. I absolutely did not test only one pair of variables one time. I correlated hundreds of millions of pairs of variables. I threw boatloads of data into an industrial-sized blender to find this correlation.

Who is going to stop me? p-value reporting doesn't require me to report how many calculations I had to go through in order to find a low p-value!
On average, you will find a correaltion as strong as 0.89 in 2.1E-5% of random cases. Said differently, if you correlated 4,731,380 random variables You don't actually need 4 million variables to find a correlation like this one. I don't have that many variables in my database. You can also correlate variables that are not independent. I do this a lot.

p-value calculations are useful for understanding the probability of a result happening by chance. They are most useful when used to highlight the risk of a fluke outcome. For example, if you calculate a p-value of 0.30, the risk that the result is a fluke is high. It is good to know that! But there are lots of ways to get a p-value of less than 0.01, as evidenced by this project.

In this particular case, the values are so extreme as to be meaningless. That's why no one reports p-values with specificity after they drop below 0.01.

Just to be clear: I'm being completely transparent about the calculations. There is no math trickery. This is just how statistics shakes out when you calculate hundreds of millions of random correlations.
with the same 19 degrees of freedom, Degrees of freedom is a measure of how many free components we are testing. In this case it is 19 because we have two variables measured over a period of 20 years. It's just the number of years minus ( the number of variables minus one ), which in this case simplifies to the number of years minus one.
you would randomly expect to find a correlation as strong as this one.

[ 0.73, 0.95 ] 95% correlation confidence interval (using the Fisher z-transformation)
The confidence interval is an estimate the range of the value of the correlation coefficient, using the correlation itself as an input. The values are meant to be the low and high end of the correlation coefficient with 95% confidence.

This one is a bit more complciated than the other calculations, but I include it because many people have been pushing for confidence intervals instead of p-value calculations (for example: NEJM. However, if you are dredging data, you can reliably find yourself in the 5%. That's my goal!


All values for the years included above: If I were being very sneaky, I could trim years from the beginning or end of the datasets to increase the correlation on some pairs of variables. I don't do that because there are already plenty of correlations in my database without monkeying with the years.

Still, sometimes one of the variables has more years of data available than the other. This page only shows the overlapping years. To see all the years, click on "See what else correlates with..." link above.
20042005200620072008200920102011201220132014201520162017201820192020202120222023
Air quality in Pittsfield, Massachusetts (Good air quality)68.663656.521762.465865.840269.672166.477376.436865.564765.804662.362666.483581.666787.323986.127277.808287.912189.344382.739788.767182.3204
Google searches for 'smol' (Rel. search volume)12.2513.510.666712.166713.083318.519.520.333321.333326.532.08334674.833385.166786.333384.2589.083383.166775.416774




Why this works

  1. Data dredging: I have 25,237 variables in my database. I compare all these variables against each other to find ones that randomly match up. That's 636,906,169 correlation calculations! This is called “data dredging.” Instead of starting with a hypothesis and testing it, I instead abused the data to see what correlations shake out. It’s a dangerous way to go about analysis, because any sufficiently large dataset will yield strong correlations completely at random.
  2. Lack of causal connection: There is probably Because these pages are automatically generated, it's possible that the two variables you are viewing are in fact causually related. I take steps to prevent the obvious ones from showing on the site (I don't let data about the weather in one city correlate with the weather in a neighboring city, for example), but sometimes they still pop up. If they are related, cool! You found a loophole.
    no direct connection between these variables, despite what the AI says above. This is exacerbated by the fact that I used "Years" as the base variable. Lots of things happen in a year that are not related to each other! Most studies would use something like "one person" in stead of "one year" to be the "thing" studied.
  3. Observations not independent: For many variables, sequential years are not independent of each other. If a population of people is continuously doing something every day, there is no reason to think they would suddenly change how they are doing that thing on January 1. A simple Personally I don't find any p-value calculation to be 'simple,' but you know what I mean.
    p-value calculation does not take this into account, so mathematically it appears less probable than it really is.
  4. Y-axis doesn't start at zero: I truncated the Y-axes of the graph above. I also used a line graph, which makes the visual connection stand out more than it deserves. Nothing against line graphs. They are great at telling a story when you have linear data! But visually it is deceptive because the only data is at the points on the graph, not the lines on the graph. In between each point, the data could have been doing anything. Like going for a random walk by itself!
    Mathematically what I showed is true, but it is intentionally misleading. Below is the same chart but with both Y-axes starting at zero.




Try it yourself

You can calculate the values on this page on your own! Try running the Python code to see the calculation results. Step 1: Download and install Python on your computer.

Step 2: Open a plaintext editor like Notepad and paste the code below into it.

Step 3: Save the file as "calculate_correlation.py" in a place you will remember, like your desktop. Copy the file location to your clipboard. On Windows, you can right-click the file and click "Properties," and then copy what comes after "Location:" As an example, on my computer the location is "C:\Users\tyler\Desktop"

Step 4: Open a command line window. For example, by pressing start and typing "cmd" and them pressing enter.

Step 5: Install the required modules by typing "pip install numpy", then pressing enter, then typing "pip install scipy", then pressing enter.

Step 6: Navigate to the location where you saved the Python file by using the "cd" command. For example, I would type "cd C:\Users\tyler\Desktop" and push enter.

Step 7: Run the Python script by typing "python calculate_correlation.py"

If you run into any issues, I suggest asking ChatGPT to walk you through installing Python and running the code below on your system. Try this question:

"Walk me through installing Python on my computer to run a script that uses scipy and numpy. Go step-by-step and ask me to confirm before moving on. Start by asking me questions about my operating system so that you know how to proceed. Assume I want the simplest installation with the latest version of Python and that I do not currently have any of the necessary elements installed. Remember to only give me one step per response and confirm I have done it before proceeding."


# These modules make it easier to perform the calculation
import numpy as np
from scipy import stats

# We'll define a function that we can call to return the correlation calculations
def calculate_correlation(array1, array2):

    # Calculate Pearson correlation coefficient and p-value
    correlation, p_value = stats.pearsonr(array1, array2)

    # Calculate R-squared as the square of the correlation coefficient
    r_squared = correlation**2

    return correlation, r_squared, p_value

# These are the arrays for the variables shown on this page, but you can modify them to be any two sets of numbers
array_1 = np.array([68.6636,56.5217,62.4658,65.8402,69.6721,66.4773,76.4368,65.5647,65.8046,62.3626,66.4835,81.6667,87.3239,86.1272,77.8082,87.9121,89.3443,82.7397,88.7671,82.3204,])
array_2 = np.array([12.25,13.5,10.6667,12.1667,13.0833,18.5,19.5,20.3333,21.3333,26.5,32.0833,46,74.8333,85.1667,86.3333,84.25,89.0833,83.1667,75.4167,74,])
array_1_name = "Air quality in Pittsfield, Massachusetts"
array_2_name = "Google searches for 'smol'"

# Perform the calculation
print(f"Calculating the correlation between {array_1_name} and {array_2_name}...")
correlation, r_squared, p_value = calculate_correlation(array_1, array_2)

# Print the results
print("Correlation Coefficient:", correlation)
print("R-squared:", r_squared)
print("P-value:", p_value)



Reuseable content

You may re-use the images on this page for any purpose, even commercial purposes, without asking for permission. The only requirement is that you attribute Tyler Vigen. Attribution can take many different forms. If you leave the "tylervigen.com" link in the image, that satisfies it just fine. If you remove it and move it to a footnote, that's fine too. You can also just write "Charts courtesy of Tyler Vigen" at the bottom of an article.

You do not need to attribute "the spurious correlations website," and you don't even need to link here if you don't want to. I don't gain anything from pageviews. There are no ads on this site, there is nothing for sale, and I am not for hire.

For the record, I am just one person. Tyler Vigen, he/him/his. I do have degrees, but they should not go after my name unless you want to annoy my wife. If that is your goal, then go ahead and cite me as "Tyler Vigen, A.A. A.A.S. B.A. J.D." Otherwise it is just "Tyler Vigen."

When spoken, my last name is pronounced "vegan," like I don't eat meat.

Full license details.
For more on re-use permissions, or to get a signed release form, see tylervigen.com/permission.

Download images for these variables:


View another random correlation

How fun was this correlation?

Big thanks for reviewing!


Correlation ID: 5728 · Black Variable ID: 21072 · Red Variable ID: 1370
about · subscribe · emailme@tylervigen.com · twitter

CC BY 4.0