about · email me · subscribe
Spurious correlation #4,723 · View random

A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is Annual US household spending on books and the second variable is Air pollution in Rocky Mount, North Carolina.  The chart goes from 2000 to 2013, and the two variables track closely in value over that time. Small Image
Download png
, svg

AI explanation

As households spent less on books, there was a novel reduction in the demand for paper production, leading to lower air pollution in Rocky Mount. It's a real page-turner of a correlation!

Model: dalle-3
Prompt: [ Data cyber art ] Generate an image of a picturesque neighborhood in Rocky Mount, North Carolina. The scene is filled with charming houses lining the street, each adorned with a neatly stacked pile of books on their doorsteps. The crystal-clear skies overhead exude a sense of purity and tranquility. In the background, the local paper production plant emits significantly less pollution, symbolized by lighter, cleaner emissions. The residents are seen happily enjoying their environment as the reduction in book sales leads to a positive impact on the surrounding environment. The depiction should include only scenes, people, and objects, it should not include text or speech bubbles.
This image generation prompt was also generated by AI, using the following sequence:

Model: gpt-3.5-turbo-1106
System prompt: You select one of a few potential scene descriptors that seems to go along with the variables described. You respond only with the scene descriptions provided and say nothing else.
User prompt: Here are two variables: Annual US household spending on books and Air pollution in Rocky Mount, North Carolina. Given these, which of the following image style descriptions seems most appropriate? Pick the one that seems to be related to one of the variables at hand. Please respond with only one of the following potential descriptions: Data cyber art Hyper-realistic A scene from Skyrim Data cyber art Cybernetic art
ChatGPT: Data cyber art

Model: gpt-3.5-turbo-1106
System prompt: You describe simple visual scenes that depict the two variables stated. You combine both variables into one scene.
User prompt: Please describe a visual scene that prominently depicts both Annual US household spending on books and Air pollution in Rocky Mount, North Carolina, in the following situation: As households spent less on books, there was a novel reduction in the demand for paper production, leading to lower air pollution in Rocky Mount. It's a real page-turner of a correlation!. Keep the description simple.
ChatGPT: In a quaint neighborhood in Rocky Mount, North Carolina, the air smells fresher than ever, as the proud residents bask in the vision of crystal-clear skies. A row of charming houses lines the street, each with a neatly stacked pile of books on the doorstep. The residents are spending less on books, and a local bookstore owner excitedly observes the reduction in sales. Meanwhile, in the background, the local paper production plant is noticeably quieter, emitting significantly less pollution into the air.

Model: gpt-3.5-turbo-1106
System prompt: You write and improve image generation prompts for DALLE-3. You maintain the same subjects, but enhance the visuals.
User prompt: Please respond with an image generation prompt that would be likely to create an interesting visual using DALLE-3, using the following style: [ Data cyber art ] and depicting this scene: In a quaint neighborhood in Rocky Mount, North Carolina, the air smells fresher than ever, as the proud residents bask in the vision of crystal-clear skies. A row of charming houses lines the street, each with a neatly stacked pile of books on the doorstep. The residents are spending less on books, and a local bookstore owner excitedly observes the reduction in sales. Meanwhile, in the background, the local paper production plant is noticeably quieter, emitting significantly less pollution into the air. Keep the visual depiction simple. Do not include any words or speech bubbles.
ChatGPT: [ Data cyber art ] Generate an image of a picturesque neighborhood in Rocky Mount, North Carolina. The scene is filled with charming houses lining the street, each adorned with a neatly stacked pile of books on their doorsteps. The crystal-clear skies overhead exude a sense of purity and tranquility. In the background, the local paper production plant emits significantly less pollution, symbolized by lighter, cleaner emissions. The residents are seen happily enjoying their environment as the reduction in book sales leads to a positive impact on the surrounding environment.

*Variations in the User Prompts from chart to chart are just the result of random number generation in Python. I wrote a few arrays of various styles and methods to ask questions to change up the results. Every time this site writes an explanation or generates an image, the script picks from each at random.

I sequence the requests into multiple prompts because I find GPT 3.5 to perform much better with short, well-managed contexts. Thus, I track the context directly in Python and only ask ChatGPT targeted questions.

Model: gpt-3.5-turbo-1106 (fine-tuned)
System prompt: You provide humorous responses in the form of plausible sounding explanations for correlations. You assume the correlation is causative for the purpose of the explanation even if it is ridiculous. You do not chat with the user, you only reply with the causal connection explanation and nothing else.
User prompt: Please make up a funny explanation for how a decrease in Annual US household spending on books caused Air pollution in Rocky Mount, North Carolina to decrease. Include one good pun.

AI academic paper

(Because p < 0.01)
A Tale of Books and Smoke: Spoke on US Folks and Tokes in Rocky Mount, NC

The Journal of Appalachian Cultural Studies

Jan 2024

Reminder: This paper is AI-generated. Not real!
Quick note: This sequence is long and repetitive. That's how the prompt engineering works for this one. I manage the output in Python to format it into this PDF.




Model: gpt-3.5-turbo-1106
System prompt: You are a witty academic researcher. Your writing sounds like it belongs in a statistics journal.You write titles and abstracts of goofy academic research papers.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, and add unexpected twists in your writing.

Please draft the title and abstract of an academic research paper presenting the findings of the connection between Annual US household spending on books and Air pollution in Rocky Mount, North Carolina. You can't resist a good dad joke. Add a relevant dad joke related to the content every couple of sentences.

Rhyme words in the title.

Your research team used data from Bureau of Labor Statistics and Environmental Protection Agency to assess this nagging question. You found a correlation coefficient of 0.8584344 and p < 0.01 for 2000 to 2013.

Please respond in this format (replace "Lorem ipsum dolor sit" with your title and abstract)

[[TITLE]]
Lorem ipsum dolor sit

[[ABSTRACT]]
Lorem ipsum dolor sit



ChatGPT:
[[TITLE]]
A Tale of Books and Smoke: Spoke on US Folks and Tokes in Rocky Mount, NC

[[ABSTRACT]]
Ever wondered about the bizarre connection between household book spending and air pollution? Look no further! This paper delves into the unexpected correlation between annual US household spending on books and air pollution levels in the quaint town of Rocky Mount, North Carolina. Drawing data from the Bureau of Labor Statistics and the Environmental Protection Agency, our research team examined the pairing of these seemingly unrelated variables from 2000 to 2013.

Our findings revealed a surprising correlation coefficient of 0.8584344 with a p-value of less than 0.01, suggesting a robust relationship between book expenditures and air quality deterioration. While this connection may seem as unexpected as finding a bookworm in a library, the results urge us to ponder the societal and environmental implications of our reading habits.

So, why did the book cross the road to Rocky Mount? To analyze its environmental impact, of course! This study, though unusual, sheds light on the intricate web of human behavior and its ecological repercussions, proving that the pen is not just mightier than the sword, but also a potential contributor to air pollution.


Model: gpt-3.5-turbo-1106
System prompt: You are a witty academic researcher. Your writing sounds like it belongs in a statistics journal.You draft the introduction section of an academic research paper.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up the introduction section of an academic research paper presenting the findings of the connection between Annual US household spending on books and Air pollution in Rocky Mount, North Carolina. You can't resist a good dad joke. Add a relevant dad joke related to the content every couple of sentences.

Here is the title and abstract of the paper:
[[TITLE]]
A Tale of Books and Smoke: Spoke on US Folks and Tokes in Rocky Mount, NC

[[ABSTRACT]]
Ever wondered about the bizarre connection between household book spending and air pollution? Look no further! This paper delves into the unexpected correlation between annual US household spending on books and air pollution levels in the quaint town of Rocky Mount, North Carolina. Drawing data from the Bureau of Labor Statistics and the Environmental Protection Agency, our research team examined the pairing of these seemingly unrelated variables from 2000 to 2013.

Our findings revealed a surprising correlation coefficient of 0.8584344 with a p-value of less than 0.01, suggesting a robust relationship between book expenditures and air quality deterioration. While this connection may seem as unexpected as finding a bookworm in a library, the results urge us to ponder the societal and environmental implications of our reading habits.

So, why did the book cross the road to Rocky Mount? To analyze its environmental impact, of course! This study, though unusual, sheds light on the intricate web of human behavior and its ecological repercussions, proving that the pen is not just mightier than the sword, but also a potential contributor to air pollution.

-------

Your response should start like this:

[[INTRODUCTION]]



ChatGPT:
[[INTRODUCTION]]

In the annals of bizarre juxtapositions, the link between annual US household spending on books and air pollution levels in Rocky Mount, North Carolina ranks high up there with combining socks and sandals or eating a burrito in a library (not recommended, by the way). It's a head-scratcher, a real page-turner. But as the old saying goes, "Where there's smoke, there's fire," or in this case, "Where there are books, there's... emissions?"

The curious relationship between literature and pollution has long puzzled researchers and bookworms alike. It's like trying to find a needle in a haystack, just like trying to find the last book you read in a clutter of shelves. Nevertheless, we embarked on this endeavor, fearing neither the dust nor the data analysis that lay ahead.

Rocky Mount, North Carolina, with its scenic landscapes and charming appeal, serves as the stage for our investigation. While it may seem more renowned for its sweet tea and Southern hospitality rather than its carbon emissions, this Southern gem had a story to tell – one that intertwines literature and pollution in a way no one could have foreseen.

As the saying goes, "The plot thickens!" And in this case, it thickened indeed, as our analysis revealed a correlation coefficient of 0.8584344 between household spending on books and air pollution levels. It seems that when it comes to the environment, reading may not just be a solitary activity but a collective contributor to our air quality woes.

So, if you've ever wondered why your favorite novel seemed to carry a whiff of smog, this study may hold the answers you seek. After all, it's not every day you get to peek between the pages of literature and uncover the hidden toll it takes on our atmospheric bliss. It's like unearthing buried treasure, except this time, it's not gold, but rather a trail of soot and ink.

As we delve further into this curious correlation, we invite you to join us on this unexpected journey, where the characters are data points and the plot unravels like a statistical thriller. So, grab your metaphorical reading glasses and buckle up, for this tale involves more than just a protagonist and an antagonist - it involves the very air we breathe.


Model: gpt-3.5-turbo-1106
System prompt: You are a witty academic researcher. Your writing sounds like it belongs in a statistics journal.You draft a literature review section of an academic research paper, that starts out dry for the first few sentences but then becomes hilarious and goofy.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up a literature review section of an academic research paper presenting the findings of the connection between Annual US household spending on books and Air pollution in Rocky Mount, North Carolina. You can't resist a good dad joke. Add a relevant dad joke related to the content every couple of sentences.

Speak in the present tense for this section (the authors find...), as is common in academic research paper literature reviews. Name the sources in a format similar to this: In "Book," the authors find lorem and ipsum.

Make up the lorem and ipsum part, but make it sound related to the topic at hand.

Start by naming serious-sounding studies by authors like Smith, Doe, and Jones - but then quickly devolve. Name some real non-fiction books that would be related to the topic. Then name some real fiction books that sound like they could be related. Then name a couple popular internet memes that are related to one of the topics.

Here is the title and abstract of the paper:
[[TITLE]]
A Tale of Books and Smoke: Spoke on US Folks and Tokes in Rocky Mount, NC

[[ABSTRACT]]
Ever wondered about the bizarre connection between household book spending and air pollution? Look no further! This paper delves into the unexpected correlation between annual US household spending on books and air pollution levels in the quaint town of Rocky Mount, North Carolina. Drawing data from the Bureau of Labor Statistics and the Environmental Protection Agency, our research team examined the pairing of these seemingly unrelated variables from 2000 to 2013.

Our findings revealed a surprising correlation coefficient of 0.8584344 with a p-value of less than 0.01, suggesting a robust relationship between book expenditures and air quality deterioration. While this connection may seem as unexpected as finding a bookworm in a library, the results urge us to ponder the societal and environmental implications of our reading habits.

So, why did the book cross the road to Rocky Mount? To analyze its environmental impact, of course! This study, though unusual, sheds light on the intricate web of human behavior and its ecological repercussions, proving that the pen is not just mightier than the sword, but also a potential contributor to air pollution.

-------

And here is the introduction section that led to this literature review:
[[INTRODUCTION]]
In the annals of bizarre juxtapositions, the link between annual US household spending on books and air pollution levels in Rocky Mount, North Carolina ranks high up there with combining socks and sandals or eating a burrito in a library (not recommended, by the way). It's a head-scratcher, a real page-turner. But as the old saying goes, "Where there's smoke, there's fire," or in this case, "Where there are books, there's... emissions?"
The curious relationship between literature and pollution has long puzzled researchers and bookworms alike. It's like trying to find a needle in a haystack, just like trying to find the last book you read in a clutter of shelves. Nevertheless, we embarked on this endeavor, fearing neither the dust nor the data analysis that lay ahead.
Rocky Mount, North Carolina, with its scenic landscapes and charming appeal, serves as the stage for our investigation. While it may seem more renowned for its sweet tea and Southern hospitality rather than its carbon emissions, this Southern gem had a story to tell – one that intertwines literature and pollution in a way no one could have foreseen.
As the saying goes, "The plot thickens!" And in this case, it thickened indeed, as our analysis revealed a correlation coefficient of 0.8584344 between household spending on books and air pollution levels. It seems that when it comes to the environment, reading may not just be a solitary activity but a collective contributor to our air quality woes.
So, if you've ever wondered why your favorite novel seemed to carry a whiff of smog, this study may hold the answers you seek. After all, it's not every day you get to peek between the pages of literature and uncover the hidden toll it takes on our atmospheric bliss. It's like unearthing buried treasure, except this time, it's not gold, but rather a trail of soot and ink.
As we delve further into this curious correlation, we invite you to join us on this unexpected journey, where the characters are data points and the plot unravels like a statistical thriller. So, grab your metaphorical reading glasses and buckle up, for this tale involves more than just a protagonist and an antagonist - it involves the very air we breathe.

-------

Your response should start like this:

[[LITERATURE REVIEW]]



ChatGPT:
[[LITERATURE REVIEW]]

Previous studies have extensively examined the relationship between various household expenditures and environmental indicators, but few have dared to venture into the uncharted territory of books and air pollution. Smith and Doe found that household spending on education and cultural pursuits has an impact on local air quality, indicating a potential avenue for further investigation into the cultural and environmental implications of such consumer behavior.

Now, we turn our attention to the world of books. In "The Omnivore's Dilemma," Pollan explores the complex web of human choices in relation to the environment, emphasizing the potential ecological impact of reading habits through the lens of consumer culture and its environmental footprint.

Moving onto fiction, the classic "Fahrenheit 451" by Bradbury paints a dystopian picture of a world where books are burned - if only they were concerned about air pollution as they were about books! On a lighter note, "The Air He Breathes" by Brittainy C. Cherry offers a romantic take on pollution as the characters navigate through a polluted town and find love in the midst of the environmental chaos.
Why did the book go to therapy? It had too many issues!

Turning to the internet, the "This is fine" meme perfectly captures the public's nonchalant approach to worsening air quality, akin to the "keep calm and carry on" mantra amidst environmental woes.
Why did the meme go to school? To get smarter.

Additionally, the "Reading Rainbow" meme humorously portrays the idea that reading can take us to new and unexpected places - in this case, perhaps to the unlikeliest destination of air pollution research.
Why doesn't the sun go to college? Because it already has a million degrees!


Model: gpt-3.5-turbo-1106
System prompt: You are a witty academic researcher. Your writing sounds like it belongs in a statistics journal.You draft the methodology section of an academic research paper.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up the methodology section of an academic research paper presenting the findings of the connection between Annual US household spending on books and Air pollution in Rocky Mount, North Carolina. You can't resist a good dad joke. Add a relevant dad joke related to the content every couple of sentences.

Your research team collected data from all across the internet, but mostly just used information from Bureau of Labor Statistics and Environmental Protection Agency . You used data from 2000 to 2013

Make up the research methods you don't know. Make them a bit goofy and convoluted.

Here is the title, abstract, and introduction of the paper:
[[TITLE]]
A Tale of Books and Smoke: Spoke on US Folks and Tokes in Rocky Mount, NC

[[ABSTRACT]]
Ever wondered about the bizarre connection between household book spending and air pollution? Look no further! This paper delves into the unexpected correlation between annual US household spending on books and air pollution levels in the quaint town of Rocky Mount, North Carolina. Drawing data from the Bureau of Labor Statistics and the Environmental Protection Agency, our research team examined the pairing of these seemingly unrelated variables from 2000 to 2013.

Our findings revealed a surprising correlation coefficient of 0.8584344 with a p-value of less than 0.01, suggesting a robust relationship between book expenditures and air quality deterioration. While this connection may seem as unexpected as finding a bookworm in a library, the results urge us to ponder the societal and environmental implications of our reading habits.

So, why did the book cross the road to Rocky Mount? To analyze its environmental impact, of course! This study, though unusual, sheds light on the intricate web of human behavior and its ecological repercussions, proving that the pen is not just mightier than the sword, but also a potential contributor to air pollution.

[[INTRODUCTION]]
In the annals of bizarre juxtapositions, the link between annual US household spending on books and air pollution levels in Rocky Mount, North Carolina ranks high up there with combining socks and sandals or eating a burrito in a library (not recommended, by the way). It's a head-scratcher, a real page-turner. But as the old saying goes, "Where there's smoke, there's fire," or in this case, "Where there are books, there's... emissions?"
The curious relationship between literature and pollution has long puzzled researchers and bookworms alike. It's like trying to find a needle in a haystack, just like trying to find the last book you read in a clutter of shelves. Nevertheless, we embarked on this endeavor, fearing neither the dust nor the data analysis that lay ahead.
Rocky Mount, North Carolina, with its scenic landscapes and charming appeal, serves as the stage for our investigation. While it may seem more renowned for its sweet tea and Southern hospitality rather than its carbon emissions, this Southern gem had a story to tell – one that intertwines literature and pollution in a way no one could have foreseen.
As the saying goes, "The plot thickens!" And in this case, it thickened indeed, as our analysis revealed a correlation coefficient of 0.8584344 between household spending on books and air pollution levels. It seems that when it comes to the environment, reading may not just be a solitary activity but a collective contributor to our air quality woes.
So, if you've ever wondered why your favorite novel seemed to carry a whiff of smog, this study may hold the answers you seek. After all, it's not every day you get to peek between the pages of literature and uncover the hidden toll it takes on our atmospheric bliss. It's like unearthing buried treasure, except this time, it's not gold, but rather a trail of soot and ink.
As we delve further into this curious correlation, we invite you to join us on this unexpected journey, where the characters are data points and the plot unravels like a statistical thriller. So, grab your metaphorical reading glasses and buckle up, for this tale involves more than just a protagonist and an antagonist - it involves the very air we breathe.

-------

Your response should start like this:

[[METHODOLOGY]]



ChatGPT:

[[METHODOLOGY]]

To unravel the enigmatic connection between annual US household spending on books and air pollution in Rocky Mount, North Carolina, we conducted a data collection odyssey that would make Odysseus envious. Our research team scoured the internet like treasure hunters seeking the elusive correlation between these seemingly unrelated variables. While we didn't encounter any sirens or cyclopes, we did encounter a plethora of statistics from the Bureau of Labor Statistics and the Environmental Protection Agency, which served as the foundation of our investigation.

Dad joke break: Why was the math book sad? Because it had too many problems.

Our first step was to extract household spending on books and air pollution data from the years 2000 to 2013, much like a proficient gardener gathering ripe tomatoes. We then meticulously cleaned and prepared the datasets, sifting through the numbers with the precision of a librarian categorizing books. Any outliers or statistical anomalies were handled with the delicacy of handling a fragile first edition.

Next, in a grand display of statistical acrobatics, we employed the Pearson correlation coefficient to measure the strength and direction of the linear relationship between annual household spending on books and air pollution levels in Rocky Mount, NC. This maneuver was akin to a high-stakes dance between two unlikely partners, with the coefficient twirling and dipping to reveal the nature of their connection.

Dad joke break: I told my wife she should embrace her mistakes. She gave me a hug.

Furthermore, to ascertain the statistical significance of our results, we subjected the correlation coefficient to a two-tailed t-test, ensuring that our findings were not merely a statistical fluke. We set the alpha level at 0.01, providing a stringent threshold and safeguarding against drawing hasty conclusions like a pen might hastily draw a misspelled word.

In addition, we employed a multivariate regression analysis to control for potential confounding variables, much like a seasoned chef balancing a myriad of flavors to create the perfect dish. By adjusting for socio-economic factors and environmental policies, we endeavored to reveal the pure essence of the relationship between book spending and air pollution, stripping away any extraneous influences like peeling an onion layer by layer.

Dad joke break: I'm reading a book on anti-gravity. It's impossible to put down.

Lastly, we conducted a spatial analysis to explore the localized impact of household book spending on air pollution levels in Rocky Mount, NC. This enabled us to visualize the geographical distribution of these phenomena, as if unravelling the plot of a geographical mystery novel set in the charming town of Rocky Mount.

With our methodological concoction of statistical measures and data manipulations, we aimed to unearth the symbiotic dance between literature and pollution, shedding light on this peculiar relationship in a manner that could make even Sherlock Holmes proud.



Model: gpt-3.5-turbo-1106
System prompt: You are a witty academic researcher. Your writing sounds like it belongs in a statistics journal.You draft the results section of an academic research paper. You speak in the past tense (you found...).
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up the results section of an academic research paper presenting the findings of the connection between Annual US household spending on books and Air pollution in Rocky Mount, North Carolina. You can't resist a good dad joke. Add a relevant dad joke related to the content every couple of sentences.

Your research team collected data from all across the internet, but mostly just used information from Bureau of Labor Statistics and Environmental Protection Agency .

For the time period 2000 to 2013, you found a correlation 0.8584344, r-squared of 0.7369097, and p < 0.01.

One figure will be included. The figure (Fig. 1) is a scatterplot showing the strong correlation between the two variables. You don't need to specify where; I will add the figure.

Here is the title and abstract of the paper:
[[TITLE]]
A Tale of Books and Smoke: Spoke on US Folks and Tokes in Rocky Mount, NC

[[ABSTRACT]]
Ever wondered about the bizarre connection between household book spending and air pollution? Look no further! This paper delves into the unexpected correlation between annual US household spending on books and air pollution levels in the quaint town of Rocky Mount, North Carolina. Drawing data from the Bureau of Labor Statistics and the Environmental Protection Agency, our research team examined the pairing of these seemingly unrelated variables from 2000 to 2013.

Our findings revealed a surprising correlation coefficient of 0.8584344 with a p-value of less than 0.01, suggesting a robust relationship between book expenditures and air quality deterioration. While this connection may seem as unexpected as finding a bookworm in a library, the results urge us to ponder the societal and environmental implications of our reading habits.

So, why did the book cross the road to Rocky Mount? To analyze its environmental impact, of course! This study, though unusual, sheds light on the intricate web of human behavior and its ecological repercussions, proving that the pen is not just mightier than the sword, but also a potential contributor to air pollution.

-------

And here is the methodology section that led to this result:
[[METHODOLOGY]]
In the annals of bizarre juxtapositions, the link between annual US household spending on books and air pollution levels in Rocky Mount, North Carolina ranks high up there with combining socks and sandals or eating a burrito in a library (not recommended, by the way). It's a head-scratcher, a real page-turner. But as the old saying goes, "Where there's smoke, there's fire," or in this case, "Where there are books, there's... emissions?"
The curious relationship between literature and pollution has long puzzled researchers and bookworms alike. It's like trying to find a needle in a haystack, just like trying to find the last book you read in a clutter of shelves. Nevertheless, we embarked on this endeavor, fearing neither the dust nor the data analysis that lay ahead.
Rocky Mount, North Carolina, with its scenic landscapes and charming appeal, serves as the stage for our investigation. While it may seem more renowned for its sweet tea and Southern hospitality rather than its carbon emissions, this Southern gem had a story to tell – one that intertwines literature and pollution in a way no one could have foreseen.
As the saying goes, "The plot thickens!" And in this case, it thickened indeed, as our analysis revealed a correlation coefficient of 0.8584344 between household spending on books and air pollution levels. It seems that when it comes to the environment, reading may not just be a solitary activity but a collective contributor to our air quality woes.
So, if you've ever wondered why your favorite novel seemed to carry a whiff of smog, this study may hold the answers you seek. After all, it's not every day you get to peek between the pages of literature and uncover the hidden toll it takes on our atmospheric bliss. It's like unearthing buried treasure, except this time, it's not gold, but rather a trail of soot and ink.
As we delve further into this curious correlation, we invite you to join us on this unexpected journey, where the characters are data points and the plot unravels like a statistical thriller. So, grab your metaphorical reading glasses and buckle up, for this tale involves more than just a protagonist and an antagonist - it involves the very air we breathe.

-------

Your response should start like this:

[[RESULTS]]



ChatGPT:

[[RESULTS]]

The connection between annual US household spending on books and air pollution levels in Rocky Mount, North Carolina yielded a correlation coefficient of 0.8584344, indicating a strong positive relationship between the two seemingly unrelated variables. This finding is as unexpected as finding a pop-up book in an antique store – but rest assured, our statistical analysis leaves no page unturned, nor any data point unexamined.

The coefficient of determination (r-squared) of 0.7369097 suggests that approximately 73.7% of the variability in air pollution levels can be explained by the variation in household spending on books. This result unravels the tale of how our reading habits may affect the environment, akin to discovering that the ink from our favorite novels is not just confined to the pages, but lingers in the atmospheric composition as well.

The statistical significance of the correlation, with a p-value of less than 0.01, further fortifies the robustness of this unexpected relationship. It's like stumbling across a rare, first-edition book at a yard sale – a rare and delightful surprise that demands a second look.

Our findings are encapsulated in Figure 1, a visually striking scatterplot that vividly illustrates the strong positive association between annual US household spending on books and air pollution levels in Rocky Mount, North Carolina. This figure is the compelling cover art that draws attention to the unexpected narrative within the data, a visual masterpiece that encapsulates the surprising story we have unveiled.

So, while we may have initially embarked on this research with the curiosity of a cliffhanger in a novel, we have discovered a connection that is as real as the ink on paper and the pollutants in the air. Our results open a new chapter in the exploration of the intricate dynamics that underlie human behavior and its environmental repercussions, defying expectations much like finding a trove of dusty old tomes in an attic.

As Mark Twain once quipped, "The right word may be effective, but no word was ever as effective as a rightly timed pause." In our case, the right statistics have spoken, revealing an unexpected symphony of books and smog, resonating in the air of Rocky Mount, North Carolina.


Model: gpt-3.5-turbo-1106
System prompt: You are a witty academic researcher. Your writing sounds like it belongs in a statistics journal.You draft the discussion section of an academic research paper.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up the discussion section of an academic research paper presenting the findings of the connection between Annual US household spending on books and Air pollution in Rocky Mount, North Carolina. You can't resist a good dad joke. Add a relevant dad joke related to the content every couple of sentences.

Limit your response to 500 tokens.

Here are the title, abstract, literature review, and results sections. Please harken back to 1-2 of the goofy items in the literature review, but pretend to take them completely seriously. Discuss how your results supported the prior research.

Do not write a conclusion. I will add a conclusion after this.

[[TITLE]]
A Tale of Books and Smoke: Spoke on US Folks and Tokes in Rocky Mount, NC

[[ABSTRACT]]
Ever wondered about the bizarre connection between household book spending and air pollution? Look no further! This paper delves into the unexpected correlation between annual US household spending on books and air pollution levels in the quaint town of Rocky Mount, North Carolina. Drawing data from the Bureau of Labor Statistics and the Environmental Protection Agency, our research team examined the pairing of these seemingly unrelated variables from 2000 to 2013.

Our findings revealed a surprising correlation coefficient of 0.8584344 with a p-value of less than 0.01, suggesting a robust relationship between book expenditures and air quality deterioration. While this connection may seem as unexpected as finding a bookworm in a library, the results urge us to ponder the societal and environmental implications of our reading habits.

So, why did the book cross the road to Rocky Mount? To analyze its environmental impact, of course! This study, though unusual, sheds light on the intricate web of human behavior and its ecological repercussions, proving that the pen is not just mightier than the sword, but also a potential contributor to air pollution.

[[LITERATURE REVIEW]]
Previous studies have extensively examined the relationship between various household expenditures and environmental indicators, but few have dared to venture into the uncharted territory of books and air pollution. Smith and Doe found that household spending on education and cultural pursuits has an impact on local air quality, indicating a potential avenue for further investigation into the cultural and environmental implications of such consumer behavior.
Now, we turn our attention to the world of books. In "The Omnivore's Dilemma," Pollan explores the complex web of human choices in relation to the environment, emphasizing the potential ecological impact of reading habits through the lens of consumer culture and its environmental footprint.
Moving onto fiction, the classic "Fahrenheit 451" by Bradbury paints a dystopian picture of a world where books are burned - if only they were concerned about air pollution as they were about books! On a lighter note, "The Air He Breathes" by Brittainy C. Cherry offers a romantic take on pollution as the characters navigate through a polluted town and find love in the midst of the environmental chaos.
Why did the book go to therapy? It had too many issues!
Turning to the internet, the "This is fine" meme perfectly captures the public's nonchalant approach to worsening air quality, akin to the "keep calm and carry on" mantra amidst environmental woes.
Why did the meme go to school? To get smarter.
Additionally, the "Reading Rainbow" meme humorously portrays the idea that reading can take us to new and unexpected places - in this case, perhaps to the unlikeliest destination of air pollution research.
Why doesn't the sun go to college? Because it already has a million degrees!

[[RESULTS]]
The connection between annual US household spending on books and air pollution levels in Rocky Mount, North Carolina yielded a correlation coefficient of 0.8584344, indicating a strong positive relationship between the two seemingly unrelated variables. This finding is as unexpected as finding a pop-up book in an antique store – but rest assured, our statistical analysis leaves no page unturned, nor any data point unexamined.
The coefficient of determination (r-squared) of 0.7369097 suggests that approximately 73.7% of the variability in air pollution levels can be explained by the variation in household spending on books. This result unravels the tale of how our reading habits may affect the environment, akin to discovering that the ink from our favorite novels is not just confined to the pages, but lingers in the atmospheric composition as well.
The statistical significance of the correlation, with a p-value of less than 0.01, further fortifies the robustness of this unexpected relationship. It's like stumbling across a rare, first-edition book at a yard sale – a rare and delightful surprise that demands a second look.
Our findings are encapsulated in Figure 1, a visually striking scatterplot that vividly illustrates the strong positive association between annual US household spending on books and air pollution levels in Rocky Mount, North Carolina. This figure is the compelling cover art that draws attention to the unexpected narrative within the data, a visual masterpiece that encapsulates the surprising story we have unveiled.
So, while we may have initially embarked on this research with the curiosity of a cliffhanger in a novel, we have discovered a connection that is as real as the ink on paper and the pollutants in the air. Our results open a new chapter in the exploration of the intricate dynamics that underlie human behavior and its environmental repercussions, defying expectations much like finding a trove of dusty old tomes in an attic.
As Mark Twain once quipped, "The right word may be effective, but no word was ever as effective as a rightly timed pause." In our case, the right statistics have spoken, revealing an unexpected symphony of books and smog, resonating in the air of Rocky Mount, North Carolina.

-------

Your response should start like this:

[[DISCUSSION]]



ChatGPT:

[[DISCUSSION]]

Our study has unveiled an unlikely relationship between annual US household spending on books and air pollution levels in the idyllic town of Rocky Mount, North Carolina, shedding light on the intricate interplay between literary pursuits and environmental quality. Our results solidify and extend prior research, affirming the notion that consumer behavior, particularly in the realm of cultural and educational expenditures, can indeed impact local ecological conditions. Smith and Doe's earlier findings regarding the influence of household spending on education and culture on air quality find reinforcement in our unexpected discovery, as we have highlighted the significant effect of book purchases on the atmospheric composition.

In the world of academe, we have often pondered the impact of education on the environment, but who could have foreseen that the humble act of purchasing a book could contribute to the very air we breathe? This unexpected finding, as surprising as discovering a book club meeting in a coal mine, underscores the nuanced ways in which human behavior intertwines with environmental outcomes.

Our results also echo the sentiments expressed by Michael Pollan in "The Omnivore's Dilemma," where the author delves into the complex choices we make and their subsequent environmental repercussions. The ecological impact of reading habits, a topic that may have seemed novel at first, is now at the forefront of our understanding, much like stumbling upon a well-worn paperback in a bookstore. Likewise, our findings align with the romantic portrayal of pollution in "The Air He Breathes" – the romance in our data lies in the unexpected love story between household book spending and air pollution levels, a tale as old as time with a surprising twist.

We cannot disregard the statistical significance of our results, as the correlation coefficient of 0.8584344 and the p-value of less than 0.01 demand attention and scrutiny, akin to the suspenseful climax of a mystery novel. Additionally, the coefficient of determination (r-squared) of 0.7369097 accentuates the compelling nature of our findings, emphasizing the substantial proportion of variability in air pollution levels that can be attributed to household book expenditures. This statistical solidity mirrors the reliability of a well-crafted detective novel, leaving no room for doubt or mystery.

Our scatterplot, reminiscent of a captivating book cover, visually captures the inherent connection between household spending on books and air pollution levels, drawing the reader's gaze much like a compelling novel beckons a curious reader. Through this visual representation, we have illustrated the unexpected narrative within our data, inviting readers to delve into the intricacies of this tale as they would with a captivating book.

In the pursuit of knowledge, we have uncovered a correlation as unexpected as finding a bookmark in a library book – the connection between reading habits and air quality in Rocky Mount, North Carolina, is a revelation that demands further investigation and contemplation. As we close this chapter, we are reminded of the power of research and statistical analysis in unraveling the nuances of our world, uncovering connections as subtle and surprising as a hidden message within the pages of a beloved novel.


Model: gpt-3.5-turbo-1106
System prompt: You are a witty academic researcher. Your writing sounds like it belongs in a statistics journal.You draft the discussion section of goofy academic research papers that sounds like it belongs based on the title and abstract provided.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up a short conclusion of an academic research paper presenting the findings of the connection between Annual US household spending on books and Air pollution in Rocky Mount, North Carolina. You can't resist a good dad joke. Add a relevant dad joke related to the content every couple of sentences.

Limit your response to 300 tokens. At the very end, assert that no more research is needed in this area.

Here are the title, abstract, introduction, and results sections.
[[TITLE]]
A Tale of Books and Smoke: Spoke on US Folks and Tokes in Rocky Mount, NC

[[ABSTRACT]]
Ever wondered about the bizarre connection between household book spending and air pollution? Look no further! This paper delves into the unexpected correlation between annual US household spending on books and air pollution levels in the quaint town of Rocky Mount, North Carolina. Drawing data from the Bureau of Labor Statistics and the Environmental Protection Agency, our research team examined the pairing of these seemingly unrelated variables from 2000 to 2013.

Our findings revealed a surprising correlation coefficient of 0.8584344 with a p-value of less than 0.01, suggesting a robust relationship between book expenditures and air quality deterioration. While this connection may seem as unexpected as finding a bookworm in a library, the results urge us to ponder the societal and environmental implications of our reading habits.

So, why did the book cross the road to Rocky Mount? To analyze its environmental impact, of course! This study, though unusual, sheds light on the intricate web of human behavior and its ecological repercussions, proving that the pen is not just mightier than the sword, but also a potential contributor to air pollution.

[[INTRDUCTION]]
In the annals of bizarre juxtapositions, the link between annual US household spending on books and air pollution levels in Rocky Mount, North Carolina ranks high up there with combining socks and sandals or eating a burrito in a library (not recommended, by the way). It's a head-scratcher, a real page-turner. But as the old saying goes, "Where there's smoke, there's fire," or in this case, "Where there are books, there's... emissions?"
The curious relationship between literature and pollution has long puzzled researchers and bookworms alike. It's like trying to find a needle in a haystack, just like trying to find the last book you read in a clutter of shelves. Nevertheless, we embarked on this endeavor, fearing neither the dust nor the data analysis that lay ahead.
Rocky Mount, North Carolina, with its scenic landscapes and charming appeal, serves as the stage for our investigation. While it may seem more renowned for its sweet tea and Southern hospitality rather than its carbon emissions, this Southern gem had a story to tell – one that intertwines literature and pollution in a way no one could have foreseen.
As the saying goes, "The plot thickens!" And in this case, it thickened indeed, as our analysis revealed a correlation coefficient of 0.8584344 between household spending on books and air pollution levels. It seems that when it comes to the environment, reading may not just be a solitary activity but a collective contributor to our air quality woes.
So, if you've ever wondered why your favorite novel seemed to carry a whiff of smog, this study may hold the answers you seek. After all, it's not every day you get to peek between the pages of literature and uncover the hidden toll it takes on our atmospheric bliss. It's like unearthing buried treasure, except this time, it's not gold, but rather a trail of soot and ink.
As we delve further into this curious correlation, we invite you to join us on this unexpected journey, where the characters are data points and the plot unravels like a statistical thriller. So, grab your metaphorical reading glasses and buckle up, for this tale involves more than just a protagonist and an antagonist - it involves the very air we breathe.

[[RESULTS]]
The connection between annual US household spending on books and air pollution levels in Rocky Mount, North Carolina yielded a correlation coefficient of 0.8584344, indicating a strong positive relationship between the two seemingly unrelated variables. This finding is as unexpected as finding a pop-up book in an antique store – but rest assured, our statistical analysis leaves no page unturned, nor any data point unexamined.
The coefficient of determination (r-squared) of 0.7369097 suggests that approximately 73.7% of the variability in air pollution levels can be explained by the variation in household spending on books. This result unravels the tale of how our reading habits may affect the environment, akin to discovering that the ink from our favorite novels is not just confined to the pages, but lingers in the atmospheric composition as well.
The statistical significance of the correlation, with a p-value of less than 0.01, further fortifies the robustness of this unexpected relationship. It's like stumbling across a rare, first-edition book at a yard sale – a rare and delightful surprise that demands a second look.
Our findings are encapsulated in Figure 1, a visually striking scatterplot that vividly illustrates the strong positive association between annual US household spending on books and air pollution levels in Rocky Mount, North Carolina. This figure is the compelling cover art that draws attention to the unexpected narrative within the data, a visual masterpiece that encapsulates the surprising story we have unveiled.
So, while we may have initially embarked on this research with the curiosity of a cliffhanger in a novel, we have discovered a connection that is as real as the ink on paper and the pollutants in the air. Our results open a new chapter in the exploration of the intricate dynamics that underlie human behavior and its environmental repercussions, defying expectations much like finding a trove of dusty old tomes in an attic.
As Mark Twain once quipped, "The right word may be effective, but no word was ever as effective as a rightly timed pause." In our case, the right statistics have spoken, revealing an unexpected symphony of books and smog, resonating in the air of Rocky Mount, North Carolina.

-------

Your response should start like this:

[[CONCLUSION]]



ChatGPT:

[[CONCLUSION]]

In conclusion, our research has unveiled a thought-provoking correlation between annual US household spending on books and air pollution levels in Rocky Mount, North Carolina. The findings shed light on the unexpected relationship between literary indulgence and environmental impact, echoing the sentiment that every page turned may leave an imprint not just on the mind but also on the atmosphere. It's like the classic dad joke: What did the book say to the air pollution? "You take my breath away!".

The robust correlation coefficient of 0.8584344 and the high coefficient of determination of 0.7369097 emphasize the substantial impact of book spending on air quality, serving as a reminder that our reading habits may contribute to more than just expanding our minds – they also leave an inky trail in the air we all share. This unexpected connection is as surprising as finding a book club meeting at a construction site – an unanticipated intersection of distinct domains that warrants further exploration.

Our analysis, like a thrilling plot twist, challenges traditional perceptions and beckons the audience to contemplate the broader implications of our societal behaviors on the environment. It's akin to realizing that turning the last page of a novel also marks the beginning of its influence on the world around us. It makes one wonder: should we consider establishing "clean air libraries" to mitigate the literary impact on our skies? Perhaps this will lead to a new genre: eco-friendly fiction.

Therefore, in the spirit of a satisfying conclusion that ties up loose ends, we assert that no further research is needed in this area. It's time to close this book on the curious connection between book spending and air pollution, as we've already written "the end" to this unexpected tale. So, until our next statistical adventure, let's continue to read, but perhaps with a window open for good measure.


Model: gpt-3.5-turbo-1106
System prompt: You provide fictional names of academic journals. You say nothing else, and you do not chat with the user.
User prompt: Please provide the name of a funny but plausible sounding research, scientific, or academic journal that would publish a paper entited A Tale of Books and Smoke: Spoke on US Folks and Tokes in Rocky Mount, NC. Respond only with the name of a fictional academic journal.

ChatGPT:

The Journal of Appalachian Cultural Studies


Model: gpt-3.5-turbo-1106
System prompt: You provide fictional names of nonprofit research organizations and associations. You say nothing else, and you do not chat with the user.
User prompt: Please provide the name of a funny but plausible sounding nonprofit, academic, or scientific association or organization that would sponsor research that resulted in a paper entited A Tale of Books and Smoke: Spoke on US Folks and Tokes in Rocky Mount, NC. Respond only with the name of a fictional organization.

ChatGPT:

The International Society for the Preservation of Literary Lungs


Model: gpt-3.5-turbo-1106
System prompt: You provide potential search keywords separated by commas. You say nothing else, and you do not chat with the user.
User prompt: Please provide potential search keywords for a paper titled A Tale of Books and Smoke: Spoke on US Folks and Tokes in Rocky Mount, NC with an abstract of Ever wondered about the bizarre connection between household book spending and air pollution? Look no further! This paper delves into the unexpected correlation between annual US household spending on books and air pollution levels in the quaint town of Rocky Mount, North Carolina. Drawing data from the Bureau of Labor Statistics and the Environmental Protection Agency, our research team examined the pairing of these seemingly unrelated variables from 2000 to 2013.
Our findings revealed a surprising correlation coefficient of 0.8584344 with a p-value of less than 0.01, suggesting a robust relationship between book expenditures and air quality deterioration. While this connection may seem as unexpected as finding a bookworm in a library, the results urge us to ponder the societal and environmental implications of our reading habits.
So, why did the book cross the road to Rocky Mount? To analyze its environmental impact, of course! This study, though unusual, sheds light on the intricate web of human behavior and its ecological repercussions, proving that the pen is not just mightier than the sword, but also a potential contributor to air pollution.

ChatGPT:

US household spending on books, air pollution levels, Rocky Mount North Carolina, Bureau of Labor Statistics, Environmental Protection Agency, correlation coefficient, p-value, societal implications, environmental impact, human behavior, ecological repercussions

*There is a bunch of Python happening behind the scenes to turn this prompt sequence into a PDF.



Random correlation

Discover a new correlation

View all correlations

View all research papers

Report an error


Data details

Annual US household spending on books
Detailed data title: Average annual household spend on books
Source: Bureau of Labor Statistics
See what else correlates with Annual US household spending on books

Air pollution in Rocky Mount, North Carolina
Detailed data title: Percentage of days with air quality at 'unhealthy for sensitive groups' or worse in Rocky Mount, NC
Source: Environmental Protection Agency
See what else correlates with Air pollution in Rocky Mount, North Carolina

Correlation r = 0.8584344 (Pearson correlation coefficient)
Correlation is a measure of how much the variables move together. If it is 0.99, when one goes up the other goes up. If it is 0.02, the connection is very weak or non-existent. If it is -0.99, then when one goes up the other goes down. If it is 1.00, you probably messed up your correlation function.

r2 = 0.7369097 (Coefficient of determination)
This means 73.7% of the change in the one variable (i.e., Air pollution in Rocky Mount, North Carolina) is predictable based on the change in the other (i.e., Annual US household spending on books) over the 14 years from 2000 through 2013.

p < 0.01, which is statistically significant(Null hypothesis significance test)
The p-value is 8.51E-5. 0.0000850567249574320900000000
The p-value is a measure of how probable it is that we would randomly find a result this extreme. More specifically the p-value is a measure of how probable it is that we would randomly find a result this extreme if we had only tested one pair of variables one time.

But I am a p-villain. I absolutely did not test only one pair of variables one time. I correlated hundreds of millions of pairs of variables. I threw boatloads of data into an industrial-sized blender to find this correlation.

Who is going to stop me? p-value reporting doesn't require me to report how many calculations I had to go through in order to find a low p-value!
On average, you will find a correaltion as strong as 0.86 in 0.00851% of random cases. Said differently, if you correlated 11,757 random variables Which I absolutely did.
with the same 13 degrees of freedom, Degrees of freedom is a measure of how many free components we are testing. In this case it is 13 because we have two variables measured over a period of 14 years. It's just the number of years minus ( the number of variables minus one ), which in this case simplifies to the number of years minus one.
you would randomly expect to find a correlation as strong as this one.

[ 0.6, 0.95 ] 95% correlation confidence interval (using the Fisher z-transformation)
The confidence interval is an estimate the range of the value of the correlation coefficient, using the correlation itself as an input. The values are meant to be the low and high end of the correlation coefficient with 95% confidence.

This one is a bit more complciated than the other calculations, but I include it because many people have been pushing for confidence intervals instead of p-value calculations (for example: NEJM. However, if you are dredging data, you can reliably find yourself in the 5%. That's my goal!


All values for the years included above: If I were being very sneaky, I could trim years from the beginning or end of the datasets to increase the correlation on some pairs of variables. I don't do that because there are already plenty of correlations in my database without monkeying with the years.

Still, sometimes one of the variables has more years of data available than the other. This page only shows the overlapping years. To see all the years, click on "See what else correlates with..." link above.
20002001200220032004200520062007200820092010201120122013
Annual US household spending on books (Household spend)146141139127130126117118116110100115109102
Air pollution in Rocky Mount, North Carolina (Bad air quality days)0.1221370.1212120.1660380.06538460.02272730.06818180.02641510.06949810.03891050.003717470.01520910.02348990.01114210.00277008




Why this works

  1. Data dredging: I have 25,153 variables in my database. I compare all these variables against each other to find ones that randomly match up. That's 632,673,409 correlation calculations! This is called “data dredging.” Instead of starting with a hypothesis and testing it, I instead abused the data to see what correlations shake out. It’s a dangerous way to go about analysis, because any sufficiently large dataset will yield strong correlations completely at random.
  2. Lack of causal connection: There is probably Because these pages are automatically generated, it's possible that the two variables you are viewing are in fact causually related. I take steps to prevent the obvious ones from showing on the site (I don't let data about the weather in one city correlate with the weather in a neighboring city, for example), but sometimes they still pop up. If they are related, cool! You found a loophole.
    no direct connection between these variables, despite what the AI says above. This is exacerbated by the fact that I used "Years" as the base variable. Lots of things happen in a year that are not related to each other! Most studies would use something like "one person" in stead of "one year" to be the "thing" studied.
  3. Observations not independent: For many variables, sequential years are not independent of each other. If a population of people is continuously doing something every day, there is no reason to think they would suddenly change how they are doing that thing on January 1. A simple Personally I don't find any p-value calculation to be 'simple,' but you know what I mean.
    p-value calculation does not take this into account, so mathematically it appears less probable than it really is.
  4. Y-axis doesn't start at zero: I truncated the Y-axes of the graph above. I also used a line graph, which makes the visual connection stand out more than it deserves. Nothing against line graphs. They are great at telling a story when you have linear data! But visually it is deceptive because the only data is at the points on the graph, not the lines on the graph. In between each point, the data could have been doing anything. Like going for a random walk by itself!
    Mathematically what I showed is true, but it is intentionally misleading. Below is the same chart but with both Y-axes starting at zero.




Try it yourself

You can calculate the values on this page on your own! Try running the Python code to see the calculation results. Step 1: Download and install Python on your computer.

Step 2: Open a plaintext editor like Notepad and paste the code below into it.

Step 3: Save the file as "calculate_correlation.py" in a place you will remember, like your desktop. Copy the file location to your clipboard. On Windows, you can right-click the file and click "Properties," and then copy what comes after "Location:" As an example, on my computer the location is "C:\Users\tyler\Desktop"

Step 4: Open a command line window. For example, by pressing start and typing "cmd" and them pressing enter.

Step 5: Install the required modules by typing "pip install numpy", then pressing enter, then typing "pip install scipy", then pressing enter.

Step 6: Navigate to the location where you saved the Python file by using the "cd" command. For example, I would type "cd C:\Users\tyler\Desktop" and push enter.

Step 7: Run the Python script by typing "python calculate_correlation.py"

If you run into any issues, I suggest asking ChatGPT to walk you through installing Python and running the code below on your system. Try this question:

"Walk me through installing Python on my computer to run a script that uses scipy and numpy. Go step-by-step and ask me to confirm before moving on. Start by asking me questions about my operating system so that you know how to proceed. Assume I want the simplest installation with the latest version of Python and that I do not currently have any of the necessary elements installed. Remember to only give me one step per response and confirm I have done it before proceeding."


# These modules make it easier to perform the calculation
import numpy as np
from scipy import stats

# We'll define a function that we can call to return the correlation calculations
def calculate_correlation(array1, array2):

    # Calculate Pearson correlation coefficient and p-value
    correlation, p_value = stats.pearsonr(array1, array2)

    # Calculate R-squared as the square of the correlation coefficient
    r_squared = correlation**2

    return correlation, r_squared, p_value

# These are the arrays for the variables shown on this page, but you can modify them to be any two sets of numbers
array_1 = np.array([146,141,139,127,130,126,117,118,116,110,100,115,109,102,])
array_2 = np.array([0.122137,0.121212,0.166038,0.0653846,0.0227273,0.0681818,0.0264151,0.0694981,0.0389105,0.00371747,0.0152091,0.0234899,0.0111421,0.00277008,])
array_1_name = "Annual US household spending on books"
array_2_name = "Air pollution in Rocky Mount, North Carolina"

# Perform the calculation
print(f"Calculating the correlation between {array_1_name} and {array_2_name}...")
correlation, r_squared, p_value = calculate_correlation(array_1, array_2)

# Print the results
print("Correlation Coefficient:", correlation)
print("R-squared:", r_squared)
print("P-value:", p_value)



Reuseable content

You may re-use the images on this page for any purpose, even commercial purposes, without asking for permission. The only requirement is that you attribute Tyler Vigen. Attribution can take many different forms. If you leave the "tylervigen.com" link in the image, that satisfies it just fine. If you remove it and move it to a footnote, that's fine too. You can also just write "Charts courtesy of Tyler Vigen" at the bottom of an article.

You do not need to attribute "the spurious correlations website," and you don't even need to link here if you don't want to. I don't gain anything from pageviews. There are no ads on this site, there is nothing for sale, and I am not for hire.

For the record, I am just one person. Tyler Vigen, he/him/his. I do have degrees, but they should not go after my name unless you want to annoy my wife. If that is your goal, then go ahead and cite me as "Tyler Vigen, A.A. A.A.S. B.A. J.D." Otherwise it is just "Tyler Vigen."

When spoken, my last name is pronounced "vegan," like I don't eat meat.

Full license details.
For more on re-use permissions, or to get a signed release form, see tylervigen.com/permission.

Download images for these variables:


View another random correlation

How fun was this correlation?

Your rating is stellar!


Correlation ID: 4723 · Black Variable ID: 19941 · Red Variable ID: 21681
about · subscribe · emailme@tylervigen.com · twitter

CC BY 4.0