Spurious correlation #2,284 · View random
A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is US kids in public school and the second variable is The number of pipelayers in North Dakota. The chart goes from 2003 to 2022, and the two variables track closely in value over that time.

A linear line chart with years as the X-axis and two variables on the Y-axis. The first variable is US kids in public school and the second variable is The number of pipelayers in North Dakota. The chart goes from 2003 to 2022, and the two variables track closely in value over that time.

Download png, svg

AI explanation

As the number of kids in public schools rose, so did the demand for well-laid pipes. It seems like these young students are really flushing out the pipelaying profession!

^{Show image generation prompt} Model: dalle-3
Prompt: Generate an image of a busy school playground in the United States, filled with energetic kids playing and laughing. Also, showcase a construction site in North Dakota, where several workers are seen laying pipes in the background. Render the scene using the style of a woodblock painting, focusing on the vibrant, lively atmosphere of the playground and the industrious activity at the construction site. The depiction should include only scenes, people, and objects, it should not include text or speech bubbles.

This image generation prompt was also generated by AI, using the following sequence:

Model: gpt-3.5-turbo-1106
System prompt: You select one of a few potential scene descriptors that seems to go along with the variables described. You respond only with the scene descriptions provided and say nothing else.
User prompt: Here are two variables: US kids in public school and The number of pipelayers in North Dakota. Given these, which of the following image style descriptions seems most appropriate? Pick the one that seems to be related to one of the variables at hand. Please respond with only one of the following potential descriptions: Magic realism Cyberpunk Woodblock painting Cybernetic art Celtic art
ChatGPT: Woodblock painting

Model: gpt-3.5-turbo-1106
System prompt: You describe simple visual scenes that depict the two variables stated. You combine both variables into one scene.
User prompt: Please describe a visual scene that prominently depicts both US kids in public school and The number of pipelayers in North Dakota together. Keep the description simple.
ChatGPT: A bustling school playground in the United States, with kids playing and laughing. In the background, a construction site in North Dakota with several workers laying pipes.

Model: gpt-3.5-turbo-1106
System prompt: You write and improve image generation prompts for DALLE-3. You maintain the same subjects, but enhance the visuals.
User prompt: Please respond with an image generation prompt that would be likely to create an interesting visual using DALLE-3, using the following style: [ Woodblock painting ] and depicting this scene: A bustling school playground in the United States, with kids playing and laughing. In the background, a construction site in North Dakota with several workers laying pipes. Keep the visual depiction simple. Do not include any words or speech bubbles.
ChatGPT: Generate an image of a busy school playground in the United States, filled with energetic kids playing and laughing. Also, showcase a construction site in North Dakota, where several workers are seen laying pipes in the background. Render the scene using the style of a woodblock painting, focusing on the vibrant, lively atmosphere of the playground and the industrious activity at the construction site.

*Variations in the User Prompts from chart to chart are just the result of random number generation in Python. I wrote a few arrays of various styles and methods to ask questions to change up the results. Every time this site writes an explanation or generates an image, the script picks from each at random.

I sequence the requests into multiple prompts because I find GPT 3.5 to perform much better with short, well-managed contexts. Thus, I track the context directly in Python and only ask ChatGPT targeted questions.
^{Show prompt for explanation} Model: gpt-3.5-turbo-1106 (fine-tuned)
System prompt: You provide humorous responses in the form of plausible sounding explanations for correlations. You assume the correlation is causative for the purpose of the explanation even if it is ridiculous. You do not chat with the user, you only reply with the causal connection explanation and nothing else.
User prompt: Please make up a funny explanation for how an increase in US kids in public school caused The number of pipelayers in North Dakota to increase. Include one good pun.

AI academic paper

(Because p < 0.01)

Digging Deeper: The Unearthed Link between U.S. Public School Kids and North Dakota Pipelayers

The Journal of Subterranean Sociology

Jan 2024

Reminder: This paper is AI-generated. Not real!

^{Show prompt used to generate this paper} Quick note: This sequence is long and repetitive. That's how the prompt engineering works for this one. I manage the output in Python to format it into this PDF.

Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You write titles and abstracts of goofy academic research papers.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, and add unexpected twists in your writing.

Please draft the title and abstract of an academic research paper presenting the findings of the connection between US kids in public school and The number of pipelayers in North Dakota. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Make the title seem clever.

Your research team used data from National Center for Education Statistics and Bureau of Larbor Statistics to assess this nagging question. You found a correlation coefficient of 0.8241081 and p < 0.01 for 2003 to 2022.

Please respond in this format (replace "Lorem ipsum dolor sit" with your title and abstract)

[[TITLE]]
Lorem ipsum dolor sit

[[ABSTRACT]]
Lorem ipsum dolor sit

ChatGPT:
[[TITLE]]
"Digging Deeper: The Unearthed Link between U.S. Public School Kids and North Dakota Pipelayers"

[[ABSTRACT]]
In this study, we delve into the soil of education and employment data to unearth a surprising connection between the number of children enrolled in public schools in the United States and the count of pipelayers in the fertile land of North Dakota. Utilizing datasets from the National Center for Education Statistics and the Bureau of Labor Statistics spanning the years 2003 to 2022, our research team uncovered a correlation coefficient of 0.8241081 and a statistically significant p-value of less than 0.01. Our findings suggest that there may be an underground network of factors at play, linking the academic environments of American public schools to the labor force engaged in pipeline construction. The implications of this discovery are far-reaching and merit further investigation to determine the root cause of this unexpected connection and whether it holds water in other geographical contexts. Join us as we dig deep into the dirt of education and labor data to unearth this peculiar correlation.

Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You draft the introduction section of an academic research paper.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up the introduction section of an academic research paper presenting the findings of the connection between US kids in public school and The number of pipelayers in North Dakota. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Here is the title and abstract of the paper:
[[TITLE]]
"Digging Deeper: The Unearthed Link between U.S. Public School Kids and North Dakota Pipelayers"

[[ABSTRACT]]
In this study, we delve into the soil of education and employment data to unearth a surprising connection between the number of children enrolled in public schools in the United States and the count of pipelayers in the fertile land of North Dakota. Utilizing datasets from the National Center for Education Statistics and the Bureau of Labor Statistics spanning the years 2003 to 2022, our research team uncovered a correlation coefficient of 0.8241081 and a statistically significant p-value of less than 0.01. Our findings suggest that there may be an underground network of factors at play, linking the academic environments of American public schools to the labor force engaged in pipeline construction. The implications of this discovery are far-reaching and merit further investigation to determine the root cause of this unexpected connection and whether it holds water in other geographical contexts. Join us as we dig deep into the dirt of education and labor data to unearth this peculiar correlation.

-------

Your response should start like this:

[[INTRODUCTION]]

ChatGPT:
[[INTRODUCTION]]

Welcome, dear readers, to our exploration of a most peculiar and perhaps unexpected connection: the relationship between the number of U.S. public school children and the count of pipelayers in the great state of North Dakota. While this pairing may seem as incongruous as a cat attempting to bark or a penguin striving to soar through the sky, as diligent researchers, we cannot ignore the numbers that leap from the data like eager gophers emerging from their burrows.

In the domain of academia, it is all too common to focus solely on the fruit-bearing trees of well-established connections between variables, neglecting the fertile ground where peculiar correlations grow like wildflowers in a neglected garden. Our inquiry leads us to untangle the roots of this unexpected correlation between the education realm and the pipelaying domain, a correlation so robust that it demands our attention.

We recognize that some may view our pursuit as akin to chasing after rainbows or searching for a yeti in the wilds of the Himalayas, but we assure you that our efforts are firmly grounded in rigorous analysis and statistical examination. Our aim is not merely to draw attention to the quirks of data but to unearth the underlying factors that may be at play, potentially influencing both the foundation of education and the infrastructure of pipeline construction.

As we embark on this expedition, we invite you to join us in this venture that is part academic inquiry, part archaeological excavation, and part detective work. Together, let us dig into the earth of empirical evidence, probing for treasures of insight and unearthing the unexpected connections that may lie buried beneath the surface.

Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You draft a literature review section of an academic research paper, that starts out dry for the first few sentences but then becomes hilarious and goofy.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up a literature review section of an academic research paper presenting the findings of the connection between US kids in public school and The number of pipelayers in North Dakota. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Speak in the present tense for this section (the authors find...), as is common in academic research paper literature reviews. Name the sources in a format similar to this: In "Book," the authors find lorem and ipsum.

Make up the lorem and ipsum part, but make it sound related to the topic at hand.

Start by naming serious-sounding studies by authors like Smith, Doe, and Jones - but then quickly devolve. Name some real non-fiction books that would be related to the topic. Then name some real fiction books that sound like they could be related. Then you might move on to cartoons and children's shows that you watched for research.

Here is the title and abstract of the paper:
[[TITLE]]
"Digging Deeper: The Unearthed Link between U.S. Public School Kids and North Dakota Pipelayers"

[[ABSTRACT]]
In this study, we delve into the soil of education and employment data to unearth a surprising connection between the number of children enrolled in public schools in the United States and the count of pipelayers in the fertile land of North Dakota. Utilizing datasets from the National Center for Education Statistics and the Bureau of Labor Statistics spanning the years 2003 to 2022, our research team uncovered a correlation coefficient of 0.8241081 and a statistically significant p-value of less than 0.01. Our findings suggest that there may be an underground network of factors at play, linking the academic environments of American public schools to the labor force engaged in pipeline construction. The implications of this discovery are far-reaching and merit further investigation to determine the root cause of this unexpected connection and whether it holds water in other geographical contexts. Join us as we dig deep into the dirt of education and labor data to unearth this peculiar correlation.

-------

And here is the introduction section that led to this literature review:
[[INTRODUCTION]]
Welcome, dear readers, to our exploration of a most peculiar and perhaps unexpected connection: the relationship between the number of U.S. public school children and the count of pipelayers in the great state of North Dakota. While this pairing may seem as incongruous as a cat attempting to bark or a penguin striving to soar through the sky, as diligent researchers, we cannot ignore the numbers that leap from the data like eager gophers emerging from their burrows.
In the domain of academia, it is all too common to focus solely on the fruit-bearing trees of well-established connections between variables, neglecting the fertile ground where peculiar correlations grow like wildflowers in a neglected garden. Our inquiry leads us to untangle the roots of this unexpected correlation between the education realm and the pipelaying domain, a correlation so robust that it demands our attention.
We recognize that some may view our pursuit as akin to chasing after rainbows or searching for a yeti in the wilds of the Himalayas, but we assure you that our efforts are firmly grounded in rigorous analysis and statistical examination. Our aim is not merely to draw attention to the quirks of data but to unearth the underlying factors that may be at play, potentially influencing both the foundation of education and the infrastructure of pipeline construction.
As we embark on this expedition, we invite you to join us in this venture that is part academic inquiry, part archaeological excavation, and part detective work. Together, let us dig into the earth of empirical evidence, probing for treasures of insight and unearthing the unexpected connections that may lie buried beneath the surface.

-------

Your response should start like this:

[[LITERATURE REVIEW]]

ChatGPT:
[[LITERATURE REVIEW]]

In the exploration of the intertwined relationship between U.S. public school children and the cadre of pipelayers in North Dakota, a myriad of studies and empirical works has laid the groundwork for our current investigation. Smith et al. (2010) first highlighted the importance of considering the geographic disparities in educational enrollment, while Doe and Jones (2015) emphasized the significance of labor force dynamics in regional economies. Their studies have paved the way for our inquiry, serving as the anchor to our understanding of the complex web of connections that may underlie the correlation between seemingly disparate domains.

Moving beyond the traditional disciplinary boundaries, non-fiction works such as "The Geography of Education: Revelation and Challenges" (Brown, 2018) and "Pipelines, Plains, and Prosperity: A Socioeconomic Study of North Dakota" (Garcia, 2013) have provided valuable insights into the regional contexts that intersect with our focal points. These texts have sown seeds of knowledge that we will cultivate in our empirical investigation, enriching our understanding of the social, economic, and educational landscapes that form the backdrop of our study.

Venturing into the realm of fiction, works such as "The Gopher and the Golden Pipeline" (Robinson, 2005) and "Penguins and Pipelines: A Tale of Unlikely Connections" (Adams, 2012) may appear whimsical on the surface, yet their narratives carry echoes of the unlikely synergies that are at the heart of our research endeavor. Although their anecdotes may exist in the realm of make-believe, they remind us of the unforeseen bonds that can weave through even the most divergent of elements.

In a bid to comprehend the experiences and interactions of youngsters in the public schooling system, the researchers diligently delved into the rich tapestry of children's media. Through an immersion in cartoons such as "The Magic School Bus" and "Dora the Explorer," alongside a deep dive into children's programming like "Sesame Street" and "Arthur," a nuanced understanding of the influences on young minds has been cultivated. These cultural touchstones have not only brought joy and laughter but have also provided invaluable insights into the formative years of the very demographic at the heart of our investigation.

With this diverse array of literature guiding our steps, we stand ready to unearth the hidden underpinnings of the correlation between U.S. public school children and North Dakota pipelayers. As we strap on our intellectual shovels and don our scholarly hard hats, we move forward with both rigor and whimsy, prepared to encounter the unexpected and weave a narrative that bridges the realms of data analysis and delightful peculiarities.

Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You draft the methodology section of an academic research paper.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up the methodology section of an academic research paper presenting the findings of the connection between US kids in public school and The number of pipelayers in North Dakota. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Your research team collected data from all across the internet, but mostly just used information from National Center for Education Statistics and Bureau of Larbor Statistics . You used data from 2003 to 2022

Make up the research methods you don't know. Make them a bit goofy and convoluted.

Here is the title, abstract, and introduction of the paper:
[[TITLE]]
"Digging Deeper: The Unearthed Link between U.S. Public School Kids and North Dakota Pipelayers"

[[ABSTRACT]]
In this study, we delve into the soil of education and employment data to unearth a surprising connection between the number of children enrolled in public schools in the United States and the count of pipelayers in the fertile land of North Dakota. Utilizing datasets from the National Center for Education Statistics and the Bureau of Labor Statistics spanning the years 2003 to 2022, our research team uncovered a correlation coefficient of 0.8241081 and a statistically significant p-value of less than 0.01. Our findings suggest that there may be an underground network of factors at play, linking the academic environments of American public schools to the labor force engaged in pipeline construction. The implications of this discovery are far-reaching and merit further investigation to determine the root cause of this unexpected connection and whether it holds water in other geographical contexts. Join us as we dig deep into the dirt of education and labor data to unearth this peculiar correlation.

[[INTRODUCTION]]
Welcome, dear readers, to our exploration of a most peculiar and perhaps unexpected connection: the relationship between the number of U.S. public school children and the count of pipelayers in the great state of North Dakota. While this pairing may seem as incongruous as a cat attempting to bark or a penguin striving to soar through the sky, as diligent researchers, we cannot ignore the numbers that leap from the data like eager gophers emerging from their burrows.
In the domain of academia, it is all too common to focus solely on the fruit-bearing trees of well-established connections between variables, neglecting the fertile ground where peculiar correlations grow like wildflowers in a neglected garden. Our inquiry leads us to untangle the roots of this unexpected correlation between the education realm and the pipelaying domain, a correlation so robust that it demands our attention.
We recognize that some may view our pursuit as akin to chasing after rainbows or searching for a yeti in the wilds of the Himalayas, but we assure you that our efforts are firmly grounded in rigorous analysis and statistical examination. Our aim is not merely to draw attention to the quirks of data but to unearth the underlying factors that may be at play, potentially influencing both the foundation of education and the infrastructure of pipeline construction.
As we embark on this expedition, we invite you to join us in this venture that is part academic inquiry, part archaeological excavation, and part detective work. Together, let us dig into the earth of empirical evidence, probing for treasures of insight and unearthing the unexpected connections that may lie buried beneath the surface.

-------

Your response should start like this:

[[METHODOLOGY]]

ChatGPT:

[[METHODOLOGY]]

Data Collection:
Our research team embarked on a digital quest to gather data from various sources, traversing the expansive landscape of the internet like intrepid explorers charting uncharted territory. However, to maintain scholarly integrity, we primarily relied on the National Center for Education Statistics and the Bureau of Labor Statistics as our lodestars in navigating the sea of information. These reputable repositories provided us with a steady stream of data spanning the years 2003 to 2022, granting us a panoramic view of the educational and labor landscapes.

Correlation Analysis:
To analyze the connection between the number of children enrolled in U.S. public schools and the count of pipelayers in North Dakota, we employed the trusty tools of statistical analysis, wielding correlation coefficients with the precision of seasoned artisans sculpting fine marble. Through this method, we sought to discern the degree of association between these seemingly disparate variables, unearthing the hidden threads that weave them together like clandestine conspirators in a Dickensian tale.

Regression Modeling:
In our pursuit of uncovering the underlying factors shaping this unconventional correlation, we turned to the ever-reliable technique of regression modeling, akin to constructing a sturdy framework to support the weight of our curiosity. Armed with this method, we endeavored to tease apart the intricate web of causal relationships between the educational ecosystem and the labor landscape, constructing a mathematical scaffold to peer into the depths of this enigmatic entanglement.

Statistical Significance Testing:
As we ventured deeper into the statistical wilderness, we sought the proverbial x-marks-the-spot that would validate our findings. Thus, we meticulously subjected our data to the rigors of hypothesis testing and significance analysis, eager to discern whether our observed correlations were mere mirages or substantial landmarks in the terrain of empirical evidence. Through this process, we sought to separate the wheat from the chaff, distilling the essence of our findings into robust conclusions worthy of scholarly contemplation.

In conclusion, our methodological expedition deftly navigated the tides of data collection, corralling correlation coefficients, erecting regression models, and conducting significance forays, yielding a bountiful harvest of empirical insights that breathe life into the peculiar correlation between U.S. public school children and North Dakota pipelayers.

Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You draft the results section of an academic research paper. You speak in the past tense (you found...).
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up the results section of an academic research paper presenting the findings of the connection between US kids in public school and The number of pipelayers in North Dakota. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Your research team collected data from all across the internet, but mostly just used information from National Center for Education Statistics and Bureau of Larbor Statistics .

For the time period 2003 to 2022, you found a correlation 0.8241081, r-squared of 0.6791541, and p < 0.01.

One figure will be included. The figure (Fig. 1) is a scatterplot showing the strong correlation between the two variables. You don't need to specify where; I will add the figure.

Here is the title and abstract of the paper:
[[TITLE]]
"Digging Deeper: The Unearthed Link between U.S. Public School Kids and North Dakota Pipelayers"

[[ABSTRACT]]
In this study, we delve into the soil of education and employment data to unearth a surprising connection between the number of children enrolled in public schools in the United States and the count of pipelayers in the fertile land of North Dakota. Utilizing datasets from the National Center for Education Statistics and the Bureau of Labor Statistics spanning the years 2003 to 2022, our research team uncovered a correlation coefficient of 0.8241081 and a statistically significant p-value of less than 0.01. Our findings suggest that there may be an underground network of factors at play, linking the academic environments of American public schools to the labor force engaged in pipeline construction. The implications of this discovery are far-reaching and merit further investigation to determine the root cause of this unexpected connection and whether it holds water in other geographical contexts. Join us as we dig deep into the dirt of education and labor data to unearth this peculiar correlation.

-------

And here is the methodology section that led to this result:
[[METHODOLOGY]]
Welcome, dear readers, to our exploration of a most peculiar and perhaps unexpected connection: the relationship between the number of U.S. public school children and the count of pipelayers in the great state of North Dakota. While this pairing may seem as incongruous as a cat attempting to bark or a penguin striving to soar through the sky, as diligent researchers, we cannot ignore the numbers that leap from the data like eager gophers emerging from their burrows.
In the domain of academia, it is all too common to focus solely on the fruit-bearing trees of well-established connections between variables, neglecting the fertile ground where peculiar correlations grow like wildflowers in a neglected garden. Our inquiry leads us to untangle the roots of this unexpected correlation between the education realm and the pipelaying domain, a correlation so robust that it demands our attention.
We recognize that some may view our pursuit as akin to chasing after rainbows or searching for a yeti in the wilds of the Himalayas, but we assure you that our efforts are firmly grounded in rigorous analysis and statistical examination. Our aim is not merely to draw attention to the quirks of data but to unearth the underlying factors that may be at play, potentially influencing both the foundation of education and the infrastructure of pipeline construction.
As we embark on this expedition, we invite you to join us in this venture that is part academic inquiry, part archaeological excavation, and part detective work. Together, let us dig into the earth of empirical evidence, probing for treasures of insight and unearthing the unexpected connections that may lie buried beneath the surface.

-------

Your response should start like this:

[[RESULTS]]

ChatGPT:

[[RESULTS]]

In our quest to uncover the unexpected connection between U.S. public school kids and the number of pipelayers in North Dakota, we unearthed compelling evidence of a surprisingly strong correlation. The correlation coefficient, r, stood at 0.8241081, indicating a robust relationship between the two variables. Furthermore, the r-squared value of 0.6791541 highlighted the amount of variation in pipelayer figures that can be explained by changes in the number of U.S. public school children.

Our p-value of less than 0.01 provided statistical support for the significance of this relationship, offering assurance that it was not merely a chance alignment like finding a four-leaf clover in a field of three-leaf specimens.

Fig. 1 depicts our findings visually in a scatterplot, illustrating the strong correlation between the number of U.S. public school children and the count of pipelayers in North Dakota. As the saying goes, a picture is worth a thousand words, and in this case, the scatterplot speaks volumes about the unexpected connection we have uncovered.

These results beckon us to explore further, as this correlation is as intriguing as discovering a treasure map hidden in the margins of a dusty old tome. To embrace the spirit of whimsy and wonder, let us remember that in the world of data, the most captivating stories often lurk beneath the surface, waiting to be brought to light.

Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You draft the discussion section of an academic research paper.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up the discussion section of an academic research paper presenting the findings of the connection between US kids in public school and The number of pipelayers in North Dakota. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Limit your response to 500 tokens.

Here are the title, abstract, literature review, and results sections. Please harken back to 1-2 of the goofy items in the literature review, but pretend to take them completely seriously. Discuss how your results supported the prior research.

Do not write a conclusion. I will add a conclusion after this.

[[TITLE]]
"Digging Deeper: The Unearthed Link between U.S. Public School Kids and North Dakota Pipelayers"

[[ABSTRACT]]
In this study, we delve into the soil of education and employment data to unearth a surprising connection between the number of children enrolled in public schools in the United States and the count of pipelayers in the fertile land of North Dakota. Utilizing datasets from the National Center for Education Statistics and the Bureau of Labor Statistics spanning the years 2003 to 2022, our research team uncovered a correlation coefficient of 0.8241081 and a statistically significant p-value of less than 0.01. Our findings suggest that there may be an underground network of factors at play, linking the academic environments of American public schools to the labor force engaged in pipeline construction. The implications of this discovery are far-reaching and merit further investigation to determine the root cause of this unexpected connection and whether it holds water in other geographical contexts. Join us as we dig deep into the dirt of education and labor data to unearth this peculiar correlation.

[[LITERATURE REVIEW]]
In the exploration of the intertwined relationship between U.S. public school children and the cadre of pipelayers in North Dakota, a myriad of studies and empirical works has laid the groundwork for our current investigation. Smith et al. (2010) first highlighted the importance of considering the geographic disparities in educational enrollment, while Doe and Jones (2015) emphasized the significance of labor force dynamics in regional economies. Their studies have paved the way for our inquiry, serving as the anchor to our understanding of the complex web of connections that may underlie the correlation between seemingly disparate domains.
Moving beyond the traditional disciplinary boundaries, non-fiction works such as "The Geography of Education: Revelation and Challenges" (Brown, 2018) and "Pipelines, Plains, and Prosperity: A Socioeconomic Study of North Dakota" (Garcia, 2013) have provided valuable insights into the regional contexts that intersect with our focal points. These texts have sown seeds of knowledge that we will cultivate in our empirical investigation, enriching our understanding of the social, economic, and educational landscapes that form the backdrop of our study.
Venturing into the realm of fiction, works such as "The Gopher and the Golden Pipeline" (Robinson, 2005) and "Penguins and Pipelines: A Tale of Unlikely Connections" (Adams, 2012) may appear whimsical on the surface, yet their narratives carry echoes of the unlikely synergies that are at the heart of our research endeavor. Although their anecdotes may exist in the realm of make-believe, they remind us of the unforeseen bonds that can weave through even the most divergent of elements.
In a bid to comprehend the experiences and interactions of youngsters in the public schooling system, the researchers diligently delved into the rich tapestry of children's media. Through an immersion in cartoons such as "The Magic School Bus" and "Dora the Explorer," alongside a deep dive into children's programming like "Sesame Street" and "Arthur," a nuanced understanding of the influences on young minds has been cultivated. These cultural touchstones have not only brought joy and laughter but have also provided invaluable insights into the formative years of the very demographic at the heart of our investigation.
With this diverse array of literature guiding our steps, we stand ready to unearth the hidden underpinnings of the correlation between U.S. public school children and North Dakota pipelayers. As we strap on our intellectual shovels and don our scholarly hard hats, we move forward with both rigor and whimsy, prepared to encounter the unexpected and weave a narrative that bridges the realms of data analysis and delightful peculiarities.

[[RESULTS]]
In our quest to uncover the unexpected connection between U.S. public school kids and the number of pipelayers in North Dakota, we unearthed compelling evidence of a surprisingly strong correlation. The correlation coefficient, r, stood at 0.8241081, indicating a robust relationship between the two variables. Furthermore, the r-squared value of 0.6791541 highlighted the amount of variation in pipelayer figures that can be explained by changes in the number of U.S. public school children.
Our p-value of less than 0.01 provided statistical support for the significance of this relationship, offering assurance that it was not merely a chance alignment like finding a four-leaf clover in a field of three-leaf specimens.
Fig. 1 depicts our findings visually in a scatterplot, illustrating the strong correlation between the number of U.S. public school children and the count of pipelayers in North Dakota. As the saying goes, a picture is worth a thousand words, and in this case, the scatterplot speaks volumes about the unexpected connection we have uncovered.
These results beckon us to explore further, as this correlation is as intriguing as discovering a treasure map hidden in the margins of a dusty old tome. To embrace the spirit of whimsy and wonder, let us remember that in the world of data, the most captivating stories often lurk beneath the surface, waiting to be brought to light.

-------

Your response should start like this:

[[DISCUSSION]]

ChatGPT:

[[DISCUSSION]]

The unearthing of a substantial correlation between the number of U.S. public school children and the count of pipelayers in North Dakota provides fertile ground for reflection, akin to discovering an unexpected veggie patch in the backyard. Our findings align with the prior research, resembling a puzzle that, once assembled, reveals a whimsical image.

Delving into the regional dynamics emphasized by Smith et al. (2010) and the socioeconomic contexts illuminated by Garcia (2013), our results bolster the notion that there's more to this unexpected connection than meets the eye. Just as "The Gopher and the Golden Pipeline" (Robinson, 2005) and "Penguins and Pipelines: A Tale of Unlikely Connections" (Adams, 2012) hinted at surprising synergies, our study has brought one to the surface like a gleaming diamond among common stones.

The stronger-than-anticipated correlation coefficient and the statistically significant p-value support the existence of an intriguing relationship, akin to stumbling upon a buried treasure amidst the dry academic literature. Our findings add weight to the idea that there may indeed be an underground network of factors linking the academic settings of American public schools to the labor force engaged in pipeline construction – a connection as startling as encountering a unicorn in a field of horses.

Venturing into cartoon realms such as "The Magic School Bus" and "Dora the Explorer," we found that children's media may hold playful clues to the influences shaping young minds, not unlike finding Easter eggs hidden within the vast expanse of numbered data. Our results have unearthed what seems like a whimsical revelation: the academic and labor landscapes are entangled in a manner reminiscent of a complex, intertwining garden maze.

In conclusion, our findings support the notion that there's more to this correlation than meets the eye, like the plot twists in a mystery novel. The unexpected link uncovered in this study beckons us to explore further, reminding us that in the vast expanse of research, the most fascinating discoveries often hide in plain sight, waiting to be revealed like unexpected punchlines in a somber documentary.

Model: gpt-3.5-turbo-1106
System prompt: You are a silly researcher writing an academic paper.You draft the discussion section of goofy academic research papers that sounds like it belongs based on the title and abstract provided.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.

Please make up a short conclusion of an academic research paper presenting the findings of the connection between US kids in public school and The number of pipelayers in North Dakota. Make it seem dry like academic research paper, but with lots of subtle hidden offhand remarks, jokes, goofy observations, and puns.

Limit your response to 300 tokens. At the very end, assert that no more research is needed in this area.

Here are the title, abstract, introduction, and results sections.
[[TITLE]]
"Digging Deeper: The Unearthed Link between U.S. Public School Kids and North Dakota Pipelayers"

[[ABSTRACT]]
In this study, we delve into the soil of education and employment data to unearth a surprising connection between the number of children enrolled in public schools in the United States and the count of pipelayers in the fertile land of North Dakota. Utilizing datasets from the National Center for Education Statistics and the Bureau of Labor Statistics spanning the years 2003 to 2022, our research team uncovered a correlation coefficient of 0.8241081 and a statistically significant p-value of less than 0.01. Our findings suggest that there may be an underground network of factors at play, linking the academic environments of American public schools to the labor force engaged in pipeline construction. The implications of this discovery are far-reaching and merit further investigation to determine the root cause of this unexpected connection and whether it holds water in other geographical contexts. Join us as we dig deep into the dirt of education and labor data to unearth this peculiar correlation.

[[INTRDUCTION]]
Welcome, dear readers, to our exploration of a most peculiar and perhaps unexpected connection: the relationship between the number of U.S. public school children and the count of pipelayers in the great state of North Dakota. While this pairing may seem as incongruous as a cat attempting to bark or a penguin striving to soar through the sky, as diligent researchers, we cannot ignore the numbers that leap from the data like eager gophers emerging from their burrows.
In the domain of academia, it is all too common to focus solely on the fruit-bearing trees of well-established connections between variables, neglecting the fertile ground where peculiar correlations grow like wildflowers in a neglected garden. Our inquiry leads us to untangle the roots of this unexpected correlation between the education realm and the pipelaying domain, a correlation so robust that it demands our attention.
We recognize that some may view our pursuit as akin to chasing after rainbows or searching for a yeti in the wilds of the Himalayas, but we assure you that our efforts are firmly grounded in rigorous analysis and statistical examination. Our aim is not merely to draw attention to the quirks of data but to unearth the underlying factors that may be at play, potentially influencing both the foundation of education and the infrastructure of pipeline construction.
As we embark on this expedition, we invite you to join us in this venture that is part academic inquiry, part archaeological excavation, and part detective work. Together, let us dig into the earth of empirical evidence, probing for treasures of insight and unearthing the unexpected connections that may lie buried beneath the surface.

[[RESULTS]]
In our quest to uncover the unexpected connection between U.S. public school kids and the number of pipelayers in North Dakota, we unearthed compelling evidence of a surprisingly strong correlation. The correlation coefficient, r, stood at 0.8241081, indicating a robust relationship between the two variables. Furthermore, the r-squared value of 0.6791541 highlighted the amount of variation in pipelayer figures that can be explained by changes in the number of U.S. public school children.
Our p-value of less than 0.01 provided statistical support for the significance of this relationship, offering assurance that it was not merely a chance alignment like finding a four-leaf clover in a field of three-leaf specimens.
Fig. 1 depicts our findings visually in a scatterplot, illustrating the strong correlation between the number of U.S. public school children and the count of pipelayers in North Dakota. As the saying goes, a picture is worth a thousand words, and in this case, the scatterplot speaks volumes about the unexpected connection we have uncovered.
These results beckon us to explore further, as this correlation is as intriguing as discovering a treasure map hidden in the margins of a dusty old tome. To embrace the spirit of whimsy and wonder, let us remember that in the world of data, the most captivating stories often lurk beneath the surface, waiting to be brought to light.

-------

Your response should start like this:

[[CONCLUSION]]

ChatGPT:

[[CONCLUSION]]

In conclusion, our excavation into the data has revealed a tantalizing correlation between the number of U.S. public school children and the count of pipelayers in North Dakota. This unexpected pairing has sparked more questions than a curious toddler, leaving us to ponder the underlying factors at play. The statistically significant relationship we've unearthed is as surprising as stumbling upon a hidden garden gnome in a forest of data. While this connection may seem as improbable as finding a needle in a haystack, the numbers speak for themselves.

However, as intriguing as our findings may be, we must acknowledge the limitations of our study. Correlation does not imply causation, and we tread carefully, lest we mistakenly attribute the increase in pipelayers to a sudden surge in children fashioning mini-pipelines in their sandbox play.

The implications of our discovery are as mysterious as a magician's sleight of hand, and while we are tempted to speculate on the potential mechanisms at work, our findings highlight the need for further investigation. As much as we are drawn to the allure of this peculiar correlation, we must resist the urge to jump to conclusions as swiftly as a kangaroo evading a predator.

In the spirit of scholarly inquiry, we encourage future researchers to delve deeper into this peculiar relationship, though we cannot help but mimic the marmot's wisdom by stating that further excavation may be akin to digging to China from our current position. We are confident in asserting that the quirky ties between U.S. public school children and North Dakota pipelayers have been thoroughly unearthed, and no further spadework seems warranted in this peculiar plot of academic soil.

Model: gpt-3.5-turbo-1106
System prompt: You provide fictional names of academic journals. You say nothing else, and you do not chat with the user.
User prompt: Please provide the name of a funny but plausible sounding research, scientific, or academic journal that would publish a paper entited Digging Deeper: The Unearthed Link between U.S. Public School Kids and North Dakota Pipelayers. Respond only with the name of a fictional academic journal.

ChatGPT:

The Journal of Subterranean Sociology

Model: gpt-3.5-turbo-1106
System prompt: You provide fictional names of nonprofit research organizations and associations. You say nothing else, and you do not chat with the user.
User prompt: Please provide the name of a funny but plausible sounding nonprofit, academic, or scientific association or organization that would sponsor research that resulted in a paper entited Digging Deeper: The Unearthed Link between U.S. Public School Kids and North Dakota Pipelayers. Respond only with the name of a fictional organization.

ChatGPT:

The Society for Cross-Disciplinary Research Advancement (SCDRA)

Model: gpt-3.5-turbo-1106
System prompt: You provide potential search keywords separated by commas. You say nothing else, and you do not chat with the user.
User prompt: Please provide potential search keywords for a paper titled Digging Deeper: The Unearthed Link between U.S. Public School Kids and North Dakota Pipelayers with an abstract of In this study, we delve into the soil of education and employment data to unearth a surprising connection between the number of children enrolled in public schools in the United States and the count of pipelayers in the fertile land of North Dakota. Utilizing datasets from the National Center for Education Statistics and the Bureau of Labor Statistics spanning the years 2003 to 2022, our research team uncovered a correlation coefficient of 0.8241081 and a statistically significant p-value of less than 0.01. Our findings suggest that there may be an underground network of factors at play, linking the academic environments of American public schools to the labor force engaged in pipeline construction. The implications of this discovery are far-reaching and merit further investigation to determine the root cause of this unexpected connection and whether it holds water in other geographical contexts. Join us as we dig deep into the dirt of education and labor data to unearth this peculiar correlation.

ChatGPT:

U.S. public school enrollment, North Dakota pipelayers, education data, employment data, correlation coefficient, Bureau of Labor Statistics, National Center for Education Statistics, academic environments, pipeline construction, labor force, geographical contexts, statistical significance, root cause, data analysis, research findings

*There is a bunch of Python happening behind the scenes to turn this prompt sequence into a PDF.

Random correlation

Discover a new correlation

View all correlations

View all research papers

Report an error

Data details

US kids in public school
Detailed data title: Enrollment in public elementary and secondary schools
Source: National Center for Education Statistics
See what else correlates with US kids in public school

The number of pipelayers in North Dakota
Detailed data title: BLS estimate of pipelayers in North Dakota
Source: Bureau of Larbor Statistics
See what else correlates with The number of pipelayers in North Dakota

Correlation r = 0.8241081 (Pearson correlation coefficient)
Correlation is a measure of how much the variables move together. If it is 0.99, when one goes up the other goes up. If it is 0.02, the connection is very weak or non-existent. If it is -0.99, then when one goes up the other goes down. If it is 1.00, you probably messed up your correlation function.

r² = 0.6791541 (Coefficient of determination)
This means 67.9% of the change in the one variable (i.e., The number of pipelayers in North Dakota) is predictable based on the change in the other (i.e., US kids in public school) over the 20 years from 2003 through 2022.

p < 0.01, which is statistically significant(Null hypothesis significance test)
The p-value is 7.9E-6.^Show 0.0000079285729679872740000000
The p-value is a measure of how probable it is that we would randomly find a result this extreme.^Note More specifically the p-value is a measure of how probable it is that we would randomly find a result this extreme if we had only tested one pair of variables one time.

But I am a p-villain. I absolutely did not test only one pair of variables one time. I correlated hundreds of millions of pairs of variables. I threw boatloads of data into an industrial-sized blender to find this correlation.

Who is going to stop me? p-value reporting doesn't require me to report how many calculations I had to go through in order to find a low p-value!
On average, you will find a correaltion as strong as 0.82 in 0.00079% of random cases. Said differently, if you correlated 126,126 random variables^Note You don't actually need 126 thousand variables to find a correlation like this one. I don't have that many variables in my database. You can also correlate variables that are not independent. I do this a lot.

p-value calculations are useful for understanding the probability of a result happening by chance. They are most useful when used to highlight the risk of a fluke outcome. For example, if you calculate a p-value of 0.30, the risk that the result is a fluke is high. It is good to know that! But there are lots of ways to get a p-value of less than 0.01, as evidenced by this project.

In this particular case, the values are so extreme as to be meaningless. That's why no one reports p-values with specificity after they drop below 0.01.

Just to be clear: I'm being completely transparent about the calculations. There is no math trickery. This is just how statistics shakes out when you calculate hundreds of millions of random correlations.
with the same 19 degrees of freedom, ^Note Degrees of freedom is a measure of how many free components we are testing. In this case it is 19 because we have two variables measured over a period of 20 years. It's just the number of years minus ( the number of variables minus one ), which in this case simplifies to the number of years minus one.
you would randomly expect to find a correlation as strong as this one.

[ 0.6, 0.93 ] 95% correlation confidence interval (using the Fisher z-transformation)
^{Read more about the confidence interval} The confidence interval is an estimate the range of the value of the correlation coefficient, using the correlation itself as an input. The values are meant to be the low and high end of the correlation coefficient with 95% confidence.

This one is a bit more complciated than the other calculations, but I include it because many people have been pushing for confidence intervals instead of p-value calculations (for example: NEJM. However, if you are dredging data, you can reliably find yourself in the 5%. That's my goal!

All values for the years included above: ^Note If I were being very sneaky, I could trim years from the beginning or end of the datasets to increase the correlation on some pairs of variables. I don't do that because there are already plenty of correlations in my database without monkeying with the years.

Still, sometimes one of the variables has more years of data available than the other. This page only shows the overlapping years. To see all the years, click on "See what else correlates with..." link above.

	2003	2004	2005	2006	2007	2008	2009	2010	2011	2012	2013	2014	2015	2016	2017	2018	2019	2020	2021	2022
US kids in public school (Students)	48540200	48795500	49113300	49315800	49290600	49265600	49361000	49484200	49521700	49771100	50044500	50312600	50438000	50615200	50685600	50694100	50796400	49374800	49433100	49262300
The number of pipelayers in North Dakota (Pipelayers)	70	80	110	140	210	200	130	120	150	320	310	470	360	450	330	540	230	130	60	190

Why this works

Data dredging: I have 25,153 variables in my database. I compare all these variables against each other to find ones that randomly match up. That's 632,673,409 correlation calculations! This is called “data dredging.” Instead of starting with a hypothesis and testing it, I instead abused the data to see what correlations shake out. It’s a dangerous way to go about analysis, because any sufficiently large dataset will yield strong correlations completely at random.
Lack of causal connection: There is probably^Note Because these pages are automatically generated, it's possible that the two variables you are viewing are in fact causually related. I take steps to prevent the obvious ones from showing on the site (I don't let data about the weather in one city correlate with the weather in a neighboring city, for example), but sometimes they still pop up. If they are related, cool! You found a loophole.
no direct connection between these variables, despite what the AI says above. This is exacerbated by the fact that I used "Years" as the base variable. Lots of things happen in a year that are not related to each other! Most studies would use something like "one person" in stead of "one year" to be the "thing" studied.
Observations not independent: For many variables, sequential years are not independent of each other. If a population of people is continuously doing something every day, there is no reason to think they would suddenly change how they are doing that thing on January 1. A simple^Note Personally I don't find any p-value calculation to be 'simple,' but you know what I mean.
p-value calculation does not take this into account, so mathematically it appears less probable than it really is.
Y-axis doesn't start at zero: I truncated the Y-axes of the graph above. I also used a line graph, which makes the visual connection stand out more than it deserves. ^Note Nothing against line graphs. They are great at telling a story when you have linear data! But visually it is deceptive because the only data is at the points on the graph, not the lines on the graph. In between each point, the data could have been doing anything. Like going for a random walk by itself!
Mathematically what I showed is true, but it is intentionally misleading. Below is the same chart but with both Y-axes starting at zero.

Try it yourself

You can calculate the values on this page on your own! Try running the Python code to see the calculation results. ^{Show the steps to do this.} Step 1: Download and install Python on your computer.

Step 2: Open a plaintext editor like Notepad and paste the code below into it.

Step 3: Save the file as "calculate_correlation.py" in a place you will remember, like your desktop. Copy the file location to your clipboard. On Windows, you can right-click the file and click "Properties," and then copy what comes after "Location:" As an example, on my computer the location is "C:\Users\tyler\Desktop"

Step 4: Open a command line window. For example, by pressing start and typing "cmd" and them pressing enter.

Step 5: Install the required modules by typing "pip install numpy", then pressing enter, then typing "pip install scipy", then pressing enter.

Step 6: Navigate to the location where you saved the Python file by using the "cd" command. For example, I would type "cd C:\Users\tyler\Desktop" and push enter.

Step 7: Run the Python script by typing "python calculate_correlation.py"

If you run into any issues, I suggest asking ChatGPT to walk you through installing Python and running the code below on your system. Try this question:

"Walk me through installing Python on my computer to run a script that uses scipy and numpy. Go step-by-step and ask me to confirm before moving on. Start by asking me questions about my operating system so that you know how to proceed. Assume I want the simplest installation with the latest version of Python and that I do not currently have any of the necessary elements installed. Remember to only give me one step per response and confirm I have done it before proceeding."

# These modules make it easier to perform the calculation
import numpy as np
from scipy import stats

# We'll define a function that we can call to return the correlation calculations
def calculate_correlation(array1, array2):

    # Calculate Pearson correlation coefficient and p-value
    correlation, p_value = stats.pearsonr(array1, array2)

    # Calculate R-squared as the square of the correlation coefficient
    r_squared = correlation**2

    return correlation, r_squared, p_value

# These are the arrays for the variables shown on this page, but you can modify them to be any two sets of numbers
array_1 = np.array([48540200,48795500,49113300,49315800,49290600,49265600,49361000,49484200,49521700,49771100,50044500,50312600,50438000,50615200,50685600,50694100,50796400,49374800,49433100,49262300,])
array_2 = np.array([70,80,110,140,210,200,130,120,150,320,310,470,360,450,330,540,230,130,60,190,])
array_1_name = "US kids in public school"
array_2_name = "The number of pipelayers in North Dakota"

# Perform the calculation
print(f"Calculating the correlation between {array_1_name} and {array_2_name}...")
correlation, r_squared, p_value = calculate_correlation(array_1, array_2)

# Print the results
print("Correlation Coefficient:", correlation)
print("R-squared:", r_squared)
print("P-value:", p_value)

Reuseable content

You may re-use the images on this page for any purpose, even commercial purposes, without asking for permission. The only requirement is that you attribute Tyler Vigen. ^Note Attribution can take many different forms. If you leave the "tylervigen.com" link in the image, that satisfies it just fine. If you remove it and move it to a footnote, that's fine too. You can also just write "Charts courtesy of Tyler Vigen" at the bottom of an article.

You do not need to attribute "the spurious correlations website," and you don't even need to link here if you don't want to. I don't gain anything from pageviews. There are no ads on this site, there is nothing for sale, and I am not for hire.

For the record, I am just one person. Tyler Vigen, he/him/his. I do have degrees, but they should not go after my name unless you want to annoy my wife. If that is your goal, then go ahead and cite me as "Tyler Vigen, A.A. A.A.S. B.A. J.D." Otherwise it is just "Tyler Vigen."

When spoken, my last name is pronounced "vegan," like I don't eat meat.

Full license details.
For more on re-use permissions, or to get a signed release form, see tylervigen.com/permission.

Download images for these variables:

High resolution line chart ^Note The image linked here is a Scalable Vector Graphic (SVG). It is the highest resolution that is possible to achieve. It scales up beyond the size of the observable universe without pixelating. You do not need to email me asking if I have a higher resolution image. I do not. The physical limitations of our universe prevent me from providing you with an image that is any higher resolution than this one.

If you insert it into a PowerPoint presentation (a tool well-known for managing things that are the scale of the universe), you can right-click > "Ungroup" or "Create Shape" and then edit the lines and text directly. You can also change the colors this way.

Alternatively you can use a tool like Inkscape.
High resolution line chart, optimized for mobile
Alternative high resolution line chart
Scatterplot
Portable line chart (png)
Portable line chart (png), optimized for mobile
Line chart for only US kids in public school
Line chart for only The number of pipelayers in North Dakota
AI-generated correlation image
The spurious research paper: Digging Deeper: The Unearthed Link between U.S. Public School Kids and North Dakota Pipelayers

View another random correlation

How fun was this correlation?

You're a rater extraordinaire!

Correlation ID: 2284 · Black Variable ID: 1296 · Red Variable ID: 12691


Problem variable:
Issue:
Additional details: Optional
Confirm you are a human: