Download png, svg
AI explanation
Well, you see, as the Chicago Cubs score more runs, they unknowingly send out powerful cosmic vibes of offensive prowess. These vibes, traveling at the speed of light, reach the New York Mets, energizing their batters and inspiring base running heroics. It's like the Cubs are the conductors of a high-scoring, inter-MLB orchestra, and the Mets are the first violins, harmonizing perfectly with every RBI. It’s a little-known phenomenon called the Run-symptotic Effect, and it's a home run in the field of sports pseudoscience. Who knew that the Cubs' domination on the diamond could have such far-reaching, Mets-merizing consequences?Model: dalle-3
Prompt: Generate an image of a vibrant baseball diamond at twilight, illuminated by the glow of floodlights, with the Chicago Cubs in action on the field. Every swing of the bat by the Cubs' players causes bright, radiant sparks to trail behind the speeding baseball, creating an electrifying and visually stunning effect. In the background, a roaring crowd can be seen, adding to the dynamic atmosphere. On the opposite side of the field, the New York Mets are depicted mirroring the Cubs' movements, showcasing their determination. The scoreboard flashes with rapidly changing numbers, symbolizing the intense and exhilarating race for runs in the NL Central and East divisions. The depiction should include only scenes, people, and objects, it should not include text or speech bubbles.
This image generation prompt was also generated by AI, using the following sequence:
Model: gpt-3.5-turbo-1106
System prompt: You select one of a few potential scene descriptors that seems to go along with the variables described. You respond only with the scene descriptions provided and say nothing else.
User prompt: Here are two variables: Total Runs Scored by Chicago Cubs Team in National League (Central and East Division) and Runs scored by the New York Mets. Given these, which of the following image style descriptions seems most appropriate? Pick the one that seems to be related to one of the variables at hand. Please respond with only one of the following potential descriptions: a funny image macro Realistic Cute cartoon A Discovery magazine cover image a funny image macro
ChatGPT: Realistic
Model: gpt-3.5-turbo-1106
System prompt: You describe simple visual scenes that depict the two variables stated. You combine both variables into one scene.
User prompt: Please describe a visual scene that prominently depicts both Total Runs Scored by Chicago Cubs Team in National League (Central and East Division) and Runs scored by the New York Mets, in the following situation: Because the Chicago Cubs' Total Runs Scored in the NL Central and East divisions directly correlated with Runs scored by the New York Mets, as the Cubs' offense became more prolific, it created a domino effect that motivated the Mets to step up their own game. The Cubs' bats were like a symphony of hits, inspiring the Mets to compose their own runs-filled masterpiece. It was as if the Cubs were saying, "Hey Mets, watch us score," and the Mets responded with, "Challenge accepted, we'll run with it." This friendly competition resulted in a run-scoring bonanza that kept both teams on their toes, and the scoreboard operators quite busy.. Keep the description simple.
ChatGPT: Imagine a vibrant baseball diamond where the Chicago Cubs are unleashing a flurry of hits, with every swing of the bat seeming to ignite sparks. The cheers of the crowd fuel the Cubs' unstoppable momentum as they rack up runs in the NL Central and East divisions. Meanwhile, across the field, the New York Mets are mirroring their every move, determined to match the Cubs stride for stride. Each run scored by the Cubs sets off a chain reaction, fueling the Mets' determination. Both teams engage in a thrilling competition, with the scoreboard flashing numbers at a dizzying speed, symbolizing the electrifying race to score.
Model: gpt-3.5-turbo-1106
System prompt: You write and improve image generation prompts for DALLE-3. You maintain the same subjects, but enhance the visuals.
User prompt: Please respond with an image generation prompt that would be likely to create an interesting visual using DALLE-3, using the following style: [ Realistic ] and depicting this scene: Imagine a vibrant baseball diamond where the Chicago Cubs are unleashing a flurry of hits, with every swing of the bat seeming to ignite sparks. The cheers of the crowd fuel the Cubs' unstoppable momentum as they rack up runs in the NL Central and East divisions. Meanwhile, across the field, the New York Mets are mirroring their every move, determined to match the Cubs stride for stride. Each run scored by the Cubs sets off a chain reaction, fueling the Mets' determination. Both teams engage in a thrilling competition, with the scoreboard flashing numbers at a dizzying speed, symbolizing the electrifying race to score. Keep the visual depiction simple. Do not include any words or speech bubbles.
ChatGPT: Generate an image of a vibrant baseball diamond at twilight, illuminated by the glow of floodlights, with the Chicago Cubs in action on the field. Every swing of the bat by the Cubs' players causes bright, radiant sparks to trail behind the speeding baseball, creating an electrifying and visually stunning effect. In the background, a roaring crowd can be seen, adding to the dynamic atmosphere. On the opposite side of the field, the New York Mets are depicted mirroring the Cubs' movements, showcasing their determination. The scoreboard flashes with rapidly changing numbers, symbolizing the intense and exhilarating race for runs in the NL Central and East divisions.
*Variations in the User Prompts from chart to chart are just the result of random number generation in Python. I wrote a few arrays of various styles and methods to ask questions to change up the results. Every time this site writes an explanation or generates an image, the script picks from each at random.
I sequence the requests into multiple prompts because I find GPT 3.5 to perform much better with short, well-managed contexts. Thus, I track the context directly in Python and only ask ChatGPT targeted questions.
System prompt: You provide humorous responses in the form of plausible sounding explanations for correlations. You assume the correlation is causative for the purpose of the explanation even if it is ridiculous. You do not chat with the user, you only reply with the causal connection explanation and nothing else.
User prompt: Please make up a funny explanation for how increases in Total Runs Scored by Chicago Cubs Team in National League (Central and East Division) positively influenced Runs scored by the New York Mets.\Make it sound serious until the end.
AI academic paper
Batting Around the Numbers: A Statistical Analysis of the Relationship Between Chicago Cubs' Total Runs Scored and New York Mets' Runs Scored The Journal of Sports Analytics and Performance Metrics Jan 2024 Reminder: This paper is AI-generated. Not real! |
Model: gpt-3.5-turbo-1106
System prompt: You are a researcher writing an academic paper.You write titles and abstracts of goofy academic research papers.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, and add unexpected twists in your writing.
Please draft the title and abstract of an academic research paper presenting the findings of the connection between Total Runs Scored by Chicago Cubs Team in National League (Central and East Division) and Runs scored by the New York Mets. Make lots of jokes, goofy observations, and puns.
Make the title punny if you can come up with clever pun. If not, just make it sound clever.
Your research team used data from Baseball Reference and Baseball-Reference.com to assess this nagging question. You found a correlation coefficient of 0.7687085 and p < 0.01 for 1975 to 2022.
Please respond in this format (replace "Lorem ipsum dolor sit" with your title and abstract)
[[TITLE]]
Lorem ipsum dolor sit
[[ABSTRACT]]
Lorem ipsum dolor sit
ChatGPT:
[[TITLE]]
"Batting Around the Numbers: A Statistical Analysis of the Relationship Between Chicago Cubs' Total Runs Scored and New York Mets' Runs Scored"
[[ABSTRACT]]
In this paper, we swing for the fences as we delve into the relationship between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets. Utilizing data from the vast playing fields of Baseball Reference and Baseball-Reference.com, our research team has uncovered a striking correlation coefficient of 0.7687085 with p < 0.01 from the years 1975 to 2022. The findings of this study not only help to bridge the gap between the world of statistics and the diamond, but also provide valuable insights into the interconnectedness of these two storied franchises. So, grab your scorecards and rally caps as we knock it out of the park with the results of this home-run analysis!
Model: gpt-3.5-turbo-1106
System prompt: You are a researcher writing an academic paper.You draft the introduction section of an academic research paper.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.
Please make up the introduction section of an academic research paper presenting the findings of the connection between Total Runs Scored by Chicago Cubs Team in National League (Central and East Division) and Runs scored by the New York Mets. Make lots of jokes, goofy observations, and puns.
Here is the title and abstract of the paper:
[[TITLE]]
"Batting Around the Numbers: A Statistical Analysis of the Relationship Between Chicago Cubs' Total Runs Scored and New York Mets' Runs Scored"
[[ABSTRACT]]
In this paper, we swing for the fences as we delve into the relationship between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets. Utilizing data from the vast playing fields of Baseball Reference and Baseball-Reference.com, our research team has uncovered a striking correlation coefficient of 0.7687085 with p < 0.01 from the years 1975 to 2022. The findings of this study not only help to bridge the gap between the world of statistics and the diamond, but also provide valuable insights into the interconnectedness of these two storied franchises. So, grab your scorecards and rally caps as we knock it out of the park with the results of this home-run analysis!
-------
Your response should start like this:
[[INTRODUCTION]]
ChatGPT:
[[INTRODUCTION]]
Step up to the plate, baseball enthusiasts, as we embark on a grand slam exploration into the world of statistical correlations, where the Chicago Cubs and the New York Mets go head-to-head in a mesmerizing dance of runs scored. As the saying goes, "You can't tell the players without a scorecard," and indeed, we bring forth a scorecard that not only tracks the numbers but unveils the intricate relationship between these two titans of the National League.
While baseball may be a sport steeped in tradition, our research aims to infuse a fresh perspective by applying rigorous statistical methodologies to unveil the underlying patterns in the runs scored by these illustrious teams. As we unravel the tapestry of hits, runs, and errors, we can't help but marvel at the "hidden ball tricks" that the numbers play on us, offering unexpected twists and turns in their statistical saga.
The Chicago Cubs and the New York Mets, like longtime rivals in a heated pennant race, have seen their fortunes ebb and flow over the decades. From the ivy-covered walls of Wrigley Field to the skyscraping home runs at Shea Stadium and Citi Field, these teams have left an indelible mark on the annals of baseball history. Our study seeks to uncover the statistical undercurrents that intertwine the runs scored by these teams, offering a delightful play-by-play account of their interconnected batting exploits.
So, dust off your baseball caps, and get ready to slide into the world of numerical analyses that is anything but a curveball. Let's swing for the statistical fences and knock this study out of the park!
Model: gpt-3.5-turbo-1106
System prompt: You are a researcher writing an academic paper.You draft a literature review section of an academic research paper, that starts out dry for the first few sentences but then becomes hilarious and goofy.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.
Please make up a literature review section of an academic research paper presenting the findings of the connection between Total Runs Scored by Chicago Cubs Team in National League (Central and East Division) and Runs scored by the New York Mets. Make lots of jokes, goofy observations, and puns.
Speak in the present tense for this section (the authors find...), as is common in academic research paper literature reviews. Name the sources in a format similar to this: In "Book," the authors find lorem and ipsum.
Make up the lorem and ipsum part, but make it sound related to the topic at hand.
Start by naming serious-sounding studies by authors like Smith, Doe, and Jones - but then quickly devolve. Name some real non-fiction books that would be related to the topic. Then name some real fiction books that sound like they could be related. Then name a few real TV shows that sound like they might be relevant to the topic that you watched as research.
Here is the title and abstract of the paper:
[[TITLE]]
"Batting Around the Numbers: A Statistical Analysis of the Relationship Between Chicago Cubs' Total Runs Scored and New York Mets' Runs Scored"
[[ABSTRACT]]
In this paper, we swing for the fences as we delve into the relationship between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets. Utilizing data from the vast playing fields of Baseball Reference and Baseball-Reference.com, our research team has uncovered a striking correlation coefficient of 0.7687085 with p < 0.01 from the years 1975 to 2022. The findings of this study not only help to bridge the gap between the world of statistics and the diamond, but also provide valuable insights into the interconnectedness of these two storied franchises. So, grab your scorecards and rally caps as we knock it out of the park with the results of this home-run analysis!
-------
And here is the introduction section that led to this literature review:
[[INTRODUCTION]]
Step up to the plate, baseball enthusiasts, as we embark on a grand slam exploration into the world of statistical correlations, where the Chicago Cubs and the New York Mets go head-to-head in a mesmerizing dance of runs scored. As the saying goes, "You can't tell the players without a scorecard," and indeed, we bring forth a scorecard that not only tracks the numbers but unveils the intricate relationship between these two titans of the National League.
While baseball may be a sport steeped in tradition, our research aims to infuse a fresh perspective by applying rigorous statistical methodologies to unveil the underlying patterns in the runs scored by these illustrious teams. As we unravel the tapestry of hits, runs, and errors, we can't help but marvel at the "hidden ball tricks" that the numbers play on us, offering unexpected twists and turns in their statistical saga.
The Chicago Cubs and the New York Mets, like longtime rivals in a heated pennant race, have seen their fortunes ebb and flow over the decades. From the ivy-covered walls of Wrigley Field to the skyscraping home runs at Shea Stadium and Citi Field, these teams have left an indelible mark on the annals of baseball history. Our study seeks to uncover the statistical undercurrents that intertwine the runs scored by these teams, offering a delightful play-by-play account of their interconnected batting exploits.
So, dust off your baseball caps, and get ready to slide into the world of numerical analyses that is anything but a curveball. Let's swing for the statistical fences and knock this study out of the park!
-------
Your response should start like this:
[[LITERATURE REVIEW]]
ChatGPT:
[[LITERATURE REVIEW]]
In their seminal work, "Statistical Analysis of Major League Baseball Teams: A Comprehensive Study," Smith and Doe undertook a rigorous examination of the relationship between runs scored by the Chicago Cubs and the New York Mets. This study laid the groundwork for our investigation by highlighting the significance of identifying patterns in the offensive prowess of these iconic teams. The authors find significant evidence pointing to a positive correlation between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets, setting the stage for our own exploration of this compelling statistical battleground.
Building on this foundation, Jones et al. further delved into the batting statistics of National League Central and East Division teams in "Advanced Statistical Methods in Baseball Research." Their findings echoed the sentiments put forth by Smith and Doe, emphasizing the need for a thorough analysis of the relationship between the runs scored by these teams over time. The authors conclude that a deeper understanding of this connection can provide invaluable insights into the strategic and tactical maneuvers that shape the outcomes of baseball clashes.
As we segue from the serious and stoic world of academic literature to a more eclectic landscape, we turn to non-fiction works that intersect with the realm of baseball statistics. "Moneyball: The Art of Winning an Unfair Game" by Michael Lewis is a poignant exploration of the unconventional methodologies employed by the Oakland Athletics to achieve success in Major League Baseball. While the focus is not directly on the Chicago Cubs or the New York Mets, the underlying principles of statistical analysis and player performance provide a rich backdrop for our own inquiry.
Similarly, "The Book: Playing the Percentages in Baseball" by Tom Tango, Mitchel Lichtman, and Andrew Dolphin offers a treasure trove of statistical insights that permeate the fabric of our study. Through a lens of probability and numerical acumen, the authors dissect the nuances of baseball strategy and decision-making, illuminating the interconnected nature of statistical analyses within the sport.
Turning now to the realm of fiction, we pause to consider titles such as "The Art of Fielding" by Chad Harbach and "The Natural" by Bernard Malamud. While these literary works might not directly address the statistical relationships between baseball teams, their evocative narratives transport readers to the heart of the game, where the crack of the bat and the rush of adrenaline encapsulate the essence of baseball.
In expanding our horizons, we also engage with television programs that offer glimpses into the world of baseball dynamics. Shows such as "Pitch" and "Brockmire" entertain and inform, presenting fictional yet insightful portrayals of the intricate web of relationships, rivalries, and triumphs within the realm of sports.
With our scholarly pursuits taking us to the diamond and beyond, we now proceed to unveil the results of our empirical investigation into the connection between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets, navigating the wondrous intersections of statistics and America's favorite pastime.
Model: gpt-3.5-turbo-1106
System prompt: You are a researcher writing an academic paper.You draft the methodology section of an academic research paper.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.
Please make up the methodology section of an academic research paper presenting the findings of the connection between Total Runs Scored by Chicago Cubs Team in National League (Central and East Division) and Runs scored by the New York Mets. Make lots of jokes, goofy observations, and puns.
Your research team collected data from all across the internet, but mostly just used information from Baseball Reference and Baseball-Reference.com . You used data from 1975 to 2022
Make up the research methods you don't know. Make them a bit goofy and convoluted.
Here is the title, abstract, and introduction of the paper:
[[TITLE]]
"Batting Around the Numbers: A Statistical Analysis of the Relationship Between Chicago Cubs' Total Runs Scored and New York Mets' Runs Scored"
[[ABSTRACT]]
In this paper, we swing for the fences as we delve into the relationship between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets. Utilizing data from the vast playing fields of Baseball Reference and Baseball-Reference.com, our research team has uncovered a striking correlation coefficient of 0.7687085 with p < 0.01 from the years 1975 to 2022. The findings of this study not only help to bridge the gap between the world of statistics and the diamond, but also provide valuable insights into the interconnectedness of these two storied franchises. So, grab your scorecards and rally caps as we knock it out of the park with the results of this home-run analysis!
[[INTRODUCTION]]
Step up to the plate, baseball enthusiasts, as we embark on a grand slam exploration into the world of statistical correlations, where the Chicago Cubs and the New York Mets go head-to-head in a mesmerizing dance of runs scored. As the saying goes, "You can't tell the players without a scorecard," and indeed, we bring forth a scorecard that not only tracks the numbers but unveils the intricate relationship between these two titans of the National League.
While baseball may be a sport steeped in tradition, our research aims to infuse a fresh perspective by applying rigorous statistical methodologies to unveil the underlying patterns in the runs scored by these illustrious teams. As we unravel the tapestry of hits, runs, and errors, we can't help but marvel at the "hidden ball tricks" that the numbers play on us, offering unexpected twists and turns in their statistical saga.
The Chicago Cubs and the New York Mets, like longtime rivals in a heated pennant race, have seen their fortunes ebb and flow over the decades. From the ivy-covered walls of Wrigley Field to the skyscraping home runs at Shea Stadium and Citi Field, these teams have left an indelible mark on the annals of baseball history. Our study seeks to uncover the statistical undercurrents that intertwine the runs scored by these teams, offering a delightful play-by-play account of their interconnected batting exploits.
So, dust off your baseball caps, and get ready to slide into the world of numerical analyses that is anything but a curveball. Let's swing for the statistical fences and knock this study out of the park!
-------
Your response should start like this:
[[METHODOLOGY]]
ChatGPT:
[[METHODOLOGY]]
To embark on our statistical journey through the batting averages and runs scored by the Chicago Cubs and the New York Mets, our research team employed a "Moneyball" approach, utilizing a combination of conventional statistical methods and some unconventional techniques that would make even Billy Beane raise an eyebrow.
First and foremost, we scoured the vast expanses of the digital baseball cosmos, extracting data from trusted sources such as Baseball Reference and Baseball-Reference.com. Armed with a potent mixture of SQL queries and web scraping wizardry, we collected comprehensive game-by-game statistics for both teams from 1975 to 2022. Our data mining expedition was akin to prospecting for gold in a statistical mine, unearthing valuable nuggets of information that formed the bedrock of our analysis.
With our treasure trove of data in hand, we then embarked on an odyssey through the world of statistical analyses. Employing the venerable tools of correlation analysis, regression modeling, and time series analysis, we sought to unravel the hidden patterns and relationships within the runs scored by these two powerhouse teams. Our analytical arsenal was not limited to conventional methodologies; we also delved into the arcane arts of sabermetrics, seeking insights that might elude traditional statistical approaches.
A key aspect of our methodology involved the development of custom-built algorithms to identify and account for potential confounding variables, such as changes in team rosters, stadium dimensions, and even the whims of the baseball gods. These algorithms were fine-tuned through rigorous testing and validation procedures, ensuring that our findings were robust and resilient against the vagaries of the baseball season.
Furthermore, our research team conducted extensive sensitivity analyses and Monte Carlo simulations to assess the robustness of our results in the face of uncertainty. This involved subjecting our data to a barrage of hypothetical scenarios, akin to sending a lineup of pinch hitters to face a relentless bullpen, to gauge the resilience of our findings under various conditions.
In keeping with the spirit of baseball, our methodology also included a healthy dose of "small ball" tactics, where we meticulously accounted for outliers, streaks, and hot/cold streaks that could potentially skew our analyses. By employing a "Moneyball" approach that embraced both the art and science of baseball statistics, we strived to paint a nuanced portrait of the relationship between the Chicago Cubs' runs scored and the New York Mets' offensive prowess.
Ultimately, our methodology was not merely a formulaic exercise in number crunching; it was a symphony of statistical acrobatics, where we danced with the data and let the numbers sing their own unique ballad of runs, hits, and errors. So, grab your virtual scorecards and join us as we step up to the statistical plate, ready to knock our findings out of the park!
Model: gpt-3.5-turbo-1106
System prompt: You are a researcher writing an academic paper.You draft the results section of an academic research paper. You speak in the past tense (you found...).
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.
Please make up the results section of an academic research paper presenting the findings of the connection between Total Runs Scored by Chicago Cubs Team in National League (Central and East Division) and Runs scored by the New York Mets. Make lots of jokes, goofy observations, and puns.
Your research team collected data from all across the internet, but mostly just used information from Baseball Reference and Baseball-Reference.com .
For the time period 1975 to 2022, you found a correlation 0.7687085, r-squared of 0.5909128, and p < 0.01.
One figure will be included. The figure (Fig. 1) is a scatterplot showing the strong correlation between the two variables. You don't need to specify where; I will add the figure.
Here is the title and abstract of the paper:
[[TITLE]]
"Batting Around the Numbers: A Statistical Analysis of the Relationship Between Chicago Cubs' Total Runs Scored and New York Mets' Runs Scored"
[[ABSTRACT]]
In this paper, we swing for the fences as we delve into the relationship between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets. Utilizing data from the vast playing fields of Baseball Reference and Baseball-Reference.com, our research team has uncovered a striking correlation coefficient of 0.7687085 with p < 0.01 from the years 1975 to 2022. The findings of this study not only help to bridge the gap between the world of statistics and the diamond, but also provide valuable insights into the interconnectedness of these two storied franchises. So, grab your scorecards and rally caps as we knock it out of the park with the results of this home-run analysis!
-------
And here is the methodology section that led to this result:
[[METHODOLOGY]]
Step up to the plate, baseball enthusiasts, as we embark on a grand slam exploration into the world of statistical correlations, where the Chicago Cubs and the New York Mets go head-to-head in a mesmerizing dance of runs scored. As the saying goes, "You can't tell the players without a scorecard," and indeed, we bring forth a scorecard that not only tracks the numbers but unveils the intricate relationship between these two titans of the National League.
While baseball may be a sport steeped in tradition, our research aims to infuse a fresh perspective by applying rigorous statistical methodologies to unveil the underlying patterns in the runs scored by these illustrious teams. As we unravel the tapestry of hits, runs, and errors, we can't help but marvel at the "hidden ball tricks" that the numbers play on us, offering unexpected twists and turns in their statistical saga.
The Chicago Cubs and the New York Mets, like longtime rivals in a heated pennant race, have seen their fortunes ebb and flow over the decades. From the ivy-covered walls of Wrigley Field to the skyscraping home runs at Shea Stadium and Citi Field, these teams have left an indelible mark on the annals of baseball history. Our study seeks to uncover the statistical undercurrents that intertwine the runs scored by these teams, offering a delightful play-by-play account of their interconnected batting exploits.
So, dust off your baseball caps, and get ready to slide into the world of numerical analyses that is anything but a curveball. Let's swing for the statistical fences and knock this study out of the park!
-------
Your response should start like this:
[[RESULTS]]
ChatGPT:
[[RESULTS]]
In our pursuit of uncovering the enigmatic ties between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets, we stepped into the batter's box armed with statistical tools and a keen eye for discovering the hidden patterns within the game. With a resounding crack of the bat, our analysis revealed a robust correlation coefficient of 0.7687085 between these two crucial variables, signifying a substantial positive relationship.
This correlation coefficient, with an r-squared value of 0.5909128, attests to the formidable influence that the Chicago Cubs' total runs scored exert on the runs scored by the New York Mets. It's as if the winds from Lake Michigan are propelling those runs straight to their counterparts in Queens. The rapport between these teams on the scoreboard is undoubtedly a sight to behold.
Furthermore, our findings are supported by a p-value of less than 0.01, solidifying the statistical significance of this relationship. It seems the data speaks so loudly, even the seventh-inning stretch can't muffle its impact.
Fig. 1 showcases this compelling correlation with a scatterplot that unmistakably captures the synchrony between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets. It's like witnessing a well-executed double play – smooth, seamless, and undeniably connected.
In summary, our research uncovers a striking bond between these two iconic franchises, demonstrating that the Chicago Cubs' total runs scored in the National League (Central and East Division) and the runs scored by the New York Mets are more than just numbers on a box score; they are intertwined in a statistical pas de deux that dances through the annals of baseball history. So, consider this study a grand slam in our quest to decode the fascinating relationships that underpin America's favorite pastime.
Stay tuned for the postgame analysis, where we unravel the implications of these findings and embark on a ninth-inning stretch of insights that will leave you reaching for your rally cap.
Model: gpt-3.5-turbo-1106
System prompt: You are a researcher writing an academic paper.You draft the discussion section of an academic research paper.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.
Please make up the discussion section of an academic research paper presenting the findings of the connection between Total Runs Scored by Chicago Cubs Team in National League (Central and East Division) and Runs scored by the New York Mets. Make lots of jokes, goofy observations, and puns.
Limit your response to 500 tokens.
Here are the title, abstract, literature review, and results sections. Please harken back to 1-2 of the goofy items in the literature review, but pretend to take them completely seriously. Discuss how your results supported the prior research.
Do not write a conclusion. I will add a conclusion after this.
[[TITLE]]
"Batting Around the Numbers: A Statistical Analysis of the Relationship Between Chicago Cubs' Total Runs Scored and New York Mets' Runs Scored"
[[ABSTRACT]]
In this paper, we swing for the fences as we delve into the relationship between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets. Utilizing data from the vast playing fields of Baseball Reference and Baseball-Reference.com, our research team has uncovered a striking correlation coefficient of 0.7687085 with p < 0.01 from the years 1975 to 2022. The findings of this study not only help to bridge the gap between the world of statistics and the diamond, but also provide valuable insights into the interconnectedness of these two storied franchises. So, grab your scorecards and rally caps as we knock it out of the park with the results of this home-run analysis!
[[LITERATURE REVIEW]]
In their seminal work, "Statistical Analysis of Major League Baseball Teams: A Comprehensive Study," Smith and Doe undertook a rigorous examination of the relationship between runs scored by the Chicago Cubs and the New York Mets. This study laid the groundwork for our investigation by highlighting the significance of identifying patterns in the offensive prowess of these iconic teams. The authors find significant evidence pointing to a positive correlation between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets, setting the stage for our own exploration of this compelling statistical battleground.
Building on this foundation, Jones et al. further delved into the batting statistics of National League Central and East Division teams in "Advanced Statistical Methods in Baseball Research." Their findings echoed the sentiments put forth by Smith and Doe, emphasizing the need for a thorough analysis of the relationship between the runs scored by these teams over time. The authors conclude that a deeper understanding of this connection can provide invaluable insights into the strategic and tactical maneuvers that shape the outcomes of baseball clashes.
As we segue from the serious and stoic world of academic literature to a more eclectic landscape, we turn to non-fiction works that intersect with the realm of baseball statistics. "Moneyball: The Art of Winning an Unfair Game" by Michael Lewis is a poignant exploration of the unconventional methodologies employed by the Oakland Athletics to achieve success in Major League Baseball. While the focus is not directly on the Chicago Cubs or the New York Mets, the underlying principles of statistical analysis and player performance provide a rich backdrop for our own inquiry.
Similarly, "The Book: Playing the Percentages in Baseball" by Tom Tango, Mitchel Lichtman, and Andrew Dolphin offers a treasure trove of statistical insights that permeate the fabric of our study. Through a lens of probability and numerical acumen, the authors dissect the nuances of baseball strategy and decision-making, illuminating the interconnected nature of statistical analyses within the sport.
Turning now to the realm of fiction, we pause to consider titles such as "The Art of Fielding" by Chad Harbach and "The Natural" by Bernard Malamud. While these literary works might not directly address the statistical relationships between baseball teams, their evocative narratives transport readers to the heart of the game, where the crack of the bat and the rush of adrenaline encapsulate the essence of baseball.
In expanding our horizons, we also engage with television programs that offer glimpses into the world of baseball dynamics. Shows such as "Pitch" and "Brockmire" entertain and inform, presenting fictional yet insightful portrayals of the intricate web of relationships, rivalries, and triumphs within the realm of sports.
With our scholarly pursuits taking us to the diamond and beyond, we now proceed to unveil the results of our empirical investigation into the connection between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets, navigating the wondrous intersections of statistics and America's favorite pastime.
[[RESULTS]]
In our pursuit of uncovering the enigmatic ties between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets, we stepped into the batter's box armed with statistical tools and a keen eye for discovering the hidden patterns within the game. With a resounding crack of the bat, our analysis revealed a robust correlation coefficient of 0.7687085 between these two crucial variables, signifying a substantial positive relationship.
This correlation coefficient, with an r-squared value of 0.5909128, attests to the formidable influence that the Chicago Cubs' total runs scored exert on the runs scored by the New York Mets. It's as if the winds from Lake Michigan are propelling those runs straight to their counterparts in Queens. The rapport between these teams on the scoreboard is undoubtedly a sight to behold.
Furthermore, our findings are supported by a p-value of less than 0.01, solidifying the statistical significance of this relationship. It seems the data speaks so loudly, even the seventh-inning stretch can't muffle its impact.
Fig. 1 showcases this compelling correlation with a scatterplot that unmistakably captures the synchrony between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets. It's like witnessing a well-executed double play – smooth, seamless, and undeniably connected.
In summary, our research uncovers a striking bond between these two iconic franchises, demonstrating that the Chicago Cubs' total runs scored in the National League (Central and East Division) and the runs scored by the New York Mets are more than just numbers on a box score; they are intertwined in a statistical pas de deux that dances through the annals of baseball history. So, consider this study a grand slam in our quest to decode the fascinating relationships that underpin America's favorite pastime.
Stay tuned for the postgame analysis, where we unravel the implications of these findings and embark on a ninth-inning stretch of insights that will leave you reaching for your rally cap.
-------
Your response should start like this:
[[DISCUSSION]]
ChatGPT:
[[DISCUSSION]]
As we dig into the implications of our findings, it's clear that the statistical connection between the Chicago Cubs and the New York Mets runs deeper than the ivy at Wrigley Field. Our research not only reaffirms the work of esteemed colleagues but also reveals a home-run relationship that transcends mere numbers.
The analyses conducted by Smith and Doe, and echoed by Jones et al., set the stage for our study and emphasized the importance of unraveling the intricate tapestry of baseball statistics. Like a well-thrown curveball, their findings sparked our curiosity and propelled us toward the discovery of a strong positive correlation between the runs scored by these iconic teams. It's as if statistical fate has woven together the destinies of the Cubs and the Mets on the diamond.
Our results further cement the validity of these prior studies and, dare I say, add a dash of pizzazz to the mix. With a correlation coefficient reminiscent of a perfect double play, our research exemplifies the enduring bond between these historic franchises. In a whimsical twist of fate, it seems that the Cubs' runs not only cross home plate but also find their way to the Mets' scoreboard, establishing a numerical duet that echoes through the annals of baseball lore.
Building on the literary works that intersect with the realm of baseball statistics – and with all due respect to "The Natural" and "Pitch" – our findings bring to light the real-life drama and intrigue that accompany the statistical dance between these teams. It's akin to witnessing a nail-biting extra-inning affair, where each run scored by the Cubs sets the stage for a complementary response from the Mets, creating a symphonic resonance that reverberates throughout the season.
In essence, our research not only aligns with the serious and stoic world of statistical inquiry but also infuses a delightful twist into the narrative of baseball analysis. By shedding light on the interconnectedness of these statistical phenomena, we deepen our understanding of the game while injecting a dose of statistical whimsy into the scholarly domain. So, as we gear up for the extra innings of interpretation and exploration, it's safe to say that our statistical journey from Wrigley Field to Citi Field has uncovered an unassailable connection that stretches far beyond the confines of the diamond.
Model: gpt-3.5-turbo-1106
System prompt: You are a researcher writing an academic paper.You draft the discussion section of goofy academic research papers that sounds like it belongs based on the title and abstract provided.
User prompt:
Pretend you are an academic researcher. Your writing sounds like it belongs in a medical or economics journal. However, you frequently crack jokes, make puns, or add unexpected twists in your writing.
Please make up a short conclusion of an academic research paper presenting the findings of the connection between Total Runs Scored by Chicago Cubs Team in National League (Central and East Division) and Runs scored by the New York Mets. Make lots of jokes, goofy observations, and puns.
Limit your response to 300 tokens. At the very end, assert that no more research is needed in this area.
Here are the title, abstract, introduction, and results sections.
[[TITLE]]
"Batting Around the Numbers: A Statistical Analysis of the Relationship Between Chicago Cubs' Total Runs Scored and New York Mets' Runs Scored"
[[ABSTRACT]]
In this paper, we swing for the fences as we delve into the relationship between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets. Utilizing data from the vast playing fields of Baseball Reference and Baseball-Reference.com, our research team has uncovered a striking correlation coefficient of 0.7687085 with p < 0.01 from the years 1975 to 2022. The findings of this study not only help to bridge the gap between the world of statistics and the diamond, but also provide valuable insights into the interconnectedness of these two storied franchises. So, grab your scorecards and rally caps as we knock it out of the park with the results of this home-run analysis!
[[INTRDUCTION]]
Step up to the plate, baseball enthusiasts, as we embark on a grand slam exploration into the world of statistical correlations, where the Chicago Cubs and the New York Mets go head-to-head in a mesmerizing dance of runs scored. As the saying goes, "You can't tell the players without a scorecard," and indeed, we bring forth a scorecard that not only tracks the numbers but unveils the intricate relationship between these two titans of the National League.
While baseball may be a sport steeped in tradition, our research aims to infuse a fresh perspective by applying rigorous statistical methodologies to unveil the underlying patterns in the runs scored by these illustrious teams. As we unravel the tapestry of hits, runs, and errors, we can't help but marvel at the "hidden ball tricks" that the numbers play on us, offering unexpected twists and turns in their statistical saga.
The Chicago Cubs and the New York Mets, like longtime rivals in a heated pennant race, have seen their fortunes ebb and flow over the decades. From the ivy-covered walls of Wrigley Field to the skyscraping home runs at Shea Stadium and Citi Field, these teams have left an indelible mark on the annals of baseball history. Our study seeks to uncover the statistical undercurrents that intertwine the runs scored by these teams, offering a delightful play-by-play account of their interconnected batting exploits.
So, dust off your baseball caps, and get ready to slide into the world of numerical analyses that is anything but a curveball. Let's swing for the statistical fences and knock this study out of the park!
[[RESULTS]]
In our pursuit of uncovering the enigmatic ties between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets, we stepped into the batter's box armed with statistical tools and a keen eye for discovering the hidden patterns within the game. With a resounding crack of the bat, our analysis revealed a robust correlation coefficient of 0.7687085 between these two crucial variables, signifying a substantial positive relationship.
This correlation coefficient, with an r-squared value of 0.5909128, attests to the formidable influence that the Chicago Cubs' total runs scored exert on the runs scored by the New York Mets. It's as if the winds from Lake Michigan are propelling those runs straight to their counterparts in Queens. The rapport between these teams on the scoreboard is undoubtedly a sight to behold.
Furthermore, our findings are supported by a p-value of less than 0.01, solidifying the statistical significance of this relationship. It seems the data speaks so loudly, even the seventh-inning stretch can't muffle its impact.
Fig. 1 showcases this compelling correlation with a scatterplot that unmistakably captures the synchrony between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets. It's like witnessing a well-executed double play – smooth, seamless, and undeniably connected.
In summary, our research uncovers a striking bond between these two iconic franchises, demonstrating that the Chicago Cubs' total runs scored in the National League (Central and East Division) and the runs scored by the New York Mets are more than just numbers on a box score; they are intertwined in a statistical pas de deux that dances through the annals of baseball history. So, consider this study a grand slam in our quest to decode the fascinating relationships that underpin America's favorite pastime.
Stay tuned for the postgame analysis, where we unravel the implications of these findings and embark on a ninth-inning stretch of insights that will leave you reaching for your rally cap.
-------
Your response should start like this:
[[CONCLUSION]]
ChatGPT:
[[CONCLUSION]]
In the illustrious game of baseball, where the crack of the bat reverberates through the hearts of fans and statisticians alike, our research has revealed a home run of a correlation between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets. It's as if these numbers are engaged in a spirited dance, tangoing across the statistical field in perfect harmony.
Much like a well-executed double play, our findings demonstrate the seamless synchrony between these two pivotal variables, showcasing a bond stronger than the seventh-inning stretch. With a correlation coefficient that hits it out of the park and a p-value so low, it's like the statistical significance is shouting "safe at home plate" for all to hear.
So, dear readers, as we pack up our statistical scorecards and bid adieu to this lively game of numbers, let us revel in the joy of discovery and the thrill of unearthing the unexpected relationships hidden within America's favorite pastime. As for future research, we assert with confidence that no more investigations are needed in this area. We've hit statistical gold, and it's time to call it a game-winning walk-off.
Model: gpt-3.5-turbo-1106
System prompt: You provide fictional names of academic journals. You say nothing else, and you do not chat with the user.
User prompt: Please provide the name of a funny but plausible sounding research, scientific, or academic journal that would publish a paper entited Batting Around the Numbers: A Statistical Analysis of the Relationship Between Chicago Cubs' Total Runs Scored and New York Mets' Runs Scored. Respond only with the name of a fictional academic journal.
ChatGPT:
The Journal of Sports Analytics and Performance Metrics
Model: gpt-3.5-turbo-1106
System prompt: You provide fictional names of nonprofit research organizations and associations. You say nothing else, and you do not chat with the user.
User prompt: Please provide the name of a funny but plausible sounding nonprofit, academic, or scientific association or organization that would sponsor research that resulted in a paper entited Batting Around the Numbers: A Statistical Analysis of the Relationship Between Chicago Cubs' Total Runs Scored and New York Mets' Runs Scored. Respond only with the name of a fictional organization.
ChatGPT:
The Society for Advanced Baseball Analytics
Model: gpt-3.5-turbo-1106
System prompt: You provide potential search keywords separated by commas. You say nothing else, and you do not chat with the user.
User prompt: Please provide potential search keywords for a paper titled Batting Around the Numbers: A Statistical Analysis of the Relationship Between Chicago Cubs' Total Runs Scored and New York Mets' Runs Scored with an abstract of In this paper, we swing for the fences as we delve into the relationship between the total runs scored by the Chicago Cubs and the runs scored by the New York Mets. Utilizing data from the vast playing fields of Baseball Reference and Baseball-Reference.com, our research team has uncovered a striking correlation coefficient of 0.7687085 with p < 0.01 from the years 1975 to 2022. The findings of this study not only help to bridge the gap between the world of statistics and the diamond, but also provide valuable insights into the interconnectedness of these two storied franchises. So, grab your scorecards and rally caps as we knock it out of the park with the results of this home-run analysis!
ChatGPT:
"Chicago Cubs total runs scored," "New York Mets runs scored," "statistical analysis baseball runs scored," "correlation coefficient baseball teams," "relationship between Chicago Cubs and New York Mets runs scored," "baseball statistics comparison teams," "impact of runs scored on baseball teams."
*There is a bunch of Python happening behind the scenes to turn this prompt sequence into a PDF.
Discover a new correlation
View all correlations
View all research papers
Report an error
Data details
Total Runs Scored by Chicago Cubs Team in National League (Central and East Division)Source: Baseball Reference
See what else correlates with Total Runs Scored by Chicago Cubs Team in National League (Central and East Division)
Runs scored by the New York Mets
Detailed data title: The total number of runs scored during the year by the New York Mets
Source: Baseball-Reference.com
See what else correlates with Runs scored by the New York Mets
Correlation is a measure of how much the variables move together. If it is 0.99, when one goes up the other goes up. If it is 0.02, the connection is very weak or non-existent. If it is -0.99, then when one goes up the other goes down. If it is 1.00, you probably messed up your correlation function.
r2 = 0.5909128 (Coefficient of determination)
This means 59.1% of the change in the one variable (i.e., Runs scored by the New York Mets) is predictable based on the change in the other (i.e., Total Runs Scored by Chicago Cubs Team in National League (Central and East Division)) over the 48 years from 1975 through 2022.
p < 0.01, which is statistically significant(Null hypothesis significance test)
The p-value is 1.8E-10. 0.0000000001770696416646453000
The p-value is a measure of how probable it is that we would randomly find a result this extreme. More specifically the p-value is a measure of how probable it is that we would randomly find a result this extreme if we had only tested one pair of variables one time.
But I am a p-villain. I absolutely did not test only one pair of variables one time. I correlated hundreds of millions of pairs of variables. I threw boatloads of data into an industrial-sized blender to find this correlation.
Who is going to stop me? p-value reporting doesn't require me to report how many calculations I had to go through in order to find a low p-value!
On average, you will find a correaltion as strong as 0.77 in 1.8E-8% of random cases. Said differently, if you correlated 5,647,495,475 random variables You don't actually need 5 billion variables to find a correlation like this one. I don't have that many variables in my database. You can also correlate variables that are not independent. I do this a lot.
p-value calculations are useful for understanding the probability of a result happening by chance. They are most useful when used to highlight the risk of a fluke outcome. For example, if you calculate a p-value of 0.30, the risk that the result is a fluke is high. It is good to know that! But there are lots of ways to get a p-value of less than 0.01, as evidenced by this project.
In this particular case, the values are so extreme as to be meaningless. That's why no one reports p-values with specificity after they drop below 0.01.
Just to be clear: I'm being completely transparent about the calculations. There is no math trickery. This is just how statistics shakes out when you calculate hundreds of millions of random correlations.
with the same 47 degrees of freedom, Degrees of freedom is a measure of how many free components we are testing. In this case it is 47 because we have two variables measured over a period of 48 years. It's just the number of years minus ( the number of variables minus one ), which in this case simplifies to the number of years minus one.
you would randomly expect to find a correlation as strong as this one.
[ 0.62, 0.86 ] 95% correlation confidence interval (using the Fisher z-transformation)
The confidence interval is an estimate the range of the value of the correlation coefficient, using the correlation itself as an input. The values are meant to be the low and high end of the correlation coefficient with 95% confidence.
This one is a bit more complciated than the other calculations, but I include it because many people have been pushing for confidence intervals instead of p-value calculations (for example: NEJM. However, if you are dredging data, you can reliably find yourself in the 5%. That's my goal!
All values for the years included above: If I were being very sneaky, I could trim years from the beginning or end of the datasets to increase the correlation on some pairs of variables. I don't do that because there are already plenty of correlations in my database without monkeying with the years.
Still, sometimes one of the variables has more years of data available than the other. This page only shows the overlapping years. To see all the years, click on "See what else correlates with..." link above.
1975 | 1976 | 1977 | 1978 | 1979 | 1980 | 1981 | 1982 | 1983 | 1984 | 1985 | 1986 | 1987 | 1988 | 1989 | 1990 | 1991 | 1992 | 1993 | 1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 | 2022 | |
Total Runs Scored by Chicago Cubs Team in National League (Central and East Division) (Runs Scored) | 712 | 611 | 692 | 664 | 706 | 614 | 370 | 676 | 701 | 762 | 686 | 680 | 720 | 660 | 702 | 690 | 695 | 593 | 738 | 500 | 693 | 772 | 687 | 831 | 747 | 764 | 777 | 706 | 724 | 789 | 703 | 716 | 752 | 855 | 707 | 685 | 654 | 613 | 602 | 614 | 689 | 808 | 822 | 761 | 814 | 265 | 705 | 657 |
Runs scored by the New York Mets (Runs scored) | 646 | 615 | 587 | 607 | 593 | 611 | 348 | 609 | 575 | 652 | 695 | 783 | 823 | 703 | 683 | 775 | 640 | 599 | 672 | 506 | 657 | 746 | 777 | 706 | 853 | 807 | 642 | 690 | 642 | 684 | 722 | 834 | 804 | 799 | 671 | 656 | 718 | 650 | 619 | 629 | 683 | 671 | 735 | 676 | 791 | 286 | 636 | 772 |
Why this works
- Data dredging: I have 25,153 variables in my database. I compare all these variables against each other to find ones that randomly match up. That's 632,673,409 correlation calculations! This is called “data dredging.” Instead of starting with a hypothesis and testing it, I instead abused the data to see what correlations shake out. It’s a dangerous way to go about analysis, because any sufficiently large dataset will yield strong correlations completely at random.
- Lack of causal connection: There is probably
Because these pages are automatically generated, it's possible that the two variables you are viewing are in fact causually related. I take steps to prevent the obvious ones from showing on the site (I don't let data about the weather in one city correlate with the weather in a neighboring city, for example), but sometimes they still pop up. If they are related, cool! You found a loophole.
no direct connection between these variables, despite what the AI says above. This is exacerbated by the fact that I used "Years" as the base variable. Lots of things happen in a year that are not related to each other! Most studies would use something like "one person" in stead of "one year" to be the "thing" studied. - Observations not independent: For many variables, sequential years are not independent of each other. If a population of people is continuously doing something every day, there is no reason to think they would suddenly change how they are doing that thing on January 1. A simple
Personally I don't find any p-value calculation to be 'simple,' but you know what I mean.
p-value calculation does not take this into account, so mathematically it appears less probable than it really is. - Confounding variable: 2020 is particularly different from the other years on this graph. Confounding variables (like global pandemics) will cause two variables to look connected when in fact a "sneaky third" variable is influencing both of them behind the scenes.
Try it yourself
You can calculate the values on this page on your own! Try running the Python code to see the calculation results. Step 1: Download and install Python on your computer.Step 2: Open a plaintext editor like Notepad and paste the code below into it.
Step 3: Save the file as "calculate_correlation.py" in a place you will remember, like your desktop. Copy the file location to your clipboard. On Windows, you can right-click the file and click "Properties," and then copy what comes after "Location:" As an example, on my computer the location is "C:\Users\tyler\Desktop"
Step 4: Open a command line window. For example, by pressing start and typing "cmd" and them pressing enter.
Step 5: Install the required modules by typing "pip install numpy", then pressing enter, then typing "pip install scipy", then pressing enter.
Step 6: Navigate to the location where you saved the Python file by using the "cd" command. For example, I would type "cd C:\Users\tyler\Desktop" and push enter.
Step 7: Run the Python script by typing "python calculate_correlation.py"
If you run into any issues, I suggest asking ChatGPT to walk you through installing Python and running the code below on your system. Try this question:
"Walk me through installing Python on my computer to run a script that uses scipy and numpy. Go step-by-step and ask me to confirm before moving on. Start by asking me questions about my operating system so that you know how to proceed. Assume I want the simplest installation with the latest version of Python and that I do not currently have any of the necessary elements installed. Remember to only give me one step per response and confirm I have done it before proceeding."
# These modules make it easier to perform the calculation
import numpy as np
from scipy import stats
# We'll define a function that we can call to return the correlation calculations
def calculate_correlation(array1, array2):
# Calculate Pearson correlation coefficient and p-value
correlation, p_value = stats.pearsonr(array1, array2)
# Calculate R-squared as the square of the correlation coefficient
r_squared = correlation**2
return correlation, r_squared, p_value
# These are the arrays for the variables shown on this page, but you can modify them to be any two sets of numbers
array_1 = np.array([712,611,692,664,706,614,370,676,701,762,686,680,720,660,702,690,695,593,738,500,693,772,687,831,747,764,777,706,724,789,703,716,752,855,707,685,654,613,602,614,689,808,822,761,814,265,705,657,])
array_2 = np.array([646,615,587,607,593,611,348,609,575,652,695,783,823,703,683,775,640,599,672,506,657,746,777,706,853,807,642,690,642,684,722,834,804,799,671,656,718,650,619,629,683,671,735,676,791,286,636,772,])
array_1_name = "Total Runs Scored by Chicago Cubs Team in National League (Central and East Division)"
array_2_name = "Runs scored by the New York Mets"
# Perform the calculation
print(f"Calculating the correlation between {array_1_name} and {array_2_name}...")
correlation, r_squared, p_value = calculate_correlation(array_1, array_2)
# Print the results
print("Correlation Coefficient:", correlation)
print("R-squared:", r_squared)
print("P-value:", p_value)
Reuseable content
You may re-use the images on this page for any purpose, even commercial purposes, without asking for permission. The only requirement is that you attribute Tyler Vigen. Attribution can take many different forms. If you leave the "tylervigen.com" link in the image, that satisfies it just fine. If you remove it and move it to a footnote, that's fine too. You can also just write "Charts courtesy of Tyler Vigen" at the bottom of an article.You do not need to attribute "the spurious correlations website," and you don't even need to link here if you don't want to. I don't gain anything from pageviews. There are no ads on this site, there is nothing for sale, and I am not for hire.
For the record, I am just one person. Tyler Vigen, he/him/his. I do have degrees, but they should not go after my name unless you want to annoy my wife. If that is your goal, then go ahead and cite me as "Tyler Vigen, A.A. A.A.S. B.A. J.D." Otherwise it is just "Tyler Vigen."
When spoken, my last name is pronounced "vegan," like I don't eat meat.
Full license details.
For more on re-use permissions, or to get a signed release form, see tylervigen.com/permission.
Download images for these variables:
- High resolution line chart
The image linked here is a Scalable Vector Graphic (SVG). It is the highest resolution that is possible to achieve. It scales up beyond the size of the observable universe without pixelating. You do not need to email me asking if I have a higher resolution image. I do not. The physical limitations of our universe prevent me from providing you with an image that is any higher resolution than this one.
If you insert it into a PowerPoint presentation (a tool well-known for managing things that are the scale of the universe), you can right-click > "Ungroup" or "Create Shape" and then edit the lines and text directly. You can also change the colors this way.
Alternatively you can use a tool like Inkscape. - High resolution line chart, optimized for mobile
- Alternative high resolution line chart
- Scatterplot
- Portable line chart (png)
- Portable line chart (png), optimized for mobile
- Line chart for only Total Runs Scored by Chicago Cubs Team in National League (Central and East Division)
- Line chart for only Runs scored by the New York Mets
- AI-generated correlation image
- The spurious research paper: Batting Around the Numbers: A Statistical Analysis of the Relationship Between Chicago Cubs' Total Runs Scored and New York Mets' Runs Scored
You're a rater extraordinaire!
Correlation ID: 3086 · Black Variable ID: 206 · Red Variable ID: 4365