Popularity of the first name Colin correlates with Petroluem consumption in Yemen (r=0.782)

	1980	1981	1982	1983	1984	1985	1986	1987	1988	1989	1990	1991	1992	1993	1994	1995	1996	1997	1998	1999	2000	2001	2002	2003	2004	2005	2006	2007	2008	2009	2010	2011	2012	2013	2014	2015	2016	2017	2018	2019	2020	2021
Popularity of the first name Colin (Babies born)	1211	1243	1545	1571	1785	1756	1706	2116	2273	2621	2802	2309	2641	2909	2823	2635	2964	3109	3083	3086	3263	3180	3315	4883	5147	4540	3865	3614	3730	3656	3489	3272	3022	3040	2888	2429	2052	1666	1477	1393	1181	1293
Petroluem consumption in Yemen (Million Barrels/Day)	45	46	54	50	62	69	66.4	66.1	67.7	73.4	76.3	78.8	84	70	65.6	73	73.4	78.1	84.9	61	101.316	109.485	111.294	126.873	133.989	141.3	141.277	150.033	156.196	165.801	148.813	127.296	112.804	156	154	73.2607	62.2702	59	57	75.7121	55.4342	62.4283

# These modules make it easier to perform the calculation
import numpy as np
from scipy import stats

# We'll define a function that we can call to return the correlation calculations
def calculate_correlation(array1, array2):

    # Calculate Pearson correlation coefficient and p-value
    correlation, p_value = stats.pearsonr(array1, array2)

    # Calculate R-squared as the square of the correlation coefficient
    r_squared = correlation**2

    return correlation, r_squared, p_value

# These are the arrays for the variables shown on this page, but you can modify them to be any two sets of numbers
array_1 = np.array([1211,1243,1545,1571,1785,1756,1706,2116,2273,2621,2802,2309,2641,2909,2823,2635,2964,3109,3083,3086,3263,3180,3315,4883,5147,4540,3865,3614,3730,3656,3489,3272,3022,3040,2888,2429,2052,1666,1477,1393,1181,1293,])
array_2 = np.array([45,46,54,50,62,69,66.4,66.1,67.7,73.4,76.3,78.8,84,70,65.6,73,73.4,78.1,84.9,61,101.316,109.485,111.294,126.873,133.989,141.3,141.277,150.033,156.196,165.801,148.813,127.296,112.804,156,154,73.2607,62.2702,59,57,75.7121,55.4342,62.4283,])
array_1_name = "Popularity of the first name Colin"
array_2_name = "Petroluem consumption in Yemen"

# Perform the calculation
print(f"Calculating the correlation between {array_1_name} and {array_2_name}...")
correlation, r_squared, p_value = calculate_correlation(array_1, array_2)

# Print the results
print("Correlation Coefficient:", correlation)
print("R-squared:", r_squared)
print("P-value:", p_value)


Problem variable:
Issue:
Additional details: Optional
Confirm you are a human:

Data details

Why this works

Try it yourself

Reuseable content