Climate changes has been a urgent topic for the current century. There are consistent new regulations and directives which aim to reduce the CO2 emissions in hopes of slowing down the increasing of land temperatures. For that purpose, the researchers have decided to analyse two datasets a uncover any relationships which may exist.
The researchers have proposed 2 research questions to direct the analysis of such large data:
The researchers hypothesise that there is a positive correlation between CO2 emissions and global land warming, therefore an increase in CO2 emissions results in increased land temperatures.
The null hypothesis is that there is no correlation between CO2 emissions and land temperature and that they are independant of each other.
2.1. Which country produce high CO2 emission but received less effected from global land warming? (evil emitters)
2.2. Which country produce low CO2 emission but received high effected from global land warming? (unlucky receivers)
The researchers are using a dataset which contains data regarding the Global land temperature by country. Source: https://data.world/data-society/global-climate-change-data
## Rows: 577,462
## Columns: 4
## $ dt <date> 1743-11-01, 1743-12-01, 1744-01-01, 174…
## $ AverageTemperature <dbl> 4.384, NA, NA, NA, NA, 1.530, 6.702, 11.…
## $ AverageTemperatureUncertainty <dbl> 2.294, NA, NA, NA, NA, 4.680, 1.789, 1.5…
## $ Country <chr> "Åland", "Åland", "Åland", "Åland", "Åla…
The Country is a categorical data and has 243 unique
values, but it also includes continents: Africa, Antarctica, Asia,
Europe, North America, Oceania, South America.
Interestingly, there are some country which represented in 2 different values: Denmark (Europe) - Denmark, France (Europe) - France, Netherlands (Europe), Netherlands, and United Kingdom (Europe) - United Kingdom. Most of them are almost the same except Denmark.
The dt is a discrete numerical data representing date.
The dataset contains record of every 1st of the month. and here are the
latest 10 record dates:
## [1] "2013-09-01" "2013-08-01" "2013-07-01" "2013-06-01" "2013-05-01"
## [6] "2013-04-01" "2013-03-01" "2013-02-01" "2013-01-01" "2012-12-01"
The AverageTemperature is a continues numerical data
representing country’s land temperature in each month. The value is
swinging up and down in pattern according to the seasons.
We can see that
AverageTemperature contains a lot of NA’s
because only some country data is available in the early years.
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## -37.66 10.03 20.90 17.19 25.81 38.84 32651
The AverageTemperatureUncertainty is also given but we
are not going to use it.
Source: https://ourworldindata.org/co2-dataset-sources
## Rows: 66,984
## Columns: 4
## $ Entity <chr> "Afghanistan", "Afghanistan", "Afghanistan", "A…
## $ Code <chr> "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG"…
## $ Year <dbl> 1750, 1751, 1752, 1753, 1754, 1755, 1756, 1757,…
## $ `Annual CO2 emissions` <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
The Entity is categorical data with 247 unique values
including countries but also - Continents: Africa, Antarctica, Asia,
Europe, North America, Oceania, South America - Continents with
exception: Asia (excl. China and India), Europe (excl. EU-27), Europe
(excl. EU-28), European Union (27), European Union (28), North America
(excl. USA) - Country with income group: Low-income countries,
High-income countries, Upper-middle-income countries etc. - GCP areas:
French Equatorial Africa (GCP), French West Africa (GCP), etc. - Not a
country: International transport
Year column is a discrete numerical data. The dataset
contain annual data. Here are the 10 latest record year:
## [1] 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012
Annual CO2 emissions is a continues numerical data
representing CO2 emission in tons. There is no NA’s but range from 0 to
3.712e+10.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000e+00 0.000e+00 0.000e+00 1.230e+08 1.205e+06 3.712e+10
Code representing country code, but we don’t have any
plan to use it.
First, we remove row continent and redundant country in
Country column, as discussed in previous section. Now we
have valid 232 unique values of country.
Then, we calculate annual average temperature from 1913 - 2012 and drop NAs rows
## # A tibble: 23,130 × 4
## Country Year AverageTemperature AverageTemperatureUncertainty
## <chr> <int> <dbl> <dbl>
## 1 Afghanistan 1913 13.9 0.580
## 2 Afghanistan 1914 14.3 0.757
## 3 Afghanistan 1915 14.9 0.687
## 4 Afghanistan 1916 13.3 0.717
## 5 Afghanistan 1917 13.9 0.716
## 6 Afghanistan 1918 13.5 0.722
## 7 Afghanistan 1919 13.9 0.767
## 8 Afghanistan 1920 13.0 0.808
## 9 Afghanistan 1921 14.2 0.608
## 10 Afghanistan 1922 14.3 0.539
## # … with 23,120 more rows
First, we remove rows with invalid country in Entity column, as discussed in previous section. Now we have 221 unique values of country.
Then we filter only record from 1913-2012
## # A tibble: 22,100 × 4
## Entity Code Year `Annual CO2 emissions`
## <chr> <chr> <dbl> <dbl>
## 1 Afghanistan AFG 1913 0
## 2 Afghanistan AFG 1914 0
## 3 Afghanistan AFG 1915 0
## 4 Afghanistan AFG 1916 0
## 5 Afghanistan AFG 1917 0
## 6 Afghanistan AFG 1918 0
## 7 Afghanistan AFG 1919 0
## 8 Afghanistan AFG 1920 0
## 9 Afghanistan AFG 1921 0
## 10 Afghanistan AFG 1922 0
## # … with 22,090 more rows
To answer the first research question, the researchers conducted correlation analysis between CO2 emission and land temperature in the following locations: Estonia, Thailand, South Africa, Australia, United States, China, and India.
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
For this report, the researchers have decided to look at the CO2 emissions and land temperatures of Estonia, Thailand, South Africa, Australia, United states, China and India. All of these countries have a positive correlation between CO2 emissions and land temperatures, however some places with a higher coefficient than others.
Estonia is the only country which seems to have no significant (p=0.21) result for the correlation and also with a smaller coefficient (r(98)=.13) than other places. All other locations had highly significant (p<.0001) correlation results. Australia showed the highest CO2 emission and land temperature correlation with a result of r(98)=0.69. After Australia, in descending order, there is China (r(98)=0.68), India (r(98)=0.66), South Africa (r(98)=0.66), Thailand (r(98)=0.55), and United States (r(98)=0.47).
From the EDA, we will revise 2 sub-questions as following:
Since the annual temperatures were swinging between each year. We estimation the temperature change of each country between 1913 and 2012 using the linear regression model using the land temperature dataset.
From the model, temperature change from 1913 - 2012 of each country are
as following:
## # A tibble: 232 × 2
## Country `Temperature change`
## <chr> <dbl>
## 1 Afghanistan 1.51
## 2 Åland 1.03
## 3 Albania 0.663
## 4 Algeria 1.20
## 5 American Samoa 0.888
## 6 Andorra 1.20
## 7 Angola 0.859
## 8 Anguilla 1.15
## 9 Antigua And Barbuda 1.16
## 10 Argentina 0.892
## # … with 222 more rows
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.4897 0.8779 1.0009 1.0182 1.1577 1.8429
The distribution is very close to the normal distribution which is surprising.
Next step, we find total CO2 emission from 1913 - 2012 of each country from CO2 emission dataset.
## # A tibble: 221 × 2
## Country `CO2 emissions (MT)`
## <chr> <dbl>
## 1 Afghanistan 125.
## 2 Albania 248.
## 3 Algeria 3383.
## 4 Andorra 11.3
## 5 Angola 444.
## 6 Anguilla 2.31
## 7 Antigua and Barbuda 17.7
## 8 Argentina 6880.
## 9 Armenia 327.
## 10 Aruba 69.0
## # … with 211 more rows
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.2 31.1 246.4 11943.9 2327.1 1336251.6
The distribution is ultimately skewed to the right. Meaning most country emit just a few co2 but a few country, which can be considered as outlier, emit a lot.
Then we join both tibble, estimated temperature changes and CO2 emission for each country from 1913 - 2012, together with additional columns:
## # A tibble: 186 × 5
## Country `Temperature change` `CO2 emissions (MT)` Ratio Group
## <chr> <dbl> <dbl> <dbl> <chr>
## 1 Afghanistan 1.51 125. 0.0120 Neutral
## 2 Albania 0.663 248. 0.00267 Neutral
## 3 Algeria 1.20 3383. 0.000354 Neutral
## 4 Andorra 1.20 11.3 0.107 Unlucky rece…
## 5 Angola 0.859 444. 0.00194 Neutral
## 6 Anguilla 1.15 2.31 0.498 Neutral
## 7 Argentina 0.892 6880. 0.000130 Neutral
## 8 Armenia 1.38 327. 0.00422 Neutral
## 9 Aruba 1.04 69.0 0.0150 Neutral
## 10 Australia 0.954 15053. 0.0000634 Neutral
## # … with 176 more rows
Here are the countries from each group with temperature changes and CO2 emission from 1913-2012
The above chart is difficult to read. Since a lot of countries has
relatively low emission, and we are not interested in neutral countries,
here is the plot that contain only evil emitters and unlucky
receivers:
The evil emitters, ranking by smallest temperature changes per CO2 emission ratio, are:
## # A tibble: 9 × 4
## Country `Temperature change` `CO2 emissions (MT)` Ratio
## <chr> <dbl> <dbl> <dbl>
## 1 India 0.854 35051. 0.0000244
## 2 Mexico 0.650 16390. 0.0000397
## 3 Turkey 0.793 7696. 0.000103
## 4 Thailand 0.760 5208. 0.000146
## 5 Greece 0.558 3492. 0.000160
## 6 Nigeria 0.553 2931. 0.000189
## 7 Egypt 0.869 4520. 0.000192
## 8 Denmark 0.720 3629. 0.000198
## 9 Bulgaria 0.800 3472. 0.000230
The unlucky receiver, ranking by highest temperature changes per CO2 emission ratio, are:
## # A tibble: 6 × 4
## Country `Temperature change` `CO2 emissions (MT)` Ratio
## <chr> <dbl> <dbl> <dbl>
## 1 Montserrat 1.16 1.31 0.887
## 2 Dominica 1.16 3.49 0.333
## 3 Liechtenstein 1.27 4.86 0.261
## 4 Grenada 1.16 5.98 0.194
## 5 Saint Lucia 1.16 10.5 0.111
## 6 Andorra 1.20 11.3 0.107
However, judging CO2 for the whole country might not be fair since smaller country are more likely to produce less co2. CO2 emission per capita can be used in the future study.
The data cleaning process in this report has given a lot of insight into how the global land temperature has annually been increasing in our chosen countries: Australia, China, Estonia, India, South Africa, Thailand, and United States. Similarly, CO2 emission levels have been increasing in all of the countries, however some countries increasing at rather more drastic increments than others. China, Unites States and India have seen a great rise in CO2 emissions, although United States seems to be doing a better job in attempting to slow down their rise with a sudden drop in emissions in the last decade.
While looking at the correlation between CO2 emission and annual land temperatures, there is a clear positive correlation between the two variables. However, the correlation is not strong enough to make any clear inferences of CO2 emissions affecting the land temperature. Rather it seems probable there are also other variables which should be included in such an analysis. There is opportunity to build on the current research with the integration of further datasets.
Even though climate changes is the global disaster, degree of effect on each country are different. From the analysis, we found some countries that has high CO2 emission but has low land temperature changes, and on the other hand, some countries that has low CO2 emission but has high land temperature changes. The limitation of our methodology is we used the whole country CO2 emission without concerning population size.
By combining the past countries’ CO2 emission and land temperature change dataset, we can draw insights uncovering relationships between it. The study show that there is a clear positive relationship between CO2 emission and land temperature change when we focus in each country, but it doesn’t seem fair when we compare ratio of CO2 emission and land temperature changed of each country around the world. These insight emphasize that climate changes is still a global issue, which require serious action from every countries.