top of page
Image by Kyle Glenn

Analysis and Visualization

Data visualization analysis on the relationship between multiple attributes of a country such as population growth, birth and mortality rates, GDP, etc. 

Introduce the Problem 

Introducing the Data

Pre-Processing Data

Data Visualization

Storytelling
 

The GDP rate of a country can show us a lot of indicators about the progression of a country and its financial stability. I wanted to take a closer look at the correlation between GDP rates and various attributes of a country such as: Infant mortality, GDP, birth rate, death rate and other factors that could be affected by population growth. 

Questions to answer: 

  • What are the median GDP rates of countries regionally and how do their net birth/death rates compare?

  • Can there be a certain correlation between infant mortality, birthrates, deathrates and GDP?

The data that I chose to utilize was titled “Countries of the World” found on Kaggle 

(https://www.kaggle.com/datasets/fernandol/countries-of-the-world?datasetId=23752&searchQuery=mortality). The data was compiled between 1970-2017 and contains attributes such as population density, rates on birth and death, GDP, literacy rates, etc. Although there are more attributes given on the website with the database, for the sake of the project, I filtered only the data that I was interested in analyzing.

The first thing I did to clean this data was eliminate null values within the fields I would use. The data set used commas instead of decimal points for numbers, making it a string variable, I converted all numerical values with commas to have decimal points in order to allow python to read the data as a float value instead. Other than missing values, as stated before, I filtered fields that I would not be utilizing for this project

​

Medians.png

Birthrates

Birthrates.png
Deathrates.png

Deathrates

Based on the data gathered, we can form a deeper understanding of the relationship between the GDP of a region compared to its birthrates and death rates. Firstly, taking a look at the regions of the world with a lower GDP median, we find that Asian, Sub-Saharan Africa and C.W of Ind. States*  have the lowest GDP rates compared to all regions. Meanwhile, North America, Western Europe and Near East (Middle East) are the regions with three highest median GDP rates. When comparing the birth and death rates of these regions, it is visible that the regions with a lower GDP also have a higher birth rate, yet if you take a closer look at these same regions, they also have a significantly higher median of Infant mortality rate (per 1,000 births). Juxtaposed to North America, Western Europe and the Middle East, where their birthrates and infant mortality rates are relatively lower than those with lower GDP medians. On the other hand, I found that death rates do not necessarily have a strong relationship to either GDP rates or birth rates. The data shows that regions with a higher GDP rate median do not have nearly as high infant mortality rates. For example, Sub-Saharan Africa’s median GDP rate was the lowest at $1,300.00 per capita and its infant mortality rate was marked the highest at 76 infant deaths per 1,000 births. Compared to Northern America and Western Europe, with the highest GDP median at $36,900.00 per capita and $27,700.00 per capita respectively. North America's infant mortality rate was 7.5 deaths per 1,000 births and Western Europe had 4.5 deaths per 1,000 births. Showing the infant mortality rates in Sub-Saharan Africa is roughly 10 times larger than that of North America.  In conclusion, nationally it is clear that countries with lower GDP rates suffer with higher birth rates followed by higher infant mortality rates due to the lack of sustainability of economy within the given regions.

​

When reflecting over the data and its results, I believe that there are various factors that could have impacted my findings. 

One being that I was not able to access the changes over a given time because this data was collected over a time period. I believe that if this data came along with years, it would have been a better interpretation of the progression of GDP rates and all other attributes of each given country. 

Data collected from: https://www.kaggle.com/datasets/fernandol/countries-of-the-world?datasetId=23752&searchQuery=mortality​

Code: 
- df2.groupby('Region')[['GDP ($ per capita)', 'Birthrate', 'Deathrate',  'Population', 'Infant mortality (per 1000 births)']].median()


-df2.hist(column='GDP ($ per capita)', by='Region', figsize=(10,12), bins=5)


- df2.hist(column='Birthrate', by='Region', figsize=(10,11), bins=5)

​

- df2.hist(column='Deathrate', by='Region', figsize=(10,11), bins=5)

​

References &
Codes

 

Impact Session
 

bottom of page