Population Projections using R - Including dynamic graphical visualisations

A textbook on population projections with R.

Farid FLICI

Last Revision: 2020-04-30

Farid FLICI (2020). Populations Projections Using R - Including Dynamic graphical Visualisations. A textbook published with Gitbook, Available at: [https://farid-flici.gitbook.io/pop-proj-dz/], Version of 2020-04-30.

In this textbook, we are going to illustrate how to perform populations projections using the Cohort-Component Method using simple R functions and without using population projections specific Packages such as popdemo. We use the algerian population data for our case study. Then, we are going to show how to carry-out practical plots of population pyramid.

1. Population Projections

1.1. Introduction to population projections

$$a$$can be estimated based on the historical recorded values.

1.2. Data Requirements

All datasets need to be extended on the period going from the reference yars, 2015 in our case, to the horizon of the projection, 2070 in our case.

We associated the datasets to this textbook as embded files which can be downloaded from the links provided below.

  • Baseline population: We use the population of 2015, by detailed ages, as a baseline population.

This population pyramid, for males and females, can also be downloaded fom :[http://www.cread.dz/Actuarial_Demography/Donnees_Site/Detailed_Ages_Proj.xlsx]

  • The projected Age-Specific mortality Rates (ASMRs) from 0 to 120 years for the period from 2015 to 2017 were driven from the coherent mortality foracst of Flici (2016a) following the methodology of Hyndman et al. (2013).

The dataset of ASMRs of males and females can also be downloaded from [ http://www.cread.dz/Actuarial_Demography/Donnees_Site/Projected_ASMRs.xlsx].

  • The immigration sold is supposed to be null.

  • The number of males corresponding to 1 female among newborns is equal to 1.045 according to historical data.

1.3. Data Preparation

1.3.1. Baseline population pyramid

For the needs of this work, we use the population pyramid of Algeria in the begening of 2015 for males and females for the ages 0-99 years old.

We upload the data file:

Baseline_pop_dz_2015 <-
read.table("https://firebasestorage.googleapis.com/v0/b/gitbook-28427.appspot.com/o/assets%2F-M69GC8Q928dOzF6PVLf%2F-M69J6T3gp8XoKgj42Q6%2F-M69JWbNPOq-h1hHgkYv%2Fbaseline_pop_dz_2015.txt?alt=media&token=7735f959-b6c4-43c1-a5c5-294e4825a144", header=T)

We separate males and females into different datasheets, namely PM15 and PF15 as accronyms of "Population of males 2015" and "Population of females 2015".

PM15<-as.matrix(Baseline_pop_dz_2015$males[1:100])
rownames(PM15)<-c(0:99) 

PF15<-as.matrix(Baseline_pop_dz_2015$females[1:100])
rownames(PF15)<-c(0:99)

1.3.2. Mortality forecast

We need to upload the mortality surface for males and females.

Male mortality

MM<- read.table("https://firebasestorage.googleapis.com/v0/b/gitbook-28427.appspot.com/o/assets%2F-M69GC8Q928dOzF6PVLf%2F-M69J6T3gp8XoKgj42Q6%2F-M69J_65GSqJfyB1hQZ4%2FMort_forecast_males.txt?alt=media&token=ce303331-93ca-40ce-b2a3-fb50dfcc0bea", header=T)
MM<-as.matrix(MM[1:121,2:57])
rownames(MM)<-c(0:120)
colnames(MM)<-c(2015:2070)

Female Mortality

MF<- read.table("https://firebasestorage.googleapis.com/v0/b/gitbook-28427.appspot.com/o/assets%2F-M69GC8Q928dOzF6PVLf%2F-M69J6T3gp8XoKgj42Q6%2F-M69JamiW2fVfbZC0gia%2FMort_forecast_females.txt?alt=media&token=8454d5f6-3270-45a0-826c-09de6bfe750d", header=T)
MF<- as.matrix(MF[1:121,2:57])
rownames(MF)<-c(0:120)
colnames(MF)<-c(2015:2070)

1.3.3. Fertility Forecast

Upload the data file, and we name it as (FR) as "Fertility".

FR<-read.table("https://firebasestorage.googleapis.com/v0/b/gitbook-28427.appspot.com/o/assets%2F-M69GC8Q928dOzF6PVLf%2F-M69J6T3gp8XoKgj42Q6%2F-M69JcVOtWsH-GQUmIiu%2FFert_forecast.txt?alt=media&token=cc90ae5b-dcae-49d9-86e6-b9244fd4549d", header=T)
FR<- as.matrix(FR[1:35, 2:57])
rownames(FR)<-c(15:49)
colnames(FR)<-c(2015:2070)

1.4. Projection Results

First, we create an ampty matrix to receive the projection results (PopM for males and PopF for females). This matrix should be of a dimension (n=121 * t = 56).

PopM<-matrix(0,nrow=121,ncol=56)
rownames(PopM)<-c(0:120)
colnames(PopM)<-c(2015:2070)

Then,

PopF<-matrix(0,nrow=121,ncol=56)
rownames(PopF)<-c(0:120)
colnames(PopF)<-c(2015:2070)

Then, we copy the PM15 into the first row in PopM (PF15 into PopF[,1])

PopM[1:121,1]<-rbind(PM15,as.matrix(rep(0,21),ncol=1,nrow=21))
PopF[1:121,1]<-rbind(PF15,as.matrix(rep(0,21),ncol=1,nrow=21))
PopM<-as.matrix(PopM)
PopF<-as.matrix(PopF)

$$B_{t-1}=\sum_{s=15}^{49} PFM_{s,t-1}*f_{s,t-1}$$

We estimate the Mid-year population (PFM as an average of the populations at the begening and the end of the previous year). It makes:

$$B_{t-1}=\sum_{s=15}^{49} \frac{PF_{s,t-1}+PF_{s,t}}{2}*f_{s,t-1}$$

a=1.045
for (i in 2 : 56) {
for (j in 2: 121) {
PopM[j,i]<-PopM[j-1,i-1]*(1-MM[j-1,i-1])
PopF[j,i]<-PopF[j-1,i-1]*(1-MF[j-1,i-1])
}
PopM[1,i]<-as.matrix(t(PopF[16:50,i-1]+PopF[16:50,i])/2)%*%as.matrix(FR[,i-1])*(a/(1+a))
PopF[1,i]<-as.matrix(t(PopF[16:50,i-1]+PopF[16:50,i])/2)%*%as.matrix(FR[,i-1])*(1/(1+a)) 
}

2. Graphical Visualisation

2.1. How to plot a population pyramid in R?

First, we need to call ggplot

library(ggplot2)

A first example

i<-40
# We set 40 just as an example
year1<-as.character(2015+i)
pyram<-cbind.data.frame(seq(0,110,1),-PopM[1:111,i],PopF[1:111,i])
colnames(pyram)<-c("age","males","females")
A<-ggplot(pyram, aes(x=age))
B<-A+ geom_bar(aes(y=males),fill="blue",stat="identity",width=0.75)
C<- B+ geom_bar(aes(y=females),fill="red",stat="identity",width=0.75)
print(C)

2.2. How to reverse the axis x and y ?

by adding + coord_flip()

i<-40
year1<-as.character(2015+i)
pyram<-cbind.data.frame(seq(0,110,1),-PopM[1:111,i],PopF[1:111,i])
colnames(pyram)<-c("age","males","females")
A<-ggplot(pyram, aes(x=age))
B<-A+ geom_bar(aes(y=males),fill="blue",stat="identity",width=0.75)
C<- B+ geom_bar(aes(y=females),fill="red",stat="identity",width=0.75)
D<- C+coord_flip()
print(D)

2.3. Ho to redefine the axis labels?

by adding scale_y_continuous(breaks = seq(-600000, 600000, 200000), labels = paste(as.character(c(seq(6, 0, -2),seq(2,6,2))), "x105"), limits=c(-650000,650000))

It makes

i<-40
year1<-as.character(2015+i)
pyram<-cbind.data.frame(seq(0,110,1),-PopM[1:111,i],PopF[1:111,i])
colnames(pyram)<-c("age","males","females")
A<-ggplot(pyram, aes(x=age))
B<-A+ geom_bar(aes(y=males),fill="blue",stat="identity",width=0.75)
C<- B+ geom_bar(aes(y=females),fill="red",stat="identity",width=0.75)
D<- C+coord_flip()
E<-D+ scale_y_continuous(breaks = seq(-600000, 600000, 200000),
labels = paste(as.character(c(seq(6, 0, -2),seq(2,6,2))), "x105"),
limits=c(-650000,650000))
print(E)

2.4. How to rename Axis titles?

by adding

xlab("Age") + ylab("Population number")

It makes :

i<-40
year1<-as.character(2015+i)
pyram<-cbind.data.frame(seq(0,110,1),-PopM[1:111,i],PopF[1:111,i])
colnames(pyram)<-c("age","males","females")
A<-ggplot(pyram, aes(x=age))
B<-A+ geom_bar(aes(y=males),fill="blue",stat="identity",width=0.75)
C<- B+ geom_bar(aes(y=females),fill="red",stat="identity",width=0.75)
D<- C+coord_flip()
E<-D+ scale_y_continuous(breaks = seq(-600000, 600000, 200000),
labels = paste(as.character(c(seq(6, 0, -2),seq(2,6,2))), "x105"),
limits=c(-650000,650000))
F<-E+ xlab("Age") + ylab("Population number")
print(F

2.5. how to add a title?

by adding

ggtitle(paste("Population Pyramid of Algeria:", year1)

It makes :

i<-40
year1<-as.character(2015+i)
pyram<-cbind.data.frame(seq(0,110,1),-PopM[1:111,i],PopF[1:111,i])
colnames(pyram)<-c("age","males","females")
A<-ggplot(pyram, aes(x=age))
B<-A+ geom_bar(aes(y=males),fill="blue",stat="identity",width=0.75)
C<- B+ geom_bar(aes(y=females),fill="red",stat="identity",width=0.75)
D<- C+coord_flip()
E<-D+ scale_y_continuous(breaks = seq(-600000, 600000, 200000),
labels = paste(as.character(c(seq(6, 0, -2),seq(2,6,2))), "x105"),
limits=c(-650000,650000))
F<-E+ xlab("Age") + ylab("Population number")
G<-F+ ggtitle(paste("Population Pyramid of Algeria:", year1)))
print(G)

2.6. How to add a legend? Or any text to the plot?

by adding

annotate('text', x = 95, y = 250000, label = 'Females', size = 5, colour="red") + annotate('text', x = 95, y = - 250000, label = 'Males', size = 5, colour="blue")

It makes :

i<-40
year1<-as.character(2015+i)
pyram<-cbind.data.frame(seq(0,110,1),-PopM[1:111,i],PopF[1:111,i])
colnames(pyram)<-c("age","males","females")
A<-ggplot(pyram, aes(x=age))
B<-A+ geom_bar(aes(y=males),fill="blue",stat="identity",width=0.75)
C<- B+ geom_bar(aes(y=females),fill="red",stat="identity",width=0.75)
D<- C+coord_flip()
E<-D+ scale_y_continuous(breaks = seq(-600000, 600000, 200000),
labels = paste(as.character(c(seq(6, 0, -2),seq(2,6,2))), "x105"),
limits=c(-650000,650000))
F<-E+ xlab("Age") + ylab("Population number")
G<-F+ ggtitle(paste("Population Pyramid of Algeria:", year1)))
H<-G+ annotate('text', x = 95, y = 250000, label = 'Females', size = 5,
colour="red") + annotate('text', x = 95, y = - 250000, label = 'Males', size = 5,
colour="blue")
print(H)

2.7. Other options

by adding theme(plot.title = element_text(hjust = 0.5,size=12,face="bold"), panel.border = element_rect(colour = "black",fill=NA,size=1), axis.text = element_text(size=12,face="bold"), panel.background = element_rect(fill="white")))

It makes :

i<-40
year1<-as.character(2015+i)
pyram<-cbind.data.frame(seq(0,110,1),-PopM[1:111,i],PopF[1:111,i])
colnames(pyram)<-c("age","males","females")
A<-ggplot(pyram, aes(x=age))
B<-A+ geom_bar(aes(y=males),fill="blue",stat="identity",width=0.75)
C<- B+ geom_bar(aes(y=females),fill="red",stat="identity",width=0.75)
D<- C+coord_flip()
E<-D+ scale_y_continuous(breaks = seq(-600000, 600000, 200000),
labels = paste(as.character(c(seq(6, 0, -2),seq(2,6,2))), "x105"),
limits=c(-650000,650000))
F<-E+ xlab("Age") + ylab("Population number")
G<-F+ ggtitle(paste("Population Pyramid of Algeria:", year1)))
H<-G+ annotate('text', x = 95, y = 250000, label = 'Females', size = 5,
colour="red")
I<-H")+theme(plot.title = element_text(hjust = 0.5,size=12,face="bold"),
panel.border = element_rect(colour = "black",fill=NA,size=1), axis.text =
element_text(size=12,face="bold"))
print(H)

2.8. A little bit more

In order to change the background color from gray to white we just need to put:

panel.background = element_rect(fill=NA) into +theme()

It makes :

i<-40
year1<-as.character(2015+i)
pyram<-cbind.data.frame(seq(0,110,1),-PopM[1:111,i],PopF[1:111,i])
colnames(pyram)<-c("age","males","females")
A<-ggplot(pyram, aes(x=age))
B<-A+ geom_bar(aes(y=males),fill="blue",stat="identity",width=0.75)
C<- B+ geom_bar(aes(y=females),fill="red",stat="identity",width=0.75)
D<- C+coord_flip()
E<-D+ scale_y_continuous(breaks = seq(-600000, 600000, 200000), 
labels = paste(as.character(c(seq(6, 0, -2),seq(2,6,2))), "x105"),
limits=c(-650000,650000))
F<-E+ xlab("Age") + ylab("Population number")
G<-F+ ggtitle(paste("Population Pyramid of Algeria:", year1)))
H<-G+ annotate('text', x = 95, y = 250000, label = 'Females', size = 5, colour="red")
I<-H+theme(plot.title = element_text(hjust = 0.5,size=12,face="bold"),
panel.border = element_rect(colour = "black",fill=NA,size=1), axis.text =
element_text(size=12,face="bold"), panel.background = element_rect(fill= NA))
print(H)

3. How to Make it moving moving ???

Required packages:

library(animation)
library(dplyr)
library(ggthemes)

The code to run:

saveGIF({
for (i in 1:56)
{
year1<-as.character(2014+i)
pyram<-cbind.data.frame(seq(0,110,1),-PopM[1:111,i],PopF[1:111,i])
colnames(pyram)<-c("age","males","females")
A<-ggplot(pyram,aes(x=age))+geom_bar(aes(y=males),fill="blue",stat=
"identity",width=0.75)+geom_bar(aes(y=females),fill="red",stat="identity",
width=0.75)+ coord_flip()+ scale_y_continuous(breaks= seq(-600000,600000,
200000), labels = paste(as.character(c(seq(6,0,-2),seq(2,6,2))),"x105"),
limits=c(-650000,650000))+xlab("Age")+ylab("Population number")+ggtitle(
paste("Population Pyramid of Algeria:", year1))+annotate('text', x=95, y=
250000,label='Females', size = 5,colour="red")+annotate('text', x=95, y=
250000, label='Females', size = 5, colour="red")+annotate('text', x=95,
y =-250000, label='Males', size = 5, colour="blue")+ theme(plot.title =
element_text(hjust = 0.5, size=12, face="bold"), panel.border =
element_rect(colour = "black", fill=NA,size=1),axis.text = element_text(
size=12, face="bold"))
print(A)
}}
, movie.name = 'pyram.gif', interval = 0.3, ani.width = 1100, ani.height = 820)

After having executed the R-code above, a message about the progression of the GIF creation is posted on the screen:

Once you get this message, the gif file can be found in the same directory as the R-project location. To open it, you should use any internet browser or to put it on power point full screen.

In order to visualize the dynamic pyramid, you can insert the GIF plot into a power point file and to make it on full screen view or you can just open the GIF using any internet browser.

Converter Error Message

Sometimes, when trying to run the code to create the GIF, an error message appears and it concerns the converter ImageMagick. This last is not a part of R, and i is an independent graphical tool which allow the creation of a GIF correctely. What do you need to do in such a case, is to install ImageMagic version 7 or higher because the old versions work with converter.exe which is not adapted to the newest versions of R-studio. The compatible converter is magick . This application can be downloaded from: [www.imagemagick.org/script/download.php].

Then, you need to run the following code in R to update the location of the converter being installed:

ani.options(convert = 'C:/PROGRA~1/ImageMagick-7.0.7-Q8/convert.exe) Before to run again the saveGIF() .

4. Application Examples

4.1. Dynamic Total Fertility Rate projection

  • Now, make it dynami

4.2. Dynamic life expectancy projection:

  • Make it movING

References

  • Flici, F. (2016a). Coherent mortality forecasting for the Algerian population. Presented at Samos Conference in Actuarial Sciences and Finance, Samos, Greece (May).

  • Flici, F. (2016b). Projection des taux de fécondité de la population algérienne á l'horizon 2050. MPRA Paper No. 99077, posted 12 Mar 2020. [https://mpra.ub.uni-muenchen.de/99077/1/MPRA_paper_99077.pdf]

  • Flici, F. (2017). Longevity and pension plan sustainability in Algerie: Taking the re- tirees mortality experience into account. Doctoral dissertation, Higher National School of Statistics and Applied Economics (ENSSEA), Kolea, Algeria.

  • Hyndman, R. J., Booth, H., & Yasmeen, F. (2013). Coherent mortality forecasting: the product-ratio method with functional time series models. Demography, 50 (1), 261-283.

  • Lee, R. D. (1993). Modeling and forecasting the time series of US fertility: Age distribution, range, and ultimate level. International Journal of Forecasting, 9(2), 187-202.

Contact SupportGitHub Status@githubstatus

Last updated