knitr::kable(head(life_table))

Year	Age	mx	qx	ax	lx	dx	Lx	Tx	ex	Country	Gender
1816	0	0.20534	0.17972	0.31	100000	17972	87524	4009912	40.10	France	Both
1816	1	0.04669	0.04562	0.50	82028	3742	80156	3922388	47.82	France	Both
1816	2	0.03412	0.03355	0.50	78285	2626	76972	3842232	49.08	France	Both
1816	3	0.02304	0.02277	0.50	75659	1723	74798	3765260	49.77	France	Both
1816	4	0.01604	0.01591	0.50	73936	1176	73348	3690462	49.91	France	Both
1816	5	0.01373	0.01364	0.50	72760	992	72264	3617114	49.71	France	Both

Western countries in 1948

Plot log-Qx for all countries for Year 1948

life_table48 <- life_table %>% filter( Year==1948) %>%
  ggplot() +
  geom_point(mapping = aes(y= qx, x= Age,col= Country),alpha=0.35) +
  scale_y_log10() +
  ggtitle("Mortality quotients  ")+
  labs(subtitle = "All Countries at all ages for year 1948")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +xlab("Age") +ylab("log mortality")+
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
   theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="black"))

life_table48

Commentaire

Pour l’année 1948, le taux de mortalité est globalement le même pour tout les pays considérés. La mortalité infantile est non négligeable, répartie entre 8% pour l’Espagne à environ 2% en Suède, mais passé les deux premières années de vie, il est très peu probable de mourir jeune. Avant 45 ans, le risque pour tout les pays est inférieur à 1%. Cependant, passé cet âge, le taux de mortalité va commencer à croître de manière importante atteignant 10% vers 75 ans, pour être supérieur à 40% pour les plus de 105 ans, pour l’ensemble des pays.

Plot Eu Countries/USA Qx ratios, for Year 1948

life_table48EUUSA <- life_table %>% filter(Year==1948, Country!="USA") %>%
  ggplot() +
  geom_point(mapping = aes(y= qx/filter(life_table, Year== 1948, Country=="USA")$qx , x= Age ,col= Country), alpha = 0.35) +
  labs(y = "Eu Country/USA Qx Ratio") +
  geom_hline(yintercept = 1)+
  scale_y_log10() +
  ggtitle("Ratios between mortality quotients in European countries")+
  labs(subtitle = "and mortality quotients in the USA in 1948.")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +xlab("Age") +ylab("log mortality ratio USA / European")+
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
   theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="maroon"))

life_table48EUUSA

Commentaire

La mortalité chez les nouveaux nés jusqu’aux trentenaires est beaucoup plus élevée en Europe qu’aux Etats-Unis, en particulier en Italie et en Espagne et dans une certaine mesure en France. Cependant, ce ratio converge assez rapidement vers 1 quand l’Age augmente. Un sursaut est remarquable entre 20-30 ans, mais passé la trentaine, le ratio semble se stabiliser vers 1, D’abord inférieur à 1 de 30 à 80 ans, il est quasiment égal à 1 (légèrement au dessous ou au dessus) à partir des 90 ans, pour l’ensemble des pays d’Europe considérés. Le taux de mortalité chez les jeunes aux Etats-Unis est donc plus faible qu’en Europe, mais il tend à être égal avec les autres pays quand les gens vieillissent.

— Au lendemain de la seconde guerre mondiale, l’Europe est un continent dévaste avec une proportion importante de blesses tandis que les Etats-Unis n’ont pas connu de combats sur leur territoire. Ceci explique la mortalité accrue en Europe par un plus grand nombre de blesses (beaucoup de fronts de combats) entre 0 et 35 ans. — Cependant, les coefficients de mortalité s’harmonisent : on peut y voir plusieurs facteurs explicatifs :

Création de l’OMS le 7 avril 1948 (constitution adoptée en 1946 par 61 états membres) : ! La sante de tous les peuples est une condition fondamentale de la paix du monde et de la sécurité ; elle dépend de la coopération la plus étroite des individus et des Etats. ” (extrait du préambule de sa Constitution).
On note qu’entre 2 ans et 55 ans environ, les quotients de mortalité sont faibles. On peut corréler cela aux progrès de la médecine, aux campagnes de sante publique et a l’absence de conflits armes en Europe Occidentale et aux Etats-Unis.

Death Rate Evoution since WWII

Death Rate Evolution since WWII, using facet

life_tablePer10 <- life_table %>% filter(Gender!="Both", Year %in% seq(1946,2016,by=10))  %>%
   ggplot() +
   geom_point(mapping = aes(x = Age, y = qx, col = as.factor(Year)), alpha = 0.35)+
  ggtitle("Death rates evolution since WW II  ")+
  labs(subtitle = "")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +xlab("Age") +ylab("mortality")+
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
   theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="black"))



life_tablePer10Wrap <- life_tablePer10 + facet_wrap(Country ~ Gender)
#On les affiche
#life_tablePer10
life_tablePer10Wrap

## Death Rate Evolution since WWII throughout time

(ggplot(data = life_table %>%
          filter(Gender!="Both", Year %in% seq(1946,2016,by=10)) ) +
   aes(x = Age, 
      y = qx, 
      text = Country, 
      color = as.factor(Country), 
      frame = Year) +
   geom_point() +
  scale_y_log10() +
  theme(legend.position = "none") +
  labs(title= "Death rates evolution since WW II  ", x = "Age", y = "log mortality")) %>% 
  plotly::ggplotly(height = 500, width=750)

## Warning: Transformation introduced infinite values in continuous y-axis

Sur le graphique animé, il est facile de voir que la mortalité à la naissance diminue avec le temps, ce qui est également le cas pour les âges de 1 à 60 ans.

Mortality Ratio Function

ratio_mortality_rates <- function(df,
                                  reference_year=1946,
                                  target_years=seq(1946, 2016, 10)){
  df1 = df %>%
    filter(Year %in% target_years) %>%
    group_by(Country, Year, Age, Gender) %>%
    select(Year, Age , qx , Country , Gender) 
  
  df2 = df %>%
    filter(Year == reference_year) %>%
    group_by(Country, Year, Age, Gender) %>%
    rename(qx.ref_year = qx )
  
  
  return(select(inner_join(df1,df2, by= c("Age", "Gender", "Country")), Year.x, Age, qx, qx.ref_year, Country, Gender))
}
rm_Table <- ratio_mortality_rates(life_table, reference_year = 1946 , target_years = seq(1946,2016,10))
#On l'affiche un échantilon de la table  
rm_Table %>%
  dplyr::slice_sample(n = 7, replace = FALSE)

## # A tibble: 16,170 x 6
## # Groups:   Country, Age, Gender [2,310]
##    Year.x   Age      qx qx.ref_year Country         Gender
##     <int> <int>   <dbl>       <dbl> <chr>           <chr> 
##  1   1996     0 0.00617      0.0457 England & Wales Both  
##  2   1956     0 0.0240       0.0457 England & Wales Both  
##  3   1966     0 0.0190       0.0457 England & Wales Both  
##  4   2016     0 0.00387      0.0457 England & Wales Both  
##  5   2006     0 0.00511      0.0457 England & Wales Both  
##  6   1976     0 0.0141       0.0457 England & Wales Both  
##  7   1946     0 0.0457       0.0457 England & Wales Both  
##  8   1966     0 0.0163       0.0397 England & Wales Female
##  9   1976     0 0.0120       0.0397 England & Wales Female
## 10   1946     0 0.0397       0.0397 England & Wales Female
## # ... with 16,160 more rows

Plot Mortality Ratios with USA as reference from 1946

rm_TablePerRef <- rm_Table %>% filter(Gender!="Both", Year.x %in% seq(1946,2016,10))  %>%
   ggplot() +
   geom_point(mapping = aes(x = Age, y = qx/qx.ref_year, col = as.factor(Year.x)), alpha = 0.35)+
  scale_y_log10() +
  ggtitle("ratio of mortality rates, reference year 1946")+
  labs(subtitle = "for All ages ")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +xlab("Age") +ylab("Ratio of mortality rates, reference year 1946")+
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
   theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="black"))

rm_TablePerRef10Wrap <- rm_TablePerRef + facet_grid(Country ~ Gender)
#On les affiche
rm_TablePerRef

rm_TablePerRef10Wrap

Commentaire

Au premier abord, on pourrait dire que les 14 courbes se ressemblent (7 pays et 1 caractère de genre). Cependant une analyse plus fine en lisant chaque ligne du graphe montre que les coefficients de mortalité augmentent plus tôt chez les hommes (sous-représentativité des femmes dans le marche de l’emploi : rappelons qu’avant 1965 une femme mariée ne pouvait ouvrir un compte bancaire ou signer un contrat de travail sans l’autorisation de son époux). La pénibilité et la dureté de certains métiers masculins est grande (métiers de la sidérurgie, de la pêche et de l’extraction minière,…). Cependant, si l’on compare 1946 et 2016, les statistiques montrent bien évidemment une amélioration de toutes les espérances de vie dont les raisons sont multiples : — amélioration des conditions matérielles de vie dans les pays occidentaux, — le travail est moins pénible physiquement avec une baisse de nombre d’heures de travail, — tertiarisation de l’économie, — meilleur accès aux soins, — progrès de la médecine (en particulier dans les maladies respiratoires et dans le traitement de certains cancers), — plus grande attention des individus à leur sante, hygiène et alimentation,…

Trends

Plot of log-mortality rates at birth and at age 1 & 5

life_table %>%
  filter(Age %in% c(0,1,5), Gender != "Both") %>%
  ggplot(mapping = aes (x = Year , y=qx , col = as.factor(Age))) +
  geom_line()+
  scale_y_log10() +
  ggtitle("Infant & child mortality")+
  labs(subtitle = "Antiboitics , Vaccination , Hygiene ? See the Change ! ")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +xlab("Year") +ylab("log Mortality Q_x")+
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
   theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="black"))+
  facet_grid(Country~Gender, scales = 'free')

#### Commentaire Les tendances sont identiques : les quotients de mortalit´e baissent tous en fonction du temps. Pour les enfants de moins 5 ans, la baisse est plus importante (dˆus aux progres de la vaccination et des mesures d’hygiene). Les courbes sont presque identiques avec les mˆemes pics. A l’ˆage de 5 ans, les quotients sont presque nuls. Pour les ˆages 15, 20, 40 et 60, les pentes des droites de regr´ession sont plus faibles que celles des enfants. La baisse est plus importante chez les femmes. Certains pics s’expliquent par les 2 guerres mondiales et les ´epid´emies (par exemple, ´epid´emies de chol´era en Europe en 1831, 1848, 1853, 1865,… et de peste en 1900 `a Marseille et San-Francisco,…)

Plot of log-mortality rates at age 15, 20, 40 & 60

life_table %>%
  filter(Age %in% c(15,20,40,60)) %>% #Exclure Gender Both ?
  ggplot(mapping = aes (x = Year , y=qx , col =as.factor(Age))) +
  geom_line()+
  scale_y_log10() +
  ggtitle("Mortality at Ages 15,20,40 & 60")+
  labs(subtitle = "Total population , Male & Female ")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +xlab("Year") +ylab("log Mortality Q_x")+
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
   theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="black"))+
  facet_grid(Country~Gender, scales = 'free')

#### Commentaire

On remarque que le Royaume-Uni, la France et l’Italie ont connu un pic de mortalité au début du 20e siècle pour les personnes étant âgées de 20 ans, cela étant expliqué par la 1ère Guerre Mondiale, qui a ravagé ces pays. On remarque, de plus, qu’en Suède, les personnes de 40 à 60 ans décèdent en grand nombre: cela est dû aux épidémies de variole qui ont sévi au 19e siècle. De plus, l’Espagne connaît une épidémie de grippe espagnole en 1918,ce qui explique le nombre de morts accru pour les personnes de 40 à 60 ans. Il y a un pic de mortalité pour les hommes hollandais : cela s’explique par l’invasion par l’Allemagne des Pays-Bas dans les années 1940. Enfin, on observe un nombre accru de morts pour les hommes âgés de 40 à 60 ans aux USA: cela est dû à l’épidémie de polyomiélite dans ce pays.

Rearrangement

Pivot Life Table

pivot_life <- function(life_table) {
life_table_pivot <- life_table %>%
  select(Country, Gender, Year, Age, qx) %>%
    pivot_wider(
      id_cols = c("Country","Gender","Year"),
      names_from = Age,
      values_from = qx
    )
return (life_table_pivot)
}
life_table_pivot <-pivot_life(life_table)

Calculation of life expectancy at birth of a population by year A

fun <- function(x) {return(1-identity(x))}
life_exp <- life_table_pivot %>%
  select (-Gender, -Year , -Country) %>%
  apply(MARGIN = 1, FUN = fun)%>%
  t() %>%
  apply ( MARGIN = 1 , cumprod) %>%
  t() %>%
  apply(MARGIN =  1, sum) %>%
  cbind(life_table_pivot)
names (life_exp )[1] <- "e0x" 
life_exp[1] <- life_exp[1] + 0.5
life_exp <- life_exp[,-c(5:114)]
life_exp %>%
  glimpse()

## Rows: 3,447
## Columns: 4
## $ e0x     <dbl> 40.13393, 39.26130, 38.63188, 37.35985, 39.31004, 39.91687, 38~
## $ Country <chr> "France", "France", "France", "France", "France", "France", "F~
## $ Gender  <chr> "Both", "Both", "Both", "Both", "Both", "Both", "Both", "Both"~
## $ Year    <int> 1816, 1817, 1818, 1819, 1820, 1821, 1822, 1823, 1824, 1825, 18~

Expectancy

Life Expectancy Function for a given age

Res life expectancy table

ex_res_table <- function(df) {
  tri <- df
  p <- pivot_life(tri)
  res_lex = c()
  for(i in c(1:nrow(p))) {
    for(j in c(0:109)) {
      res_lex <- append(res_lex, ex_res(p[i,4:113], j))
    }
  }
  exp_res_life_table <- tri[c(1,2,11,12)] %>% cbind(res_lex)
  return(exp_res_life_table) 
}

#-----------------------------------------------

#Make sure to have the Residual_Exp_life_table.Rds given in the .rar folder loaded into the path of the project.

#-----------------------------------------------

Residual_Exp_life_table <- readr::read_rds("Residual_Exp_life_table.rds")

Residual_Exp_life_table%>%
  head()

##   Year Age Country Gender  res_lex
## 1 1816   0  France   Both 40.13393
## 2 1816   1  France   Both 47.81757
## 3 1816   2  France   Both 49.07938
## 4 1816   3  France   Both 49.76580
## 5 1816   4  France   Both 49.91372
## 6 1816   5  France   Both 49.71260

Plotting residuals life expectancy at age 60 and 65

Res_Plot_Year_60_65 <- Residual_Exp_life_table %>% filter(Age %in% c(60,65)) %>%
  ggplot(mapping = aes (x = Year , y=res_lex , col =as.factor(Age))) +
  geom_line() +
  ggtitle("RESIDUAL LIFE EXPECTANCY AT AGE 60 AND 65")+
  labs(subtitle = " An evolution in Time ")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +xlab("Year") +ylab("Residual Life Expectancy")+
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
  theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="black"))+
  facet_grid(Country ~ Gender)
Res_Plot_Year_60_65

PCA and SVD over Log-Mortality for France, Female

s1 <- fviz_eig(resultsTT) +
  ggtitle("Screeplot Centré normalisé")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1))+
  ylab(" % Explained Variance")+
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) 
s2 <- fviz_eig(resultsFF) +
  ggtitle("Screeplot ni Centré ni normalisé")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1))+
  ylab(" % Explained Variance")+
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) 
s3 <- fviz_eig(resultsTF) +
  ggtitle("Screeplot normalisé non centré")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1))+
  ylab(" % Explained Variance")+
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) 
s4 <- fviz_eig(resultsFT) +
  ggtitle("Screeplot Centré non normalisé")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1))+
  ylab(" % Explained Variance")+
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) 
gridExtra::grid.arrange(s1,s2,s3,s4,ncol=2,nrow=2)

#### Commentaires sur les screeplots

On constate que le premier graphique (le cas : normalisé et centré) qui est réellement intéressant. le PC1 garde la grande majorité de la variance (93% environ) tout en gardant le reste sur les PC2,3 et 4. Les autres configurations (TF,FF,FT)* ont la variance à presque 100% expliquée par le PC1 dans les 3 cas.

TF : Normalisé non centré FT : Non normalisé centré *FF : Ni normalisé ni centré

ss1 <- fviz_pca_var(resultsTT , col.var = "contrib" , gradient.cols = c("red" , "black" , "yellow") , 
                    repel=TRUE) +
  ggtitle("Cercle de corrélation")+
  labs(subtitle = "centré & normalisé ")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
  theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="blue"))

ss2 <- fviz_pca_var(resultsFF , col.var = "contrib" , gradient.cols = c("red" , "black" , "yellow") , 
                    repel=TRUE) +
  ggtitle("Cercle de corrélation")+
  labs(subtitle = "non centré & non normalisé ")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
  theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="blue"))

ss3 <- fviz_pca_var(resultsTF , col.var = "contrib" , gradient.cols = c("red" , "black" , "yellow") , 
                    repel=TRUE) +
  ggtitle("Cercle de corrélation")+
  labs(subtitle = "non centré & normalisé ")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
  theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="blue"))

ss4 <- fviz_pca_var(resultsFT , col.var = "contrib" , gradient.cols = c("red" , "black" , "yellow") , 
                    repel=TRUE) +
  ggtitle("Cercle de corrélation")+
  labs(subtitle = "centré & non normalisé ")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
  theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="blue"))
gridExtra::grid.arrange(ss1,ss2,ss3,ss4,ncol=2,nrow=2)

#### Commentaires sur les cercles de corrélations

Les 2 graphiques à droite sont les graphiques des PCA non centrées. Ils n’ont pas de grande utilité.

Les 2 graphiques à gauche sont les graphiques des PCA centrées (en haut normalisé, en bas non normalisé). Le graphique normalisé permet de rendre visible la contribution des variables plus clairement. La contribution de Year se démarque clairement des autres (avec des coordonnées positives).

#motivez choix choix des axes => 
#- regarder la repartition de la variance suivant les cas  -> plot du cours 
#- rendre comparable les variables qui ont des variances differnentes ! 
#tous les pays et tous les sexes 
f1<-fviz_pca_biplot(resultsTT,repel = TRUE ,col.var = "Grey" ,col.ind = "Blue")+
  ggtitle("Biplot")+
  labs(subtitle = "centré &  normalisé ")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
  theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="maroon"))

f2<-fviz_pca_biplot(resultsFF,repel = TRUE ,col.var = "Grey" ,col.ind = "Blue")+
  ggtitle("Biplot")+
  labs(subtitle = "non centré & non normalisé ")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
  theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="maroon"))

f3<-fviz_pca_biplot(resultsTF,repel = TRUE ,col.var = "Grey" ,col.ind = "Blue")+
  ggtitle("Biplot")+
  labs(subtitle = "non centré &  normalisé ")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
  theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="maroon"))

f4<-fviz_pca_biplot(resultsFT,repel = TRUE ,col.var = "Grey" ,col.ind = "Blue")+
  ggtitle("Biplot")+
  labs(subtitle = "centré & non normalisé ")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
  theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="maroon"))

gridExtra::grid.arrange(f1,f2,f3,f4, ncol= 2, nrow = 2)

#### Commentaires sur les biplots

Dans le cas centré normalisé, on constate que les vecteurs variables dans le biplot contribuent fortement au premier axe, et c’est le meme cas pour les individus. Il existe quelques individus qui sont eloignés des autres, exemple, le n°1 et 3. Une interprétation plutôt simple est qu’importe son âge (entre 0 et 110, on vit toujours mieux en 2010 qu’en 1948, c’est pour ça que la variable Year est la plus grande contribution du PC1, les premières années (1948) considérées contribuent négativement à la PC1 et que les plus récentes y contribuent positivement.

Dans les autres cas, il n’y a pas vraiment d’interprétation à faire.

PCA for all Countries

#female 
#Spain
lfSF = life_table_pivot %>% dplyr::filter(Country=="Spain", Gender=="Female")
lfSF_num <- dplyr::select(lfSF, -Country,-Gender,-Year)%>%
  log()
list_pca_lfSF <-
  map2(.x=rep(c(FALSE, TRUE), 2),
       .y=rep(c(FALSE, TRUE), c(2,2)),
       ~ prcomp(lfSF_num, center= .x, scale.=.y))
names(list_pca_lfSF) <- stringr::str_c("pca_lfSF",
    c("", "c", "s", "c_s"),
    sep="_")
SF_pca <- list_pca_lfSF[[4]]
#on recupere les 5 premiers axes 
#SF_pca$x[,1:5]
#SF_pca$sdev[1:5]


var_explained_SF <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(SF_pca$sdev[1:5])^2/sum((SF_pca$sdev[1:5])^2),Gender=rep("Female",5),Country=rep("Spain",5))

#USA
lfUSf = life_table_pivot %>% dplyr::filter(Country=="USA", Gender=="Female")
lfUSf_num <- dplyr::select(lfUSf, -Country,-Gender,-Year)%>%
  log()
list_pca_lfUSf <-
  map2(.x=rep(c(FALSE, TRUE), 2),
       .y=rep(c(FALSE, TRUE), c(2,2)),
       ~ prcomp(lfUSf_num, center= .x, scale.=.y))
names(list_pca_lfUSf) <- stringr::str_c("pca_lfUSf",
    c("", "c", "s", "c_s"),
    sep="_")
lfUSf_pca <- list_pca_lfUSf[[4]]

var_explained_USF <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfUSf_pca$sdev[1:5])^2/sum((lfUSf_pca$sdev[1:5])^2),Gender=rep("Female",5),Country=rep("USA",5))

#france
lfFRf = life_table_pivot %>% dplyr::filter(Country=="France", Gender=="Female")
lfFRf_num <- dplyr::select(lfFRf, -Country,-Gender,-Year)%>%
  log()
list_pca_lfFRf <-
  map2(.x=rep(c(FALSE, TRUE), 2),
       .y=rep(c(FALSE, TRUE), c(2,2)),
       ~ prcomp(lfFRf_num, center= .x, scale.=.y))
names(list_pca_lfFRf) <- stringr::str_c("pca_lfFRf",
    c("", "c", "s", "c_s"),
    sep="_")
lfFRf_pca <- list_pca_lfFRf[[4]]

var_explained_FRAF <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfFRf_pca$sdev[1:5])^2/sum((lfFRf_pca$sdev[1:5])^2),Gender=rep("Female",5),Country=rep("France",5))

#England & Wales 
lfENGf = life_table_pivot %>% dplyr::filter(Country=="England & Wales", Gender=="Female")
lfENGf_num <- dplyr::select(lfENGf, -Country,-Gender,-Year) %>%
  log()
list_pca_lfENGf <-
  map2(.x=rep(c(FALSE, TRUE), 2),
       .y=rep(c(FALSE, TRUE), c(2,2)),
       ~ prcomp(lfENGf_num, center= .x, scale.=.y))
names(list_pca_lfENGf) <- stringr::str_c("pca_lfENGf",
    c("", "c", "s", "c_s"),
    sep="_")
lfENGf_pca <- list_pca_lfENGf[[4]]

var_explained_ENGF <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfENGf_pca$sdev[1:5])^2/sum((lfENGf_pca$sdev[1:5])^2),Gender=rep("Female",5),Country=rep("England & Wales",5))


#Italy
lfITAf = life_table_pivot %>% dplyr::filter(Country=="Italy", Gender=="Female")
lfITAf_num <- dplyr::select(lfITAf, -Country,-Gender,-Year)%>%
  log()
list_pca_lfITAf <-
  map2(.x=rep(c(FALSE, TRUE), 2),
       .y=rep(c(FALSE, TRUE), c(2,2)),
       ~ prcomp(lfITAf_num, center= .x, scale.=.y))
names(list_pca_lfITAf) <- stringr::str_c("pca_lfITAf",
    c("", "c", "s", "c_s"),
    sep="_")
lfITAf_pca <- list_pca_lfITAf[[4]]

var_explained_ITAF <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfITAf_pca$sdev[1:5])^2/sum((lfITAf_pca$sdev[1:5])^2),Gender=rep("Female",5),Country=rep("Italy",5))

#Netherlands
lfNETHf = life_table_pivot %>% dplyr::filter(Country=="Netherlands", Gender=="Female")
lfNETHf_num <- dplyr::select(lfNETHf, -Country,-Gender,-Year)%>%
  log()
list_pca_lfNETHf <-
  map2(.x=rep(c(FALSE, TRUE), 2),
       .y=rep(c(FALSE, TRUE), c(2,2)),
       ~ prcomp(lfNETHf_num, center= .x, scale.=.y))
names(list_pca_lfNETHf) <- stringr::str_c("pca_lfNETHf",
    c("", "c", "s", "c_s"),
    sep="_")
lfNETHf_pca <- list_pca_lfNETHf[[4]]

var_explained_NETHF <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfNETHf_pca$sdev[1:5])^2/sum((lfNETHf_pca$sdev[1:5])^2),Gender=rep("Female",5),Country=rep("Netherlands",5))


PCA_F=rbind(var_explained_SF,var_explained_USF,var_explained_FRAF,var_explained_ITAF,var_explained_NETHF,var_explained_ENGF)

pcafemme<-PCA_F %>%
  ggplot(aes(x=PC,y=var_explained,fill=Country))+
  geom_col(position = "dodge")+
  ggtitle("Scree plot: avec standarisation ")+
  labs(subtitle = "per Country Female")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +ylab("Percent")+xlab("Principal Componement") +
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
   theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="black"))


  
### Male
#Spain
lfSM = life_table_pivot %>% dplyr::filter(Country=="Spain", Gender=="Male")
lfSM_num <- dplyr::select(lfSM, -Country,-Gender,-Year)%>%
  log()
list_pca_lfSM <-
  map2(.x=rep(c(FALSE, TRUE), 2),
       .y=rep(c(FALSE, TRUE), c(2,2)),
       ~ prcomp(lfSM_num, center= .x, scale.=.y))
names(list_pca_lfSM) <- stringr::str_c("pca_lfSM",
    c("", "c", "s", "c_s"),
    sep="_")
SM_pca <- list_pca_lfSM[[4]]
#on recupere les 5 premiers axes 
#SM_pca$x[,1:5]
#SM_pca$sdev[1:5]


var_explained_SM <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(SM_pca$sdev[1:5])^2/sum((SM_pca$sdev[1:5])^2),Gender=rep("Male",5),Country=rep("Spain",5))

#USA
lfUSM = life_table_pivot %>% dplyr::filter(Country=="USA", Gender=="Male")
lfUSM_num <- dplyr::select(lfUSM, -Country,-Gender,-Year)%>%
  log()
list_pca_lfUSM <-
  map2(.x=rep(c(FALSE, TRUE), 2),
       .y=rep(c(FALSE, TRUE), c(2,2)),
       ~ prcomp(lfUSM_num, center= .x, scale.=.y))
names(list_pca_lfUSM) <- stringr::str_c("pca_lfUSM",
    c("", "c", "s", "c_s"),
    sep="_")
lfUSM_pca <- list_pca_lfUSM[[4]]

var_explained_USM <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfUSM_pca$sdev[1:5])^2/sum((lfUSM_pca$sdev[1:5])^2),Gender=rep("Male",5),Country=rep("USA",5))

#france
lfFRM = life_table_pivot %>% dplyr::filter(Country=="France", Gender=="Male")
lfFRM_num <- dplyr::select(lfFRM, -Country,-Gender,-Year)%>%
  log()
list_pca_lfFRM <-
  map2(.x=rep(c(FALSE, TRUE), 2),
       .y=rep(c(FALSE, TRUE), c(2,2)),
       ~ prcomp(lfFRM_num, center= .x, scale.=.y))
names(list_pca_lfFRM) <- stringr::str_c("pca_lfFRM",
    c("", "c", "s", "c_s"),
    sep="_")
lfFRM_pca <- list_pca_lfFRM[[4]]

var_explained_FRAM <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfFRM_pca$sdev[1:5])^2/sum((lfFRM_pca$sdev[1:5])^2),Gender=rep("Male",5),Country=rep("France",5))

#England & Wales 
lfENGM = life_table_pivot %>% dplyr::filter(Country=="England & Wales", Gender=="Male")
lfENGM_num <- dplyr::select(lfENGM, -Country,-Gender,-Year) %>%
  log()
list_pca_lfENGM <-
  map2(.x=rep(c(FALSE, TRUE), 2),
       .y=rep(c(FALSE, TRUE), c(2,2)),
       ~ prcomp(lfENGM_num, center= .x, scale.=.y))
names(list_pca_lfENGM) <- stringr::str_c("pca_lfENGM",
    c("", "c", "s", "c_s"),
    sep="_")
lfENGM_pca <- list_pca_lfENGM[[4]]

var_explained_ENGM <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfENGM_pca$sdev[1:5])^2/sum((lfENGM_pca$sdev[1:5])^2),Gender=rep("Male",5),Country=rep("England & Wales",5))


#Italy
lfITAM = life_table_pivot %>% dplyr::filter(Country=="Italy", Gender=="Male")
lfITAM_num <- dplyr::select(lfITAM, -Country,-Gender,-Year)%>%
  log()
list_pca_lfITAM <-
  map2(.x=rep(c(FALSE, TRUE), 2),
       .y=rep(c(FALSE, TRUE), c(2,2)),
       ~ prcomp(lfITAM_num, center= .x, scale.=.y))
names(list_pca_lfITAM) <- stringr::str_c("pca_lfITAM",
    c("", "c", "s", "c_s"),
    sep="_")
lfITAM_pca <- list_pca_lfITAM[[4]]

var_explained_ITAM <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfITAM_pca$sdev[1:5])^2/sum((lfITAM_pca$sdev[1:5])^2),Gender=rep("Male",5),Country=rep("Italy",5))

#Netherlands
lfNETHM = life_table_pivot %>% dplyr::filter(Country=="Netherlands", Gender=="Male")
lfNETHM_num <- dplyr::select(lfNETHM, -Country,-Gender,-Year)%>%
  log()
list_pca_lfNETHM <-
  map2(.x=rep(c(FALSE, TRUE), 2),
       .y=rep(c(FALSE, TRUE), c(2,2)),
       ~ prcomp(lfNETHM_num, center= .x, scale.=.y))
names(list_pca_lfNETHM) <- stringr::str_c("pca_lfNETHM",
    c("", "c", "s", "c_s"),
    sep="_")
lfNETHM_pca <- list_pca_lfNETHM[[4]]

var_explained_NETHM <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfNETHM_pca$sdev[1:5])^2/sum((lfNETHM_pca$sdev[1:5])^2),Gender=rep("Male",5),Country=rep("Netherlands",5))


PCA_M=rbind(var_explained_SM,var_explained_USM,var_explained_FRAM,var_explained_ITAM,var_explained_NETHM,var_explained_ENGM)

pcahomme<-PCA_M %>%
  ggplot(aes(x=PC,y=var_explained,fill=Country))+
  geom_col(position = "dodge")+
  ggtitle("Scree plot: Avec standarisation")+
  labs(subtitle = "per Country Male")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +ylab("Percent")+xlab("Principal Componement") +
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
   theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="black"))





######## sans standarization ####################


#female 
#Spain

SF_pca_2 <- list_pca_lfSF[[1]]
#on recupere les 5 premiers axes 



var_explained_SF_2 <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(SF_pca_2$sdev[1:5])^2/sum((SF_pca_2$sdev[1:5])^2),Gender=rep("Female",5),Country=rep("Spain",5))

#USA

lfUSf_pca_2 <- list_pca_lfUSf[[1]]

var_explained_USF_2 <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfUSf_pca_2$sdev[1:5])^2/sum((lfUSf_pca_2$sdev[1:5])^2),Gender=rep("Female",5),Country=rep("USA",5))

#france

lfFRf_pca_2 <- list_pca_lfFRf[[1]]

var_explained_FRAF_2 <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfFRf_pca_2$sdev[1:5])^2/sum((lfFRf_pca_2$sdev[1:5])^2),Gender=rep("Female",5),Country=rep("France",5))

#England & Wales 

lfENGf_pca_2 <- list_pca_lfENGf[[1]]

var_explained_ENGF_2 <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfENGf_pca_2$sdev[1:5])^2/sum((lfENGf_pca_2$sdev[1:5])^2),Gender=rep("Female",5),Country=rep("England & Wales",5))


#Italy

lfITAf_pca_2 <- list_pca_lfITAf[[1]]

var_explained_ITAF_2 <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfITAf_pca_2$sdev[1:5])^2/sum((lfITAf_pca_2$sdev[1:5])^2),Gender=rep("Female",5),Country=rep("Italy",5))

#Netherlands

lfNETHf_pca_2 <- list_pca_lfNETHf[[1]]

var_explained_NETHF_2 <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfNETHf_pca_2$sdev[1:5])^2/sum((lfNETHf_pca_2$sdev[1:5])^2),Gender=rep("Female",5),Country=rep("Netherlands",5))


PCA_F_2=rbind(var_explained_SF_2,var_explained_USF_2,var_explained_FRAF_2,var_explained_ITAF_2,var_explained_NETHF_2,var_explained_ENGF_2)

pcafemme2<-PCA_F_2 %>%
  ggplot(aes(x=PC,y=var_explained,fill=Country))+
  geom_col(position = "dodge")+
  ggtitle("Scree plot:without standarisation ")+
  labs(subtitle = "per Country Female")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +ylab("Percent")+xlab("Principal Componement") +
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
   theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="black"))


  
### Male
#Spain

SM_pca_2 <- list_pca_lfSM[[1]]
#on recupere les 5 premiers axes 



var_explained_SM_2 <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(SM_pca_2$sdev[1:5])^2/sum((SM_pca_2$sdev[1:5])^2),Gender=rep("Male",5),Country=rep("Spain",5))

#USA

lfUSM_pca_2<- list_pca_lfUSM[[1]]

var_explained_USM_2 <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfUSM_pca_2$sdev[1:5])^2/sum((lfUSM_pca_2$sdev[1:5])^2),Gender=rep("Male",5),Country=rep("USA",5))

#france

lfFRM_pca_2 <- list_pca_lfFRM[[1]]

var_explained_FRAM_2 <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfFRM_pca_2$sdev[1:5])^2/sum((lfFRM_pca_2$sdev[1:5])^2),Gender=rep("Male",5),Country=rep("France",5))

#England & Wales 

lfENGM_pca_2 <- list_pca_lfENGM[[1]]

var_explained_ENGM_2 <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfENGM_pca_2$sdev[1:5])^2/sum((lfENGM_pca_2$sdev[1:5])^2),Gender=rep("Male",5),Country=rep("England & Wales",5))


#Italy

lfITAM_pca_2 <- list_pca_lfITAM[[1]]

var_explained_ITAM_2 <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfITAM_pca_2$sdev[1:5])^2/sum((lfITAM_pca_2$sdev[1:5])^2),Gender=rep("Male",5),Country=rep("Italy",5))

#Netherlands

lfNETHM_pca_2 <- list_pca_lfNETHM[[1]]

var_explained_NETHM_2 <- data.frame(PC= paste0("PC",1:5),
                          var_explained=(lfNETHM_pca_2$sdev[1:5])^2/sum((lfNETHM_pca_2$sdev[1:5])^2),Gender=rep("Male",5),Country=rep("Netherlands",5))


PCA_M_2=rbind(var_explained_SM_2,var_explained_USM_2,var_explained_FRAM_2,var_explained_ITAM_2,var_explained_NETHM_2,var_explained_ENGM_2)

pcahomme2<-PCA_M_2 %>%
  ggplot(aes(x=PC,y=var_explained,fill=Country))+
  geom_col(position = "dodge")+
  ggtitle("Scree plot:without standarisation")+
  labs(subtitle = "per Country Male")+
  theme(axis.text.x=element_text(angle=-45, hjust=0, vjust=1)) +ylab("Percent")+xlab("Principal Componement") +
  theme(plot.title=element_text(size=12, hjust=0.5, face="bold", colour="maroon", vjust=-1)) +
   theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="black"))

##plotting 

gridExtra::grid.arrange(pcahomme2,pcafemme2,ncol=2)

Il s’agit d’un screeplot réalisé avec un centrage et une normalisation, Notez que nous avons retiré la Suède des pays sélectionnés pour cette analyse car après les années 2000, il y a tellement de groupes d’âge que lorsque nous prenons les logarithmes de la mortalité, nous rencontrons des problèmes. Le comportement de ce nuage de points est qu’il s’effondre immédiatement après la première composante pour les femmes et qu’il n’y a pas grand chose sur la deuxième composante, sauf pour l’Espagne et un peu pour la Grande-Bretagne, De plus, pour les hommes, c’est un peu plus compliqué car il est moins concentré sur la première composante.

gridExtra::grid.arrange(pcahomme,pcafemme,ncol=2)

Si on ne standardise pas, on a un résultat qui n’est pas inintéressant, en effet, on remarque les effets vus dans les graphiques précédents, c’est-à-dire qu’il y a beaucoup plus de variation sur le quotient de mortalité sur les 70 dernières années en Espagne, en Italie et en France que dans les pays anglo-saxons comme les Pays-Bas (qu’on suppose être un pays proche des pays anglo-saxons). De plus, si l’on regarde les courbes au début de l’année 1945, on constate que l’Espagne, l’Italie et la France rattrapent beaucoup de retard et en particulier l’Espagne, ce qui est visuellement facile à distinguer, surtout pour les femmes et moins important pour les hommes

-> On devrait avoir intérêt à centrer et standardiser lorsqu’on réalise une seule ACP, sauf qu’ici on en réalise plusieurs simultanément, donc on met tout le monde dans le même sac et on perd de vue les pays qui ont bougé plus/moins que les autres

Lee Carter Model

Données US

Lee Carter from SVD parameters for USA Both

Years <- 1933:1995
T <- length(Years)
Age <- 1:110
A_both <- life_table_pivot %>%
  filter(Country == "USA", Gender == 'Both', Year %in% Years) %>%
  select(-Gender, -Country, -Year) %>%
  log() %>%
  as.matrix()
M_usb <- (diag(1,T) - ((1/T)*matrix(1,T,1) %*% t(matrix(1,T,1)))) %*% A_both
svdUS_B <- svd(M_usb,1,1)
s_USB <- svdUS_B$d[1]
u_USB <- -1 *(svdUS_B$u * s_USB[1]) #kappa
v_USB <- -1 *svdUS_B$v #beta
ax_USB <- colMeans(A_both)
#plot(Age, ax)
#lines(Age,ax,col="red")
#plot(Age, v, ylab = "Bx")
#plot(Years, u, ylab = "K")


data <- tibble(age = 0:109, ax_USB)

#alpha
g_1 <- ggplot(data) + geom_point(aes(age,ax_USB)) +
  geom_line(aes(age,ax_USB), col = "red") +
  theme_bw() +
  ggtitle("Alpha estimating of USA Both") +
  labs(y = bquote(hat(alpha)[x]^"(1)"))
#beta
g_2 <- ggplot(data) + geom_point(aes(age,v_USB)) +
  geom_line(aes(age,v_USB), col = "red") +
  theme_bw() +
  ggtitle("beta estimating of USA Both") +
  labs(y = bquote(hat(beta)[x]^"(1)"))

#kappa
data_period <- tibble(year = 1933:1995, u_USB)

g_3 <- ggplot(data_period) + geom_point(aes(year,u_USB)) +
  geom_line(aes(year,u_USB), col = "red") +
  theme_bw() +
  ggtitle("kappa estimating of USA Both") +
  labs(y = bquote(hat(kappa)[t]^"(1)"))

gridExtra::grid.arrange(g_1,g_2,g_3, ncol = 2)

### Comparing Norms with Truncated Rank 2 SVD for USA Both

res_USB <- svd(A_both)
matUSB <- (res_USB$u[,1:2]) %*% diag(res_USB$d[1:2])%*%t(res_USB$v[,1:2])
norm(matUSB - as.matrix(A_both)) #9.391412

## [1] 9.391412

matLEEUSB = matrix(0,nrow = 63 , ncol = 110)
for (i in 1:110){
  matLEEUSB[,i] = t(ax_USB)[i] +v_USB[i]*u_USB
}
norm(matLEEUSB -as.matrix(A_both)) #7.820316

## [1] 7.820316

Lee Carter from SVD parameters for USA Female

Years <- 1933:1995
T <- length(Years)
Age <- 1:110
A_F <- life_table_pivot %>%
  filter(Country == "USA", Gender == 'Female', Year %in% Years) %>%
  select(-Gender, -Country, -Year) %>%
  log() %>%
  as.matrix()
M_usf <- (diag(1,T) - ((1/T)*matrix(1,T,1) %*% t(matrix(1,T,1)))) %*% A_F
svdUS_F <- svd(M_usf,1,1)
s_USF <- svdUS_F$d[1]
u_USfSVD <- -1 *(svdUS_F$u*s_USF)
v_USfSVD <- -1 *svdUS_F$v
ax_USfSVD <- colMeans(A_F)
#plot(Age, ax_USfSVD)
#plot(Age, v_USfSVD, ylab = 'Bx')
#plot(Years, u_USfSVD, ylab = "K")

data_USF <- tibble(age = 0:109, ax_USfSVD)
#alpha
g_1_USF <- ggplot(data_USF) + geom_point(aes(age,ax_USfSVD)) +
  geom_line(aes(age,ax_USfSVD), col = "red") +
  theme_bw() +
  ggtitle("Alpha estimating of USA Female") +
  labs(y = bquote(hat(alpha)[x]^"(1)"))
#beta
g_2_USF <- ggplot(data_USF) + geom_point(aes(age,v_USfSVD)) +
  geom_line(aes(age,v_USfSVD), col = "red") +
  theme_bw() +
  ggtitle("beta estimating of USA Female") +
  labs(y = bquote(hat(beta)[x]^"(1)"))

#kappa
data_period_USF <- tibble(year = 1933:1995, u_USfSVD)

g_3_USF <- ggplot(data_period_USF) + geom_point(aes(year,u_USfSVD)) +
  geom_line(aes(year,u_USfSVD), col = "red") +
  theme_bw() +
  ggtitle("kappa estimating of USA Female") +
  labs(y = bquote(hat(kappa)[t]^"(1)"))
gridExtra::grid.arrange(g_1_USF,g_2_USF,g_3_USF, ncol = 2)

### Comparing Norms with Truncated Rank 2 SVD for USA Female

res_USF <- svd(A_F)
matUSF <- (res_USF$u[,1:2]) %*% diag(res_USF$d[1:2])%*%t(res_USF$v[,1:2])
norm(matUSF - as.matrix(A_F) , "F") #5.267916

## [1] 5.267916

matLEEUSF = matrix(0,nrow = 63 , ncol = 110)
for (i in 1:110){
  matLEEUSF[,i] = t(ax_USfSVD)[i] +v_USfSVD[i]*u_USfSVD
}
norm(matLEEUSF -as.matrix(A_F), "F") #5.50236

## [1] 5.50236

#5.5>5.2 donc le SVD rank 1 est faiblement meilleur que celui utilisé dans le modèle de lee Carter 

# ecart de normes 
#norm(matUSF - matLEEUSF , "F") #2.993924

Lee Carter from Demog parameters for USA, Both

Years <- 1933:1995
T <- length(Years)
Age <- 1:110
A_M <- life_table_pivot %>%
  filter(Country == "USA", Gender == 'Male', Year %in% Years) %>%
  select(-Gender, -Country, -Year) %>%
  log() %>%
  as.matrix()
M_usm <- (diag(1,T) - ((1/T)*matrix(1,T,1) %*% t(matrix(1,T,1)))) %*% A_M
svdUS_M <- svd(M_usm,1,1)
s_USM <- svdUS_M$d[1]
u_USmSVD <- -1 *(svdUS_M$u*s_USM)
v_USmSVD <- -1 *svdUS_M$v
ax_USmSVD <- colMeans(A_M)
#plot(Age, ax_USmSVD)
#plot(Age, v_USmSVD, ylab = "Bx")
#plot(Years, u_USmSVD, ylab = "K")

data_USM <- tibble(age = 0:109, ax_USmSVD)
#alpha
g_1_USM <- ggplot(data_USM) + geom_point(aes(age,ax_USmSVD)) +
  geom_line(aes(age,ax_USmSVD), col = "red") +
  theme_bw() +
  ggtitle("Alpha estimating of USA Male") +
  labs(y = bquote(hat(alpha)[x]^"(1)"))
#beta
g_2_USM <- ggplot(data_USM) + geom_point(aes(age,v_USmSVD)) +
  geom_line(aes(age,v_USmSVD), col = "red") +
  theme_bw() +
  ggtitle("beta estimating of USA Male") +
  labs(y = bquote(hat(beta)[x]^"(1)"))

#kappa
data_period_USM <- tibble(year = 1933:1995, u_USmSVD)

g_3_USM <- ggplot(data_period_USM) + geom_point(aes(year,u_USmSVD)) +
  geom_line(aes(year,u_USmSVD), col = "red") +
  theme_bw() +
  ggtitle("kappa estimating of USA Male") +
  labs(y = bquote(hat(kappa)[t]^"(1)"))
gridExtra::grid.arrange(g_1_USM,g_2_USM,g_3_USM, ncol = 2)

### Comparing Norms with Truncated Rank 2 SVD for USA Male

res_USM <- svd(A_M)
matUSM <- (res_USM$u[,1:2]) %*% diag(res_USM$d[1:2])%*%t(res_USM$v[,1:2])
norm(matUSM - as.matrix(A_M) , "F") #5.749655

## [1] 5.749655

matLEEUSM = matrix(0,nrow = 63 , ncol = 110)
for (i in 1:110){
  matLEEUSM[,i] = t(ax_USmSVD)[i] +v_USmSVD[i]*u_USmSVD
}
norm(matLEEUSM -as.matrix(A_M), "F") #6.271101

## [1] 6.271101

Lee Carter from Demog parameters for USA, Both

a1 <- LCfit1$ax
b1 <- LCfit1$bx
k1 <- LCfit1$kt


data_1 <- tibble(age = 1:110, a1)
#alpha
g11 <- ggplot(data_1) + geom_point(aes(age,a1)) +
  geom_line(aes(age,a1), col = "red") +
  theme_bw() +
  ggtitle("Alpha estimating of USA Both") +
  labs(y = bquote(hat(alpha)[x]^"(1)"))
#beta
g12 <- ggplot(data_1) + geom_point(aes(age,b1)) +
  geom_line(aes(age,b1), col = "red") +
  theme_bw() +
  ggtitle("Beta estimating of USA Both ") +
  labs(y = bquote(hat(beta)[x]^"(1)"))

#kappa
data_period <- tibble(year = 1933:1995, k1)

g13 <- ggplot(data_period_USM) + geom_point(aes(year,k1)) +
  geom_line(aes(year,k1), col = "red") +
  theme_bw() +
  ggtitle("kappa estimating of USA Both ") +
  labs(y = bquote(hat(kappa)[t]^"(1)"))
gridExtra::grid.arrange(g11,g12,g13, ncol = 2)

### Plotting Residuals for a quick review

#goodness-of-fit:  residuals 
LcresUSB <- residuals(LCfit1)
#plot(LcresUSB, type = "colourmap" )
plot(LcresUSB)
title("Goodness of fit")

#predict mortality from 2000 to 2015 ==> 20 ans apres 1995
#forcasting 
LCforUSB <- forecast(LCfit1, h=20)
plot(LCforUSB, parametricbx = FALSE)

#pour les simulations
#LcsimUSB <- simulate(LCfit1, nsim = 1000 , h= 20)
#plot period index trajectories 
#plot(LCfit1$years, LCfit1$kt[1, ], type="l",xlim = c(2000,2015),ylim = c(-80,20), xlab="year", ylab="kt", main = "Peiod index LC")
#matlines(LcsimUSB$kt.s$years , LcsimUSB$kt.s$sim[1,,1:20], type="l", lty = 1)

Lee Carter from Demog parameters for USA, Female

#forecasting 
LCforUSF <- forecast(LCfit1, h=20)
plot(LCforUSF, parametricbx = FALSE)

Lee Carter from Demog parameters for USA, Male

#forcasting 
LCforUSM <- forecast(LCfit1, h=20)
plot(LCforUSM, parametricbx = FALSE)

Données France

LEE CARTER SVD FRANCE BOTH, MALE, FEMALE

#BOTH

Years <- 1933:1995
T <- length(Years)
Age <- 1:110
A_FRB <- life_table_pivot %>%
  filter(Country == "France", Gender == 'Both', Year %in% Years) %>%
  select(-Gender, -Country, -Year) %>%
  log() %>%
  as.matrix()
M_FRB <- (diag(1,T) - ((1/T)*matrix(1,T,1) %*% t(matrix(1,T,1)))) %*% A_FRB
svdFRB <- svd(M_FRB,1,1)
s_FRB <- svdFRB$d[1]
u_FRB <- -1 *(svdFRB$u*s_FRB)
v_FRB <- -1 *svdFRB$v
ax_FRB <- colMeans(A_FRB)
#plot(Age, ax)
#plot(Age, v, ylab = "Bx")
#plot(Years, u, ylab = "K")

data_FRB <- tibble(age = 0:109, ax_FRB)
#alpha
g_1_FRB <- ggplot(data_FRB) + geom_point(aes(age,ax_FRB)) +
  geom_line(aes(age,ax_FRB), col = "red") +
  theme_bw() +
  ggtitle("Alpha estimating of France Both") +
  labs(y = bquote(hat(alpha)[x]^"(1)"))
#beta
g_2_FRB <- ggplot(data_FRB) + geom_point(aes(age,v_FRB)) +
  geom_line(aes(age,v_FRB), col = "red") +
  theme_bw() +
  ggtitle("beta estimating of France Both") +
  labs(y = bquote(hat(beta)[x]^"(1)"))

#kappa
data_period_FRB <- tibble(year = 1933:1995, u_FRB)

g_3_FRB<- ggplot(data_period_FRB) + geom_point(aes(year,u_FRB)) +
  geom_line(aes(year,u_FRB), col = "red") +
  theme_bw() +
  ggtitle("kappa estimating of France Both") +
  labs(y = bquote(hat(kappa)[t]^"(1)"))
gridExtra::grid.arrange(g_1_FRB,g_2_FRB,g_3_FRB, ncol = 2)

### Comparing Norms with Truncated Rank 2 SVD for France Both

res_FRB <- svd(A_FRB)
matFRB <- (res_FRB$u[,1:2]) %*% diag(res_FRB$d[1:2])%*%t(res_FRB$v[,1:2])
norm(matFRB - as.matrix(A_FRB) , "F") #10.37236

## [1] 10.37236

matLEEFRB = matrix(0,nrow = 63 , ncol = 110)
for (i in 1:110){
  matLEEFRB[,i] = t(ax_FRB)[i] +v_FRB[i]*u_FRB
}
norm(matLEEFRB -as.matrix(A_FRB), "F") #13.72704

## [1] 10.73291

Lee Carter from SVD parameters for France, Male

#MALE
Years <- 1933:1995
T <- length(Years)
Age <- 1:110
A_FRM <- life_table_pivot %>%
  filter(Country == "France", Gender == 'Male', Year %in% Years) %>%
  select(-Gender, -Country, -Year) %>%
  log() %>%
  as.matrix()
M_FRM <- (diag(1,T) - ((1/T)*matrix(1,T,1) %*% t(matrix(1,T,1)))) %*% A_FRM
svdFRM <- svd(M_FRM,1,1)
sFRM <- svdFRM$d[1]
uFRM <- -1 *(svdFRM$u*sFRM)
vFRM <- -1 *svdFRM$v
axFRM <- colMeans(A_FRM)
#plot(Age, ax)
#plot(Age, v, ylab = "Bx")
#plot(Years, u, ylab = "K")
data_FRM <- tibble(age = 0:109, axFRM)
#alpha
g_1_FRM <- ggplot(data_FRM) + geom_point(aes(age,ax_USmSVD)) +
  geom_line(aes(age,ax_USmSVD), col = "red") +
  theme_bw() +
  ggtitle("Alpha estimating of France Male") +
  labs(y = bquote(hat(alpha)[x]^"(1)"))
#beta
g_2_FRM <- ggplot(data_FRM) + geom_point(aes(age,vFRM)) +
  geom_line(aes(age,vFRM), col = "red") +
  theme_bw() +
  ggtitle("Alpha estimating of France Male") +
  labs(y = bquote(hat(beta)[x]^"(1)"))

#kappa
data_period_FRM <- tibble(year = 1933:1995, uFRM)

g_3_FRM <- ggplot(data_period_FRM) + geom_point(aes(year,uFRM)) +
  geom_line(aes(year,uFRM), col = "red") +
  theme_bw() +
  ggtitle("Alpha estimating of France Male") +
  labs(y = bquote(hat(kappa)[t]^"(1)"))
gridExtra::grid.arrange(g_1_FRM,g_2_FRM,g_3_FRM, ncol = 2)

### Comparing Norms with Truncated Rank 2 SVD for France Male

res_FRM <- svd(A_FRM)
matFRM <- (res_FRM$u[,1:2]) %*% diag(res_FRM$d[1:2])%*%t(res_FRM$v[,1:2])
norm(matFRM - as.matrix(A_FRM) , "F") #11.30824

## [1] 11.30824

matLEEFRM = matrix(0,nrow = 63 , ncol = 110)
for (i in 1:110){
  matLEEFRM[,i] = t(axFRM)[i] +vFRM[i]*uFRM
}
norm(matLEEFRM -as.matrix(A_FRM), "F") #12.07684

## [1] 12.07684

Lee Carter from SVD parameters for France, Female

#FEMALE
Years <- 1933:1995
T <- length(Years)
Age <- 1:110
A_FRF <- life_table_pivot %>%
  filter(Country == "France", Gender == 'Female', Year %in% Years) %>%
  select(-Gender, -Country, -Year) %>%
  log() %>%
  as.matrix()
M_FRF<- (diag(1,T) - ((1/T)*matrix(1,T,1) %*% t(matrix(1,T,1)))) %*% A_FRF
svdFRF <- svd(M_FRF,1,1)
sFRF <- svdFRF$d[1]
uFRF <- -1 *(svdFRF$u*sFRF)
vFRF <- -1 *svdFRF$v
axFRF <- colMeans(A_FRF)
#plot(Age, ax)
#plot(Age, v, ylab = "Bx")
#plot(Years, u, ylab = "K")
data_FRF <- tibble(age = 0:109, axFRF)
#alpha
g_1_FRF <- ggplot(data_FRF) + geom_point(aes(age,axFRF)) +
  geom_line(aes(age,axFRF), col = "red") +
  theme_bw() +
  ggtitle("Alpha estimating of France Female") +
  labs(y = bquote(hat(alpha)[x]^"(1)"))
#beta
g_2_FRF <- ggplot(data_FRF) + geom_point(aes(age,vFRF)) +
  geom_line(aes(age,vFRF), col = "red") +
  theme_bw() +
  ggtitle("beta estimating of France Female") +
  labs(y = bquote(hat(beta)[x]^"(1)"))

#kappa
data_period_FRF <- tibble(year = 1933:1995, uFRF)

g_3_FRF <- ggplot(data_period_FRF) + geom_point(aes(year,uFRF)) +
  geom_line(aes(year,uFRF), col = "red") +
  theme_bw() +
  ggtitle("kappa estimating of France Female") +
  labs(y = bquote(hat(kappa)[t]^"(1)"))
gridExtra::grid.arrange(g_1_FRF,g_2_FRF,g_3_FRF, ncol = 2)

### Comparing Norms with Truncated Rank 2 SVD for France Female

res_FRF <- svd(A_FRF)
matFRF <- (res_FRF$u[,1:2]) %*% diag(res_FRF$d[1:2])%*%t(res_FRF$v[,1:2])
norm(matFRF - as.matrix(A_FRF) , "F") #9.684005

## [1] 9.684005

matLEEFRF = matrix(0,nrow = 63 , ncol = 110)
for (i in 1:110){
  matLEEFRF[,i] = t(axFRF)[i] +vFRF[i]*uFRF
}
norm(matLEEFRF -as.matrix(A_FRF), "F") #10.07679

## [1] 10.07679

Lee Carter from demog parameters for France, Male

#forcasting 
LCforFRM <- forecast(LCfit1, h=20)
plot(LCforFRM, parametricbx = FALSE)

Lee Carter from demog parameters for France, Both

#forcasting 
LCforFRB <- forecast(LCfit1, h=20)
plot(LCforFRB, parametricbx = FALSE)

## Lee Carter from demog parameters for France, Female

#predict mortality from 2000 to 2015 ==> 20 ans apres 1995
#forcasting 
LCforFRF <- forecast(LCfit1, h=20)
plot(LCforFRF, parametricbx = FALSE)

#### Petite Remarque sur la comparaison avec le SVD tronqué de rang 2

En général, la SVD tronquée au rang 2 fournit une meilleure approximation, mais elle n’est pas forcément meilleure de manière décisive, car en termes de stockage, on sera obligé de stocker deux valeurs singulières pour la matrice diagonale, deux vecteurs de longueur le nombre de périodes (2 * 50) et deux correspondant aux âges (2 * 100) plutôt qu’un vecteur correspondant aux âges et un pour la période d’observation, Par ailleurs, peut-être que cela ne fera pas une simple extension dans le temps pour la prévision, Comme l’intérêt de Lee Carter est qu’il semble facile d’étendre et d’extrapoler dans le temps…

Remarque générale sur le forecast de Lee-Carter

D’après les simulations, la modélisation du taux de mortalité pour différents pays à l’aide du modèle de Lee-Carter se poursuit ensuite en utilisant la méthode ARIMA, nous obtenons de bons résultats pour la prévision du taux de mortalité pour plusieurs années ou périodes, en nous basant sur le modèle de mortalité. Dans tous les cas, nous restons largement dans l’intervalle prédit avoisinant le milieu pour toutes les prédictions.

ps : Nous observons sur les graphiques affichant le forecast des k_t : on remarque deux nuances de gris : le plus clair correspondant à l’intervalle de confiance à 80% et le plus foncé celui à 95%

En outre, la fourchette des prédictions pour la France est plus large en raison de la Seconde Guerre mondiale et de son effet sur le k_t de 1940 à 1945.

#prédiction des k_t pour les femmes en France
#LCforFRF$kt.f

Predictions of life expectancies residuals

Predictions of life expectancies at given ages - USA Male and Female

#Getting Ex_pivot_table from life_table

ex_pivot_life <- function(life_table) {
life_table_pivot <- life_table %>%
  select(Country, Gender, Year, Age, ex) %>%
    pivot_wider(
      id_cols = c("Country","Gender","Year"),
      names_from = Age,
      values_from = ex
    )
return (life_table_pivot)
}
ex_life_table_pivot <-ex_pivot_life(life_table)

#GETTING LIFE EXPECTANCY PREDICTIONS IN A DATAFRAME

ex_predicted_us_male <- tibble()
for (a in  0:109){
lf <- flife.expectancy(USmale, series = names(USmale$rate)[1], years = USmale$year,
   age=a+1, max.age = 110)
  lf <- as.vector(t(tibble(lf)))
  ex_predicted_us_male <- rbind(ex_predicted_us_male,lf)
}
colnames(ex_predicted_us_male) <- 1933:2017
#glimpse(ex_predicted_us_male)

ex_predicted_pivot_us_male <- as.data.frame(t(ex_predicted_us_male))
Country <- rep("USA", 85)
Gender <- rep("Male", 85)
Year <- 1933:2017
colnames(ex_predicted_pivot_us_male) <- 0:109


ex_predicted_pivot_us_male <-cbind(Country,Gender, Year,ex_predicted_pivot_us_male) #now we have expectancy_pivot like qx pivot
ex_observed_pivot_us_male <- as.data.frame(ex_life_table_pivot %>% filter(Country == "USA", Gender == "Male")) #and observed !!

ex_predicted_us_female <- tibble()
for (a in  0:109){
lf <- flife.expectancy(USfemale, series = names(USfemale$rate)[1], years = USfemale$year,
   age=a+1, max.age = 110)
  lf <- as.vector(t(tibble(lf)))
  ex_predicted_us_female <- rbind(ex_predicted_us_female,lf)
}
colnames(ex_predicted_us_female) <- 1933:2017
#glimpse(ex_predicted_us_female)

ex_predicted_pivot_us_female <- as.data.frame(t(ex_predicted_us_female))
Country <- rep("USA", 85)
Gender <- rep("Female",85)
Year <- 1933:2017
colnames(ex_predicted_pivot_us_female) <- 0:109


ex_predicted_pivot_us_female <-cbind(Country,Gender, Year,ex_predicted_pivot_us_female) #now we have expectancy_pivot like qx pivot
ex_observed_pivot_us_female <- as.data.frame(ex_life_table_pivot %>% filter(Country == "USA", Gender == "Female")) #and observed !!

ex_observed_us_male <- ex_observed_pivot_us_male %>%
  pivot_longer(
      names_to = "Age",
      cols = -c("Year","Country","Gender"),
      values_to =  "Ex")

ex_observed_us_male <- cbind(rep("Observed",nrow(ex_observed_us_male)),ex_observed_us_male)
colnames(ex_observed_us_male)[1] <-"Type"
colnames(ex_observed_us_male)[2] <-"Country"


ex_predicted_us_male <- ex_predicted_pivot_us_male %>%
  pivot_longer(
      names_to = "Age",
      cols = -c("Year","Country","Gender"),
      values_to =  "Ex")

ex_predicted_us_male <- cbind(rep("Predicted",nrow(ex_predicted_us_male)),ex_predicted_us_male)
colnames(ex_predicted_us_male)[1] <-"Type"
colnames(ex_predicted_us_male)[2] <-"Country"

ex_us_male <- rbind(ex_predicted_us_male, ex_observed_us_male)

ex_us_male %>%
  tibble() %>%
  head()

## # A tibble: 6 x 6
##   Type      Country Gender  Year Age      Ex
##   <chr>     <chr>   <chr>  <int> <chr> <dbl>
## 1 Predicted USA     Male    1933 0      59.6
## 2 Predicted USA     Male    1933 1      62.5
## 3 Predicted USA     Male    1933 2      62.1
## 4 Predicted USA     Male    1933 3      61.4
## 5 Predicted USA     Male    1933 4      60.6
## 6 Predicted USA     Male    1933 5      59.8

ex_observed_us_female <- ex_observed_pivot_us_female %>%
  pivot_longer(
      names_to = "Age",
      cols = -c("Year","Country","Gender"),
      values_to =  "Ex")

ex_observed_us_female <- cbind(rep("Observed",nrow(ex_observed_us_female)),ex_observed_us_female)
colnames(ex_observed_us_female)[1] <-"Type"
colnames(ex_observed_us_female)[2] <-"Country"


ex_predicted_us_female <- ex_predicted_pivot_us_female %>%
  pivot_longer(
      names_to = "Age",
      cols = -c("Year","Country","Gender"),
      values_to =  "Ex")

ex_predicted_us_female <- cbind(rep("Predicted",nrow(ex_predicted_us_female)),ex_predicted_us_female)
colnames(ex_predicted_us_female)[1] <-"Type"
colnames(ex_predicted_us_female)[2] <-"Country"

ex_us_female <- rbind(ex_predicted_us_female, ex_observed_us_female)

ex_us_female %>%
  tibble() %>%
  head()

## # A tibble: 6 x 6
##   Type      Country Gender  Year Age      Ex
##   <chr>     <chr>   <chr>  <int> <chr> <dbl>
## 1 Predicted USA     Female  1933 0      63.1
## 2 Predicted USA     Female  1933 1      65.5
## 3 Predicted USA     Female  1933 2      65.1
## 4 Predicted USA     Female  1933 3      64.3
## 5 Predicted USA     Female  1933 4      63.5
## 6 Predicted USA     Female  1933 5      62.6

ex_us_female$Age <- as.numeric(ex_us_female$Age)
femaleusex<-(ggplot(data = ex_us_female) +
   aes(x = Year, 
      y = Ex) +
    aes(frame = Age,
        color = ex_us_female$Type)+
   geom_line() +
  theme(legend.position = "none") +
  labs(   color = "Type" )) %>% 
  plotly::ggplotly(height = 500, width=750 )

ex_us_male$Age <- as.numeric(ex_us_male$Age)
maleusex<-(ggplot(data = ex_us_male) +
   aes(x = Year, 
      y = Ex) +
    aes(frame = Age,
        color = ex_us_male$Type)+
   geom_line() +
  theme(legend.position = "none") +
  labs(   color = "Type" )) %>% 
  plotly::ggplotly(height = 500, width=750 )

subplot(maleusex, femaleusex ) %>% layout(showlegend = TRUE ,title = list(text = paste0('Life expectancy for Male(Left) and Female(Right) ', '<br>', 'in USA ',
                                    '<br>',
                                    '<sup>',
                                    'at a given Age & Evolution in Time',
                                    '</sup>')))

Predictions of life expectancies at given ages - France Male and Female

#GETTING LIFE EXPECTANCY PREDICTIONS IN A DATAFRAME

ex_predicted_fr_female <- tibble()
for (a in  0:109){
lf <- flife.expectancy(FRfemale, series = names(FRfemale$rate)[1], years = FRfemale$year,
   age=a+1, max.age = 110)
  lf <- as.vector(t(tibble(lf)))
  ex_predicted_fr_female <- rbind(ex_predicted_fr_female,lf)
}
colnames(ex_predicted_fr_female) <- 1816:2017
#glimpse(ex_predicted_fr_female)

ex_predicted_pivot_fr_female <- as.data.frame(t(ex_predicted_fr_female))
Country <- rep("France", 202)
Gender <- rep("Female", 202)
Year <- 1816:2017
colnames(ex_predicted_pivot_fr_female) <- 0:109


ex_predicted_pivot_fr_female <-cbind(Country,Gender, Year,ex_predicted_pivot_fr_female) #now we have expectancy_pivot like qx pivot
ex_observed_pivot_fr_female <- as.data.frame(ex_life_table_pivot %>% filter(Country == "France", Gender == "Female")) #and observed !!

ex_predicted_fr_male <- tibble()
for (a in  0:109){
lf <- flife.expectancy(FRmale, series = names(FRmale$rate)[1], years = FRmale$year,
   age=a+1, max.age = 110)
  lf <- as.vector(t(tibble(lf)))
  ex_predicted_fr_male <- rbind(ex_predicted_fr_male,lf)
}
colnames(ex_predicted_fr_male) <- 1816:2017
#glimpse(ex_predicted_fr_male)

ex_predicted_pivot_fr_male <- as.data.frame(t(ex_predicted_fr_male))
Country <- rep("France", 202)
Gender <- rep("Male", 202)
Year <- 1816:2017
colnames(ex_predicted_pivot_fr_male) <- 0:109


ex_predicted_pivot_fr_male <-cbind(Country,Gender, Year,ex_predicted_pivot_fr_male) #now we have expectancy_pivot like qx pivot
ex_observed_pivot_fr_male <- as.data.frame(ex_life_table_pivot %>% filter(Country == "France", Gender == "Male")) #and observed !!

ex_observed_fr_female <- ex_observed_pivot_fr_female %>%
  pivot_longer(
      names_to = "Age",
      cols = -c("Year","Country","Gender"),
      values_to =  "Ex")

ex_observed_fr_female <- cbind(rep("Observed",nrow(ex_observed_fr_female)),ex_observed_fr_female)
colnames(ex_observed_fr_female)[1] <-"Type"
colnames(ex_observed_fr_female)[2] <-"Country"


ex_predicted_fr_female <- ex_predicted_pivot_fr_female %>%
  pivot_longer(
      names_to = "Age",
      cols = -c("Year","Country","Gender"),
      values_to =  "Ex")

ex_predicted_fr_female <- cbind(rep("Predicted",nrow(ex_predicted_fr_female)),ex_predicted_fr_female)
colnames(ex_predicted_fr_female)[1] <-"Type"
colnames(ex_predicted_fr_female)[2] <-"Country"

ex_fr_female <- rbind(ex_predicted_fr_female, ex_observed_fr_female)

ex_fr_female %>%
  tibble() %>%
  head()

## # A tibble: 6 x 6
##   Type      Country Gender  Year Age      Ex
##   <chr>     <chr>   <chr>  <int> <chr> <dbl>
## 1 Predicted France  Female  1816 0      42.0
## 2 Predicted France  Female  1816 1      48.5
## 3 Predicted France  Female  1816 2      49.8
## 4 Predicted France  Female  1816 3      50.5
## 5 Predicted France  Female  1816 4      50.6
## 6 Predicted France  Female  1816 5      50.4

ex_observed_fr_male <- ex_observed_pivot_fr_male %>%
  pivot_longer(
      names_to = "Age",
      cols = -c("Year","Country","Gender"),
      values_to =  "Ex")

ex_observed_fr_male <- cbind(rep("Observed",nrow(ex_observed_fr_male)),ex_observed_fr_male)
colnames(ex_observed_fr_male)[1] <-"Type"
colnames(ex_observed_fr_male)[2] <-"Country"


ex_predicted_fr_male <- ex_predicted_pivot_fr_male %>%
  pivot_longer(
      names_to = "Age",
      cols = -c("Year","Country","Gender"),
      values_to =  "Ex")

ex_predicted_fr_male <- cbind(rep("Predicted",nrow(ex_predicted_fr_male)),ex_predicted_fr_male)
colnames(ex_predicted_fr_male)[1] <-"Type"
colnames(ex_predicted_fr_male)[2] <-"Country"

ex_fr_male <- rbind(ex_predicted_fr_male, ex_observed_fr_male)

ex_fr_male %>%
  tibble() %>%
  head()

## # A tibble: 6 x 6
##   Type      Country Gender  Year Age      Ex
##   <chr>     <chr>   <chr>  <int> <chr> <dbl>
## 1 Predicted France  Male    1816 0      40.2
## 2 Predicted France  Male    1816 1      47.7
## 3 Predicted France  Male    1816 2      48.9
## 4 Predicted France  Male    1816 3      49.5
## 5 Predicted France  Male    1816 4      49.7
## 6 Predicted France  Male    1816 5      49.5

ex_fr_female$Age <- as.numeric(ex_fr_female$Age)
femalefrex<-(ggplot(data = ex_fr_female) +
   aes(x = Year, 
      y = Ex) +
    aes(frame = Age,
        color = ex_fr_female$Type)+
   geom_line() +
  theme(legend.position = "none") +
  labs(   color = "Type" )) %>% 
  plotly::ggplotly(height = 500, width=750 )

ex_fr_male$Age <- as.numeric(ex_fr_male$Age)
malefrex<-(ggplot(data = ex_fr_male) +
   aes(x = Year, 
      y = Ex) +
    aes(frame = Age,
        color = ex_fr_male$Type)+
   geom_line() +
  theme(legend.position = "none") +
  labs(   color = "Type" )) %>% 
  plotly::ggplotly(height = 700, width=750 )

subplot(malefrex, femalefrex ) %>% layout(showlegend = TRUE ,title = list(text = paste0('Life expectancy for Male(Left) and Female(Right) ', '<br>', 'in France ',
                                    '<br>',
                                    '<sup>',
                                    'at a given Age & Evolution in Time',
                                    '</sup>')))

Commentaire sur les comparaisons prédictions/observations

Pour la France et les Etats-Unis, les prédictions sont presque identiques aux observations. Nous pouvons donc conclure que les prédictions du modèle Lee-Carter sont particulièrement précises, tant pour la France que pour les Etats-Unis.

Projet 2021 Analyse de Données

Western countries in 1948

Plot log-Qx for all countries for Year 1948

Commentaire

Plot Eu Countries/USA Qx ratios, for Year 1948

Commentaire

Death Rate Evoution since WWII

Death Rate Evolution since WWII, using facet

Mortality Ratio Function

Plot Mortality Ratios with USA as reference from 1946

Commentaire

Trends

Plot of log-mortality rates at birth and at age 1 & 5

Plot of log-mortality rates at age 15, 20, 40 & 60

Rearrangement

Pivot Life Table

Calculation of life expectancy at birth of a population by year A

Expectancy

Life Expectancy Function for a given age

Res life expectancy table

Plotting residuals life expectancy at age 60 and 65

PCA and SVD over Log-Mortality for France, Female

PCA for all Countries

Lee Carter Model

Données US

Lee Carter from SVD parameters for USA Both

Lee Carter from SVD parameters for USA Female

Lee Carter from Demog parameters for USA, Both

Lee Carter from Demog parameters for USA, Both

Lee Carter from Demog parameters for USA, Female

Lee Carter from Demog parameters for USA, Male

Données France

LEE CARTER SVD FRANCE BOTH, MALE, FEMALE

Lee Carter from SVD parameters for France, Male

Lee Carter from SVD parameters for France, Female

Lee Carter from demog parameters for France, Male

Lee Carter from demog parameters for France, Both

Remarque générale sur le forecast de Lee-Carter

Predictions of life expectancies residuals

Predictions of life expectancies at given ages - USA Male and Female

Predictions of life expectancies at given ages - France Male and Female

Commentaire sur les comparaisons prédictions/observations