The Generation and Gender Survey (GGS) is a cross-national survey on life-course and family dynamics of women and men aged 18-79 years. The GGS currently consists of two rounds. Data collection for the first round (GGS-I) took place between 2004-2012 and is now completed. The second round of data collection (GGS-II) started officially in 2020 with an enhanced survey design, refined baseline questionnaire, and refreshed samples.
The GGS-II documentation is available online on the GGP Colectica Portal https://ggp.colectica.org/
Some other useful resources:
GGS-II data is available for download in the GGP User Space upon registration and completing the application procedure. Access to one of the GGS-II datasets provides immediate access to other GGS-II countries after filling in the request
The data is available in different formats: .dta, .sav, and .xlxs
The dataset of each country is processed in such a way that it is harmonized with the other GGS-II datasets to reduce the need for users to post-harmonize the data. As such, it is possible to append all country data for a cross-country comparison.
The datasets are prepared based on the baseline questionnaire 3.1. That means that in countries that fielded an earlier version of the questionnaire (e.g., Norway) where additional variables are still included, the variables are coded as country-specific. This ensures harmonization between countries.
The variables contain the labels and response options as in the baseline questionnaire. The full question is also stored within the variable.
Any country-specific deviations are systematically coded using four digits-long country codes. The variable “country” provides an overview of the country codes.
Country-specific values are added when the question follows the baseline questionnaire, but the answers are not at all or partly compatible. They consist of the country code plus a number, e.g., 2901.
A country-specific variable is introduced when the question differs from the baseline questionnaire or has been added to it. This kind of variable is identified with the prefix ‘cnt_’ and the suffix consisting of the country code plus a number, e.g., cnt_dem01_2401
A break-off refers to when a respondent quits the survey before reaching the final question. Those cases are marked with the missing value “.h incomplete survey”. Respondents who quit the survey in the first two sections, DEM or LHI, are removed from the dataset.
Missing values in the dataset are indicated by system-generated codes. When a value is missing due to specific reasons, it is marked as follows:
.a Don’t know
.c Not applicable
.h Incomplete survey
Certain variables feature unique response categories. The coding of these special response categories varies based on whether the variables are continuous or categorical. In the case of continuous variables, the special answer categories are coded using system missings too in order to maintain the continuity of the variable.
.e Mainly work from home
.f Not at all
.g Not working or homemaker
Post-stratification weights and design weights are included in the datasets. The post-stratification weights are produced using Iterative Proportional Fitting based on the most recent and reliable information on population figures provided by the country teams on five items: age, gender, region, level of education, and marital status. This accounts for selectivity in response, making within-country and cross-country-comparative research more reliable. In some countries, more detailed information is available so country teams chose to produce additional weights themselves. This weight variable is called cnt_weight and can only be used for within-country analyses.