Data Issues Paper
December 2023
Downloads
Overview
The Data Issues Paper provides a summary of data-related issues that have been identified in the Ten to Men data. It has been designed to assist users of the data as they undertake research and analysis. It should be read in conjunction with the Ten to Men Data User Guide.
The Data Issues Paper provides information to data users on:
- observed inconsistencies and issues that they should be aware of when analysing and interpreting the Ten to Men data
- recommendations and guidance in the management of identified data quality issues in the Ten to Men data.
The Data Issues Paper has been divided into 3 sections:
- a history of the Ten to Men datasets
- changes to the structure of the Ten to Men datasets
- description of identified data quality issues.
Further sections will be added as any data-related issues emerge.
Data Issue Paper Updates
Date | Version | Update | Suggested citation |
---|---|---|---|
September 2019 | 1.0 | Initial version | Howell, L., Bandara, D., Mohal, J., Andalon, M., Silbert, M., Garrard, B., Swami, N., & Daraganova, G. (2019). Ten to Men: The Australian Longitudinal Study on Male Health - Data Issues Paper, Version 1.0, September 2019. Melbourne: Australian Institute of Family Studies. |
September 2021 | 2.0 | Updated for Wave 3 | Howell, L., Silbert, M., & Bandara, D. (2021). Ten to Men: The Australian Longitudinal Study on Male Health - Data Issues Paper, Version 2.0, September 2021. Melbourne: Australian Institute of Family Studies. |
March 2022 | 2.1 | Addition of Section 3.9 Current Occupation | Howell, L., Silbert, M., & Bandara, D. (2022). Ten to Men: The Australian Longitudinal Study on Male Health - Data Issues Paper, Version 2.1, March 2022. Melbourne: Australian Institute of Family Studies. |
September 2023 | 3.0 | Updated for Wave 4 | Volpe, F. Suares, M., Silbert, M., & Martin, S. (2023). Ten to Men: The Australian Longitudinal Study on Male Health - Data Issues Paper, Release 4.0, (Waves 1-4). Melbourne: Australian Institute of Family Studies. |
1. Ten to Men data
Periodically a new release of the Ten to Men datasets will be generated as additional information becomes available after each data collection wave. The releases are numbered in sequential order and a new Digital Object Identifier (DOI) is minted.
There have been 5 releases of data:
- Release 1.0 was issued by the University of Melbourne and contained data from Wave 1 only.
- Release 2.0 was also issued by the University of Melbourne. It contained data from both Wave 1 and Wave 2, as well as a respondent dataset.
- Release 2.1 was issued by the Australian Institute of Family Studies (AIFS) and comprised of updated Wave 1 and Wave 2 datasets. Relevant data from the respondent dataset was included in these datasets and it is no longer available as a separate dataset.
- Release 3.0 was issued by AIFS and contained data from Wave 1, Wave 2 and Wave 3.
- Release 4.0 was also issued by AIFS, and contained data from Wave 1, Wave 2, Wave 3 and Wave 4.
A history of the dataset releases and suggested citations can be found in Appendix A.
2. Ten to Men datasets
This section documents the structural changes that have been applied to the Ten to Men datasets. These structural changes enhance the usability of the datasets, especially as data from additional waves are included. Structural changes include the merging of datasets, resolving data inconsistencies, addressing quality issues, and augmenting data resources with additional information.
Most of the major structural changes were implemented in Release 2.1. Table 1 provides a summary of all structural changes, and further details can be found in the corresponding sections.
Table 1: Summary of changes to the dataset structure
Structural change | Implementation | See section for further details |
---|---|---|
Addition of a data sharing framework | Release 2.1 | 2.1 |
Respondent data added to Wave 1 and Wave 2 datasets, removing the need for a separate dataset | Release 2.1 | 2.2 |
Renaming of variables to indicate wave, thus aligning with the standard naming convention for variables | Release 2.1 | 2.3 |
Renaming of variables to indicate research domain | Release 2.1 | 2.4 |
Addition of a new research domain for linked data | Release 2.1 | 2.5 |
Renaming of census linked variables to include reference year | Release 2.1 | 2.6 |
Renaming of weight variables to include reference to population or sample weights | Release 3.0 | 2.7 |
2.1 Addition of a data sharing framework
To increase the utility of information while minimising disclosure risks, a data sharing framework to differentiate the user's access level was adopted for Release 2.1 of the Ten to Men datasets. This resulted in 2 levels of datasets for each wave being generated - the General Release and the Restricted Release.
A lower level of confidentilisation was applied to the Restricted Release dataset, with all initial information preserved. The only information not included in the dataset are names, addresses and other contact details. Access to the Restricted Release dataset may only granted when data users are able to demonstrate a genuine need for the additional data, and when they also meet the necessary additional security requirements.
The General Release dataset has undergone further data confidentilisation. In addition to the information removed for the Restricted Release dataset, further confidentilisation for the General Release dataset includes supressing variables, aggregating response categories and recoding outlying values to a less extreme value. Users can consult the Ten to Men Data Dictionary for more information on the confidentialised variables.
As access requirements for the General Release dataset are less rigorous than for the Restricted Release dataset, this has improved accessibility for users to the Ten to Men datasets.
For further information about the Ten to Men datasets, including data access procedures, users can refer to Sections 3 and 8 of the Ten to Men Data User Guide.
2.2 Availability of respondent dataset
Release 2.0 comprised of 3 Ten to Men datasets - Respondent, Wave 1 and Wave 2. The Respondent dataset contained key indicator data, such as the unique study identifier, age, household identifier and geographical information. The dataset for each wave contained the responses to the corresponding questionnaires.
In Release 2.1, relevant information from the respondent dataset was included in the Ten to Men Wave 1 and Wave 2 datasets. This has removed the necessity of maintaining a separate respondent dataset, and thus only 2 datasets were released at each level - Wave 1 and Wave 2.
The inclusion of respondent data in the wave datasets has been applied in all subsequent releases.
2.3 Renaming of variables to indicate wave
The standard naming convention of the Ten to Men variables specifies that the first character of the variable should indicate the wave or be a 'z' if the variable is constant across waves.
In Releases 1.0 and 2.0, some variables in the respondent datasets did not follow this standard naming convention. The first character of the variable name was 'z' but they were no consistent across waves. In these cases, the variable label specified whether the variable related to Wave 1 or Wave 2.
It is important that variables conform to the Ten to Men standard naming convention to maintain consistency and uniformity across the data. Therefore, in Release 2.1, these variables were renamed to follow the standard naming convention; that is, the first charact of the variable was changed to indicate the wave if the variable was not constant across waves.
Further details of all variables that were renamed are shown in Appendix B.
2.4 Renaming of variables to indicate research domain
The standard naming convention of the variables in the Ten to Men datasets specifies that the second and third characters of the variable should indicate the research domain. A list of all the research domains can be found in the Ten to Men Data Dictionary and the Ten to Men Data User Guide.
One variable was identified in Releases 1.0 and 2.0 where the second and third characters of the variable did not correspond to a research domain. As it is important to maintain consistency, this variable has been renamed to reflect the correct research domain.
Table 2: List of variables with corrections to the research domain
Variable name | Research domain (2nd and 3rd characters) | Time of correction | |
---|---|---|---|
Original | bhxsex120a | hx is not a research domain | |
Correction | bbxsex120a | Behaviours - sexual behaviour (bx) | Release 2.1 |
2.5 Additional research domain for linked data
In Releases 1.0 and 2.0, the research domain of Data Collection (dc) is comprised of both key indicator variables and linked data. This includes variables such as the unique study ID, participation indicators, household indicators, statistical area codes (SA1, SA2) and numerous socio-economic indexes for areas (SEIFA).
To provide transparency about the data source, these variables were separated into 2 research domains for Release 2.1 The key indicator variables remain in the research domain of Data Collection (dc), while an additional research domain was created for Linked Data (ld).
The standard naming convention for variables specifies that the second and third characters of the variable name should indicate the research domain. Consequently, the creation of a new research domain resulted in the renaming of some variables to conform to this standard. That is, the second and third characters of the variable names in the linked data research domain were changed form 'dc' to 'ld'.
Further details of all variables that were renamed are shown in Appendix B.
2.6 Renaming of census-based variables
In Release 2.0, the respondent dataset contained linked data from the Australian Bureau of Statistics (ABS) 2011 Census. These variables did not contain any information to indicate the census year.
As new census data becomes available, this has been added to the Ten to Men datasets. It therefore became important to include a reference to the census year in the variable name.
In Release 2.1, the eighth and ninth characters of the variable name were changed to represent a year indicator. For example, the variable 'aldieod00i' has been renamed to 'aldieod11i' to indicate that it is based on the 2011 Census data. Linked data from the ABS 2016 Census was also available and added in this release.
Further details of all variables that were renamed are shown in Appendix B.
2.7. Renaming of weight variables
Population weights have been included in all releases of the Ten to Men datasets.
Sample weights were added in Release 3.0, as well as the development of a variable naming framework for the weight variables. Wave 1 and Wave 2 weighting variables were renamed to comply with this framework and to clearly indicate whether the variable referred to a population or sampling weight.
Table 3 indicates the naming convention for weights that has been applied since Release 3.0.
Table 3: Naming convention for weights
Character position in variable name | Description | Variable abbreviation |
---|---|---|
1 | Wave | A, B, C or D |
2,3 | Research Domain | DC |
4 | Initial or Raked | I or R |
5 | Longitudinal or Cross-Sectional | L or C |
6 | Population or Sample | P or S |
7,8,9 | For Wave 1 | WTA |
7,8,9 | For Wave 2 | WTB |
7,8,9 | For Wave 3 | WTC |
7,8,9 | For Wave 4 | WTD |
7,8,9 | Between Waves 1 and 2 | WAB |
7,8,9 | Between Waves 1 and 3 | WAC |
7,8,9 | Between Waves 1 and 4 | WAD |
7,8,9 | Between Waves 2 and 3 | WBC |
7,8,9 | Between Waves 2 and 4 | WBD |
7,8,9 | Between Waves 3 and 4 | WCD |
10 | Derived | D |
Further details of the weighting variables that were renamed are shown in Appendix B.
3. Data quality issues
Data quality is measured by factors such as accuracy, validity, consistency, and completeness. AIFS undertakes validation procedures to ensure that the Ten to Men data quality is of an appropriate standard. However, it is the responsibility of the data user to assess the data quality of the Ten to Men variables before any analysis is undertaken.
This section contains information about data quality issues that have been identified across the waves of Ten to Men. Further information will be added as any additional data quality issues emerge.
Table 4 provides a summary of the identified data quality issues and the wave/s that are affected. Detailed information about the data issue and any recommendations can be found in the corresponding sections of this paper.
Table 4: Summary of data quality issues
Data quality issue | Wave 1 | Wave 2 | Wave 3 | Wave 4 | Section |
---|---|---|---|---|---|
Behaviours - alcohol | |||||
Behaviours - tobacco | |||||
Behaviours - weight | |||||
Data collection indicator | |||||
Health status | |||||
Social determinants - Life Events | |||||
Social determinants - Socioeconomic Status | |||||
Missing data | ● | ● | ● | ● | 3.1 |
Outliers | ● | ● | ● | ● | 3.2 |
Data from Parent Questionnaire | ● | ● | 3.4 | ||
Additional Wave 1 participants | ● | 3.12 | |||
Pilot data for Wave 2 | ● | 3.13 | |||
Variable naming inconsistencies in reference to the Research Domain | ● | 3.15 | |||
Derived variables | ● | ● | 3.16 | ||
Age of first drink of alcohol | ● | ● | 3.7 | ||
Age first smoked cigarette | ● | ● | 3.8 | ||
Height, Weight and Body Mass Index | ● | ● | ● | ● | 3.5 |
Height | ● | ● | 3.18 | ||
Update to weights | ● | ● | ● | 3.14 | |
Obstructive sleep apnoea | ● | 3.15 | |||
Short form 12 (SF-12) Health survey | ● | ● | 3.17 | ||
Other natural disasters | ● | 3.19 | |||
Age of respondents | ● | ● | ● | ● | 3.3 |
Level of education completed | ● | ● | 3.6 | ||
Country of birth | ● | 3.9 | |||
Current occupation | ● | ● | 3.10 | ||
Language spoken at home | ● | 3.11 |
3.1. Missing data
Most variables in the Ten to Men datasets have some proportion of missing data, which has been coded using the Ten to Men standard missing value code frame (see the Ten to Men Data User Guide for more information).
The proportion and reasons for missing data should be considered before drawing any conclusion from the data.
3.2 Outliers
All releases of the Ten to Men datasets contain the raw data, with variables that have not been cleaned for outliers. The exception to this is the categorising of the extreme ends as part of the confidentilisation process for the General Release datasets. The variables where this type of confidentilisation has been applied are indicated in the Ten to Men Data Dictionary.
Data users are advised to take care when using and interpreting the Ten to Men data as the presence of outliers may necessitate excluding values or categorising the extreme ends.
3.3 Age of respondents
Cohort inconsistencies
The scope of Ten to Men was males aged 10-55 years at Wave 1, with 3 cohorts:
- males aged 10-14 years completing a Boys questionnaire
- males aged 15-17 years completing a Young Men questionnaire
- males aged 18 years and over completing an Adult questionnaire.
However, there were a small number of men invited to participate whose age was outside the scope or who completed the incorrect questionnaire for their age. The inconsistency arises with less than 0.5% of respondents and is likely to have occurred due to the difference in time between sending out the hard copy questionnaires and the respondents completing the questionnaires. The survey data for these respondents have been retained in the Ten to Men datasets.
The inconsistencies are present in all releases of the Wave 1 and Wave 2 datasets. From Wave 3, there was only one questionnaire and therefore this inconsistency is not an issue.
Calculation of age in Wave 3
In Wave 3 and all subsequent waves, the age of the respondent was not asked in the questionnaire. For inclusion in the datasets, the age is calculated using the respondent's date of birth.
As part of the respondent validation process for Wave 3 , the date of birth was asked. Therefore, there are 2 sources of the date of birth - the master contact file and Wave 3 survey data. A process was undertaken to compare the date of birth from the 2 sources, and it was the same for 97% of respondents.
Further investigation of the 3% where the date of birth differed showed that many only supplied the birth year for Wave 3 data. An assumption has been made that the birth date on the contact file is correct and this has been used to calculate the age of the respondent in Wave 3 (the Wave 3 survey date was also used in the calculations).
There are 5 observations where no date of birth has been supplied (in either Wave 3 or the master contact file). In these cases, the age at Wave 1 and Wave 2, as well as the survey completion dates have been used to impute an age for Wave 3. The 5 unique study identifiers (zdcid0001d) where this occurred are 5003136, 7006305, 7007404, 8010082 and 9015997.
In Wave 4, the date of birth was asked again as part of the respondent validation process.
A process was undertaken to compare the date of birth against the master contact file, and it was the same for 99% of respondents.
Further investigation of the 1% where the date of birth differed showed that many only supplied the birth year for Wave 3 data. As per Wave 3, an assumption has been made that the birth date on the contact file is correct and this has been used to calculate the age of the respondent in Wave 4 (the Wave 4 survey date was also used in the calculations).
New Inconsistencies have been found since asking date of birth in Wave 4. Differences in Wave 4, Wave 3 and master contact file are now present. To make age consistent in Wave 4, and across subsequent waves, age in Wave 4 has calculated either from date of birth from master contact file or if missing, from a combination of Wave 3 based on Wave 1 age and survey completion dates.
There is one observation where no date of birth has been supplied (in either Wave 4, Wave 3 or the master contact file). In this case, the age at Wave 1, Wave 2, Wave 3 as well as the survey completion dates have been used to impute an age for Wave 4. The unique study identifiers (zdcid0001d) where this occurred is 7007404.
3.4 Parent questionnaire data
For Wave 1 and Wave 2 of Ten to Men, the parents of the males aged 10-14 years also completed a questionnaire. The parent was not assigned an ID and therefore, it cannot be determined if the same parent filled in the questionnaire for both Wave 1 and Wave 2. This is important as some questions were subject to the parent's perception. For example, 'In the past 4 weeks, how often does your child feel happy?'
As a result, data users are advised to take extreme care if comparing responses from the Parent questionnaire across Wave 1 and Wave 2.
3.5 Anthropometric measurements
The Ten to Men questionnaires contain questions about anthropometric measurements. Some of the responses are implausible (e.g. a height of 1 cm).
Data users are advised to clean and make their own decisions when dealing with anthropometric measurements as they may contain erroneous data values that will affect derived values and interpretations.
3.6 Level of education completed
Questions about the completed level of education have been asked in all waves of Ten to Men. However, the response categories for the various questionnaires (Boys, Young Men, Adults, Parents) and waves has not been consistent. Extreme care needs to be taken when using this education data, especially if comparing values across questionnaires and/or waves.
The Australian Standard Classification of Education (ASCED) could be used to further categorise this data. In this case, Primary education should also include Year 7 for South Australia only. More information on the ASCED and how it is structured can be found on the ABS website.
3.7 Age when first drank alcohol
Summary
A data issue has been identified with the responses to the question 'How old were you when you first drank more than just a sip or a taste of alcohol?'. The question was included on 3 questionnaires (Boys, Young Men and Adults) for Waves 1 and 2, and a common variable was created to hold the responses for each wave:
- 'abaalcagem' contains the responses from all questionnaires for Wave 1
- 'bbaalcagem' contains the responses from all questionnaires for Wave 2.
The data issue arose as a format was applied to the responses to this question on the Boys questionnaire. No format, other than the missing value formats, was applied to the responses to this question on the Young Men and Adults questionnaires. When the data from the Boys questionnaire was merged with the data from the Young Men and Adults questionnaires to create the common variable, the format for the responses from the Boys questionnaire was not applied.
As a result, the data from the Boys questionnaire for this question was incorrectly reduced by 4 years. For example, a response of 12 years would be recorded as 8 years in the Ten to Men dataset.
This data issue is present in Releases 1.0 and 2.0 of the Ten to Men datasets, but the raw data has been amended in Release 2.1.
Further details
The data (excluding the missing values) from Release 2.0 of the Ten to Men datasets is shown in Table 5. Responses from both the Boys and Young Men questionnaires are shown for comparison. Each cell in the table is colour coded:
- Grey - representing recorded plausible responses in the Ten to Men dataset
- Green - representing no recorded responses in the Ten to Men dataset
- Black - representing recorded implausible values given the age of the respondent at the time of the survey (e.g. a 10 year old cannot respond that they started drinking at 12 years).
Table 5: Data from Release 2.0 (Waves 1 and 2)
There is an issue with the data from the Boys questionnaire, as there is no recorded response of anyone having their first drink of alcohol after the age of 10 (green cells). There is also a higher than expected number of respondents having their first drink of alcohol before the age of 5 years (grey cells).
This issue is especially evident when compared to the data from the Young Men questionnaire.
Further investigation identified a problem with different formats being applied.
The format applied to the responses to this question on the Boys questionnaire is shown in Table 6. Applying this format meant the if the respondent replied 10 years of age, the data was entered as 6.
Table 6: Format applied to the Boys questionnaire
Code | Format |
---|---|
-8 | No questionnaire or interview completed |
-7 | Unable to determine value |
-6 | Value implausible |
-5 | Invalid multiple response |
-4 | Refused or not answered |
-3 | Don't know |
-2 | Not applicable |
-1 | Not asked |
1 | 5 years old |
2 | 6 years old |
3 | 7 years old |
4 | 8 years old |
5 | 9 years old |
6 | 10 years old |
7 | 11 years old |
8 | 12 years old |
9 | 13 years old |
10 | 14 years old |
The corresponding question in the Young Men and Adults questionnaires only had the missing value formats applied (codes -8 to -1). For example, if the respondent replied 10 years of age, the data was entered as 10.
So in summary, if the respondent replied 10 years of age, the data entered was either 6 (Boys questionnaire) or 10 (Young Men or Adults questionnaires).
The data from all questionnaires was then combined to form the Ten to Men datasets. When the data from the Boys questionnaire was merged with the data from the Young Men and Adults questionnaires, no format other than the missing value formats was applied. The format for the Boys questionnaire was not applied and the formatted age value was replaced with the code. As a result, the age of the first drink of alcohol for the Boys data was reduced by 4 years (with the maximum age possible being 10).
This issue is present in both Release 1.0 and 2.0 of the Ten to Men datasets.
3.8 Age when first smoked cigarettes
A data issue has been identified with the responses to the question 'How old were you when you first smoked your first cigarette?'. The question was included on 3 questionnaires (Boys, Young Men and Adults) for Waves 1 and 2, and a common variable was created to hold the responses for each wave:
- 'abtcigagem' contains the responses from all questionnaires for Wave 1
- 'bbtcigagem ' contains the responses from all questionnaires for Wave 2.
The data issue arose as a format was applied to the responses to this question on the Boys questionnaire. No format, other than the missing value formats, was applied to the responses to this question on the Young Men and Adults questionnaires. When the data from the Boys questionnaire was merged with the data from the Young Men and Adults questionnaires to create the common variable, the format for the responses from the Boys questionnaire was not applied.
As a result, the data from the Boys questionnaire for this question was incorrectly reduced by 4 years. For example, a response of 12 years would be recorded as 8 years in the Ten to Men dataset.
This data issue is present in Releases 1.0 and 2.0 of the Ten to Men datasets, but the raw data has been amended in Release 2.1.
As it is the same data issue as described above, see section 3.7 for further details.
3.9 Country of birth
This section describes a data issue that was present in all Releases prior to Release 4.0. A file containing the raw country of birth data was located in late 2022, and therefore this data issue was corrected for the Wave 1 dataset in Release 4.0.
In Wave 1 of Ten to Men, each questionnaire contained 3 questions about participant's country of birth and their parents' country of birth. The response options included 'Other', where the respondent could specify any country using the free text field.
The data were recorded in the 3 variables:
- participant's country of birth (asecobownm)
- mother's country of birth (asemocob1m)
- father's country of birth (asefacob1m).
This was then re-coded using the Standard Australian Classification of Countries (SACC) and an additional 9 variables at the 1-digit, 2-digit and 4-digit levels were created. These variables contain more detail than the categories provided on the questionnaire, as the 'Other' category has been expanded to include languages specified in the free text field. They are:
- participant's country of birth (asecobow1md, asecobow2md, asecobow4md)
- mother's country of birth (asemocob1md, asemocob2md, asemocob4md)
- father's country of birth (asefacob1md, asefacob2md, asefacob4md).
Although the SACC is a 3-level hierarchical structure, this has not been strictly applied to the data. Small values at the 2-digit and 4-digit levels have been confidentialised by replacing with 99 or 9999 instead of using the supplementary codes (not further defined (nfd)).
Therefore, care should be taken when using the variables at the 2-digit and 4-digit levels, as it will give higher 'Other' results than expected. Further details of the coding are shown in Table 7.
For data users, it is recommended that the variables at the 2-digit and 4-didigt levels are used in conjunction with the 1-digit level variable. The confidentialised variables at the 2-digit and 4-digit levels can then be replaced with the corresponding nfd code.
Table 7: Country of birth codes
Country of birth (1-digit code) | Country of birth (2-digit code) | Suggested replacement country of birth (2-digit code) | Wave 1 frequency |
---|---|---|---|
1 | 99 | 10 Oceania and Antarctica nfd | 46 |
2 | 99 | 20 North-West Europe nfd | 17 |
3 | 99 | 30 Southern and Eastern Europe nfd | 26 |
4 | 99 | 40 North Africa and Middle East nfd | 45 |
5 | 99 | 50 South-East Asia nfd | 0 |
6 | 99 | 60 North-East Asia nfd | 29 |
7 | 99 | 70 Southern and Central Asia nfd | 28 |
8 | 99 | 80 Americas nfd | 10 |
3.10 Current occupation
In Wave 1 and Wave 2 of Ten to Men, the Adult questionnaire contained a question about the participant's current occupation. It was a free text field, requesting both the Job title and the main duties/tasks.
This data was then coded using the Australian and New Zealand Standard Classification of Occupations (ANZSCO). Three variables for the participant's current occupation were created for each wave. These are at the 1-digit, 2-digit and 4-digit levels:
- 1-digit level (aseempoc1ad, bseempoc1ad)
- 2-digit level (aseempoc2ad, bseempoc2ad)
- 4-digit level (aseempoc4ad, bseempoc4ad)
Although ANZSCO is a 3-level hierarchical structure, this has not been strictly applied to the data. Small values at the 2-digit and 4-digit levels have been confidentialised by replacing with 99 or 9999 instead of using the supplementary codes (not further defined (nfd)). Some values at the 2-digit level have been coded as -7 (Unable to determine value) because the 4-digit level has been confidentialised to 9999.
Therefore, care should be taken when using the variables at the 2-digit and 4-digit levels, as it will give higher 'Other' results than expected. Further details are shown below in Table 8.
Table 8: Current occupation codes
Current occupation (1-digit code) | Current occupation (2-digit code) | Suggested replacement current occupation (2-digit code) | Wave 1 frequency | Wave 2 frequency |
---|---|---|---|---|
1 | -7 | 10 Managers nfd | 166 | 154 |
2 | -7 | 20 Professionals nfd | 65 | 55 |
3 | -7 | 30 Technicians and Trades Workers nfd | 70 | 97 |
5 | -7 | 50 Clerical and Administrative Workers nfd | 14 | 12 |
5 | 99 | 50 Clerical and Administrative Workers nfd | 41 | 32 |
6 | -7 | 60 Sales Workers nfd | 44 | 38 |
6 | 99 | 60 Sales Workers nfd | 49 | 38 |
7 | -7 | 70 Machinery Operators and Drivers nfd | 75 | 43 |
8 | -7 | 80 Labourers nfd | 41 | 40 |
8 | 99 | 80 Labourers nfd | 0 | 42 |
For data users, it is recommended that the variables at the 2-digit and 4-digit levels are used in conjunction with the 1-digit level variable. The confidentialised variables at the 2-digit and 4-digit levels can then be replaced with the corresponding nfd code.
The Parent's questionnaire asked the same question about the parent's current occupation. The variables for this are:
- 1-digit level (aseempoc1pd, bseempoc1pd)
- 2-digit level (aseempoc2pd, bseempoc2pd)
- 4-digit level (aseempoc4pd, bseempoc4pd)
This data has the same issue and recommendations as the participant's current occupation.
Table 9: Current occupation codes (Parents)
Current occupation (1-digit code) | Current occupation (2-digit code) | Suggested replacement current occupation (2-digit code) | Wave 1 frequency | Wave 2 frequency |
---|---|---|---|---|
1 | -7 | 10 Managers nfd | 10 | 10 |
1 | 99 | 10 Managers nfd | 103 | 62 |
2 | -7 | 20 Professionals nfd | 5 | 2 |
2 | 99 | 20 Professionals nfd | 65 | 66 |
3 | -7 | 30 Technicians and Trades Workers nfd | 1 | 0 |
3 | 99 | 30 Technicians and Trades Workers nfd | 54 | 0 |
4 | 99 | Community and Personal Service Workers nfd | 46 | 62 |
5 | -7 | 50 Clerical and Administrative Workers nfd | 2 | 4 |
5 | 99 | 50 Clerical and Administrative Workers nfd | 119 | 68 |
8 | -7 | 80 Labourers nfd | 3 | 0 |
8 | 99 | 80 Labourers nfd | 58 | 0 |
9 | -7 | 99 Other | 9 | 3 |
3.11 Language spoken at home
In Wave 1 of Ten to Men, each questionnaire contained a question about the language spoken at home. However, the response categories varied across the 3 different questionnaires.
Adult questionnaire
The Adult questionnaire had 7 options for the response to the question about language, which are shown in Table 10. Once option was 'Other', where the respondent could specify any other language using a free text field.
Table 10: Language codes for Adult questionnaire
Code | Language |
---|---|
1201 | English |
2201 | Greek |
2401 | Italian |
4202 | Arabic |
6302 | Vietnamese |
7104 | Mandarin |
9999 | Other |
This data was then re-coded using the Australian Standard Classification of Languages (ASCL) and 3 variables at the 1-digit, 2-digit and 4-digit levels were created (aselangh1ad, aselangh2ad, aselangh4ad). These variables contain more detail than the categories on the questionnaire, as the 'Other' category has been expanded to include languages specified in the free text field.
Table 11: Language codes for Adult questionnaire
Language (1-digit level) aselangh1ad | Language (2-digit level) aselangh2ad | Suggested replacement language (2-digit level) | Wave 1 frequency |
---|---|---|---|
1 | 99 | 10 Northern European Languages nfd | 30 |
2 | 99 | 20 Southern European Languages nfd | 72 |
3 | 99 | 30 Eastern European Languages nfd | 56 |
4 | 99 | 40 Southwest and Central Asian Languages nfd | 57 |
5 | 99 | 50 Southern Asian Languages nfd | 2 |
6 | 99 | 60 Southeast Asian Languages nfd | 30 |
7 | 99 | 70 Eastern Asian Languages nfd | 20 |
Although detailed information on the language can be obtained, the small values at these levels have resulted in the variables being confidentialised (some values have been replaced by 99 or 9999). Care should be taken when using the variables at the 2-digit and 4-digit levels, as it will give higher 'Other' results than expected. Further details are shown in Table 11.
We recommend that the variables at the 2-digit and 4-digit levels be used in conjunction with the 'aselangh1ad' variable. The confidentialised variables at the 2-digit and 4-digit levels can then be replaced with the corresponding nfd code.
Boys and Young Men questionnaires
The Boys and Young Men questionnaires only had 3 options for the response to this question about language, as shown in Table 12, and recorded as the variable 'aselangh1u'.
Table 12: Language codes for Boys and Young Men questionnaires
Code | Language |
---|---|
1 | English |
2 | Another language |
3 | English and another language about equally |
The respondent could specify the other language using the free text field and this was re-coded using the ASCL. Three variables at the 1-digit, 2-digit and 4-digit levels were created (aselangh1ud, aselangh2ud, aselangh4ud).
However, the small values at this level have resulted in the variables being totally confidentialised (all values have been replaced by 9, 99 or 9999).
Therefore, no information about the other languages spoken at home is available in the Ten to Men datasets for the Boys and Young Men.
3.12 Additional Wave 1 participants
During Wave 2 of Ten to Men, 33 additional participants were identified for Wave 1. They were not included in the original Wave 1 dataset (Release 1.0) as their eligibility and consent status had not been determined at that stage, but this issue was resolved during Wave 2.
In Release 1.0, the sample size for Wave 1 was 15,988. This was comprised of the 3 cohorts:
- 1,087 males aged 10-14 years completing a Boys questionnaire
- 1,017 males aged 15-17 years completing a Young Men questionnaire
- 13,884 males aged 18 years and over completing an Adult questionnaire.
In Releases 2.0 and 2.1, the 33 additional participants have been subsequently included in Wave 1, taking the reconciled sample size for Wave 1 to 16,021. The reconciled cohort sizes are:
- 1,099 males aged 10-14 years completing a Boys questionnaire
- 1,026 males aged 15-17 years completing a Young Men questionnaire
- 13,896 males aged 18 years and over completing an Adult questionnaire.
3.13 Pilot data for Wave 2
Of the reconciled Wave 1 sample, there were 314 respondents who were interviewed in the Ten to Men pilot for Wave 2. These respondents did not complete a questionnaire during the course of the main data collection period for Wave 2.
In Releases 1.0 and 2.0, the pilot data has been included in Wave 2 datasets. The sample size was 12,250 males.
In Release 2.1, the data for these 314 respondents have been removed from the Wave 2 dataset. This has reduced the sample size for Wave 2 to 11,936 males. From this Release onwards, these 314 respondents will remain part of the pilot and not be included in the main sample.
3.14 Update to weights
Release 1.0 and 2.0 only included sample weights for Wave 1.
Upon review of the Ten to Men data, it was decided to include Wave 2 weights in Release 2.1. It was necessary to update the Wave 1 weights to ensure that the weights for Wave 2 were developed using the same approach and references as those used in the calculation of the Wave 1 weights.
Therefore, Release 2.1 of the Ten to Men datasets contains the updated weights for Wave 1 and the new sample weights for Wave 2.
In Release 3.0, population and sample weights have been included for all waves.
3.15 Obstructive sleep apnoea
For Wave 2 of Ten to Men, there were 4 questions asked in the Adult questionnaire relating to obstructive sleep apnoea as part of the STOP-Bang questionnaire screening tool. Further information about this screening tool can be found on the STOP-Bang website.
Four objective measures are also required as part of the STOP-Bang questionnaire screening tool: BMI, age, neck circumference and gender. The responses to these 8 elements are scored, with the result indicating low, medium or high risk of obstructive sleep apnoea.
The resulting score was recorded in the Ten to Men Wave 2 dataset as the derived variable:
- Risk of OSA (STOP-Bang) (bhsosarisad).
The values of this derived variable should be stored as a score (0-8 scale), or as a Low/Medium/High format.
In Release 2.0, this variable had values of 0 or 1.
In Release 2.1, the intention was to recalculate the derived variable. However, only 7 of the 8 elements of the STOP-Bang questionnaire screen were available, as we did not have information about the neck circumference.
As a result, this derived variable (bhsosarisad) has been removed from the datasets in Release 2.1 and all subsequent releases.
3.16 Derived variables
The Ten to Men dataset contains numerous derived variables, including scale and summary scores. The calculation of these derived variables generally require input from multiple raw variables, and it is possible that one or more of these input data values may be missing.
Missing values are given negative numeric values according to the Ten to Men standard missing value code frame. More information about this code frame can be found in the Ten to Men Data User Guide.
A couple of issues have been identified with the calculation of the derived variables in Releases 1.0 and 2.0:
- Any negative data values were replaced with zero in the calculation of the derived variables. This could introduce misinterpretation of data, depending on the derivation of each variable. For example, the mean of individual components may be underestimated when zero is assigned to a missing value.
- Incorrect calculation of some derived variables. For example, the elements of the General Wellbeing Scale were not reversed scored before calculating the mean.
Data users using Release 1.0 or 2.0 are advised to re-check and review the interpretation of the derived variables, as the derived variable values may be underestimated or overestimated.
For Release 2.1, a set of guidelines were developed for the treatment of missing input variables for the calculation of derived variables. These are:
- If all the missing input values had the same code frame, the derived variable was assigned the same missing value as per the code frame. For example, if all input variables were -4, the derived variable was assigned to be -4.
- If the input variables had any combination of missing values and some valid data values, the derived variable was assigned the missing value code of -7 (Unable to determine value).
All subsequent releases follow these guidelines.
3.17 Short Form 12 (SF-12) Health Survey
The Wave 1 and Wave 2 questionnaires for Adults included the SF-12 Health Survey: a licensed scale measuring respondent's health status. An SF-12 scale score was derived and included in the dataset for Release 1.0 and 2.0.
Due to issues relating to SF-12 license approvals, the raw data items and derived scale score have been removed from Release 2.1 and subsequent releases. These items have been redacted in the annotated questionnaires and have been deleted from the Data Dictionary.
3.18 Height
In Wave 3, height was only asked if the respondent was under 23 years. Therefore, 88% were not asked this question and the height variable was initially coded as -2 (Not applicable) in Wave3.
For Wave 4, height was only asked if the respondent was under 25 years. Therefore, 91% were not asked this question and the height variable was initially coded as -2 (Not applicable) in Wave 4.
The decision was made to impute the height at Waves 3 and 4 for all respondents where the question was not asked. There are 2 sources of height - from both the Wave 1 and Wave 2 data collection. Data from both waves were used, as some respondents in Wave 3 may not have participated in Wave 1 and/or Wave 2.
An assumption has been made that the largest height value is the most accurate, and this has been used to populate the height variable in Wave 3 and 4 for those respondents aged 23 years or above.
3.19 Other natural disasters
In Wave 3, the following 2 questions were asked about whether you or a family member had experienced a natural disaster:
- Have you been affected by any of the following natural disasters in the past year?
- Has a close friend or family member been affected by any of the following natural disasters in the past year?
One of the response options was 'Other', where a free text field then allowed more details about the type of natural disaster. A summary of the responses from the free text field is shown in Table 13.
More than 90% of the responses in the free text field specified coronavirus, covid, pandemic or something similar. Although technically correct (the Federal Government considers the COVID-19 pandemic a natural disaster), it was an unexpected response to these 2 questions. The COVID-19 pandemic has affected everyone, yet only some respondents reflected that.
The decision was made to include an additional 2 variables in the Wave 3 dataset. These 2 variables (cslndothc, cslndfmoc) reflect a new categorised other free text variable. The values of these variables are shown in Table 14.
Table 13: Free text responses for other natural disasters
Question | Description | Wave 3 frequency |
---|---|---|
Have you been affected by any of the following natural disasters in the past year? | Coronavirus | 280 |
Earthquake | 8 | |
House fire | 3 | |
Other | 18 | |
Total | 309 | |
Has a close friend or family member been affected by any of the following natural disasters in the past year? | Coronavirus | 166 |
Other | 10 | |
Total | 176 |
Table 14: New categorical variables for other natural disasters
Variable | Label | Value | Description | Wave 3 frequency |
---|---|---|---|---|
cslndothc | Natural Disaster - Other Category | 0 | Coronavirus | 280 |
1 | Other | 29 | ||
-2 | Not Applicable | 7,610 | ||
cslndfmoc | Natural Disaster - Family/friend - Other Category | 0 | Coronavirus | 166 |
1 | Other | 10 | ||
-2 | Not Applicable | 7,743 |
Data users are advised to make their own decisions about whether to include or exclude the COVID-19 pandemic as a natural disaster.
Appendix A
History of dataset releases
Date | Release | Dataset | Suggested citation and DOI |
---|---|---|---|
July 2016 | Release 1.0 | Wave 1 | Pirkis, J., English, D., & Currier, D. (2016). The Australian Longitudinal Study on Male Health (Ten to Men), 2013. [computer file]. Canberra: Australian Data Archive, The Australian National University.
|
August 2017 | Release 2.0 | Respondent Wave 1 Wave 2 | Pirkis, J., English, D., & Currier, D. (2017). The Australian Longitudinal Study on Male Health (Ten to Men), 2013. [computer file]. Canberra: Australian Data Archive, The Australian National University.
|
September 2019 | Release 2.1 | Wave 1 Wave 2 | Bandara, D., Howell, L., & Daraganova, G. (2019). Ten to Men: The Australian Longitudinal Study on Male Health, Release 2.1 (Waves 1-2).
|
September 2021 | Release 3.0 | Wave 1 Wave 2 Wave 3 | Bandara, D., Howell, L., Silbert, M., & Daraganova, G. (2021). Ten to Men: The Australian Longitudinal Study on Male Health, Release 3 (Waves 1-3).
|
September 2023 | Release 4.0 | Wave 1 Wave 2 Wave 3 Wave 4 | Volpe, F. Suares, M., Silbert, M., & Martin, S. (2023). Ten to Men: The Australian Longitudinal Study on Male Health, Release 4.0, (Waves 1-4). |
Appendix B
The tables below show the variables that have been renamed for each release.
Table 15: Variables renamed for Release 2.1
Label | Wave | Old variable name (Release 2.0) | New variable name (Release 2.1) |
---|---|---|---|
SA1 code confidentialised (2011 Census based) | 1 | zdcsa1codmd | aldsa1c11md |
SA1 code confidentialised (2011 Census based) | 2 | zdcsa1codmd | bldsa1c11md |
SA1 code confidentialised (2016 Census based) | 2 | n/a | bldsa1c16md |
SA2 code confidentialised (2011 Census based) | 1 | zdcsa2codmd | aldsa2c11md |
SA2 code confidentialised (2011 Census based) | 2 | zdcsa2codmd | bldsa2c11md |
SA2 code confidentialised (2016 Census based) | 2 | n/a | bldsa2c16md |
SA Modified Monash Model Classification | 1 | zdcmmmcsam | adcmmmcsam |
SA Modified Monash Model Classification | 2 | zdcmmmcsam | bdcmmmcsam |
ASGS Region (2011 Census based) | 1 | zdcremotem | aldremt11m |
ASGS Region (2011 Census based) | 2 | zdcremotem | bldremt11m |
ASGS Region (2016 Census based) | 2 | n/a | bldremt16m |
State (2011 Census based) | 1 | zshstate0id | aldstat11id |
State (2011 Census based) | 2 | zshstate0id | bldstat11id |
State (2016 Census based) | 2 | n/a | bldstat16id |
Number of Household Participants | 1 | zdchmparted | adchmparted |
Sampling Weights (2011 Census based) | 1 | zdcwgt001md | adcwgts11md |
SEIFA Index of Relative Socio-Economic Disadvantage - Rank (2011 Census based) | 1 | zdcirsdr0i | aldirdr11i |
SEIFA Index of Relative Socio-Economic Disadvantage - Rank (2011 Census based) | 2 | zdcirsdr0i | bldirdr11i |
SEIFA Index of Relative Socio-Economic Disadvantage - Rank (2016 Census based) | 2 | n/a | bldirdr16i |
SEIFA Index of Relative Socio-Economic Disadvantage - Percent (2011 Census based) | 1 | zdcirsdp0i | aldirdp11i |
SEIFA Index of Relative Socio-Economic Disadvantage - Percent (2011 Census based) | 2 | zdcirsdp0i | bldirdp11i |
SEIFA Index of Relative Socio-Economic Disadvantage - Percent (2016 Census based) | 2 | n/a | bldirdp16i |
SEIFA Index of Relative Socio-Economic Disadvantage - Decile (2011 Census based) | 1 | zdcirsdd0i | aldirdd11i |
SEIFA Index of Relative Socio-Economic Disadvantage - Decile (2011 Census Based) | 2 | zdcirsdd0i | bldirdd11i |
SEIFA Index of Relative Socio-Economic Disadvantage - Decile (2016 Census based) | 2 | n/a | bldirdd16i |
SEIFA Index of Relative Socio-Economic Advantage and Disadvantage - Rank (2011 Census based) | 1 | zdcirsadri | aldiadr11i |
SEIFA Index of Relative Socio-Economic Advantage and Disadvantage - Rank (2011 Census based) | 2 | zdcirsadri | bldiadr11i |
SEIFA Index of Relative Socio-Economic Advantage and Disadvantage - Rank (2016 Census based) | 2 | n/a | bldiadr16i |
SEIFA Index of Relative Socio-Economic Advantage and Disadvantage - Percent (2011 Census based) | 1 | zdcirsadpi | aldiadp11i |
SEIFA Index of Relative Socio-Economic Advantage and Disadvantage - Percent (2011 Census based) | 2 | zdcirsadpi | bldiadp11i |
SEIFA Index of Relative Socio-Economic Advantage and Disadvantage - Percent (2016 Census based) | 2 | n/a | bldiadp16i |
SEIFA Index of Relative Socio-Economic Advantage and Disadvantage - Decile (2011 Census based) | 1 | zdcirsaddi | aldiadd11i |
SEIFA Index of Relative Socio-Economic Advantage and Disadvantage - Decile (2011 Census based) | 2 | zdcirsaddi | bldiadd11i |
SEIFA Index of Relative Socio-Economic Advantage and Disadvantage - Decile (2016 Census based) | 2 | n/a | bldiadd16i |
SEIFA Index of Economic Resources - Rank (2011 Census based) | 1 | zdcierr00i | aldierr11i |
SEIFA Index of Economic Resources - Rank (2011 Census based) | 2 | zdcierr00i | bldierr11i |
SEIFA Index of Economic Resources - Rank (2016 Census based) | 2 | n/a | bldierr16i |
SEIFA Index of Economic Resources - Percent (2011 Census based) | 1 | zdcierp00i | aldierp11i |
SEIFA Index of Economic Resources - Percent (2011 Census based) | 2 | zdcierp00i | bldierp11i |
SEIFA Index of Economic Resources - Percent (2016 Census based) | 2 | n/a | bldierp16i |
SEIFA Index of Economic Resources - Decile (2011 Census based) | 1 | zdcierr00i | aldierd11i |
SEIFA Index of Economic Resources - Decile (2011 Census based) | 2 | zdcierr00i | bldierd11i |
SEIFA Index of Economic Resources - Decile (2016 Census based) | 2 | n/a | bldierd16i |
SEIFA Index of Education and Occupation - Rank (2011 Census based) | 1 | zdcieor00i | aldieor11i |
SEIFA Index of Education and Occupation - Rank (2011 Census based) | 2 | zdcieor00i | bldieor11i |
SEIFA Index of Education and Occupation - Rank (2016 Census based) | 2 | n/a | bldieor16i |
SEIFA Index of Education and Occupation - Percent (2011 Census based) | 1 | zdcieop00i | aldieop11i |
SEIFA Index of Education and Occupation - Percent (2011 Census based) | 2 | zdcieop00i | bldieop11i |
SEIFA Index of Education and Occupation - Percent (2016 Census based) | 2 | n/a | bldieop16i |
SEIFA Index of Education and Occupation - Decile (2011 Census based) | 1 | zdcieod00i | aldieod11i |
SEIFA Index of Education and Occupation - Decile (2011 Census based) | 2 | zdcieod00i | bldieod11i |
SEIFA Index of Education and Occupation - Decile (2011 Census based) | 2 | n/a | bldieod16i |
Sex in the past 12 months | 2 | bhxsex120a | bbxsex120a |
Table 16: Variables renamed for Release 3.0
Label | Wave | Old variable name (Release 2.1) | New variable name (Release 3.0) |
---|---|---|---|
Initial cross-sectional population weight for Wave 1 | 1 | adcicswgtmd | adcicpwtad |
Raked cross-sectional population weight for Wave 1 | 1 | adcrcswgtmd | adcrcpwtad |
Initial cross-sectional population weight for Wave 2 | 2 | bdcicswgtmd | bdcicpwtbd |
Raked cross-sectional population weight for Wave 2 | 2 | bdcrcswgtmd | bdcrcpwtbd |
Initial longitudinal population weight between Wave 1 and Wave 2 | 2 | bdcilgwgtmd | bdcilpwabd |
Raked longitudinal population weight between Wave 1 and Wave 2 | 2 | bdcrlgwgtmd | bdcrlpwabd |
Table 17: Variables renamed for Release 4.0
Label | Wave | Old variable name | Label |
---|---|---|---|
Coronavirus – lacked companionship (current) | 3 | cslcvfc01 | csllosc01 |
Coronavirus – felt left out (current) | 3 | cslcvfc02 | csllosc02 |
Coronavirus – felt isolated (current) | 3 | cslcvfc03 | csllosc03 |
Coronavirus – felt lonely (current) | 3 | cslcvfc04 | csllosc04 |
Coronavirus – lacked companionship (during restrictions) | 3 | cslcvfr01 | cslloscr1 |
Coronavirus – felt left out (during restrictions) | 3 | cslcvfr02 | cslloscr2 |
Coronavirus – felt isolated (during restrictions) | 3 | cslcvfr03 | cslloscr3 |
Coronavirus – felt lonely (during restrictions) | 3 | cslcvfr04 | cslloscr4 |
Glossary
Term | Description |
---|---|
ABS | Australian Bureau of Statistics |
ANZSCO | Australian and New Zealand Standard Classification of Occupations |
ASCED | Australian Standard Classification of Education |
ASCL | Australian Standard Classification of Languages |
AIFS | Australian Institute of Family Studies |
ASGS | Australian Statistical Geographic Standards |
BMI | Body Mass Index |
DC | Data Collection |
DOI | Digital Object Identifier |
General Release | This dataset includes data from which the more sensitive information has been removed. Confidentilisation has also been considered for all variables and applied if required |
LD | Linked Data |
NFD | Not further defined |
Respondent Dataset | A dataset containing key indicator data, such as the unique study identifier, age, household identifier and geographical information. |
Restricted Release | This dataset includes information at a more detailed level than the General Release datasets. Items include language, occupation, and country of birth at the 4-digit levels. |
SA1 | Statistical Area 1 |
SA2 | Statistical Area 2 |
SACC | Standard Australian Classification of Countries |
SEIFA | Socio-Economic Indexes for Areas |
SRC | Social Research Centre |
TTM | Ten to Men Study |
UoM | University of Melbourne |
Update | An update occurs when significant changes are made to an existing release. For example, the update to Release 2.0 resulted in it being reissued as Release 2.1. |
Wave dataset | A dataset containing the responses to the corresponding questionnaire of a given wave. |
Acknowledgements
Ten to Men: The Australian Longitudinal Study on Male Health is the first large-scale, nationally representative, longitudinal study to focus exclusively on investigating and improving the health and wellbeing of males in Australia. It is also the largest longitudinal study of male health in the world.
Ten to Men was commissioned and is funded by the Australian Government Department of Health to inform the National Male Health Policy. The study was initially conducted by the University of Melbourne who released datasets, including data documentation, for Wave 1 and Wave 2. Roy Morgan Research undertook the data collection and initial data processing for these 2 waves.
After a competitive tender process in 2017, the Australian Institute of Family Studies (AIFS) was awarded with the responsibility to conduct Waves 3 and 4. Since then, AIFS has updated the Wave 1 and Wave 2 datasets, including data documentation.
In 2020, the study team re-evaluated and revised the survey content and methodology to enable contactless interviewing for Wave 3. New items designed to collect information on the impacts of COVID-19 and the recent effects of natural disasters were also incorporated into the revised survey. The online survey went live at the end of July 2020, with data collection concluding in February 2021.
Minimal changes, both in terms of the survey content and the data collection method, occurred between Wave 3 and Wave 4. The Wave 4 online survey data collection period was from August 2022 to December 2022.
The Social Research Centre (SRC), in collaboration with Ipsos, was contracted to undertake the fieldwork component for Waves 3 and 4 of the study.
Volpe, F. Suares, M., Silbert, M., & Martin, S. (2023). Ten to Men: The Australian Longitudinal Study on Male Health - Data Issues Paper, Release 4.0, (Waves 1-4). Melbourne: Australian Institute of Family Studies.