The Clinical Practice Research Datalink (CPRD) is an ongoing primary care database of anonymised medical records from general practitioners, with coverage of over 11.3 million patients from 674 practices in the UK. With 4.4 million active (alive, currently registered) patients meeting quality criteria, approximately 6.9% of the UK population are included and patients are broadly representative of the UK general population in terms of age, sex and ethnicity. General practitioners are the gatekeepers of primary care and specialist referrals in the UK. The CPRD primary care database is therefore a rich source of health data for research, including data on demographics, symptoms, tests, diagnoses, therapies, health-related behaviours and referrals to secondary care. For over half of patients, linkage with datasets from secondary care, disease-specific cohorts and mortality records enhance the range of data available for research. The CPRD is very widely used internationally for epidemiological research and has been used to produce over 1000 research studies, published in peer-reviewed journals across a broad range of health outcomes. However, researchers must be aware of the complexity of routinely collected electronic health records, including ways to manage variable completeness, misclassification and development of disease definitions for research.
Background: The number of Mendelian randomization analyses including large numbers of genetic variants is rapidly increasing. This is due to the proliferation of genome-wide association studies, and the desire to obtain more precise estimates of causal effects. However, some genetic variants may not be valid instrumental variables, in particular due to them having more than one proximal phenotypic correlate (pleiotropy). Methods: We view Mendelian randomization with multiple instruments as a meta-analysis, and show that bias caused by pleiotropy can be regarded as analogous to small study bias. Causal estimates using each instrument can be displayed visually by a funnel plot to assess potential asymmetry. Egger regression, a tool to detect small study bias in meta-analysis, can be adapted to test for bias from pleiotropy, and the slope coefficient from Egger regression provides an estimate of the causal effect. Under the assumption that the association of each genetic variant with the exposure is independent of the pleiotropic effect of the variant (not via the exposure), Egger's test gives a valid test of the null causal hypothesis and a consistent causal effect estimate even when all the genetic variants are invalid instrumental variables. Results: We illustrate the use of this approach by re-analysing two published Mendelian randomization studies of the causal effect of height on lung function, and the causal effect of blood pressure on coronary artery disease risk. The conservative nature of this approach is illustrated with these examples. Conclusions: An adaption of Egger regression (which we call MR-Egger) can detect some violations of the standard instrumental variable assumptions, and provide an effect estimate which is not subject to these violations. The approach provides a sensitivity analysis for the robustness of the findings from a Mendelian randomization investigation.
The Avon Longitudinal Study of Parents and Children (ALSPAC) is a transgenerational prospective observational study investigating influences on health and development across the life course. It considers multiple genetic, epigenetic, biological, psychological, social and other environmental exposures in relation to a similarly diverse range of health, social and developmental outcomes. Recruitment sought to enrol pregnant women in the Bristol area of the UK during 1990-92; this was extended to include additional children eligible using the original enrolment definition up to the age of 18 years. The children from 14 541 pregnancies were recruited in 1990-92, increasing to 15 247 pregnancies by the age of 18 years. This cohort profile describes the index children of these pregnancies. Follow-up includes 59 questionnaires (4 weeks-18 years of age) and 9 clinical assessment visits (7-17 years of age). The resource comprises a wide range of phenotypic and environmental measures in addition to biological samples, genetic (DNA on 11 343 children, genome-wide data on 8365 children, complete genome sequencing on 2000 children) and epigenetic (methylation sampling on 1000 children) information and linkage to health and administrative records. Data access is described in this article and is currently set up as a supported access resource. To date, over 700 peer-reviewed articles have been published using ALSPAC data.
The Korea National Health and Nutrition Examination Survey (KNHANES) is a national surveillance system that has been assessing the health and nutritional status of Koreans since 1998. Based on the National Health Promotion Act, the surveys have been conducted by the Korea Centers for Disease Control and Prevention (KCDC). This nationally representative cross-sectional survey includes approximately 10 000 individuals each year as a survey sample and collects information on socioeconomic status, health-related behaviours, quality of life, healthcare utilization, anthropometric measures, biochemical and clinical profiles for non-communicable diseases and dietary intakes with three component surveys: health interview, health examination and nutrition survey. The health interview and health examination are conducted by trained staff members, including physicians, medical technicians and health interviewers, at a mobile examination centre, and dieticians' visits to the homes of the study participants are followed up. KNHANES provides statistics for health-related policies in Korea, which also serve as the research infrastructure for studies on risk factors and diseases by supporting over 500 publications. KCDC has also supported researchers in Korea by providing annual workshops for data users. KCDC has published the Korea Health Statistics each year, and microdata are publicly available through the KNHANES website (http://knhanes.cdc.go.kr).
Background: Since the introduction of specified diagnostic criteria for mental disorders in the 1970s, there has been a rapid expansion in the number of large-scale mental health surveys providing population estimates of the combined prevalence of common mental disorders (most commonly involving mood, anxiety and substance use disorders). In this study we undertake a systematic review and meta-analysis of this literature. Methods: We applied an optimized search strategy across the Medline, PsycINFO, EMBASE and PubMed databases, supplemented by hand searching to identify relevant surveys. We identified 174 surveys across 63 countries providing period prevalence estimates (155 surveys) and lifetime prevalence estimates (85 surveys). Random effects meta-analysis was undertaken on logit-transformed prevalence rates to calculate pooled prevalence estimates, stratified according to methodological and substantive groupings. Results: Pooling across all studies, approximately 1 in 5 respondents (17.6%, 95% confidence interval:16.3–18.9%) were identified as meeting criteria for a common mental disorder during the 12-months preceding assessment; 29.2% (25.9–32.6%) of respondents were identified as having experienced a common mental disorder at some time during their lifetimes. A consistent gender effect in the prevalence of common mental disorder was evident; women having higher rates of mood (7.3%:4.0%) and anxiety (8.7%:4.3%) disorders during the previous 12 months and men having higher rates of substance use disorders (2.0%:7.5%), with a similar pattern for lifetime prevalence. There was also evidence of consistent regional variation in the prevalence of common mental disorder. Countries within North and South East Asia in particular displayed consistently lower one-year and lifetime prevalence estimates than other regions. One-year prevalence rates were also low among Sub-Saharan-Africa, whereas English speaking counties returned the highest lifetime prevalence estimates. Conclusions: Despite a substantial degree of inter-survey heterogeneity in the meta-analysis, the findings confirm that common mental disorders are highly prevalent globally, affecting people across all regions of the world. This research provides an important resource for modelling population needs based on global regional estimates of mental disorder. The reasons for regional variation in mental disorder require further investigation.
Using data from a large European collaborative study, we aimed to identify the circumstances in which treated HIV-infected individuals will experience similar mortality rates to those of the general population. Adults were eligible if they initiated combination anti-retroviral treatment (cART) between 1998 and 2008 and had one prior CD4 measurement within 6 months. Standardized mortality ratios (SMRs) and excess mortality rates compared with the general population were estimated using Poisson regression. Periods of follow-up were classified according to the current CD4 count. Of the 80 642 individuals, 70% were men, 16% were injecting drug users (IDUs), the median age was 37 years, median CD4 count 225/mm(3) at cART initiation and median follow-up was 3.5 years. The overall mortality rate was 1.2/100 person-years (PY) (men: 1.3, women: 0.9), 4.2 times as high as that in the general population (SMR for men: 3.8, for women: 7.4). Among 35 316 individuals with a CD4 count ≥500/mm(3), the mortality rate was 0.37/100 PY (SMR 1.5); mortality rates were similar to those of the general population in non-IDU men [SMR 0.9, 95% confidence interval (95% CI) 0.7-1.3] and, after 3 years, in women (SMR 1.1, 95% CI 0.7-1.7). Mortality rates in IDUs remained elevated, though a trend to decrease with longer durations with high CD4 count was seen. A prior AIDS diagnosis was associated with higher mortality. Mortality patterns in most non-IDU HIV-infected individuals with high CD4 counts on cART are similar to those in the general population. The persistent role of a prior AIDS diagnosis underlines the importance of early diagnosis of HIV infection
Background: Previous studies have identified significant variability in attention-deficit / hyperactivity disorder (ADHD) prevalence estimates worldwide, largely explained by methodological procedures. However, increasing rates of ADHD diagnosis and treatment throughout the past few decades have fuelled concerns about whether the true prevalence of the disorder has increased over time. Methods: We updated the two most comprehensive systematic reviews on ADHD prevalence available in the literature. Meta-regression analyses were conducted to test the effect of year of study in the context of both methodological variables that determined variability in ADHD prevalence (diagnostic criteria, impairment criterion and source of information), and the geographical location of studies. Results: We identified 154 original studies and included 135 in the multivariate analysis. Methodological procedures investigated were significantly associated with heterogeneity of studies. Geographical location and year of study were not associated with variability in ADHD prevalence estimates. Conclusions: Confirming previous findings, variability in ADHD prevalence estimates is mostly explained by methodological characteristics of the studies. In the past three decades, there has been no evidence to suggest an increase in the number of children in the community who meet criteria for ADHD when standardized diagnostic procedures are followed.
The Avon Longitudinal Study of Children and Parents (ALSPAC) was established to understand how genetic and environmental characteristics influence health and development in parents and children. All pregnant women resident in a defined area in the South West of England, with an expected date of delivery between 1st April 1991 and 31st December 1992, were eligible and 13 761 women (contributing 13 867 pregnancies) were recruited. These women have been followed over the last 19-22 years and have completed up to 20 questionnaires, have had detailed data abstracted from their medical records and have information on any cancer diagnoses and deaths through record linkage. A follow-up assessment was completed 17-18 years postnatal at which anthropometry, blood pressure, fat, lean and bone mass and carotid intima media thickness were assessed, and a fasting blood sample taken. The second follow-up clinic, which additionally measures cognitive function, physical capability, physical activity (with accelerometer) and wrist bone architecture, is underway and two further assessments with similar measurements will take place over the next 5 years. There is a detailed biobank that includes DNA, with genome-wide data available on > 10 000, stored serum and plasma taken repeatedly since pregnancy and other samples; a wide range of data on completed biospecimen assays are available. Details of how to access these data are provided in this cohort profile.
Interrupted time series (ITS) analysis is a valuable study design for evaluating the effectiveness of population-level health interventions that have been implemented at a clearly defined point in time. It is increasingly being used to evaluate the effectiveness of interventions ranging from clinical therapy to national public health legislation. Whereas the design shares many properties of regression-based approaches in other epidemiological studies, there are a range of unique features of time series data that require additional methodological considerations. In this tutorial we use a worked example to demonstrate a robust approach to ITS analysis using segmented regression. We begin by describing the design and considering when ITS is an appropriate design choice. We then discuss the essential, yet often omitted, step of proposing the impact model a priori. Subsequently, we demonstrate the approach to statistical analysis including the main segmented regression model. Finally we describe the main methodological issues associated with ITS analysis: over-dispersion of time series data, autocorrelation, adjusting for seasonal trends and controlling for time-varying confounders, and we also outline some of the more complex design adaptations that can be used to strengthen the basic ITS design.
The English Longitudinal Study of Ageing (ELSA) is a panel study of a representative cohort of men and women living in England aged epsilon 50 years. It was designed as a sister study to the Health and Retirement Study in the USA and is multidisciplinary in orientation, involving the collection of economic, social, psychological, cognitive, health, biological and genetic data. The study commenced in 2002, and the sample has been followed up every 2 years. Data are collected using computer-assisted personal interviews and self-completion questionnaires, with additional nurse visits for the assessment of biomarkers every 4 years. The original sample consisted of 11 391 members ranging in age from 50 to 100 years. ELSA is harmonized with ageing studies in other countries to facilitate international comparisons, and is linked to financial and health registry data. The data set is openly available to researchers and analysts soon after collection (http://www.esds.ac.uk/longitudinal/access/elsa/l5050.asp).
Background: Questions remain about the strength and shape of the dose-response relationship between fruit and vegetable intake and risk of cardiovascular disease, cancer and mortality, and the effects of specific types of fruit and vegetables. We conducted a systematic review and meta-analysis to clarify these associations. Methods: PubMed and Embase were searched up to 29 September 2016. Prospective studies of fruit and vegetable intake and cardiovascular disease, total cancer and all-cause mortality were included. Summary relative risks (RRs) were calculated using a random effects model, and the mortality burden globally was estimated; 95 studies (142 publications) were included. Results: For fruits and vegetables combined, the summary RR per 200 g/day was 0.92 [95% confidence interval (CI): 0.90-0.94, I-2 - 0%, n - 15] for coronary heart disease, 0.84 (95% CI: 0.76-0.92, I-2 = 73%, n = 10) for stroke, 0.92 (95% CI: 0.90-0.95, I-2 = 31%, n = 13) for cardiovascular disease, 0.97 (95% CI: 0.95-0.99, I-2 = 49%, n = 12) for total cancer and 0.90 (95% CI: 0.87-0.93, I-2 = 83%, n = 15) for all-cause mortality. Similar associations were observed for fruits and vegetables separately. Reductions in risk were observed up to 800 g/day for all outcomes except cancer (600 g/day). Inverse associations were observed between the intake of apples and pears, citrus fruits, green leafy vegetables, cruciferous vegetables, and salads and cardiovascular disease and all-cause mortality, and between the intake of green-yellow vegetables and cruciferous vegetables and total cancer risk. An estimated 5.6 and 7.8 million premature deaths worldwide in 2013 may be attributable to a fruit and vegetable intake below 500 and 800 g/day, respectively, if the observed associations are causal. Conclusions: Fruit and vegetable intakes were associated with reduced risk of cardiovascular disease, cancer and all-cause mortality. These results support public health recommendations to increase fruit and vegetable intake for the prevention of cardiovascular disease, cancer, and premature mortality.
The HUNT Study includes large total population-based cohorts from the 1980ies, covering 125 000 Norwegian participants; HUNT1 (1984-86), HUNT2 (1995-97) and HUNT3 (2006-08). The study was primarily set up to address arterial hypertension, diabetes, screening of tuberculosis, and quality of life. However, the scope has expanded over time. In the latest survey a state of the art biobank was established, with availability of biomaterial for decades ahead. The three population based surveys now contribute to important knowledge regarding health related lifestyle, prevalence and incidence of somatic and mental illness and disease, health determinants, and associations between disease phenotypes and genotypes. Every citizen of Nord-Tr circle divide ndelag County in Norway being 20 years or older, have been invited to all the surveys for adults. Participants may be linked in families and followed up longitudinally between the surveys and in several national health- and other registers covering the total population. The HUNT Study includes data from questionnaires, interviews, clinical measurements and biological samples (blood and urine). The questionnaires included questions on socioeconomic conditions, health related behaviours, symptoms, illnesses and diseases. Data from the HUNT Study are available for researchers who satisfy some basic requirements (www.ntnu.edu/hunt), whether affiliated in Norway or abroad.
SHARE is a unique panel database of micro data on health, socio-economic status and social and family networks covering most of the European Union and Israel. To date, SHARE has collected three panel waves (2004, 2006, 2010) of current living circumstances and retrospective life histories (2008, SHARELIFE); 6 additional waves are planned until 2024. The more than 150 000 interviews give a broad picture of life after the age of 50 years, measuring physical and mental health, economic and non-economic activities, income and wealth, transfers of time and money within and outside the family as well as life satisfaction and well-being. The data are available to the scientific community free of charge at ext-link-type="uri" xlink:href="http://www.share-project.org" xmlns:xlink="http://www.w3.org/1999/xlink">www.share-project.org after registration. SHARE is harmonized with the US Health and Retirement Study (HRS) and the English Longitudinal Study of Ageing (ELSA) and has become a role model for several ageing surveys worldwide. SHARE's scientific power is based on its panel design that grasps the dynamic character of the ageing process, its multidisciplinary approach that delivers the full picture of individual and societal ageing, and its cross-nationally ex-ante harmonized design that permits international comparisons of health, economic and social outcomes in Europe and the USA.
Background Quitting tobacco or alcohol use has been reported to reduce the head and neck cancer risk in previous studies. However, it is unclear how many years must pass following cessation of these habits before the risk is reduced, and whether the risk ultimately declines to the level of never smokers or never drinkers. Methods We pooled individual-level data from case-control studies in the International Head and Neck Cancer Epidemiology Consortium. Data were available from 13 studies on drinking cessation (9167 cases and 12 593 controls), and from 17 studies on smoking cessation (12 040 cases and 16 884 controls). We estimated the effect of quitting smoking and drinking on the risk of head and neck cancer and its subsites, by calculating odds ratios (ORs) using logistic regression models. Results Quitting tobacco smoking for 1-4 years resulted in a head and neck cancer risk reduction [OR 0.70, confidence interval (CI) 0.61-0.81 compared with current smoking], with the risk reduction due to smoking cessation after ≥20 years (OR 0.23, CI 0.18-0.31), reaching the level of never smokers. For alcohol use, a beneficial effect on the risk of head and neck cancer was only observed after ≥20 years of quitting (OR 0.60, CI 0.40-0.89 compared with current drinking), reaching the level of never drinkers. Conclusions Our results support that cessation of tobacco smoking and cessation of alcohol drinking protect against the development of head and neck cancer.
Background In studies of all-cause mortality, the fundamental epidemiological concepts of rate and risk are connected through a well-defined one-to-one relation. An important consequence of this relation is that regression models such as the proportional hazards model that are defined through the hazard (the rate) immediately dictate how the covariates relate to the survival function (the risk). Methods This introductory paper reviews the concepts of rate and risk and their one-to-one relation in all-cause mortality studies and introduces the analogous concepts of rate and risk in the context of competing risks, the cause-specific hazard and the cause-specific cumulative incidence function. Results The key feature of competing risks is that the one-to-one correspondence between cause-specific hazard and cumulative incidence, between rate and risk, is lost. This fact has two important implications. First, the naive Kaplan-Meier that takes the competing events as censored observations, is biased. Secondly, the way in which covariates are associated with the cause-specific hazards may not coincide with the way these covariates are associated with the cumulative incidence. An example with relapse and non-relapse mortality as competing risks in a stem cell transplantation study is used for illustration. Conclusion The two implications of the loss of one-to-one correspondence between cause-specific hazard and cumulative incidence should be kept in mind when deciding on how to make inference in a competing risks situation
The Health and Retirement Study (HRS) is a nationally representative longitudinal survey of more than 37 000 individuals over age 50 in 23 000 households in the USA. The survey, which has been fielded every 2 years since 1992, was established to provide a national resource for data on the changing health and economic circumstances associated with ageing at both individual and population levels. Its multidisciplinary approach is focused on four broad topics—income and wealth; health, cognition and use of healthcare services; work and retirement; and family connections. HRS data are also linked at the individual level to administrative records from Social Security and Medicare, Veteran’s Administration, the National Death Index and employer-provided pension plan information. Since 2006, data collection has expanded to include biomarkers and genetics as well as much greater depth in psychology and social context. This blend of economic, health and psychosocial information provides unprecedented potential to study increasingly complex questions about ageing and retirement. The HRS has been a leading force for rapid release of data while simultaneously protecting the confidentiality of respondents. Three categories of data—public, sensitive and restricted—can be accessed through procedures described on the HRS website (hrsonline.isr.umich.edu).
Population ageing is rapidly becoming a global issue and will have a major impact on health policies and programmes. The World Health Organization's Study on global AGEing and adult health (SAGE) aims to address the gap in reliable data and scientific knowledge on ageing and health in low- and middle-income countries. SAGE is a longitudinal study with nationally representative samples of persons aged 50+ years in China, Ghana, India, Mexico, Russia and South Africa, with a smaller sample of adults aged 18-49 years in each country for comparisons. Instruments are compatible with other large high-income country longitudinal ageing studies. Wave 1 was conducted during 2007-2010 and included a total of 34 124 respondents aged 50+ and 8340 aged 18-49. In four countries, a subsample consisting of 8160 respondents participated in Wave 1 and the 2002/04 World Health Survey (referred to as SAGE Wave 0). Wave 2 data collection will start in 2012/13, following up all Wave 1 respondents. Wave 3 is planned for 2014/15. SAGE is committed to the public release of study instruments, protocols and meta- and micro-data: access is provided upon completion of a Users Agreement available through WHO's SAGE website (www.who.int/healthinfo/systems/sage) and WHO's archive using the National Data Archive application (http://apps.who.int/healthinfo/systems/surveydata).