COPD Genetics


Many factors can contribute to the occurrence of COPD. While cigarette smoking is the number one risk factor in the development of COPD, recent research has pointed to the notion that genes may also play a significant role and could be responsible for why some people who smoke will develop COPD while others will not. COPD may be influenced by race, ethnicity, gender, and environmental factors as well as genetic factors. The marked variability in lung function and risk of COPD in people with similar cigarette smoking histories, together with studies of familial aggregation, support an important role for genetics in COPD.


The graph above illustrates variation in COPD self-reported in the United States across race and gender, 1980-2000. As shown below, the prevalence of COPD based on self-reports of emphysema or chronic bronchitis has generally been higher among whites and women.  Limited data exist that compare COPD in different racial ethnic groups; however, the available data suggest that differences in COPD may exist.  Potential differences in COPD between racial/ethnic groups include genetic and biological differences; disparities in diagnosis and treatment; increasing exposure to cigarettes in nonwhite populations world-wide; and a lack of enrollment of minorities in epidemiological and clinical trials.  Gender appears to also influence COPD.  Historically, men have had higher prevalence rates of COPD than women, but recent data suggest that women may actually be more susceptible to COPD.  COPD in women may have different characteristics than in men, and it may be more severe.  Importantly, in the United States, more women than men now die of COPD.




Performing a Genetic Study

In general, doing scientific research involves the use of multiple tools and techniques in order to find out information. Researchers employ different methods to conduct a study depending on what is being studied and what type of information is being sought. In the COPDGene® Study, we are looking for genes that might be involved in the development and disease progression of COPD. Finding genes that are associated with a disease is a tedious and difficult process, and one that involves many steps.

The COPDGene® Study is a type of epidemiological study, called a genetic epidemiology study. Epidemiological research looks at a large population of individuals and tries to understand a disease process within that population. A genetic epidemiological study looks for genes that are suspected to be involved in a disease process for a population. The general goal of a genetic epidemiological study is to identify genes that affect the health and wellness of a population.

Conducting a genetic epidemiological study like COPDGene® requires following a well-defined research method that will help reduce the collection of irrelevant or inaccurate data. The general method used in COPDGene® is outlined below.


1. Identify a Study Population

When scientists attempt to look for genes that are associated with a particular trait, they first have to choose a population that expresses that trait. In order to do this, a population must be chosen that uniquely fits a physical or behavioral model that will help separate the population into groups that share more traits in common. For instance, men and women are often separated into two populations or groups in a study because of differences in physiological traits. People from different ethnic and racial backgrounds are also often studied as unique groups as some genes may be exclusively expressed for these groups. For instance, people of Northern European decent tend to carry the gene for alpha1-antitrypsin deficiency more commonly than people from other racial groups.

Most studies need at least two distinct populations: a control group and a case group. A control group is chosen to serve as a ‘normal' or unchanged population, and the case group is the population that shares a particular trait or feature of interest that is being studied. For instance, if researchers wanted to understand how smoking affects the lungs, they would study the lungs of a group of smokers (case) and the lungs of non-smokers (control) looking for differences between the two groups. In a genetic study, researchers look for people who share certain disease features and try to find genes that are common to that group but that may not be found in the control population. The traits that are used to collect a study group have been carefully selected and will help narrow the genetic diversity of the population, thus making finding genetic similarities more likely. The COPDGene® Study looks for people with specific smoking histories, age, race, gender, lung health, and several other factors. By gathering a specific population, genes that may be associated with a disease can be more easily found.


2. Measure Key Characteristics of the Members of the Study Population

The next step after finding a group of people that fit a specified criteria is to measure them. Taking measurements helps to further define a population and helps to outline the specific traits and features that they share. In COPDGene®, measurements that relate to COPD, lung health, and general physical health are taken. These measurements can then be used to assess the severity of disease and place study participants into smaller groups based on their lung health. At a COPDGene® Study visit, a chest CT scan is preformed to get a visual assessment of lung disease, a walk test is performed, spirometry data are collected, medical history is recorded, and blood is taken. The blood samples are used to extract DNA and study genes.

Each of the measurements taken will help scientists study correlations between factors and draw conclusions about disease states. For instance, by measuring how much people smoke and the health of their lungs, it has been concluded that people who smoke cigarettes have a higher risk of developing a lung disease than non-smokers.


3. Look for Genetic Similarities

The next step in a genetic epidemiology study is to look for genetic variants that differ between the case and control populations. This is no doubt the most difficult and tedious aspect of a genetic study and requires the aid of biostatisticians and molecular geneticists to find genetic associations. Most genes associated with disease are not easy to find because each gene may only contribute a small portion of the total genetic component of a disease. Unlike more clear-cut genes with one primary gene responsible for a trait (like those that are responsible for eye color or blood type), many diseases are caused by multiple genes of small effect that function together to create a complex array of physiological processes leading to disease. Finding all the parts and assessing their importance with respect to the disease is difficult and often does not produce definitive results.

Further complicating matters is the problem of genes interacting with the environment, which may change which genes are expressed. ‘The environment' refers to factors that a person is exposed to outside of their physical body that can change the way the body functions. For instance, if genetic factors are found for COPD, simply having the genes may not be enough to develop the disease. Smoking cigarettes, however, could interact with the COPD genes potentially causing COPD.

The main way in which the COPDGene® Study looks for genes within the study population is though the use of a Genome Wide Association Study, or GWAS for short. A genome-wide association study is a scientific approach that involves scanning hundreds of thousands of genetic markers across the complete sets of DNA, or genomes, of many people to find genetic variations associated with a particular disease. GWAS studies are often used as a tool to find genes involved in complex diseases such as COPD, heart disease, and cancer. 

4. Make Conclusions

After all the information is collected, scientists begin to make conclusions about what was found. In a genetics study, this often involves making conclusions about trends seen in the data. Discovering a trend or an association of a gene with a particular disease or even one aspect of a disease does not mean that every individual that has that gene will develop the disease. This is where the idea of ‘risk factors' come into play. Say for instance, a random group of 100 people is studied for genetic similarities and it is found that 50 of the people studied have a particular set of genes in common. Of those 50 people, 40 also have COPD. The conclusion one might make is that these particular genes may be associated with COPD, though it could not be said that they definitively cause COPD as 10 people in the group who have the genes do not have COPD. Drawing conclusions about genetic similarities works in the same way; genes are often associated with a disease but do not always absolutely cause it.

So what accounts for the differences then? While it is not completely understood why some people with a disease-associated gene will develop the disease when others may not, it is widely accepted that environmental factors play an important role. The process of genetic and environmental risk factors working together to cause an outcome is commonly referred to as the gene-environment interaction. In short, genes may set the stage for a disease to occur, but only through environmental exposures acting in concert with the genes will a disease actually develop.


Current Status of COPD Genetics Research

The marked variability in lung function and risk for COPD in people with similar cigarette smoking histories, together with studies of familial aggregation, support an important role for genetic risk factors in COPD. A small but important fraction of COPD cases harbor a major genetic determinant, α1-antitrypsin deficiency (AATD). This condition is most common in populations of Northern European ancestry, although affected individuals in other populations can be found. Despite significant advances in diagnosis and treatment, AATD remains highly under-diagnosed. Manifestations other than classic lower-lobe predominant emphysema can include bronchiectasis, liver disease, panniculitis, and vasculitis. Intravenous augmentation with AAT protein is a commonly used treatment for severe AATD; it may result in improved pulmonary outcomes, although randomized clinical trials that prove the efficacy of this treatment have not provided definitive evidence.

The discovery of alpha-1 antitrypsin (AAT) deficiency was a major factor in developing the Protease-‌Antiprotease Hypothesis for COPD, the prevailing model of disease pathogenesis for over 40 years.  Hence, it was natural to hope that the identification of other COPD susceptibility genes would lead to similar novel insights into the causes of COPD.  However, the results of many candidate gene association studies have been largely inconsistent1.  These inconsistencies likely relate to a variety of methodological issues, including small sample sizes, failure to adjust for multiple statistical testing, variability in disease characterization, and inadequate adjustment for population stratification.  However, the greatest problem in these studies likely was improper candidate gene selection, reflecting our limited under­standing of COPD pathogenesis.  By contrast, the application of genome-wide association studies (GWAS), which provide an unbiased and comprehensive search throughout the genome for common susceptibility loci, has changed the landscape of COPD genetics.  Based on GWAS, three novel genetic loci have been unequivocally associated with COPD susceptibility.  Pillai and colleagues found genome-wide significant associations between COPD and the CHRNA3/CHRNA5/IREB2 region on chromosome 152.  DeMeo and colleagues performed gene expression studies comparing normal and COPD lung tissues followed by genetic association analysis of COPD3, suggesting that at least one of the key COPD genetic determinants in the chromosome 15 GWAS region is IREB2.  In a GWAS from the Framingham Heart Study4, the HHIP region was associated with FEV1/FVC, and this same region nearly reached genome-‌wide significance with COPD susceptibility in the Pillai paper2.  Studies from large general population samples have provided strong support for associations between SNPs near HHIP with FEV1/FVC5,6.  One of these studies, from the CHARGE Consortium, also found evidence for association between FEV1/FVC and the FAM13A locus5, which has been strongly associated with COPD susceptibility in multiple populations (including COPDGene) by our research group7.  Thus, the frustration of inconsistent genetic association results in COPD over the past decade has been replaced by optimism regarding the likely importance of the IREB2, HHIP, and FAM13A loci in COPD susceptibility.  GWAS studies focus on common genetic variants that contribute to disease risk.  Rare variants (often through DNA sequencing) may also contribute to COPD susceptibility and are becoming an increasingly important focus of COPDGene.





  1. Castaldi PJ, Cho MH, Cohn M, Langerman F, Moran S, Tarragona N, Moukhachen H, Venugopal R, Hasimja D, Kao E, Wallace B, Hersh CP, Bagade S, Bertram L, Silverman EK, Trikalinos TA. The COPD genetic association compendium: a comprehensive online database of COPD genetic associations. Hum Mol Genet 2010; 19:526-34.
  2. Pillai SG, Ge D, Zhu G, Kong X, Shianna KV, Need AC, Feng S, Hersh CP, Bakke P, Gulsvik A, Ruppert A, Lodrup Carlsen KC, Roses A, Anderson W, Rennard SI, Lomas DA, Silverman EK, Goldstein DB. A genome-wide association study in chronic obstructive pulmonary disease (COPD): identification of two major susceptibility loci. PLoS Genet 2009; 5:e1000421.
  3. DeMeo DL, Mariani T, Bhattacharya S, Srisuma S, Lange C, Litonjua A, Bueno R, Pillai SG, Lomas DA, Sparrow D, Shapiro SD, Criner GJ, Kim HP, Chen Z, Choi AM, Reilly J, Silverman EK. Integration of genomic and genetic approaches implicates IREB2 as a COPD susceptibility gene. Am J Hum Genet 2009; 85:493-502.
  4. Wilk JB, Chen TH, Gottlieb DJ, Walter RE, Nagle MW, Brandler BJ, Myers RH, Borecki IB, Silverman EK, Weiss ST, O'Connor GT. A genome-wide association study of pulmonary function measures in the Framingham Heart Study. PLoS Genet 2009; 5:e1000429.
  5. Hancock DB, Eijgelsheim M, Wilk JB, Gharib SA, Loehr LR, Marciante KD, Franceschini N, van Durme YM, Chen TH, Barr RG, Schabath MB, Couper DJ, Brusselle GG, Psaty BM, van Duijn CM, Rotter JI, Uitterlinden AG, Hofman A, Punjabi NM, Rivadeneira F, Morrison AC, Enright PL, North KE, Heckbert SR, Lumley T, Stricker BH, O'Connor GT, London SJ. Meta-analyses of genome-wide association studies identify multiple loci associated with pulmonary function. Nat Genet 2010; 42:45-52.
  6. Repapi E, Sayers I, Wain LV, Burton PR, Johnson T, Obeidat M, Zhao JH, Ramasamy A, Zhai G, Vitart V, Huffman JE, Igl W, Albrecht E, Deloukas P, Henderson J, Granell R, McArdle WL, Rudnicka AR, Barroso I, Loos RJ, Wareham NJ, Mustelin L, Rantanen T, Surakka I, Imboden M, Wichmann HE, Grkovic I, Jankovic S, Zgaga L, Hartikainen AL, Peltonen L, Gyllensten U, Johansson A, Zaboli G, Campbell H, Wild SH, Wilson JF, Glaser S, Homuth G, Volzke H, Mangino M, Soranzo N, Spector TD, Polasek O, Rudan I, Wright AF, Heliovaara M, Ripatti S, Pouta A, Naluai AT, Olin AC, Toren K, Cooper MN, James AL, Palmer LJ, Hingorani AD, Wannamethee SG, Whincup PH, Smith GD, Ebrahim S, McKeever TM, Pavord ID, MacLeod AK, Morris AD, Porteous DJ, Cooper C, Dennison E, Shaheen S, Karrasch S, Schnabel E, Schulz H, Grallert H, Bouatia-Naji N, Delplanque J, Froguel P, Blakey JD, Britton JR, Morris RW, Holloway JW, Lawlor DA, Hui J, Nyberg F, Jarvelin MR, Jackson C, Kahonen M, Kaprio J, Probst-Hensch NM, Koch B, Hayward C, Evans DM, Elliott P, Strachan DP, Hall IP, Tobin MD. Genome-wide association study identifies five loci associated with lung function. Nat Genet 2010; 42:36-44.
  7. Cho MH, Boutaoui N, Klanderman BJ, Sylvia JS, Ziniti JP, Hersh CP, DeMeo DL, Hunninghake GM, Litonjua A, Sparrow D, Lange C, Won S, Murphy J, Beaty T, Regan EA, Make B, Hokanson JE, Crapo JD, Kong XQ, Anderson WH, Tal-Singer R, Lomas DA, Bakke P, Gulsvik A, Pillai SG, Silverman EK. Variants in FAM13A are associated with chronic obstructive pulmonary disease. Nature Genetics 2010; 42:200-2.