Discovery of high-density molecular markers in barley and wheat through genotyping by sequencing: an interview with Dr. Jesse Poland (USDA)
Genotyping by sequencing (GBS) using the Ion PGM™ Sequencer facilitates crop breeding programs
Jesse Poland, PhD, is a research geneticist with the United States Department of Agriculture–Agricultural Research Service (USDA-ARS) and an adjunct assistant professor at Kansas State University, Agronomy Department. Dr. Poland’s research focuses on wheat germplasm improvement and developing new breeding technologies. He recently collaborated with Life Technologies and Dr. Nils Stein at The Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Germany, to conduct a barley genotyping by sequencing study on the Ion PGM™ Sequencer. Dr. Poland was gracious enough to answer a few of our questions regarding his work.
What questions are you trying to answer for barley and wheat in particular?
Our primary research is on wheat, but we also work on barley to develop new genotyping approaches. The new genotyping-by-sequencing (GBS) approach will be used in research on multiple crops. Barley is a good model system for exploring the larger polyploid wheat genome.
For barley we work mainly on mapping populations to develop high-density genetic maps, which are used to anchor the barley reference genome to develop a physical map, collaborating with Dr. Stein at IPK and the barley sequencing consortium.
For wheat, we focus on breeding material, largely using GBS approaches to genotype breeding lines from US programs and the International Maize and Wheat Improvement Center (CIMMYT). The GBS markers are used to characterize the germplasms and develop genomic selection models for breeding programs.
What technologies do you use today for genotyping?
My lab only works with GBS. Other labs use (Illumina®) 9K SNP arrays, which is a collaborative development by many labs in the wheat consortium. The consortium is also working on a 90K SNP array. Also, when researchers target on an individual gene, TaqMan® or KASPar (by KBioscience) individual assays are used.
What are the advantages of GBS compared to microarrays?
- First, it is de novo. There is no need for SNP discovery and then design arrays. This allows you to study new species or populations. SNP array development is very time-consuming and costly, but with GBS there are no upfront efforts.
- Arrays have a higher cost of scale—you need a very large number of arrays for manufacturing to be cost-effective. GBS costs the same (on a per-sample basis) for hundreds or thousands of samples. This is very important for smaller labs and smaller projects.
- As next-generation sequencing (NGS) becomes cheaper and faster, GBS becomes more accessible.
- GBS is free of the bias associated with a fixed array design. Arrays designed based on one set of populations might not represent the SNPs in a new germplasm set.
- GBS also has advantages when studying polyploid species. Polyploidy is a challenge for any technology. One advantage for GBS is that it relies on secondary genome-specific polymorphisms that are next to the SNP. This allows us to assign a given sequence to a specific genome so it becomes a single locus marker. For the array-based method, when a probe targets on a SNP it will assay all three genomes at the same time. The cross talk between genomes will make SNP calling very difficult.
- Finally, and most importantly, GBS has low cost per sample and per data point.
If a sequencing technology would allow you to sequence the entire genome at a low cost, will you sequence it or does GBS still have advantages in some cases?
GBS can be used as an intermediate solution prior to the availability of the sequence for the entire genome. But it is always cheaper to reduce the complexity of the genome. Cost per sample is very important for breeding. Also, the predicting power provided by 50,000 markers may not be higher than that of 100,000 markers. It makes more sense to increase the sample throughput when the sequencing capacity goes up.
In addition, the wheat genome, for example, has 80% repetitive sequences so it may never be practical to sequence the entire genome. We use methylation-sensitive restriction enzymes to avoid repetitive sequences. Therefore, it may always be important to reduce the complexity for these types of genomes. On the other side, for crops like rice, which has a much smaller genome, it might be more practical to do random resequencing of the genome.
What are the key requirements for a sequencing platform to fit those needs?
A platform that can generate a lot of independent sequencing reads. More sequences will offer a higher chance to target different genome regions for genotyping calls. For a given GBS library, if a sequencing run can generate 1 GB of data, it is better to have 10 million reads of 100 bp long than 1 million reads of 1,000 bp long. The longer reads still target the same regions. Of course, if the sequences are too short, there will not be enough to cover the SNPs. The sample multiplexing is also important because the cost per sample will depend on the multiplexing abilities.
Would you foresee GBS replacing approaches such as microarray or other SNP genotyping technologies? How about individual or low multiplexing SNP assays?
Yes, but I’m not sure how quickly it will happen. The main challenge is bioinformatics. It is more difficult to analyze complex sequencing data compared to a clear/easy SNP call using array technology. A lot of good tools for GBS are coming along from different research groups. With the fast pace the sequencing technology is developing, I see a lot of movement toward sequencing-based technologies.
Certain breeding or selection programs still assay one or two markers, which can be achieved using individual assays. Researchers may do a preselection on one or two targeted genes and then move to whole-genome GBS. It is also useful in specific projects when a researcher wants to perform fine mapping on a single locus.
The USDA is turning 150 years old in 2012. How did modern technologies change agriculture in the last 150 years?
We have seen more and bigger changes in agricultural genomics and breeding technologies in the past 10 years than in the entire 150 years prior to that. SNP discovery, genotyping, and NGS have really changed the way we are thinking. We can start asking questions we always had but never had the tools for (i.e., markers that allow us to assay the entire genome in a cost-effective way on very large populations). There are real opportunities to increase the output of breeding programs by putting these new sequencing technologies to work. The $1,000 genome project for human genetics can be leveraged in the crop world for a better understanding of those genomes in order to improve them. When you look globally, the impact on humans of applying those new technologies to improve crop and animal productivity can be much deeper and broader than that coming directly from human genomic research. We have hundreds of millions of children and families in extreme poverty facing food security challenges. Crop improvements that come out of these new technologies have the potential to dramatically increase their living standard.