Genetic Diversity Analysis of Eritrean Sorghum ( Sorghum bicolor (L.) Moench) Germplasm using SSR Markers

Eritrea is considered a center of origin for sorghum, the main cereal crop in terms of area under cultivation and production in the country. There have been very little genetic diversity studies done on the Eritrean sorghum before. To improve this crop, the knowledge of genetic diversity estimation is needed on the available germplasm. The aim of this study was therefore to asses the extent of genetic diversity within and among 98 sorghum genotypes collected from Eritrea alongside 42 regional reference accessions from the International Crop Research Institute for Semi-Arid Tropics (ICRISAT) using a set of 29 Simple Sequence Repeat (SSR) markers. An average of 4.8 alleles per marker was recorded. The mean polymorphism information content (PIC) value for the SSR loci was 0.52. The Analysis of Molecular Variation (AMOVA) revealed that 12% of the variation resulted from the difference among populations, 31% within individual populations and 57% among individual accessions of the sub populations. Neighbor joining phylogeny tree based on genetic similarity coefficient revealed three distinct groups of clustering with the Eritrean populations further sub clustered into three groups. The Eritrean sorghum accessions from Gash Barka and South regions and South Sudan accessions recorded the highest private alleles. The results of Principal Coordinate Analysis (PCoA) also classified the sorghum accessions into three major groups. Genetic distance matrix revealed that the Eritrean accessions are more related to each other compared to the regional accessions. The existence of higher level of allelic richness, close genetic distance and an isolated clustering of the Eritrean population indicates that the accessions have not been introgressed with foreign genes and are valuable resource for future breeding programs of this crop.


Background
Sorghum (2n=20) belongs to the family Poaceae, genus Sorghum Moench, species Sorghum bicolor (L.) Moench and tribe Andropogoneae. This species includes the annual sorghums, namely grain sorghum, sorgos, broomcorn and Sudan grass (Prasad and Stagenborg 2008). Sorghum is the fifth most important cereal crop worldwide after wheat (Triticum aestivum), rice (Oryza sativa), maize (Zea mays) and barley (Hordeum vulgare) (FAOSTAT 2012). It forms the most important dryland cereal crop for the semi-arid tropics together with maize and pearl millet (Pennisetum glaucum (L.). It is grown in at least 86 countries, in an area of 38 million hectares and with annual grain production of 58 million tonnes. The average productivity reaches 1.5 t ha -1 (FAOSTAT 2012).
Grain sorghum is the most important staple food crop in Eritrea where the grain is used for human consumption in different forms. Grains are grinded into flour and used to make 'Injera', bread and local drinks, while the leaves and stalks are commonly fed to animals. Sorghum is mainly grown under rain fed conditions by resource-poor subsistence farmers with very little or no capital inputs, such as fertilizers, pesticides, or irrigation (Tesfamichael et al., 2013). It is widely grown in the lowland and mid-highland regions of the country where rainfall is low for the cultivation of other cereals. This crop is cultivated annually in Eritrea on an average area of 230,000 hectares producing approximately 135,000 tons of grain and with productivity of less than 1 t ha -1 which is below the average global production (MoA, 2010). This low productivity is due to drought, striga and lack of knowledge on the benefits of genetic diversity in the country.
The eastern African region, to which Eritrea belongs, has been described as one of the centers of diversity Molecular Plant Breeding 2014, Vol.5, No.13, 1-12 http://mpb.biopublisher.ca 2 and possible area of domestication for sorghum (Ghebru et al., 2002). Although Eritrea is a home to a large number of sorghum landraces, very little information on the genetic diversity of these landraces is available. Previous studies of the National Agricultural Research Institute of Eritrea indicated that in the last 15 years the country's sorghum improvement program has relied on adopting exotic improved cultivars. Relying on improved exotic cultivars has brought the risk of eroding the genetic diversity of the local landraces of sorghum (Engels and Hawke, 1991). However, small-scale farmers in Eritrea commonly grow sorghum landraces that have wide variation in plant structure, panicle orientation, seed color and maturity periods (Tesfamichael et al., 2013). Landraces have been selected and continued for several years by farmers on the basis of their grain and stalk qualities and adaptation to specific ecologies (Mann et al., 1983). Successful plant-breeding programs depend on the availability of a wide crop genetic diversity. In the search for diverse breeding material, farmer cultivar or landraces (locally adapted populations bred through traditional methods of direct selection) are usually the major sources of genetic variation for solving different production constraints (Ghebru et al., 2002).
There are different DNA markers that have been used for diversity assessment in sorghum and other crops. Among the different DNA markers, simple sequence repeats or SSRs are the most commonly used because they are hyper-variable, co-dominant, robust, and multi-allelic in nature (Rakshit et al., 2012). SSR markers are widely used for diversity assessment in several cultivated crop species including sorghum (Dje et al., 2000;Ghebru et al., 2002;Agrama and Tuinstra, 2003). The main aim of the current research was to assess the genetic diversity and genetic relationships within and among the Eritrean accessions with some reference set of germplasm from eastern and central Africa. This study has a paramount importance to address the knowledge gap and facilitate the utilization and documentation of the extent of landrace diversity for sorghum breeding programs in Eritrea for the benefit of small scale farmers in solving the low productivity of this crop.

SSR marker categorization and extent of genetic diversity
The twenty nine SSR markers generated a total of 140 alleles which were used to estimate the genetic diversity among the 140 sorghum genotypes. The number of alleles revealed by each marker ranged from two (gbsb123, Xcup61 and mSbCIR262) to eight (mSbCIR283 and Xtxp141) with an average of 4.8 per marker (Table 1). The polymorphism information content (PIC) value for the SSR loci ranged from 0.06 (mSbCIR262) to 0.74 (Xtxp265) with a mean of 0.52. In the current study nineteen SSR markers revealed PIC values of more than 0.50. The mean level of heterozygosity per SSR marker was 0.22 ranging from 0.02 for marker mSbCIR262 to 0.74 for Xtxp136. Marker Xtxp265 had the highest gene diversity (0.77) and while mSbCIR262, with a value of 0.07, had the lowest.

Patterns of genetic differentiation
Cluster analysis was carried out independently for the 11 populations of Eritrean and regional reference accessions. Based on unweighted neighbor-joining cluster analysis put the 140 sorghum accessions into three major clusters, 'A', 'B' and 'C' (Figure 1). Cluster 'A', consisted of 66 Eritrean accessions, cluster 'B' consisted of 39 accessions from Tanzania, Uganda, Kenya, Ethiopia, Sudan and South Sudan. Cluster 'C' consisted of 35 accessions from regional and Eritrean populations. Cluster 'A' was further subdivided into three sub clusters I, II and III. Ten accessions were grouped in sub cluster I that comprised genotypes from Gash Barka, South, Anseba and Northern Red Sea. Sub cluster II comprised 16 accessions from Gash Barka, Anseba, South and Northern Red Sea while sub cluster III was the largest cluster with 40 accessions that composed of accessions from the 4 regions of Eritrea ( Figure 1). Cluster 'B' is mainly from the regional accessions of Uganda (4), Kenya (15), Tanzania (4), and Sudan (6), Ethiopia (4), South Sudan (4), Northern Red Sea (1) and national released cultivars of Eritrea (1). Cluster 'C' consists of mixed populations from South Sudan (2), Ethiopia (1), Kenya (1), and Tanzania (1), Anseba (3), Gash Barka (18), South (4) and Northern Red Sea (5) regions. The genetic relationships among the Eritrean and regional accessions were further investigated using principal co-ordinate (PCoA) analysis ( Figure 2). The PCoA classified the 140 accessions into three major groups based on the Eritrean origin accessions and regional reference accessions from Eastern Africa countries where the Eritrean populations and regional groups indicated by I and II, respectively. The pattern of clustering was also similar to those detected by cluster analysis except some nine germplasm accessions from Uganda and Kenya that remained distinctly outlier and forming a solitary group in the PCoA that categorized as group III. These accessions clustered far apart from all other germplasms indicating their dissimilarity with other groups.

Population structure
All the variance components of the sorghum accessions under study had shown clear differences among populations, among individuals and within individuals using significance tests based on 1,000 permutations. The Analysis of Molecular Variation (AMOVA) revealed that 12% of the variation resulted from the difference among populations, 57% among individual accessions of the sub populations and 31% within accessions ( Table 2). The variations for the within-populations mainly contributed from the Eritrean populations of Gash Barka, South, Anseba and Northern Red Sea regions germplasm.

Genetic diversity pattern
The SSR markers used in this study were able to structure both the Eritrean and regional accessions.
Comparing the allelic richness of the different populations the data provides significant variation among the populations. Private or rare allele per population ranged from 0 to 13. The Eritrean Gash Barka accessions were observed to have highest private allele with 13 that came from 9 accessions followed by the South Sudan with 8 and South region of Eritrea and Kenya each having 5 private alleles. The percentage of polymorphic loci indicated that the populations of Gash Barka, Anseba and Northern Red Sea regions had the highest percentage of polymorphic loci with 100% (allelic frequency >5%). The lowest was observed for the Ugandan and National program populations. The mean observed gene diversity with the geographical populations was variable, ranging from 0.093 for Tanzanian to 0.253 for Eritrean South populations. The data in general revealed that the Eritrean populations showed higher observed heterozygosity compare to the regional populations (Table 3). Population specific F-statistic indices showed higher values of inbreeding index (Fis) for all populations which ranged between 0.5 and 0.8 with a mean of 0.65 (Table 3). The Nei unbiased genetic distance matrix was calculated to estimate the relationship among the 11 populations that comprises 140 accessions. The furthest distance was between accessions from Uganda and those from South Eritrea and Anseba regions with genetic distance of 0.674 and 0.668 respectively. The closest populations were for accessions from Gash Barka, Anseba, South and Northern Red Sea with genetic distance ranging from 0.042 to 0.085. The Ethiopian, South Sudanese, Kenyan and Tanzanian also showed close genetic distance among each other ranging from 0.068 to 0.171. In general all the Eritrean germplasm accessions were far-off in their genetic distances from the regional accessions of Ethiopia, Kenya, Sudan Tanzania, South Sudan and Uganda. However, the regional accessions had close genetic distance and clustered to each other with the exception of the Ugandan and some Kenyan accessions (Table 4).

Discussion
The mean number of alleles per SSR locus (4.8) detected on the 140 sorghum accessions in the current study was similar to that detected in sorghum that employed 28 SSR primers by Agrama and Tuinstra (2003) with mean allele per locus of 4.3 but lower than those reported on sorghum by Smith et al. (2000), with mean allele per locus (5.9). The gene diversity observed in current studied populations (0.57) is also very similar to the diversity value (0.58) reported by Smith et al. (2000) in sorghum, but lower than the diversity value (0.62) reported by Agrama and Tuinstra (2003). The high levels of gene diversity of SSR markers observed in this study was probably due to the presence of an extensive genetic diversity in these sorghum accessions that represented different races and geographic regions. Nineteen SSR markers recorded PIC values more than 0.50 indicating their usefulness in discriminating the genotypes. Similar PIC value results with more than 0.50 were also reported by Smith et al., (2000) and Rakshit et al., (2012) in sorghum bicolor.
The accessions that originated or that were collected from close geographic regions were generally clustered together by the unrooted neighbor joining population structure. The fact that 63% of the Gash Barka, 79% of the Anseba, 83% South and 50% of the Northern Red Sea accessions clustered together in 'A' and further sub clustered into 3 groups indicates that the Eritrean populations though similar but have some degree of variability among each other. When the Eritrean accessions closely examined, smaller clusters were observed that represented from all the four administration regions of Eritrea within the clusters. This could be due to different naming of the accessions in the different regions for the same accession or sharing common gene pool in their ancestry. Two years field experiment records on phenotypic and morphological evaluation in the Eritrean sorghum accessions also elucidate similar charateristics and this speculation is reflected in the clusters. However, compared to the East African reference sets the Eritrean populations have high degree of dissimilarity and are closely related to each other indicating the uniqueness of the Eritrean accessions. On the other hand, it can't be ignored that few accessions from the Eritrean populations are still closely clustered with the regional references. For instance 18 accessions of the Molecular Plant Breeding 2014, Vol.5, No.13, 1-12 http://mpb.biopublisher.ca 6 population from Gash Barka region which has a border with Sudan and Ethiopia grouped in one major clustered 'C'. This indicates some degree of germplasm exchange and common gene pool sharing between these regions and neighboring countries i.e. Sudan and Ethiopia. The sorghum accessions from Gash Barka grow in an area bordering the Sudan and Ethiopia and often near to wild sorghums, providing an additional opportunity for the introgression of foreign genetic material. Similar studies by Epperson (2004) and Ghebru et al. (2002) indicated that seed exchange and pollen dispersal causes similarity between neighboring populations, whereas distant populations differ for the studied autocorrelation. Interestingly, the two cultivars released by the National program in Eritrea indicated here as National program population, were clustered within the regional set of references 'B'. The main reason for these released cultivars to be clustered within the regional populations is that the pedigrees for the crosses of these varieties were originated from Sudan, Ethiopia and Kenya. The results of PCoA were also similar to those of the neighbor joining method. However, four accessions from Uganda namely IS 8193, Serena, Seredo and 5DX 160 formed a solitary group with five accessions from Kenya: Teso #1, Asinge local, Siaya #42 Siaya #82 and Makueni. These accessions, though clustered on the basis of their geographical regions, they show higher degree of relatedness.
In the present study all the variation components confirmed that there is fair genetic diversity among individual accessions within the population (57%) than among populations (12%) and within individuals accessions (31%) of the given populations. In agreement with the current results, Ghebru et al. (2002), reported the existence of high genetic diversity in a separate study of 28 Eritrean sorghum landraces. The presence of relatively higher percent of variation among individuals accessions within a population could be due to the selection practice of local farmers, where each farmer keeps and maintains more than one landrace for various uses as reported by Tesfamichael et al., (2013) and Tiny et al., (2014) in sorghum. This practice of separately maintaining several landraces increases the total sorghum genetic diversity within a given geographic area but does not increase the within population genetic diversity, due to the inbreeding nature of sorghum. The occurrence of fair inbreeding index values may come as a result of shared common alleles and genetic drift. This is especially true with the Eritrean populations where farmers have the tradition of selecting panicles while the crop is in the field and retaining their own seeds. Another reason for high genetic variation among accessions of population could also be due to high informal seed exchange and open sorghum marketing across localities within the administrative regions (Tesfamichael et al., 2013).The inbreeding index obtained in the current study were slightly lower than those of Dje et al. (2000) with Fis = 0.68 but higher than obtained by Ghebru et al. (2002) with Fis = 0.45 in sorghum bicolor.
The most noticeable result in the current study was the occurrence of high level of private allelic richness actually observed in the sorghum populations of Gash Barka and South regions of Eritrea which could be beneficial to sorghum breeding program and further diverse types of population-genetic studies as it may be linked to unique traits. The presence of high allelic richness in the accessions of Gash Barka and South regions of Eritrea could be due to the fact that most of the collection of sorghum germplasm in the country comes from these two geographical areas. Besides, further and specific conservation effort may be necessary to maintain this unique diversity in their area of cultivation. The list of Eritrean accessions with higher private and unique alleles include 'white Bazenay', 'Wedi-ferej', 'Koden short' and 'Koden tall', 'Embulbul', 'Ajeb Sidu', 'Kine Dirga' and 'Kine Biba' from Gash Barka population. Accessions 'Kehi Mashela', 'Anseba', 'Koden loose' and 'Koden compact' are from Eritrea South population. 'Hijeri' and 'Wedi-Susa' are accessions with private alleles from Northern Red Sea and Anseba populations respectively. Accessions from South Sudan with rich private alleles include 'Jeri', 'Medenga', 'Okabir', 'Deri', 'Kodu Kine' and 'Oderi'.
The mean gene diversity (H e =0.50) for the current sorghum populations is slightly lower than the value estimated for cultivated sorghum of Kenya (0.59) by Mutegi et al., (2011). The H o values were generally lower than the H e values, indicating deviations from the random mating and low cross pollination rate due to the isolation among the different accessions of each Molecular Plant Breeding 2014, Vol.5, No.13, 1-12 http://mpb.biopublisher.ca population or among the diverse geographical sampling. However, the observed heterozygosity (H o ) on the Eritrean sorghum populations of Gash Barka (0.25) and South (0.21) is relatively higher than the other populations. This higher observed heterozygosity in the two Eritrean population's landraces may be because they are in a continuous segregation that could be due the free gene flow and cross administrative regional seed genetic exchange. Based on the current results, it could be noted that the most relevant population for further improvement and selection of this crop is the sorghum populations from Gash Barka and South regions.
The Nei's unbiased pairwise genetic distance between population showed variable genetic distances. The Eritrean populations have shown lower genetic distance to each other indicating that they are geographically nearest to each other and genetically similar. In the same way, the regional reference sets showed lower genetic distance to each other except those from Uganda that showed higher genetic distance to the Eritrean and other regional accessions. This could be due to geographical isolation with the Eritrean accessions and could have unique characters in which the Eritrean accessions might not have. Thus introducing selected Ugandan accessions could be beneficial to the Eritrean sorghum breeding program.

Conclusions
The Eritrean sorghum accessions showed wider genetic diversity and possess unique alleles. The Eritrean accessions grouped in one main cluster and three sub clusters thereby demonstrating that Eritrean sorghums germplasm is still isolated and contain a great deal of genetic diversity with higher level of allelic richness. These germplasm accessions can be exploited in breeding programs for improvement of the sorghum crop in Eritrea. To retain this rich and unique genetic diversity special conservation effort for all the Eritrean sorghum population in general, and the Gash Barka and South region populations in particular, may be necessary to safeguard and reap maximum benefits associated with the rich genetic diversity.

Plant germplasm
A total of 96 Eritrean landraces (Table 5) along with two released cultivars were selected based on the Eritrean gene bank characterization information and agro-ecological representation. The seeds of these accessions were obtained from the Plant Genetic Resource unit of Eritrea. In addition, 42 sorghum germplasm (Table 6) from the eastern and central African (ECA) countries were obtained from the International Crop Research Institute for Semi Arid tropics (ICRISAT), Kenya regional collection and included as a reference set. All the 140 accessions were planted in the greenhouse at the Biosciences eastern and central Africa (BecA) -ILRI hub, Nairobi, Kenya.

Genomic DNA extraction
The seeds of each selected germplasm were planted in a plastic tray with 2.8 cm base and 4 cm of top diameter with a height of 4 cm per single hole. Each hole was then filled with sterile soil, irrigated and seeds planted at Biosciences for eastern and central Africa (BecA) greenhouse. The seedlings were irrigated as required and maintained at temperatures between 21 0 C to 25 0 C. Tender leaf tissues from three plants per accession were harvested from 14 day-old seedlings and bulked for genomic DNA extraction. DNA was extracted using Cetyl-trimethyl Ammonium Bromide (CTAB) method according to Mace et al. (2004). Determination of the quality and concentration of the isolated DNA was done using agarose (2%) gel electrophoresis stained with GelRed TM (Biotium, USA) (2.4µl/100ml) and a Nanodrop® 2000 C spectrophotometer respectively. All the DNA samples were diluted to a final concentration of 20ng/µl.

PCR amplification
A total of 29 labeled SSR markers previously described by Menz et al., (2002) were used for this study (Table 7). The preparation of PCR was done in 10 μl reaction volume consisting of 2 mM MgCl 2 , 1x PCR buffer, 0.20 μM reverse primer, 0.20 μM forward primer labeled with either 6 FAM, VIC, PET or NED, 0.04 mM of each of the four dNTPs and 0.2 U Taq DNA polymerase (Sibenzyme®), 30ng template DNA and topped up with sterile distilled water. GeneAmp® PCR system 9700 (PE-Applied Biosystems) was used for temperature cycling as follows: 5 min at 94°C followed by 35 cycles of 30 seconds at 94°C, 1 min at 55°C and 2 min at 72°C with a final extension of 15 minutes at 72 º C. Following PCR, two reaction products from each SSR marker were randomly selected Molecular Plant Breeding 2014, Vol.5, No.13, 1-12 http://mpb.biopublisher.ca to confirm proper amplification and PCR product concentration on a 2% (w/v) agarose gel. Samples that amplified well were subjected to capillary electrophoresis to determine their sizes.
PCR amplified products of 3-4 individual primer pairs were co-loaded based on the florescent dye, fragment size and dye florescence strength, to reduce the unit cost of high throughput genotyping. 2.0 μl labeled PCR products were mixed with 7.85μl Hi-Di formamide (Applied Biosystems), 0.15μl GeneScan Liz 500 size standard (Applied Biosystems) and denatured at 94 0 C for 5 min before analysis by capillary electrophoresis using the ABI PRISM 3730 (Applied Biosciences).

Data Analysis
Twenty nine SSR markers were used for this study. The peaks were sized and the alleles were scored using GeneMapper version 4.1 software (Applied Biosystems). The data was analysed using Power-Marker version 3.25 (Liu and Musa 2005) to calculate PIC for an estimate of the discriminatory power of a locus by taking into account, not only the number of alleles that are expressed, but also the relative frequencies of those alleles. The power marker analysis also includes heterozygosity, number of alleles identified for each marker, the extent of genetic diversity among the accessions and their genetic distances.
The data generated from the geneMapper were analyzed using Genealex version 6.4 (Peakall and Smouse 2012) to produce Principal Coordinate Analysis that helped to establish the relationship among individuals of the sorghum populations, Analysis of Molecular Variance (AMOVA) was used to compute the differences of variance among the genotypes and for calculating percentage of polymorphism, number of private alleles and genetic distances. Dissimilarity indices were estimated using allelic data by simple allele matching and cluster analysis based on unweighted neighbor-joining (Gascuel, 1997) were carried using DARwin 5.0 dissimilarity analysis software (Perrier and Jacquemoud 2006).