Development of Genetic Characterization Databases (GC-db) of Cattle Breeds

There is considerable genetic diversity among livestock but there has been little attempt to exploit this diversity to increase productivity. One reason for this is the lack of coherent organized breeding strategies. To study a breed, the first step is to provide its comprehensive characterization, including breed history, population size and distribution, physiological parameters of the animals and the genetic polymorphism within the loci that are specifically associated with traits of interest. Genetic characterization is a process by which livestock breeds are sampled and genotyped at an established set of loci. By comparing the allelic frequency at common loci across breeds, one can determine the genetic similarity of breeds since breeds with unique genetics are more valuable for conservation purposes. By allowing IAEA Member States access to the detailed genetic information and knowledge available in various sources, their capacity to realise the genetic potential of livestock would be improved.

At the Animal Production & Health Laboratory at Seibersdorf, Austria, the development of bioinformatics and genomics tools continues to form an integral part of the strategy for providing Member States with the means for utilising the genetic resources within their national herds. Genetic characterization is one of the main areas the APHL operates to foster collaboration of scientists in Member States. For this reason, an on-line database of microsatellite genotypes for characterization of cattle breeds, which can easily be expanded to other species, has been developed. This will allow users to compare their breed characteristics with breeds from other countries to determine uniqueness. It is also intended to encourage scientists to join efforts to perform a global "meta-analysis" of characterized species, which would be invaluable for conservation activities.

The systematic collection of the bovine microsatellite data started with the EU-sponsored AIR concerted action: The Analysis of Genetic Diversity in Cattle to Preserve Future Breeding Options project (1995-1998, coordinated by J.L. Williams of the Roslin Institute, UK). During this project, 30 microsatellites were selected as standardized panel of markers, and subsequently these markers were recommended by the FAO for global use. The ResGen Project Strategy towards a Rational Conservation of the Genetic Diversity of European Cattle (1999-2002, coordinated by J.A. Lenstra of Utrecht University, The Netherlands) allowed a considerable expansion of the number of breeds characterized with the 30-microsatellite marker panel. Later on, the collection of data has been continued by contributions of several European institutes, which together constituted the European Cattle Genetic Diversity Consortium which have already led to several publications.

The aim of the GC-DB project was to develop a web-based interface allowing users to view/download/upload molecular data (i.e. microsatellite allelic frequencies).

Implementation of GC-db platform

This database is now open to the public and can be accessed at www.cattlegenomic.com; however, the database has different levels of access according to the type of user. Some data will be visible to all, but access to other data, such as data for internal use is password-protected. As of August 2010, there are 72 data sets freely available for public access to users logged in as guests. Although standardization of allele lengths can be problematic, it has been checked to ensure that these data are consistent. Existing data have been provided by institutions in France, Italy, Spain, Sweden, Switzerland, United Kingdom, and The Netherlands, among others.

This platform is web-based, does not need special software on user computers, and works on all web-browsers with the following features:

- User management and security. Ownership of data and the level of manipulation are assigned to individuals, so each user is identified by his/her personal username and password.
- PID # (Project Identification Number). Each population is represented by a project, which are identified by number and can be accessed by prior permission. The identification assigned to the project includes: GC – Species - Country abbreviation - Breed abbreviation - Institute/city abbreviation (e.g., GC-cattle-BE-BWB-Malle/Viterbo). This allows an expansion of the platform to other species and integration of additional information from different studies. There is a simple colour guide implemented in the system which allow user to select a background colour for his/her project.
- LRN (Laboratory Reference Number). A unique sample number is generated by the system for each participating laboratory, which includes: Species – Country - Breed abbreviation – Institute/city abbreviation - Sample Original Identification (e.g., cattle-Belgium-BWB-Malle/Viterbo-23).

Confidentiality, integrity and availability of data

Development of Genetic Characterization Databases (GC-db) of Cattle Breeds The sensitivity of the data in terms of confidentiality and integrity is different for the various sections on the application. In order to ensure the integrity of the data and to respect data ownership and IPR, only the data owner is privileged to modify data. Data owners can also attach supporting documents such as primer information, SOPs (Standard Operating Procedures), or multiplex PCR information to their projects. Data owners can permit others to access their result and supporting documents, and have the responsibility of integrity and correctness of formula.

Take Home Message

- Accessibility of genetic data of indigenous breeds is most useful. This justifies the development of a database for proper information management at all levels (institutional, national, and international).
- GC-db is a web-based platform in user-friendly format for providing the information in a simple and organized format. This platform may also contribute to creating a network of the end-users in order to exchange information and facilitate the interactions and collaborations all around the world.
- To further develop the application and extend it to other species, we are asking scientists who have done molecular characterization of breeds to share their data. The data can be uploaded in a database and made available to interested users. Providers of data would also have free access to data of others. Would you be interested in contributing your data and participating in any global meta-analyses? If you have genotyped only few markers in the studies, it may be interesting to collect additional data or share samples with other institutes for a further characterization. We would be able to put you in contact with these institutes. We are open for any comments or suggestions to expand the usage of the platform.