What can you determine about the disease, and how does that relate to the function of the protein/tissues it is expressed inIs there an animal model for this disease?

GMO lab report

Genetics Assessment 2

Summative assessment will be via a written report on a human gene /genetic condition. This report should be around 3-4 pages.
Formative feedback will be available on the information collected

Information & Sequence retrieval exercise

In order to start collecting information, the first place is NCBI
https://www.ncbi.nlm.nih.gov/
would recommend that you use a tool like padlet (www.padlet.com) in order to collate the information you find.

1. Go to NCBI web site
2. Type your gene name from the spreadsheet into the search box, and enter
3. In the results, click on Gene results, then select the human gene.
4. From the gene page and the links to the reference sequences find the following information

Full gene name

Chromosomal location

There may be a short description, which may include the function of the protein, what tissue it is expressed in, and whether it causes any disease.

There may be a link to pubmed – take a look at the number and range of articles

See if you can determine the number of exons/size of the gene.

Is there any evidence of alternative splicing?

Accession Number for the mRNA sequence (just 1 if multiple isoforms)
Accession number for the protein sequence (just 1 if multiple isoforms)
Size of mRNA transcript
Size of the protein (number of amino acids)

Go back to the search results and look for the OMIM results link – this will take you to pages about the gene, and also of any diseases – have a look at that information.

1. What can you determine about the disease, and how does that relate to the function of the protein/tissues it is expressed inIs there an animal model for this disease? Find the gene page for the mouse homolog of your gene. You can also look at the Mouse Genome Informatics Database at the Jackson Laboratory

Bioinformatics Session
BLAST searches
1. Go to NCBI web site
2. Type your gene name from the spreadsheet into the search box, and enter
3. In the results, click on Gene results, then select the human gene.
4. From the gene page find the links to the reference sequences
5. Go to reference sequences, either by scrolling down or clicking on the reference sequences link in the right hand column
6. Click on the 1st mRNA nucleotide sequence link (note not the genomic sequence link) and this will take you to the Genbank entry for your sequence
7. On the right hand column you will see a link “Run BLAST”. Clicking on this will take you to a BLASTN page
8. Without changing any of the parameters, check the box for “Show results in new window” and click BLAST button
9. Whilst that is running go back to BLAST window, in the Program Selection section select Somewhat similar sequences (blastn) and repeat BLAST. When the analysis has finished, take a look at the results, and see the difference between the two.
10. It is likely that all of the results will be human/primates
11. Repeat the BLAST searches, under Choose Search Set under the drop down menu, select Reference RNA sequences. In the Organism box, type in “Primates” (the option with the taxid should come up), and select. Then check the box to exclude. Click on BLAST button. Look at results, and assess the range of organisms, and the level of similarity. Copy some of the results onto your padlet.

Are the sequences all from mammals? If so, this time on the BLAST page, in the Organism box, type in “mammals” (the option with the taxid should come up), and select. Then check the box to exclude.
What kinds of organisms are in the list of hits?

BLASTP
1. Go back to the gene page, and back to the reference sequences,
2. Click on the 1st protein nucleotide sequence link and this will take you to the Genbank entry for your protein sequence
3. On the right hand column you will see a link “Run BLAST”. Clicking on this will take you to a BLASTP page
4. Without changing any of the parameters, check the box for “Show results in new window” and click BLAST button
5. When the analysis has finished, take a look at the results, and see the difference between these and the results of the BLASTN.
It is likely that all of the results will be human/primates

Repeat the BLAST searches, under Choose Search Set. In the Organism box, type in “Primates” (the option with the taxid should come up), and select. Then check the box to exclude. Click on BLAST button. Look at results, and assess the range of organisms, and the level of similarity. Copy some of the results onto your padlet.

Are the sequences all from mammals? If so, this time on the BLAST page, in the Organism box, type in “mammals” (the option with the taxid should come up), and select. Then check the box to exclude.

What kinds of organisms are in the list of hits? What level of similarity (%) between protein sequences do you see?

Multiple alignment

Select a number of sequences (minimum of 4, choose a diverse set of species) from the results on the mammals BLASTP analysis, click on download, and select FASTA (complete sequences) .The sequences will download as text file.
Open and copy these sequences.

The CLUSTALW tool allows you to carry out a multiple alignment.

Paste the proteins sequences into the box, and hit submit.
Look at the results, and copy into padlet.

What does this tell you about the conservation of your protein?

Genomic sequence – Gene Identification

Go back to the gene page, this time selecting the genomic reference sequence. Click to download the sequence in FASTA format. Copy this sequence. Go to Genscan, and paste in the genomic sequence. Under print options select predicted CDS and Peptides.

Use either the predicted peptide sequence in a BLASTP search by copying and pasting the sequence into a BLASTP search to determine whether GENSCAN has correctly identified the gene.