This algorithm lets you convert DNA letters into numbers. Below is a link to a government Web site which contains lists of DNA sequences you might find interesting:
Instructions:
Users can input any DNA sequence of A's, T's G's and C's. Some DNA
sequences have N's, which must be removed. Users can search for DNA
sequences with keywords at the National Center for Biotechnology
Information site, a public database for molecular biology information. The repository contains nucleotides, protein sequences, protein structures, complete genomes, and
information on taxonomy.
Step 1
Place your keyword search from the "Entrez" browser. The Nucleotide
GenBank database should provide the most hits.
Step 2
Locate a gene, or gene
fragment.
Step 3
Copy and paste the sequence into the DNA sequences input
box at the present musicalgorithms site. The algorithm will observe only the letters
(except N's) and ignore any numbers. If a user wants to insert a long DNA
sequence with a dial-up modem connection, it would be advisable to use only 200 letter segments at a time.
Otherwise, the program may work too slowly.
Description: DNA (Genetics)
Deoxyribonucleic acid (DNA) represents the most fundamental structure for
which genes are composed. A strand of DNA consists of four basic
molecules called nucleotides: A,T, G, C. When combined as a sequence, the
nucleotides can represent a set of genetic instructions for the development
of all cellular forms of life. The molecules are linked as pairs
entwined in a double helix forming a chains of DNA strands. Some triplet base pairs of
nucleotides are called codons. Codons along with other nucleotides and
enzymes generate amino acids - the basic building blocks for proteins. The "expression" of these proteins is encoded ("written") in genes, in other words a gene is a DNA sequence (a string of nucleotides) that generates polypeptides and proteins.
The genes provide a genetic code for proteins that an organism can
"express." Genes are responsible for defining a
species and making individualistic traits. The DNA structure was
discovered in 1953 by Watson and Crick. The Human genome (entire DNA
sequence) was completed in 2000 at a cost of over 3 billion dollars. Homo
sapiens have 23 chromosome pairs with a total of 3 x 10 to the power 9
pairs of DNA molecules (base pairs) [this is how much is in each of the
pairs, there are twice this number of bases in the 46 chromosomes possessed
by each cell of a human]. [just a small portion of this DNA is used to code
for proteins] Humans have approximately 30,000 to
35,000 genes.
DNA sequences vary enormously due to the extent of possible combinations.
Calculating all possible combinations of three letters in three spaces with
the possibility of repeated letters (e.g., A,A,C...T,C,G) is expressed
mathematically as 4 to the power 3, or 4x4x4, which is 64. How many
possible combinations of four letters with the possibility of repeated
letters are there with 4 spaces?
Have you ever wondered how a missing letter can affect a person's genes? Try playing a melody from codons, and then remove one letter from the input DNA sequence. The new output will produce a new set of codons, and a new melody is created.