Margaret Belle (Oakley) Dayhoff (March 11, 1925 – February 5, 1983) was an American biophysicist and a pioneer in the field of bioinformatics. Dayhoff was a professor at Georgetown University Medical Center and a noted research biochemist at the National Biomedical Research Foundation, where she pioneered the application of mathematics and computational methods to the field of biochemistry. She dedicated her career to applying the evolving computational technologies to support advances in biology and medicine, most notably the creation of protein and nucleic acid databases and tools to interrogate the databases. She originated one of the first substitution matrices, point accepted mutations (PAM). The one-letter code used for amino acids was developed by her, reflecting an attempt to reduce the size of the data files used to describe amino acid sequences in an era of punch-card computing.

Her PhD degree was from Columbia University in the department of chemistry, where she devised computational methods to calculate molecular resonance energies of several organic compounds. She did postdoctoral studies at the Rockefeller Institute (now Rockefeller University) and the University of Maryland, and joined the newly established National Biomedical Research Foundation in 1959. She was the first woman to hold office in the Biophysical Society and the first person to serve as both secretary and eventually president.

Early life

thumb|155x155px|[[Washington Square Park, near where Dayhoff's undergraduate work was conducted]]

Dayhoff was born an only child in Philadelphia, but moved to New York City when she was ten. Her academic promise was evident from the outset – she was valedictorian (class of 1942) at Bayside High School, Bayside, New York, and from there received a scholarship to Washington Square College of New York University, graduating magna cum laude in mathematics in 1945 and getting elected to Phi Beta Kappa.

Research

Dayhoff began a PhD in quantum chemistry under George Kimball in the Columbia University Department of Chemistry. In her graduate thesis, Dayhoff pioneered the use of computer capabilities – i.e. mass-data processing – to theoretical chemistry; specifically, she devised a method of applying punched-card business machines to calculate the resonance energies of several polycyclic organic molecules. Her management of her research data was so impressive that she was awarded a Watson Computing Laboratory Fellowship. As part of this award, she received access to "cutting-edge IBM electronic data processing equipment" at the lab.

thumb|An example of a pre-computer punch card system

After completing her PhD, Dayhoff studied electrochemistry under Duncan A. MacInnes at the Rockefeller Institute from 1948 to 1951. In 1952, she moved to Maryland with her family and later received research fellowships from the University of Maryland (1957–1959), working on a model of chemical bonding with Ellis Lippincott. At Maryland, she gained her first exposure to a new high-speed computer, the IBM model 7094. After this ended, she joined the National Biomedical Research Foundation in 1960 as associate director (a position she held for 21 years). With their combined expertise, they published a paper in 1962 entitled "COMPROTEIN: A computer program to aid primary protein structure determination" that described a "completed computer program for the IBM 7090" that aimed to convert peptide digests to protein chain data. They actually began this work in 1958, but were not able to start programming until late 1960. To produce a Dayhoff matrix, pairs of aligned amino acids in verified alignments are used to build a count matrix, which is then used to estimate at mutation matrix at 1 PAM (considered an evolutionary unit). From this mutation matrix, a Dayhoff scoring matrix may be constructed. Along with a model of indel events, alignments generated by these methods can be used in an iterative process to construct new count matrices until convergence.

One of Dayhoff's most important contributions to bioinformatics was her Atlas of Protein Sequence and Structure, a book reporting all known protein sequences (totaling 65) that she published in 1965. This book published a degenerate encoding of amino acids. It was subsequently republished in several editions. This led to the Protein Information Resource database of protein sequences, the first online database system that could be accessed by telephone line and available for interrogation by remote computers. The book has since been cited nearly 4,500 times.

The one letter code was adopted by IUPAC and remains in general use. Dayhoff's ambiguous one-letter code has been superseded.

Marriage and family

Dayhoff's husband was Edward S. Dayhoff, an experimental physicist who worked with magnetic resonance and with lasers. They had two daughters who are also academics, Ruth and Judith.

Judith Dayhoff has a PhD in mathematical biophysics from the University of Pennsylvania and is the author of Neural network architectures: An introduction and coauthor of Neural Networks and Pattern Recognition.

Ruth Dayhoff graduated summa cum laude in Mathematics from the University of Maryland and focused on Medical Informatics while doing her MD at Georgetown University School of Medicine. Despite the success of Dayhoff's Atlas, experimental scientists and researchers considered their sequence information very valuable and were often reluctant to submit it to such a publicly available database.

During the last few years of her life, she focused on obtaining stable, adequate, long-term funding to support the maintenance and further development of her Protein Information Resource. She envisioned an online system of computer programs and databases, accessible by scientists all over the world, for identifying protein from sequence or amino acid composition data, for making predictions based on sequences, and for browsing the known information. Less than a week before she died, she submitted a proposal to the Division of Research Resources at NIH for a Protein Identification Resource. After her death, her colleagues worked to make her vision a reality, and the protein database was fully operational by the middle of 1984. It is presented at the annual meeting of the Biophysical Society and includes an honorarium of $2,000.

She was survived by her husband, Edward S. Dayhoff of Silver Spring; two daughters, Ruth E. Dayhoff Brannigan of College Park, and Judith E. Dayhoff of Silver Spring, and her father, Kenneth W. Oakley of Silver Spring.

Her seminal contributions as the mother of the science of bioinformatics, now routinely used as part of the process for naming bacteria, were acknowledged with a bacterium being named after her in 2020, Enemella dayhoffiae.

Dayhoff was inducted into the Maryland Women's Hall of Fame in March 2026.

References

  • Picture of Margaret Oakley Dayhoff, c. 1980. Owned by her daughter Ruth E. Dayhoff, M.D. Made available by the National Library of Medicine.
  • Profile and photographs of Margaret O. Dayhoff in Grandma got STEM project. Information submitted to the project by Margaret Dayhoff's son-in-law Vincent, husband of Ruth E. Dayhoff. Also contains biographical information about descendants.
  • Baby Joseph and Vrundha M. Nair .2012 Woman Innovator in Bioinformatics: Dr. Margaret Oakley Dayhoff. Adv Bio Tech:12 (01) 32–34