# 5. Origins of Life#

## 5.1. The Crucial Issues#

### 5.1.1. Why is This Question Important?#

Our current sample for life in the universe is limited to on or near the surface of our home planet. Therefore, one of our first questions considers how life first took root here, and whether things that we would consider alive exist elsewhere. The fossil record on Earth appears to extent to about 3.5 Gyr ago and isotopic evidence suggests the presence of life several hundred million years before that. However, no hard evidence exists concerning the mechanism by which life first began on Earth.

Every human culture has developed an answer to this question and considered its importance in defining our place in the cosmos. in the absence of firm evidence, the door has been left open to a variety of answers from science, mythology, and religion. Each of these modes approaches our place in the Universe in a different way and with a different purpose. These points of view can be broadly separated into three groups: Biblical-Creationist, Improbable Event, and Cosmic Evolution.

#### 5.1.1.1. Group I: Biblical-Creationist#

In the Biblical-Creationist view, life began directly through the actions of a Creator using supernatural means. This view is accepted by a large fraction of our population, although it is not testable and hence, non-scientific. A modified version of this theory is Intelligent Design, which states that life on Earth arose through the actions of an intelligent being and does not identify the creator or specify the means that were used to create life. This solves the problem of what delivered the initial spark, where there is disagreement concerning how much involvement the intelligent being has with the seeds of life are planted.

If we presume that creation is generated supernaturally, then we leave the realm for which science can answer questions and our tools are no longer effective for separating fact from fiction. Francis Crick has suggested that life began on Earth when intelligent beings from a far-off planet delivered it (intentionally or not) by releasing bacteria from a spaceship (i.e., directed panspermia; Crick & Orgel (1973)). This explanation is not very satisfying because it merely shifts the problem from the beginnings of life on Earth to the beginnings of life in the Universe, which is a much more untenable issue to investigate.

#### 5.1.1.2. Group II: Improbable Event#

Another point of view is that life began through natural causes, but the events that lead to this process were extremely improbable. From this premise, it follows that life anywhere (beyond this planet) may be rare and the remainder of the Universe may be barren. Consider the following:

One has only to contemplate the magnitude of this task to concede that spontaneous generation is impossible, Yet we are here - as a result, I believe, of spontaneous generation.

Time is in fact the hero of the plot. The time with which we have to deal is of the order of two billion years. Given so much time, the `impossible’ becomes possible, the possible probable, and the probable virtually certain. One has only to wait: time itself performs the miracles.

—George Wald (1954)

In our current thinking, the amount of time available is only $${\sim}500$$ Myr rather than 2 Gyr, though the maximum amount of space that can be used for random trials is presumably the same (i.e., the surface of the Earth and the areas that exchange material with it). Both could be expanded greatly if the possibility of panspermia were taken into account. Panspermia presumes that life can be spread from world to world by spores (or within meteors) through space. Much of the Universe and much of the time since creation would then be available for trials on the origin of life.

An improbable event requires us to understand a bit about probability itself (i.e., What are the chances?). Given the age of our Universe and that it’s filled with billions upon billions of stars, small improbabilities can multiply. For example, how long would it take to generate the word “at” on my computer keyboard by random strokes? Assume that I limit my strokes to the fifty keys that produce characters, and I could carry out one trial per second. My chances would be 1 in 2500 ($$50^2 = 50 \times 50$$), which means that I could expect to produce this word within about 45 minutes.

Suppose we raise the difficulty level: What are the chances that I could type this line at random? The italicized sentence has 48 characters (ignoring the spaces). The odds of getting the order completely correct would be 1 in $$50^{48}$$, about 1 in $$4\times 10^{81}$$! If we consider 6 billion humans performing one trial per second and an Earth-like planet orbited every star within every Galaxy of the Universe ever since the Big Bang, then the odds would still be less than 1 in $$10^{30}$$.

The incredibly slim chances for success would be masked if one broke up that sentence into small fragments which were considered individually. The chances for getting the first two characters right in a finite time would seem reasonable. If we assume (wrongly) that the whole is the sum (rather than the product) of its parts, the entire sequence seems reasonable. This type of reasoning underlies much of the approach used for what is called prebiotic synthesis. A number of steps is demonstrated separately in a laboratory, each under conditions that may have prevailed in one location or another on the early Earth. The products of the reaction are then termed prebiotic a distinction that allows the experimenter to use any of the reactions in a subsequent prebiotic transformation. Little consideration is given to the adverse probabilities generated by the numerous processes which are implied by the suggested sequence of steps.

This inevitably leads to odds that makes the overall process seem near miraculous. With some good fortune, such an event (i.e., combination of the correct compounds for life) might have occurred within the history of the Universe, and given rise to us. A consequence of this theory would be that separate origins of life would very rare in the Universe. Panspermia would still be a possibility, but any life that we encountered elsewhere would then be related to our own. Astrobiology would represent a simple extension of Earth biology.

#### 5.1.1.3. Group III: Cosmic Evolution#

In this viewpoint, life began through the inevitable consequences from the laws of nature. There exists a principle of self-organization that has guided the development of the Universe from the Big Bang to the present. Life is common in the Universe. It will arise whenever a prerequisite set of materials interacts with a suitable energy source. Life’s origin is not liked to the improbable formation of a particular functioning large biomolecule, and it requires a much simpler set of ingredients:

• a suitable source of free energy,

• a set of chemicals that can utilize energy to increase its own ability to survive, and

• an environment that is sufficiently stable so that the evolving system is not disrupted by extreme changes in conditions.

This theory encourages the expectation that life is likely to arise on other worlds that have enjoyed Earth-like conditions during some part of their history. No recipe for a specific set of ingredients for life is offered. The possibility remains open that alternative lifeforms may rise in environments that are very different than Earth but still meet the necessary requirements for life. This stance towards the origin of life is the most optimistic for the field of astrobiology, but it remains to be demonstrated that it is correct.

### 5.1.2. Historical Background#

#### 5.1.2.1. Spontaneous Generation#

For millennia, the consensus favored spontaneous generation, where life may arise quickly by chance from non-living matter and independent of the action of parents. Aristotle and his followers claimed that fireflies emerged from morning dew from their many observations in which they suddenly appeared. Many kinds of small animals seemed to arise from the mud at the bottom of ponds and streams. The 17th century philosopher René Descartes suggested that the process was driven by heat agitating the subtle and dense particles of putrefying matter.

Through careful experimental tests to these speculative ideas, scientists found negative results for a whole array of lower lifeforms. By the mid-19th century, the spontaneous generation question had contracted to the realm of microorganisms.

The cell theory was modified in the 1860s so that protoplasm came to be considered the structural unit of life. Moreover, this protoplasm was viewed as an essentially simple substance, theoretically capable of bing produced directly from inorganic matter. That the simples organism were generally regarded merely as naked lumps of protoplasm added credence to the belief that they, too, could be produced spontaneously.

—John Farley (1977)

The paradigm collapsed in the late 19th century as a result of the experiments of Louis Pasteur. Pasteur proclaimed in an 1862 lecture at the Sorbonne, “never will the doctrine of spontaneous generation recover from the mortal blow of this simple experiment.” Pasteur heated and thus sterilized a broth of sugared yeast water, which showed that a solution could be devoid of life and did not spontaneously accumulate microbes until direct contact with the outside air was re-established.

Experiments similar to Pasteur’s demonstrated that all living things arise today from existing life. The results were comforting to those with the Biblical-Creationist view, where a Creator was necessary to bring life to the world for the first time. This included Pasteur himself:

What a victory for materialism if it could be affirmed that it rests on the established fact that matter organizes itself, takes on life itself, matter which has in it already all known forces! … Of what good would it be then to have recourse to the idea of a primordial creation, before which mystery it is necessary to bow? What good then would be the idea of a Creator God?

—John Farley (1977)

A scientific (rather than theologic) view need to push the origin back to an earlier era when conditions on the Earth were very different, and when timespans greater than a few days could be invoked. It was recognized that the generation of a fully equipped modern functioning cell might not be necessary to get life going. Some simpler system which utilized only a portion of a cell might suffice.

#### 5.1.2.2. The Oparin-Haldane Hypothesis#

This theory proposed separately by Alexander Oparin and J.B.S. Haldane stated that life arose slowly on the early Earth in a “prebiotic soup” of chemicals that covered the planet through a series of steps.

1. The early Earth had an oxygen-free atmosphere, but contained ammonia, hydrogen gas, water, and methane (i.e., reduced carbon bound to hydrogen).

2. This atmosphere was exposed to a variety of energy sources: solar radiation, meteorite and comet infall, volcanic eruptions, lightning, and others, which led to the formation of organic compounds.

3. These substances “must have accumulated until the primitive oceans reached the consistency of hot dilute soup” as stated by Haldane. Harold Urey suggested that the soup components, “would remain for long periods of time in the primitive oceans. … This would provide a very favorable situation for the origin of life.”

4. By unspecified transformations, life developed in this soup.

Some elements of the hypothesis were demonstrated in a famous 1953 experiment by Urey and his student Stanley Miller. They set up a flask of boiling water to represent the oceans of the early Earth. The water vapor passed through a compartment that held two electrodes and supported a spark discharge. The spark served as an energy source and represented lightning. After the vapors passed through the spark, they were condensed to water droplets that returned to the original flask. The entire system was enclosed and filled with a mixture of methane, ammonia, and hydrogen gases to mimic the primitive atmosphere (as it was understood at the time). The experiment was run for a week, in which all of the methane was consumed. What was produced in its place?

The results apparently depended upon the arrangement of the compartments and the nature fo the spark. The most exciting result was to produce a mixture containing amino acids, the building blocks of proteins. In a modified design, a hydrocarbon layer was formed that was devoid of amino acids. A later experiment, an intermittent spark was used, but few organic compounds were produced. Further studies were carried out under the conditions that afforded amino acids.

The prominent product was a tar-like, insoluble substance that coated the walls of the vessel. Fifteen percent of the methane had been converted to a mixture of simple organic compounds. Fourteen of these compounds were present in yields over $$0.1\%$$. All of the compounds were organic acids (i.e., carboxylic acids), which may have been preserved from destruction by their formation of non-volatile ammonium salts. The salts would have remained in the liquid phase and evaded continuous recycling through the spark. Two amino acids produced are used in life’s proteins, while the remainder are not.

The experiment did confirm one tenet of the hypothesis: that an atmosphere of methane, hydrogen, and ammonia, when exposed to certain types of energy, would afford a mixture of organic compounds. It revealed nothing about the processes that produce a living organism from such a mixture. As our understanding of the complexity of life has increased, that problem has become more difficult over the next 50+ years.

#### 5.1.2.3. Even the Simplest Forms of Life are Complex#

In the mid-19th century, a common scientific idea was that the building material of life was a pulpy, gel-like, structureless material called protoplasm. In 1857, Thomas Huxley dredged a “transparent, gelatinous matter” from the sea bottom and concluded it was protoplasm. If the construction of life was relatively simple, then it would be not surprising if life could be prepared from simple broths of nutrients. After 150 years of careful study, a very different picture emerged. A bacterium is far more complex than the most intricate machine constructed by humans.

### 5.1.3. The Architecture of Life#

Fig. 5.1 The complexity of life on a logarithmic scale, where lipids and proteins are $${\sim}10-100\times$$ larger than an atom. Most living things are larger than $$1\ {\rm \mu m}$$. Image Credit: Openstax:Biology.#

#### 5.1.3.1. Eukaryotic Cells have Complex Internal Structure#

The complexity of life life is shown over many orders of magnitude in Figure 5.1. A number of distinct features exist at the human level (e.g., limbs, eyes, nose, etc.), where magnification to the organ or tissue level $$(100\ {\rm \mu m}-1\ {\rm mm})$$ produces some simplification and the cellular level $$(10-100\ {\rm \mu m})$$ has a wealth of complex structures (e.g., organelles). The mitochondrion is an organelle with intricate detail in its construction (see Fig. 5.2). At the molecular level, individual proteins emerge and chemical analysis shows that the protein is a chain molecule whose links are amino acids. A set of twenty related amino acids is used in Earth biology to construct all proteins. To get a protein with a particular shape and function, it is necessary to connect (encode) the amino acids in a specific order.

Fig. 5.2 This electron micrograph shows a mitochondrion through an electron microscope. Image Credit: Openstax:Biology.#

#### 5.1.3.2. Bacterial (Prokaryotic) Cells are Simpler#

For most of Earth’s history, single-celled organisms were the only form of life. In seeking an origin for life, the appearance of these forms must be explained. In a superficial examination, a bacterial cell reveals fewer details than a eukaryotic cell, but a deeper inspection (at higher magnification) reveals many complexities. Bacterial cells are crafted of large molecules, where many are used in the construction of a eukaryotic cell (e.g., large molecules are proteins and nucleic acids).

Fig. 5.3 The features of a typical prokaryotic cell. Flagella, capsules, and pili are not found in all prokaryotes. Image Credit: Openstax:Biology.#

Individual units called nucleotides are strung together in a linear array to construct long chains, where each nucleotide consists of a sugar (deoxyribose in DNA and ribose in RNA), a phosphate group, and a base. Only four different nucleotide units are combined to construct a nucleic acid. But what nucleic acids lack in variety, they make up for in size. In the case of DNA (deoxyribonucleic acid), several million nucleotides are linked huge circle to form the bacterial chromosome. RNA (ribonucleic acid) does not attain such lengths, but appears in a greater variety of cellular contexts than DNA, and performs many more functions. Other macromolecules that serve as recognition elements in cellular contacts (signaling), energy storage, and structural materials for the membrane of the cell, which include the

#### 5.1.3.3. Organized at the Molecular Level: Proteins, Nucleic Acids, and Lipids#

Historically, biologists used a “top-down” approach to understand life, where anatomical studies of larger (macro-sized) creatures first drew their interest. Chemists worked in a “bottom-up” fashion and studied the molecular components of life before attempting to explore the larger structures. Chemically, a bacterial cell is primarily composed of water (70% by weight), which serves as a solvent. Hundreds of organic molecules are present, but they comprise only a few percent of the cell (by weight).

Note

A percentage by weight refers to the mass ratio of a substance, where the mass of the quantity in question is divided by the total mass. If a bacterial cell is 70% water (by weight), then there are 7 g of water for every 10 g of the total cell mass. The typical total mass of a bacterium is $${\sim}10^{-12}\ {\rm g}=1\ {\rm pg}$$.

Inorganic ions of many kinds provide only about 1% of the cellular mass, where the remainder of a bacterial cell is composed of biopolymers or macromolecules (e.g., proteins (15%) and nucleic acids ($${\sim}7\%$$)). Proteins and nucleic acids comprise the bulk of the organic components, and play a number of roles in vital cellular processes. Over 99% of the atoms in a bacterium are $$\rm H$$, $$\rm O$$, $$\rm C$$, and $$\rm N$$.

Note

In chemistry, an organic compound is a molecule containing carbon. Organic matter refers to a once living organism that is capable (or the product) of decay, or composed of organic compounds.

Proteins control most cellular functions, where our muscles move through their contraction/relaxation. The rotation of flagella (a protein structure) in bacteria allow them to swim. Proteins can perform many more functions, such as

• forming gateways in the cellular membranes constricting the movement of substances to enter or leave the cell.

• transporting electrons via brigades within cellular organelles.

• functioning as bioweapons and poisons (e.g., defensive antibodies).

• forming a scaffolding for cells (e.g., the cartilage that forms our noses).

Most importantly they can act as enzymes, which speed up chemical reactions needed for life relative to benign or destructive reactions. Proteins are composed of amino acids, where those used in living organisms contain the (arbitrarily called) left-handed form. The reason for this selection is unclear, where it could be from physical processes arising from an environmental bias, simply arbitrary, or governed by some accident. When looking for life elsewhere, we must acknowledge our own limited ontology. It could be a boon, in that a strong signal for panspermia could be identified. On the other hand, it could be our bane, where we are not initially equipped to detect such life.

The linear sequence of amino acids in a protein governs the way that it folds in 3D space, where artificial intelligence (AI) has come to the fore to increase our understanding. Proteins produced by billions of years of evolution fold into distinct shapes, with charges at particular locations, and reactive groups in proximity to each other. This folding will determine the biological function of a protein by:

• whether it wil bind certain organic molecules and catalyze their reactions, or

• form a regular structure (e.g., helix) and act as a binding material.

Sometimes proteins are created in the cell in a particular way and are later modified to meet a particular function. In this case a protein may be altered by other proteins or may recruit an additional factor (e.g., organic molecule or inorganic ion).

Nucleic acids store hereditary information and reproduce themselves (with the help of proteins). An organelle called the ribosome can synthesize protein from amino acids, where the bacterial ribosome is made of 50+ proteins and 3 RNA molecules. The RNA catalyzes the central step of the protein synthesis process, the linking of each new amino acid to the growing chain called a polypeptide chain. Individual amino acids are transported to the ribosome for manufacturing purposes by transfer RNA.

A bacterial cell man contain $${\sim}\text{20,000}$$ ribosomes able to produce several thousand proteins, where each ribosome can produce any of the proteins. A ribosome determines which protein to manufacture through an information-bearing “tape” that threads through the ribosome. This tape is made of messenger RNA, which carries information in the sequence of bases (a subunit of a nucleotide) that project from the backbone of the RNA. The bases of RNA are

• cytosine (C),

• guanine, (G), and

• uracil (U).

The recognition scheme of these bases is the so-called genetic code, and the overall process is called translation.

Messenger RNA is not the original source of the information, but it has been carried by DNA (i.e., holder of the master blueprint for the cell). A portion of the information in DNA is read into a messenger in a process called transcription. DNA exists as the famous double helix, where two DNA chains (polymers) wrap around one another and linked by an affinity through the well-known Watson-Crick pairs. These pairs combine from similar bases as RNA, where DNA contains thymine (T) instead of uracil to make the code with combinations of A, T, C, and G.

Each chain contains all the information present in the double helix, where individual chains are complementary. Once chain can thus be used as a template for the construction of the other from appropriate nucleotide subunits. This process requires the assistance of a specialized protein enzyme called a polymerase, and other helpful proteins. When a bacterial cell divides to give two daughters, the two DNA strands of its chromosome come apart, and each is copied to provide a complement. Through replication, a complete chromosome for each daughter cell is provided.

Lipids form a boundary, separating the contents of the cell from the external world. This function prevents the contents of a cell from being dispersed into the broader environment. This boundary must not be impermeable, where selective admission of nutrients is necessary for metabolic processes, waste needs to be removed, and undesired substances can be ejected. The process for basic membrane construction is far less rigorous and information-rich than those for proteins and nucleic acids.

Fig. 5.4 Cross section of the different structures that phospholipids can take in a aqueous solution. The circles are the hydrophilic heads and the wavy lines are the fatty acyl side chains. Image Credit: Wikipedia:lipid bilayer.#

Amphiphiles contain both hydrophilic (water-loving) and hydrophobic (water-shunning) parts. When amphiphiles are placed in water, they arrange themselves to that the hydrophilic parts are exposed to the water and the hydrophobic parts cluster inside. One arrangement is a micelle, which is a sphere with the hydrophilic groups on the surface and the hydrophobic ones within. Another is a bilayer, with two aligned layers of amphiphiles arranged in a tail-to-tail structure that resembles a membrane. A bilayer can also curve to produce a curved structure that resembles a membrane of a cell. Biological cellular membranes are modified by embedded protein, which control the access of small molecules into the cell.

### 5.1.4. Central Ideas About the Origin of Life#

#### 5.1.4.1. Thermodynamics and Probability#

The second law of thermodynamics specifies that spontaneous processes in a closed system are characterized by the conversion of order to disorder. The entropy of the system (measure of disorder) must increase, but entropy is also associated with probability (i.e., a probability of possible states). The formation of an organized system from disorder is not forbidden, but highly improbable. Living systems “pay” for their existence by creating more disorder elsewhere in the system. For example, they may convert nutrients to simpler chemicals, but the process causes a release of heat to the general environment and increases the disorder of that environment. There’s no such thing as a free lunch.

The spontaneous generation of a living bacterium from small molecules would be near-miraculous due to its interior complexity. The odds that component atoms within a hot gas could self-assemble into a bacterium are more than astronomical ($$1\ \text{in } 10^{10^{11}}$$; Morowitz 1968). Matters might improve by thousands of orders of magnitude if we started with an appropriate mixture of small molecular species, but the overall prospect would still be hopeless. To reduce the improbability, scientists have suggested that life began with only a very limited subset of the components of a bacterial cell. The heart of the origin-of-life problem lies in finding the way in which the initial functioning system was formed from disorganized chemical mixtures produced by the normal processes of abiotic chemistry.

#### 5.1.4.2. Separate Problems: Origin of Life vs. Origin of Organics#

In the early 19th century, many chemists felt that the distinction life and non-life lay in the very chemicals used to construct them. Living systems contained carbon and the chemicals in them were called “organic,” while substances considered to be inorganic (with some exceptions) had no carbon. This distinction was erased in 1828, when Friedrich Wöhler prepared urea from another substance that was classified as organic.

We now recognize that mixtures of simple organic chemicals are produced in many locations off this planet (e.g., carbonaceous chondrites, interstellar dust clouds, cometary tails, and the atmosphere of Titan). The most informative studies have been performed on meteorites, where they represent the products of authentic abiotic processes, and are amenable to detailed analyses in Earth-based laboratories. The Murchison meteorite has been subjected to extensive analysis.

The contents of the meteorite are quite complex, where the bulk of the carbon present is in the form of insoluble macromolecules and soluble organic compounds comprise 10-20%. The most predominant chemical class is organic sulfonic acids, followed by polar aromatic hydrocarbons, and carboxylic acids. Amino acids occur at levels between $$10-100$$ ppm (1-2 orders of magnitudes lower than the aforementioned compounds), and the base components of nucleic acids at about 1 ppm. Sugars and phosphate esters of all kinds are absent. More than 70 amino acids have been detected in the Murchison meteorite, but only eight of the twenty needed for Earth-life were among them.

If we presume that life on Earth began amidst a chaotic chemical mixture, we must look for a pathway by which the smaller components could organize themselves. Another question arises: How did the components of that mixture come to be present on the early Earth?

#### 5.1.4.3. Possibilities for Reduced Organics#

Underlying the Miller-Urey experiment was the assumption that the primitive atmosphere of the Earth was derived from solar nebula, and was therefore rich in hydrogen. This explains the atmospheres of the outer Solar System planets, which have reducing atmospheres consisting of molecular hydrogen $$\rm H_2$$ but not $$\rm H_2O$$. Subjecting an atmosphere rich in molecular hydrogen to energy from various sources was expected to give rise to a global “prebiotic soup.”

However, the current consensus suggests that any primary atmosphere of hydrogen was lost quickly and replaced by a secondary atmosphere due to outgassing from the planet. While a prebiotic atmosphere for the early Earth may still be favored by some prebiotic chemists, the weight of the geochemical opinion favors and early (at least by the time life arose) atmosphere dominated by heavier gases. This is in accord with current observations of volcanic emissions.

Some organics could be supplied through infall of comets and meteorites, but the total quantity from these sources are expected to be small compared to the total volume required for global life. It is not necessary to supply the entire globe with organics to furnish raw materials for the origin of life. Some very specialized and favorable local environments may have been sufficient for the purpose, which was a view expressed by Darwin in 1871.

Darwin’s “warm little pond” has been joined by several other suitable locations (e.g., hydrothermal vents). Vent conditions have been shown suitable for mineral-catalyzed abiotic synthesis of a number of small organic molecules relevant to biochemistry. Abiotic hydrocarbon reservoirs within mineral enclosures deep within Earth’s crust represent another proposed site. Igneous mineral matrices have been demonstrated suitable locations for assembly of carboxylic acids from $$\rm CO$$ and $$\rm CO_2$$.

In our current ignorance about the varieties of life in the Universe, we cannot exclude the alternative explanation that life began on this planet without the assistance of organic molecules. In theory (from Cairns-Smith), layered clay minerals provide all of the necessary qualities needed to support life: information storage, reproductive ability, catalysis, and the capacity to evolve. In this picture, organic molecules were introduced as an evolutionary improvement only after mineral life was well underway.

A central problem for the origin of life remains, which is identifying mechanism by which an abiotic chemical mixture absorbs energy, increases in complexity, and evolves. The availability of a suitable energy source, necessary raw materials, and a suitable locale are prerequisites, but not the answer to the riddle of self-organization. In the search for that answer, investigators have a dilemma. Which was first to appear on Earth: replicating molecules or metabolic processes?

### 5.1.5. Replicator-first Theories#

The discovery of the structure of DNA showed how hereditary material was stored, where the replicator DNA was at the center of our understanding of life. The beauty of the double helix, which demonstrated how both reproduction and information storage could be carried out at the molecular level, convinced many in the field that nucleic acids were at the center of life.

#### 5.1.5.1. Advantages of the Replicator Theory#

Darwin’s theory of natural selection provided a path for the evolution of simple cells to humans. It could be extended so that a replicator could also evolve by natural selection, but in an environment where molecules reproduced, mutated, and survived according to their ability to further reproduce in the environment. To explain the origin of life, one only had to account for the origin of the first replicator.

Considerable support for finding the first replicator was provided by using the RNA of a virus called $$\rm Q\beta$$ in experiments by Sol Spiegelman and his group in the 1960s and 1980s. $$\rm Q\beta$$ uses RNA rather than DNA as its genetic material, with replication carried out in a manner described by Watson and Crick. In the presence of the appropriate polymerase enzyme and the four necessary building blocks (nucleoside triphosphates), the RNA can be replicated in a test tube. The synthesized RNA can act as a template for producing more RNA, as long as the supplies of building blocks holds up. Spiegelman’s group was able to follow this process for a time equivalent of over 70 RNA “generations.”

In one thought-provoking experiment, Spiegelman’s group added a drug that bound to certain sites on the RNA, which greatly slowed the copying rate. After several generations the favored binding place of the drug had been destroyed by three mutations in the product RNA. The product RNA had replicated more rapidly than other competing variations of the RNA, and its descendents soon became an overwhelming majority of the colony. This process represented Darwinian evolution at the molecular level.

#### 5.1.5.2. A Problem with a Solution#

Problem: A naked replicator could carry information, but could not carry out tasks today performed by proteins. The RNA replicator concept could not be extended immediately to the origin of life because a protein enzyme was needed for the copying to take place. Leslie Orgel described this paradox as “the chicken or the egg” problem. A nucleic acid could be copied, and evolve through mutation, but could not by itself carry out the necessary catalytic functions. Nucleic acids and protein both appear necessary for the conduct of life, but it is difficult to account for the appearance of either substance on the early Earth. The abiotic formation of both together seemed out of the question.

Solution: Life began with an “RNA World.” Thomas Cech and Sidney Altman showed that RNA molecules could carry out some of the functions of proteins. The term RNA World was applied to describe the vision of Walter Gilbert, where RNA alone performed all the key functions of life before proteins arrived on the scene. The elegant way in which catalytic RNA appeared to solve the “chicken and egg” paradox led many to the conviction that RNA World existed in the earliest days of life on Earth. Over several decades, biochemists developed techniques to demonstrate the ability of RNA to perform the actions of proteins. However, this solution says little about the origin of life on Earth due to the absence of chemists and high-tech labs on the early Earth, and the extreme improbability of the abiotic synthesis of even a single functioning RNA molecule under realistic natural conditions.

#### 5.1.5.3. Just a Problem#

Problem: RNA is too complicated to form spontaneously. Gilbert’s seminal paper described RNA molecules catalyzing their own assembly from a “nucleotide soup.” It was widely presumed that nucleotides were readily available on the early Earth. However, this assumption has not been supported by evidence. There are significant difficulties in synthesizing RNA by considering the likely abiotic availability of adenosine. This substance has not been detected in any quantity within a Miller-Urey type experiment or a meteorite. To get the necessary adenosine, it has been suggested that the base adenine and the sugar ribose were formed separately, brought together, and combined. How likely are these events?

Ribose formation: The sugar ribose is the essential backbone component of RNA, where a reaction with formaldehyde has been generally cited as a plausible source of ribose on the early Earth. However, the ribose soon decomposes. The half-life for decomposition is around 73 minutes in an environment at $$100\ {\rm ^\circ C}$$ at a pH of 7, and 44 years at $$0\ {\rm ^\circ C}$$. Ribose yields can be improved by adding specific minerals, careful control of experimental conditions, and by the input of pure reagents. But, each such specification escalates the improbability that such conditions could naturally occur.

Adenine formation: Adenine can be prepared as long as the concentration of $$\rm HCN$$ remains sufficiently high ($$0.1$$ Molarity ($$\rm M$$)). One estimate for a maximum plausible $$\rm HCN$$ concentration in a prebiotic ocean of Earth has been $$4\times 10^{-5}\ {\rm M}$$. A remedy proposed for this situation was that the adenine was produced in lakes that froze seasonally, which concentrated the $$\rm HCN$$. However, there are several other necessary ingredients (e.g., ammonia), and the introduction of formaldehyde that tended to consume the $$\rm HCN$$.

Combination of adenine and ribose to form adenosine: Assuming that adenine and ribose formed separately, there are complex reactions that could bring them together to react. However, the product was not the one used in RNA. A mixture containing some adenosine can be produced, but still in the presence of other isomers that would be expected to interfere with RNA synthesis. Experiments could be constructed that violate no natural law, but would fall within the category of an extremely improbable event. Some scientists feel that the current state of prebiotic chemistry reflects the presence of unanswered problems, which can be cured by the application of additional ingenuity.

### 5.1.6. Pre-RNA World#

A number of scientists felt that the formation of the first replicator was coincident with the origin of life sought an alternative because they were unhappy with the prebiotic synthesis of RNA. The task of the pre-RNA World advocates was to identify another polymeric substance that possessed the catalytic and self-replication abilities of RNA, while begin a more plausible product of abiotic synthesis. Ideally this substance would form a stable double helix with an RNA strand to provide a mechanism for heritability (i.e., information transfer) during the course of evolution.

Two candidate structures were synthesized as part of a comprehensive effort to prepare analogs of DNA and RNA. One extensively investigated example was p-RNA (pyranosyl-RNA), which retained the sugar ribose, but replaced the five membered ring in RNA with a six-membered (pyranosyl) ring (see Fig. 5.5). Since free ribose in solution prefers a six-membered ring, the prospects for the abiotic synthesis of p-RNA would appear marginally better than RNA. Another candidate RNA precursor is TNA (threose nucleic acid), which contains only four carbons, and may be more accessible to abiotic synthesis than ribose (see Robertson & Joyce (2012) for more details). A TNA strand will cross-pair with both RNA and DNA. Despite these advantages, there are problems with sugar instability and other difficulties that apply to the abiotic synthesis of any sugar-based replicator.

Fig. 5.5 Like RNA, both p-RNA and PNA can form double helices through complementary base-pairing, and each could therefore in principle serve as a template for its own synthesis. Image Credit: Alberts et al. (2002).#

A third example of an RNA precursor is PNA (peptidyl nucleic acid), which escapes the complications of carbohydrate (sugar) chemistry. The structure contains two alternating units and has no chiral center. PNA can form stable double helices with itself, and it can cross-pair with RNA and DNA. The evidence supporting PNA includes the traces of the fundamental repeating unit of the backbone were formed in a Miller-Urey spark discharge experiment, but at $$<0.01\%$$ of the glycine. It seems unlikely that the amount of PNA backbone component on the early Earth would approach that of its much simpler constituent, glycine.

### 5.1.7. Metabolism-first Theories#

If replicator theories are set aside as extremely improbable (not impossible) scenarios for the origin of life, then a surprising conclusion emerges: in the beginning, life functioned without proteins, nucleic acids, and other macromolecules that dominate the activities of present cells. Small molecules carried out the processes of catalysis, information storage, and reproduction using energy-driven cycles of chemical reactions. In metabolism-first theories, a number of alternatives are presented for how the small molecules carry out basic life processes.

Information is stored in the mixture: A list storing all the information like DNA may be unnecessary. A set of molecules within a primitive cellular compartment on the early Earth would carry their own heredity. An informational system of this type has been called a “compositional genome” (Segré, Ben-Eli, & Lancet 2000).

Reproduction is carried out by splitting the compartment: In the absence of a list, a primitive replicator-free cell could reproduce by collecting a duplicate set of its contents from the environment, and then splitting in two. Improved ingredients could also be collected from the environment, which would be the equivalent of a mutation.

Enzymatic catalysis is carried out by monomers: Small molecules cannot approach the speed or specificity of highly evolved enzymes, but it is not clear how much of the higher functions would be need to drive primitive autocatalytic cycles. As evolution progressed, monomers may have yielded to dimers and trimers, gradually ascending the ladder of complexity.

Energy-driven catalytic cycles perform the essential processes of the cell: Some possible functions of importance are the synthesis of improved catalysts and membrane components that permit improved interaction with the energy supply. Possible energy sources include light, redox reactions involving organic substances, volatile inorganics and minerals, pH gradients, and ionic potentials across membranes. It is not clear (at present) which of these was most suitable on the early Earth.

• The Lipid World: This scenario includes a self-assembling lipid micelle that would catalyze necessary reactions by itself. The name “lipozyme” has been given to such an assembly. Eventually dimers and higher oligomers would form, leading to a nucleic acid “takeover” of the hereditary function.

• The Iron-Sulfur World: The evolving system is segregated from the environment by being constrained to the surface of a positively charged mineral. The candidate of choice is pyrite (iron sulfide, or “fools gold”). The formation of pyrites from hydrogen sulfide and ferrous ion provides a source of chemical energy to drive a variety of processes including the fixation of carbon monoxide and carbon dioxide into small organic molecules. No pre-formed chemical “soup” is necessary and hydrothermal vents are the preferred location.

## 5.2. The Earliest Records of Life on Earth#

### 5.2.1. Problems with the Record#

Astrobiology has only a single successful experiment in planetary life to investigate, Earth life. The history of terrestrial life must act as the archetype for astrobiological models of the appearance and radiation of life anywhere in the Universe, albeit an ever more contingent and unique one. It could be argued that all habitable planets experienced similar environmental constraints and pathways of physical and chemical development. Then, the process of biological initiation elsewhere should be broadly reminiscent of Earth-like life. If so, astrobiology is saved from the challenges of examining things far away, but is instead faced with the difficulties of examining events here long ago (i.e., trading deep space for deep time). Figure 5.6 illustrates the two versions of geologic timeline along with some events from geologic history.

Fig. 5.6 (A) The current Precambrian time scale from the International Commission on Stratigraphy, based on Plumb and James (1986) and Plumb (1991) (and including the 2014/15 revision of the base of the Cryogenian Period to c. 720 Ma); (B) a proposed revised Precambrian time scale using geologic events (Van Kranendonk et al., 2012). Image Credit: Strachan et al. (2020).#

The origin and early evolutionary history of terrestrial life is poorly known, in addition to poor constraints on the environmental conditions of the early Earth. There are several reasons for this.

1. Ancient rocks are rare: Almost all potential information about the first half of Earth’s history is contained in geological materials. Such rocks have been mostly hidden or destroyed by geological processes (e.g., erosion, burial, or subduction). Some rocks from the early Earth were ejected into space by catastrophic meteorite impacts, of which there were plenty during the first billion years of Earth history. Both geological or impact mechanisms are viable candidates for the destruction of the earliest crust.

2. Metamorphism and/or deformation damages the information content of rocks: Biological signatures are fragile, usually surviving only mild metamorphism and moderate deformation. On a tectonically active planet, the cumulative probability of deep burial or intense folding steadily increases over time. Radioactive heating was greater on the Archean Earth (2.5-4 Gyr ago), and tectonic activity may have been more intense, with post-depositional modification of rocks correspondingly more likely. The effects of shock metamorphism during meteorite impacts could have also been severe, judging by the cratering on the Moon and Mars. Temperatures above $$300\ {\rm ^\circ C}$$ are significantly destructive to most chemical biosignatures. Therefore rare, ancient rocks are even more rare to contain contain astrobiologically important information.

Note

Metamorphism refers to mineralogical changes in rocks due to pressure and heat. Deposition is the process of laying down a sedimentary layer, while post-depositional processes can later modify this layer in many ways and thus complicate its interpretation.

3. Surviving rocks are located in unfortunate places: Many occur on continents where recent environmental conditions have been unfriendly to pristine preservation. Some rocks are situated where prolonged weathering under subtropical climates has transformed most surface rocks into varieties of soil. Others have endured glaciation over the past million years which has covered them with till, shattered them by frost action, swamped them with lakes, or encouraged the overgrowth lichens and mosses. Fresh samples are obtained by drilling, but most core boring has occurred where the rocks are unrepresentative (e.g., in the vicinity of mines). Mines are situated near the anomalous concentration of a particular element.

4. Not all rock were formed in settings likely to yield evidence of life: Sedimentary rock are the only ones from which paleobiological data can be readily obtained. For the early Earth, these are commonly shales, cherts, banded iron formations, and carbonate rocks. A few other rock types can yield important environmental information and biological relics (in exceptional instances), which include sandstones, evaporites, hydrothermal deposits, tuffs, and hyaloclastites. The primordial geological record mainly consists of granites, basalts, and other highly metamorphosed and deformed materials.

5. Very few well-preserved remnants of the early Earth have not yet been studied systematically: The geological features of the rocks and their environments must be understood before paleobiological and paleo-environmental interpretation can be deduced. Moreover, primordial rocks and relics of early life might well have no modern analogs. The introduction of younger material into older rocks (i.e., contamination) has plagued Archean paleontology because the opportunities are common during the long and complex geological history of even the best-preserved Archean rock. Determining whether or not a specimen is of biological origin (i.e., biogenicity) is contentious for many putatively primordial fossils because early organisms are expected to be morphologically simple, chemically unsophisticated, and environmentally benign. Distinguishing inorganic objects from primitive fossils is difficult. Very little research has been performed on abiogenic mimics of possible early biosignatures.

Because of all these shortcomings, every claimed Archean fossil must be subjected to strict scrutiny, if only because of the extreme improbability of their survival. Despite these caveats, accessible outcrops of well-preserved Archean rocks for paleobiological and paleo-environmental study still do exist.

Fig. 5.7 World map showing the distribution of Archean cratons and modern subduction zones. Image Credit: Sotiriou et al. (2022).#

### 5.2.2. Types of Evidence#

#### 5.2.2.1. Microfossils#

Microfossils are the preserved remains of microbial organisms, which are exclusively organic-walled prokaryotes on the early Earth. These micron-sized lifeforms are only evident in the second half of Earth’s geological record. The oldest fossil visible to the unaided eye (Grypania), which is found in a carbonaceous compression that shows a spiral shaped seaweed about $$1\ {\rm cm}$$ across and appeared $${\sim}1.85\ {\rm Ga}$$. The oldest suspected microfossils were found in the Nuvvuagittuq Supracrustal Belt in Québec and appear in rocks dated to 4.28-3.75 Ga (Papineau et al. (2022)). Discoveries made in ancient rocks typically carry a great deal of controversy, where the sample for Canada is no exception.

Fig. 5.8 The microfossil Grypania from the $${\sim}2.2$$ Ga Negaunee Formation in Michigan, USA. Image Credit: Wikipedia:Grypania.#

Fig. 5.9 The cyanobacterium Eoentophysalis from the $${\sim}1.9$$ Ga Belcher Group, Canada. Scale bar = $$20\ {\rm \mu m}$$. Image Credit: Demoulin et al. (2019).#

The earlier appearance of microorganisms might also be expected from the overall trend within individual evolutionary lineages towards larger sizes over time. However, remnants of very small organisms are most prone to destruction by metamorphic and deformational recrystallization. Microbial fossils are generally only preserved where lithification (entombment in rock) was almost instantaneous upon death, the host sediment is extremely fine in grain size, and the mineralizing material is hostile to microbial activity. The two main rock types that satisfy these requirements are chert and shale.

• Cherts readily precipitated in evaporative carbonate environments during the Proterozoic and Archean before silica-shelled organisms had evolved and thereby lowered dissolved silica concentrations in the oceans to levels far below saturation.

• Shales are widespread in Archean terrains, but are generally recrystallized or sheared along bedding planes to the great detriment of fine-scale fossil preservation.

The search for Archean microfossils has been fraught with controversy. The two key issues that have emerged from the many vigorous debates are those of biogenicity (whether a presumed microfossil is actually biogenic) and syngenicity (whether a presumed microfossil has an ancient origin).

• Biogenicity arises as a problem because of the extreme morphological simplicity of prokaryotic organisms and their likely remains. Virtually all microbial groups, (except some cyanobacteria), can be categorized as “balls or sticks”; either simple spheroids or uncomplicated tubular filaments. It is very difficult to distinguish true prokaryotic microfossils from the many abiogenic structures that can mimic similar shapes. The best way of discriminating fossilized simple lifeforms from non-life is to find evidence of past biological behavior that is not explicable in terms of physical or chemical processes (e.g., varying orientations with respect to environmental gradients or indications of reproduction).

• Syngenicity arises as a problem becaus of the long and complex geological history even of the best preserved Archean rock. This provides many potential opportunities for younger biological contaminants to infiltrate. Contamination can sometimes only be distinguished by features like slightly discordant host rocks, lack of fossil-sediment interaction, or anomalous geochemistry compare to metamorphic grade.

Note

Metamorphic grade is a qualitative measure of the degree of heating and pressure to which a rock has been exposed. Low-grade means less heat and pressure, while high-grade means more heat and pressure.

The most ancient microfossils that can be confidently assigned to an extant group of organisms are variably pigmented colonies of spheroidal cells showing a distinct division pattern from the Belcher Group of arctic Canada (see Fig. 5.9). The microfossils are so similar to an current genus of morphologically complex cyanobacteria so that there can be little doubt that the Proterozoic fossils are also cyanobacteria.

The only convincing Archean microfossils (found in South Africa) are assemblages of ellipsoids ($$0.2-2.5\ {\rm \mu m}$$ in diameter), spheroids ($$1.5-20\ {\rm \mu m}$$ in diameter), tubular filaments ($$0.5-3\ {\rm \mu m}$$ in diameter), and interwoven mats of tubular filaments ($$10-30\ {\rm \mu m}$$ in diameter). They are composed of replacive iron oxides (kerogen) that has similar isotopic values to younger biogenic carbon. Their complex divisional and matting habits are strongly reminiscent of biological behavior, and their irregular distribution aligned along their sediments’ layering indicates contemporaneous deposition. These objects are clearly syngentic and biogenic, which makes them incontestable Archean microfossils. All older assemblages are more controversial.

The microfossil record leaves us with no certain members of modern microbial lineages before $${\sim}2.1\ {\rm Ga}$$ and no certain preserved cells before $${\sim}2.55\ {\rm Ga}$$, but plenty of possible and probable microbial remain some older rocks dating back as far as $${\sim}3.45\ {\rm Ga}$$. The record also shows how easy it is to mistake abiogenic artifacts and younger contaminants for genuine Archean microfossils and illustrates that careful geology and converging lines of evidence are needed in order to confident of biogenicity an syngenicity. In particular, signs of biological behavior and indications of metamorphic grade have proved to be the most powerful tools for distinguishing reliable results from the misleading.

#### 5.2.2.2. Stromatolites#

Stromatolites are laminated sedimentary structures accreted as a result of microbial growth, movement, or metabolism. They are trace fossils of microbial activity and are thus less direct evidence of life than microfossils. Their shapes vary, but they often show a predominance of convex-upward flexures forming domes or columns, although some conical forms show the converse.

Several scales of flexuring and layering give them a complex internal structure, and a pronounced irregularity of flexuring and layering makes them look wrinkly. If accreted by photosynthetic or otherwise Sun-loving microbes, the layers show thickening flexure crests because the constructing microbes grow more successfully on topographic highs where there is more light. The constructing microbes accrete sediment by three distinct processes.

1. Trapping occurs when erect microbial filaments baffle passing water currents, causing entrained sediment to be deposited, just as a carpet traps dirt.

2. Binding happens when passively deposited sediment is caught up in and overgrown by a microbial mat, either by lodging in irregularities in the mat or getting stuck in the extracellular mucilage secreted by the microbes.

3. Precipitation results from microbial photosynthesis removing $$\rm CO_2$$ from the surrounding water, causing calcium carbonate to be deposited as the equilibrium of the following reaction is forced towards the right side because of product depletion:

(5.1)#${\rm Ca^{2+}} + 2{\rm HCO_3^-} \leftrightarrow {\rm CaCO_3} + {\rm CO_2\ (gas)} + {\rm H_2O}.$

There are several classification schemes for stromatolites, with the principal ones being a binomial, quasi-biological taxonomic system like that used for animal trace fossils and a morphological system like that used for sedimentary structures. Proponents for the

• binomial system emphasize its usefulness for biostratigraphy, with characteristic stromatolites classified for various Proterozoic time intervals on several continents.

• morphological system emphasize the involvement of inorganic processes in stromatolite construction and thus regard them as better environmental markers than time markers.

Disputes have ensued the relative importance of biotic versus sedimentary controls on their structure and even about the definition of the term “stromatolite.” Stromatolites are generally large and obvious structures with a better preservation potential than microfossils. Before using stromatolites as evidence for early biology, multiple lines of supporting evidence should be sought, preferably including fabric indicative of microbe-sediment interaction.

Many assemblages of stromatolites are now known from rocks older than $${\sim}2.5\ {\rm Ga}$$. Those of late Archean age ($$<3\ {\rm Ga}$$) are widely regarded as showing strong evidence of biological activity during their formation (see Fig. 5.10). As well as the large-scale structural features typical of biogenic accretion, stromatolites can have precipitate-filled gas bubbles produced by metabolic activity, tufted layers formed by gliding filaments congregating on topographic high points, and carpet-like textures of vertically oriented filaments thickening on substrate highs. As a result, there can be little doubt that these structures were accreted by a community of microbes that responded in some way to sunlight.

Fig. 5.10 Stromatolites are formed over the years by mats (1-10 mm in thickness) of microorganisms (cyanobacteria among others) found in shallow, mainly marine waters. The microorganisms precipitate mineral particles, which makes the mat to thicken, but only the upper part survives. Most stromatolites display characteristically layered structures. Only the layers are visible to the naked eye. Locality: Strelley Pool Chert (SPC) (Pilbara Craton) - Western Australia. Image Credit: Wikipedia:stromatolite.#

#### 5.2.2.3. Carbon Isotopes#

Both the microfossil and stromatolite records of the Archean are very patchy. A more continuous record of Archean biology is preserved in a less direct way: in isotopic chemofossils, particularly the carbon isotope ratios of sedimentary rocks. Autotrophic metabolisms that use $$\rm CO_2$$ for manufacturing cellular carbon compounds preferentially incorporate the light stable isotope of carbon $$\rm ^{12}C$$ over the heavy $$\rm ^{13}C$$ into their synthesized organic matter. Any sedimentary kerogen (iron oxides) derived from such organisms will inherit this isotopic fractionation.

Reflecting the biological concentration of $$\rm ^{12}C$$, sedimentary carbonate that is deposited from water bodies inhabited by autotrophs will consequently be depleted in $$\rm ^{12}C$$, or relatively enriched in $$\rm ^{13}C$$. On Modern Earth, organic carbon has a carbon isotopic value (called $$\delta \rm ^{13}C_{org}$$) that averages around -22‰ (i.e., depleted in $$\rm ^{13}C$$ by 22 parts per thousand relative to an arbitrary standard), where carbonate has a $$\delta \rm ^{13}C_{carb}$$ of about 0‰ and the entire Earth is near -6‰.

Unfortunately, there are some complications to this simple picture. Fluctuations in the fractionation record might be expected to appear at times when the burial ratio of sedimentary kerogen to carbonate varies or when the mean autotrophic fractionation between organic and inorganic carbon $$\Delta_{\rm C}$$ varies. It appears that now is a somewhat unusual time in EArth history with very low $$\Delta_{\rm C}$$. This may have resulted from declining atmospheric $$\rm pCO_2$$ associated with the onset and growth of glaciers. Overall it seems reasonable to assume that a global carbon cycle regulates the available carbon based on autotrophy of both independent microbes and endosymbiotic microbes (e.g., chloroplasts in plants and algae).

When extrapolating back to more ancient rocks, one must consider a few complicating factors.

1. Older rocks tend to be metamorphosed, which contain a low $$\Delta_{\rm C}$$ that is indistinguishable from non-biologic fractionation.

2. $$\rm ^{13}C_{org}$$ can shift to heavier values at lower metamorphic grades becaus of loss of light hydrocarbons.

3. Isotopically light hydrocarbons can migrate into ancient sedimentary rock and become solidified (i.e., contamination).

4. Metasomatism (i.e., chemical alteration of the rock) and metamorphism can introduce secondary carbonate or graphite into ancient rocks by precipitation from or reaction with migrating fluids rich in dissolved carbonic species.

All of these mechanisms are apt to confound the ancient $$\Delta_{\rm C}$$ record, particularly in deformed and metamorphosed terrains.

Overall, the pre-1.0 Ga carbon isotope record is similar to that of more modern times. $$\delta \rm ^{13}C_{carb}$$ is generally around 0‰ and $$\delta \rm ^{13}C_{org}$$ is $${\sim}30$$‰ in low-grade metasedimentary rocks. Shallow marine carbonates in particular are very close to 0‰ as far back in time as the undisputed sedimentary record goes. Organic carbon isotopic values are similarly consistent, though with greater variability about their mean value. Autotrophic fixation has almost certainly been the predominant process of primary biological productivity on Earth since at least 3.5 Ga.

#### 5.2.2.4. Sulfur Isotopes#

Some microbial metabolic processes also fractionate the stable isotopes of sulfur, $$\rm ^{32}S$$ and $$\rm ^{34}S$$. In particular, sulfate reduction for metabolic purposes rather than for biosynthesis can impart a large ($$-10$$‰ to $$-45$$‰) fractionation in favor of the light isotope. This reaction, which can use either organic carbon:

(5.2)#\begin{align} {\rm SO_4^{2-}} + 2{\rm H^+} + 2{\rm CH_2O} \rightarrow {\rm H_2S} + 2{\rm CO_2} + 2{\rm H_2O}, \end{align}

or molecular hydrogen:

(5.3)#\begin{align} {\rm SO_4^{2-}} + 2{\rm H^+} + 4{\rm H_2} \rightarrow {\rm H_2S} + 4{\rm H_2O}, \end{align}

as an electron donor during the reduction of sulfate and only takes place in anaerobic (oxygen-poor) settings. The fractionation is preserved in the geologic record when the metabolic product hydrogen sulfide reacts with dissolved ferrous ion during early diagenesis to form sedimentary pyrite. Though not particularly abundant or widely distributed in sedimentary rocks, such pyrite is till useful as a tracer of ancient biological activity because it is relatively immune to metamorphic resetting of isotopes. However, evidence ot the isotopic composition of seawater sulfate is only rarely preserved in rocks, principally in evaporitic sulfate minerals (e.g., gypsum and anhydrite), or in trace quantities of carbonate minerals. Thus it is usually impossible to determine directly the $$\Delta_{\rm S}$$ (analogous to $$\Delta_{\rm C}$$) of ancient sediments, weakening the biological inferences that can be drawn from this isotopic system.

#### 5.2.2.5. Nitrogen Isotopes#

Nitrogen is an essential nutrient for all living things, as it is a key constituent of the amino acids of proteins and the bases of nucleic acids. It has two stable isotopes: $$\rm ^{14}N$$ and $$\rm ^{15}N$$, and predominantly resides in a single reservoir, the atmosphere, as $${\rm N_2}$$ gas. However, its biogeochemical cycle is more complex than either carbon or sulfur and is consequently more difficult to decode from sedimentary isotopic ratios. Isotopic compositions in the two main crustal repositories of nitrogen, organic matter and ammonium ions in clays and micas, probably closely reflect their biological parents, particularly at lower metamorphic grades.

There are four principal steps in the biogeochemical cycle (see Fig. 4.7), but only two of them influence nitrogen isotopic ratios. Atmospheric $$\rm N_2$$ gas is relatively nonreactive and can only be incorporated into organic matter after fixation into ammonia $$\rm NH_3$$. This reaction,

(5.4)#\begin{align} {\rm N_2} + 3{\rm H_2} \rightarrow 2 {\rm NH_3}, \end{align}

can occur only in a reducing environment (i.e., in the presence of $$\rm H_2$$ gas). On the early, abiotic Earth, it could have occurred around hydrothermal vents using dissolved $$\rm N_2$$. It is now performed by a wide range of microorganisms including many cyanobacteria, methanogenic Archaea, the aerobic Azobacter group, some members of the Clostridium group, and the symbiotic Rhizobium in plant roots. This reaction in its simplest form is

(5.5)#\begin{align} 2{\rm N_2} + 6{\rm H_2O} \rightarrow 4{\rm NH_3} + 3{\rm O_2}, \end{align}

which causes little isotopic fractionation and all ammonia produced within a cell (or cells) is rapidly incorporated into organic matter.

Organic nitrogen is broken down by ammonification, which produces either ammonium ion or ammonia depending on whether the environment is aqueous or not. In aerobic settings, ammonium is oxidized to nitrate by nitrifying microbes. This is done in two steps, from ammonium to nitrite ($$\rm NO_2^-$$), and then to nitrate ($$\rm NO_3^-$$):

(5.6)#\begin{align} 2{\rm NH_4^+} + 3{\rm O_2} &\rightarrow 2{\rm NO_2^-} + 2{\rm H_2O} + 4{\rm H^+}, \\ 2{\rm NO_2^-} + {\rm O_2} &\rightarrow 2{\rm NO_3^-}. \end{align}

Different bacteria undertake the two reactions, with Nitrosomonas principally responsible for the first and Nitrobacter dominant for the second.

The breakdown of dissolved nitrate to $$\rm N_2$$ gas (denitrification) removes nitrogen from the biosphere and returns it to the atmosphere, thus completing the biogeochemical nitrogen cycle. The resulting $$\rm N_2$$ is enriched by the light isotope $$^{14}N$$ by up to 30‰, depending on the degree of denitrification. This light $${\rm N_2}$$ diffuses out of aqueous settings, leaving the residual dissolved nitrate isotopically heavy. Living and sedimentary organic matter strongly reflects the effects of this isotopic fractionation during denitrification. The process is an anaerobic respiration that can be expressed as:

(5.7)#\begin{align} 5{\rm CH_2O} + 4{\rm NO_3^-} + 4{\rm H^+} \rightarrow 5{\rm CO_2} + 2{\rm N_2} + 7{\rm H_2O}. \end{align}

The various reactions in the pathway are preformed by a range of bacteria, most notably by some members of the genera Pseudomonas and Thiobacillus.

Fig. 5.11 Secular variation of the sedimentary $$\delta \rm ^{15}N$$ record through geological time. Image Credit: Thomazo, Couradeau, & Garcia-Pichel (2018).#

The secular record of sedimentary $$\delta \rm ^{15}N$$ values shows little change from modern ranges back to the end of the Archean. This implies that an active biological nitrogen cycle with all of its modern complexity was operating as early as 2.5 Ga. Moreover, differing Archean environments show varying nitrogen isotope values, indicating that this complex biogeochemical cycle functioned in a similarly diverse geographic pattern.

#### 5.2.2.6. Molecular Biomarkers#

Certain hydrocarbons are recognizable derivatives of biological molecules. In mature sedimentary rocks, the functional groups are removed and multiple bonds are broken, but their carbon skeleton is intact and clearly indicates their parent biomolecule. When these molecules have a specific biological source (e.g., Domain Bacteria or the Phylum Porifera), they are called biomarkers. There are many such biomarker molecules, but most that survive in mature rocks are derived from cellular lipids (i.e., cell membrane and pigments). At high temperatures, the abundance and range of hydrocarbon molecules decreases markedly because long chains break down into methane. But it is now clear that hydrocarbon biomarkers can also be used to inform us about life’s early diversity.

Organic geochemical analysis of ancient rocks is fraught with difficulties.

1. in thermally altered rocks, the concentrations of the relevant compounds diminishes to extremely low levels, making it very difficult to distinguish indigenous molecules from laboratory contamination. Usually only differences in isomer ratios and chain lengths serve to differentiate very old from very young molecules.

2. Old samples can be contaminated during the recent geological past by groundwater, subsurface microbes, surface lichens, or endolithic (living within rocks) bacteria. Contaminants can be distinguished because they often show preferences for handedness and detectable levels of radioactive $$\rm ^{14}C$$.

3. There have generally been many opportunities for post-Archean introduction of organic molecules from younger geological fluids. These can be difficult to recognize due to a similar thermal history. However, subtle differences in carbon isotope ratios, aromatic contents, or in evolutionary inappropriate biomarker distributions (e.g., plant biomarkers in Archean rocks).

Due to these problems, many studies before 1990 are now regarded as dubious, especially those of porous rocks allowing ingress of soluble contaminants. Several discoveries have reinvigorated the field. From the oil-bearing rocks in the $${\sim}1.7\ {\rm Ga}$$ McArthur Group from northern Australia, it is reasonable to suppose that where oil survives, so will biomarkers. The demonstration that oil exists in fluid inclusions in many Archean sandstones shows that it was generated before peak metamorphism of the host rocks, and that this maximum thermal episode occurred in Archean or early Proterozoic time.

Fig. 5.12 Age distribution of sediments from well-preserved organic matter. Image Credit: MIT:The Summons Lab.#

### 5.2.3. The Oldest Evidence of Life#

The most ancient rocks that could possibly contain evidence of life (i.e., the oldest deposited on the Earth’s surface) are within the $$>3.7\ {\rm Ga}$$ supracrustal successions at Isua and Akilia in western Greenland. These controversial outcrops have been both highly metamorphosed and severely deformed to such an extent that their exact age is now difficult to determine because most geochronologically useful isotopic systems have been reset.

Microfossils have been reported from Isua, but these have now been dismissed as either fluid inclusions or weathered carbonate crystals. Biomarker molecules have also been reported from these rocks, but the preservation of such complex chemicals under these severe conditions is even less likely than microfossils and experiments have shown that it is unlikely that they are indigenous. It seems that only isotopic geochemical data are robust enough to provide satisfactory evidence of life and even then, there interpretation can be fraught with problems. In particularly ancient terrains (e.g., early Archean of Greenland) several abiogenic sources could have contributed isotopically light carbon to the metamorphosed sedimentary rocks.

The Akilia rocks have undergone granulite facies metamorphism, which means they have been brought to high temperatures and been intensely deformed. The very high metamorphic grade suggests that if the organic carbon were originally biogenic, then the light hydrocarbon loss anc carbonate re-equilibration should have made is initial $$\delta {\rm ^{13}C}$$ value lighter by as much as 25‰. Such extremely light values are only biologically produced by methane-fueled metabolisms that require free oxygen or high sulfate, for which there is no evidence so early in Earth’s history. Inorganic scenarios scenarios seem more likely now that it has been shown that the host rock is not a metamorphosed banded iron formation, but is apparently a highly deformed and metasomatized ultramafic igneous rock. Field relations indicate that that the rocks may not be as old as previously believed, where apatite grains return $$\rm U-Pb$$ dates of just $${\sim}1.5\ {\rm Ga}$$. It is perhaps premature to regard the Akilia gniesses as the hosts of the oldest evidence of life on Earth.

If neither of the early Archean Greenland successions provides compelling isotopic evidence for the existence of life, what then? The next oldest rocks that might yield biological information are the $${\sim}3.52\ {\rm Ga}$$ Coonterunah Group in the Pilbara Craton of northwestern Australia and the similarly ancient Theespurit Formation in the Swaziland Supergroup of South Africa. These rocks have experienced metamorphic temperatures at about the level at which substantial resetting of carbon isotope values begins. They also contain organic carbon in quantities sufficiently large to analyze by conventional whole-rock methods.

The Coonterunah Group contains sedimentary carbonate with $$\delta {\rm ^{13}C_{carb}}$$ averaging about -2‰, and $$\delta {\rm ^{13}C_{org}}$$ averaging about -24‰. The resulting $$\rm \Delta^{13}C$$ of 22‰ can be compared with Proterozoic banded iron formations, which evidently formed in a similar sedimentary environment. The Coonterunah isotopic data are fully consistent with photoautotrophic fractionation and thus perhaps provide the oldest compelling evidence for life on earth.

#### 5.2.3.1. Physical Characteristics#

Though there is little preserved evidence, it all suggests that the earliest organisms were morphologically simple. All genuine Archean microfossils are of spheroidal or ellipsoidal single cells that reproduced by binary fission, or of cylindrical sheaths that surrounded filamentous chains of linked cells. Usually the cells are aggregated into clusters with many representatives of each morphological type present. Though only a limited range of morphological and size diversity is evident in the Archean microfossil record, this does not necessarily mean taxonomic diversity was similarly restricted. Microbes are in general architecturally conservative, so beneath the disguise of a simple spheroid or filament may have lurked a wide range of metabolic styles and phylogenetic diversity.

Community organization was also apparently simple. Assemblages of microfossils are locally homogenous, as are stromatolite microstructures. Benthic filaments (that lived on the floors of bodies of water) formed interwoven mats, and spheroids grew in dense clusters that provided protection for individuals against radiation, erosion, and desiccation. Some microbial communities were clearly very robust (e.g., many stromatolites in northwestern Australia), which were able to accommodate frequent deluges of coarse volcanic ash into their structures.

The largest late Archean filamentous microfossils are no more than a few hundred microns in length. Individual cells must have been much smaller with the greatest possible diameter allowed by the filaments being $${\sim}20\ {\rm \mu m}$$. The layering of stromatolites also provides a proxy gauge for the size of the constructing organisms. Evidently the Archean biota was entirely microbial.

#### 5.2.3.2. Environmental Preferences#

It used to be dogma in geology textbooks that the Archean environment was exclusively hot, volcanic and deep-water, but this is not so. A wide range of environmental settings are represented in the Archean geological record and life was present in many of them by the end of the eon.

• Benthic organisms (living on the sea floor) clearly flourished in the photic zone, where sunlight is present (e.g., stromatolites). Sediment-accreting microbial mats formed wherever light was available and other conditions were suitable (e.g., temperature, pH, etc.) Periodic exposure and strong currents were not much of a restriction.

• Marine plankton (organisms that float) must also have been abundant, because (kerogenous) shales deposited well below the photic zone are widespread in Archean deep-water environments. The organic material must have been largely produced by organisms living at shallow depths. Some late Archean black shales deposited at considerable depths contain hydrocarbon biomarkers for photosynthetic cyanobacteria, which can only have lived in the photic zone as plankton.

• Both benthic and marine planktonic microbes also inhabited non-marine aqueous settings (e.g., lakes). A wide range of stromatolite morphologies occur in lacustrine (deposits in lakes) rocks, occupying similar depth and salinity ranges as comparable marine stromatolites.

Despite much debate, it is still unclear whether land surfaces were inhabited during the Archean. There has been a long-held belief that before the advent of an oxygenated atmosphere with its ozone shield, life on land would have been impossible because the high solar UV flux would have destroyed the DNA of exposed organisms. However, endolithic and soil microbes could have thrived where UV light could not penetrate. The oldest noncontroversial record of life on land dates from only $$1.2\ {\rm Ga}$$, but colonization of the land was undoubtedly earlier because land settings are prone to erosion and thus unlikely to preserve relics of life.

Another widespread belief about early organisms is that they were thermophilic, based on the heat-loving tendencies of many modern microbes that appear to be ancient based on their basal branching on ribosomal RNA phylogenetic trees. It is difficult to determine from the geological and paleontological records whether the Archean biota in fact reflected this tendency. A constraint on temperature is provided by the sequence of sulfate minerals precipitated from evaporating seawater. As relics of halite ($$\rm NaCl$$) are preserved in old marine sediments elsewhere, the evaporite brine was probably rich in chloride. This implies that the water in which the stromatolites formed was cool and thus that the earliest microbial community was not dominated by thermophiles.

#### 5.2.3.3. Phylogenetic Relationships#

To understand how Archean organism were related to their more modern descendants, we have to use the “Tree of Life,” which is the family tree of all species living today and derived from the comparative sequences of their ribosomal RNA molecules. This phylogeny works on the assumption that slowly evolving and universally shared molecules will reveal evolutionary relationships through the history of life. Though it is a powerful device for exploring early evolution, it has significant drawbacks.

1. It is necessarily constructed from data derived from modern organisms; nothing extinct appears anywhere on the tree (i.e., survivor bias). In reality, the tree should be a lot bushier than it is.

2. It assumes a linear descent of characters, but is now known that significant lateral gene transfer has occurred, particularly in prokaryotic organisms. This means that the simple branching tree structure is somewhat unrealistic.

3. It has no temporal scale, only an ordering; the length of branches is related only to evolutionary distance, not geological time.

Despite these shortcomings, the RNA Tree is still the best guide we have to the pattern of early evolution.

Fig. 5.13 A phylogenetic tree of living things, based on RNA data and proposed by Carl Woese, showing the separation of bacteria, archaea, and eukaryotes. Image Credit: Wikipedia:phylogenetic tree.#

The most obvious feature fo the RNA Tree of Life is its three main divisions, termed Domains.

• The Bacteria represent most known prokaryotes (i.e., single-celled organisms without a nucleus).

• The Eukarya (eukaryotes) are organisms with complex cells containing organelles.

• The Archaea are prokaryotes with a very different cell wall composition, membrane stiffening molecules, gene structure, DNA binding, and RNA transcription. The only evidence for the existence of the Archaea before $$0.5\ {\rm Ga}$$ comes from extreme depletions in $$\delta \rm ^{13}C_{org}$$ in sediments aged between $$2.8-2.6\ {\rm Ga}$$ imparted by methanogenesis.

Clear evidence for Bacteria in the Archean is provided in the presence in sediments of $$2.72\ {\rm Ga}$$ and younger. A biomarker for cyanobacteria is also present in these sediments. But an earlier dater for the appearance of Bacteria can be obtained from the sulfur isotope systematics of $${\sim}3.47\ {\rm Ga}$$ baritic sediments in the Warrawoona Group. Given the current knowledge of the phylogenetic distribution of thermal adaptions among sulfate reducers, the Warrawoona $$\delta {\rm ^{34}S}$$ data suggest a minimum age of $$3.47\ {\rm Ga}$$ for the appearance of Bacteria.

The origin of the domain Eukarya can also be traced back to the Archean. The diversity and abundance of indigenous sterane biomarkers (steroid precursors) in sediments of the $$2.8-2.5\ {\rm Ga}$$ Groups of northwest Australia indicates the presence of eukaryotes in the Archean ecosystem. Steroids as as rigidifying molecules within the lipid bilayers that make up cell membranes, and provide eukaryotes with the ability to engulf large particles leading to the possibility of endosymbiosis of organelles. They are intimately involved with the evolution of microstructural complexity within cells, and are necessary precursors to the development of multicellularity, tissue differentiation, and sex.

Some steranes are characteristic of particular subdivisions of eukaryotes, none of those are abundant in the Archean biomarker suites. It is impossible to determine which types of eukaryotic organisms were present on the early Earth. They may have been single-celled algae, but there is nothing in the biomarker data that precludes large multicellular organisms as complex as red algae or even animals.

### 5.2.4. Astrobiological Implications#

It is clear that this planet was inhabited within a billion years of its formation. Not only that, but the organisms present $$3.5\ {\rm Ga}$$ were as complex and competent in their cellular and biochemical functioning as the bulk of the modern biota. Given that similar conditions may have been present on Mars at this time, It may have potentially evolved biotas of some complexity over a short habitable window.

All three main lineages of life were present by the end of the Archean at $$2.5\ {\rm Ga}$$. Just what controls how fast intelligent life evolves on a planet is unknown, but is now seems unlikely that it is acquisition of a complex cell structure. Eukaryotes have ben around for at least 60% of Earth’s history, but intelligent technological eukaryotes evolved only in the last 0.1% of that time. Accordingly, it seems that attempts to detect extraterrestrial life directed towards dumb germs and worms are more likely to be successful in the near-Earth region of space.

Any such investigations of planetary bodies in the Solar System will require many of the techniques and strategies used in the search for Archean life here on Earth. Exploring for signs of extinct life on Mars should be quite analogous to searching for biosignatures in old Earth rocks. As suitable Archean rocks for paleobiology tend to occur in remote places and in inhospitable environments, perhaps it would make sense to include an expert in Archean paleobiology in the first Martian exploration party.

## 5.3. The Origin and Diversification of Eukaryotes#

Prokaryotes are microorganisms whose cells have no nucleus, organelles, or tubulin-based cytoskeleton. These microorganisms were the only form of life for at least 80% of our evolutionary history. Multicellular organisms (e.g., plants, animals, and fungi) evolved a mere $$0.5-1.0\ {\rm Ga}$$ from single-celled eukaryotic ancestors. Eukaryotes are single-cell microorganisms with flexible cytoskeletal structures, organelles, and chromosomes bound within a nuclear membrane. Cyanobacterium-like fossils suggest that life emerged at least $$3.45\ {\rm Ga}$$, but the biogenic origins of these structures are contested. The chemical record documents prokaryotic metabolisms that may have existed $$3.47-3.85\ {\rm Ga}$$ and eukaryotic biosignatures that may be as old as 2.7 Gyr.

Single-cell organisms appear simple, yet they transformed the atmosphere, the waters, and the subsurface of the Earth. Even today, single-cell eukaryotes (i.e., members of the group Protista), and prokaryotes continue to dominate our biosphere in terms of genetic diversity, total biomass, and metabolic activity. Microbes are usually members of complex communities and are directly or indirectly responsible for all of Earth’s biogeochemical cycles. The multicellular world is completely dependent upon diverse microbial species for continued survival. If life exists on a world beyond Earth, it would be microbial, either now or in its earliest days. Understanding Earth’s evolutionary history or the kinds of life most likely to be encountered beyond Earth requires that we study the history, habitats, ecology, and full diversity of microorganisms that shape our biosphere.

### 5.3.1. Microbial Diversity#

Biodiversity experts claim that the number of different plant, animal, and fungal species ranges from $$1-10$$ million. Contrast this with our current census, based on conventional methods (using morphology) of only $${\sim}\text{5,000}$$ species of prokaryotes and $${\sim}\text{200,000}$$ species of protists. But this census is argued to understate the number of microbial species by orders of magnitude. It is very difficult to identify phenotypic traits that can inform us about how to distinguish microbial species and study their interrelationships.

There are no widely accepted standards for interpreting the phylogenetic (i.e., evolutionary) significance of similarities in microbial cell morphology and other traits over large evolutionary distances and the geological record is silent for most microbial groups. If if there were a consensus about an effective set of phylogenetic markers, changes in the genomes of species are not necessarily closely linked to phenotypic variation. “Natural systems” based upon resemblance are useful for the taxonomist but this method alone is not sufficient to reconstruct the evolutionary history of distantly related organisms.

### 5.3.2. Molecular Phylogeny#

Deciphering phylogenetic relationships among microorganisms requires an objective measure of relatedness. Physical appearance is insufficient, but most of these creatures have small genomes that are more amenable to DNA sequence analyses (as compared to the larger genomes of planets and animals). Today molecular data provide a practical metric for assessing biodiversity in the context of evolutionary history. The comparison of DNA sequences that share a common ancestry makes possible the measurement of genetic differences that are common to members of populations, species, or kingdoms.

The ideal strategy for reconstructing evolutionary history would be to sequence the entire genomes of all species that might inform us about microbial phylogeny. Protists can present technological challenges because their chromosomes can be large and a single genome can contain millions (or even billions) of base pairs. If we sequenced entire genomes, adequate sampling in the grouping of organisms (i.e., Taxon) for molecular evolution studies would require impractically large resources. However, gene sequencing technology has made leaps over the past decade with many medical applications (Koboldt et al. 2013).

To construct evolutionary trees, specific genes are isolated from the genome. Because different genes, domains with a gene, or specific nucleotide positions can display dissimilar rates of evolutionary change, the selection of a particular gene (or suite of genes) for phylogenetic analysis depends upon the evolutionary distances separating the organisms. For example, for closely related species slowly evolving genes may show an insufficient number of differences to resolve evolutionary patterns. At the opposite end of the spectrum, sequences for rapidly evolving genes are of little value for inferring phylogenies that span the major domains of life.

• The compared genes must have a common evolutionary history and perform the same role in all organisms in the sample.

• In additions, such genes must change slowly enough to alow us to unambiguously identify the nucleotide positions that share a common ancestry.

• To infer evolutionary relationships, compared sequences must contain a statistically significant number of variable sites.

• Ideally the sequences should also span several functional domains (i.e., regions of the genome that serve different purposes). This condition allows recognition of convergent evolution, which is when structures or molecules come to resemble each other even though they have independent evolutionary sources (e.g., wings of birds, bats, and pterosaurs). If identical trees are found in two sequence domains that can vary independently, we have divergent evolution from a common ancestor.

• The genes defining the macromolecules must not be transferred between species. If lateral transfer has occurred, the inferred phylogeny will be that of the genes, not of the organisms.

Most of what we have learned about the evolution of microbes comes from studies of the ribosomal RNA (rRNA) genes. These genes are common to the protein synthesis machinery of all cells. Comparisons of subunit rRNA led to a new understanding of the Universal Tree of Life and its three primary lineages. Understanding how the primary lineages are interrelated is a question of great interest. A simplistic interpretation of the Universal Tree of Life is that the history of life on Earth was a series of bifurcating events leading to millions of terminal nodes. The competing view is that Earth’s current biodiversity was strongly shaped by frequent lateral gene transfer, which would make it difficult to identify the earliest diverging lineages within each of the primary lines of descent. Through exploration of evolutionary history, we can also address other issues that are important to astrobiologists.

### 5.3.3. Molecular Ecology#

Cyanobacteria in the ocean produces more oxygen (via photosynthesis) than do all higher plants. Microorganisms also maintain the global carbon cycle primarily by producing and consuming $$\rm CO_2$$. Single-celled organisms control global utilization of the nitrogen through nitrogen fixation, nitrification, and nitrate reduction. Microbes impact all ecosystem on Earth, yet we still know very little about microbially mediated mechanisms that control key biogeochemical cycles.

Molecular evolution studies have provided new frameworks and tools for deciphering the complexity of microbial ecology. It is now possible to identify all microorganisms within a sample without cultivating them species by species by species in the laboratory. surveys targeting natural populations of prokaryotes have revealed largely unexplored microbial diversity in geothermal environments, cold oceans, temperate aquatic and marine environments, soils, and animal digestive tracts. A plethora of novel prokaryotic microorganisms has been revealed by molecular inventories of places, such as hydrothermal anoxic sediments, the general ocean, and various acidic/alkaline drainages.

### 5.3.4. Molecular Studies of Protists#

Protists are an eclectic assemblage of predominantly unicellular eukaryotes. Free-living protists thrive in diverse environments, while some species parasitize other protists, fungi, plants, or animals. Success in molecular phylogeny requires that the DNA sequences selected for analysis reflect the historical evolution of the relevant organisms. In 1987, Carl Woese recognized the extraordinary potential of ribosomal RNA for inferring phylogenies that include all cellular life forms.

Ribosomal RNA is ideal because it evolves very slowly as compared to most protein-coding genes, and it is also the functional core of the information translation machinery found in a cell’s ribosomes (or protein synthesis organelles). Yet rRNA sequences can display rapidly evolving regions interrupted by highly conserved (slowly evolving) elements. The rapidly changing sequences are useful for measuring evolutionary distance between closely related species in the same genus, while the highly conserved sequences are used to infer relationships that span large evolutionary distances.

Early rRNA phylogenies for protists claimed that protists without mitochondria were ancestral to all other eukaryotes. These organisms lack many of the organelles found in other eukaryotes. The molecular trees based upon rRNAs also described new complex evolutionary assemblages. Measured in terms of lifestyle and phenotypic variation, some major protist clades appear to be as complex as planets, animals, or fungi. But the several radiations in the eukaryotic rRNA trees make it difficult to define the major protist clades.

Fig. 5.14 Both of these phylogenetic trees show the relationship of the three domains of life—Bacteria, Archaea, and Eukarya—but the (a) rooted tree attempts to identify when various species diverged from a common ancestor while the (b) unrooted tree does not. Image Credit: OpenStax:Biology.#

#### 5.3.4.1. Multi-gene Studies and the Archezoan Hypothesis#

One of the striking features of the early molecular “trees” was the basal location of amitochondriates (protists without mitochondria), such as microsporidia and Giardia. The transition more than $$2.4\ {\rm Ga}$$ from a reducing atmosphere to one that contained oxygen would not have been a prerequisite for the formation of eukaryotic cells. Thomas Cavalier-Smith introduced the term Archezoa to refer to the apparently most primitive eukaryotes. In one model of how the major eukaryotic organelles were assembled, those lineages appearing before the acquisition of mitochondria occurred prior to the largely unresolved radiation of many eukaryotic lineages.

To address questions about the identity of the earliest eukaryotes and to test the rRNA scheme, several groups of researchers shifted their attention to gene families other than rRNA. In some cases, these multi-gene phylogenies confirmed what we have learned from rRNA comparisons, while in others, consistent disagreements suggest alternative scenarios for the evolution of protists. It is now evident that the simple division of the eukaryotes into two broad territories is incorrect.

• Many of the taxa presumed to have been originally amitochondriate have genes which might be derived from mitochondria.

• The “primitively amitochondriate” taxa are now mostly regarded as derived from crown taxa by loss of mitochondria.

Conflicting molecular trees suggest that deep-branching protists might be artifacts of long branch attraction between rapidly evolving, basal eukaryotic branches and distantly related archaeal and bacterial out-groups. This conflict among genealogies has led to a biological Big Bang Hypothesis (Phillipe et al. (2000)), which states that all extant eukaryotes are descendants of a sudden evolutionary radiation that occurred about $$0.6-1.0\ {\rm Ga}$$.

This interpretation has its problems, where the paleontological record offers evidence of $$1.8-2.1\ {\rm Ga}$$ eukaryotic microfossils and $$2.7\ {\rm Ga}$$ archaean molecular signatures in the form of steranes associated with eukaryotes. The biological Big Bang Hypothesis does not account for consistent hierarchical relationships evident in combined, multi-gene studies.

#### 5.3.4.2. Sampling Bias and Other Problems#

Our models for understanding how complexity evolved in the eukaryotic cell are in flux. For example interpretations of early molecular trees argued that the most basal protist lineages lacked mitochondria and the non-coding segment of a gene (i.e., intron). The absence of introns in early diverging eukaryotes and most prokaryotes suggested their relatively late introduction ito microbial genomes. If mitochondria were not present or required for eukaryotic life, protists could have appeared prior to the advent of significant oxygen within Earth’s atmosphere. However, the discovery of bacterial-like molecular chaperones in putatively deep-branching amitochondriate protists suggests that symbionts ancestral to mitochondria could have been present in the early stages of eukaryote evolution. The absence of mitochondria may be a secondary adaptation to anaerobic environments.

Our molecular perspective of early eukaryote evolution is strongly biased by the limited selection of taxa and gene families available for molecular studies. Broad scale genomic sampling from taxa that might represent basal lineages in the eukaryotic line of descent is very limited. Alternative mechanisms of genome evolution might explain discrepancies between molecular trees inferred from different protist gene families. Conflicting phylogenies could be due to:

• paraphyletic relationships inferred from comparisons of genes that were duplicated in an ancient ancestor,

• horizontal gene transfer mediated by fusion of two or more genomes,

• endosymbioses, or

• viral-mediated, cross-species transfers.

Single gene phylogenies cannot resolve which of these mechanisms might be major factors in the evolution of eukaryotic genomes. To understand phylogenetic patterns for protists, we must combine from (as many taxa as possible) more extensive molecular and phenotypic data such as morphological features and biochemical capabilities.

Algorithms that inadequately compensate for different rats of evolutionary changes can incorrectly cluster together the slowly evolving sequences and describe the recently diverged, rapidly evolving sequence as deep phylogenetic branches. Genes that share a common ancestry might display differential rates of evolution in distinct evolutionary lineages for many reasons:

• environmental shifts requiring a large number of changes at the amino acid level in order to maintain function,

• altered mutation rates occurring in response to increased errors by DNA polymerases, or

• reduced ability to repair DNA replication errors leading to an increase in nucleotide substitution patterns.

## 5.4. The Phylogentic Tree and LUCA#

All life on Earth shares a common chemical basis and fundamental genetic attributes, which implies a common ancestry. Even the simplest forms of like known are chemically complex, where we descend from a very sophisticated last universal common ancestor (LUCA). LUCA may have been an individual organism or an ecosystem of organisms that shared genes with one another. LUCA is likely to have been quite different from any organism alive today.

Many varieties of animals have descended from the Cambrian explosion, where the ancestral animals evolved into many phyla (the largest category within the animal kingdom, which categorizes basic body structure). A more recent example is that chimpanzees and humans share a common ancestor several million years ago, but this ancestor was neither like a chimp nor a person. Evolution is not a forced march to increasing complexity, and most paths wind up being evolutionary dead ends.

The known organisms located near the root of the phylogenetic tree are the Archea and Bacteria that exhibit fewer genetic characteristics exclusive to their domains. These prokaryotes are probably the most similar modern life forms to LUCA. They are thermophiles (i.e., they thrive at high temperatures; $${\sim}330\ {\rm K}$$). This observation suggest that life may have also formed as such high temperatures, where environmental stresses on early Earth may have destroyed nonthermophilic organisms. See this material from Georgia Tech on how to read evolutionary trees.

## 5.5. Young Earth and Early Life#

Planet formation is a violent process, and large environmental variations on a young planet can be very hostile to life. Impact rates were extremely high during the Earth’s early history (see Fig. 5.15), and the substantial heating of the atmosphere, hydrosphere, and upper crust by massive objects may have caused an impact frustration of life (see Fig. 5.16).

Fig. 5.15 Curve showing cratering rate as a function of time (relative to today’s cratering rate) derived from the Morbidelli et al. (2018) accretion tail model. Image Credit: Hartmann & Morbidelli (2020).#

A late large impact may well have heated the upper portions of our planet sufficiently to have killed off all nonthermophilic organisms. Thus, the thermophile root of the phylogenetic tree might be the result of an impact-induced heating event that killed off all life not tolerant of high temperatures. In this case, life could have originated in either a cool or a hot environment. Life and the planet(s) that it occupies can affect each other in many ways, and it is not possible to have a complete understanding of one without knowledge of the other.

Fig. 5.16 The effects of impactors by size (radius) on the Earth and when such impacts have occurred within Earth’s history. Image Credit: Lissauer & de Pater (2013).#

## 5.6. Homework#

Problem 1

Summarize the three points of view for the origins of life on Earth. Evaluate the falsifiability of each point of view in your summary.

Problem 2

Describe the Oparin-Haldane hypothesis in your own words and summarize a famous experiment that sought to test parts of the hypothesis.

Problem 3

What are the two main architectures for cells? Summarize how these cell architectures are organized.

Problem 4

Explain the roles that proteins, nucleic acids, and lipids play for the function of a cell.

Problem 5

Describe the problems concerning the origins of life itself versus the the origins of only organics. What do chemists mean by the term organic?

Problem 6

DNA forms the for our understanding of modern life. Summarize the theories for a world before DNA when replicator RNA dominated.

Problem 7

Explain how metabolism first theories could carry out basic life processes.

Problem 8

What are the problems with the earliest life found in the geologic record? What type of evidence do we have of the earliest lifeforms?

Problem 9

How are phylogenetic relationships deduced from the RNA Tree of Life? What are the three main division, or Domains, of the tree?

Problem 10

What is meant by a last common universal ancestor (LUCA)?

Problem 11

How could impactors on the early Earth have affected the origins of life?