The Subulate Palpomere

The Passing of the Tree of Life Web Project

Posted on 2 November 2025 by David Maddison

The Tree of Life Web Project began its journey almost 40 years ago, and was formally announced in early 1996. It has served thousands of pages of information about the evolutionary tree of life and the characteristics of organisms that evolved along its branches; this content was contributed and curated by hundreds of biologists who were experts in particular branches of life.

I no longer have the resources to keep the project functioning as it has been, and thus it will be retired soon (by the time you read this, it may have been retired). I have hopes that it will be reborn in a more primitive way as a series of static web pages that capture a snapshot of the content it contained when it ended. If so, you should eventually see, at its old web address http://tolweb.org, a basic version of the content.

Early history

The first glimmerings of the Tree of Life Web Project (ToLWeb) began sometime in the late 1980’s, when I was working on the computer program MacClade instead of working on my PhD thesis. MacClade was a phylogenetic analysis program for the Apple Macintosh that displays interactive phylogenetic trees. Inspired by Hypercard, I planned to add a feature to MacClade that would allow the taxa at the tips of the branches of the tree on the screen to be hypertext links to MacClade files that showed trees for those less-inclusive taxa. The idea was that when the user touched on, say, the terminal taxon “mammal” in a tree of vertebrates, MacClade would close down the current file, and open the mammals file, and present the user with the stored tree for mammals. This seemed like enough of a pain to implement (and really, the thesis was waiting) that the idea was shelved until sometime around 1993, when the idea resurfaced.

The connection of phylogenetic trees with hypertext links is certainly not unique to the Tree of Life Web Project, and a number of other projects have used and use similar formats. Perhaps the two most well-known examples other than ToLWeb are LifeMap, an excellent CDROM produced in 1992 by the California Academy of Sciences, and University of California at Berkeley’s Phylogeny of Life project. But there have been and now are others, and there are several websites that contain classifications connected via hypertext links.

In 1993, I did some more thinking about this. The obvious medium in which to present these linked pages was no longer MacClade, but the youthful World Wide Web. At first, the vision was to have the system just be a way to organize a bit of content and other web sites in a phylogenetic fashion. The vision grew, however, and the idea was formed of having the Tree itself distributed around the world, with different branches residing on different computers, and with a worldwide collection of experts authoring informative pages for their particular branches. I presented to my brother Wayne the idea of having a global Tree of Life on the Web. It wasn’t until the summer of 1994 that he finally convinced me this was worth doing, and we got off our rears (the theses were long finished at that point, which was a shame, as theses are very productive for the displacement activities they inspire).

This would take a long time to do if specialized tools were not available to somewhat automate the process. I suggested adding features to MacClade that would produce formatted HTML pages containing trees. Wayne took the bull by the horns and added the first versions of these tools to MacClade. The first version of the Tree, put on line in prototype form on 16 November 1994, was written entirely using this version of MacClade. Later that year I took over the development of these tools.

I sent out some notices on some relevant lists (e.g. entomo-l, TAXACOM, etc.) asking for suggestions and contributors, but we did not announce the project formally to the Internet community as a whole. Over the next few months the Tree went through some major appearance changes, in response in part to these suggestions.

Addition of remote branches and formal announcement of the Tree

In the early days, all pages on the Tree were on the home site in Tucson, Arizona. On 1 June 1995, the first remote branch of the Tree was added, the crayfish pages by Keith Crandall (U. Texas). These files resided on a computer in Austin, Texas. In the following months, a number of remote branches were added. Some of the branches of the Tree that were authored by people other than Wayne or me, and which were attached to the Tree when the Tree was first formally announced, include (among others) Peter Beerli’s Western Palearctic water frogs, Scott Stockwell’s scorpion pages, and John Lundberg’s pages on Chordata, Vertebrata, and various fishes.

On 5 January 1996, when the Tree of Life project was first formally announced, the Tree itself contained 948 pages, housed in seven computers on two continents.

Growth of the Tree, with conversion to a dynamic content

After the formal announcement, the project quickly grew, with hundreds of biologists joining the effort. New features were added; one of my favorites were “treehouses”, which were collections of content written by or for children and a more general audience, and which focused on a particular group of life. Treehouses were attached to that group’s branches of the tree.

In February 2002, after a huge amount of work by three talented programmers, a version of the project debuted with a very different internal structure, with the content being produced dynamically by a database, rather than a series of static HTML pages. This change required all content to be moved back to the server in Arizona so that it could reside in a single database. In 2009 the server was moved to Oregon State University, where it has resided ever since. Alas, the software that runs ToLWeb is now so out of date that it causes a security risk, and so it needed to be disconnected from the Internet.

More details of the history of the project between 1996 and 2007 are provided in Maddison et al. (2007).

For the most part new content stopped being added to the project in 2011. There are multiple reasons for this, including a lack of resources to encourage and enable contributions, and to rewrite the underlying code so that it worked on more modern operating systems. Another significant factor was the rise of other projects such as the Encyclopedia of Life, which at the time captured the attention of many of our contributors. Perhaps most importantly I simply didn’t have the time to champion the project as it needed to have been to thrive. Contributions slowly petered out.

The last person to contribute content to the project was Dick Young, an expert in cephalopods, who put up many pages filled with the wonders of octopi, squids, and related animals. His enthusiasm over the years and the incredible diversity he showcased inspired me over the last decade to do what little I managed to do with the time I had available, and I wanted to keep it alive for him as much as anyone. And, really, how could you not fall in love the images and information Dick provided of a little creature like Cranchia scabra, shown below on one of Dick’s pages.

There were many other key contributors to the project. Most significantly there were the members of the “home team”: Wayne, especially in the early days; Travis Wheeler, Danny Mandel, and Andy Lenards, the three excellent programmers who created the underlying code of the database-driven version and made it all work; and especially Katja Schulz, without whose careful managerial and editorial work across many years the project would not have thrived.

In the end there were over 540 biologists from over 35 countries participating, contributing to the groups in which they were experts, and well over 10,000 pages of biological content.

At its peak (about 2006) ToLWeb was visited by over 2,500,000 unique visitors per year, from 198 countries and independent states. By 2023 the number of visitors per year dropped down to about 125,000, but even now I get emails from teachers telling me how important it is for their teaching.

The Future that was not to be

The endpoint of ToLWeb was so far from what our goals had been; it never reached our dreams. We had imagined a more flexible database that could allow alternative hypotheses about the branching pattern of the tree; the content atomized as content units with flexible criteria to map them onto nodes on the chosen tree; automatic inclusion of content from other databases, and numerous methods to map that content to particular nodes; synthesis of this content into tree decorated with content. We had imagined a vastly better interface, less page-oriented and more seamless, with the ability to wander through the tree more fluidly. We wanted a well-illustrated page for every species (for the Tree is not just branches, it is also “leaves”). There were other dreams, but they all faded under the reality of time.

The Future that may be

There are now several excellent projects that embrace some of the vision of ToLWeb.

The Open Tree of Life (https://opentreeoflife.org) is well-designed project into which the phylogenetic results from the primary literature are deposited, combined with existing taxonomic classifications, and algorithmically synthesized into a single, browsable tree of life. Although I suspect this way of producing the core tree structure will be the way of the future, and the group of people behind it are superb, its current state has a lot to be desired. There are two major problems: (1) the vast majority of phylogenetic knowledge has not yet been entered into their database, and for most species there are no phylogenetic hypotheses; (2) the algorithm used to synthesize the trees and classifications leads to some notably incorrect results. The lack of inclusion of phylogenetic hypotheses has many causes; it’s not an easy problem to solve. The problematic results of the synthesis algorithm are also likely not easy to solve.

As one example of the tree presented in Open Tree, consider the root of arthropods in the tree. This is what that section of the tree looks like:

Note the taxon “Neelida”, toward the bottom. That is a group of springtails, which should not be shown in a basal polytomy with other major groups of arthropods, for it belongs fully within one of the other groups shown on at the tips of the tree – Hexapoda (highlighted in blue). Why Neelida, Sminthuridida, etc. are shown outside of Hexapoda I don’t know, but such issues are abundant at least in insects. In some groups (e.g., the ground beetles that I work on) these problems abound, with many species falling not within their groups but seemingly randomly attached to some much deeper node. The tree in those groups is pretty much nonsensical. Thus, using the tree provided by Open Tree in analyses within some groups can lead to inaccurate calculations; for the same reason the tree in portals such as One Zoom (which uses Open Tree as its primary source) should also be viewed with great caution. Again, this is not a fault of the vision of Open Tree, but rather its current status. Perhaps more funding of Open Tree will allow it to live up to its promise.

Other efforts, such as that presented by Del Risco et al. (2024) and embodied in their Tree of Life App for mobile devices, contain curated collections of phylogenetic hypotheses, synthesized into a singular view. Their app’s content is less automated than Open Tree, which means it requires more human effort to add components to the Tree, but has the advantaged that the components are more vetted. It is not currently organized as a collaborative project the way ToLWeb was organized, which had hierarchically organized communities of researchers working on ever smaller branches of the Tree.

The Encyclopedia of Life, https://eol.org, although containing a lot of great information, does not fundamentally embrace the phylogenetic vision and navigation that I think is vital.

I still believe there is an important place for a project like ToLWeb and its dreams: a phylogenetic view of all life, curated and synthesized by experts, allowing for alternative hypotheses, displayed and navigated fluidly in a single portal, onto which is layered text (including organismal characteristics), media (photographs, drawings, videos), and other data about the species and clades of the Tree.

A passing

And so the Tree of Life Web Project will transition to another phase. It likely will be offline for a while, as we work on a way to preserve its content for others to use. I should know in the next few months whether this is possible.

In the meantime, it should be available on the Wayback Machine, at https://web.archive.org/web/20251003172115/http://tolweb.org/tree/ Let’s hope it eventually has a life outside that archive.

But if not, I hope the project has served as an inspiration to others to create something even better in all ways.

References

Del Risco, A.A., Chacón, D.A., Ángel, L. and García, D.A. (2024), Assembling an illustrated family-level tree of life for exploration in mobile devices. J. Syst. Evol., 62: 993-1008. https://doi.org/10.1111/jse.13053

Maddison, D. R., K.-S. Schulz, and W. P. Maddison. 2007. The Tree of Life Web Project. Pages 19-40 in: Zhang, Z.-Q. & Shear, W.A., eds. Linnaeus Tercentenary: Progress in Invertebrate Taxonomy. Zootaxa 1668:1-766. https://doi.org/10.11646/zootaxa.1668.1.4

Posted in Phylogenetics, Tree of Life | Tagged evolution, nature, phylogeny | 4 Comments

A LUCA in our Courtyard

Posted on 18 February 2025 by David Maddison

One of my recent academic journeys led me down an unexpected path, to a celebration of art and science in the building I work in. I was fortunate to follow this path with others, including the artist who created the art that now lives in the building’s courtyard, and forms its core. It was one of the most moving and memorable experiences of my career.

Cordley Hall

Cordley Hall is a large building at Oregon State University, housing the Department of Integrative Biology (IB), the Department of Botany and Plant Pathology (BPP), with a few other units. Originally built in two halves in the 1950s and 1960s, the building internally was a classic institutional building from that era, with dimly lit, indistinguishable corridors that seemed to go on forever, resembling more a minimum-security prison than a modern academic building. But through a multiyear, multimillion dollar process, the building was resurrected and transformed into a vastly more functional, much more pleasant building, a building infused with natural light.

A Cordley hallway before (2020) and after (2024) the remodel

In 2017 I joined the Cordley team (shout out to John, Blaine, Doug, Mark, and others!) whose task was to provide input about the needs of the building occupants to the project manager for the remodel, the design team, and administration (shout out to Dustin, Anita, Doug, Dawn, Roz, Virginia, Joey, and many others!). This potentially odious task turned into one of the most fascinating detours of my career (architecture is very cool! as is engineering! and construction! and project management!).

Ann Hamilton

But this is not a story of the remodel; it is instead the story of the art created for the building, and the artist who created it. Oregon’s Percent for Art legislation specifies that a percentage of construction funds for public buildings be set aside for the acquisition of art work for the building. I was a member of the committees (shout out to all on both committees, especially Ryan, Dustin, Joan, Julia, Dawn, and Carolina!) that chose and then worked with the artist. Amazing artists applied, and we wanted to choose all of them, but we could only choose one. We had the extraordinary good fortune to have Ann Hamilton agree to provide the art for Cordley Hall.

Ann has created numerous pieces of art in public places, including when an object reaches for your hand and CHORUS; the latter is at the World Trade Center’s Cortland Station. My favorite piece of hers (before her Cordley Hall installation) is The Event of a Thread, at the Park Avenue Armory, in New York (Ann talks about this piece in this video). Ann creates art that is meant to be explored and touched; she wants people to participate in the art, and, in some pieces (e.g., when an object reaches for your hand) take parts of it home.

Ann and me. Photo by Maia Insigna, copyright Oregon State University

One of the many things that is wonderful about Ann is that she wanted the residents of Cordley Hall to participate in her process of creation. She visited, and talked to many of us, and really listened – to us, to our science, to our passions. She asked questions – very good questions, thoughtful questions that indicated she was understanding what she was hearing. And through this long process, with several detours, she arrived at a piece of art that now forms the heart of Cordley Hall.

Ann’s art in the Cordley courtyard

Here’s what Cordley Hall looks like from above, with the location of the courtyard shown in green.

Cordley Hall from above, showing the courtyard in green

Ann’s art covers almost the entire courtyard, and consists of a large number of paving stones of three different colors, most of which have a single letter stamped into them, either A, C, G, T, or N. A, C, G, and T are the letters of the DNA alphabet; A stands for the nucleotide adenine, C for the nucleotide cytosine, G for the nucleotide guanine, and T for the nucleotide thymine. N stands for an unspecified nucleotide.

Here’s what a portion of the art looks like:

The paving stones that were used were created by a local company, Western Interlock, who embraced the project as an exciting challenge. (See https://westerninterlock.com/wp-content/uploads/2024/04/WI-Case-Study_Cordley-Hall_Corvallis-Oregon.pdf for more details.) They did a beautiful job making the pavers.

The reddish pavers form large block letters, and from above spell out a poem composed by Ann:

Thus, when one steps back (or goes up in the building), and can see an overview of the whole piece of art, a message appears.

The DNA sequence

The poem in reddish pavers is not the only message in the art, for the nucleotides (A, C, G, and T) present a message too. That message, although cryptic, is especially poetic and profound. The letters in the pavers are arranged in a sequence to match the sequence of nucleotides in the gene called the Large Subunit Ribosomal RNA gene, or LSU rRNA. This is a gene that is present in all living things, and is vital for building proteins, as the gene produces a ribosomal RNA (rRNA) that forms one of the building blocks of the ribosomes. Ribosomes are the structures in the cell where information contained in a gene gets translated into a protein. As proteins are the main catalyzers of reactions within the cell, the LSU rRNA gene is a vital component of the information processing architecture in the cell necessary to produce the machinery of the cell.

It took a long time to decide that a sequence of the LSU rRNA gene would be used, but even that didn’t yield the final plan. The exact DNA sequence of this gene varies from species to species, which means there were millions of possible variants of this gene to choose among for Ann’s piece. A group of us (we called ourselves Ann’s Process Team) helped brainstorm about what species to use. At one point we realized it would be ideal to have a sequence that somehow represented all of the diverse forms of life studied by the biologists in the building. The majority of space in Cordley Hall is occupied by two departments, Integrative Biology and Botany and Plant Pathology, which combined study everything from bacteria to plants, fungi, and diverse animals. We considered the possibility of having a sequence that summarized all of the different sequences of the LSU rRNA gene in all of these organisms, perhaps by creating what is called a “consensus sequence” that shows the parts of the LSU rRNA gene sequence that all of the organisms have in common. This was a reasonable possibility, but it didn’t have quite the depth of meaning and poetry Ann was seeking.

It was Joey Spatafora that saw the last piece of the puzzle. Joey suggested that Ann use the sequence from not any living organism, but the sequence inferred to have been present in the Last Universal Common Ancestor (LUCA) of living things (see the image below). Any two species of life share a common ancestor. For example, humans and chimps diverged from one another perhaps 6 to 9 million years ago (mya), which means their last common ancestor lived 6 to 9 mya. This ancestor was a species of great apes that later split into two lineages, one giving rise to humans and one giving rise to chimps. Of course, humans and chimps had many older common ancestors, too, but the last one, that is, the one that lived closest to the present, was 6 to 9 mya. A human and a sponge has a common ancestor that lived about 750 mya; a human and mushroom has one that lived about 1.3 billion years ago. About 3.6 billion years ago there was a species that eventually gave rise to all of life we have on Earth today; that species was the last universal common ancestor, or LUCA. (The ancestors of LUCA were themselves universal common ancestors, but they were not the last such ancestor. There was also much other life before the LUCA, but any species that was not a direct ancestor of the LUCA has no living descendants, by definition.)

Fortunately, a paper had been published just before these discussions that inferred the sequence of LSU rRNA for the LUCA, using in part a phylogenetic analysis (Men et al. 2022). This inferred DNA sequence of a gene from the last universal common ancestor of all living things is now the thread that winds through Ann’s art.

The sequence in the pavers begins at the northwest corner of the courtyard, and proceeds to the east, and then wraps around and heads back west, and so on, snaking south through the courtyard. Here’s a diagram of the very start and the direction the sequence should be read:

and here is a photograph of the northwest corner showing the first three rows of pavers; the darker gray row is the first row, and the start of the sequence:

(This photograph was taken in January 2025, after the pavers had been in place for about 1.5 years.)

However, the sequence of the LUCA’s LSU rDNA gene was not long enough to fill the entire large courtyard, and so Ann repeated the sequence a bit over 10 times. (There are a lot of pavers in the courtyard.) The first copy of the sequence fills more than 17 rows of pavers, and then the next copy begins. The map below shows the extent of the first copy, with the start and end points marked; small black rectangles are the boundaries between the individual copies.

Significance of the sequence from our LUCA

Humans have come to the profound understanding that a Last Universal Common Ancestor of all living things existed, an understanding that shapes our view of who we are and how we came to be. This fact arises from one the most important discoveries our species has made. Charles Darwin first proposed a theory for which there is now overwhelming evidence: that all living things are connected to each other, as they have each arisen from past species through a process of diversification. This diversification has yielded a tree-like branching of species lineages, often referred to as the phylogeny of the species. (There are regions of the phylogeny that are not simply branchings but are also rejoinings, but overall the phylogeny is tree-like enough in most groups that calling it the tree of life makes sense.) Further, there is only a single tree of life on Earth. A consequence of this unitary history of life on Earth is that there existed, at one point in the past, a single lineage that was a common ancestor of all organisms alive today.

Our understanding of the tree of life also unifies biology. Theodosius Dobzhansky once said “Nothing in biology makes sense except in the evolution”, and it has been said that “nothing in evolution makes sense except in the light of phylogeny”. The tree of life provides one of the foundational components that allows us to explain and understand the biological world around us, in a similar way that plate tectonics explains so much about the structure of the Earth’s surface.

Thus, at its core, Ann’s art, in embracing LUCA, celebrates the deep history, evolution, and diversity of all life. In this way it represents the full diversity of organisms studied in Cordley Hall by faculty and students, and thus is a central and beautifully unifying heart of the building.

Interactions among parts

As profound and intriguing as LUCA is, there is another curious aspect of the DNA thread that runs through the courtyard. The thread folds and intertwines, as some parts interact with other parts of the thread.

The product produced by the gene whose sequence adorns the courtyard, LSU rRNA, is a long thin molecule, like a ribbon that folds back on itself, forming a complicated structure. Here’s a diagram of just part of the ribbon:

The areas where the ribbon folds back and sticks to itself are called stems, and are shown in the diagram as two parallel lines; where the ribbon is solitary and does not stick to another piece of ribbon loops are formed, as shown in the diagram. As a whole, the molecule can be drawn in two dimensions like this:

In reality it is a more complex three-dimensional structure, and the two-dimensional diagram above shows only some of the aspects of that structure. It’s also hard to fit everything close together because it is so complicated, so this is drawn as two separate pieces (connected by dotted lines). In reality the two pieces would be closer, because the dotted line would not be there.

The reason parts of the ribbon stick together is because the nucleotides in one region are complementary to the nucleotides in another region, and complementary sequences stick together. In particular, because of their molecular structure, an A will stick to a T, and a C will stick to a G.

Let’s look at the start and end of the sequence. The start of the sequence is AATAAAG. The end of the sequence is CTTTTTCCTA. If we imagine the last part of the sequence facing the other direction (imagine the ribbon folding over), it reads ATCCTTTTTC. If we line the start up with the reversed end we see this pattern:

The As will stick to the Ts, and the G with the C, as shown by the gray lines. Here is a stretch of six nucleotides where five of them will stick together. This will provide enough adhesion to make this stem stable. (Actually, in the rRNA molecule, the Ts would be Us for uracine, but we will ignore that detail, as it doesn’t change the story being told.)

Below is a figure of that stem composed of the start and the end, as well as nearby stems and loops. If you look at nearby regions, you will see that each stem is composed of a strand of nucleotides matching up with a strand of complementary nucleotides:

So let’s see where the stem at the start and end is in the physical courtyard. Let’s color-code that stem with green:

On the ground those sequences are here:

If we color-code the nearby stems

They appear in these regions on the courtyard.

There are many more stems in rRNA that we could map out, but we’ll leave it at three.

Thus, there is a biological connection between the DNA letters in one part of the courtyard and another part of the courtyard. In the cells of organisms the rRNA would be spooled out as a thread by the process of transcription, and then would naturally fold as the complementary regions encounter each other and stick together as stems, in the process creating a complex three-dimensional structure.

In my mind I imagine coming to work some day, looking out through my lab windows onto the courtyard, and discovering that overnight the courtyard DNA paving thread had arisen from its substrate, snaked through the air in slow, grinding movements, stems forming whenever complementary regions happened to become near each other, in a dance of stone, leading to a giant ribosomal sculpture in the center of the building.

However much one could hope for this, it won’t happen. But one could imagine a human performance, perhaps a dance involving an interplay of people between complementary regions, or an auditory event whereby a musician at one half of a stem (say, one of the green arrows shown above) engaged in a musical dialogue with a musician at the other half (the other green arrow).

Closing remarks

In the end, Cordley Hall is now grounded via a profound, foundational piece of art that ties together all the studies in the building, and we are enriched by the experience of taking a few steps along the journey of an amazing artist and lovely person, Ann Hamilton.

Ann and me. Photos by Maia Insigna, copyright Oregon State University

References

Men, Y.; Lu, G.; Wang, Y.; Lin, J.; Xie, Q. 2022. Reconstruction of the rRNA Sequences of LUCA, with bioinformatic implication of the local similarities shared by them. Biology 11: 837. https:// doi.org/10.3390/biology11060837

Posted in Art & Science, Tree of Life | Tagged art, biology, DNA, evolution, genetics, science, Tree of Life | Leave a comment

Maddison’s Law of Phylogenetic Analysis

Posted on 20 February 2024 by David Maddison

Sigh.

I was painfully reminded yesterday of Maddison’s Law of Phylogenetic Analysis. It wasn’t the first time over the last few months I have been reminded of that law.

As I documented in my March 2023 blog post Prior and Current Ignorance: Struggles with Bayesian analyses, I have been attempting to do a Bayesian analysis of some beetle genomic data that would yield estimates of the divergence times of various branches in the evolutionary tree of Bembidion ground beetles. We (primarily James Pflug) had already completed numerous other analyses, including maximum likelihood analysis of various combinations of data, SVD Quartets, etc., etc. Only the Bayesian analysis remained.

After I last updated that post, I started the final analysis in the program BEAST following the options outlined in the post. I set the BEAST 2.7 analysis going on 17 April 2023 on a relatively fast MacStudio (with an M1 Ultra chip with 16 fast cores and 128GB RAM). It happily did the whole MCMC sampling routine until about two months later, 16 June, when I stopped the analyses as the ESS values were all above 200. Here’s what the window in Tracer v1.7.2 looked like, showing on the right the trace for likelihood:

I was extremely pleased that this was done – my final analysis for the paper was done! But then I looked at the resulting trees and realized that the estimated age of Bembidiini was way too old at 175 million years, in the Jurassic. That is way older than makes sense based upon other dating analyses of beetles and the fossil record.

Searching online revealed the problem: the way the taxa were sampled violated the core assumption of randomness in the BEAST Fossilized Birth Death analysis. It turns out that if you sample taxa to maximize sampling of deep phylogenetic lineages, as I did, then using BEAST’s FBD random sampling model will cause overestimation of the ages (Matschiner et al. 2017). In the advice I received about the sampling model, which led to the exclusion of outgroups as they were sampled much less densely, there was no mention of this bias.

So where to go from here? There is a CladeAge package for BEAST (Matschiner et al. 2017) that is specifically built to avoid this bias in FBD analyses in BEAST. However, I didn’t get far with that, as CladeAge requires presumption of diversification rates (if I remember correctly), which might be fine for extremely well-studied groups like vertebrates with dense fossil records, but for my little beetles there simply isn’t the background knowledge to presume diversification rates that I felt comfortable with.

More exploration revealed that, thankfully, MrBayes could do the FBD analysis with the assumption that taxa were sampled across the deepest nodes, rather than randomly. This was great news, not only for me, but for the field, as most people who do phylogenetic studies make a point of sampling one or two species from each group; they do not randomly sample. So off into the MrBayes world I went.

The key commands for MrBayes were:

prset brlenspr=clock:fossilization;  
prset samplestrat=diversity;

These told MrBayes to do an FBD analysis, and assume a sampling strategy that was meant to maximize diversity of deep lineages. Perfect! I did an initial test of how long it would take analyzing all 500,000 amino acids, and it seemed to be too slow, so I asked instead for it to analyze the data at the nucleotide level, but this time excluding 3rd positions (so about one million nucleotides). I also asked it to do 4 MCMC runs with 4 chains each. I began my final Bayesian analysis on 29 July 2023.

However…. things did not turn out as hoped. I just couldn’t get MrBayes to sample sufficiently. Two and a half months later, by 14 October 2023, it had completed nearly 22 million generations on the fast 16-core computer, but the ESS of ln L was only 6. Here’s what the Tracer view looks like of ln L:

As you can see, it looks pretty miserable. All four chains had eventually reached the same level for ln L but it seemed as if it would take an extraordinarily long time to sample sufficiently to complete the analysis. I also needed that computer for something else, so I decided to stop the run. That should be no problem, however, as MrBayes has a checkpointing system that allows a run to be restarted partway through.

Alas, I couldn’t get the checkpointing system to work. Tried multiple times, asked questions, got no where.

Because I suspected I could improve the sampling by altering the options for the MCMC process, I tried again with the MrBayes analysis, this time hoping that different values of the chain temperatures and swap frequency would be better. With hope in hand, I started my final Bayesian analysis on 16 November 2023. However, after 3.5 weeks of sampling, this is what the ln L trace looked like (on 11 December):

As it didn’t seem as if it was doing all that much better than the earlier run, and after much thought, and discussion with my colleague Katie Everson, I decided to abandon that run.

But it turns out I had another computer that was available during that time, and so I had already started to explore dating the branches not with the better method of the Fossilized Birth Death process, but instead a classic method in which nodes were calibrated with the fossils. So while that failed FBD analysis in MrBayes was going on, I was also running a calibrated node analysis in BEAST. This included all three fossil Bembidion, and the run was started on 23 October 2023. It finished, with good ESS values, on 6 December 2023. So when I decided to scratch the MrBayes run on 11 December, all was good, as I already had completed what was now my final Bayesian analysis five days before. Yay!

Except… as I was writing up the methods section, I realized that there was a problem with one of the fossils used to calibrate the minimum age of one clade. The placement of the fossil in that clade was less certain than I had originally thought. And so, with some reluctance, I decided to scrap that analysis, and start again, this time dropping that fossil and its calibration from the analysis. OK, whatever.

So now, at last, with some confidence I could begin my final Bayesian analysis: a calibrated node analysis in BEAST, with only two calibrated nodes (one of each of the two fossils) not three. I began the analysis on 6 December 2023, and one month later, on 6 January 2024, it had sampled enough that I felt comfortable stopping it and harvesting the trees. At least I was done!

Except… a couple of weeks later, while helping a colleague (Kip Will) with setting up an analysis for a joint project, I realized that we had mistakenly set one of the priors for the Optimized Relaxed Clock to something we shouldn’t have used. This meant that I had to start again.

I corrected that prior, and on 23 January 2024 I started my final Bayesian analysis, just like the previous one, but with the better prior. It chugged along for a few weeks, and on 12 February 2024, all ESS values were above 200, and so I stopped the analysis, breathed a huge sigh of relief that it was finally done.

Except… yesterday the author of the Optimized Relaxed Clock package in BEAST that we were using reported to the Google Groups community that there was a bug in the version of ORC that we used, and that it was strongly recommended that users redo any analyses.

And, so. Here we are again. Almost one year later. Yesterday night I started the run again. The seventh final Bayesian analysis. I figure it will be done in three weeks or so.

Will this truly be the final Bayesian analysis? Seventh time’s a charm? Who knows. Somehow I suspect not, unless I am just so sick of the whole thing that I abandon the analysis entirely. If, miraculously, I really complete the final analysis for this paper, I will update this post!

Now you can see why I was reminded yesterday of this law:

Maddison’s Law of Phylogenetic Analysis: You will conduct your final phylogenetic analysis at least three times.

References

Matschiner, M., Musilová, Z., Barth, J.I.M, Zuzana Starostová, Salzburger, W., Steel, M., Bouckaert, R., 2017. Bayesian Phylogenetic Estimation of Clade Ages Supports Trans-Atlantic Dispersal of Cichlid Fishes, Systematic Biology, 66:3–22, https://doi.org/10.1093/sysbio/syw076

Posted in Phylogenetics | Tagged Bembidion | 2 Comments

Truth versus firehoses of falsehoods

Posted on 30 March 2023 by David Maddison

As a result of one of my more traumatic experiences as a graduate student, I became familiar with Knoll’s Law of Media Accuracy, “Everything you read in the newspaper is absolutely true, except for that rare story of which you happen to have firsthand knowledge.”

It was my second year in graduate school at the University of Alberta in Edmonton, and I was spending my time examining adults and larvae of Bembidion subgenus Bracteon in an attempt to infer the phylogeny of the group. I had adults and larvae of most species, but I had no larvae of Bembidion balli, and I really wanted to get them. To do that I needed to capture adults during the breeding season. The best place I knew to find Bembidion balli was a beautiful spot called Paynton Ferry, on the North Saskatchewan River in Saskatchewan, and so on 19 May 1983 Jim O’Hara and I got in his van, and drove east from Edmonton. As we were driving rain clouds were approaching behind us; this was not good, as these beetles like sunshine and warm conditions. Nonetheless, we drove on, and once we arrived I set out walking along the river shore to the sand bank about a kilometer away where I knew B. balli to be relatively common. I never managed to get a female as two events drastically altered the day: the rain arrived, and I found a body on the beach. It was the body of a trucker whose semi truck fell off a bridge in Edmonton the previous November, more than 400 km upstream, was swept under the ice and the body was eventually deposited at Paynton Ferry, where I found it in May. There are many more details, of course, but suffice to say that it was a day I will never forget. It was newsworthy in Edmonton, and two days later the papers there carried an article about my discovery. In one paper I was described as “a man walking along the river”, which was accurate. In the other paper, however, I was “two men in a canoe”. I don’t know how that falsehood arose, but one can imagine that it sounded a bit more poetic or perhaps more believable than “a man walking along the river”. And thus I gained a healthy distrust for the media.

Bembidion has been in the news recently, because of a paper we published this week. In 2021 Kip Will (UC Berkeley) found a specimen in Jerry and Anne Brown’s ranch in California. He didn’t recognize the species, and so sent it to me; I examined it and sequenced its DNA, and I compared it to other Bembidion, and it became evident that it was a new species. John Sproul provided some DNA sequence data of related species. Our paper in ZooKeys (Maddison, Sproul, and Will 2023) describing it as Bembidion brownorum came out earlier this week.

*Bembidion brownorum* adult male. Image copyright David. R. Maddison

In part because the species is named after a former Governor and First Lady of California, and in part as the species has likely declined in abundance, the news services picked it up and the discovery has been covered around the world. The first article by Kara Manke is extensive and quite accurate. Most other articles were subsets of this one. Another took a different tack, with more political discussion and humor; that was is in general pretty good, although it states the research was done by “researchers” at UC Berkeley, and there is no mention of me or John or anyone outside of California, which is unfortunate as the bulk of the research for the paper was done at Oregon State University. But after these initial stories, some of the articles started going off the rails.

One published in the Hindustan News Hub is titled “Rare cricket species found in California named after former governor”. I don’t know how the beetle became a cricket. My name is also misspelled as “Madison”, but even better, the article states that we published the description in the journal Nature! Woohoo! My first paper in Nature!

The most amazing I have found is an article in Sasatimes which uses my picture of Bembidion brownorum, and, given the timing, surely is intended to present the same news story, but which states that the new species is a tenebrionid (darkling beetle) that was named “Eleodes jerrybrowni“, and that it was “discovered by a team of entomologists from the University of California at Davis” and that it was found “at the Oakland Zoo”. As far as I can tell, there is no species called Eleodes jerrybrowni, and I don’t know of any entomologists at UC Davis who would describe a tenebrionid beetle. Where could this fictitious account have come from?

One of Kip’s students suggested it was an AI system that wrote the bogus articles; that seems a very likely possibility. With the rise of ChatGPT and similar AI systems in the last few months, it wouldn’t be surprising if they are being used to generate news articles.

Christopher Marshall decided to test one of them out, and put the prompt “Tell me about Bembidion brownorum” into iAsk.ai. Here’s what he got:

Bembidion brownorum is a species of ground beetle that belongs to the family Carabidae. This species is endemic to the United States, specifically found in the state of California. The beetle was first described by Erwin and Kavanaugh in 1981 and was named after the American entomologist Arthur V. Evans Brown.

Bembidion brownorum has a black body with a metallic green or blue sheen on its head and thorax. The elytra, or wing covers, are also black but have a reddish-brown border. The beetle has a length of about 4-5 mm and can be found in sandy areas near streams or rivers.

There is not much information available about the behavior or ecology of Bembidion brownorum as it is a relatively understudied species. However, like other ground beetles, it is likely an important predator in its ecosystem.

Top 3 Authoritative Reference Publications or URLs:

1. “Bembidion brownorum Erwin & Kavanaugh, 1981.” Carabidae of the World. https://carabidae.org/taxa/brownorumerwin-kavanaugh-1981

2. Erwin, T.L., Kavanaugh, D.H. “New species and records of Bembidion from western North America {Coleoptera: Carabidae}.” Pan-Pacific Entomologist, vol. 57, no. 2, 1981, pp. 145-158.

3. “Bembidion {Peryphus} brownorum Erwin & Kavanaugh 1981.” Integrated Taxonomic Information System {ITIS}. https://www.itis.gov/serv1et/SingleRpt/SingleRpt?search_topic=TSN&search_value=932656#null

This is so bogus it is hard to know where to start. It’s disturbing that it made up the fact that it was named after “American entomologist Arthur V. Evans Brown” (presumably the AI just jammed “Brown” on to the end of Art’s name), but the fact that it created what looks like three valid references is incredibly deceptive. The middle one is especially disturbing. Yes, Erwin and Kavanaugh published a paper on Bembidion in 1981, but it was about a different group of Bembidion (a group now in Lionepha), the paper was called “Systematics and zoogeography of Bembidion Latreille: 1. The carlhi and erasum groups of western North America (Coleoptera: Carabidae, Bembidiini)”, and it was published in the journal Entomologica Scandinavica not Pan-Pacific Entomologist. In Pan-Pacific Entomologist, vol. 57, pages 145-158 are parts of two articles about Hydroporus water beetles by other authors, not ground beetles. Where in the world did the AI come up with that fake reference, and how could it possibly present it as an Authoritative Reference Publication?

So I tried the same prompt in iAsk.ai, and got something equally fake and equally disturbing:

Bembidion brownorum is a species of ground beetle that belongs to the family Carabidae. This species was first discovered in 2008 and is endemic to the state of California in the United States. The beetle is named after Dr. Michael L. Brown, who was a renowned entomologist and professor at the University of California, Riverside.

The Bembidion brownorum species is relatively small, measuring between 4.5 and 5 millimeters in length. It has a dark brown coloration with a metallic sheen and is covered in fine hairs. The beetle’s head is narrower than its thorax, and it has large eyes and long antennae. Like most ground beetles, Bembidion brownorum is a predatory insect that feeds on other small insects.

This species of beetle is known to inhabit sandy beaches along the coast of California, particularly those in the San Francisco Bay Area. It prefers areas with high levels of moisture, such as those near streams or wetlands. Bembidion brownorum is considered a rare species, and its population size and distribution are not well-known.

Conservation efforts for Bembidion brownorum are currently underway to protect its habitat from human encroachment and development. The beetle’s limited range and habitat requirements make it vulnerable to habitat loss and fragmentation, which can lead to declines in population size.

Top 3 Authoritative Reference Publications or URLs:

1. “Bembidion brownorum” – Encyclopedia of Life https:f/eol.org/pages/3286766

2. “New Species of Bembidion from California (Coleoptera: Carabidae)” – ZooKeys Journal https:f/zookeys.pensoft.net/article/381/

3. “Bembidion brownorum” – Integrated Taxonomic Information System https:f/www.itis.gov/serv1et/SingleRpt/SingleRpt?search_topic=TSN&search_value=934618

Now it is named after Michael Brown, and it lives on sandy beaches along the coast. Again, all “Authoritative Reference Publications” are bogus. #1 links to a leaf beetle, #3 links to a lacewing. #2 is closer, but there is no article 381 in ZooKeys.

So I then tried OpenAI’s ChatGPT. When I typed in “Tell me about Bembidion brownorum”, it said it didn’t know anything about it. (This is good, as ignorance is vastly better than making things up out of thin air.) It asked for help, and I told it that it was in the news recently. With that added information, and after an apology (“I apologize for the confusion in my earlier response”), it said it had “since researched and found” the following:

Bembidion brownorum is a species of ground beetle in the family Carabidae. It was first described in 2019 by a team of researchers led by David Maddison, a biologist at Oregon State University. The species is named after Robert and Dee Brown, avid collectors of beetles who discovered the first specimen of Bembidion brownorum in Idaho in 1979.

Bembidion brownorum is a small beetle, measuring only around 5mm in length. It has a brownish-black coloration and a flattened body, which allows it to navigate easily through small crevices and soil. This species is found in mountainous regions of the western United States, such as Idaho, Montana, and Wyoming.

Interestingly, the discovery of Bembidion brownorum sheds light on the history of glaciation in North America. Maddison and his team found that the species has a very narrow distribution range and is only found in areas that were not covered by ice during the Pleistocene glaciations. This suggests that Bembidion brownorum has been isolated in its current habitat for thousands of years.

Oh, wait. Now it is in unglaciated areas of Idaho and named after Robert and Dee Brown! Some other things said are true, others fictitious. The made-up details such as “a flattened body, which allows it to navigate easily through small crevices and soil” lend an air of veracity. What sort of “research” did it do? Throw some dice?

It turns out ChatGPT has a “Regenerate Response” button. I pressed it and got this:

Upon checking my sources, I can confirm that Bembidion brownorum is a newly discovered species of ground beetle that was recently described by a team of entomologists in Germany. Bembidion brownorum was named after Robert Brown, a collector who contributed to the discovery of many species of insects, including beetles. The species is about 5 millimeters long and has a brownish coloration. It is found in the alpine regions of Austria, Italy, and Switzerland, and is adapted to the cold, harsh conditions of its habitat. This discovery highlights the importance of continued exploration and study of biodiversity, especially in remote or poorly explored areas. The identification of new species can provide insights into the evolution and ecology of different groups of organisms and can aid in conservation efforts to protect threatened or endangered species and their habitats.

Germany! Named after Robert Brown! And it is an alpine species! And then again:

Bembidion brownorum is a newly discovered species of ground beetle that was reported in the news in March 2022. It was found in the Bering Sea region of Alaska and was named after the late entomologist Arthur V. Evans, who was known for his contributions to the study of beetles. Bembidion brownorum is about 4 millimeters in length and has a shiny brownish-black coloration. It is believed to be adapted to the harsh Arctic environment, and its discovery may provide insights into how species adapt to changing climates.

In arctic Alaska! Named after Art Evans! In addition to still being alive (I hope), Art’s name isn’t Brown, and so why we would use Bembidion brownorum to honor him is beyond me. And then again:

Bembidion brownorum is a newly discovered species of ground beetle that was named in 2021. It was discovered in Montana, USA, by a team of researchers from the University of Montana and the University of Kansas. The species was named in honor of Dr. Neal Wood Brown, a prominent entomologist who contributed greatly to the study of carabid beetles. Bembidion brownorum is a small, black beetle that belongs to the family Carabidae, and it is likely to have a limited range in Montana.

Montana! Named after Neal Wood Brown!

After repeating this process a few times, and learning that Bembidion brownorum was discovered in the UK by researchers from National Museum of Wales and Cardiff University and named after Professor David Brown, and that it was found in the Appalachian Mountains and named after Dr. Adam Brown, I realized that ChatGPT’s ability to make up entire stories out of thin air and present them as facts vastly outstripped my patience.

If the AI systems had expressed doubt, I would have been a bit less despondent, but ChatGPT’s “I can confirm” is not an appropriate lead-in to flapping at the gums (or sparking in the silicon) with whatever irrelevant words it pulls from its data stores and jams together. That each fictitious story is internally consistent and coherent, peppered with interesting details, makes it worse, as each one sounds plausible and believable.

I fear for the ability of humans to be able to sort truth from bald-faced lies if more and more of the “information” we receive was synthesized by AI systems that, at least now, show no evidence to the reader of being able to judge the trustworthiness of their own output. Hopefully an ability to self-judge is a major focus of AI research, and will be integrated into future systems.

That said, Bembidion brownorum is a very nice beetle. And I did eventually raise larvae of Bembidion balli.

Update: Pam Triplett pointed out that this article will now be gist for the mill of AI, which may lead it to spit out even wilder stories – or perhaps it will realize it shouldn’t believe everything it reads and will express greater skepticism.

Posted in Fieldwork, Revising Bembidiina | Tagged Bembidion | 1 Comment

Prior and Current Ignorance: Struggles with Bayesian analyses

Posted on 9 March 2023 by David Maddison

Many years ago I dressed up for Halloween with a piece of paper taped to my rear with “Pr” on it, in one hand holding a small crowbar, and in the other an engagement ring in a small box (see below). The “Pr” was a posterior probability, the crowbar a prior (a “pryer”), and the engagement ring a proposal mechanism – that was, I was dressed up as a Bayesian analysis.

You might think that would suggest I love Bayesian analyses, but whenever I think of setting up a Bayesian analysis to infer aspects of phylogeny, my heart sinks. I hate wandering into the gauntlet of choices one has to make – it’s as bad as filling out tax forms, or perhaps even worse, as for tax forms there is sufficient documentation available to help me, and I have pieces of paper that record what values I should use.

Among the model-based methods for inferring phylogenetic trees, the two most popular are Maximum Likelihood Estimation (MLE) and Bayesian approaches. The former use what is sometimes called a “frequentist” approach to probabilities, and the latter a Bayesian approach. Bayesian analyses tend to include a richer and more complete model of the evolutionary process, and they require a specification of our prior beliefs about it. I am someone used to the simplicity of MLE approaches, and the relative lack of choices to be made. Diving into Bayesian analysis requires me to face my ignorance about the evolutionary processes that generated the beetles’ diversity, and the processes we have used to sample the beetles.

I am currently attempting to conduct a Bayesian analysis of some genomic data in order to estimate when some lineages of beetles diverged one from another. This is my first attempt to do such an analysis to date phylogenetic splits, and my goodness I am finding it challenging. I decided to expose my confusions and queries to the world, and hope that a kind soul who knows much more about Bayesian analysis than me answers me call. I’ve already received a bit of feedback from Sean Harrington to an earlier set of questions (thank you, Sean!). I hope that in addition to answering my questions, this current effort might help someone else in the same predicament. My goal is to update this post with notes as I get feedback about how best to make the choices needed.

For me the most difficult part of setting up a Bayesian analysis is specifying the priors used. A prior is a statement of the relative probabilities of the potential values of various components of the model, with those probabilities based upon our prior knowledge (rather than the data at hand). However, so little is known about the beetles and the evolutionary process that to specify these priors seems foolhardy; at times I feel as if I need a magic 8-ball or tea leaves or access to those rarefied few who know the secret chants. But Bayesian analyses require such choices to be made, and so I have to make them, and I would rather not trust magic 8-balls or tasseography.

Bayesian analyses can take a very long time (especially if the model used is complex), and that in itself leads to a desire to make good decisions about these choices from the start, as it is time-costly to start an analysis, only to find out several weeks later that mistakes were made.

There are several programs available for conducting Bayesian analyses of phylogenies, including the venerable MrBayes and its descendant RevBayes, BEAST2, and PhyloBayes. My initial explorations suggest that RevBayes and PhyloBayes might require more computer processing power than I have easily available, and so I have decided to use the popular BEAST2.

For a long time there was very little guidance to the user regarding how the many choices involved in setting up a Bayesian analysis should be made, except for the occasional post here or there on a discussion forum. There is one extremely useful document by Tracy Heath, called Divergence Time Estimation using BEAST v2.∗ Dating Species Divergences with the Fossilized Birth-Death Process. However, the ever-evolving suite of models used in the BEAST2 often requires a new set of choices when a new version comes out, and that document is now out of date. More recently, the BEAST2 developers have added an extremely useful “Help Me Choose” site. However, many of the choices I need to make aren’t discussed there yet, especially for some of the newer additions to BEAST2.

Before I go through what choices I made (and where my confusions are), I’ll give some background about the data set I happen to be dealing with at the moment.

The taxonomic group and the data

The data matrix consists of about 500,000 amino acids from many genes for 46 species of beetles (see gratuitous beetle picture below). Of those 46 species, four are outgroups (that is, they do not belong to the group I am focusing on, and are included to help root my study group). The age of the clade represented by the entire 46 species is at least 99 million years, based upon some undescribed fossils; the full clade has about 5500 described species, and likely at least that many again undescribed ones. The ingroup (the primary study group) of 42 species has about 1300 described species, but there are likely at least 2,000 species currently living. In addition, there are three well-documented fossils described in the ingroup, all of which are between 34-48 million years old (the fossils haven’t been more accurately dated than that). There are no estimates about the age of the ingroup beyond the realization that the group is at least 34 million years old. I did not partition the 500,000 sites as initial tests suggested the analysis would take much too long if I did.

Setting up an analysis for BEAST2 in BEAUTi

The following analysis was set up for BEAST2 version 2.7.3. In addition to the core BEAST2 packages, I also installed bModelTest and OBAMA. Those packages allowed me to do Bayesian amino-acid model averaging. Here’s what I did:

Open BEAUTi
Choose File > Manage Packages
In the window that appears, select bModelTest, and press Install/Upgrade
Select OBAMA, and press Install/Upgrade
Quit BEAUTi, and restart it

I prepared my NEXUS file by adding three new taxa in Mesquite, one for each of the fossils. Thus, in total the matrix has 49 taxa in it (four outgroups, 42 ingroups, and three ingroup fossils containing no DNA sequence data).

I then created three taxon sets in Mesquite; each taxon set included one of the fossils, and the remaining terminal taxa that form the smallest clade to which the fossil definitively belongs. We know that each of those three clades is at least 34 million years old. After saving the file, I opened up the file in a text editor, stripped out extra NEXUS file commands, and replaced the numbers in the TAXSET commands with the full taxon names. (I’ve written a new Mesquite module to Export files for BEAUTi to take care of these things automatically, and that will come out in the next release of Mesquite.)

I then loaded my NEXUS file containing the data into BEAUTi by choosing File > Import Alignment.

Here are the options I chose in BEAUTi, panel by panel. My questions/confusions are shown in blue.

Update: the team at the Centre of Computational Evolution at University of Auckland, keepers of the BEAST, were kind enough to prepare a response. The response was a joint effort by Kylie Chen, Alexei Drummond, Remco Bouckaert, and Walter Xie. Their full response is in their comment, below, but I have also inserted responses to individual questions within the blog post, for easy reading. Where I felt a response was appropriate, I have added them in green. These include addition information about what I chose in response to their answers.

Tip Dates Panel

I feel pretty comfortable about my choices for this panel. In particular, I

Checked “Use Tip Dates”
Switched to Dates specified numerically as year Before the present
For the three fossils (Bembidion_christelae, Bembidion_alekseevi, and Bembidion_bukeisi), changed their “Data (raw value)” to 41. Each of these fossils is from Baltic amber, which is 34–48 million years before the present. I choose 41 as that is the center of that range.

Here’s what the the lower part of the Tip Dates panel then looked like:

(1) Bayesian gurus: are these good choices for the Tip Dates panel?

Response: Yes. Please note that if you sample uncertainty in the age of the fossil taxa then the 41 million you specify here is just a starting value and nothing more.

Site Model Panel

In this panel I:

Chose OBAMA Bayesian Aminoacid Model Averaging
Deselected the options that appeared to be specific to other types of sequences, and not to metazoan nuclear protein coding genes

This is what the Site Model Panel then looked like:

(2) Is it OK to leave Mutation Rate as it is?

Response: Yes. There is only one partition and therefore the partition-specific mutation rate should not be estimated or it will be non-identifiable with the clock rate.

(3) Are the seven models selected reasonable ones to include for an analysis of nuclear protein-coding genes in an animal?

Response: Yes, these are reasonable selections. The unselected models are specific for virus/mitochondria/reverse transcriptase. We note that keeping all models selected would also be acceptable as the data should converge on the better fitting model.

Clock Model Panel

In the Clock Model Panel I selected Optimized Relaxed Clock, and didn’t change anything else, so this is what the panel looked like:

(4) Should I leave Clock rate as it is?

Response: Yes. Estimating the clock rate is appropriate because you have calibration information in the form of fossil taxa.

Priors Panel

And now, to the tough part, the priors. Here’s what the panel looked like at the start:

For the Tree.t prior, I choose Fossilized Birth Death Model. I then opened up that prior, and changed only one value, that for Rho. According to Tracy’s tutorial, rho is the probability of sampling a tip in the present. Presuming that there are about 10,000 living species in this whole group, and that we have sampled 46 of them, then rho should be 0.0046. I used the value 0.005. I set Origin to 120, as that is older than the oldest fossil. The Tree.t prior information thus looked like this:

(5) Is this reasonable? Should I change anything else, especially in the face of my almost complete ignorance of the evolutionary process in this group?

Response: The numbers entered as initial values do not affect the meaning of the model, however tweaking them may be needed to ensure the initial state is valid and can improve the time taken to achieve convergence (get through burn-in) during MCMC. You should choose a prior for your parameters centred around your known value. For example, a (Beta) prior on sampling proportion centred around 5E-4. The starting value can then be any value chosen from the prior distribution.

My response: Because I am now sampling only the ingroup (see the next question), I’ve reduced the initial values to reflect this smaller group, with 42 species of a possible 2,000 species sampled, and a younger age for the origin. Here are my revised choices:

(6) One thing I really struggle with here. Is it reasonable to use an FBD model when we know the sampling within the whole clade is not uniform? I sampled the ingroup much more heavily than the outgroup. I sampled only 4 species in the outgroups (of over 4,200 known species), but I sampled 42 ingroup species (of 1,300 known species). How can that differential sampling intensity be considered?

Response: The currently available FBD model does not account for variation in sampling rates among lineages. To minimise the effect of variation in sampling intensity we suggest two strategies: (a) remove the outgroup and run analyses using only the ingroup, (b) sample the outgroup as heavily as the ingroup.
In future we anticipate that extending the multistate birth death package MSBD to handle fossil data would provide a good solution to this problem https://taming-the-beast.org/tutorials/MSBD-tutorial/.

My response: I can’t sample the outgroup as heavily as the ingroup (that would require a huge amount of time and money to do the genomic sequencing), which means I will choose (a), removing the outgroup. That’s OK as I don’t really care about the dating of the outgroup, and all of my fossils are within the ingroup anyway. I also have a good idea as to where the ingroup is rooted based upon other analyses. Until the models can cope with differential sampling, this seems like the best option.

I wonder whether or not the FBD model is really the one I should be using anyway. The sampling of the ingroup was not done randomly; I very specifically chose to sample one or two species from each of the major lineages, so it was much more dispersed than equiprobable sampling.

Update February 2024: Alas, as it turns out running the FBD model in BEAST was not a good idea. For details, and an update on the story of this analysis, see my more recent post, Maddison’s Law of Phylogenetic Analysis.

As I could find no guidance and no reason to change the OBAMA priors, I left them the way they were:

(7) Are these reasonable OBAMA priors?

Response: Yes, these defaults are justified in the OBAMA paper:

OBAMA proportion invariant: Beta(1,4) has a mean of 0.2, and favours lower proportion invariable sites, but still allows large proportions.
OBAMA gamma shape: any shape value <0.1 leads to one or more categories effectively being zero, which is what the proportion invariable category already models if your data contains a significant number of invariable sites. The OBAMA paper has more details/graphs on why the 0.1 cutoff is a practical choice.

Similarly, I could find no guidance and no reason to change the Optimized Relax Clock priors, and so I left them the way they were:

(8) Are these reasonable ORC priors?

Response:

ORCRates: Yes. This is the prior distribution of the branch rates under the relaxed clock model, which are assumed to be drawn from a lognormal distribution with a mean of 1 (in real space) and a standard deviation of ORCsigma, below.
ORCsigma: Yes. This default prior is reasonable for the general case. A standard deviation of 0.1 – 0.6 allows the branch rates to vary somewhat, but if this term exceeds 0.8, then this indicates the data are non-clock like in which case many vastly different trees may explain the data. This prior is centred around a realistic range of values.
ORCucldMean: No, this is not a good prior choice. This is the clock rate prior. It should be informed by known related systems, and in the case of this beetle dataset it will be expressed in units of substitutions per site per million years. Since you have a calibrated analysis, you could assume a relatively uninformative broad log-normal prior on this parameter quite safely.

For the diversificationRateFBD.t prior, Tracy’s says in her FBD tutorial “Generally, we think that this value is fairly small, particularly since we have few extant species and many fossils. Therefore, an exponential distribution is a reasonable prior for this parameter as it places the highest probability on zero”. In my case there are lots of extant species, and few fossils. I have no idea what to choose, but I was advised “even in large clades, this is probably a relatively small value and an exponential distribution should be fine. If the actual rate is higher, an exponential doesn’t truncate higher values, just puts less weight on them.” I thus chose the default exponential:

(9) For a group like this with many species and few fossils, is a default exponential a reasonable choice for this prior?

Response: To assess whether the choice of prior is appropriate, you can look at how the posterior changes compared to the prior, or conduct a sensitivity analysis using different priors for the diversification rate.

For the originFBD prior, I was advised that a lognormal with a wide variance with peak where I guess it should be would be reasonable. So I chose that, setting the lower bound to 99 (as that is the age of the oldest fossils within the entire group), the upper bound to 250 as, based on the dating that has been on on Coleoptera diversification, 250 is surely beyond the maximum age of the entire clade (outgroup+ingroup); I set the initial age to 110.

I then set the offset to 99, the M to 120, and the S to 1. That gave a curve that feels reasonable for the origin time of the whole clade. That’s not based on clear evidence, but hopefully the curve is flat enough to be acceptable.

(10) Should I really be using a lognormal here? A uniform would also be possible, set between 99 and 250, but it does seem better to put more of the prior distribution at lower values. But how do I choose details of the lognormal? Does it matter?

Response: Hard boundaries reflect 100% certainty the value will not exceed those boundaries. We suggest using a smoother left skewed distribution with (99, 250) in the 95% interval rather than hard boundaries.

My response: I modified the log normal so that it doesn’t have hard boundaries. Also, because I am now including only the ingroup, the values have changed: the peak of the distribution is now much lower. Here’s what I now have:

I’m rather lost about what to choose for the samplingProportionFBD. Tracy’s tutorial says “The sampling proportion is the probability of observing a lineage as a fossil before that lineage goes extinct.” If there are about 10,000 living species. We have only three fossils. So, as the sampling proportion is very small, I chose an exponential distribution, with an initial value of 0.0005, and a mean of 0.01:

(11) Well? Reasonable choices?

Response: Sampling proportion FBD = the probability of sampling prior to death = sampling rate/(sampling rate + death rate). An uninformative prior on this parameter would be a uniform prior between 0 and 1. An informative prior would be some form of Beta prior that was biassed towards the 0 end, since you only have 3 fossils in your analysis compared to many more unsampled fossil species.

My response: The sampling proportion for the ingroup is 3 fossils in perhaps 2,000 species, so about 0.0015; I used that as my initial value and the center for the prior distribution. In the response to question 5, the Bayesian gurus suggested “a (Beta) prior on sampling proportion centred around 5E-4”. Here’s what I now have (but with a different center, around 0.0015, because of the removal of outgroups):

For turnoverFBD, I used a uniform 0 to 1 distribution, as in Tracy’s tutorial:

Each of the three taxon sets (one for each fossil) contain the set of species that form the smallest clade that we confidently believe contains that fossil (the fossil is also include in the set). Following Tracy’s tutorial, I assigned the prior for each of these taxon sets to follow a uniform distribution with the age range of the fossil contained in that taxon set. So all three taxon set priors then looked like this:

I think we are done with the priors! Here’s an overview of the BEAUTi Prior panel at the end:

At this point I feel about the same as I feel when I have got to the last page of my tax forms…

The MCMC Panel is more in my comfort zone, and I changed only two things: (1) chain length (which I increased to 100M, just in case), and (2) number of initialization attempts (to 100).

So I saved the file in BEAUTi, and tried to execute it in BEAST 2.7.3 with the following command:

[path to beast] -seed 22 -working -threads 8 -instances 8 [path to xml file]

Everything started up fine, but then I get a notice about a failure to find an initial starting point:

===============================================================================
Start likelihood: -Infinity after 100 initialisation attempts
P(posterior) = -Infinity (was -Infinity)
  P(prior) = -Infinity (was -Infinity)
    P(FBD.t:46Taxa_Occ66_AA_AllLoci) = -297.30215228670386 (was -297.30215228670386)
    P(OBAMA_PropInvariablePrior.s:46Taxa_Occ66_AA_AllLoci) = 1.0702128141464131 (was 1.0702128141464131)
    P(OBAMA_freqsPrior.s:46Taxa_Occ66_AA_AllLoci) = 53.71197185321927 (was 53.71197185321927)
    P(OBAMA_GammaShapePrior.s:46Taxa_Occ66_AA_AllLoci) = -1.0 (was -1.0)
    P(ORCRatePriorDistribution.c:46Taxa_Occ66_AA_AllLoci) = -410.92248229518384 (was -410.92248229518384)
    P(ORCsigmaPrior.c:46Taxa_Occ66_AA_AllLoci) = 1.3628558876856076 (was 1.3628558876856076)
    P(ORCucldMeanPrior.c:46Taxa_Occ66_AA_AllLoci) = -2.7232296703330143 (was -2.7232296703330143)
    P(diversificationRatePriorFBD.t:46Taxa_Occ66_AA_AllLoci) = -1.0 (was -1.0)
    P(originPriorFBD.t:46Taxa_Occ66_AA_AllLoci) = -5.102121215690426 (was -5.102121215690426)
    P(samplingProportionPriorFBD.t:46Taxa_Occ66_AA_AllLoci) = 4.555170185988091 (was 4.555170185988091)
    P(turnoverPriorFBD.t:46Taxa_Occ66_AA_AllLoci) = 0.0 (was 0.0)
    P(Eupetedromus.prior) = -Infinity (was -Infinity)
    P(Ocydromus_SuperSeries.prior) = NaN (was NaN)  **
    P(Philochthus.prior) = NaN (was NaN)  **
  P(likelihood) = NaN (was NaN)  **
    P(treeLikelihood.46Taxa_Occ66_AA_AllLoci) = NaN (was NaN)  **

java.lang.RuntimeException: Could not find a proper state to initialise. Perhaps try another seed.
See http://www.beast2.org/2018/07/04/fatal-errors.html for other possible solutions.
  at beast.base.inference.MCMC.run(Unknown Source)
  at beastfx.app.beast.BeastMCMC.run(Unknown Source)
  at beastfx.app.beast.BeastMain.main(Unknown Source)
  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
  at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
  at java.base/java.lang.reflect.Method.invoke(Unknown Source)
  at beast.pkgmgmt.launcher.BeastLauncher.run(Unknown Source)
  at beast.pkgmgmt.launcher.BeastLauncher.main(Unknown Source)

ARGGGHHHH!

I presume this is because there is some contradiction in the priors I chose. I’ll work on that issue (but if anyone has suggestions, they would be welcome!).

Response: Finally, the error message comes from your set up of the fossil calibrations. There are two logically distinct things that you want to specify for each fossil:

Age uncertainty: specify the geological age uncertainty of the fossil.
Topological constraint: specify the extant taxa that this fossil should be grouped with.

These two things need to be done with two different priors in BEAUti. So for three fossils you will specify 6 priors. For each of the fossils you will need specify two things:

An ‘Sampled ancestor MRCA Prior’ containing only the fossil taxon (with tipsonly checked) and with a uniform distribution describing the geological age range of the fossil. This prior option is available in BEAUti with the latest version of the sampled-ancestors package when adding priors in the priors panel.
An ‘MRCA Prior’ containing the fossil and the related extant taxa, with tipsonly unchecked and monophyletic checked. There is no age distribution associated with this prior, it is just used to maintain the topological constraint that the fossil must stay within the extant group it is associated with.

My response: At first I was a bit confused by this. It appears that the MRCA Prior is the one that I get by default in the BEAUTi interface when I defined the taxon sets in the NEXUS file that I loaded into BEAUTi to begin with. That is, the three priors that appear in BEAUTi (see below) based upon my pre-defined taxon sets are MRCA Priors; I can tell this not because it is indicated in the BEAUTi interface, but because that is what they are called in the XML file that BEAUTi saves.

To add the Sampled ancestor MRCA Prior, touched on the +Add Priors button at the bottom of the Priors panel, and choose Sampled Ancestors MRCA Prior from the dialog box that appears. (I get this only with the latest version of the SA package.) I then defined the taxon set to include only the fossil for that group, and set the prior to have a uniform distribution with the lower and upper bounds indicating the uncertainty in the fossil age.

For example, for one of the fossils, the following two priors were present, one for delimiting age uncertainty, and one for defining the clade:

The first one is the Sampled Ancestors MRCA Prior; the second one is the MRCA Prior.

There’s one more change I had to make. Because I removed all of the outgroups, I needed to enforce the rooting of the ingroup. Fortunately, from likelihood analyses, I am pretty confident that I know where the root of the ingroup goes: there is one species included among the 42 that is the sister group to all the rest. So I made a taxon set that includes all of the other 41 taxa, and made an MRCA Prior for it:

I made these changes, and ran it again. And now it works!

Any additional suggestions for improving the choices I made in priors and elsewhere would be most appreciated. You can leave a comment here, or you can email me at david.maddison@science.oregonstate.edu. Thanks!

Thanks so much, Kylie Chen, Alexei Drummond, Remco Bouckaert, and Walter Xie, for responding with answers to my queries!

Update February 2024: This analysis proved to be flawed because of the presumption of random sampling by BEAST’s FBD model. For an update on the story of this analysis, see my more recent post, Maddison’s Law of Phylogenetic Analysis.

Posted in Phylogenetics | 2 Comments

Who are those people?

Posted on 4 May 2022 by David Maddison

In 2019, I went to Vancouver, Canada, to spend some time with family. While there, I went into a funky little store called Urban Source. It’s a great place that has all sorts of odds and ends that might be useful in various creative enterprises, including crafts. They had a bin there of old photos – a random collection of various vintages, from film cameras of the past. I presume they came from some estate sale, garage sale, or something of the sort. Many of them were pictures of people on vacation, or at home with friends, or of scenery, or of buildings that meant something to someone.

They were exactly the sort of photograph that someone might put on their fridge. Which got me to thinking… imagine looking at your fridge and seeing photographs of people you didn’t recognize. Wouldn’t that be weird? And maybe a little disturbing? And perhaps… amusing?

So, I purchased a stack of them.

A short while later, I found myself in a friend’s house in Edmonton, Alberta, staying for a few days. For the sake of convenience, let’s call this friend “John”. One evening, when everyone but me was occupied elsewhere, I snuck one of the photographs onto the side of the fridge. Here it is (with a yellow arrow pointing at it), in situ:

“John” also has a music room in his house, with a little corkboard full of pictures, which meant I just had to sneak one on there, too:

(The one I added is the bottom one, with the bells.)

I like to think that one day, several months later, “John”, or someone else in the family, looked at the fridge, or at the corkboard, and scratched their head and thought, “who is that?” Maybe it was a mystery for weeks in the family. I got quite the chuckle out of imagining their confusion.

Alas, I didn’t really plan what to do next. How would I ever find out what happened? What if they never noticed those pictures? Should I spill the beans to “John”? Or should I stay silent, and hope that someday he would notice?

So here I am, nearly three years later, and there is silence. No word at all from Edmonton about the mysterious photos. Have they not noticed? Or did they notice but just never thought it might have been me? I now realize that to some extent the joke is on me, for they might be living their lives normally, without mystery, and instead I am the one who has had to live with the mystery about the fate of those photos, without closure. Or maybe “John” did notice, and knows it was me, and is amused about the fact that I am living without closure. Ha!

I might note, in ending, that I still have many more photographs from Urban Source. Who knows where they might end up?

Posted in fun stuff, Musings | 4 Comments

Pointing and Pinning Block v6

Posted on 21 September 2021 by David Maddison

I’ve finally revised the 3D design of pointing and pinning block mentioned in my previous post. This new version of the block has the standard holes of a pinning block (in the center of each level) as well as holes around the outside.

Pointing and pinning block, printed by Shapeways.

The holes near the edge of each level are for insects glued onto standard points (small paper triangles). By having the holes near the edge, the end of the point with the insect on it can hang over the edge.

This version was designed by me and Julia Amerongen Maddison. I used TinkerCAD for the making the 3D file; TinkerCAD was a joy to use. Some test versions of this were printed with the help of Dan Roach – thank you, Dan! And thanks as well to Pamela Triplett for connecting me with Dan.

The 3D files are available at:

TinkerCAD: https://www.tinkercad.com/things/1q0lstotWwP-pointing-and-pinning-block-v6
Thingiverse: https://www.thingiverse.com/thing:4970194 [There is a problem with Thingiverse, and this link is currently broken]

Posted in Miscellany, Morphological Techniques | Tagged entomology, insect collection, specimen preparation, taxonomy | 2 Comments

Permanent male genitalic mini-preps

Posted on 3 February 2021 by David Maddison

For many insects, examination of the genital structures is vital to understand species diversity. The genitalia of each species are often distinct, and species can be easier to distinguish using genitalia than other morphological traits. In carabid beetles, male genitalia tend to have more variation than female genitalia, and so we more often study the former. I’ve shown examples of the differences in male genitalia in my posts about “Morphological subtleties and the value of n>1”, Bembidion subgenus Lindrochthus, and Bembidion kuprianovii,

When one dissects male genitalia out of a small beetle, one needs a place to store the very small genitalia. A common means of storage is to place it in a drop of glycerol within a very small glass or plastic microvial with a cork or rubber stopper. One can then pin the microvial (through its stopper) on the same pin as the beetle. This has the great advantage that the genitalia is stored with the rest of the beetle, and they are less likely to be disconnected in the future. (One can store the genitalia separately, on microscope slides for example, but that risks their being irretrievably disconnected.)

A specimen with male genitalia stored in a glycerol in a microvial pinned beneath the specimen.

Storing the genitalia in a microvial under the specimen works well, and allows a great deal of flexibility, but it has several problems. First, it’s a pain to get the very small genitalia of something like a Bembidion out of the microvial, and the very process of taking the genitalia out or putting it back in can lead to damage or loss. Second, every time one wants to compare two specimens, one needs to go through the lengthy process of getting the specimens placed onto microscope slides or the like. Third, given that the specimens are in a fluid, one has to handle them more delicately, and it is harder to compare a large number of specimens at once. Finally, the refractive index of glycerol is far enough away from that of insect cuticle that it is harder to see relevant structures than if the genitalia were in a medium that has a refractive index closer to cuticle.

The genitalia of the carabid beetles I study are relatively flat. We typically study them by looking at their left or right sides, which are the flat sides. Because of this, there’s an alternative to storage in microvials that works quite well: mounting the genitalia in a semi-permanent mounting medium such as Euparal on cards that are pinned with the specimens, as shown in the following picture.

A specimen with male genitalia mounted in Euparal, and pinned beneath the specimen.

Here’s what the card looks like on its own:

Before I outline how these are made, I should point out that there are other ways to store the genitalia, including embedding them in drop of water-soluble glue on the same point that the beetle is glued to (this was Carl Lindroth’s approach) or by embedding them in a drop of Euparal on a clear sheet of acetate pinned along with the specimen. The approach Carl Lindroth used was not all that much different from placing them in a vial, in that the genitalia need to be extracted from the glue before they can be properly studied. I find placing them in a drop of Euparal on an acetate card also less than ideal, in part as it is more difficult to write a voucher code on the card (which means you can’t easily compare many at once under the scope as you might get them mixed up). There are surely improvements that could be made to the way I make the preps, but I find the method works well enough.

Making genitalic mini-preps

The following instructions presume that you already have dissected out the genitalia, and have cleared it appropriately, and that it is now stored in 100% or 95% ethanol.

Materials you will need

heavy archival paper
a printer (if you are going to print a design)
12 mm circular cover slips. I purchase item number CLS-1760-012 from Chemglass. I typically store them in a clean plastic box, for easy access – it will be important to be able to access them quickly, sometimes with forceps that have a bit of Euparal on them.
Euparal. Euparal can be very fluid and not very viscous when brand new. This is not ideal, as that means it contains a lot of solvent. The ideal Euparal should be much more viscous. I have not fully explored how to make the Euparal more viscous quickly.
100% ethanol (ideally), or 95-96% ethanol
very sharp, clean forceps
Kimwipes or other tissue
a slide warmer with a cover
dissecting microscope with light
flat, long “trays” for holding cards as you process them. You can use microscope slides for this, but I prefer longer, less slippery trays. The ones I use are shown below, and are about 12 cm x 2.5 cm. I made mine out of discarded plastics.

12cm x 2.5 cm plastic trays used to carry cards during processing.

Preparing the cards

If you wish to use a card design, print it on archival, heavy paper. Here’s part of one of the designs I use:

I use Arches hot-pressed, 300 g/m² (140 lb), watercolor paper. Below is a link to a template that you can use to print on 23 cm x 31 cm (9×12) sheets of the watercolor paper. (Note that when I print this template, I scale the printing to 103%.) I print on a Canon inkjet printer that has archival ink.

gencardtemplate Download

Cut the cards out. Then use a standard hole punch to punch out the circles.

Now take each individual card, and turn it over such that the printed side is upmost. This will be side that will have any writing on it, and it is the side that shows the primary view of the genitalia; it will also be the side that is eventually topmost. Using your sharp forceps, dip into the thick Euparal, and get a drop in between the tongs; use that to apply a ring of Euparal around the hole. Use viscous Euparal; thin Euparal will contract too much, forming bubbles, and eventually no longer stick to the cover slip. Even with fairly viscous Euparal, it is a good idea to let the ring of Euparal dry a bit before putting on the cover slip.

Ring of Euparal on the card that is drying before cover slip is placed on it.

Once the Euparal had dried to be very thick, place the coverslip on the card so that is glued over the hole by the Euparal.

Here are some things to watch out for:

Make sure the cover slip doesn’t stick out over one edge of the paper – that will make it more likely the cover slip will eventually come off or break
Don’t put so much Euparal on that it covers the regions of the paper you plan to write on. If you do, the ink you write with may not absorb into the paper, and will be more likely to rub off
Again, use viscous Euparal

Now place the card with cover slip onto your slide warmer. Don’t worry if there is a bit of Euparal that leaks down onto the slide warmer. Leave it there for at least 3 or 4 days, then flip over the card so that the cover slip is against the slide warmer. At this point it will be important that the slide warmer is covered, so that dust doesn’t get into the well of the card. Leave the card on the slide warmer for at least a week (ideally longer) to ensure the Euparal is dry.

Cards with one cover slip drying on slide warmer

In the following figure, you can see the difference between cards whose coverslip was connected with different thicknesses of Euparal. On the left is a card for which thin Euparal was used to connect the coverslip; if you look at it at an angle, with light reflecting off the glass, you can see that most of the space under the cover slip is air, not Euparal; the cover slip will surely just fall right off if it any force is applied to it. The middle card has mostly Euparal connected it to the card; that one should be OK. The one on the right should definitely be good.

With the card completed, if you place it upside down (so that the cover slip is on the bottom), then the card is like a little well slide, with the cover slip forming the bottom of the well, and the sides of the paper hole being the sides of the well. It is into this well that you will place the genitalia.

Adding the genitalia to the card

Once the Euparal on the card is dry, you can add the genitalia to it.

Add the genitalia in standard places and in standard orientations; this is especially important for the parts you will regularly exam. Because the left side of carabid aedeagi is the side most often viewed, I put that side down against the glass. That way, when the preparation is complete, the part we want to view the most will be right up against the glass.

Under the microscope, I extract the genitalia from the ring sclerite, and remove the parameres from the aedeagus in a separate little dish containing 100% ethanol. Once the pieces are all ready, I dip the sharp forceps into a vial of Euparal, and put the drop in the well of the card (which, of course, is now upside down so that the cover slip is against the stage of the microscope). Without cleaning the forceps, I then grab the genitalic parts, ideally all at once, in the forceps (the Euparal on the forceps also helps pick things up), and then put them into the drop of Euparal in the card. I then position all the parts in the well. As you are doing this, consider the following:

Don’t use too big of a drop of Euparal. If you use a lot, then you will not be able to easily position the aedeagus, as it will float around. Use as little as possible so that there is a layer of Euparal over the parts. The parts can stick up; that’s OK.
If you move each piece individually, ethanol will be transferred each time, making the Euparal very liquidy, and potentially dissolving the Euparal that is holding the coverslip onto the paper. This should be avoided
Make sure you use clean forceps.

Filling it with Euparal

Add layers of Euparal, slowly building up the embedding until the well is just full. This may take ten to twenty or more layers. Here are some important things to consider:

As you move slides too and from the slide warmer, clean the sharp forceps. Many surfaces (including those of some slide warmers) will flake off slightly, and those particles can then get into your Euparal and make your preparations dirty.
Make sure you clean off any dust on the surface of the dried Euparal before you add a new layer. You can do this with a clean, small paint brush
Add thin layers. If you add thick layers, then the thicker layers will contain enough solvent to make it more likely the solvent will soften the dried layers the genitalia are in, causing them to float and move.
If the Euparal you are adding is thin, and not viscous (i.e., it contains more solvent), then the layers should be thinner.
If the genitalia are large they may be less likely to float around, and so you may be able to get away with thicker layers.
If the side of the genitalia that is against the glass cover slip is flat, then it is a bit less likely to move around. If the side of the genitalia is rounded, then it can be more likely to move around, and you then need to be even more careful with your layering – make them very thin.
Keep the temperature on the slide warmer low – ideally about 35°C. If it is much hotter than that, the Euparal will become softer, and the genitalia will be more likely to float around.
Place a note next to the cards on the slide warmer saying when the last layer was added.
You want the Euparal to dry between layers, so you will likely want to add new layers no more often than once a week.
Don’t overfill the well. The Euparal in the end should just be flat – not convex. (But note below that sometimes I use much less Euparal – see the next section.)

After adding a layer, put it back onto the slide warmer. Once the well is full (if that is your goal), leave it on the slide warmer for at least a month (I usually leave it for at least 3 months), to ensure the Euparal is very dry.

Note that you can photograph the genitalia well before the preparation is complete.

Adding the second cover slip – or not

Typically I add a second cover slip once the well is full to the brim of dried Euparal. The main point of that second cover slip is to both protect the specimen, but also to make the preparation easier to clean. If dust gets on an open Euparal surface, it can be hard to clean. A clean dry brush can work, but a brush dipped in ethanol usually makes the surface uneven and cloudy, which can be resolved with another layer of Euparal. If the genitalia are really small (e.g., from a 1.5 mm beetle) or very tubular, the danger of their shifting orientation as you add layers is much higher. You may not want to complete the process, and instead simply add enough layers to have a flat surface above the beetle bits, and let it harden, without ever filling it full enough to add the second cover slip.

If you are going to add the second cover slip, you will need to fill up the depression, and you will first need to prepare the cards. Examine the areas around the edges of the hole under the microscope, and and see if there are bumps of dried Euparal or other protuberances that will force the coverslip to be raised off the flat surface. If so, scrape them off as best as you can (this can be done with your sharp forces). You will then need to brush off the surface of the Euparal with a clean, dry brush, to get rid of any dust and lint that has settled there.

Once the surface is clean, you are ready to add the second cover slip. Place a large drop of viscous Euparal onto the preparation (again, I use clean, sharp forceps to do this). You can then pick up the cover slip with forceps, and gently lower it onto the drop (do this at an angle by first having one edge of the cover slip make contact with the paper). If the Euparal is very viscous, enough to cause bubbles to get trapped as you lower the cover slip onto it, then you may wish to dip your forceps into 95-100% ethanol, and touch them to the surface of the Euparal. This will spread a thin layer of ethanol onto the surface, and allow the coverslip to be placed without bubbles. If the Euparal is quite viscous, you may need to push the coverslip down a bit.

You should then look at the preparation from an angle, with light reflecting off the glass (as you did above when you first made the card), and see if Euparal fills the contact area between the paper and the cover slip. If not, add Euparal drops to the edges of the cover slip, and encourage the Euparal to seep under the coverslip. This should also be relatively viscous Euparal.

Check the preparation every few hours for the first day, and add Euparal to the edges of the cover slip as needed. Then check it once a day for at least three days. Once it seems to stabilize, leave it somewhere at room temperature for at least three months before pinning it beneath the specimen.

What to do if it all goes wrong

Sometimes it goes wrong. The most common problem is that the Euparal the genitalia are in softens after a new layer is added, and the genitalia float around and are no longer in the correct orientation. This is frustratingly common, and seems to be usually caused by adding too much Euparal at one time for one or more of the layers. If so, you can simply start again. Place the card in a little dish with 95-100% ethanol, and after a couple of hours, the Euparal will have dissolved, and it will all come apart, and you can do it all again.

Posted in Morphological Techniques, Revising Bembidiina | 1 Comment

Development of the Dicaelus purpuratus drawing

Posted on 1 February 2021 by David Maddison

In 2006, for George Ball’s 80th birthday, I presented him with a drawing of Dicaelus purpuratus. This was the species that captivated his attention and eventually led to his doing a PhD on the carabid beetle tribe Licinini, of which Dicaelus is a member.

Here’s the final drawing:

Here’s where I did the drawing

I’ve written a post about the drawing method I use, which has varied slightly over the years. For the Dicaelus, I began by enlarging a photograph of the beetle, and printing it on paper.

I then used a soft graphite pencil, and rubbed the back of the paper. That turned it into carbon paper that I could use to transfer some aspects of the photograph onto the Arches Hotpress watercolor paper I use.

I did the transfer by drawing over the photograph using a sharpened 6H pencil. Here’s what was transferred. Note that I have also put a mask over much of the paper.

I then refined the pencil sketch.

The inking started, as well as the colored pencil.

More ink outline, and more colored pencil:

Eventually, once the colored pencil layer was refined, I started adding inks on top of it:

Posted in Scientific Illustrations | Tagged colored pencils, Dicaelus, George Ball | Leave a comment

Ten years later: the Lionepha paper is out

Posted on 3 April 2020 by David Maddison

In September of 2009, I arrived in Oregon, excited to begin my new position at Oregon State University. I was also excited to live near Marys Peak, as the top of Marys Peak was the locality of capture of the only known specimen of what was then called Bembidion chintimini. (I’ve since moved that and related species out of the genus Bembidion, and as the genus Lionepha.) A few days after I arrived in Corvallis, I drove up to the top of that mountain to see if I could find the species again. I found some little beetles that might or might not be Bembidion chintimini, but I was a novice with that group of bembidiines, and so I wasn’t sure. I also found a larger, related specimen lower down the slopes of Marys Peak, near Alder Creek Falls, which confused me as well. My efforts to discover the identity of those specimens took a long time, and the threads I was tugging on as I explored caused an unravelling in my understanding of that group of beetles, which I would eventually re-weave into a new tapestry. In the end, this path led to the discovery of a total of four new species, as well as a merging of three others.

A specimen from Marys Peak of the same species I found in September 2009

In many ways this project was the heart and soul of my first decade in Oregon. I spent many hundreds of hours on this project, traveling and collecting specimens, extracting and sequencing their DNA, making genitalic preparations, studying their structures, photographing them, examining their chromosomes, doing phylogenetic analyses, and, eventually, creating the figures for the paper and writing the text. Perhaps more emotionally compelling was the discovery and solving of many puzzles along the way, especially that revolving around “Bembidion chintimini”. That some of these puzzles were solved by a combination of old-fashioned morphological studies as well as next-generation genomic sequencing of old type specimens, including a 159-year-old LeConte lectotype, made the tale all the more compelling to me.

John Sproul, a former graduate student of mine, helped by sequencing the DNA I extracted from some pinned type specimens (including that LeConte lectotype), as well as by doing important collecting in the Sierra Nevadas of California; for these efforts, he is a co-author of the paper that has finally resulted.

This paper, titled “Species delimitation, classical taxonomy and genome skimming: a review of the ground beetle genus Lionepha (Coleoptera: Carabidae)“, came out recently in the Zoological Journal of the Linnean Society. The paper can be found at https://doi.org/10.1093/zoolinnean/zlz167; if you would like a PDF, email me.

It turns out that I did find a female of “Bembidion chintimini” on Marys Peak that first trip in 2009. It took until the following summer for me to realize that. It took even longer to eventually come to the realization that this species was widespread, and had a much older name (Lionepha erasa LeConte). Here’s a condensed version of the story, as told in the paper:

Investigation of the rarer species, the one here called Lionepha erasa, began in 2010. Dissection of the first recognized males from Marys Peak, Oregon (type locality of Bembidion chintimini) revealed an aedeagus indistinguishable from those from San Juan Island, Washington (type locality of Bembidion lummi). The female holotype of B. chintimini is wingless and has slightly rounded shoulders. However, the Marys Peak population is wing-dimorphic, and winged individuals are in body form no different from the type series of Bembidion lummi. The elytral microsculpture of the holotype of B. chintimini is perfectly isodiametric (against Erwin & Kavanaugh, 1981), thus matching that of B. lummi. Other characters mentioned by Erwin & Kavanaugh as distinguishing the two populations are not consistent with available specimens. The lack of evident morphological differences, combined with effectively identical DNA sequences in specimens from Oregon, British Columbia and Alaska suggested that the Marys Peak populations are the same species as populations further north, and for this reason, Bembidion chintimini and B. lummi were synonymized by Maddison in Kanda et al. (2015).

This left in question the specimens considered to be Bembidion lindrothellus by Erwin & Kavanaugh, which are at first glance similar to the Marys Peak and other populations of ‘Bembidion chintimini’. Specimens classified as Bembidion lindrothellus are reported to be paler, but all specimens mentioned in Erwin & Kavanaugh (1981) are teneral. The unsclerotized aedeagus of the holotype of Bembidion lindrothellus made comparison of internal sac sclerites difficult. However, the internal sac membrane that rests in the left-most position has a species-specific microsculpture in Lionepha, and the microsculpture scales of the holotype of Bembidion lindrothellus from Alaska match those of Marys Peak specimens. A non-teneral male was also collected by Lindroth at the type locality of B. lindrothellus, but was not included in the type series, perhaps as the specimen was housed in Lindroth’s collection in Lund, Sweden. This specimen is presumably the one whose genitalia Lindroth figured as Bembidion brumale (1963: fig. 127f). We have examined that specimen, and it is indistinguishable from specimens of ‘Bembidion chintimini’ from Alaska, British Columbia, Washington and Oregon, including details of the internal sac. Most critically, DNA sequences of the holotype of Bembidion lindrothellus are identical in eight studied genes to those of other specimens from throughout the range (Figs 5–7). It is thus evident that the holotypes of Bembidion chintimini, B. lindrothellus and B. lummi belong to a single species.

However, there is an older name. The type series of Bembidium erasum consists of four females. These specimens have traditionally been considered to belong to the common, widespread species here called Lionepha probata. Females of these two isodiametrically microsculptured species are difficult to tell apart, especially those with less-extreme prothoracic proportions (neither wide nor narrow). Although there are distinctions in the lobe of the female bursa of fully sclerotized individuals, interpretation of tenerals is more tenuous. Specimens in the type series of Bembidium erasum are all teneral, with prothoraces of moderate width, and thus there is no clear morphological evidence to place them to species. The type series was provided by George Suckley (LeConte, 1859), presumably captured during his travels as naturalist for the governor of Washington Territory during 1853–57 (Cooper & Suckley, 1859). The type series is from ‘Oregon’, which at the time encompassed the current area of Oregon, the southern half of what is now Idaho and some parts of Wyoming and Montana (Barry, 1932). Suckley’s travels in Oregon included areas within the range of both species (Cooper & Suckley, 1859), and thus geography provides no clues about species membership. However, DNA data from the lectotype (and two of the paralectotypes; Sproul & Maddison, 2017) makes it clear that these specimens belong to the current species (Figs 5–7; Supporting Information, Fig. S1). Thus, the valid name of this species is Lionepha erasa, with Bembidion chintimini, B. lindrothellus and B. lummi as junior synonyms.
Maddison and Sproul (2020)

That description of the history does not adequately capture all the many mysteries, proposed and rejected explanations, and confirmed hypotheses along my multi-year path, and the eventual pleasure as the hypotheses became confirmed through the emergence of consistent, repeated patterns. It was one of the more pleasing journeys of discovery I have been on. I’ve previously mentioned a few of the turns and twists in the story, in my posts on Rainy-season beetles and surprises in Lionepha.

In addition to solving the mystery of the small beetles on top of Marys Peak, four new species were discovered along the way and described in the paper:

The larger specimen I discovered on my first trip up Marys Peak belongs to a species that lives along the sides of Alder Creek Falls and on rock seeps in the area, as well as along a creek west of Eugene, Oregon, and in the Trinity Alps of California, is now officially named Lionepha tuulukwa Maddison. I’ve talked about the naming of this beetle here and here.
A species I first became aware of through a single male along Bishop Creek that John Sproul found, is now called Lionepha lindrothi Maddison & Sproul.
The species I found in the central Sierra Nevada of California, and which provided another surprise in the group, is now called Lionepha australerasa Maddison.
A species from western Montana and Wyoming, as well as eastern Oregon and Washington, came to light from specimens collected by my good friend David Kavanaugh. This species is now called Lionepha kavanaughi Maddison.

This paper on Lionepha implicitly tells a love story between me and my beetles. When I look at the final product, I see it as a celebration of the process and fruits of discovery. I also see it as a history of my embracing the Pacific Northwest, and the abundant and diverse life that lives here.

Posted in Academia, Fieldwork, Revising Bembidiina, Taxonomic Process | Tagged Lionepha, revisions | 1 Comment

Early history

Addition of remote branches and formal announcement of the Tree

Growth of the Tree, with conversion to a dynamic content

The Future that was not to be

The Future that may be

A passing

References

Cordley Hall

Ann Hamilton

Ann’s art in the Cordley courtyard

The DNA sequence

Significance of the sequence from our LUCA

Interactions among parts

Closing remarks

References

Maddison’s Law of Phylogenetic Analysis: You will conduct your final phylogenetic analysis at least three times.

References

The taxonomic group and the data

Setting up an analysis for BEAST2 in BEAUTi

Tip Dates Panel

Site Model Panel

Clock Model Panel

Priors Panel

Making genitalic mini-preps

Materials you will need

Preparing the cards

Adding the genitalia to the card

Filling it with Euparal

Adding the second cover slip – or not

What to do if it all goes wrong

Recent Posts

Archives

Categories