rjohara.net |
Darwin-L Message Log 7:95 (March 1994)
Academic Discussion on the History and Theory of the Historical Sciences
This is one message from the Archives of Darwin-L (1993–1997), a professional discussion group on the history and theory of the historical sciences.
Note: Additional publications on evolution and the historical sciences by the Darwin-L list owner are available on SSRN.
<7:95>From DARWIN@iris.uncg.edu Thu Mar 31 21:22:42 1994 Date: Thu, 31 Mar 1994 22:22:30 -0500 (EST) From: DARWIN@iris.uncg.edu Subject: Re: cladistics & distance data To: darwin-l@ukanaix.cc.ukans.edu Organization: University of NC at Greensboro A few days ago Paul DeBenedictis asked a couple of questions that deserved answers but didn't elicit any at the time. I thought I'd take a shot at them briefly. Paul asked: Can one estimate cladistic relationships from distance data? As Paul correctly noted for those who aren't familiar with this topic in systematics, there are two basic classes of data that systematists might use: distance data and character data. A character datum would be a statement that taxon x has character state k. A distance datum would be a statement of that taxon x and taxon y are (say) 3.2 units apart. Character data is usually arranged in a table of taxa-by-characters, and distance data is usually arranged in a table of taxa-by-taxa. Paul phrases the question correctly when he asks whether distance data can be used to _estimate_ cladistic relationships (the sequence of branching events in the history of a clade). It is best to think of phylogenetic history as a complex branching sequence of events that we wish to estimate. That being the case, I think the answer is certainly yes, distance data can be used to estimate branching sequences, as can character data. The nuts-and-bolts question in any particular case is how well do distance methods or character methods estimate any particular evolutionary chronicle. I could estimate the phylogeny of a collection of taxa by first joining all those whose names begin with "A", and then all those whose names begin with "B", and so on down the alphabet. This resulting tree would be an estimate of the phylogeny of the taxa in question. It would probably not be a very good one, of course. I have found that thinking about phylogenetic inference the estimation of a sequence of events -- something that Greg Mayer convinced me to do -- is very liberating, as it allows one to escape some of the polemical rhetoric of the older (1960s and 1970s) systematic literature, which was filled with accusations of "my method generates testable hypotheses and yours doesn't", "distance approaches are worthless", etc. All these various approaches can give estimates of phylogeny; the question is which estimates are better, and that is not easily answered as it depends not just on theory but also the practice of each individual investigator. Since I answered Paul's first question as I did, my answer to his second question -- "What makes a technique cladistic?" -- may not be surprising. I have come around to the view that Greg Mayer has expressed here once or twice (he taught me everything I know), that we should take the terms "cladistic" and "phenetic" to refer to intentions rather than particular procedures, types of data, or algorithms. A technique is cladistic if it is used for the purpose of estimating phylogeny. Sibley's intention in his DNA hybridization work is clearly to estimate phylogeny, and so he is using distance data in a cladistic manner. Now whether the phylogenetic estimates he produces are good ones is a separate issue. I have been critical of them here before. But the fact that they may be poor estimates in some cases does not, in my view, make them non-cladistic. A follow-up for the linguists: Were the techniques of "lexicostatistics" and "glottochronology" that were popular in the 1960s distance techniques? That is, did the lexicostatisticians calculate "distance" values among languages, and use these to reconstruct language history, rather than using individual lexical items (linguistic character data)? Is that also perhaps what Greenberg does with his technique of "mass comparison"? Bob O'Hara, Darwin-L list owner Robert J. O'Hara (darwin@iris.uncg.edu) Center for Critical Inquiry and Department of Biology 100 Foust Building, University of North Carolina at Greensboro Greensboro, North Carolina 27412 U.S.A.
Your Amazon purchases help support this website. Thank you!