In his classic etymological dictionary Shuowen Jiezi written nearly 2000 years ago, Xu Shen showed how every character can be analyzed by breaking it into component characters, which themselves can be broken down further, so that ultimately only a couple hundred root pictographs and ideographs (wen) generate all of the characters. and its associated printed dictionary show this generation process graphically for over 4000 characters using a series of zipu or "character charts/genealogies" that each start with one of the wen from Shuowen Jiezi. Without any system for cross-referencing, Xu Shen had to break his dictionary into manageable sections, starting each one with a bushou or "section heading" (conventionally mistranslated as "radical") that was a component for other characters in that section but not always a root wen. This bushou system has been the organizing principle for almost all subsequent Chinese dictionaries, but it arbitrarily focuses on only a single component of each character. In contrast the zipu system allows any character to be found if the viewer knows any part of the character or knows any character which shares the same component.

Character etymology (ziyuan) is frequently conflated with two other distinct concepts by people who are more familiar with phonetic writing systems. First is the etymology of root words as represented by their pronunciation, which is the main focus of etymology research by modern linguists. Such research allows understanding of how Chinese words evolved even before the introduction of Chinese characters, but is of little practical value to native or foreign Chinese learners. Second is the breaking down of compound words with multiple characters into their component root words/characters, e.g., ziyuan is just "character source". Since most Chinese words are compound words with each syllable representing a character (the thousands of Chinese characters can be combined to make hundreds of thousands of Chinese words), understanding these compound word etymologies is of great practical importance to Chinese learners. Because Chinese has so few foreign loan words, and because Chinese characters allow for more detailed information on the component root words than is just available from the pronunciation, the meaning of multi-character words is usually quite clear as long as one knows the component characters. Indeed this ability to readily infer the meaning of words from their component characters is probably the greatest strength of Chinese, and is particularly helpful for learning and remembering scientific terms. (In contrast, even the etymology of the English word "etymology" is obscure.) Given the transparency of compound word etymologies when each component character is well understood, traditional Chinese etymology has always focused instead on character etymology as the most important step in learning Chinese.

The zipu follow traditional Chinese etymologies based mainly on the "small seal" characters that were standardized about 2,200 years ago in the Qin Dynasty. Modern researchers have obtained a better understanding of the earlier evolution of characters before they were standardized, but the traditional etymologies are more useful for students and remain the standard reference point for all subsequent research. Moroever, their widespread study over the centuries has meant that the traditional etymologies themselves have affected the usage, survival, and evolution of Chinese characters.

Under the influence of Western linguistics and its focus on spoken language, Chinese authorities in the 1950s and 1960s considered abolishing Chinese characters and ultimately took the less radical approach of "simplifying" several hundred characters. Unlike the last more systematic simplification in the Qin Dynasty, this simplification focused on reducing the number of strokes in characters rather than on clarifying their semantic and phonetic information. Sometimes older variant forms that were both simpler and had stronger semantic content were adopted, but frequently the semantic and phonetic content of simplified characters was degraded. Moreover, inconsistencies in the way similar characters were simplified broke or weakened the semantic and phonetic links between many characters, thereby degrading this information in many characters that were left unsimplified. Whether the overall gains and losses made characters easier or harder to learn is unclear, but in any case a rare opportunity to resystematize characters was lost. Because of their clearer etymological relationships, this dictionary focuses on traditional forms. Traditional forms are now permitted for many contexts in mainland China, and are standard in Taiwan, Hong Kong, and many overseas Chinese communities, so serious students of Chinese usually need to learn both forms of characters.

I am an economist who analyzes game theory models of strategic communication. This dictionary is not directly related to my research. Essentially I have just taken the Shuowen's data on the components for each character and run it through a program to generate the trees implied by this data. I have then translated the explanations from the Shuowen and from later commentaries by traditional Chinese sources for each character, and added character and word definitions. The dictionary does not contain original research, but rather it is a demonstration that computerized cross-referencing now makes it possible to more fully implement Xu Shen's original vision for Chinese lexicography. I hope that other printed and electronic dictionaries will similarly be designed to further his vision.

Copyright 1996-2017 by Rick Harbaugh. I manage this website in my spare time - please excuse any delays in responding to inquiries. I am currently updating some features on the site (last updated in 2001) so any suggestions are welcome.

