Originally conceived of as a complex digital object comprising audio clips from field dialect recordings coordinated with text files containing analysis on several levels, the Bulgarian Dialectology as Living Tradition project is now being prepared as an interactive database. The central focus remains the collection of interviews, each of which is available both as an audio file and in text format. The texts are currently being transcribed, translated, annotated, and entered into the database. When data entry is completed, tags at both the token level (for linguistic features) and the text level (for discourse and content features) will allow users to extract and compare data on many more levels than has previously been possible.
The database comprises 172 audio clips, excerpted from recordings made by the authors in 66 different Bulgarian villages over the last quarter century. All recordings are of speech in natural conversational settings, and each excerpt is chosen with both form and content in mind. In the first instance, the excerpt illustrates the most salient linguistic features of the particular dialect, and in the second instance, the excerpt is a well-formed piece of discourse communicating some aspect of village life or of the speaker’s worldview. Wherever possible, excerpts have also been chosen which illustrate methods of field dialectology. Data entry for each excerpt consists of:
- a line-by-line transcription of the text with interlinear translation (“Line view”)
- the same, but with grammatical glosses under each word or token (“Token view”)
- at the token level, identification of the meaning, basic grammatical attributes, and salient linguistic traits
- a index of thematic content according to basic topics
- identification of the gender and age of informants, and the date and location of recording
Publishing this material as a database has the following advantages:
- all locations represented are immediately visible as pins on a Google map
- the database can be browsed for any combination of selected phonological, morphological, grammatical, lexical and/or semantic attributes of individual tokens within texts
- all attested representations of any one standard Bulgarian lexeme can be seen at a glance, as well as semantic correspondences between dialect words and their standard equivalents
- the location within a text of any one browsed element or trait can be precisely identified, and the user can then directly access either the text or the audio file in question to see the element in its full context.
The project's goals are as follows:
- illustrate the diversity of Bulgarian dialects in a more vivid and realistic manner than is currently possible with dialect atlases
- illustrate two important “living traditions” in Bulgaria: that of village life as it maintains its inheritance from the past, and that of Bulgarian dialectology as it documents village speech in its living context
- allow linguistic analysis at a level broader than the lexical or phonetic, by using longer speech samples as the base data, and by cataloguing elements of discourse
- allow comparative access to ethnographic material within the text samples
- ensure a solid representative network of sites, covering all basic Bulgarian dialect types
- allow users direct access to the data, both by making the primary audio files available, and by making the identification of traits and attributes as transparent as possible