A method for measuring verb similarity for two closely related languages with application to Zulu and Xhosa
There are limited computational resources for Nguni languages and when improving availability for one of the languages, bootstrapping from a related language’s resources may be a cost-saving approach. This requires the ability to quantify similarity between any two closely related languages so as to make informed decisions, of which it is unclear how to measure it. We devised a method for quantifying similarity by adapting four extant similar measures, and present a method of quantifying the ratio of verbs that would need phonological conditioning due to consecutive vowels. The verbs selected are those relevant for weather forecasts for Xhosa and Zulu and newly specified as computational grammar rules. The 52 Xhosa and 49 Zulu rules share 42 rules, supporting informal impressions of their similarity. The morphosyntactic similarity reached 59.5% overall on the adapted Driver-Kroeber metric, with past tense rules only at 99.5%. This similarity score is a result of the variation in terminals mainly for the prefix of the verb.
Copyright (c) 2019 Zola Mahlaza, Catharina Maria Keet
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.Copyright of all work published here subsists in the authors. While SACJ retains right of first publication, subsequent re-publication is expressly permitted provided the original SACJ publication is acknowledged and cited, according to the terms detailed below. If plagiarism is detected during review, a paper may be summarily rejected and will not be accepted unless even minor infringements are corrected. Should plagiarism be detected after a paper is published, the Editor reserves the right to withdraw a paper from publication. We expect authors to be honest in representing work as their own, and to respect the time and effort our reviewers put in without an undue burden of policing plagiarism, and hence take violations seriously. SACJ applies the Creative Commons Attribution NonCommercial 4.0 License (CC BY-NC 4.0) to all papers published in this journal. Authors who publish with SACJ agree to the following:
- Authors retain copyright and grant SACJ right of first publication. The work is additionally licensed under a Creative Commons Attribution Non-Commercial License that requires others who share the work to acknowledge the work’s authorship and initial publication in SACJ. Should anyone else wish to make commercial use of the work, SACJ cedes the right to the author to negotiate terms and does not expect to be paid any royalties.
- Authors may enter into additional arrangements for non-exclusive distribution of the SACJ-published version of the work (e.g., post it to a repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are required to refrain from posting their work online prior to completion of reviews so as not to compromise double-blind reviewing or confuse plagiarism checks.