Sesowi Tutorial

Sesowi is a language for everyone, everywhere, to learn. Unlike previous attempts at this, which are largely European languages in disguise, Sesowi is a universal common denominator, equally easy for people of all backgrounds. All Sesowi words are composed of only AAA atomic concepts -- for instance, "snake" is langboi (long animal), lizard is kolangboi (arm snake), and salamander is lokolangboi (water lizard) -- and yet Sesowi is a complete language. Finally, Sesowi does not take itself too seriously. The unofficial mascot of Sesowi is the Opossum -- or to be precise, one particular opossum who likes to eat grapes behind my house. Interested?? Join our discord here! Also pls note that this website is still under development. Sesowi currently has AAA atoms, CCCC defined compound words, and PPP proper/phonetic nouns, covering a total of EEEE English words.

A brief intro

Sesowi is meant to be the simplest possible language to learn, that is nonetheless fully capable of expressing any idea. It rides on the principles of simplicity and neutrality: not only is it easy to learn (simplicity), but it is easy to learn for speakers of all languages, and doesn't give an advantage to people with a particular native language (neutrality/universality).

Wait, Hasn’t this been tried before? There have been a variety of attempted IALs (International Auxiliary Languages) over the years, most famously Esperanto. However, these all do not satisfy the basic criteria of universalty and neutrality. Well known IALs include Esperanto, Volapük, Interlingua, Interlingue, Ido, Novial, Lingua Franca Nova, and Idiom Neutral, all of which are essentially simplified versions of Latin. They have their own beauty, but they are far from universal. Even ones attempting to be more global, like Lingwa di Planeta, still have a fundamentally Indo-European phonology and structure. And Lojban is explicitly stated to be impossible to learn perfectly. All of these leave non European speakers — namely the majority of the world — at a heavy social disadvantage. Sesowi is a more global middle ground, positioned somewhere between Indo-European and Sinitic languages, with influence from Dravidian languages.

A basic test of neutrality: A Mandarin Chinese speaker, an American English speaker, and a Mexican Spanish speaker all learn a language. After three months, are they all at about the same skill level? For all of the above languages, the answer is clearly: no. The Chinese speaker would have much worse proficiency. Simply from the experience of this one hypothetical Chinese person, we can pull up some simple tests that all these languages fail:

These languages claim to be universal but distinguishes between he/him/she/her? That's four pronouns where only one is needed, and only one is used in most major non-Indo-European languages (Chinese, Hindi, Turkish, Bengali, Indonesian....)

There is, however, one language that does satisfy all these criteria, and very well at that: Toki Pona. Toki pona is a work of art done by someone with a deep and universal understanding of languages. However, it has no intention of being an IAL. The goal of Toki Pona is more personal discovery than communication, and more complex or exact topics, like numbers above five, are difficult to express.

Sesowi grammar is uninflecting and relies on word order and part-of-speech markers. It does away with any grammatical element that is not deemed necessary for communication, e.g. tenses, moods like the conditional, animacy, case systems, and gender. It is phonologically simple, lacking tricky elements like consonant clusters, rhotics, or final consonants (except /ŋ/). Needless to say, there are no grammatical exceptions, or nonsense around spelling.

Much of the beauty of Sesowi is in its lexicon. Sesowi words themselves are more like a vague cloud of meaning, having a wide range of meanings, and a single word is often not clear without context.¹ Importantly,² these words do not have a specific part of speech, functioning as verb, noun, or adjective depending on grammatical role. There are only a few hundred of these atoms, but the language makes extensive use of compounding to make more complex concepts.

Sesowi eschews the complex ontologies common in natural languages, in favor of context and pragmatics. To observe this, let us demonstrate how many different sub-classes of words natural languages have — sub-classes that are logically organized but nonetheless not necessary.³ We will illustrate this by comparison to another natural language that lacks many of these sub-classes: the English pidgin dialect Cronch, spoken by a small community in Central California.

In these examples, Cronch, like Sesowi, uses a single word for each concept, even though the meaning of this word is quite different in each context. For instance, “stress” means either “experiencing stress” or “causing stress”. Nonetheless, the meaning is unambiguous in context, and the “correct” word in standard English does not add to comprehension.

Natural languages usually have such complex ontologies of different types of concepts, and each one is treated differently. Adjectives are neatly bundled into causal adjectives (“stressful”), stateful adjectives (“stressed”) and so on; verbs are marked as to whether they are being performed, have been performed, are intransitive, and so on. However, this logical ontology, which is so natural to the human mind, is not usually necessary, given context! And for a learner, it is just extra burden.

Languages are already full of pragmatic meaning that can only be derived from context, but Sesowi opines that they half-ass their pragmatics. Sesowi fully asses its pragmatics.

Another brief intro

Sesowi is a creole language that evolved in the early 2090s in Nigeria as a trade language, largely between the two largest languages of that country, namely Naijá and Mandarin. However, since the port of Lagos is the largest single hub in the global shipping market, it naturally draws influences from other common languages used by maritime traders, like Spanish, Telugu, and Ilocano. The phonology of this language is more or less that of Naijá, though consonant clusters and finals have been dropped along with the voiced fricatives and affricates from contact with Chinese, and [f] has been dropped from Philippine languages.

As is common to creole languages, Sesowi has a relatively small core vocabulary, of only around 200 words. However, its Chinese substratum leads to two interesting results: first, like Chinese, it extensively makes use of compounding, to form many longer words; and second, the core words in Sesowi are all exactly one syllable. Despite these two similarities to Chinese, however, the effect ends up being quite different – unlike Chinese, the core vocabulary is very small, so Sesowi words / sentences tend to be more syllables than Chinese; but since it lacks the extensive use of affricates and triphthongs from Chinese, each syllable is much faster to say. The result is something like Greek — a barrage of quickly uttered syllables.

Like its parents, Sesowi has no inflections, and does not differentiate gender or animacy (so he/she/it/this are all the same). In common with Dravidian languages, Sesowi does not differentiate between tenses and modal verbs.

Sesowi has no adjectives; similar to Tok Pisin, it forms them entirely with relative clauses. In fact, according to a strict interpretation, Sesowi has no parts of speech except for nouns and quantifiers. According to a looser interpretation, it has two prepositions, a handful of modal verbs, and a special verb to introduce relative clauses.

Principles of Sesowi

Cronch	English
He was very stress. It because the meeting was stress.	He was very stressed because the meeting was stressful.
A: Sorry for interromp you but there is cat.	B: It’s ok, cats are very interruptive.
A: Sorry for interrupting you but there is a cat.	B: It ok, cats very interromp.
You want I sombscreen you, or are you already sombscreen?	Do you want me to put sunscreen on you, or do you already have sunscreen on?

principle of similar proficiency

Speakers should have similar proficiency, regardless of their background, so one group does not have an unfair advantage.

Applied to phonology, this means that no one group should have a much stronger accent than another. For instance, the consonant clusters, rhotics, and final consonants in conlangs like Esperanto would make it so that Vietnamese and Chinese speakers would have strong accents, but Spaniards would have little or no accent. (Esperanto, Volapük, Ido, interlingua, interlingue, and lojban all fail at this). Note: Lojban has a consonant cluster in every verb, eg “cidjrspageti”, suggesting that although they believed they drew from Chinese words, they did not have a Chinese linguist on their team.

principle of low confusability

When speaking this language, a given word should be as hard to confuse as possible.

Applied to phonology: remove any sounds or words that are easily mistaken. For instance, Hindi, Spanish, Chinese, and Telugu (and more) speakers have trouble differentiating v and w. So remove them both. Similarly, don’t include both /e/ and /ei/, as American speakers will have trouble telling them apart. Omit all homophones. For classes of similar words, make them especially hard to confuse – for instance, numbers and tenses. The English fifty/fifteen is entirely unacceptable. Spanish canto/cantó is also pretty egregious.

Principle of reducing memorization

Compounds introduce complexity because they require memorization. Therefore, whenever possible, we should rely on generative processes to express concepts. This has the downside that it becomes harder to express subtleties of meaning, but we can make up for this with a well-developed and well-decomposed system of modification (adjectives etc.) Using this principle, a word like sprint should probably be expressed as run very-fast or similar, rather than having its own bespoke word. We lose the additional connotations, but they are subtle enough that we probably don’t lse very much. But what about a word like run ? Is running different enough from walking that it should get its own word, or is it just walk fast? And is walking different enough from going that it should be go footly or should going and walking be the same (as in German)? These are trickier cases, especially that of running, but I would bias towards adjectival phrases.

subsubprinciple of some redundancy to make things easier

When something can be inferred from contexts, most natural languages opt for economy and omit extra words. For instance, Chinese does away with the 了 when the past tense is implied by “yesterday”, and Indonesian does the same, implying all tenses. This makes the language fractionally faster to speak, but noticeably harder to learn. Therefore, the present language opts for some redundancy. For instance, all three tenses are marked with a modal, whereas most natural languages would make one of these the default – for instance Nigerian Pidgin, where omitting the “dey” makes it past tense.

subprinciple of The more similar they are, the more different they should be

It’s ok for two words to sound similar. But if they also mean similar things, then this is a recipe for confusion! For instance, when learning Mandarin I would constantly mix up lǜ (green) and lán (blue). These don’t even sound very similar! But they both start in /l/ and a dark vowel, and it was (and tbh is) quite confusing for me. Similarly, the basic numbers liù (6) and jiǔ (9) are still confusing to me, even though the don’t sound that similar – but they both end in -iu. English is no better,⁴ with “fifty” versus “fifteen”, and “can” vs. “can’t”. The conclusion is that words that have similar meanings, e.g. colors, numbers, animals, etc., should also sound as different as possible. This is somewhat unintuitive, because one might have the intuition that similar things should sound similar! But this leads to confusion.

Footnotes

In this way a Sesowi core word (“atom”) is very like a Chinese character. ↩
and unlike Chinese ↩
Sesowi supposes that natural languages are in many cases too logical, organized, and precise, in ways that do not make communication more clear. ↩
And this is not even nearly as bad as many other things in Chinese that differ only by tone, like yǎnjìng (glasses) and yǎnjīng (eyes), or mǎi (buy) and mài (sell). ↩

Sesowi Phonology

Consonant Chart

	Labial	Alveolar	Palatal	Velar
Nasal	m ⟨m⟩	(mʷ ⟨mw⟩)	n ⟨n⟩			ŋ ⟨ng⟩
Plosive (fortis)	pʰ ⟨p⟩	pʰʷ ⟨pw⟩	tʰ ⟨t⟩	tʰʷ ⟨tw⟩		kʰ ⟨k⟩	kʰʷ ⟨kw⟩
Plosive (lenis)	b ⟨b⟩	bʷ ⟨bw⟩	d ⟨d⟩	dʷ ⟨dw⟩		ɡ ⟨g⟩	ɡʷ ⟨gw⟩
Fricative			s ⟨s⟩	sʷ ⟨sw⟩
Approximant	w ⟨w⟩		l ⟨l⟩		j ⟨y⟩

Overview

Each Sesowi syllable has the form CV[N], where C is a consonant, V is a vowel, and N is the sound “ng”.

Vowels

Sesowi has the five cardinal vowels /aiuoe/, as well as three diphthongs /ai/,/au/, and /oi/, which are pronounced exactly like the combination of the two vowels they are made of.

Consonants

Sesowi has 12 simple consonants, /w/,/y/,/m/,/b/,/g/,/d/,/n/,/s/,/p/,/k/,/t/,/l/ and 11 compound consonants, /ny/,/mw/,/bw/,/gw/,/dw/,/sw/,/pw/,/kw/,/tw/. The compound consonants are pronounced exactly like the combination of their two constituents, e.g. /kik/ sounds like “kick” and /kwik/ sounds like “quick”.

Phonotactics

A variety of possible syllables are never used by Sesowi, as they are deemed too easy to mistake with other syllables. For example, *nyi and *yi are too easy to mistake for ni and i, respectively. Diphthongs plus nasal coda, like *bwaing or *saung, are forbidden as they are deemed too hard to pronounce.

Consonants

According to phoible, here is a list of the world’s phonemes from most to least common, keeping in mind that this is over almost 3000 languages, so Papua new Guinea will have a disproportionate representation: m k j p w n t l s b ŋ ɡ h d r f ɲ t̠ʃ ʔ ʃ z d̠ʒ v ɾ t̪ ts kʰ pʰ x n̪ ʈ ʒ ɣ d̪ c tʰ ɳ ȵ ɡb kp kʷ ɟ ɭ ȶ dz β ɻ ɓ mb nd
“Bybee explained that the proposed primal consonants come from five sets: the consonants are: the stops made with full closure of the lips (labials), /p b m/; stops made with the tongue creating closure at the teeth or behind the teeth, /t d n/; stops made with the back of the tongue against the soft palate, /k g ŋ/; and fricative /s/ and lateral /l/
“Almost all languages use consonants from these five sets to make words,” she said. “Research on 81 unrelated languages identified sound changes that this small set of consonants undergoes to create new consonants… By contrast, it is very rare for consonants not included in the primal consonant set to change into one of the primal consonants. The discovery of these primal consonants is a major contribution to our understanding of the origins of human language.”
The red ones are ones that Sesowi lacks; the greyed ones are allophones. Should we add back in any of r f ɲ ʔ ?

Not [r], because that is too varied in its realization.
Maybe [f], though I had the impression that a lot of languages used /p/ instead, and that /f/ was just sort of a hard sound to hear?
Possibly ɲ as ⟨ny⟩, but then it's odd that it's the only one that can have its own ⟨y⟩?
Not ʔ, since the CV syllable structure makes this hard to distinguish from null unset. unless….nothing can really start in a vowel (as in Arabic)? and "vowel onset" is actually alway ʔ-onset? maybe not written but maybe? TODO

levels:

m j w p t k
n s
l
b d g ŋ h
ɲ t̠ʃ
f ʃ ɾ ʔ
z d̠ʒ v r

Consonant clusters

There are no consonant clusters, except at syllable boundaries, where there may be an ngC cluster. Voicing contrast? [b] [d] [g] vs [p] [t] [k] Please note that what actually matters is the contrasts, not the phonemes themselves. For instance, /b/, /p/, and /pʰ/ are common phonemes. But very few languages contrast all three. In fact, not one of the world’s five most spoken languages has the same set of contrasts:

English contrasts /b/ and /pʰ/
Mandarin contrasts /p/, and /pʰ/
Hindi has all of /b/, /p/, and /pʰ/
Spanish contrasts /b/ and /p/
Arabic has no contrast, and uses only /b/

For Sesowi, we have chosen to contrast the two most distinguishable of these three, namely /b/ and /pʰ/, as English does. This minimizes the possibility of confusion. The same goes for /d/ vs /tʰ/ and /g/ vs /kʰ/ Most of the world’s languages contrast voiced and voiceless stops, including Chinese, all Indo-European languages, and indeed all of the top 30 most spoken languages except Tamil – with the exception that Arabic varieties usually don’t contrast [p]/[b]. Toki Pona lacks this contrast. The one wrinkle is that DFSDFVS

[b] vs [p]: Spanish, Malay
[p] vs [ph]: Chinese
[b] vs [ph]: English
[b] [p] [ph]: Thai, Hindi
[p] vs [ɓ]: Vietnamese

TODO If Sesowi DID have voicing contrasts,

Specific decisions on different contrasts

On [b], [v], [w]

Many languages lack a v/w distinction, including Mandarin, all Indian languages, and (sort of) German. Thus, clearly only one of these should be included! Some languages, like Spanish, also lack a b/v distinction. These two constraints together lead us to include [b] and [w], but not [v].

On [f]

contrast [f] and [pʰ]: English and Mandarin
lack [pʰ]: Spanish, Arabic, Greek, and Portuguese
lack [f]: Philippine languages, Javanese, Sundanese, Blinese, "Native" Indian languages
lack [f] but have [v]: vi
weird about [f]; use [ɸ]: Japanese, Korean, Finnish

Summary: omit

On [h]/[x]

[h]:/[ɦ] English, Indonesian, Hindi, Swahili, Tagalog, Hausa, Japanese, Turkish, Vietnamese, Korean
[x~χ]: Chinese, Spanish, Portuguese (), Russian
[h] and [x~χ]: Arabic, German, Scots, Urdu
as allophone: Tamil
no [h]: French

French lacks this phoneme, and Spanish+Portuguese are likely to not pronounce because of orthographic confusion. Furthermore, this sound is weak and often goes away. Weak reject.

On [z]/[s]

Languages lacking a z/s distinction: Spanish, All Chinese, Bengali, Yoruba, Tagalog, Korean (?), Thai, Fulfulde, German. Clear reject.

On [ʃ]/[s]

Note: considering ʂ and ʃ the same for now.

Missing ʃ: Southern Chinese (e.g. Wu), Korean, Thai, many Spanish varieties, some Hindi varieties.
No contrast with tʃ: French

On [tʃ]/[dʒ]

English, Mandarin, Hindi, Bengali, Indonesian, Japanese, Nigerian Pidgin, and Turkish have this distinction. Tagalog does too, though [tʃ] "may be pronounced [ts] (or [tj] if spelled ⟨ty⟩), especially by speakers in rural areas. " Arabic lacks [tʃ]. Most Spanish varieties lack [dʒ], or when they do have it, don't contrast it with [j]; similarly, most Slavic languages have [tʃ] (well, [t͡ɕ]) but lacks [dʒ]. French and Portuguese lack both, collapsing them to [ʃ]/[ʒ], though African French varieties may have them. Vietnamese seems to lack [dʒ]. Conclusion: keep [tʃ] but not [dʒ], for ease on Spanish, Arabic, and Slavic speakers.

On rhotics

Most languages have rhotics. However, they are frequently very different from each other. Consider the "strong rhotics" /ɹ/, /r/, /ʁ/, /ʐ/, and /ɻ/, from English, Spanish, French, Mandarin, and Tamil. Each one of these sounds is exceedingly difficult for speakers of any of the other four to make. Therefore, including any of them in the language would be unwise; and allowing for all would be chaos. On the other hand, the "weak rhotic" /ɾ/ is a much easier sound to deal with. English, Mandarin, and French lack it, and Japanese does not contrast it with /l/, but Spanish, Portuguese, Arabic, Turkish, and all Indian languages have it (sometimes in free variation with /r/). English even has the sound as an allophone of /t/, and it is not a hard sound to master. Nonetheless, it will be an accent-marking phoneme, at least for English, Mandarin, French, and Japanese, who will tend to realize it as their strong rhotic. There is a strong argument for including it (also because rhotics are great), but for the purpose of reducing accent marking, maybe best to exclude.

Labialization

wikipedia: "Labialization is the most widespread secondary articulation in the world's languages. It is phonemically contrastive in Northwest Caucasian (e.g. Adyghe), Athabaskan, and Salishan language families, among others. This contrast is reconstructed also for Proto-Indo-European, the common ancestor of the Indo-European languages; and it survives in Latin and some Romance languages. It is also found in the Cushitic and Ethio-Semitic languages. American English labializes /r, ʃ, ʒ, tʃ, dʒ/ to various degrees.[citation needed] A few languages, including Arrernte and Mba, have contrastive labialized forms for almost all of their consonants. In many Salishan languages, such as Klallam, velar consonants only occur in their labialized forms (except /k/, which occurs in some loanwords). However, uvular consonants occur abundantly labialized and unrounded."

Vowels

Arabic and Inuktut etc have only aiu, lack eo. Keep the 5 cardinal vowels.

Vowel Chart

	Front	Central	Back
Close	i		u
Close-mid	e		o
Open		a

Diphthong Chart

IPA
ai
au
oi

Why not falling diphthongs?

They will be confused with a following syllable starting in /j/:
sia sounds too much like siya
conversely, sui is more distinct from suwi.
the diphthongs starting in u are realized more like /w/

Linguistics Learning Corner: What are falling or rising diphthongs? blahblahblah

Forbidden CV

Several CV clusters are forbidden:

Forbidden Sequences and Potential Confusions

Forbidden	Likely confused with	Example
yi	i	Chinese English "yeast" ("east")
ye	e	Indian English "eight" ("yeight")
wu	u	Chinese wu
wo	o	Indian English "only" ("wonly")
ti	chi	Japanese; -tion
nyi	ni
chw	tw
hw	w

On u- initial diphthongs and [ui] vs [oi]

The u-initial diphthongs can be more like /wV/. Thus, /ui/ ([wi]) and /oi/ are clearly distinguishable; in the former, the accent is on the /i/, and in the latter, accent is on the /o/. /uV/ is spelled as (rather than ) to be more recognizable to the world's population.

Syllables

All syllables are CV[N], where the final consonant can only be the velar nasal . A small set of function words can also be vowel-initial

Nasal coda

Final consonants are hard to say and evolve away in general. However, nasal codas (final consonants that are nasal) tend to remain. But which of the nasals, and how many, should remain?

There are four obvious choices for a final nasal: m, n, and ng. All other nasals are more obscure.
Only 1 nasal should be a coda. For instance, in Mandarin, there are two (n and ng), and they are actually pretty tricky to distinguish as a non-native speaker.
1. Another motivation for only one nasal coda: Nasals tend to assimilate. Meaning: all of bam-ka, ban-ka, and bang-ka will tend to end up being realized as bang-ka.
One problem with final consonants (in a language without geminates) is that if the next syllable starts in the same syllable, you might end up with homonyms, which we don’t want. For instance, bam-mi and ba-mi would sound the same. Therefore, ideally, the nasal coda would not occur in syllable-initial position.
Even if the nasal coda does not occur in the initial position, it may still assimilate into initial nasals. For instance, ban-mi would likely be pronounced as bam-mi. however, this is less likely to happen between the bilabial nasal and the velar nasal. For instance, in English, "Fan me" goes to "fammi", but "sing me" does not assimilate.

Given that we want /m/ to be a possible initial nasal, there is only one nasal that satisfies these requirements: /ŋ/. How should this be written? There are four options, none perfect:

Use the conventional digraph ⟨ng⟩.
1. Pro: used by Euro languages, Pinyin, and Austronesian languages.
2. Con: is a digraph; breaks the 1-letter-1-sound rule
Use an unused letter, like ⟨q⟩.
1. Con: not intuitively easy to read
Use ⟨n⟩, but know that it is pronounced as ŋ when at the end of a syllable
1. Con: breaks 1-letter-1-sound rule
Use the IPA symbol ⟨ŋ⟩.
1. Con: it is hard to type on most keyboards.

I have ordered these from best to worst. Curse you Latin Script, for not giving us an easier letter. Final note: I was just listening to a Mandarin speaker speaking English, and saying things like “pingpoint the problem” and “cleang the data“

Allophones

Ideally, there are no allophones, since they will tend to make a language harder to learn and understand. However, there may be some that are introduced by speakers. Possible ones I foresee:

/t/ and /d/ flapping before an unstressed syllable, as in English and Indian languages
nasal coda assimilating into next consonant
1. nasal coda assimilating into following nasal
initial /w/ before a front vowel changing to /v/
plosives following nasal coda becoming implosive

Of these, (1) and (2a) are concerns. I would love it if (4) happens… Orthography As in English, but eliminating all conjunct consonants. There is a 1:1 correspondence between single letters and sounds. Hence: “x” for “sh”, “c” for “ch”.

	Labial		Alveolar		Palatal	Velar
	plain	labialized	plain	labialized	Palatal	plain	labialized
Nasal	m ⟨m⟩	(mʷ ⟨mw⟩)	n ⟨n⟩			ŋ ⟨ng⟩
Plosive (fortis)	pʰ ⟨p⟩	pʰʷ ⟨pw⟩	tʰ ⟨t⟩	tʰʷ ⟨tw⟩		kʰ ⟨k⟩	kʰʷ ⟨kw⟩
Plosive (lenis)	b ⟨b⟩	bʷ ⟨bw⟩	d ⟨d⟩	dʷ ⟨dw⟩		ɡ ⟨g⟩	ɡʷ ⟨gw⟩
Fricative			s ⟨s⟩	sʷ ⟨sw⟩
Approximant	w ⟨w⟩		l ⟨l⟩		j ⟨y⟩