Firen Word Generator

Word	Gloss
deskta	verb
kos	verb
kuv	verb
kul	verb
ralla	verb
fůṙlla	verb
tita	verb
pisa	verb
nol	verb
saṙsk	verb
faṙna	verb
vůmma	verb
jast	verb
toffa	verb
tit	verb
hel	verb
kil	verb
humma	verb
nat	verb
task	verb
dak	verb
bun	verb
pogga	verb
kos	verb
sunna	verb
bůs	verb
skas	verb
vof	verb
jais	verb
kailla	verb
kim	verb
raof	verb
fůl	verb
vis	verb
toffa	verb
ket	verb
dol	verb
tap	verb
dossa	verb
dem	verb
boṙf	verb
son	verb
sesk	verb
stod	verb
saṙlla	verb
dedda	verb
tol	verb
tita	verb
fel	verb
sut	verb
bid	verb
fůṙn	verb
vija	verb
lubba	verb
vuf	verb
tistta	verb
hoha	verb
nov	verb
git	verb
fůvva	verb
sok	verb
jům	verb
leb	verb
čan	verb
kil	verb
hůkka	verb
staṙvva	verb
tup	verb
tain	verb
lusk	verb
sloṙm	verb
časppa	verb
skenna	verb
lůl	verb
spek	verb
spoṙn	verb
čis	verb
spůk	verb
čaif	verb
hassa	verb
mus	verb
ronna	verb
sais	verb
čum	verb
snan	verb
zinna	verb
čum	verb
sov	verb
jud	verb
zib	verb
son	verb
skad	verb
spait	verb
spastta	verb
dov	verb
vum	verb
daol	verb
kůṙga	verb
ten	verb
čassa	verb

Process returned 0

Utility Functions: Clear, Permalink

Noteworthy nodes in each datafile include:

Language	Datafile name	Root nodes (Click a root to generate from it)	Remarks
Firen	syllables.yml	`Sentence`, `Noun`, `Verb`, `NominalRoot`, `VerbalRoot`,	More information about Firen can be found on the Wiki.
Sajem Tan	sajemtan.yml	`Word`, `Root`, `Suffix`, `UnlikelyWord`, `UnlikelyRoot`, `UnlikelySuffix`,	Sajem Tan is a collaborative conlang. It has a website here.
English	english.yml	`Sentence`,	My (possibly poorly-considered) attempt to encode basic English grammar in WordGen. I apologise in advance to anyone who tries to make sense out of it.
Dab vi Suxi Kidap	ffb.yml	`Sentence`, `Word`, `Compound`, `Syllable`,	DVSK is a very simple isolating language that was created as a collaboration between me and 4 other people from the Sajem Tan tribe, however it was abandoned after working out the foundations.
Xanz	xanz.yml	`word`, `tricons`, `root`, `word1`,	Another collaborative language in the Sajem Tan universe. It is the source of triconsonantal roots in Sajem Tan.
Jafren	jafren.yml	`Sentence`, `Word`, `ChordL`, `Chord`, `Chord1`, `Chord2`, `Chord3`, `Chord4`, `Chord5`,	A musical language used in the same setting as Firen. It is currently much less well-developed.
Jokes	jokes.yml	`Gender`,	Someone on Mastodon posted a silly CFG for making gender jokes, so I encoded it as a WordGen datafile. Nothing more to it.
Numbers	numbers.yml	`number`, `phoneNumber`, `internationalPhoneNumber`,	This is one of the first files I ever wrote, and it shows. It makes use of outdated and deprecated features of WordGen and made the very questionable choice of using 'val' for a phonetic English reading of the number and 'ipa' for the digits.
Tests	CFGs.yml	`Dyck`, `binPalindrome`, `Node`,	This file exists as a testing ground for things that are too simple to need their own files, and for new or experimental features. You will need to uncrease the recursion depth to use some of these roots, particularly `Node` or else get a million errors.

Note that CFGs.yml is not allowed on this web interface due to higher resource use than the other files and its reliance on WordGen/Cpp features.

Feel free to look at the sources for WordGen/Py and WordGen/Cpp. wordgen.py is the current version of the script, and syllables.yml is the current version of the Firen data file.

This is the web frontend for a Python program that will produce random words using a (rather nifty) weighted-randomized macro expansion approach. IPA transcriptions are generated from the same file, and are not directly attached to the orthography. This means that "digraph recognition" is not even a concept to worry about.

In a second phase, regular expressions and Mealy-type finite state machines are applied to transform the output.

The Firen datafile is generally quite well-developed, and produces generally good results. The IPA transcriptions are sometimes non-obvious because they include synchronic sound changes, and sometimes unnatural but generally still correct, such as with the overzealous syllabification.

The other datafiles are in various stages of development.

Not that it matters or anything, but unless you provide your own seeds, this web frontend has worse randomness because it is simply using Unix time as the seed. (It's required that the server generates the seed for the permalink to work, and time is the standard easy choice for these things.) When run from the command line without an explicit seed parameter, the randomness is much better (Python seeds its random generator from the system's main entropy source). Maybe I could make this Base64-encode some bytes from /dev/urandom or something for the seed instead, it wouldn't change too much.

Working-1.py is a less flexible earlier (Python 2 only) draft, which technically knows nothing about words, and only generates syllables. You may find it interesting or even useful. syllables1.yml is the data file for that version. The two versions are not compatible, but are mostly similar and a single file could in theory be agnostic between them.

Once this is "done", my next plan is to implement something with Markov chains, the more classical way to generate natural language.

Top

Words: (Limit 250)
Show IPA:
Show glosses:
Show old orthography: (Sajem Tan only)
Show original IPA: (Sajem Tan only)
Show Ðab Tan: (Sajem Tan only)
Show Ðab Tan IPA: (Sajem Tan only)
Show ABC notation: (Jafren only)
Show alphabetical notation: (Jafren only)
Datafile:
Root Node:

Show paths (debug):
Show regex steps (debug):
Enable seed (debug):
Seed:
Recursion depth (debug):