Firen Word Generator

Words	Gloss
g_s_f	g_s_f
z_w_d	z_w_d
g_s_th	g_s_th
x_th_s	x_th_s
z_k_z	z_k_z
s_b_x	s_b_x
w_f_k	w_f_k
d_d_d	d_d_d
f_s_t	f_s_t
'_th_s	'_th_s
x_th_w	x_th_w
b_s_th	b_s_th
g_l_t	g_l_t
'_d_s	'_d_s
l_x_p	l_x_p
t_w_f	t_w_f
f_d_'	f_d_'
d_j_l	d_j_l
k_'_d	k_'_d
k_th_'	k_th_'
k_th_w	k_th_w
g_t_b	g_t_b
l_b_z	l_b_z
x_th_th	x_th_th
b_t_k	b_t_k
'_t_g	'_t_g
g_w_x	g_w_x
g_l_w	g_l_w
g_t_g	g_t_g
g_f_t	g_f_t
s_f_g	s_f_g
p_s_j	p_s_j
z_d_k	z_d_k
th_d_d	th_d_d
z_d_p	z_d_p
b_z_f	b_z_f
t_p_k	t_p_k
x_z_b	x_z_b
w_f_z	w_f_z
p_s_th	p_s_th
l_d_'	l_d_'
f_z_th	f_z_th
s_b_s	s_b_s
j_k_l	j_k_l
t_p_k	t_p_k
b_s_j	b_s_j
f_d_k	f_d_k
w_x_b	w_x_b
d_'_k	d_'_k
'_t_t	'_t_t
p_w_'	p_w_'
p_z_s	p_z_s
l_z_p	l_z_p
p_th_w	p_th_w
b_x_'	b_x_'
p_x_b	p_x_b
d_k_l	d_k_l
th_th_th	th_th_th
g_k_w	g_k_w
t_f_t	t_f_t
w_f_f	w_f_f
d_d_d	d_d_d
x_g_x	x_g_x
b_x_b	b_x_b
g_x_k	g_x_k
th_d_l	th_d_l
x_g_w	x_g_w
'_x_b	'_x_b
w_j_k	w_j_k
g_k_p	g_k_p
g_l_x	g_l_x
w_'_p	w_'_p
'_x_g	'_x_g
d_t_g	d_t_g
d_l_l	d_l_l
j_d_d	j_d_d
s_'_th	s_'_th
j_'_g	j_'_g
s_d_l	s_d_l
l_w_th	l_w_th
d_t_b	d_t_b
j_w_g	j_w_g
t_g_x	t_g_x
x_d_l	x_d_l
'_g_w	'_g_w
w_j_th	w_j_th
th_th_g	th_th_g
x_j_k	x_j_k
w_l_g	w_l_g
j_p_k	j_p_k
b_f_l	b_f_l
f_'_k	f_'_k
s_w_t	s_w_t
th_p_t	th_p_t
k_s_t	k_s_t
g_w_g	g_w_g
'_x_p	'_x_p
z_k_l	z_k_l
k_g_f	k_g_f
z_l_f	z_l_f

Process returned 0

Utility Functions: Clear, Permalink

Noteworthy nodes in each datafile include:

Language	Datafile name	Root nodes (Click a root to generate from it)	Remarks
Firen	syllables.yml	`Sentence`, `Noun`, `Verb`, `NominalRoot`, `VerbalRoot`,	More information about Firen can be found on the Wiki.
Sajem Tan	sajemtan.yml	`Word`, `Root`, `Suffix`, `UnlikelyWord`, `UnlikelyRoot`, `UnlikelySuffix`,	Sajem Tan is a collaborative conlang. It has a website here.
English	english.yml	`Sentence`,	My (possibly poorly-considered) attempt to encode basic English grammar in WordGen. I apologise in advance to anyone who tries to make sense out of it.
Dab vi Suxi Kidap	ffb.yml	`Sentence`, `Word`, `Compound`, `Syllable`,	DVSK is a very simple isolating language that was created as a collaboration between me and 4 other people from the Sajem Tan tribe, however it was abandoned after working out the foundations.
Xanz	xanz.yml	`word`, `tricons`, `root`, `word1`,	Another collaborative language in the Sajem Tan universe. It is the source of triconsonantal roots in Sajem Tan.
Jafren	jafren.yml	`Sentence`, `Word`, `ChordL`, `Chord`, `Chord1`, `Chord2`, `Chord3`, `Chord4`, `Chord5`,	A musical language used in the same setting as Firen. It is currently much less well-developed.
Jokes	jokes.yml	`Gender`,	Someone on Mastodon posted a silly CFG for making gender jokes, so I encoded it as a WordGen datafile. Nothing more to it.
Numbers	numbers.yml	`number`, `phoneNumber`, `internationalPhoneNumber`,	This is one of the first files I ever wrote, and it shows. It makes use of outdated and deprecated features of WordGen and made the very questionable choice of using 'val' for a phonetic English reading of the number and 'ipa' for the digits.
Tests	CFGs.yml	`Dyck`, `binPalindrome`, `Node`,	This file exists as a testing ground for things that are too simple to need their own files, and for new or experimental features. You will need to uncrease the recursion depth to use some of these roots, particularly `Node` or else get a million errors.

Note that CFGs.yml is not allowed on this web interface due to higher resource use than the other files and its reliance on WordGen/Cpp features.

Feel free to look at the sources for WordGen/Py and WordGen/Cpp. wordgen.py is the current version of the script, and syllables.yml is the current version of the Firen data file.

This is the web frontend for a Python program that will produce random words using a (rather nifty) weighted-randomized macro expansion approach. IPA transcriptions are generated from the same file, and are not directly attached to the orthography. This means that "digraph recognition" is not even a concept to worry about.

In a second phase, regular expressions and Mealy-type finite state machines are applied to transform the output.

The Firen datafile is generally quite well-developed, and produces generally good results. The IPA transcriptions are sometimes non-obvious because they include synchronic sound changes, and sometimes unnatural but generally still correct, such as with the overzealous syllabification.

The other datafiles are in various stages of development.

Not that it matters or anything, but unless you provide your own seeds, this web frontend has worse randomness because it is simply using Unix time as the seed. (It's required that the server generates the seed for the permalink to work, and time is the standard easy choice for these things.) When run from the command line without an explicit seed parameter, the randomness is much better (Python seeds its random generator from the system's main entropy source). Maybe I could make this Base64-encode some bytes from /dev/urandom or something for the seed instead, it wouldn't change too much.

Working-1.py is a less flexible earlier (Python 2 only) draft, which technically knows nothing about words, and only generates syllables. You may find it interesting or even useful. syllables1.yml is the data file for that version. The two versions are not compatible, but are mostly similar and a single file could in theory be agnostic between them.

Once this is "done", my next plan is to implement something with Markov chains, the more classical way to generate natural language.

Top

Words: (Limit 250)
Show IPA:
Show glosses:
Show old orthography: (Sajem Tan only)
Show original IPA: (Sajem Tan only)
Show Ðab Tan: (Sajem Tan only)
Show Ðab Tan IPA: (Sajem Tan only)
Show ABC notation: (Jafren only)
Show alphabetical notation: (Jafren only)
Datafile:
Root Node:

Show paths (debug):
Show regex steps (debug):
Enable seed (debug):
Seed:
Recursion depth (debug):