Firen Word Generator

Words	Gloss
s_s_'	s_s_'
d_z_j	d_z_j
x_s_p	x_s_p
p_k_s	p_k_s
x_z_p	x_z_p
k_l_f	k_l_f
b_w_t	b_w_t
z_k_t	z_k_t
j_x_x	j_x_x
p_d_t	p_d_t
g_f_k	g_f_k
s_z_b	s_z_b
x_p_j	x_p_j
g_t_th	g_t_th
th_t_f	th_t_f
d_p_z	d_p_z
d_t_z	d_t_z
b_s_j	b_s_j
p_t_d	p_t_d
z_t_j	z_t_j
l_b_j	l_b_j
g_g_t	g_g_t
z_d_th	z_d_th
g_'_s	g_'_s
f_b_d	f_b_d
x_'_w	x_'_w
g_l_b	g_l_b
d_s_b	d_s_b
w_l_x	w_l_x
t_g_p	t_g_p
l_b_d	l_b_d
k_w_'	k_w_'
l_p_s	l_p_s
l_w_t	l_w_t
w_w_p	w_w_p
x_d_d	x_d_d
'_k_j	'_k_j
p_l_j	p_l_j
g_l_s	g_l_s
t_t_w	t_t_w
th_k_p	th_k_p
s_b_th	s_b_th
b_w_w	b_w_w
th_t_w	th_t_w
k_'_w	k_'_w
l_z_th	l_z_th
p_th_z	p_th_z
z_x_b	z_x_b
x_g_z	x_g_z
j_p_z	j_p_z
th_s_j	th_s_j
th_'_f	th_'_f
th_d_b	th_d_b
z_w_k	z_w_k
g_s_s	g_s_s
th_w_f	th_w_f
b_s_p	b_s_p
g_th_'	g_th_'
x_d_g	x_d_g
b_s_d	b_s_d
f_g_p	f_g_p
x_t_t	x_t_t
j_d_th	j_d_th
th_w_z	th_w_z
b_'_b	b_'_b
z_l_f	z_l_f
f_'_k	f_'_k
x_g_x	x_g_x
j_'_f	j_'_f
f_k_x	f_k_x
f_z_g	f_z_g
l_d_b	l_d_b
z_x_p	z_x_p
t_l_p	t_l_p
p_j_th	p_j_th
b_f_b	b_f_b
f_l_p	f_l_p
w_z_w	w_z_w
x_d_z	x_d_z
d_f_g	d_f_g
x_p_b	x_p_b
b_th_k	b_th_k
w_g_d	w_g_d
k_t_d	k_t_d
k_l_p	k_l_p
k_l_z	k_l_z
w_x_p	w_x_p
z_g_l	z_g_l
p_b_'	p_b_'
g_k_x	g_k_x
l_d_'	l_d_'
z_d_g	z_d_g
t_b_g	t_b_g
b_w_d	b_w_d
'_b_'	'_b_'
f_l_'	f_l_'
d_t_g	d_t_g
x_th_j	x_th_j
l_w_b	l_w_b
z_th_b	z_th_b

Process returned 0

Utility Functions: Clear, Permalink

Noteworthy nodes in each datafile include:

Language	Datafile name	Root nodes (Click a root to generate from it)	Remarks
Firen	syllables.yml	`Sentence`, `Noun`, `Verb`, `NominalRoot`, `VerbalRoot`,	More information about Firen can be found on the Wiki.
Sajem Tan	sajemtan.yml	`Word`, `Root`, `Suffix`, `UnlikelyWord`, `UnlikelyRoot`, `UnlikelySuffix`,	Sajem Tan is a collaborative conlang. It has a website here.
English	english.yml	`Sentence`,	My (possibly poorly-considered) attempt to encode basic English grammar in WordGen. I apologise in advance to anyone who tries to make sense out of it.
Dab vi Suxi Kidap	ffb.yml	`Sentence`, `Word`, `Compound`, `Syllable`,	DVSK is a very simple isolating language that was created as a collaboration between me and 4 other people from the Sajem Tan tribe, however it was abandoned after working out the foundations.
Xanz	xanz.yml	`word`, `tricons`, `root`, `word1`,	Another collaborative language in the Sajem Tan universe. It is the source of triconsonantal roots in Sajem Tan.
Jafren	jafren.yml	`Sentence`, `Word`, `ChordL`, `Chord`, `Chord1`, `Chord2`, `Chord3`, `Chord4`, `Chord5`,	A musical language used in the same setting as Firen. It is currently much less well-developed.
Jokes	jokes.yml	`Gender`,	Someone on Mastodon posted a silly CFG for making gender jokes, so I encoded it as a WordGen datafile. Nothing more to it.
Numbers	numbers.yml	`number`, `phoneNumber`, `internationalPhoneNumber`,	This is one of the first files I ever wrote, and it shows. It makes use of outdated and deprecated features of WordGen and made the very questionable choice of using 'val' for a phonetic English reading of the number and 'ipa' for the digits.
Tests	CFGs.yml	`Dyck`, `binPalindrome`, `Node`,	This file exists as a testing ground for things that are too simple to need their own files, and for new or experimental features. You will need to uncrease the recursion depth to use some of these roots, particularly `Node` or else get a million errors.

Note that CFGs.yml is not allowed on this web interface due to higher resource use than the other files and its reliance on WordGen/Cpp features.

Feel free to look at the sources for WordGen/Py and WordGen/Cpp. wordgen.py is the current version of the script, and syllables.yml is the current version of the Firen data file.

This is the web frontend for a Python program that will produce random words using a (rather nifty) weighted-randomized macro expansion approach. IPA transcriptions are generated from the same file, and are not directly attached to the orthography. This means that "digraph recognition" is not even a concept to worry about.

In a second phase, regular expressions and Mealy-type finite state machines are applied to transform the output.

The Firen datafile is generally quite well-developed, and produces generally good results. The IPA transcriptions are sometimes non-obvious because they include synchronic sound changes, and sometimes unnatural but generally still correct, such as with the overzealous syllabification.

The other datafiles are in various stages of development.

Not that it matters or anything, but unless you provide your own seeds, this web frontend has worse randomness because it is simply using Unix time as the seed. (It's required that the server generates the seed for the permalink to work, and time is the standard easy choice for these things.) When run from the command line without an explicit seed parameter, the randomness is much better (Python seeds its random generator from the system's main entropy source). Maybe I could make this Base64-encode some bytes from /dev/urandom or something for the seed instead, it wouldn't change too much.

Working-1.py is a less flexible earlier (Python 2 only) draft, which technically knows nothing about words, and only generates syllables. You may find it interesting or even useful. syllables1.yml is the data file for that version. The two versions are not compatible, but are mostly similar and a single file could in theory be agnostic between them.

Once this is "done", my next plan is to implement something with Markov chains, the more classical way to generate natural language.

Top

Words: (Limit 250)
Show IPA:
Show glosses:
Show old orthography: (Sajem Tan only)
Show original IPA: (Sajem Tan only)
Show Ðab Tan: (Sajem Tan only)
Show Ðab Tan IPA: (Sajem Tan only)
Show ABC notation: (Jafren only)
Show alphabetical notation: (Jafren only)
Datafile:
Root Node:

Show paths (debug):
Show regex steps (debug):
Enable seed (debug):
Seed:
Recursion depth (debug):