Module:category tree/poscatboiler/data/terms by grammatical category
- පහත දැක්වෙන උපදෙස්, Module:category tree/poscatboiler/data/documentation හි පිහිටා ඇත. Module:category tree/poscatboiler/data/documentation]]. [සංස්කරණය]
- Redirected from Module:category tree/poscatboiler/data/terms by grammatical category/documentation (සංස්කරණය).
- ප්රයෝජනවත් සබැඳි: root page • root page’s subpages • සබැඳි • transclusions • testcases • sandbox
Introduction
සංස්කරණයThis is the documentation page for the main data module for Module:category tree/poscatboiler, as well as for its submodules. Collectively, these modules handle generating the descriptions and categorization for almost all category pages. The only current exception is topic pages such as Category:en:Birds and Category:zh:State capitals of Germany, and the corresponding non-language-specific pages such as Category:Birds and Category:State capitals of Germany; these are handled by Module:category tree/topic cat.
Originally, there were a large number of Module:category tree implementations, of which Module:category tree/poscatboiler was only one. It originally handled part-of-speech categories like Category:French nouns and Category:German lemmas (and corresponding "umbrella" categories such as Category:Nouns by language and Category:Lemmas by language); hence the name. However, it has long since been generalized, and the name no longer describes its current use.
The main data module at Module:category tree/poscatboiler/data does not contain data itself, but rather imports the data from its submodules, and applies some post-processing.
- To find which submodule implements a specific category, use the search box on the right.
- To add a new data submodule, copy an existing submodule and modify its contents. Then, add its name to the
subpages
list at the top of Module:category tree/poscatboiler/data.
Concepts
සංස්කරණයThe poscatboiler system internally distinguishes the following types of categories:
- Language categories. These are of the form
LANG LABEL
(e.g. Category:French lemmas and Category:English learned borrowings from Late Latin). Here,LANG
is the name of a language, andLABEL
can be anything, but should generally describe a topic that can apply to multiple languages. Note that the language mentioned byLANG
must currently be a regular language, not an etymology-only language. (Etymology-only languages include lects such as Provençal, considered a variety of Occitan, and Biblical Hebrew, considered a variety of Hebrew. See here for the list of such lects.) Most language categories have an associated umbrella category; see below. - Umbrella categories. These are normally of the form
LABEL by language
, and group all categories with the same label. Examples are Category:Lemmas by language and Category:Learned borrowings from Late Latin by language. Note that the label appears with an initial lowercase letter in a language category, but with an initial uppercase letter in an umbrella category, consistent with the general principle that category names are capitalized. Umbrella categories themselves are grouped into umbrella metacategories, which group related umbrella categories under a given high-level topic. Examples are Category:Lemmas subcategories by language (which groups umbrella categories describing different types of lemmas, such as Category:Nouns by language and Category:Interrogative adverbs by language) and Category:Terms derived from Proto-Indo-European roots (which groups umbrella categories describing terms derived from particular Proto-Indo-European roots, such as Category:Terms derived from the Proto-Indo-European root *preḱ- and Category:Terms derived from the Proto-Indo-European root *bʰeh₂- (speak)). The names of umbrella metacategories are not standardized (although many end insubcategories by language
), and internally they are handled as raw categories; see below.- Note that umbrella categories are just a special type of parent category with built-in support in the category-handling system. In particular, some types of categories have what is logically an umbrella category but which has a nonstandard name. These are handled as just another parent category, with a separate raw-category entry for the parent itself. An example is categories of the form
LANG phrasebook/AREA
(e.g. Category:English phrasebook/Health), whose umbrella category has the nonstandard namePhrasebooks by language/AREA
(e.g. Category:Phrasebooks by language/Health). Another example is categories of the formLANG terms borrowed back into LANG
, with a nonstandard umbrella category Category:Terms borrowed back into the same language. Both of these examples are handled by disabling the standard umbrella category support and listing the nonstandard umbrella category as an additional parent. - Some umbrella categories are missing the
by language
suffix; an example is Category:Terms borrowed from Latin, which groups categories of the formLANG terms borrowed from Latin
. There is special support for umbrella categories of this nature, so they do not need to be handled as described above for umbrella categories with nonstandard names.
- Note that umbrella categories are just a special type of parent category with built-in support in the category-handling system. In particular, some types of categories have what is logically an umbrella category but which has a nonstandard name. These are handled as just another parent category, with a separate raw-category entry for the parent itself. An example is categories of the form
- Language-specific categories. These are of the same form
LANG LABEL
as regular language categories, but with the difference that the label in question applies only to a single language, rather than to all or a large group of languages. Examples are Category:Belarusian class 4c verbs, Category:Dutch separable verbs with bloot, and Category:Japanese kanji by kan'yōon reading. For these categories, it does not make sense to have a corresponding umbrella category. - Raw categories. These can have any form whatsoever, and may or may not have a language name in them. Examples are Category:Requests for images in Korean entries and Category:Rhymes:Polish/ajkɛ (which logically are language categories but do not follow the standard format of a language category); Category:Phrasebooks by language/Health (which is logically an umbrella category, but again with a nonstandard name); Category:Terms by etymology subcategories by language (an umbrella metacategory); and Category:Templates (a miscellaneous high-level category).
Under the hood, the poscatboiler system distinguishes two types of implementations for categories: individual labels (or individual raw categories), and handlers. Individual labels describe a single label, such as nouns
or refractory rhymes
. Similarly, an individual raw category describes a single raw category. Handlers, on the other hand, describe a whole class of similar labels or raw categories, e.g. labels of the form learned borrowings from SOURCE
where SOURCE
is any language or etymology language. Handlers are more powerful than individual labels, but require knowledge of Lua to implement.
Adding, removing or modifying categories
සංස්කරණයA sample entry is as follows (in this case, found in Module:category tree/poscatboiler/data/lemmas):
{{{1}}}
This generates the description and categorization for all categories of the form "LANG adjectives" (e.g. Category:English adjectives or Category:Norwegian Bokmål adjectives), as well as for the umbrella category Category:Adjectives by language.
The meanings of these fields are as follows:
- The
description
field gives the description text that will appear when a user visits the category page. Here,{{{langname}}}
is automatically replaced with the name of the language in question. The text in this field is also used to generate the description of the umbrella category Category:Adjectives by language, by chopping off the{{{langname}}}
and capitalizing the next letter. - The
parents
field gives the labels of the parent categories. For example, Category:English adjectives will have Category:English lemmas as its parent category, and Category:Norwegian Bokmål adjectives will have Category:Norwegian Bokmål lemmas as its parent category. The umbrella category Category:Adjectives by language will automatically be added as an additional parent. - The
umbrella_parents
field specifies the parent category of the umbrella category Category:Adjectives by language (i.e. the umbrella metacategory which this page belongs to; see #Concepts above).
Category label fields
සංස්කරණයThe following fields are recognized for the object describing a label:
parents
- A table listing one or more parent labels of this label. This controls the parent categories that the category is contained within, as well as the chain of breadcrumbs appearing across the top of the page (see below).
- An item in the table can be either a single string (the parent label), or a table containing (at least) the two elements
name
andsort
. In the latter case,name
specifies the parent label name, while thesort
value specifies the sort key to use to sort it in that category. The default sort key is the category's label. - The first listed parent controls the category's parent breadcrumb in the chain of breadcrumbs at the top of the page. (The breadcrumb of the category itself is determined by the
breadcrumb
setting, as described below.) - Normally the specified parent label refers to another category of the same type. That is, if the category is a language category, the parent label specifies another language category with the same language, and if the category is a raw category, the parent label specifies another raw category. This can be changed as follows:
- If the category is a language category:
- Add the property
{{{1}}}
to specify that the parent is a raw category. - Add the property
{{{1}}}
to specify that the parent is a language category for the language codecode
instead of the current language. Note that template substitutions happen in thelang
field; see #Template substitutions in field values below. - Add the property
{{{1}}}
to specify that the parent is an umbrella category.
- Add the property
- If the category is a raw category:
- Add the properties
{{{1}}}
and{{{1}}}
to specify that the parent is a language category with the specified language codecode
. Template substitutions happen in thelang
field, as above. - Add the properties
{{{1}}}
and{{{1}}}
to specify that the parent is an umbrella category.
- Add the properties
- If the category is a language category:
- If a parent label begins with
Category:
it is interpreted as a category outside theposcatboiler
system. It can still have its own sort key as usual. - Parent items can also have the properties
{{{1}}}
to specify a script codescript_code
for script-specific categories (e.g. Category:Pali nouns in Thai script) and/or{{{1}}}
} to specify additional arguments, for categories implemented using a handler that accepts or requires additional arguments passed to{{auto cat}}
(e.g. a category like Category:Latin terms suffixed with -inus or Category:Okinawan language). Template substitutions happen in the values of both of these properties; see #Template substitutions in field values below.
- An item in the table can be either a single string (the parent label), or a table containing (at least) the two elements
description
- A plain English description for the label. This should generally be no longer than one sentence. Place additional, longer explanatory text in the
additional
field described below, and put{{wikipedia}}
boxes in thetopright
field described below so that they are correctly right-aligned with the description. Template invocations and special template-like references such as{{{langname}}}
and{{{langcode}}}
will be expanded appropriately; see #Template substitutions in field values below. breadcrumb
- The text of the last breadcrumb that appears at the top of the category page.
- By default, it is the same as the category label, with the first letter capitalized.
- The value can be either a string, or a table containing two elements called
name
andnocap
. In the latter case,name
specifies the breadcrumb text, whilenocap
can be used to disable the automatic capitalization of the breadcrumb text that normally happens. - Note that the breadcrumbs collectively are the chain of links that serve as a navigation aid for the hierarchical organization of categories. For example, a category like Category:French adjectives will have a breadcrumb chain similar to "Fundamental » All languages » French » Lemmas » Adjectives", where each breadcrumb is a link to a category at the appropriate level. The last breadcrumb here is "Adjectives", and its text is controlled by this field.
displaytitle
-
- Apply special formatting such as italics to the category page title, as with the
{{DISPLAYTITLE:...}}
magic word (see mw:Help:Magic words). The value of this is either a string (which should be the formatted category title, without the precedingCategory:
) or a Lua function to generate the formatted category title. A Lua function is most useful inside of a handler (see #Handlers below). The Lua function is passed two parameters, the raw category title (without the precedingCategory:
) and the language object of the category's language (ornil
for umbrella categories), and should return the formatted category title (again without the precedingCategory:
). If the value of this field is a string, template invocations and special template-like references such as{{{langname}}}
and{{{langcode}}}
will be expanded appropriately; see below. See Module:category tree/poscatboiler/data/terms by etymology and Module:category tree/poscatboiler/data/lang-specific/nl for examples of usingdisplaytitle
.
- Apply special formatting such as italics to the category page title, as with the
topright
- Introductory text to display right-aligned, before the edit and recent-entries boxes on the right side. This field should be used for
{{wikipedia}}
and other similar boxes. Template invocations and special template-like references such as{{{langname}}}
and{{{langcode}}}
are expanded appropriately, just as withdescription
; see #Template substitutions in field values below. Compare thepreceding
field, which is similar totopright
but used for left-aligned text placed above the description. preceding
- Introductory text to display directly before the text in the
description
field. The difference between the two is thatdescription
text will also be shown in the list of children categories shown on the parent category's page, while thepreceding
text will not. For this reason, usepreceding
instead ofdescription
for{{also}}
hatnotes and similar text, and keepdescription
relatively short. Template invocations and special template-like references such as{{{langname}}}
and{{{langcode}}}
are expanded appropriately, just as withdescription
; see #Template substitutions in field values below. Compare thetopright
field, which is similar topreceding
but is right-aligned, placed above the edit and recent-entries boxes. additional
- Additional text to display directly after the text in the the
description
field. The difference between the two is thatdescription
text will also be shown in the list of children categories shown on the parent category's page, while theadditional
text will not. For this reason, useadditional
instead ofdescription
for long explanatory notes, See also references and the like, and keepdescription
relatively short. Template invocations and special template-like references such as{{{langname}}}
and{{{langcode}}}
are expanded appropriately, just as withdescription
; see #Template substitutions in field values below. umbrella
- A table describing the umbrella category that collects all language-specific categories associated with this label, or the special value
false
to indicate that there is no umbrella category. The umbrella category is normally called "LABEL by language". For example, for adjectives, the umbrella category is named Category:Adjectives by language, and is a parent category (in addition to any categories specified usingparents
) of Category:English adjectives, Category:French adjectives, Category:Norwegian Bokmål adjectives, and all other language-specific categories holding adjectives. This table contains the following fields:name
- The name of the umbrella category. It defaults to "LABEL by language". You should not use this, even if the umbrella category has a nonstandard name, because if you set it, you will have to modify Module:auto cat to recognize the new name of the umbrella category. Instead, set
{{{1}}}
and list the nonstandard umbrella category as an additional parent (and add a raw-category entry for the umbrella category itself; see the implementation of categories like Category:English terms borrowed back into English for an example). description
- A plain English description for the umbrella category. By default, it is derived from the
description
field of the category itself by removing any{{{langname}}}
,{{{langcode}}}
or{{{langcat}}}
template parameter reference and capitalizing the remainder. Text is automatically added to the end indicating that this category is an umbrella category that only contains other categories, and does not contain pages describing terms. parents
- The parent category or categories of the umbrella category. This can either be a single string specifying a category (with or without the
Category:
prefix), a table with fieldsname
(the category name) andsort
(the sort key, as in the outerparents
field described above), or a list of either type of entity. breadcrumb
- The last breadcrumb in the chain of breadcrumbs at the top of the category page; see above. By default, this is the category label (i.e. the same as the umbrella category name, minus the final "by language" text).
displaytitle
- Apply special formatting such as italics to the umbrella category page title; see above.
topright
- Like the
topright
field on regular category pages; see above. preceding
- Like the
preceding
field on regular category pages; see above. additional
- Like the
additional
field on regular category pages; see above. toc_template
,toc_template_full
- Override the table of contents bar used on umbrella pages. See below. It's unlikely you will ever need to set this.
umbrella_parents
- The same as the
parents
subfield of theumbrella
field. This typically specifies a single umbrella metacategory to which the page's corresponding umbrella page belongs; see #Concepts above). A separate field is provided for this because the umbrella's parent or parents always need to be given, whereas other umbrella properties can usually be defaulted. (In practice, you will find that most entries in a subpage of Module:category tree/poscatboiler/data do not explicitly specify the umbrella's parent. This is because a default value is supplied near the end of the "LABELS" section in which the entry is found.) toc_template
- The template or templates to use to display the "table of contents" bar for easier navigation on categories with multiple pages of entries. By default, categories with more than 200 entries or 200 subcategories display a language-appropriate table of contents bar whose contents are held in a template named
CODE-categoryTOC
, whereCODE
is the language code of the category's language. (If no such template exists, no table of contents bar is displayed. If the category has no associated language, as with umbrella pages, the English-language table of contents bar is used.) For example, the category Category:Spanish interjections (and other Spanish-language categories) use{{es-categoryTOC}}
to display a Spanish-appropriate table of contents bar. (In the case of Spanish, this includes entries for Ñ and for acute-accented vowels such as Á and Ó.) To override this behavior, specify a template or a list of templates intoc_template
. The first template that exists will be used; if none of the specified templates exist, the regular behavior applies, i.e. the language-appropriate table of contents bar is selected.- Special strings such as
{{{langcode}}}
(to specify the language code of the category's language) can be used in the template names; see below. - Use the special value
false
to disable the table of contents bar. - An example of a category that uses this property is "LANG romanizations". For example, the category Category:Gothic romanizations would by default use the Gothic-specific template
{{got-categoryTOC}}
to display a Gothic-script table of contents bar. This is inappropriate for this particular category, which contains Latin-script romanizations of Gothic terms rather than terms written in the Gothic script. To fix this, the "romanizations" label specifies atoc_template
value of{"{{{langcode}}}-rom-categoryTOC", "en-categoryTOC"
}, which first checks for a special Gothic-romanization-specific template{{got-rom-categoryTOC}}
(which in this case does exist), and falls back to the English-language table of contents template.
- Special strings such as
toc_template_full
- Similar to
toc_template
but used for categories with large numbers of entries (specifically, more than 2,500 entries or 2,500 subcategories). If none of the specified templates exist, the templates listed intoc_template
are tried, and if none of them exist either, the default behavior applies. In this case, the default behavior is to use a language-appropriate "full" table of contents template namedCODE-categoryTOC/full
, and if that doesn't exist, fall back to the regular table of contents template namedCODE-categoryTOC
. An example of a "full" table of contents template is{{es-categoryTOC/full}}
, which shows links for all two-letter combinations and appears on pages such as Category:Spanish nouns, with over 50,000 entries. catfix
- Specifies the language code of the language to use when calling the
catfix()
function in Module:utilities on this page. Thecatfix()
function is used to ensure that page names in foreign scripts show up in the correct fonts and are linked to the correct language.- The default value is the category's language, if any (for example, the language
LANG
in pages of the formLANG LABEL
). If the category has no associated language, or if the setting{{{1}}}
is used, the catfix mechanism is not applied. - The setting
{{{1}}}
is used, for example, on theromanizations
label (which holds Latin-script romanizations of foreign-script terms, rather than terms in the language's native script) and theterms with redundant transliterations
labels (which holds pages mentioning terms in the language in question with redundant transliterations). If this is omitted, for example, then pages in Category:Manchu romanizations will show up oriented vertically despite being in Latin script, and pages in Category:Cantonese terms with redundant transliterations will show up using a double-width font despite mostly not being Cantonese-language pages. - The setting
{{{1}}}
is used for example on categories of the formRequests for translations into LANG
(see Module:category tree/poscatboiler/data/entry maintenance) because these categories contain English pages need translations into a given language, rather than containing pages of that language. - Note that setting a particular language for
catfix
will normally cause that language's table of contents page to display in place of the category's normal language, and setting a value offalse
will normally cause the English table of contents page to display. In both cases, this behavior can be overridden by specifying thetoc_template
ortoc_template_full
fields.
- The default value is the category's language, if any (for example, the language
|hidden = true
- Specifies that the category is hidden. This should be used for maintenance categories. (Hidden categories do not show up in the list of categories at the bottom of a page, but do show up when searched for in the search box.)
|can_be_empty = true
- Specifies that the category should not be deleted when empty. This should be used for maintenance categories.
Template substitutions in field values
සංස්කරණයArbitrary template invocations can be inserted in the text of description
, parents
(both name and sort key), breadcrumb
, toc_template
and toc_template_full
values, and will be expanded appropriately. In addition, the following special template-like invocations are recognized and replaced by the equivalent text:
{{PAGENAME}}
- The name of the current page. (Note that two braces are used here instead of three, as with the other parameters described below.)
{{{langname}}}
- The name of the language that the category belongs to. Not recognized in umbrella fields.
{{{langcode}}}
- The code of the language that the category belongs to (e.g.
en
for English,de
for German). Not recognized in umbrella fields. {{{langcat}}}
- The name of the language's main category, which adds "language" to the regular name. Not recognized in umbrella fields.
Raw categories
සංස්කරණයRaw categories are treated similarly to regular labels. The main differences are:
- They are stored in a separate
raw_categories
table. The key is the full category name (rather than the label name, as in the case of language categories), and the value is a structure much like for language categories. - Raw categories have no corresponding umbrella category, so the
umbrella
andumbrella_parents
fields are unnecessary and do nothing. If you want an umbrella category that groups several related raw categories, you should add the umbrella category yourself as an additional parent (and create a separate entry in theraw_categories
table for this umbrella category).
See Module:category tree/poscatboiler/data/modules for an example of a module with several labels and raw categories.
Handlers
සංස්කරණයIt is also possible to have handlers that can handle arbitrarily-formed labels and raw categories. There are two types of handlers:
- label handlers handle language categories such as
lang ###-syllable words
for anylang
and###
(e.g. Category:English 3-syllable words), andlang learned borrowings from source
for anylang
andsource
(e.g. Category:Spanish learned borrowings from Ancient Greek); - raw handlers handle raw categories such as
Rhymes:lang/rhyme
for anylang
andrhyme
(e.g. Category:Rhymes:Polish/ajkɛ).
Note that the difference between the two is that label handlers are used for categories prefixed with the language name (and associated umbrella categories, such as Category:3-syllable words by language and Category:Learned borrowings from Ancient Greek by language), while raw handlers are used for arbitrarily-named raw categories. Raw categories may have a language name or code in them (as in the example above), but it generally does not occur as a prefix.
As an example, the following is the label handler for the label terms coined by coiner
(such as Category:English terms coined by Lewis Carroll):
{{{1}}}
The handler checks if the passed-in label has a recognized form, and if so, returns an object that follows the same format as described above for directly-specified labels. In this case, the handler disables the umbrella category Terms coined by coiner by language
because most people coin words in only one language.
The handler is passed a single argument data
, which is an object containing the following fields:
label
: the label;lang
: the language object of the language at the beginning of the category, ornil
for no language (this happens with umbrella categories);sc
: the script code of the script mentioned in the category, if the category is of the formlang label in script
, ornil
otherwise;args
: a table of extra parameters passed to{{auto cat}}
.
If the handler interprets the extra parameters passed as data.args
, it should return two values: a label object (as described above), and the value true
. Otherwise, an error will be thrown if any extra parameters are passed to {{auto cat}}
. An example of a handler that interprets the extra parameters is the affix-cat handler in Module:category tree/poscatboiler/data/terms by etymology, which supports {{auto cat}}
parameters |alt=
, |sort=
, |tr=
and |sc=
. The |alt=
parameter in particular is used to specify extra diacritics to display on the affix that forms part of the category name, as in categories such as Category:Latin terms suffixed with -inus (properly -īnus).
For further examples, see Module:category tree/poscatboiler/data/terms by lexical property, Module:category tree/poscatboiler/data/terms by script or Module:category tree/poscatboiler/data/terms by etymology.
Raw handlers are similar to label handlers in that they also accept a single argument data
, but this object contains only the following fields:
category
: the raw category;args
: a table of extra parameters passed to{{auto cat}}
.
Here, there is no language or script object passed in. If there is a language in the category name, it needs to be handled inside of the handler. For example, the following is the raw handler for categories of the form Varieties of lang
:
{{{1}}}
Note that if a handler is specified, the module should return a table holding both the label and handler data; see the above modules.
Language-specific labels and handlers
සංස්කරණයSupport exists for labels and handlers that are specialized to particular languages. A typical label such as verbs
applies to many languages, but some categories have labels that are specialized to a particular language, e.g. Category:Belarusian class 4c verbs or Category:Dutch prefixed verbs with ver-. Here, the label class 4c verbs
is specific to Belarusian with a description and other properties only for this particular language, and similarly for the Dutch-specific label prefixed verbs with ver-
. Yet, it is desirable to integrate these categories into the poscatboiler hierarchy, so that e.g. breadcrumbs and other features are available. This can be done by creating a module such as Module:category tree/poscatboiler/data/lang-specific/be (for Belarusian) or Module:category tree/poscatboiler/data/lang-specific/nl (for Dutch), and specifying labels and/or handlers in the same fashion as is done for language-agnostic categories. See Module:category tree/poscatboiler/data/lang-specific/documentation for more information. Note that once you create a per-language module, you must add the language code to the langs_with_modules
table in Module:category tree/poscatboiler/data/lang-specific listing all the languages with language-specific modules; otherwise, the corresponding categories won't be recognized.
Subpages
සංස්කරණයlocal labels = {}
local raw_categories = {}
local handlers = {}
-----------------------------------------------------------------------------
-- --
-- LABELS --
-- --
-----------------------------------------------------------------------------
labels["terms by grammatical category"] = {
description = "{{{langname}}} terms categorized by their grammatical category.",
umbrella_parents = "මූලධර්ම",
parents = {{name = "{{{langcat}}}", raw = true}},
}
------- GENDER -------
for _, pos in ipairs { "නාම පද", "pronouns", "proper nouns", "ප්රත්ය" } do
labels[pos .. ", ලිංග භේදය අනුව"] = {
description = "ලිංග භේදය අනුව පෙළ ගසා ඇති {{{langname}}} " .. pos .. " මෙහි දැක්වෙයි.",
breadcrumb = "ලිංග භේදය අනුව",
parents = {{name = pos, sort = "gender"}},
}
labels[pos .. " with irregular gender"] = {
description = "{{{langname}}} " .. pos .. " whose ending is not typical for " .. pos .. " of their gender.",
breadcrumb = "with irregular gender",
parents = {{name = "irregular " .. pos, sort = "irregular gender"}},
}
labels[pos .. " with multiple genders"] = {
description = "{{{langname}}} " .. pos .. " that belong to more than one gender.",
breadcrumb = "with multiple genders",
parents = {{name = pos .. ", ලිංග භේදය අනුව", sort = "multiple genders"}},
}
labels["common-gender " .. pos] = {
description = "{{{langname}}} " .. pos .. " of {{glossary|common gender}}, i.e. belonging to a gender category that combines the function of {{glossary|masculine}} and {{glossary|feminine}} and is opposed to the {{glossary|neuter}} gender.",
breadcrumb = "common-gender",
parents = {pos .. ", ලිංග භේදය අනුව"},
}
labels["ස්ත්රී ලිංග " .. pos] = {
description = "{{{langname}}} " .. pos .. " of {{glossary|feminine}} gender, i.e. belonging to a gender category that contains (among other things) female beings.",
breadcrumb = "ස්ත්රී ලිංග",
parents = {pos .. ", ලිංග භේදය අනුව"},
}
labels["පුරුෂ ලිංග " .. pos] = {
description = "{{{langname}}} " .. pos .. " of {{glossary|masculine}} gender, i.e. belonging to a gender category that contains (among other things) male beings.",
breadcrumb = "පුරුෂ ලිංග",
parents = {pos .. ", ලිංග භේදය අනුව"},
}
labels["පුරුෂ ලිංග සහ ස්ත්රී ලිංග " .. pos .. " by sense"] = {
description = "{{{langname}}} " .. pos .. " that may be either {{glossary|masculine}} or {{glossary|feminine}} depending on whether they refer to male or female beings.",
breadcrumb = "පුරුෂ ලිංග සහ ස්ත්රී ලිංග by sense",
parents = {pos .. ", ලිංග භේදය අනුව"},
}
labels["නපුංසක ලිංග " .. pos] = {
description = "{{{langname}}} " .. pos .. " of {{glossary|neuter}} gender, i.e. belonging to a gender category that does not usually contain male or female beings.",
breadcrumb = "නපුංසක ලිංග",
parents = {pos .. ", ලිංග භේදය අනුව"},
}
labels["gender-neutral " .. pos] = {
description = "{{{langname}}} " .. pos .. " that are applicable to all people, independent of gender.",
breadcrumb = "gender-neutral",
parents = {pos .. ", ලිංග භේදය අනුව", "gender-neutral terms"},
}
end
for _, pos in ipairs({"adjectives", "ප්රත්ය"}) do
labels["epicene " .. pos] = {
description = "{{{langname}}} " .. pos .. " whose form is the same for both {{glossary|masculine}} and {{glossary|feminine}}, in languages whose " .. pos .. " normally distinguish gender.",
breadcrumb = "epicene",
parents = {pos .. ", වරනැගීම් වර්ගය අනුව"},
}
end
------- NOUN CLASSES -------
labels["nouns by class"] = {
description = "{{{langname}}} nouns organized by the class they belong to.",
breadcrumb = "by class",
parents = {{name = "නාම පද", sort = "class"}},
}
labels["alienable nouns"] = {
description = "{{{langname}}} nouns that are [[w:Inalienable possession|alienably possessed]].",
breadcrumb = "alienable",
parents = {"නාම පද"},
}
labels["inalienable nouns"] = {
description = "{{{langname}}} nouns that are [[w:Inalienable possession|inalienably possessed]].",
breadcrumb = "inalienable",
parents = {"නාම පද"},
}
------- ANIMACY -------
for _, pos in ipairs({"නාම පද", "ප්රත්ය", "ක්රියා පද"}) do
labels["ප්රාණවාචී " .. pos] = {
description = "{{{langname}}} " .. pos .. " that refer to humans or animals.",
breadcrumb = "animate",
parents = {pos},
}
labels["අප්රාණවාචී " .. pos] = {
description = "{{{langname}}} " .. pos .. " that refer to inanimate objects (not humans or animals).",
breadcrumb = "inanimate",
parents = {pos},
}
labels[pos .. " with multiple animacies"] = {
description = "{{{langname}}} " .. pos .. " that belong to more than one animacy.",
breadcrumb = "with multiple animacies",
parents = {{name = pos, sort = "multiple animacies"}},
}
end
for _, pos in ipairs({"නාම පද", "ප්රත්ය"}) do
-- This category should be used particularly in languages that have
-- grammatical distinctions related to animals, such as Ukrainian.
labels["animal " .. pos] = {
description = "{{{langname}}} " .. pos .. " that refer to animals.",
breadcrumb = "animal",
parents = {"ප්රාණවාචී " .. pos},
}
-- This category should be used particularly in languages that have
-- grammatical distinctions related to men, such as Polish.
labels["nonvirile " .. pos] = {
description = "{{{langname}}} plural " .. pos .. " that refer to a group without male humans.",
breadcrumb = "nonvirile",
parents = {pos, "pluralia tantum"},
}
labels["personal " .. pos] = {
description = "{{{langname}}} " .. pos .. " that refer to humans.",
breadcrumb = "personal",
parents = {"ප්රාණවාචී " .. pos},
}
-- This category should be used particularly in languages that have
-- grammatical distinctions related to men, such as Polish.
labels["virile " .. pos] = {
description = "{{{langname}}} plural " .. pos .. " that refer to a group with at least one male human.",
breadcrumb = "virile",
parents = {pos, "pluralia tantum"},
}
end
------- INFLECTED PARTS OF SPEECH -------
-- Add "POS by inflection type", "irregular POS" and "POS by tone"
-- categories for (potentially) inflected parts of speech.
local inflected_poses = {
"adjectives",
"adverbs",
"determiners",
"නාම පද",
"සංඛ්යාංක",
"participles",
"pronouns",
"proper nouns",
"ප්රත්ය",
"ක්රියා පද",
}
for _, pos in ipairs(inflected_poses) do
labels[pos .. ", වරනැගීම් වර්ගය අනුව"] = {
description = "{{{langname}}} " .. pos .. " organized by the type of inflection they follow.",
breadcrumb = "by inflection type",
parents = {{name = pos, sort = "inflection"}},
}
labels["irregular " .. pos] = {
description = "{{{langname}}} " .. pos .. " that follow non-standard patterns of inflection.",
breadcrumb = "irregular",
parents = {pos .. ", වරනැගීම් වර්ගය අනුව"},
}
labels["defective " .. pos] = {
description = "{{{langname}}} " .. pos .. " that lack one or more forms in their inflections.",
breadcrumb = "defective",
parents = {pos, "irregular " .. pos},
}
labels["suppletive " .. pos] = {
description = "{{{langname}}} " .. pos .. " that have inflected forms from different roots.",
breadcrumb = "suppletive",
umbrella_parents = "Suppletion subcategories by language",
parents = {"irregular " .. pos},
}
if pos ~= "ක්රියා පද" and pos ~= "adverbs" then
labels["අව්යය " .. pos] = {
description = "{{{langname}}} " .. pos .. " that do not display additional grammatical relations by means of declension.",
breadcrumb = "අව්යය",
parents = {pos .. ", වරනැගීම් වර්ගය අනුව"},
}
labels[pos .. " with multiple declensions"] = {
description = "{{{langname}}} " .. pos .. " that follow more than one type of inflection.",
breadcrumb = "with multiple declensions",
parents = {{name = pos .. ", වරනැගීම් වර්ගය අනුව", sort = "multiple declensions"}},
}
labels[pos .. " with multiple plurals"] = {
description = "{{{langname}}} " .. pos .. " that have more than one possible plural (sometimes with distinct meanings).",
breadcrumb = "with multiple plurals",
parents = {{name = pos .. ", වරනැගීම් වර්ගය අනුව", sort = "multiple plurals"}},
}
end
labels[pos .. " by tone"] = {
description = "{{{langname}}} " .. pos .. " organized by the tone they follow.",
breadcrumb = "by tone",
parents = {{name = pos .. ", වරනැගීම් වර්ගය අනුව", sort = "tone"}},
}
labels[pos .. " by vowel harmony"] = {
description = "{{{langname}}} " .. pos .. " organized by the vowel harmony they follow.",
breadcrumb = "by vowel harmony",
parents = {{name = pos .. ", වරනැගීම් වර්ගය අනුව", sort = "vowel harmony"}},
}
end
-- FIXME: Only used currently for Arabic; probably should be removed as a general category.
labels["irregular elative adjectives"] = {
description = "{{{langname}}} elative adjectives that follow non-standard patterns of inflection.",
parents = {"adjectives by inflection type"},
}
for _, pos in ipairs { "නාම පද", "proper nouns", "pronouns" } do
labels[pos .. " with unattested plurals"] = {
description = "{{{langname}}} " .. pos .. " with unattested plurals.",
breadcrumb = "with unattested plurals",
parents = {{name = pos, sort = "unattested plurals"}},
}
labels["definite " .. pos] = {
description = "{{{langname}}} " .. pos .. " that are inherently definite and have definite concord.",
breadcrumb = "definite",
parents = {pos .. ", වරනැගීම් වර්ගය අනුව"},
}
end
------- GERMANIC VERB CLASSES -------
-- FIXME: Not clear this belongs among the general categories.
labels["strong verbs"] = {
description = "{{{langname}}} verbs that present different stem vowels in their typically regular conjugated forms.",
breadcrumb = "strong",
parents = {"verbs by inflection type"},
}
labels["weak verbs"] = {
description = "{{{langname}}} verbs that display dental suffixes in their past tense conjugated forms.",
breadcrumb = "weak",
parents = {"verbs by inflection type"},
}
labels["preterite-present verbs"] = {
description = "{{{langname}}} verbs that inflect in the present tense like the past tense of strong verbs.",
breadcrumb = "preterite-present",
parents = {"verbs by inflection type"},
}
labels["class 1 strong verbs"] = {
description = "Verbs where the [[ablaut]] vowel was followed by ''-y-'' in Proto-Indo-European.",
breadcrumb = "class 1",
parents = {{name = "strong verbs", sort = "1"}},
}
labels["class 1 weak verbs"] = {
description = "Weak verbs of the first class.",
breadcrumb = "class 1",
parents = {{name = "weak verbs", sort = "1"}},
}
labels["class 2 strong verbs"] = {
description = "Verbs where the [[ablaut]] vowel was followed by ''-w-'' in Proto-Indo-European.",
breadcrumb = "class 2",
parents = {{name = "strong verbs", sort = "2"}},
}
labels["class 2a strong verbs"] = {
description = "Verbs where the [[ablaut]] vowel was *eu in Proto-Germanic.",
breadcrumb = "class 2a",
parents = {{name = "class 2 strong verbs", sort = "1"}},
}
labels["class 2b strong verbs"] = {
description = "Verbs where the [[ablaut]] vowel was *ū in Proto-Germanic.",
breadcrumb = "class 2b",
parents = {{name = "class 2 strong verbs", sort = "2"}},
}
labels["class 2 weak verbs"] = {
description = "Weak verbs of the second class.",
breadcrumb = "class 2",
parents = {{name = "weak verbs", sort = "2"}},
}
labels["class 3 weak verbs"] = {
description = "Weak verbs of the third class.",
breadcrumb = "class 3",
parents = {{name = "weak verbs", sort = "3"}},
}
labels["class 3 strong verbs"] = {
description = "Verbs where the [[ablaut]] vowel was followed by a [[consonant cluster]] in Proto-Indo-European.",
breadcrumb = "class 3",
parents = {{name = "strong verbs", sort = "3"}},
}
labels["class 3a strong verbs"] = {
description = "Verbs where the [[consonant cluster]] begins with a nasal consonant.",
breadcrumb = "class 3a",
parents = {{name = "class 3 strong verbs", sort = "1"}},
}
labels["class 3b strong verbs"] = {
description = "Verbs where the [[consonant cluster]] begins with a lateral consonant or velar fricative.",
breadcrumb = "class 3b",
parents = {{name = "class 3 strong verbs", sort = "2"}},
}
labels["class 3c strong verbs"] = {
description = "Verbs where the [[consonant cluster]] begins with a rhotic consonant.",
breadcrumb = "class 3c",
parents = {{name = "class 3 strong verbs", sort = "3"}},
}
labels["class 4 strong verbs"] = {
description = "Verbs where the [[ablaut]] vowel was followed by a [[sonorant]] (''m'', ''n'', ''l'', ''r'') but no other consonant in Proto-Indo-European.",
breadcrumb = "class 4",
parents = {{name = "strong verbs", sort = "4"}},
}
labels["class 4 weak verbs"] = {
description = "Weak verbs of the fourth class.",
breadcrumb = "class 4",
parents = {{name = "weak verbs", sort = "4"}},
}
labels["class 5 strong verbs"] = {
description = "Verbs where the [[ablaut]] vowel was followed by [[consonant]] other than a [[sonorant]] in Proto-Indo-European.",
breadcrumb = "class 5",
parents = {{name = "strong verbs", sort = "5"}},
}
labels["class 6 strong verbs"] = {
description = "The Proto-Indo-European origin of this class is not securely known. It contains verbs with the stem vowel ''-a-'', except those where it is followed by a sonorant and another consonant (this combination was considered a diphthong in PIE and therefore belonged to class 7).",
breadcrumb = "class 6",
parents = {{name = "strong verbs", sort = "6"}},
}
labels["class 7 strong verbs"] = {
description = "Verbs that retained their reduplication in the past tense in Proto-Germanic.",
breadcrumb = "class 7",
parents = {{name = "strong verbs", sort = "7"}},
}
labels["class 7a strong verbs"] = {
description = "Class 7 strong verbs where the root vowel was ''*ai'' in Proto-Germanic, analogous to class 1.",
breadcrumb = "class 7a",
parents = {{name = "class 7 strong verbs", sort = "a"}},
}
labels["class 7b strong verbs"] = {
description = "Class 7 strong verbs where the root vowel was ''*au'' in Proto-Germanic, analogous to class 2.",
breadcrumb = "class 7b",
parents = {{name = "class 7 strong verbs", sort = "b"}},
}
labels["class 7c strong verbs"] = {
description = "Class 7 strong verbs where the root vowel was ''*a'' followed by a [[consonant cluster]] in Proto-Germanic, analogous to class 3.",
breadcrumb = "class 7c",
parents = {{name = "class 7 strong verbs", sort = "c"}},
}
labels["class 7d strong verbs"] = {
description = "Class 7 strong verbs where the root vowel was ''*ē'' in Proto-Germanic.",
breadcrumb = "class 7d",
parents = {{name = "class 7 strong verbs", sort = "d"}},
}
labels["class 7e strong verbs"] = {
description = "Class 7 strong verbs where the root vowel was ''*ō'' in Proto-Germanic.",
breadcrumb = "class 7e",
parents = {{name = "class 7 strong verbs", sort = "e"}},
}
------- TUPIAN LEMMA CLASSES -------
-- FIXME: Present in Old Tupi, Nheengatu, Guaraní and some other Tupian languages; not clear if this belongs among the general categories.
labels["pluriform adjectives"] = {
description = "{{{langname}}} adjectives that have a relational prefix added to their stem.",
breadcrumb = "pluriform",
parents = {"adjectives by inflection type"},
}
labels["pluriform nouns"] = {
description = "{{{langname}}} nouns that have a relational prefix added to their stem.",
breadcrumb = "pluriform",
parents = {"nouns by inflection type"},
}
labels["pluriform postpositions"] = {
description = "{{{langname}}} postpositions that have a relational prefix added to their stem.",
breadcrumb = "pluriform",
parents = {"postpositions by inflection type"},
}
labels["pluriform verbs"] = {
description = "{{{langname}}} verbs that have a relational prefix added to their stem.",
breadcrumb = "pluriform",
parents = {"verbs by inflection type"},
}
local labels2 = {}
-- Add 'umbrella_parents' key if not already present.
for key, data in pairs(labels) do
labels2[key] = data
if not data.umbrella_parents then
data.umbrella_parents = "Terms by grammatical category subcategories by language"
end
end
-----------------------------------------------------------------------------
-- --
-- RAW CATEGORIES --
-- --
-----------------------------------------------------------------------------
raw_categories["Terms by grammatical category subcategories by language"] = {
description = "Umbrella categories covering topics related to grammatical categories, such as gender, animacy and noun and verb classes.",
additional = "{{{umbrella_meta_msg}}}",
parents = {
"ඡත්ර මෙටා ප්රවර්ග",
{name = "terms by grammatical category", is_label = true, sort = " "},
},
}
raw_categories["Suppletion subcategories by language"] = {
description = "Umbrella categories covering suppletive terms in specific part-of-speech categories.",
additional = "{{{umbrella_meta_msg}}}",
parents = {
"ඡත්ර මෙටා ප්රවර්ග",
"Terms by grammatical category subcategories by language",
},
}
-----------------------------------------------------------------------------
-- --
-- HANDLERS --
-- --
-----------------------------------------------------------------------------
table.insert(handlers, function(data)
local class = data.label:match("^class ([0-9a-z]+) nouns$")
if class then
local classnum, suffix = class:match("^([0-9]+)([a-z]*)$")
return {
description =
"{{{langname}}} nouns that belong to class " .. class .. ".",
breadcrumb = class,
umbrella = false,
parents = {{
name = "nouns by class",
sort = classnum and ("#%02d"):format(classnum) .. suffix or class,
}},
}
end
end)
table.insert(handlers, function(data)
local pos, tone = data.label:match("^(.+) with tone ([^ ]+)$")
if pos then
return {
description = "{{{langname}}} " .. pos .. " with tone " .. tone .. ".",
breadcrumb = tone,
-- FIXME, should there be an umbrella category e.g. 'Adjectives with tone H by language'?
umbrella = false,
parents = {{
name = pos .. " by tone",
sort = "" .. tone:len() .. tone,
}},
}
end
end)
table.insert(handlers, function(data)
local vh, pos = data.label:match("^(.+)-harmonic ([^ ]+)$")
if pos then
return {
description = "{{{langname}}} " .. pos .. " with vowel harmony in " .. vh .. ".",
breadcrumb = vh,
umbrella = false,
parents = {{
name = pos .. " by vowel harmony",
sort = "" .. vh:len() .. vh,
}},
}
end
end)
table.insert(handlers, function(data)
local pos, classifier = data.label:match("^(nouns) classified by (.+)$")
if pos then
local linktext
if data.lang then
-- Chinese classifiers may take the form TRAD/SIMP. This will cause problems if passed directly to [[Module:links]],
-- but the module can accept links of the form TRAD//SIMP and display them correctly.
if data.lang:getCode() == "zh" then
classifier = classifier:gsub("/", "//")
end
linktext = require("Module:links").full_link({ term = classifier, lang = data.lang }, "term")
else
linktext = classifier
end
return {
description = "{{{langname}}} " .. pos .. " using " .. linktext .. " as their classifier.",
breadcrumb = classifier,
umbrella = false,
parents = {{
name = pos .. " by classifier",
sort = (data.lang:makeSortKey(classifier)),
}},
}
end
end)
return {LABELS = labels2, RAW_CATEGORIES = raw_categories, HANDLERS = handlers}