Japanese Version
Note on the basic concepts of Wnn
In this note, general concepts of Wnn are introduced.
Mainly, we will forcus on the two products: free software Wnn4.2 and
proprietary Wnn6.
What is Wnn?
Wnn is a
Kana-Kanji translation system ,
developed by a joint project formed by
Kyoto University,
OMRON Corporation
[formerly known as Tateishi Electronics Co.],
and ASTEC Inc.
The first public release of Wnn has done in 1987 for UNIX operating
system. The name "Wnn", which is being an acronym for the Japanese
sentence "Watashino Namaeha Nakanodesu" (literally, it means "my name
is Nakano."), is derived from a goal of the project: to develop a
system so powerful enough that the system can translate such whole
sentence at once. Note, in those days the goal had been technically
challenging.
The source code has been written in C language, and been
distributed freely. Consequently, Wnn spread widely
among on workstation platforms, and became a de-facto
standard of the Kana-Kanji translation system for UNIX operation
systems.
One most significant point is that Wnn works in client-server
manner. The server portion of Wnn, or
jserver,
is used as a Kana-Kanji translation engine for those clients such like
"xwnmo" ;
an input system on X Window System, or for
"Egg"
of
"Nemacs"
or
"Mule"
which are discussed below.
Figure: Modules that construct Wnn.
In the version 4.1, which is released in 1991, Chinese and some
European languages supports are integrated to Wnn. Hence Wnn became
multilingual. The latest free software line of Wnn, or Wnn4.2, has
adopted the functionality of multi phrase translation of Korean
language, as well as support for X11R6.
Figure: Wnn's history
(excerpted from [3] )
Commercial Wnn product is available from Omron Software Corporation for
both UNIX and Microsoft Windows95 operating systems.
Wnn6, released in 1995, is for some major variants
of UNIX. Wnn95, released in 1996 is for Windows95. Wnn95 is released
separately with its supporting language: for Japanese, Chinese, and
Korean.
The client-server structure of Wnn
Wnn system consists of its server part; jserver, and its clients
parts; such like
uum.
Server and clients communicate to each other
in TCP/IP, thus they can be hosted on different machines
as far as they are IP reachable.
Figure: The client-server structure of Wnn
(excerpted from [3] )
On the process of translation, Wnn's behave as follows:
assume now that an user gave the string
"わたしのなまえはなかのです", which is the pre-translated
sentence (sentence in Kana characters),
to an wnn client such like uum or egg .
Then the string is passed from the client to its server.
On receiving the request from the client,
the server start to try to find out an appropriate
translation result, with consulting its
dictionaries
together with some other information like
"frequency database",
then gives back the result, say,
"私の名前は中野です", which is the translated sentence (sentence
consists of Kana and Kanji characters),
to the client.
[Due to the nature of its contents, the paragraph
above contains the code which represents some characters in Japanese.]
Clients of Wnn
The following is a list of known wnn clients.
- uum
- The program uum is the standard client of Wnn system.
Note that "uum" is the image of 180 degree rotation of "wnn."
- xwnmo
- Xwnmo is another wnn client which works on X Window System.
- egg
- Egg is a part of Nemacs;
Nihongo Emacs, or Mule; Multilingual
Emacs.
Egg provides emacs lisp API on Kana-Kanji translation for these
localized or internationalized Emacses.
Wnn's 6 characters
Until now, Wnn has been kept on evolving with many features. For
instance, translation servers for Chinese, Korean, and Taiwanese as
well as their dictionaries are integrated. Also, Wnn has been employed
as a standard of multilingual input system for X Window System.
Nevertheless, there has been 6 characters kept preserved
since the first version of Wnn in 1987.
- (1)Ability for multiphrase translation
- In the days of Wnn's early development stage in 1985, most of
other Kana-Kanji translation system could provide only single
phrase translation. Though just one word of a developer of the
project Mr. Shuji Nakano (*)
"What we shall develop is a system that can translate even a
multiphase such like 'Watashino Namaeha Nakanodesu'
('my name is nakano') at once!",
made a goal of the project.
(*) Mr. Shuji Nakano, Omron Co.
- (2)Client-Server scheme
- By centralizing the translation server in a local area network
(LAN) environment, the set of dictionaries or data to which the
server shall consult can also be unified. Consequently, users
can access to the same environment even they are logged on to
other machine.
Moreover, because only one running server is needed in the LAN, it
results in conservation of resources such like memory spaces.
- (3) Enabling migration to other system, or sharing the data among
users
- The dictionaries and its formats has been kept publicly
available.
- (4) To be available many platforms
- Wnn has been portable since it had been written in C language.
- (5) Let users can develop their own applications
- The routines for translation are provided as libraries, and its C
API has been kept open. Actually, the entire source has been
freely available. As a result, ETL (Electorotechnical Laboratory)
could lay Egg independently to the Wnn consortium.
- (6) Let as many users can use as possible
- The source code has been distributed free of charge.
Wnn6, on the other hand, is a proprietary software that
enhances these 1 to 4 above and abandon 5 and 6.
Wnn6 is developed at Information Technology Resarch Center
of OMRON Co., and sold by OMRON Software Corporation.
Functions enhanced in Wnn6
Below we will see the features which are enhanced in Wnn6 with
comparing the list of Wnn's characteristics above.
- (1)Ability for multiphrase translation
- --> Efficiency has been increased by FI technology .
-
The word "FI" stands for Flexible Inteligence in which
the essential mechanism is a) FI translation mechanism; that take
connectivity with the phrase with case information, and phrase of
the predicate into account of translation. togeather with b) FI
learning mechanism; that learns the relations between phrases.
In order FI becomes practical, the following databases are used:
system dictionary on FI in which 2.7 millions cases of
translation patterns are stored, and user dictionaries on
FI that is provided for each user. Other than FI
dictionaries, Wnn6 also refers to the system dictionary with 200
thousand of vocabulary. (about 6 times more in contrast with
Wnn4's 35 thousands words pubdic.)
- (2)Client-Server scheme
- --> Taking advantage of Client-Server scheme
- Nowadays a client-server system is everywhere.
So the point is, if any, how much advantage the system can
take with the scheme.
Wnn6 has the following functionalities which could become
practical on being a client-server system.
- Offline learning
- To improve efficiency of translation as well as
conservation of resources, Wnn6 can rearrange the
database on frequency of translations with the command
wnnoffline .
- Administrative tools
- With wnnaccess , system administrators can
control user's access to the server by setting permissions
for host and user basis.
- Automatical parameter tuning
- 7 out of 17 parameters that are critical for
translation can be optimized automatically.
- More than one jserver runnable on a host
- By assigning different port number, more than one jserver
can be running on a machine.
- (3) Enabling migration to other system, or sharing the data
among users
--> Enabling migration from other input system.
- There is a converter included in Wnn6 that translate
user dictionaries used with other Japanese input systems, like
ATOK7, ATOK8, VJE-Delta, EGBRIDGE, into the one Wnn6 can
use. Of cause, a dictionary for Wnn4 also can be migrated to Wnn6.
- (4) To be available many platforms
--> Not only for Unix workstations, Windows are supported.
- Wnn95 for Windows95 has been released. There is Wnn6 for
Linux/FreeBSD also.
References
- [1] "UNIX no NIHONGO SHORI GA WAKARUHON SAISHIN Wnn KATSUYOU GAIDO"
(in Japanese)
- Yoshida, Tomoko et al. Nikkan Kougyou Shinbunsha, 1993.
- [2] "KANA KANJI HENKAN SYSTEM" in "UNIX USER" 1995.7 (in Japanese)
- Yoshida, Tomoko. SOFT BANK, 1995
- [3]
"Maruchi ringal kankyou no kouchiku" (CREATING A
MULTILINGUAL ENVIRONMENT: Multilingualization Using
X Windows, Wnn, Mule, and WWW Browsers) (in Japanese)
- Nishikimi, Mikiko et al. Prentice Hall, 1996
Glossary
- uum
-
uum is a frontend processor of Wnn, which
is included as a part of Wnn system distribution.
uum works on [either localized or internationalized] character
terminal.
With uum running, the bottom most line of the terminal is designated
to display user's inputs and translation results.
The name of Japanese runtime module is also "uum."
uum programs for Simplified Chinese, traditional Chinese, and Korean
language are named "cuum," "tuum," and "kuum" respectively.
Though the runtime module of these programs are distinct to each
other, the source codes for them has been unified.
With the compilation options, the result runtime module can be
altered.
- xwnmo
- Xwnmo also is a frontend processor of Wnn, included in Wnn
distribution as well. Xwnmo works on X Window System
as a client in the sence of X.
As of Wnn's version 4.1 where Chinese language began to be
supported, xwnmo also was adopted multilingual functionality.
Xwnmo included in Wnn4.2 supports Chinese, Korean, and some
European languages. Unlike uum, single runtime module of xwnmo
can serve to connect with different translation servers for
different languages.
- Nemacs
- Nihongo Emacs
-
Nemacs is a localized editor, based on GNU Emacs.
Nemacs can deal with English and Japanese.
Nemacs is developed at the Electorotechnical Laboratory (ETL) of
the Agency of Industrial Science and Technology (AIST) at
the Ministory of International Trading and Industory (MITI).
As of version 2.1 of Nemacs released in June 1988, the system
which directly communicate with jserver, Egg , is
integrated for the ease of Japanese text inputs.
The development project of Nemacs ended in the release of
version 3.3.2 (codename Fujimusume), in June 1990.
Then the project turned to develop multilingual Mule
.
- Mule
- MULtilingual Enhancement to GNU Emacs
- Mule is an internationalized editor based on GNU Emacs, which
also is developed at ETL.
Mule can handle many charactersets, mainly that
defined in ISO2022 but not only them.
Actually, user can define additional characterset for mule to
handle.
Development of Mule has started in late 1991. In August of
1993, version 1.0 (codename Kiritsubo ) becomes
publicly released.
The latest version 2.3 (codename Suetsumuhana ),
which is released on September 24th 1995, is based on GNU Emacs
19.28.
The work to merge Mule and GNU Emacs
has done and is now available as emacs20. Its latest
version is 20.2.
[
Fujimusume,
Kiritsubo,
and
Suetsumuhana
are the titles of the chapters of
Genji-Monogatari
(
The Story of Genji
) by
Murasakishikibu
.
Genji-Monogatari is known to be the oldest female
literature in the world.
All the release of Nemacs and Mule are named after these
titles.
]
- Egg
-
Egg
(
TAMAGO
)
is an input method for Mule and/or Nemacs.
By means of communication with a translation server on the
network,
Egg provides input translation functionality for these editors.
The Egg system which is a part of Mule2.3 is called Egg
TAKANA version .
With Mule2.3, Egg serves as an input system for both Japanese,
Korean, and Chinese in simplified characters.
The word Egg is the literal translation of its
Japanese name TAMAGO , which is the acronym for the
Japanese sentence; "TAkusan MAtasete GOmennasai" (literally,
"Sorry to kept you wainting for long.") The word "TAKANA" comes
from the sentence "TAmagoyo KAshikoku NAre" ("TAMAGO, be
smarter.")
This page has been created by
Tomoko Yoshida.
Translation by
M. Meiarashi.
Last modified: Wed Aug 25 18:00:41 JST 2004