Provided by: libencode-arabic-perl_14.2-3_all 

NAME
Encode::Arabic::Buckwalter - Tim Buckwalter's transliteration of Arabic
SYNOPSIS
use Encode::Arabic::Buckwalter; # imports just like 'use Encode' would, plus more
while ($line = <>) { # Tim Buckwalter's mapping into the Arabic script
print encode 'utf8', decode 'buckwalter', $line; # 'Buckwalter' alias 'Tim'
}
# shell filter of data, e.g. in *n*x systems instead of viewing the Arabic script proper
% perl -MEncode::Arabic::Buckwalter -pe '$_ = encode "buckwalter", decode "utf8", $_'
# employing the modes of conversion for filtering and trimming
Encode::Arabic::enmode 'buckwalter', 'nosukuun', '>&< xml';
Encode::Arabic::Buckwalter->demode(undef, undef, 'strip _');
$decode = "Aiqora>o h`*aA {l_n~a_S~a bi___{notibaAhK.";
$encode = encode 'buckwalter', decode 'buckwalter', $decode;
# $encode eq "AiqraO h`*aA Aln~aS~a biAntibaAhK."
DESCRIPTION
Tim Buckwalter's notation is a one-to-one transliteration of the Arabic script for Modern Standard
Arabic, using lower ASCII characters to encode the graphemes of the original script. This system has been
very popular in Natural Language Processing, however, there are limits to its applicability due to
numerous non-alphabetic codes involved.
IMPLEMENTATION
The module takes care of the Encode::Encoding programming interface, while the effective code is Tim
Buckwalter's "tr"ick:
$encode =~ tr[\x{060C}\x{061B}\x{061F}\x{0621}-\x{063A}\x{0640}-\x{0652} # !! no break in true perl !!
\x{0670}\x{0671}\x{067E}\x{0686}\x{0698}\x{06A4}\x{06AF}\x{0660}-\x{0669}]
[,;?'|>&<}AbptvjHxd*rzs$SDTZEg_fqklmnhwYyFNKaui~o`{PJRVG0-9];
$decode =~ tr[,;?'|>&<}AbptvjHxd*rzs$SDTZEg_fqklmnhwYyFNKaui~o`{PJRVG0-9]
[\x{060C}\x{061B}\x{061F}\x{0621}-\x{063A}\x{0640}-\x{0652} # !! no break in true perl !!
\x{0670}\x{0671}\x{067E}\x{0686}\x{0698}\x{06A4}\x{06AF}\x{0660}-\x{0669}];
EXPORTS & MODES
If the first element in the list to "use" is ":xml", the alternative mapping is introduced that suits the
XML etiquette. This option is there only to replace the ">&<" reserved characters by "OWI" while still
having a one-to-one notation. There is no XML parsing involved, and the markup would get distorted if
subject to "decode"!
$using_xml = eval q { use Encode::Arabic::Buckwalter ':xml'; decode 'buckwalter', 'OWI' };
$classical = eval q { use Encode::Arabic::Buckwalter; decode 'buckwalter', '>&<' };
# $classical eq $using_xml and $classical eq "\x{0623}\x{0624}\x{0625}"
The module exports as if "use Encode" also appeared in the package. The other "import" options are just
delegated to Encode and imports performed properly.
The conversion modes of this module allow one to override the setting of the ":xml" option, in addition
to filtering out diacritical marks and stripping off kashida. The modes and aliases relate like this:
our %Encode::Arabic::Buckwalter::modemap = (
'default' => 0, 'undef' => 0,
'fullvocalize' => 0, 'full' => 0,
'nowasla' => 4,
'vocalize' => 3, 'nosukuun' => 3,
'novocalize' => 2, 'novowels' => 2, 'none' => 2,
'noshadda' => 1, 'noneplus' => 1,
);
enmode ($obj, $mode, $xml, $kshd)
demode ($obj, $mode, $xml, $kshd)
These methods can be invoked directly or through the respective functions of Encode::Arabic. The
meaning of the extra parameters follows from the examples of usage.
SEE ALSO
Encode::Arabic, Encode, Encode::Encoding
Tim Buckwalter's Qamus <http://www.qamus.org/>
Buckwalter Arabic Morphological Analyzer
<http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2002L49>
AUTHOR
Otakar Smrz "<otakar-smrz users.sf.net>", <http://otakar-smrz.users.sf.net/>
COPYRIGHT AND LICENSE
Copyright (C) 2003-2016 Otakar Smrz
This library is free software; you can redistribute it and/or modify it under the same terms as Perl
itself.
perl v5.36.0 2022-10-16 Encode::Arabic::Buckwalter(3pm)