-
Notifications
You must be signed in to change notification settings - Fork 10
/
s_Abstract.tex
48 lines (39 loc) · 2.73 KB
/
s_Abstract.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
\title{Learning meters of Arabic and English poems with Recurrent Neural Networks: a step forward
for language understanding and synthesis}
\author{%
Waleed~A.~Yousef\textsuperscript{a},~\IEEEmembership{Senior Member,~IEEE};~\thanks{Waleed
A. Yousef is an associate professor, \url{[email protected]}}
Omar M. Ibrahime\textsuperscript{a,b};~\thanks{Omar M. Ibrahime, B.Sc., \url{[email protected]}}
Taha M. Madbouly\textsuperscript{a,b};~\thanks{Taha M. Madbouly, B.Sc., \url{[email protected]}} %
Moustafa A. Mahmoud\textsuperscript{a,b};~\thanks{Moustafa A. Mahmoud, B.Sc., Senior Big Data Engineer, \url{[email protected]}}
Ahmed A. Abouelkahire~\thanks{Ahmed A. Abouelkahire, B.Sc., Data Scientist, TeraData, Egypt,
\url{[email protected]}}%
\thanks{\textsuperscript{a}Human Computer Interaction Laboratory (HCILAB:\
\url{www.hciegypt.com}), Egypt.}
\thanks{\textsuperscript{b}These three authors contributed equally
to the manuscript, their names are ordered alphabetically according to the family name, and each
of them is the second author.}
}
\maketitle
\begin{abstract}
Recognizing a piece of writing as a poem or prose is usually easy for the majority of people; however,
only specialists can determine which meter a poem belongs to. In this paper, we build Recurrent
Neural Network (RNN) models that can classify poems according to their meters from plain text. The
input text is encoded at the character level and directly fed to the models without feature
handcrafting. This is a step forward for machine understanding and synthesis of languages in
general, and Arabic language in particular.
Among the 16 poem meters of Arabic and the 4 meters of English the networks were able to correctly
classify poem with an overall accuracy of 96.38\% and 82.31\% respectively. The poem datasets used to
conduct this research were massive, over 1.5 million of verses, and were crawled from different
nontechnical sources, almost Arabic and English literature sites, and in different heterogeneous
and unstructured formats. These datasets are now made publicly available in clean, structured, and
documented format for other future research.
To the best of the authors' knowledge, this research is the first to address classifying poem
meters in a machine learning approach, in general, and in RNN featureless based approach, in
particular. In addition, the dataset is the first publicly available dataset ready for the purpose
of future computational research.
\end{abstract}
\begin{IEEEkeywords}
Poetry, Meters, Al-'arud, Arabic, English, Recurrent Neural Networks, RNN, Deep Learning, Deep Neural
Networks, DNN, Classification, Text Mining.
\end{IEEEkeywords}