6 CONCLUSION
We examined the English translation of South Korean
legislation and its official simplification and produced
a parallel corpus by aligning both sources. Subse-
quently, we explored the parallel corpus and inves-
tigated how the normal legalisation differs from the
simple one. We concluded that simple legislation gen-
erally uses fewer and shorter sentences. Furthermore,
complete sentences, fewer passive voice and modal
verbs are favoured in simple law. Common Read-
ability measures lead to insufficient results and were
deemed unusable for legal texts. State of the art Text
Simplification models were able to quantitatively re-
duce the complexity of the normal legal text. How-
ever, the models had problems retaining all informa-
tion when used on the normal legal text in our paral-
lel corpus. Awareness to the domain of the words the
models paraphrase would improve the results.
ACKNOWLEDGEMENTS
This paper is based on the master thesis “Automatic
English Text Simplification for Statutes” by Akshaya
Muralidharan
11
.
REFERENCES
Al-Thanyyan, S. S. and Azmi, A. M. (2021). Auto-
mated text simplification. ACM Computing Surveys,
54(2):1–36.
Chandrasekar, R. and Srinivas, B. (1997). Automatic induc-
tion of rules for text simplification. Knowledge-Based
Systems, 10(3):183–190.
Coster, W. and Kauchak, D. (2011). Simple english
wikipedia: a new text simplification task. In Proceed-
ings of the 49th Annual Meeting of the Association for
Computational Linguistics: Human Language Tech-
nologies, pages 665–669.
Dras, M. (1999). Tree Adjoining Grammar and the Reluc-
tant Paraphrasing of Text. PhD thesis, Citeseer.
Kajiwara, T. and Komachi, M. (2016). Building a mono-
lingual parallel corpus for text simplification using
sentence similarity based on alignment between word
embeddings. In Proceedings of COLING 2016, the
26th International Conference on Computational Lin-
guistics: Technical Papers, pages 1147–1158, Osaka,
Japan. The COLING 2016 Organizing Committee.
Kauchak, D., Leroy, G., and Hogue, A. (2017). Measuring
text difficulty using parse-tree frequency. Journal of
the Association for Information Science and Technol-
ogy, 68(9):2088–2100.
11
https://wwwmatthes.in.tum.de/pages/1pwcti6a1ymz0/
Master-Thesis-Akshaya-Muralidharan
Martin, L., de la Clergerie,
´
E., Sagot, B., and Bordes, A.
(2020a). Controllable sentence simplification. In Pro-
ceedings of the 12th Language Resources and Evalua-
tion Conference, pages 4689–4698, Marseille, France.
European Language Resources Association.
Martin, L., Fan, A.,
´
Eric de la Clergerie, Bordes, A., and
Sagot, B. (2020b). Muss: Multilingual unsupervised
sentence simplification by mining paraphrases.
Qiang, J., Li, Y., Zhu, Y., Yuan, Y., Shi, Y., and Wu,
X. (2021). LSBert: Lexical simplification based on
BERT. IEEE/ACM Transactions on Audio, Speech,
and Language Processing, 29:3064–3076.
Rubab, I., Khan, M. Y., and Asgher, T. (2020). Transforma-
tion of legal texts into simplified accounts to make the
justice accessible. Pakistan Social Sciences Review,
4(1):141–153.
Siddharthan, A. (2015). A survey of research on text sim-
plification. ITL - International Journal of Applied Lin-
guistics, 165(2):259–298.
Vajjala, S. and Lu
ˇ
ci
´
c, I. (2018). Onestopenglish corpus: A
new corpus for automatic readability assessment and
text simplification. In Proceedings of the thirteenth
workshop on innovative use of NLP for building edu-
cational applications, pages 297–304.
Wubben, S., van den Bosch, A., and Krahmer, E.
(2012). Sentence simplification by monolingual ma-
chine translation. In Li, H., Lin, C.-Y., Osborne, M.,
Geunbae, G., and Park, J., editors, Proceedings of the
50th annual meeting of the Association for Computa-
tional Linguistics (ACL), Jeju, Repulic of Korea, vol-
ume 1, pages 1015–1024. Association for Computa-
tional Linguistics.
Xu, W., Callison-Burch, C., and Napoles, C. (2015). Prob-
lems in current text simplification research: New data
can help. Transactions of the Association for Compu-
tational Linguistics, 3:283–297.
Zhu, Z., Bernhard, D., and Gurevych, I. (2010). A mono-
lingual tree-based translation model for sentence sim-
plification. COLING ’10, page 1353–1361, USA. As-
sociation for Computational Linguistics.
ICAART 2022 - 14th International Conference on Agents and Artificial Intelligence
704