ALICE – Applying BERT to Italian Emails


Pasquale Restaino and Liliana Saracino, Sogei S.p.A., Italy


ALICE is an Artificial Intelligence solution that allows the automatic classification of email-type documents based on their information content, analyzed almost in real-time. The emails are written in Italian language. The classification classes are equal to 591, and in the use of the service, they will be able to grow further. The aim of this paper is to explores the implementation of the BERT model in the ALICE email classification system for multiclass classification in the Italian language. The main objective of the ALICE solution is the automation of classification processes performed manually by dedicated operators. To improve the performance of the BERT model and to allow the addition of further classification classes, a transfer learning process has been envisaged. The ALICE service, which uses the BERT model trained on the provided dataset, presently has a 76% accuracy.


BERT model, multiclass text classification, Italian emails