Overview

We present a model called DAPPER that can learn to embed persona from natural language and alleviate task or domain-specific data sparsity issues related to personas. To this end, we implement a text encoding strategy that leverages a pretrained language model and an external memory to produce domain-adapted persona representations. Further, we evaluate the transferability of these embeddings by simulating low-resource scenarios. Our comparative study demonstrates the capability of our method over other approaches towards learning rich transferable persona embeddings. Empirical evidence suggests that the learnt persona embeddings can be effective in downstream tasks like hate speech detection.

Datasets

Movies Dialogue Corpus
Personal Essays Dataset
PersonalityCafe Corpus (Expanded Raw dump)

Contact

Have a question or comment? Please contact me at pralav [at] media [dot] mit [dot] edu

DAPPER:
Domain-Adapted Pretraining-based Persona Representation

Domain-Adapted Pretraining-based Persona Representation

Overview

Datasets

Contact