Zusammenfassung
This paper employs two major natural language processing techniques, topic
modeling and clustering, to find patterns in folktales and reveal cultural
relationships between regions. In particular, we used Latent Dirichlet
Allocation and BERTopic to extract the recurring elements as well as K-means
clustering to group folktales. Our paper tries to answer the question what are
the similarities and differences between folktales, and what do they say about
culture. Here we show that the common trends between folktales are family,
food, traditional gender roles, mythological figures, and animals. Also,
folktales topics differ based on geographical location with folktales found in
different regions having different animals and environment. We were not
surprised to find that religious figures and animals are some of the common
topics in all cultures. However, we were surprised that European and Asian
folktales were often paired together. Our results demonstrate the prevalence of
certain elements in cultures across the world. We anticipate our work to be a
resource to future research of folktales and an example of using natural
language processing to analyze documents in specific domains. Furthermore,
since we only analyzed the documents based on their topics, more work could be
done in analyzing the structure, sentiment, and the characters of these
folktales.
Nutzer