Incollection,

A Complex Network Model of Words to Describe the Dynamics of Text Construction

, , and .
Abstract Book of the XXIII IUPAP International Conference on Statistical Physics, Genova, Italy, (9-13 July 2007)

Abstract

This work explores interdisciplinary dialogue between Physics and Psychoanalysis on the dynamics of text construction. Human language must be considered a complex object of knowledge. Physics offers concepts and instruments that allow modeling the language apparatus as a complex network. Language complexity is evidenced by an intricate system composed of elements (words) that interact in small groups (sentences), arriving at a bigger, auto-organized organism (text), thus producing emergent order (sense). Each sentence is a conceptual unit, where new sentences are connected with old ones by means of shared words, forming a network. Analyzing unconscious phenomena, Freud describes a “reticular fabric”, a network with vertices, edges and interstices, emphasizing quantitative differences between tracks through which the neuron information passes, generating preferential ways, where “difference of essence is substituted by one of destination and place”. Freud’s hypothesis is that, in speech and writing, the process of choosing words is unconscious and determined by easiness of connection between representans (words corresponding to objects, not as meaning, but as marks). With networks theory, we analyzed different samples of written texts, in search of emergent properties. To allow building networks, the texts received a previous treatment to eliminate grammatical words and reduce them to canonic form. Statistical analyses used Degree Distribution, Diameter, Frequency of Pairs, Critical Centrality and Betweenness, as measures for identifying words of bigger value for the network, as well as ratio of new words. The dynamics of written text construction was analyzed by adding new sentences and words, measuring parameters in each stage. All texts presented a redundancy pattern responsible for the topology of the network, but the expanded texts continued to present new concepts, suggesting a similar behavior between them, including oral discourses. The exception was Joyce’s Ulisses whose new word increases as a function of new sentences presented an exponent extremely high. In line with Freud’s hypothesis, results indicate that the network topology is composed by frequency of word repetition and not by structure of sentences or use of grammatical words.

Tags

Users

  • @statphys23

Comments and Reviews