Verb Concept Selection using Co-occurrence Information of Verb Concepts : A Mechanism in an Integrated Narrative Generation System

An integrated narrative generation system, which we develop, generates an event concept by referring to noun and verb concepts than a conceptual dictionary. However, the noun and verb concepts were selected at random. Considering the choice of the noun concept, we used the frequency information and co-occurrence information. Furthermore, we implement a similar method considering the verb concept selection. The method of the verb concept selection based on frequency information assumed in this paper premises that the appearance frequency of all concepts is prepared for. When it is extreme that all the candidates of a certain verb concept are of frequency 0, the concept of selection based on the frequency is in good agreement with the conventional random selection without functioning. Therefore, we suggest a method using the co-occurrence information between the concepts. We demonstrate a calculation result of the frequency information using the co-occurrence information in the verb concept when the frequency information becomes 0, in particular in the calculation method of the frequency


Introduction
The authors have developed a narrative generation system known as an integrated narrative generation system (INGS) (1)(2)(3) .Fig. 1 shows the overview of the INGS.
The INGS generates narrative events and sentences according to two types of macro level parts i.e., generation mechanism and knowledge mechanism.The former is divided into three main elements that include the story generation mechanism, discourse mechanism, and surface generation mechanism.The main elements of the latter are conceptual dictionaries (for noun concepts and verb concepts chiefly), language notation (letter notation) dictionary, and state-event transformation knowledge base.This paper is mainly related to the story generation mechanism and verb conceptual dictionary.The most important element of a story structure generated by the story generation mechanism is an event that consists of the verb concept and the corresponding noun concept.However, the noun and verb concepts were selected at random.Furthermore, we implement a similar method for the verb concept selection.The method of the verb concept selection is based on frequency information.

The Architecture of an Integrated Narrative Generation System
The INGS integrates various mechanisms for narrative generation that we have been suggesting and developing (1)(2)(3) .The entire overview is shown in Fig. 1.The system consists of a concept generation mechanism to generate a story and edit a structure of narration (discourse).In addition, it consists of a surface representation mechanism to represent a narrative by letter, sound, and picture.The INGS requires the conceptual dictionary knowledge (noun concept, verb concept, etc.) for constructing an event concept that refers to an occurrence in the narrative, as well as story content knowledge (causal relationship, script, etc.) for constituting the structure of the narrative.The conceptual dictionary (4) , language notation dictionary (5) , story content knowledge base (Store knowledge show each relationship between events such as causal relationship and script) (6) , etc. are used to obtain the required information for each generation mechanism.Moreover, each event in the conceptual structure is connected states in front and behind using the state-event transformation knowledge base (3) .

Conceptual Dictionaries
An event is generated with the instantiation of conceptual materials stored in conceptual dictionaries for noun and verb concepts.Each dictionary has a hierarchical structure, from higher to lower concepts.The noun dictionary currently contains 115,765 terminal concepts and 5809 intermediate concepts.Each intermediate concept has (1) a list of hyponymy concepts, (2) the number of depth in the hierarchy, (3) the serial number of the super-ordinate concepts, and (4) the range of serial numbers of the hyponymy concepts.Fig. 2 shows the description of a noun concept.
On the other hand, the verb concepts hierarchy has 11,951 terminal concepts and 36 intermediate categories.Each terminal verb concept describes the following three elements: (1) a "sentence-pattern" for a sentence including the verb, (2) one or more "case-frame(s)" to show the types of required noun cases, and (3) one or more "constraint(s)" to define the range in the noun conceptual dictionary, in which each noun concept in the above case-frame(s) requires.For example, Fig. 3 shows the description of the verb concept, "eat".When the INGS materializes the framework of a case-frame, it selects concepts in the noun conceptual dictionary.The objective of the paper is to revise the selecting method.
A story generation process involves expanding or transforming a story structure.In particular, under some parameters ("macro-structure", "length", etc.) given in the first step, a story technique generates a new event or sub-structure including one or more new events using a variety of story content knowledge to integrate them into the original structure using various relations.According to a generated new event, new states are also generated.When an event is generated, the above story content knowledge gives the basic form based on the description of a case-frame shown in Fig. 3.Each constraint indicates the range in the noun conceptual dictionary to decide a noun concept.Essentially, each noun concept in an event is transformed into an instance using the attribute information if necessary.Finally, each generated event is transformed into a natural language sentence using specific word Expression mechanism Fig. 3.The description of the verb concept, Fig. 2. The description of a noun concept representation.Fig. 4 shows the process of generating an event.

Acquiring the Frequency Information and the Co-occurrence Information
We have analyzed the word frequency in 4980 texts (mainly novels) in Aozora Bunko from 1872 through 1963 to use it for verb concept selection in story generation.The image of the process is shown in Fig. 5. KH Coder (7) is used in the analysis in the first step to calculate the word frequency.In addition, the mechanism links the acquired frequency information to the co-occurrence information

Acquiring the Frequency Information
Each terminal concept in the last place of the hierarchy supports the vocabulary item in the language notation dictionary.We let the terminal concept in the verb conceptual dictionary of the INGS reflect the result of the frequency investigation into the verb vocabulary from the target texts.
According to the conditions, the mechanism analyzes the word appearance frequency in the target texts by using morphological analysis.Next, the proposed mechanism relates the above verbs with frequency to the terminal verb concepts in the verb conceptual dictionary by simple matching.Although each description of the verb concepts is theoretically a type of label and the linguistic notation is proposed using the language notation dictionary, a general notation is actually used for each of the description.
This study investigates the appearance frequency of the verb vocabulary from the text.Fig. 5 shows (1) the process of acquiring the frequency information (2) the result of the verb concept dictionary terminal concepts.A verb concept is stored in the verb concept dictionary in the INGS hierarchically.
However, for the verb concept, a meaning perceives a different concept using a number with the same sound as the notation in the verb conceptual dictionary.For example, "eat 1(食べる 1)" and "eat 2(食べる 2)".
Therefore, at first, in the case of acquisition of the frequency information, we compile a redundant concept.For example, we compile "eat 1(食べる 1)" and "eat 2(食 べる 2)" to "eat(食べる)".Finally, considering each verb concept, we return the number that we eliminated.The value of the frequency information of the verb concept and co-occurrence information then just applies a value before returning a number to each verb concept.For example, in the case of 10, the frequency information "eat" makes "eat 1" and "eat 2" from the verb concept and so does each frequency information with 10.
As a result, we have acquired word frequency for 4,885 verbs in the target texts and linked the frequency to 4,886 terminal verb concepts.

Acquiring the Co-occurrence Information between Verb Concepts
The acquisition method of the co-occurrence information is approximately same as an acquisition method of the frequency information.Because information to obtain the difference in a handling of (1) process of Fig. 5 is co-occurrence information, the information to relate to a verb concept in ( 2) is the point that is a verb concept and

Verb conceptual dictionary
Physical-action

Generating sentences
Verb conceptual dictionary Noun conceptual dictionary Language notation dictionary
We show the calculating formula of the co-occurrence information in Fig. 6.Where, "Sa" is the total number of sentences that "A" targeted for a search and is included in the text of the sample; the "Sb" is the total number of sentences that "B" targeted and is included in the text of the sample; "Sa ∩ Sb" is the total number of sentences including neither of "A" and "B" in the text of the sample; "Sall" is the total number of sentences included in the text of the sample.
The number of verb concepts that could obtain the co-occurrence information is the same as the number of verb concepts that were able to get frequency information.

Calculating the Frequency Information based on the Co-occurrence Information
In this section, we calculate the value of the estimate using the co-occurrence information between the verb concepts using the value calculated in section 4. When the verb concept is 0 in the frequency information of the verb concept, we used the value calculated in section 3.

A Method of Calculating the Frequency Information
Considering the verb concept of appearance frequency 0, we find the co-occurrence information of the verb concept based on a supposition "Appearance frequency of concept "B" having a collocation and the appearance frequency of concept "A" are proportional to verb concept "A"".In addition, we suggest a method to set the frequency information to a verb concept of frequency 0 using the frequency information in the verb concept.
The procedure is as follows: (1) The system extracts a verb concept having verb concept "A" and a collocation in order of the strength of the collocation using the KH Coder.
(2) The system calculates the mean of the frequency information of each verb concept that the system extracted, and assumes this frequency information of verb concept "A".
For example, when "lay in stock" co-occur "over (frequency information 20,711)", "to give (frequency information 66,190)", and "to come back (frequency information 49,435)", as for the frequency information of "to lay in stock" is 45,445.33 that is average of frequency information of "over", "to give" and "to come back".
In addition, we cannot estimate the frequency information when the frequency information of a certain verb concept and all verb concepts in the collocation is 0 only by this method.Therefore, we devised the frequency information acquisition technique by using the repetition of the suggestion technique.For example, in Fig 7, "(1)" shows a state of frequency provided using a method without the co-occurrence information, and "( 2)" shows the frequency information setting by the method that uses the co-occurrence information.In concept "E", the frequency remains 0 for frequency 0 then all of co-occurrence concepts.The case that sets the frequency information using the co-occurrence information more here is in "( 3)".The frequency information of concept "E" in "(3)" is set to 1 by calculating the average of concept "A" and "F".

The result of Calculating the Frequency Information
We select 30 novel works of "Riichi Yokomitsu" among "Aozora Bunko" so that verb concepts of 0 frequency information sharply increased as a calculation object of the frequency information.We obtained the frequency information of 2,364 verb concepts.The total number of the verb concept of 0 frequency is 2,521.As a result of calculating the frequency information using the above-mentioned method, 3,672 verb concepts have frequency information, and 1,308 verb concepts have frequency information in comparison with the initial calculation of the frequency information.Fig. 8 shows its transition by a graph.
We then show a generation result using the frequency information.We also show a generation result using the Fig. 7.A method of calculating the frequency information f= frequency information Fig. 6.The calculating formula of the co-occurrence information frequency information before calculation using the co-occurrence information shown in Fig. 9, and we present the generation result using the frequency information after having calculated in Fig. 10.We plan an estimate of the frequency information from the study of "Riichi Yokomitsu" and other authors used the condition by "Aozora Bunko" This experiment is intended to examine confirmation of the validity of the method and other methods from the tendency confirmation of every plural writer.

Conclusions
In this paper, we implement a similar method for verb concept selection.The method of the verb concept selection is based on frequency information.The method of the verb concept selection based on frequency information assumed in this paper premises that the appearance frequency of all concepts is prepared for.Therefore, we suggest a method using the co-occurrence information between the concepts.We demonstrate a calculation result of the frequency information using the co-occurrence information in the verb

Fig. 1 .
Fig. 1.Overview of the integrated narrative generation system

Fig. 4 .
Fig. 4. The process of generating an event