A Study on the Impression Received from the Response of a Communication Robot

In this study, we examine how the conversation with the communication robot affects the user's mental status using the subjective measurement method called Visual Analog Scale (VAS). The main purpose of this study is to investigate the possibility of mental health improvement using robot and IT technology. First, we introduce VAS as a subjective evaluation method and subjective measurement applications “VASpad” for smartphones. The development of smartphone applications in recent years is remarkable, smartphones are rapidly becoming familiar to us. In consideration of this situation, we propose a transition from conventional VAS measurement to new VAS measurement that uses smartphone. Consequently, it is expected that the burden of tabulation work is reduced. Next, we describe the experiment using a communication robot. After user talked with the robot for a few minutes, subjective evaluation was performed to the user using VASpad for 10 question items. This subjective evaluation was carried out in two ways, the case where the robot's conversation was smooth and the case where it was not smooth. Finally, we analyze and consider the experimental results. The expected results were obtained when the conversation was smooth, however a different trend was obtained when the conversation was not smooth.


Introduction
In recent years, the number of mental illness cases in Japan is increasing.In response to this problem, the Ministry of Health, Labor and Welfare in Japan inspects the stress situation of workers regularly from December 2015, and informs themselves of the result and urges the awareness of the state of their own stress in personal mental health disorder.The main purpose of this examination system is to inform them about the result and to notice about the state of stress on their own and to reduce the risk of individual mental health disorder and analyze the results to improve the workplace environment collectively.
As mental health care has become more important, companies are also working on alleviating mental health problems by utilizing ICT.One of them is improvement of mental health care using robot.In the conventional market, there were mostly industrial robots for corporations, however many types of robots, from more advanced personal robots to relatively inexpensive toy robots, have been developed in recent years.
Pet type robots are expected to have an effect like animal therapy.However, some problems such as when the user is animal phobia or burden on animals are pointed out.
Pet type robot therapy is expected to solve these problems of animal therapy.There is a possibility to solve this problem by using robot therapy instead of animal therapy.Assuming robot therapy that can expect the same effect as animal therapy, we focused on effects of interacting with users of communication robots.
In this research, we investigate the psychological change that the conversation of the communication robot exerts on the user with a subjective evaluation method called VAS.From a medical point of view, it is desirable to use the latest high-performance communication robots, however we conduct experiments using inexpensive toy robots in this time.Inexpensive communication robots often do not have smooth conversation due to lack of performance.Psychological evaluation on conversation with such a robot seems to become important in the future.We try to examine psychological effects of general conversation with toy robot for users in this study.We focus on the differences between users' reactions obtained from conversations with communication robots, when the conversation is smooth and when it is not.

Robot Therapy
It is expected that people will feel "fun" and "comfort" by touching the robot, and people can obtain "mental recovery" as a result in robot therapy.In fact, some verification results show that robot therapy brings the same effect as animal therapy.These effects are listed below.
1. Psychological effects: cheer and motivate people 2. Physiological effects: reduce stress, stabilize blood pressure and pulse 3. Social effect: activating communication (ex) providing a new source of topic of conversation For example, seal type robot, Paro was used for elderly people at a health service facility for the aged, and the results showed that interaction with Paro improved their moods and depression (4) .Also, Paro has been introduced in several welfare and medical facilities, many pediatric patients who are mentally stable after the introduction of Paro has been seen in the Kobe University Hospital Children Center.Robot therapy has the advantage that it can be used safely in elderly people and toddlers because there is no risk of hygiene problems and the risk of biting and barking.

Communication Robot
Pepper is one of the famous communication robots.Pepper can behave as if it has feelings by itself by "emotion recognition function" which distinguishes its emotion from expression and voice, or original emotion processing function.Pepper is provided by SOFTBANK, and although the price is as high as 198,000 JPY, it is spreading not only for corporate users but also for individuals.
RoBoHoN developed by Sharp is a next generation mobile information communication terminal that combines robot technology and mobile phone technology.As a humanoid robot capable of bipedal walking, it is possible to realize an extremely small size (about 19.5 cm in height), and to carry it in a pocket or a bag.It supports mobile communication and has basic functions of smartphone such as voice call, mail, camera, liquid crystal touch panel, etc.
Because it has a small projector, it can also project photos and images.
In this research, we use communication robot OHaNAS developed by Takara Tomy.It has a feature of carrying on a conversation by using the NTT's AI server via smartphone application without mounting AI on the main body.Because it is less expensive than Pepper, RoBoHoN etc. mentioned above, introduction at home is comparatively easy.
We assume that a communication robot also has a therapeutic effect, and perform a subjective evaluation experiment using this robot.

Technical Specification of OHaNAS
In this research, OHaNAS (Organized Human Interface and Network Artificial Intelligence System) (5) is adopted as a communication robot for experiment.OHaNAS is a communication robot jointly developed by NTT DoCoMo and Takara Tomy and is the first robot adopted in the world that introduces DoCoMo's natural dialogue platform technology.The size of OHaNAS is compact with 160 mm square, and it can be used like a pet.The weight is also light as 590 g with batteries, so it is easy to carry.It operates with three AAA batteries or an external power supply.It can be used for about 2 to 3 weeks as it is used for 20 to 30 minutes a day.To talk with OHaNAS, it is necessary to install a dedicated application on the smartphone and pair it with Bluetooth.After making these settings, it is possible to talk with OHaNAS with the smartphone connected to the Internet.
When entering the profile specified by the smartphone application, the robot makes conversation with consideration to the user.When OHaNAS is started for the first time, it tells us the date, travel knowledge and popular words related to that date.Because it is linked with DoCoMo's server, OHaNAS respond when the knowledge that can be acquired in the net. Figure 1 shows a view of OHaNAS.

Features of VAS
Due to limitations of space, we show the details of VAS in a previous study (1)(2)(3) .VAS is a method of measuring the degree of pain perceived by himself on a horizontal straight line of 100 mm between 1 and 0 and digitizing the degree by its length.There are also reports that it is a highly sensitive pain evaluation method, and it is a widely used evaluation method.
However, some problems have also been pointed out.First, there are disadvantages that patients cannot understand expressing pain on a line, patients with reduced vision, some elderly people or some children cannot acquire methods, and cannot compare with other patients.In addition, subjective assessment measurements are generally carried out on question sheets.Therefore, there is also a disadvantage that the counting of the measurement results becomes complicated.To solve the problem, we are considering that development of VAS application (5) which can automatically calculate measured value is important for smooth subjective evaluation measurement in future.

Other Subjective Measurement Method
We also refer to subjective evaluation methods other than VAS in this section.For example, in the measurement method called LS, it is generally to select a numerical value that applies to his / her pain in 5 to 7 stages.LS is easy to compile results on question paper rather than the case using VAS, and it is often used in psychology etc.
We have confirmed that there is a difference in tendency between LS and VAS in the previous study.The results of comparative experiments on subjective evaluation of VAS and LS for more than 170 subjects indicate that there is some bias between VAS and LS (1) .In addition, we introduce another measurement method called Faces Pain Scale (FPS) (see Fig. 2).Instead of expressing pain by words, FPS expresses pain by human facial expressions, which is expressed by several stages from a painless face to a very painful face, and it is frequently used for children because of easiness to understand.There are various studies on comprehensive evaluation of pain such as VAS and FPS (5) .

Development of VAS app 5.1 Development Objectives
As previously mentioned, VAS is a very sensitive measurement method, and it costs a large amount to obtain numerical values from question paper and aggregate it.Therefore, we have developed an application with the purpose of being able to use VAS measurement on smartphone which became the device most familiar to us.We aim to design a simple interface that can be easily used from children to elderly people, and to develop applications to be used in various situations such as medical scene and education scene in the future.

Development Environment of VASpad
Next, we describe the "VASpad" which we have developed VAS application (7) .There are a question number and a question as interfaces.The value obtained by tapping on the straight line for VAS measurement is normalized by the length of the straight line as the measured value.Therefore, the measured value falls within the interval [0, 1] (indicated by [0-100] in the application and this paper).

Fig. 3 Screenshots of VASpad (Android Version)
We are developing iOS version and Android version.Basically, it has the same function as the VAS application for iPad programmed with Objective-C.Figure 3 shows screenshots of VASpad (Android Version).It is confirmed by previous study (7) that there is no significant difference in performance between VAS measurement by application and VAS measurement using question paper.

Experimental Method
We conducted an experiment to 10 students selected from our college on conversations with OHaNAS using the developed VAS application newly.Student talks to OHaNAS for about 5 to 10 minutes.Here, we focused on the smoothness of the flow of the conversation, we instructed the students to answer 10 questions respectively, in the case of a smooth conversation and the case of a conversation with an inappropriate reply.The speech recognition rate of this robot is not high, therefore conversation cannot be established frequently.Subjects can experience both smooth conversation and contextless response easily.
Figure 4 shows the 10 questions used in this experiment.Q2 and Q3 are questions about paired items such as "Happy-Sad", and Q7 to Q10 are questions using items used in Q2 and Q3 and answer as numerical values, respectively.Q1, Q4, Q5, Q6 are questions about the performance of OHaNAS.

Results and Discussion -From the viewpoint of differences in questions
In this section, we consider the measured values obtained by the experiments described above.We discuss the measurements which is obtained from the 10 questions from the points of view of smoothness of conversation.Therefore, 2 results are shown, one is "smooth conversation" and the other is the conversation include "contextless response".
We analyze from the viewpoint of differences in questions.Table 1 shows the measured values obtained from 10 students when the conversation was smooth.Table 2 shows the measured values obtained from same 10 students when the conversation was not smooth.In this paper, we refer to the former as case 1 and the latter as case 2 to simplify notation."Q1" to "Q10" in the first row of the table represent the question number, "S1" to "S10" in the first column of the  All values are within the interval of 0 to 100, rounded to two decimal places.Table 3 and Tab. 4 show the fundamental statistics of Tab. 1 and Tab. 2, respectively.For most of the questions, we see that the average value of the two cases is different.The standard deviation varies greatly in some questions.
To visually understand these features, we present boxand-whisker charts.Figure 5 and Fig. 6 show the box-andwhisker charts of Tab. 1 and Tab. 2. The thick line inside the box is the median, the upper side of the box is the upper quartile, and the lower side of the box is the lower quartile.The upper and lower lines outside of the box are connected by the dotted line from the position of the upper (lower) quartile to the maxima (minima), respectively.Small circles represent outliers.We can see outliers at Q1, Q4, Q5, Q7 and Q10 in Fig. 5.We also see outliers at Q4 in Fig. 6.
We add supplementary explanation for some characteristic cases.The greatest outlier is seen in Q10 of the smooth conversation.When we asked the student the reason why his degree of anger was high, and he replied, "Although the conversation went well, I could not hear the intonation of the OHaNAS reply."Next, we explain the case of Q1 in Tab. 4 that standard deviation is the largest (29.25).A student who selected a low numerical value puts emphasis on the smoothness of conversation, the numerical value is naturally small, as it did not work.On the other hand, the student who selected a high numerical value evaluates the reply itself, "The conversation did not go well, but it responded."Furthermore, we discuss differences between the two groups in detail.Box-and-whisker charts of Fig. 5 and measured values in Tab. 1 are drawn in blue, and Box-and-whisker charts of Fig. 6 and measured values in Tab. 2 are drawn in red, respectively.In setting up the 10 questions shown in Fig. 4, we anticipate the following trends with respect to the measurement results.We can see that measured values in this experiment meets the above trend from Fig. 7.This result indicates that robot's conversation ability, happiness, sadness and anger are susceptible to smoothness of conversation.Moreover, it can be confirmed that sadness and anger show higher values when the conversation is not smooth.(between_SS / total_SS = 74.9%) Fig. 10 Clustering by K-means method (case 1) This is not far from our common sensitivity.We can also see that the trend of measured values of Q6 shows various values as initially envisioned.
For reference, we also show results of cluster analysis for each question.Figure 9 and Fig. 11 show cluster dendrogram of case 1 and case 2 by the Ward method.Figure 10 and Fig. 12 show the results of clustering by K-means method when the number of clusters is 3 for case 1 and 5 for case 2, respectively.The clustering number was determined based on the measure of the total variance of each clustering.Figure 10 and Fig. 12 plot the clustering result on the first and second principal component planes.(These figures are illustrated in a two-dimensional plane for the purpose of visually understanding, and the two components have no significant meaning.) The result of clustering shown in Fig. 9 and Fig. 10 is consistent with the prior expectation shown in Fig. 8. On the other hand, the results of the clustering shown in Fig. 11 and Fig. 12 are not very applicable to the prior expectation.In this experiment, when the conversation is smooth, the we can obtain expected result, however we cannot find the similar trend when the conversation is not smooth.(between_SS / total_SS = 75.4%) Fig. 12 Clustering by K-means method (case 2)

Results and Discussion -From the viewpoint of differences in respondents
Next, we analyze results from the viewpoint of differences on respondents.The data analysis method is as described above.0.06 -0.52 -0.02 -0.12 0.24 -0.42 -0.05 0.29 S10 -0.18 -0.74 -0.30 -0.14 0.42 0.09 0.34 0.97 0.37 Box-and-whisker charts and circle points drawn in light green express data of case 1, and figures drawn in light orange express data of case 2, respectively.
To examine the similarity of each student, we used a correlation matrix.Table 7 shows the correlation matrix of students in case 1, and Tab. 8 shows the correlation matrix in case 2. To make it easier to understand the feature, cells with a correlation coefficient of 0.8 or more are painted in light pink color and cells less than -0.8 are painted with light orange color.When the conversation was smooth, more than half of the students had high correlation, however the correlation coefficient between students tended to decrease in the other case.

Conclusions
We conducted subjective evaluation experiments to 10 students using the VASpad in conversations with the communication robot.We set 10 questions about the performance of the robot and the feelings received by the robot and asked the students to answer their evaluation after the conversation with the robot.The VAS measurement was carried out in two ways, a case where the conversation is smooth and a case where the conversation is not so smooth.When the conversation is smooth, we can confirm that obtained results match our predicted trends.Cluster analysis using K-means method also agreed with this trend.However, we also confirmed that it did not match well when the conversation is not smooth.The difference trend of correlation coefficients between students in the two cases also indicate the possibility that different reactions are shown depending on the smoothness of the conversation.
Of course, we understand that the number of samples in this experiment is not enough.It seems to be necessary to compare with experimental results obtained with a larger number of samples.It is also an important issue to compare LS and VAS.To clarify these problems, we continue our research in the future.

Fig. 2
Fig. 2 An image of Faces Pain Scale

Fig. 4
Fig. 4 10 questions about conversation with OHaNAS Figure 7 shows overlapped figure of Fig. 5, Fig. 6 and all measured values.

Fig. 8
Fig. 8 Trends of measurement values assumed before experiment Fig. 11 Cluster dendrogram on questions (case 2)

Fig. 13
Fig. 13 Box-and-whisker charts of Tab. 5 and Tab.6 and plots of measured values.
table represent students, respectively.