Analysis for Betweenness Centrality in Social Network Models

In the real social network, it's often important to identify nodes which have huge influence. For searching such node, Betweenness has been proposed, which is a fraction of shortest paths through a node. However, to obtain Betweenness needs huge cost of calculation in large-scale networks. Therefore in recent years, an approximation for Betweenness has been proposed. In this study, we calculate Betweenness on network models to imitate real social networks. In addition, we analyze results of Betweenness by both the above approximate method and conventional one. As a result, Betweenness on the above network models tends to be converged on few nodes in the case that high degree nodes tends to connected each other and precision of the approximation depends on hop number, which limits search ranges.


Introduction
In actual social networks, such as coauthor relations of scientists, costarring relations of movie stars, acquaintance relations of company managers, SNS, it is often important in deals, negotiations, suggestions of new plans to identify person (node) who have huge influence for business.
In actual network such as the above, each node builds links independently, and whole of network is constructed.In other words, such networks are self-organized.As a common feature of many self-organized networks is that degree distribution follows power low.The degree distribution following power low expresses that enormous nodes with only few degrees is mixed with very few hubs.
Recently, as an index to find important elements in networks, Betweenness centrality has been proposed, which is fraction of shortest paths through a node.However, to obtain Betweenness, it is necessary to search for shortest paths for all pair of nodes.Therefore we need high calculation cost for Betweenness in large-scale networks [1].
In late years an approximation to obtain Betweenness in large-scale networks is proposed.This approximation estimates Betweenness in the whole network from rangelimited Betweenness in a restrictive range under the assumption that the number of nodes possible to search increases exponentially for hop number.
On the other hand, whereas, in social networks, hubs (node with high degree) tends to connect each other, while in technical networks such as the Internet, power grids and so on, hubs and nodes with few links tend to connect.
Because the number of nodes possible to search can change with the same hop number if relation of degrees between adjacent nodes are different, the node number can influence to approximation of Betweenness.
In this report, we analyze a property of Betweenness and approximate one for the relations of the degree between adjacent nodes by computer simulations.As a result, it shows that nodes which have high approximate values of Betweenness tend to have high true one.This indicates we can discover central elements with less calculation cost by our proposed method than one for true values of Betweenness.
The paper is organized as follows: Section 2 introduces generation models of the self-organized network in this study.Section 3 gives a calculation of approximate Betweenness with an extrapolation of Bl(u) concretely, which is necessary for implementation and shows results of both our proposed method and the original Betweenness in numerical experiments.And finally, Section 4 gives an abstract.

Generation model of the network 2.1 Generation model of a self-organized network
The degree distribution of a self-organized network obeys power low.As a generation model of such networks, BA model is proposed.The BA model is constructed through two procedures; "addition of node" and "preferential destination search."In this model, nodes with various degrees are connected each other.Now, we explain configration model that can regulate relations of degrees between adjacent nodes in the next section

The generation method of the configration model
The degree distribution of the configration model is determined like BA model.We can regulate probability of connection between nodes with the same degree by a parameter a (It regulate probability to connect nodes each other).Let us denote the number of the nodes by N. The generation method is as follows: (1) Determine degrees of each node of network same as BA model (Do not connect nodes each other).(2-1) Select randomly two nodes i, j those have the numbers of connected links less than their degrees.(2-2) Connect them by the next probability if they are not connected (  : Difference of the rank of node's degree.The node with the highest degree is the first place) (3-1) If it failure to connect in procedure (2) consecutively N times, select link randomly and let l the link.(3-2) Let u, v terminal of l, respectively.Disconnect u and v.  Select two nodes that the numbers of connected links are less than their degrees randomly and connect each node to u, v with probability of equation ( 1).(4) Repeat from the procedure (2-1) until degree distribution of the whole network is satisfied.
In this research, we also tested the method to calculate λ(u) about each node u in procedure(2) without taking average about all nodes in procedure(1).

Analyze of our approximation
In this chapter, we analyze true and approximate value of Betweenness in BA model and configration model.We show parameters of network models in Tab.1 and a computer specification which we use for simulations in Tab.2.
As a measurement to indicate relation of degrees between adjacent nodes, assortativity [3] is proposed.When assortativity is a positive value, it is frequently that nodes with same degree connect with each other.On the contrary, when assortativity is a negative value, it is frequently that nodes with different degree connect.In BA model and configuration model, assortativity was 0 and 0.45 respectively.It is revealed that in the BA model nodes with various degrees connect with each other and in configration model nodes with same degrees connected.

Time necessary to calculate and accuracy of approximation
We show time necessary to calculate and accuracy of approximation in Tab.3.In BA model, our approximation takes more time than Brandes' algorithm, however accuracy of approximation seems to increase.On the other hand, in configration model, time for approximation shorten, however the accuracy decreases, because in configuration model, there is less number of nodes possible to search in M hop than in BA model.Tab.4 also shows this fact.

Comparison between approximate and true value of betweenness
We show results of comparison between approximate value of betweenness, B*(u) and true one, B(u) of each model.B*(u) in BA model tend to be slightly bigger than B(u) with common λ and smaller with individual λ as in Fig. 1 and 2. However, in the both cases, there seems to be little replacement of rank to nodes with high true value of Betweenness.
Furthermore, when we compare both, betweenness of node with high rank in configration model is twice as large as one in BA model.In the other words, it becomes more frequently that shortest paths through specific nodes in configration model than in BA model.Thus, it can be thought that link structure of configration model includes coarseness and minuteness more clearly than BA model.

Comparison between approximate curve and true value
We show curves of <Zl> to l in Fig. 5 and curves of Bl (u) to l in Fig. 6 and 7.
According to Fig. 5, a curve to approximate <Zl> is almost consistent with original <Zl>.However according to Fig. 7 and 8, approximate curves of Bl (u) are slightly different from original Bl (u).
Meanwhile, in Ref. [1], using a monotone increasing function, an approximation for Bl(u) is consistent with original Bl(u); nevertheless <Zl> and Bl (u) converge on a certain value, respectively.
The fact indicates that there are important elements for accurate approximation of BL*(u) except accommodating a form of approximation function with both of original <Zl> and Bl(u).So, we can give a task to improve a function of approximate curve as a future work.

Conclusions
In this report, we analyzed properties with both of true Betweenness and approximate one for a relation of degrees between adjacent nodes by computer simulations.As results, about BA model, we can calculate approximate Betweenness with high accuracy by our method.Moreover, the highest Betweenness in configration model is higher than that in BA model.In addition, high ranks of approximate Betweenness coincidence mostly high ranks of true one.

( 1 )
Calculate the average of node number possible to search: <Zl> with hop number l = 1, 2, ..., M.(2) Approximate <Zl> by the next expression and calculate parameters A and λ using the least squares method.

Table 3 :
Indispensability calculation time and accuracy (u) -B(u)) / B(u)|> Average of relative error about all node u *

Table 4 :
Average of the number of the vertexes within the l hop