Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
R ES EA R CH Open Access
Abstract
The Brazilian Symposium on Databases (SBBD) celebrated its 30th edition in October 2015. As the database
community has evolved over the years, so has the data analysis area. To celebrate such accomplishments, this article
goes over the SBBD history from distinct social perspectives. Overall, we investigate the complete SBBD co-authorship
network built from bibliographic data of SBBDs 30 editions, from 1986 to 2015, and analyze several network metrics,
considering the network evolution over the three decades. In particular, we analyze the progress of the most engaged
SBBD authors, the number of distinct authors, institutions, and published papers, and the evolution of some of the
most frequent terms presented in the titles of the papers, as well as the influence and impact of the most prominent
SBBD authors.
Keywords: Collaboration networks, Social networks, Databases, SBBD
The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license, and indicate if changes were made.
Costa de Lima et al. Journal of the Brazilian Computer Society (2017) 23:10 Page 2 of 16
Finally, the Conclusions section summarizes the main Distributed Systems (SBRC) over its 30 editions. Besides
conclusions and possible future directions. analyzing its co-authorship network, they also consider
several network properties to describe the kinds of col-
Related work laborations found and identified the main communities
Collaboration networks analyses have been well explored within SBRC.
to reveal interesting features of academic communities. Similar to the aforementioned works, here, we analyze
Newman presents the first studies in this area [27, 28]. SBBDs co-authorship network statistics, including aver-
He presents distributions of collaborators and their clus- age of articles by author, articles by edition, and coauthors
ters and characterizes different patterns of collabora- by article. We also evaluate several structural and tem-
tion between distinct fields. Then, Newman [29] answers poral characteristics, such as the main type of co-author
a broad variety of questions about collaboration pat- relationship among authors, the most prominent com-
terns, presenting several structural and topological fea- munities within SBBD, the collaborations and newcomers
tures from bibliographic databases in biology, physics, on the network, and the homophily of SBBD from two
and mathematics. Likewise, other studies build and ana- perspectives: affiliation and gender.
lyze academic networks by countries [16, 24], universities The SBBD community was briefly studied by Procpio
[7, 16, 21], researchers [3, 19], subareas [18, 19], and et al. in a short paper [31] on the occasion of its 25th
venues [2, 5, 22, 26, 31, 32]. anniversary, with an analysis of both its structural char-
There are also studies that regard Computer Science acteristics and temporal evolution. Besides including the
(CS) as a specific area. For example, Freire and Figueiredo data of the last 5 years and contributing to a more thor-
[10] characterize the structural properties of the collabo- ough analysis of the SBBD network, we provide a deeper
ration network in CS by exploring external collaborations investigation underlying social perspectives such as col-
of groups and individuals. Lima et al. [18] propose ranking laborations between authors in this network, communi-
authors across multiple research areas by characteriz- ties assembled inside SBBD, and authors with central roles
ing the profile of top Brazilian researchers. Mena-Chalco in the network. Hence, we considerably expand the previ-
et al. [23] study the Brazilian co-authorship network by ous work, not only in the time interval considered but also
exploring topological metrics. in the many analytical dimensions added.
In this article, we perform a deep analysis of the
Brazilian database research community by evaluating the Background information
collaboration network of the Brazilian Symposium on This section describes background information on data
Databases on the occasion of its 30th edition. Consider- acquisition, network modeling, and key metrics consid-
ing other specific database conferences, Nascimento et al. ered in our study. Given the importance of data acquisi-
[26] analyze the co-authorship network of the ACM SIG- tion in this kind of analysis, we start by describing our
MOD International Conference on Management of Data, dataset building process.
whereas Ameloot et al. [2] go over the 30 years of his-
tory of the ACM Symposium on Principles of Database Dataset
Systems (PODS). For a more complete study, we consider both publi-
Regarding other CS areas, Duarte et al. [8] analyze cation statistics and co-authorship social perspectives.
the works presented at the the Brazilian Sympo- Therefore, our dataset comprises bibliographic data of
sium on Multimedia and the Web in the period SBBDs 30 editions from 1986 to 2015, which includes for
19952012 in order to provide an overview of its each paper: its title, year of publication, list of authors
community and show how its research topics have with their respective affiliations, and the language the
evolved over time. Likewise, Neto et al. [11] inves- paper was written. As each SBBD edition is unique in
tigate the impact of international research in the terms of its program and aiming at a more robust anal-
Brazilian Symposium on Software Engineering (SBES) ysis, we consider only full papers actually presented at
by analyzing its first 24 editions. Moreover, Smeaton the symposium (i.e., our dataset disregards short papers,
et al. [33] analyze the co-authorship network and research demos and tools, tutorials and keynotes, and workshop
topics at the 25-year celebration of the International papers).
ACM SIGIR Conference on Research and Development To ensure continuity, the dataset analyzed by Procpio
in Information Retrieval. Liu et al. [20] focus on the et al. [31], comprising SBBDs first 25 years, was
first decade of the Digital Libraries community by ana- extended with data from the remaining 5 years (2011
lyzing the coauthorship network of past ACM, IEEE, 2015) as collected from BDBComp1 , the Brazilian
and joint ACM/IEEE digital library conferences. Simi- Digital Library of Computing [15]. For consolidating
larly, Maia et al. [22] study the collaboration network our dataset and keeping data consistency, we have
of the Brazilian Symposium on Computer Networks and also performed name disambiguation to avoid splitting
Costa de Lima et al. Journal of the Brazilian Computer Society (2017) 23:10 Page 3 of 16
the publications of one person under two different The average number of papers per year is 22.47 (with
author names. a standard deviation of 4.86), and the average number of
Note that, from 2010 to 2014, SBBD full papers were authors per year is 60.53 (with a standard deviation of
published as articles in the Journal of Information and 17.53). Finally, the average number of papers per author
Data Management - JIDM2 , whereas the SBBD proceed- is 1.98 (with a standard deviation of 3.21), while the aver-
ings included short papers, demos and workshop papers. age number of authors per paper is 3.04 (with a standard
Then, the 2015 edition published full papers in the pro- deviation of 1.47).
ceedings (mostly in Portuguese) and articles in JIDM as
well. JIDM has also published full versions of invited SBBD Network metrics
short papers, as well as invited papers from other confer- Several metrics characterize and enable to investigate col-
ences such as the Brazilian Symposium on Geoinformatics laboration networks as the one studied here. This section
(GeoInfo), the Symposium on Knowledge Discovery, Min- summarizes the definition and applicability of the metrics
ing and Learning (KDMiLe), and the Brazilian Symposium used in this work.
on Multimedia and the Web (WebMedia). Thus, seek- The degree of a vertex is the number of its adjacent
ing a more round analysis, for the 20102015 period, we edges. Hence, authors who have high number of papers
consider only full papers from SBBD proceedings and all with different co-authors also have high degree. The
SBBD-related papers from JIDM (both regular articles and degree distribution is the probability distribution of these
invited full versions of short papers). Selected papers from degrees over the whole network. The network density is
the other conferences are not considered, as they do not calculated by dividing the number of edges by the number
reflect publications from the SBBD community, the focus of nodes present in the graph. The assortativity measures
of our study. the similarity of connections in the graph with respect
to the node degree. High coefficient means that authors
of high degree tend to connect to authors of high degree
Network model (assortative network).
Following Maia et al. [22], in this study, we also represent A connected component of an undirected graph is a sub-
the SBBD network as a temporal graph Gy = (Vy , Ey ), graph in which any two vertices are connected to each
where Vy is the set of vertices, Ey is the set of edges, other by paths. The size of such a component is given
and y is the year a network refers to. The graph Gy = by dividing the number of its nodes by the total number
(Vy , Ey ) is an undirected weighted graph, where the ver- of vertices of the graph. The average path length is the
tices represent authors, and the edges indicate that two average number of steps along the shortest paths for all
authors have published together in or before the year possible pairs of network nodes. The graph diameter is
y. In addition, each edge has a corresponding weight the length of the longest shortest path between all pair of
based on the weighted collaboration network proposed vertices.
by Newman [27]: the weight for each edge is discounted Betweenness and closeness regard the centrality of nodes
by thesize of the collaboration, according to the formula in the network. The former is equal to the number of
w = p Np11 , where Np is the number of authors of paper shortest paths from all vertices to all others that pass
p. For instance, given a, b, c, d Vy and assuming that a through a given node, whereas the latter measures how
and b wrote a single paper with no other co-author, and b close a vertex is to all other vertices in the graph. The
and c have a joint co-author d. Then, the weight of the edge clustering coefficient CC(x) of a vertex x is the ratio
(a, b) is 1.0, while the weight of the edges (b, c), (c, d), and between the number of edges among the neighbor-set
(b, d) is 0.5, as the weights among nodes are calculated by of x and the total possible number. The clustering coef-
the number of authors for each paper. Figure 1 shows the ficient of the whole network is the average CC(x) over
complete SBBD network as viewed in 2015, representing all x. Finally, homophily is the tendency of authors to
30 years of history. interact with others with similar features. We refer to
The complete SBBD collaboration network, built from Albert and Barabsi [1] and Costa et al. [6] for further
all papers published in its 30 editions, has a total of information and complete definitions of all these network
1034 authors (vertices) and 2299 collaborations (edges), metrics.
comprising a total of 674 papers. The largest connected
component (LCC) (the largest subgraph in which any pair Basic statistics
of nodes is connected by paths) has 781 nodes, repre- We start our study by discussing the SBBD growth.
senting 75.53% of the whole community. Then, 162 nodes Figure 2 presents the evolution of the number of dis-
compose smaller components containing three to seven tinct authors (Fig. 2a), institutions (Fig. 2b), and pub-
authors. Finally, there are 80 nodes that form pairs of lished papers (Fig. 2c) per year. The regression line (which
authors, and 11 nodes that correspond to sole authors. includes a 95% confidence region) of Fig. 2a shows that
Costa de Lima et al. Journal of the Brazilian Computer Society (2017) 23:10 Page 4 of 16
30
100
Number of institutions
Number of authors
Number of papers
30 25
80
20
60
20
15
40
10
10
20
1990 2000 2010 1990 2000 2010 1990 2000 2010
Year Year Year
(a) (b) (c)
Fig. 2 Evolution of the number of distinct (a) Authors, (b) Institutions, and (c) Papers
Costa de Lima et al. Journal of the Brazilian Computer Society (2017) 23:10 Page 5 of 16
the number of authors has increased over the years, which but starts growing again from 1996 on. The figure also
suggests that SBBD has been attracting the contribution shows the network density without considering the paper
of new authors. The Pearsons correlation coefficient3 with 18 authors, which in fact represents a collabora-
between the number of authors and institutions per year tion pattern outlier. The density presents some oscillation
indicates a high correlation (0.75, p value = 1.493e-06). in the first decade, but with a clear increasing tendency
Thus, we can go further and say that, besides attracting later, possibly explained by the arrival of new authors that
the participation of new authors, SBBD is also attracting collaborate with others who are already in the network.
authors from new institutions, which contributes largely Tables 1 and 2 display the distributions of authors per
to the symposium scientific strength and reachability. paper and papers per author. Most SBBD authors (71.76%)
On the other hand, the sharp decrease in the number have only one paper published throughout the years, and
of authors and papers at the most recent editions might about 28% have two or more papers. Single-paper authors
be due to different reasons. Most of all, if a researcher comprise students who most likely published with their
wants to publish in a conference, there are many options supervisors, authors who have published just once, and
for possible venues of interest. Specifically in Brazil, there employees from companies, for instance. Despite the high
are at least two clear examples of other conferences whose number of single-paper authors, some of the authors
topics of interest intersect with SBBD. One is WebMedia, are really engaged (70 authors published five or more
which was initially restricted to multimedia and hyper- papers) and have published consecutively over the years.
media systems (in Portuguese: Simpsio Brasileiro de Sis- Currently, the record holder is Marcos Andr Gonalves
temas Multimdia e Hipermdia) and then evolved to with a streak (uninterrupted period of publishing) of 11
Multimedia and the Web in 2003. As databases have also years (20042014). He is followed by three authors with
evolved to the Web, now, SBBD and WebMedia have com- streaks of 9 years: Rubens Nascimento Melo (19922000),
mon topics of interest such as Semantic Web, Linked Data, Alberto H. F. Laender (19942002), and Caetano Traina
and Ontologies. Another is KDMiLe, with its first edition Jr. (20032011). This distribution shows that only 13.25%
in 2013, specialized in topics related to data mining and of the authors have been continuously responsible for the
knowledge discovery that were mostly covered by SBBD. core publications.
Nonetheless, we can only speculate, as a true answer could Using the number of publications, Table 3 shows the
only come from interviewing the authors of papers whose top 20 authors considering the 30 years of the sympo-
topics are related to SBBD but are published elsewhere. sium, which is also illustrated in Fig. 4 by the authors
Figure 3 presents the SBBD network density over the name cloud. Then, Fig. 5 illustrates the most engaged
years. The network density is calculated by dividing the SBBD authors by the number of publications separated by
number of edges by the number of vertices present in the decades. Some authors who had a high number of publi-
graph. It achieves its highest value in the first years of the cations during the first years do not have the same partic-
symposium. This can be explained by an unusual paper ipation in the last years. For example, Rubens Nascimento
with 18 authors published in 1989 [34]. After that, as the Melo is one of the founders of the SBBD community who
number of authors increases, the density starts reducing, is already retired.
SBBD is a national symposium targeted at the Brazilian
database research community. Despite that, more than
half of its papers (50.37%) were published in English in
its 30-year history. Figure 6 shows the number of papers
All papers published in English over the years, compared with the
Minus paper with 18 authors
number of papers published in Portuguese. Note that, over
2.1
Graph density
1.8
Table 2 Distribution of papers per number of authors the evolution of research keywords extracted from both
Number of authors Number of papers Percentage (%) title and abstract of SBBD papers in the first 25 edi-
1 33 4.90
tions. For the sake of completeness, we have analyzed
the terms present in SBBD paper titles for all 30 edi-
2 262 38.87
tions. Specifically, Fig. 7 illustrates the main words found
3 189 28.04 in the titles of the papers in Portuguese (Fig. 7a) and
4 99 14.69 in English (Fig. 7b). Most words, in both languages, are
5 or more 91 13.50 typical database-related terms, which is expected since
SBBD is the official database event of the Brazilian Com-
puter Society. In order to further investigate the evolution
of these terms and visualize how Brazilian data-related
of JIDM, when all SBBD full papers were submitted in research has evolved over the last 30 years, Fig. 8 high-
English and the accepted ones published in this journal. lights the amount of papers per year associated with 15 of
This was an important effort of the SBBD community the most frequent words and terms.
to make its results more internationally visible. With an Some terms, such as databases and data, received a
international Editorial Board and a fast track submission high number of mentions over all the years. Particularly,
scheme, the JIDM effort produced very good results in database studies usually focus on different aspects of the
terms of the quality of the accepted papers, but consid- data life cycle, such as modeling, designing, and manag-
erably reduced the number of submissions to SBBD, par- ing spatial, object, distributed, and relational databases.
ticularly those by younger authors. Thus, in 2014, SBBD The appearance of specific terms such as XML, min-
resumed its traditional submission scheme, but including ing, and Web from 2000 onwards corroborates the rise
in its technical program all JIDM database-related articles of these topics in Computer Science. Figure 8 also
published in the same year. shows that some related terms do not keep high grow-
Regarding the main topics of interest covered by SBBD, ing rates. For example, the term object-oriented achieves
the study reported by Kauer and Moreira [14] presents its highest number of titles in mid 1990s, decreas-
ing its importance afterwards most probably because
the idea of an object-oriented database was replaced
Table 3 Top twenty most prolific SBBD authors ranked by by the more realistic solutions over object-relational
number of papers systems.
Author Number of papers
Collaborations and newcomers
Caetano Traina Jr. 42
In this section, we discuss the SBBD authors collaboration
Alberto H. F. Laender 37 networks from two points of view: existing collaborations
Marta L. de Queirs Mattoso 32 and newcomers (SBBD first time authors).
Marcos A. Gonalves 29 Figure 9 shows the values for mean, skewness, and
Agma J. M. Traina 27 variance of the degree distribution over the years. Both
skewness and mean tend to stabilize, with similar behav-
Wagner Meira Jr. 23
ior from 2000 on. Note that, skewness and variance values
Marco A. Casanova 21
are significantly large, indicating that some authors pos-
Altigran S. da Silva 19 sess a high degree. We note that such trends also happen
Claudia Bauzer Medeiros 17 in SBRC [22].
Rubens Nascimento Melo 17 Using Newmans relation weight scheme, the range of
Ana Carolina Salgado 16 a single collaboration in all editions of SBBD varies from
0.06 to 8.72. The former is due to (again) the unusual
Berthier A. Ribeiro-Neto 16
paper with 18 authors published in 1989, whereas the lat-
ngelo Brayner 15
ter is product of a long-term collaboration between Agma
Ana Maria de Carvalho Moura 14 J. M. Traina and Caetano Traina Jr., who coauthor 26
Valria Cesrio Times 14 papers.
Maria L. M. Campos 13 At SBBD, papers are mostly published by two authors
Ulrich Schiel 13 (38.87%). Publications from single authors took place
from 1986 to 2003, comprising only 4.90% of all papers.
Adriano Veloso 12
Moreover, the number of single-author papers (Fig. 10)
Edleno Silva de Moura 12
has decreased over the years, thus suggesting that SBBD
Jano Moreira Souza 12 authors are collaborating more. Indeed, 53.33% of the
Costa de Lima et al. Journal of the Brazilian Computer Society (2017) 23:10 Page 7 of 16
authors who published a single-author paper have pub- tend to connect to low degree nodes, i.e., new authors with
lished at least one joint paper after it. few collaborations are increasingly tending to connect
The average degree of the network is 4.45, i.e., on to authors with higher degree. This kind of behavior is
average, an SBBD author collaborates with many distinct usually observed between students and their supervisors.
coauthors. The degree distribution (Fig. 11) follows a We also notice that, at early years, the number of new-
power-law, i.e., there are few authors with high degree (the comers connected to the largest connected component
highest being Marcos A. Gonalves with 68), and most of was close to the number of newcomers connected out-
the authors have only few collaborations. This is a com- side it, probably because this component was not large
mon feature of many complex networks, closely related to enough. Figure 13 shows that SBBD initially received
the rich-gets-richer effect [4]. several authors with no link to its largest connected com-
A further investigation into collaborations shows that ponent, but recently, it has become more common to
the SBBD network is non-assortative, with a value close join the symposium collaborating with someone already
to 0 (-0.06). Figure 12 shows the time evolution of assor- connected to it.
tativity and how predominant this kind of collaboration An important question is whether a newcomer who
is becoming. This indicates that nodes with high degree joins SBBD as a part of the LCC has more chance of
30 Portuguese
English
Number of papers
20
10
0
1990 2000 2010
Year
Fig. 6 Number of papers published in English and in Portuguese over time
reappearing than an author who joins it outside this com- Analysis of the collaboration groups
ponent. Table 4 shows the number of newcomers with at We now take a further look into the SBBD collaboration
least one reappearance in the symposium in the first and groups by using social network metrics and the k-clique
in the last 15 years. Most of the authors who have joined algorithm [30].
the symposium in the first 15 years and reappeared at least
once did not join it directly into the LCC (around 81%). Clustering coefficient
However, this has changed in recent years, as authors who The SBBD network has a high global clustering coefficient
have more than one SBBD paper usually contribute to a of 65.6%, indicating dense groups of authors who pub-
paper with a well-connected SBBD author first (62.18% lish together (PODS, for example, achieved only 35% in its
of reappearances from 2001 to 2015 are from authors 30th edition in 2011 [2]), indicating that there is a stronger
who joined the SBBD community in collaboration with triadic closure effect [12] in SBBD. Even though the num-
an author already in the LCC). In addition, the Pearsons ber of authors in the network has substantially increased,
correlation coefficient between the number of newcom- new authors tend to collaborate with coauthors that main-
ers connected to the LCC and the density of the graph tain previous collaborations, establishing triangle network
(0.46, p value = 0.009) indicates a slight positive correla- motifs. In other words, previous collaborations increase
tion. Such a result suggests that the network is becoming the chance of a pair of authors establishing a collaboration
more dense as the number of newcomers connected to the with a newcomer in the network4 .
LCC increases. This reinforces the increasing tendency of Figure 14 shows the evolution of the average shortest
the density values observed from 2000 onward in Fig. 3. path length and the clustering coefficient of SBBD, as well
(a) (b)
Fig. 7 Word clouds from the paper titles (a) in Portuguese, and (b) in English
Costa de Lima et al. Journal of the Brazilian Computer Society (2017) 23:10 Page 9 of 16
Database
Data
System 14
Model 12
Temporal
ObjectOriented 10
Query
Relational 8
Management 6
Information
XML 4
Mining
Web 2
DBMS
Application 0
1990 1995 2000 2005 2010 2015
Fig. 8 Evolution of some of the most frequent terms over time, as collected from titles of papers both in Portuguese and English
as their equivalent random networks. A high clustering smaller components and authors who did not collaborate
coefficient (compared to its equivalent random network) with other authors. There are only 11 cases of sole authors
and a small average shortest path (as low as its equiva- (see Fig. 1), whose papers were published between 1986
lent random network) characterize the SBBD network as and 2003 (see Fig. 10). It is worth noticing that most of
a small-world network [25]. This phenomenon has long such papers are short communications describing projects
been subject of scientific studies and is typical in social developed by large technology companies (e.g., Telebrs
networks [35]. and Embratel) or present results of their authors thesis or
Table 5 shows the size (number of vertices) of the two dissertation.
SBBD largest connected components over the years, and
Fig. 15 the evolution of their relative sizes. The LCC for Formation of collaboration groups
the 30 editions has 2004 edges, with a relative size of Decomposing a complex network into groups (sets of
87.17%. It grows with an average of approximately 26 new highly connected nodes) is very important, as it may help
nodes yearly over the 30 years, 32 over the last 20 years, to understand a-priori unknown features and properties
and 41 over the last 10 years, reaching 75.53% of all nodes of the network. In this section, we focus on discovering
in 2015 (13.06% in 1995 and 36.07% in 2005). These val- and analyzing collaboration groups inside SBBD.
ues confirm that most of the SBBD authors are connected Specifically, we use the k-clique algorithm [30] that
through a single component, with the second largest con- relaxes the notion of clique and has shown great success in
nected component (SLCC) having only 22 edges, with a detecting clusters on a large scale. A collaboration group is
relative size of 0.96%. The remaining cases correspond to then defined as the maximal union of k-cliques that can be
5
30
Single author papers
20 3
Value
Mean
Skewness
Variance
2
10
1
0 0
1990 2000 2010 1985 1990 1995 2000
Year Year
Fig. 9 Three moments of the degree values over the years Fig. 10 Number of single-author papers by year
Costa de Lima et al. Journal of the Brazilian Computer Society (2017) 23:10 Page 10 of 16
LCC
Number of newcomers
200 40
Number of nodes
other
150 30
100 20
50 10
0 0
0 20 40 60 1990 1995 2000 2005 2010 2015
Degree Year
Fig. 11 Degree frequency Fig. 13 Number of newcomers connected to the largest connected
component and outside it with at least one reappearance
6
0.6
Relative size
4 Random 0.4 LCC
Real SLCC
0.2
2
0.0
1990 2000 2010 1990 2000 2010
Year Year
(a) Fig. 15 Relative size of the largest connected component (LCC) and
second largest connected component (SLCC)
Clustering Coefficient
0.6
Table 10 Top ten closeness Table 12 h-index and number of publications of the authors
Name 30 yrs Last 10 yrs who appear in at least one of those top ten ranks: Newmans
weight, betweenness, closeness, and degree
Rubens Nascimento Melo 0.191
Name h-index # pub.
Alberto H. F. Laender 0.188 0.117 (6th)
Marcos A. Gonalves 39 29
Marco A. Casanova 0.183 0.074 (136th)
Wagner Meira Jr. 39 23
Marcos A. Gonalves 0.182 0.159 (1st)
Alberto H. F. Laender 33 37
Jos Lus Braga 0.182
Marta L. de Queirs Mattoso 31 32
Jos A. F. de Macdo 0.181 0.082 (111th)
Agma J. M. Traina 30 27
Ana Carolina Salgado 0.178 0.073 (138th)
Altigran S. da Silva 29 19
Srgio Lifschitz 0.177 0.039 (253rd)
Caetano Traina Jr. 28 42
Altigran S. da Silva 0.176 0.121 (64th)
Marco A. Casanova 28 21
Valria Cesrio Times 0.176 0.096 (2nd)
Ana Carolina Salgado 20 16
Jano Moreira Souza 18 12
Conclusions
Table 11 Most central authors per year In this article, we went over the SBBD history from dis-
Name Count Most central in tinct social perspectives. Specifically, we presented a deep
Rubens Nascimento Melo 13 1993, 19961998, 20062011, analysis of the Brazilian database community based on
20132015 the publications included in the SBBD proceedings and its
Marta Lima de Queirs Mattoso 9 1989, 19992005, 2012 associated journal JIDM. We achieved this by investigat-
Dcio Fonseca 3 1990, 1991, 1992 ing the complete SBBD co-authorship network built from
bibliographic data of SBBDs 30 editions and analyzing
Alberto H. F. Laender 2 1994, 1995
several network metrics (e.g., degree, density, and assor-
Sonia Schechtman Sette 2 1987, 1988
tativity) considering the network evolution over three
N. L. Knauth 1 1986 decades. In particular, we analyzed the involvement of
Costa de Lima et al. Journal of the Brazilian Computer Society (2017) 23:10 Page 14 of 16
1.5
Mean
6 1.0 Skewness
Variance
Value
Value
0.5
4 Mean
Skewness
Variance 0.0
2
0.5
0 1.0
1990 2000 2010 1990 2000 2010
Year Year
(a) (b)
Fig. 17 Three moments of the (a) betweenness and (b) closeness values
the most engaged SBBD authors, the number of distinct been continuously responsible for the core publications.
authors, institutions, and published papers, and the evo- Besides, one common feature of many complex networks,
lution of some of the most frequent terms presented in closely related to the rich-gets-richer effect [4], can also
the titles of the papers. Finally, we discussed the produc- be observed in the SBBD community, as most authors
tivity, influence, and impact of SBBD authors given by have only few collaborations while there are few that
centrality measures and showed that the SBBD network correspond to high degree nodes.
follows a phenomenon typical in social networks known Our analysis demonstrates that even though the num-
as small-world. ber of authors increased, new authors tend to publish
Among our main findings, we provide evidence that the papers with coauthors that maintain previous collabo-
SBBD community is becoming more collaborative over rations, which can be associated with the high number
the years. Moreover, the increasing number of newcomers of collaboration groups with few authors found in the
is followed by an increasing number of new institutions, network. We showed that geographical location and the
which contribute to the symposium scientific strength affiliation of an author are strong factors to determine
and reachability. In spite of some authors being really which group this author belongs to in the SBBD com-
engaged and publishing consecutively for several years, munity, although collaborations between regions such as
only few authors are key nodes in the network and have Southeast and North, for instance, have many affiliated
authors who publish together in SBBD. Furthermore, the
connection of high degree nodes to low degree ones (usu-
Table 13 Top-10 institutions considering the number of distinct ally observed between students and their supervisors)
authors
is another factor that leads to collaborations between
University # Authors Region authors.
UFMG Universidade Federal 61 Southeast
de Minas Gerais
UFRGS Universidade Federal 54 South 0.9
do Rio Grande do Sul
Affiliation
UFRJ Universidade Federal 51 Southeast
do Rio de Janeiro
Gender
Homophily
Authors contributions
1.00 Male Professors MM and AP designed the social study; students LMdA and GP
performed data collection and basic statistics; GP also defined the analyses of
Female influential authors and homophily; LH was in charge of all other metrics; all
authors contributed differently in writing the paper; specially, professors AHF
0.75
Relative Size
and JP were fundamental for analyzing the SBBD evolution. All authors read
and approved the final manuscript.
Publishers Note
0.25 Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
Author details
0.00 1 Universidade Federal de Minas Gerais, Belo Horizonte, Brazil. 2 Universidade
1990 2000 2010 Federal do Rio Grande do Sul, Porto Alegre, Brazil.
Year Received: 7 March 2017 Accepted: 23 June 2017
Fig. 19 Authorss gender distribution over time
References
1. Albert R, Barabsi AL (2002) Statistical mechanics of complex networks.
Despite the deep analyses of the Brazilian Database Rev Mod Phys 74(1):47
2. Ameloot TJ, Marx M, Martens W, Neven F, Van Wees J (2011) 30 years of
community based on its publications, our work could be
PODS in facts and figures. ACM SIGMOD Rec 40(3):5460
improved by further studying the geographic location of 3. Arruda D, de Lima Bezerra F, Neris VA, Toro PRD, Wainer J (2009) Brazilian
the collaborations. It would also be interesting to consider computer science research: gender and regional distributions.
analyses based on the collaboration network of mem- Scientometrics 79(3):651665
4. Barabsi AL, Albert R (1999) Emergence of scaling in random networks.
bers of SBBD program committees, equally studying the Science 286(5439):509512
existing members and appearance of newcomers through 5. Bazzan ALC, Argenta VF (2011) Network of collaboration among PC
the years. Finally, we could also investigate how SBBD members of Brazilian computer science conferences. J Braz Comput Soc
17(2):133139
researchers contribute to other communities, similar to 6. Costa LdF, Rodrigues FA, Travieso G, Villas Boas PR (2007) Characterization
the broader study of Silva et al. [32]. of complex networks: a survey of measurements. Adv Phys 56(1):167242
7. Digiampietri LA, Mena-Chalco JP, Vaz de Melo PO, Malheiro APR, Meira
Endnotes DN, Franco LF, Oliveira LB (2014) BraX-Ray: an x-ray of the brazilian
computer science graduate programs. PLoS ONE 9(4):e94541
1
BDBComp: http://www.lbd.dcc.ufmg.br/bdbcomp 8. Duarte A, Jnior MLdM, Souza J, De Brito AV, Trinta FAM, Viana R (2014)
2 WebMedia XX: who we are and what we have done in the last two
JIDM: https://seer.ufmg.br/index.php/jidm/index decades. In: Kulesza R, Tavares TA (eds). Proceedings of the 20th Brazilian
3
Pearsons correlation coefficient provides a measure Symposium on Multimedia and the Web, WebMedia 2014, Joo Pessoa.
ACM. pp 9198. http://doi.acm.org/10.1145/2664551.2664576
of the strength of a linear association between two 9. Easley DA, Kleinberg JM (2010) Networks, crowds, and markets: reasoning
variables [17]. about a highly connected world. Cambridge University Press, New York.
4 http://www.cambridge.org/gb/knowledge/isbn/item2705443/?
A similar case of the friend of my friend is also my site_locale=en_GB
friend phenomenon found in social networks [28]. 10. Freire VP, Figueiredo DR (2011) Ranking in collaboration networks using a
group based metric. J Braz Comput Soc 17(4):255266
5
The h-index [13] of an author is the highest number of 11. Gomes JS, da Mota Silveira Neto PA, Cruzes DS, de Almeida ES (2011) 25
her publications with at least that many citations, e.g., an years of Software Engineering in Brazil: an analysis of SBES history. In: 25th
Brazilian Symposium on Software Engineering, SBES 2011, So Paulo. IEEE
author with 10 papers with at least 10 citations each has Computer Society. pp 413. doi:10.1109/SBES.2011.11
12. Granovetter MS (1973) The strength of weak ties. Am J Sociol
h-index of 10. 78(6):13601380
6
Google Scholar: https://scholar.google.com.br 13. Hirsch JE (2005) An index to quantify an individuals scientific research
output. Proc Natl Acad Sci USA 102(46):16569
Abbreviations 14. Kauer AU, Moreira VP (2013) Evoluo dos Temas de Interesse do SBBD ao
CS: Computer Science; CSBC: Congress of the Brazilian Computer Society; Longo dos Anos. In: Fileto R, Cristo M (eds). XXVIII Simpsio Brasileiro de
JIDM: Journal of Information and Data Management; KDMiLe: Symposium on Banco de Dados - Short Papers. SBC, Recife. pp 25:125:6. http://www.lbd.
Knowledge Discovery, Mining, and Learning; LCC: Largest connected dcc.ufmg.br/colecoes/sbbd/2013/0025.pdf
component; PODS: Symposium on Principles of Database Systems; SBBD: 15. Laender AHF, Gonalves MA, Roberto PA (2004) BDBComp: building a
Brazilian Symposium on Databases; SBC: Brazilian Computer Society; SBRC: digital library for the Brazilian computer science community. In: Chen H,
Brazilian Symposium on Computer Networks and Distributed Systems; SLCC: Wactlar HD, Chen C-C, Lim E-P, Christel MG (eds). ACM/IEEE Joint
Second largest connected component; SNA: Social networks analysis; Conference on Digital Libraries, JCDL 2004. ACM, Tucson. pp 2324.
WebMedia: Brazilian Symposium on Multimedia and the Web doi:10.1145/996350.996357
16. Laender AHF, de Lucena CJP, Maldonado JC, de Souza e Silva E, Ziviani N
Funding (2008) Assessing the research and education quality of the top Brazilian
The research was partially funded by CNPq, PRPq/UFMG, and FAPEMIG - Brazil. Computer Science graduate programs. SIGCSE Bull 40(2):135145
Costa de Lima et al. Journal of the Brazilian Computer Society (2017) 23:10 Page 16 of 16