Demography of Altmetrics under the Light of Dimensions: Locations, Institutions, Journals, Disciplines and Funding Bodies in the Global Research Framework

The interconnection between the Dimensions database and Altmetric.com provides an opportunity to carry out a worldwide analysis on altmetrics coverage of scientific literature, analyzing the percentage of documents with altmetric mentions not only in general (indexed documents), but also filtered according to different units of analysis. In order to do so, the Dimensions Pro version database was directly used to retrieve 97,531,400 documents, which were subsequently filtered to obtain the top journals, countries, cities, institutions, research fields, and funding bodies according to the total number of publications indexed in the database. For each entity and year of publication (from 2000 to 2017), the corresponding percentage of publications cited and the Altmetric Attention Score (% mentioned) were calculated. The main results indicate that the total number of publications with an Altmetric Attention Score (AAS) of one or over one is low (9.4% out of the total coverage), which has been highly concentrated in recent years, and higher for open access documents (18.9%), showing an open access altmetric advantage. Otherwise, English-speaking universities stand out, which determines an increase in the presence of specific cities from Anglo-Saxon countries, diminishing the presence in Japan, China, Russia, or India, despite their elevated productivity. Multidisciplinary and medicine-related journals are also highlighted, which in turn influences the research disciplines with a higher AAS (% mentioned): genetics, immunology, microbiology, or medical microbiology. However, since the conducted analysis has brought out some inconsistencies in the quality of the data, results must be taken with caution.


Introduction
The main advantages and disadvantages of social media metrics that have been identified, described, and analyzed for research assessment include one of the main activities within the field of altmetrics (Haustein 2016).
While Wouters and Costas (2012) highlight the main advantages of altmetrics through four dimensions (broadness, diversity, speed, openness), Priem (2014) points out a series of disadvantages, including the lack of theory, ease of gaming, and possible biases. Bornmann (2014) expands the taxonomy of limitations to data quality (bias, target, multiple versions, different meanings, measurement standards, mention standards, cross-field and time normalization, and replication), missing evidence, and manipulation. Likewise, Haustein (2014) suggests the representativeness of altmetrics-which might be included within the ' data quality' category-as one of the main limitations of altmetrics.
The representativeness issue can be treated from different perspectives, namely the population of active users using social platforms (number of users), the order of magnitude of data collected by the direct sources of metrics (amount of data generated), the actual coverage of data obtained by altmetrics data providers (amount of data identified), and the coverage of documents indexed by altmetrics providers (percentage of published documents that are both mentioned and identified). This last issue represents the objective of this study. Haustein et al. (2014a) performed one of the first studies oriented at knowing the coverage of documents with alternative metrics through Altmetric.com, obtaining a coverage percentage of 45.2% for a sample of 84,374 documents deposited in Arxiv.org. Later, Robinson-García et al. (2014) analyzed a corpus consisting of 2,792,706 articles (published between 2011 and 2013 with digital object identifier (DOI) and indexed in the Web of Science) ciphering the coverage by 19%, although with important differences according to the source, being the main source Twitter (87.1% of articles),

Research questions
In relation to the worldwide altmetrics coverage of scientific literature, the following research questions are addressed:

RQ1.
What is the total coverage of academic documents with altmetric mentions at present? What is the evolution of this coverage like over time?
RQ2. What places in the world (countries, cities) show a higher percentage of documents with altmetric mentions? What is the evolution of this coverage like over time?
RQ3. What institutions show a higher percentage of documents with altmetric mentions? What is the evolution of this coverage like over time?
RQ4. What journals show a higher percentage of documents with altmetric mentions? What is the evolution of this coverage like over time?
RQ5. What research categories show a higher percentage of documents with altmetric mentions? What is the evolution of this coverage like over time?
RQ6. What funding bodies show a higher percentage of documents with altmetric mentions? What is the evolution of this coverage like over time?
RQ7. Is Dimensions an accurate bibliographic tool to carry out an analysis of the worldwide penetration of altmetrics, according to different units of analysis?

Method
Dimensions Pro version database directly provides structured information about the set of documents that match with each specific query, including the number of publications, number of citations, citations per publication, the Relative Citation Ratio (RCR) Mean, Field Citation Ratio (FCR) Mean, the percentage of articles cited, and the percentage of publications with an Altmetric Attention Score of one or higher, hereinafter referred to as the AAS percentage.
In order to address RQ1, we defined a global query (97,531,400 documents; time interval: 1665 to 2018). This query was subsequently filtered according to the open access level (all, publisher, and repository). Finally, all publications between 2000 and 2017 (52,048,103 documents) were considered as the study sample.
In regard to RQ2, RQ3, RQ4, RQ5, and RQ6, we selected the top 50 journals, countries, cities, institutions, and funding bodies according to the total number of publications indexed in the database. In the case of disciplines, all of them (152) were gathered in order to minimize potential biases in the results.
Then, for each entity (journals, countries, cities, institutions, funding bodies, and research categories), the annual number of publications from 2000 to 2017 was additionally gathered. Finally, for each entity and year of publication, the corresponding percentage of publications cited and the AAS percentage were calculated.
In the case of documents with multiple authors, binary counting was used to quantify the locations (countries and cities) and institutions. That is each location and institution was counted once per publication.
Lastly, with regard to RQ7, we took into account various procedures to test the reliability of the database: -Data availability was analyzed by checking the number of records without the institutional or discipline fields. To do this, an SQL query to the database was performed using the in-house version of Dimensions available by Centre for Science and Technology Studies (CWTS) as of July 2018. -Data volatility was tested by performing a retroactive analysis of Dimensions. All the same queries were repeated twice (July and October 2018) and directly compared (top 25 entities per entity type were used for this purpose). -Data indexing was finally checked by comparing the number of articles indexed for the set of top 50 journals considered in Dimensions, with the number of articles indexed for the same journals (when available) in the Web of Science.
All data were extracted manually and then statistically analyzed with XLStat. The first sample was gathered in July 2018 and the second sample in October 2018. Data were analyzed in November 2018. The raw results per entity are available in supplementary file 1.

Global coverage
Considering the total coverage of Dimensions as of October 2018 (97,531,400 publications), the percentage of documents (all typologies) with an Altmetric Attention Score amounts to 9.4 ( The percentage of publications with AAS remains stable around 8 from 2000 to 2010, experiencing a notable increase from 2012 (14.8) to 2017 (22.5) (Figure 1). This growth can be directly related to the launch of Altmetric.com, the company that provides Dimensions with altmetric data, as well as an increase in the usage of academic social network platforms by researchers, already foretold by different Nature surveys (Van Noorden 2014; Harseim & Goodey 2017).
If we disaggregate the results according to each of the units of analysis considered (top 50 countries, cities, institutions, journals, disciplines, and funding bodies), we can see that the total AAS percentage (considering the complete coverage in the database) varies depending on the type of entity analyzed ( Table 2). While it is more homogeneous for countries and cities (standard deviation of 4.9 and 5.8, respectively), it seems to be more sparse for funding bodies and, especially, for journals. This effect can be visualized in the box plots performed for each entity (Figure 2).
The global peak achieved in past years (see Figure 1), together with the raw academic publication output growth, may distort to some extent the overall percentage of documents with altmetrics, as well as the specific values obtained at the unit level.   of the units of analysis considered (top 50 countries, cities, institutions, journals, disciplines, and funding bodies), the Spearman correlation between the results obtained in 2017 with those obtained both at the beginning of the analyzed period (2000) and the total values (from 1665). As we can observe, the results (both for publications and AAS percentage) related to 2017 strongly correlate to the total value for all units except for journals. Conversely, the correlation between 2017 and 2010 is weaker. A reasonable explanation of this fact is that the concentration of publications and altmetrics in 2017 is striking.

Journals
The coverage of journals (and articles published by each journal) constitutes the fundamental piece on which the coverage of any bibliographic database is sustained and, therefore, will determine the coverage of documents with altmetrics (both total and broken down by aggregated entities) that Dimensions will show. PLoS One is the journal with the highest number of articles published with an AAS percentage (approximately 131,508 documents), followed by the Proceedings of the National Academy of Sciences (PNAS) (approximately 74,312 documents). Table 5 contains the journals with the highest and lowest total AAS percentages, the AAS percentage in 2017, the statistical range (AAS percentage in 2017 minus AAS percentage in 2000), the standard deviation (SD) (from 2000 to 2017), the total number of publications with an AAS percentage (P alt ), and the ranking position of the journal according to the total number of articles indexed in the database.
Among the 50 journals with the most indexed publications, nine obtain a total AAS percentage value less than one, where ChemInform (0 articles with an altmetric attention score) especially stands out for being the journal with the most articles indexed in Dimensions (791,868), though it ceased its publication in 2017.
Apart from the obvious differences among disciplines (see Research Categories section), multidisciplinary journals show higher performance (PLoS One and PNAS), especially in the last years. Precisely, Nature (97.8) and Science (95.7) are the journals with the highest AAS percentage in 2017 (Figure 3).

Institutions
Harvard University stands as the institution not only with the highest number of publications with AAS, but also with the highest total AAS percentage (Figure 4). We can also observe a predominance of North American universities (7 out of the 10 institutions with higher total AAS percentage are from USA). The University College London (UK) should also be pointed out because this institution achieved the highest AAS percentage in 2017 (70.5 of their documents published have an AAS of one or higher), considering the top 50 institutions with the highest productivity in Dimensions.
On the contrary, Japanese institutions achieve low AAS percentages despite their elevated productivity, especially Osaka University (14th position in total productivity, with 13.1 of AAS percentage), and University of Tokyo (1st position in total productivity, with 14.7 AAS percentage) ( Table 6).
The top 10 universities according to the total AAS percentage are included in Figure 4 so that we can observe the evolution of their AAS percentage over time (2000 to 2017). We can notice a similar pattern as previously observed for the global coverage (see Figure 1), with one notable increase in the AAS percentage located in 2012, which marks a growing trend until 2017.

Geographies
Given the total productivity, it is not surprising to confirm that the United States (2,999,786 documents) and the United Kingdom (907,637 documents) are the countries with the most publications with AAS. However, Australia stands out as the country with the highest total AAS percentage (24.6), followed by Denmark (24.4) ( Table 7), considering only the top 50 countries according to the total productivity.
A world map (both for publications with AAS and for total AAS percentage) is offered in Figure 5. Saint Kitts and Nevis (50.6) and  show the highest total AAS percentages in the world due to a statistical artefact (scarce productivity). For this reason, only countries with a minimum threshold in productivity should be considered when analyzing the AAS percentage.
Data shows China (3rd position in total productivity), Japan (5th position), India (9th position) and Russia (14th position) achieving low AAS percentages (13.3, 11.1, 11.6, and 6.3, respectively). This low impact on social media metrics, not reflected in the number of citations received (Table 8), might be associated with a lower impact of non-English contents on Twitter, the main carrier of altmetric mentions, as well as the usage of Twitter in these countries.
The 50 cities in the world with the highest number of publications can be visualized in Figure 6. Cambridge (Massachusetts, US) achieves the highest AAS percentage in the sample, though there are doubts about whether it should be included as part of Boston (3rd position). As with institutions, cities in the United States take the first positions (Table 9); whereas, a lack of visibility is detected in Japan (Tokyo is the most productive city in the sample, however it shows an AAS percentage of 12.9), Russia (Moscow holds the lowest AAS percentage, 6.9), and China (Beijing is in the 3rd position regarding total productivity, but has a total AAS percentage of 14.6). The presence of other cities (such as Seattle in the United States) can be related to their elevated publication output in highly-cited research disciplines (clinical sciences, public health, and biochemistry).

Research categories
Genetics (31.6) and public health and health services (31) constitute the research categories with the highest total AAS percentages in the sample. On the contrary, we find fields related with mathematics (applied mathematics, 4.8; pure mathematics, 4.6; numerical and computational mathematics, 4.4) in the lowest positions (Table 10). In addition, low values are identified for engineering fields (material engineering, 8.9; electrical and electronic engineering, 7.8; communication technologies, 7.3; interdisciplinary engineering, 6.5) and computer sciences (computer software, 7.5, artificial intelligence and image processing, 8.8), or even combined fields (computation theory and mathematics, 6.8).
As with the previously analyzed remaining entities, an increase in the number of publications with altmetrics occurred in 2012. The evolution of the five research fields with higher and lower AAS percentages in 2017 and the number of publications is offered in Figure 7. As we can observe, the AAS percentage of these disciplines is between 5 and 25 in the period before 2012, while in 2017, the differences among fields have been evidenced, from communication technologies (4.9) to genetics (62.6).
The AAS percentage for each of the 22 fields in which Dimensions integrates research categories is available in Table 11. The results obtained not only reinforce previous findings (high values for medicine and biology-related disciplines; low results for mathematics, engineering, and computer sciences), but also provide an overall picture of all disciplines, locating humanities and social sciences in the global framework, with unexpected high values for education (comprising 4 research categories) and studies in human society (covering 9 research categories). In the case of Europe, we can observe a difference between the European Research Council (second highest total AAS percentage) and the European Commission (26th position). Their evolution (from 2000 to 2017) is available in Figure 8.

Dimensions data a) Data availability
Although Dimensions offers statistics for any query, including a null query (e.g., the whole database), the AAS percentage (percentage of documents returned by a query that has an Altmetric Attention Score of one or above one) for the aggregated entities analyzed firstly depends on the coverage of documents and, secondly, on the information extracted from each document.
In this sense, the in-house version of Dimensions as of July covers a total of 40,711,747 publications published between 2000-2017. Of these, only 55. 1% (22,433,285) have an associated research category field, and an affiliation field appears for 53.8% (21,884,709).   In order to delve into this issue, we compared the number of articles indexed by these journals per year both in Dimensions and the Web of Science. Out of the 50 journals in the sample-those with more total publications indexed in Dimensions-8 are not indexed in the Web of Science.
When one journal changes its name from Name 1 to Name 2, Dimensions merges the articles of all two titles under the bibliographic record corresponding to Name 2. For this reason, in order to compare the output in Dimensions offered by Web of Science, we need to locate all previous journal names in WoS and then merge their production, in order to compare the total volume of publications indexed in both databases. This issue was detected in the following journals of the analyzed sample: Biochimica et Biophysica Acta, Angewandte Chemie, The Lancet, Physical Review A-general physics, Physical Review B-solid state, British Medical Journal, and Analytical and Bioanalytical Chemistry.
The Spearman correlation of all articles indexed by each journal is statistically significant but unexpectedly low (0.54; p-value: 0.000) and increases when only the period (2000 to 2017) is considered (0.87; p-value: <0.0001). A scatter plot comparing the ranking position of journals according to the total number of articles indexed in Dimensions with the ranking position that these journals occupy in the Web of Science confirms that the two databases are offering a different coverage of the academic output (Figure 9). As we can observe, it seems the number of articles is somewhat inflated in Dimensions, with three clear outliers (Notes and Lectures, Scientific American, and Journal of Geophysical Research).   Note: Each document can be assigned to more than one research category. Note: The field average of Documents Cited and Documents with Altmetric Attention Score is calculated through the Cited (%) and AAS (%) of each research category within the field. Therefore, results should be considered as approximate indicative numbers. The case of the Journal of Geophysical Research highlights an indexing problem. This journal was gradually divided into several sections (each of them with a distinctive ISSN). In Web of Science, there are 17,071 articles under the general "Journal of Geographical Research" name (stopped at 1985) and 98,472 publications if we consider all the remaining articles in all the current journal sections. In Dimensions, we can find also a record for the general journal, as well as for each of the seven sections. However, assigning articles to the general journal shows inconsistencies (see Figure 10), which inflates the number of articles indexed in the old journal.  Finally, a retroactive growth of Dimensions has been carried out. As we can observe in Table 13, the database significantly grows from July to October when we analyze the same years. This effect is more pronounced in some years (especially 2007), probably due to the index of new journals. However, the AAS percentage variation is low and only slightly meaningful in 2017 (global decrease of 0.7 points).  The effect of the retroactive growth per type of entity is low (average variation of countries: 0.24; cities: 0.36; institutions: 0.24; journals: 0.26; disciplines: 0.27). However, some outliers are found. The maximum variation for each entity is the following: • Countries: maximum variation of 11.3, detected for the USA in 2017 (43.9 in July; 55.2 in October).
• Cities: maximum variation of 2.5, detected for Ann Arbor in 2017 (49.1 in July; 51.6 in October). Moreover, erratic variation throughout the whole period is detected. • Disciplines: maximum variation of 2, detected for neurosciences in 2016 (59.4 in July; 56.4 in October).
• Journals: maximum variation of 9.2, detected for New England Journal of Medicine in 2011 (58.5 in July; 67.7 in October). Moreover, erratic variation from 2011 to 2017 is detected. • Universities: maximum variation of 2, detected for University College London in 2017 (68.5 in July; 70.5 in October).

Discussion
The results show certain limitations of Dimensions data that can jeopardize the main purpose of discovering the coverage of documents with altmetrics mentions (measured through the number of documents with an Altmetric Attention Score of one or above one).
The percentage of publications without an affiliation field is high (46.2%). This parameter is of importance due to the fact that information about institutions, cities, and countries is extracted precisely from the affiliation field. Moreover, some inconsistencies (documents indexed by journal inflated, unusual annual indexing rates, errors in assigning the article to the right journal) may change the ranking of the most productive journals in the database. Otherwise, the number of publications without a research category assigned is also high (44.9%). Moreover, publication categorization, performed at the article level instead of the journal level, has been proved in literature to show some inconsistencies (Orduna-Malea and Delgado Lopez-Cozar 2018; Bornmann 2018). Finally, the retroactive growth causes some minor variations in the Altmetric Attention Score percentage, which in timely manner may affect the results of specific entities depending on the data collection time.
Nonetheless, despite some particular exceptions, the results offered are plausible and reflect some general wellknown patterns (see Results section). Moreover, because a specific period of time (2000 to 2017) and entities (top 50 entities per type) are considered, the error rate is minimized and we were allowed to specifically concentrate on the years where altmetric activity is higher (2012 onwards). In this sense, the research questions established in this work can be answered in a general way and considered with caution.
Apart from Dimensions-the database is continuously growing and improving its functionalities-other external variables may bias the results obtained.
Firstly, the percentage of documents with altmetric mentions is gathered via one specific data provider (Altmetric.com), whose results may differ from those obtained by other data providers, such as PlumX (Zahedi & Costas 2018). Also, Altmetric.com only uses mentions driven by DOI. This method potentially disregarded publication mentions without DOIs, an aspect already discussed in literature (Weller et al. 2011;Mahrt et al. 2012). Moreover, not all publications have a DOI. Gorráiz et al. (2016) estimates that 10% of articles in Web of Science (2005Science ( to 2014 in sciences and social sciences do not have DOIs, and this percentage is much lower for humanities (exceeding 50% only since 2013). For this reason, all scores about altmetric mentions via Altmetric.com can be considered an underestimation of the real value.
Secondly, publication coverage in Dimensions is wider than in Web of Science and Scopus. At the time of writing this study, Dimensions includes 10,180,612 book chapters (with an AAS percentage of 1) and 375,080 books (AAS percentage of 8.9). This coverage definitely affects all comparisons with altmetric coverage performed previously. For example, while Torres-Salinas et al. (2018) ciphered University Pompeu Fabra as the Spanish public university with the highest percentage of documents (from 2014 to 2016) in Altmetric.com (71%), the Altmetric Attention Score percentage in Dimensions for the same period is 62.3%. Despite the different methods used-the percentage of documents included in Altmetric.com does not necessarily coincide with the percentage of documents with an Altmetric Attention Score of one or above one-the results should be expected to be closer. Therefore, a wider coverage of Dimensions offers a new perspective on the coverage of documents with altmetrics.
Thirdly, there are external variables that have been proved to bias the reception of altmetrics by the publications (Sugimoto et al. 2017). Non-biomedical disciplines (Haustein et al. 2014b;Holmberg & Thelwall 2014;Ortega 2018;Zahedi et al. 2014), disciplinary journals , Latin-American countries (Alperin 2015), and, in general, older publications ) statistically obtain less social media metrics than disciplines on biomedicine, multidisciplinary journals, English-speaking countries, and recent publications. All these previous conclusions are in line with the results obtained via Dimensions, which reinforces its reliability.

RQ1. Total coverage of publications with altmetrics
The total number of publications with an Altmetric Attention Score (AAS) of one or above one is low (9,167,952 documents; 9.4% out of the total coverage) and highly concentrated in recent years (from 2012 onwards), especially 2017, which contains 10.6% of all the documents with AAS. The percentage of documents with an AAS percentage is higher (18.9%) when only open access documents are considered (an Open Access Altmetric advantage).