Definitions of the e-index In what follows, we study only the citations received by papers in the h-core, all of a researcher's papers having at least h citations [18]. Using the h-index, the only citation information that can be inferred is ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e001.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e001.jpg) , i.e., at least h2 citations have been received, and additional citations for papers in the h-core are completely ignored. Here we define the e-index to complement the h-index for the ignored excess citations. The excess citations received by all papers in the h-core, denoted by ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e002.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e002.jpg) , are where ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e004.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e004.jpg) are the citations received by the jth paper and ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e005.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e005.jpg) denotes the excess citations within the h-core. Letting we have or Note that ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e009.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e009.jpg) , and ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e010.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e010.jpg) is a real number. Accordingly, A geometrical explanation of the e-index Without losing generality, we assume that ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e012.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e012.jpg) , ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e013.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e013.jpg) , can be represented by a smooth function ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e014.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e014.jpg) , ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e015.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e015.jpg) , where ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e016.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e016.jpg) , and ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e017.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e017.jpg) , ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e018.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e018.jpg) . Based on the function ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e019.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e019.jpg) , we will give a geometrical explanation about the above formulas. i.e., ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e021.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e021.jpg) is equal to the area of the dark gray region in Figure 1. ![Figure 1 Figure 1](picrender.fcgi?artid=2673580&blobname=pone.0005429.g001.gif) | Figure 1 A geometrical explanation of the e-index. |
I emphasize that ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e028.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e028.jpg) is independent of h, and ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e029.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e029.jpg) represents the net excess citations received by all papers in the h-core, in addition to h2 citations. Note that the larger the ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e030.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e030.jpg) , the larger the net excess citations, and hence more severe of the loss of citation information when using the h-index alone. In other words, when the h-index is used to evaluate individual scientists, the smaller the e, the more reliable the h-index is. In an extreme case, when ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e031.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e031.jpg) , which is highly unlikely in reality, the h-index completely describes the citation information for papers in the h-core. Otherwise, when ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e032.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e032.jpg) , the h-index always losses citation information, which is complemented by the e-index. Numerical relations between the e-index and some other h-type indices The relations between the e-index and some other h-type indices, including the a-index [7] and the R-index [14], are presented briefly as follows. A plane is spanned by h and e, called the ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e033.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e033.jpg) plane ( Figure 2). A point ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e034.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e034.jpg) in the ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e035.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e035.jpg) plane represents the overall information of citations received by all papers in the h-core. It is interesting to point out that the Euclidean distance between the origin and the point ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e036.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e036.jpg) is equal to where the R-index here is given a geometrical meaning ( Figure 2). ![Figure 2 Figure 2](picrender.fcgi?artid=2673580&blobname=pone.0005429.g002.gif) | Figure 2 A Descartes coordinate system for the h-e plane. |
From eq. (3), it is found The 4 indices being discussed, h, e, a and R, can be divided into two types, fundamental ones and derived ones. A fundamental index satisfies following conditions (i) it is an independent variable (ii) it can be used to derive other indices. Here h and e are fundamental indices, because they are independent of each other, and they can be used to derive a and R. In contrast, a and R are derived indices, because they are dependent on h and e, which are not derivable given either a or R. Let where f denotes the fold of excess citations over the h2 citations received for papers in the h-core. The total citations received in the h-core are equal to ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e042.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e042.jpg) , as shown in Figure 1. Therefore, the combination ( ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e043.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e043.jpg) ) provides complete citation information in the h-core. In contrast, a and R are derived indices, and they therefore have information redundancy with h. When a or R is used together with h, the information redundancy masks the f values, i.e., the real fold of excess citations over the h2 citations are less than the real ones, which will be exemplified by comparisons of citations for some scientists in the following section. Comparison of the academic performance of scientists within an isohindex group The mapping points P( e, h) can only be situated on the horizontal lines in the h- e plane with h![](corehtml/pmc/pmcents/thinsp.gif) = ![](corehtml/pmc/pmcents/thinsp.gif) 1, 2, …, H, where H is the largest value of the h-index, given a group of scientists. All of the points on the same horizontal line have an identical h-index. For convenience, this horizontal line is called an isohindex line, where “isohindex” denotes an identical h-index. One of such isohindex lines is shown in Figure 2. We further define the isohindex group as follow: A group of scientists having an identical h-index is said to be within an isohindex group. To compare the academic performance of scientists belonging to the same isohindex group, the h-index is inadequate, and the e-index becomes especially necessary. The journal Chemistry World published a list of chemists with high h-indices [19]. As an example, we chose from the list two chemists both having an h-index of 51 ( Table 1). Although having an identical h-index, the second researcher in fact had much more citations than the first researcher. The e-indices for the second and first researchers are 54.73 and 31.10, respectively, and (54.73/31.10) 2![](corehtml/pmc/pmcents/thinsp.gif) = ![](corehtml/pmc/pmcents/thinsp.gif) 3.1, indicating that the citations ignored by the h-index for the second researcher are more than 3 times of those of the first researcher. ![Table 1 Table 1](corehtml/pmc/pmcgifs/table-icon.gif) | Table 1 The e-index and some derived h-type indices for three famous chemists.a |
The merit of using the e-index is that ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e051.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e051.jpg) is strictly equal to the net excess citations received for all the papers in the h-core, whereas a and R are not. Both a and R are derived indices, and they all include contribution from both h2 citations and the net excess citations ( ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e052.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e052.jpg) ), and they are dependent on h and e, while e is independent of h. Consequently, using a or R together with the h-index to evaluate the performance of scientists within an isohindex group can lead to unrealistic result. Compared with the first researcher, the second one had a more than 2-fold increase in net excess citations, however, a and R only increased by 0.57- and 0.25-fold, respectively. Therefore, a and R indices mask the real difference in ignored excess citations, and the e-index is more objective and precise, when used together with the h-index, in comparing the citation information for researchers within an isohindex group. The third researcher listed in Table 1 is the famous chemist, Dr. Berni Alder, who pioneered computer simulation. It is noteworthy that the h-index severely underestimates the scientific impact of him. Although having an h-index of 50, Dr. Alder's total citations were much more than many researchers having an h-index of 50 or even more than 50. For instance, the total citations of Dr. Alder were more than 4 times of those for the first researcher, who had a higher h-index, 51. The e-index for Dr. Alder was 114.0 and f![](corehtml/pmc/pmcents/thinsp.gif) = ![](corehtml/pmc/pmcents/thinsp.gif) 5.2, indicating that the ignored excess citations by the h-index were more than 5 times of the h2 citations, highlighting the need for using the e-index. Loss of citation information by the g-index The e-index proposed here is aimed at considering the contributions of excess citations, which are mainly from highly cited papers. It is necessary to mention the g-index, which was proposed as being “sensitive to the level of the highly cited papers” [5]. The g-index is defined as “the highest number of g of papers that together received g 2 or more citations” [5]. Although having some advantages, the g-index also suffers from the loss of citation information in many important cases, especially for distinguished scientists (most of whose papers are highly cited). For instance, for any ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e053.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e053.jpg) , if then the g-index has no definition. In fact, for any N conditions in eq. (10), the g-index can have no definition. Among the N conditions in eq. (10), the strongest condition is and the weakest condition is Eq.s (10), (11) and (12) are associated with many important cases. For example, Dr. Frederick Sanger is an outstanding scientist, who won the Nobel Prize twice. He has published 30 papers (N ![](corehtml/pmc/pmcents/thinsp.gif) = ![](corehtml/pmc/pmcents/thinsp.gif) 30), and the citation number for one of his paper is 63781, much more than 30 2![](corehtml/pmc/pmcents/thinsp.gif) = ![](corehtml/pmc/pmcents/thinsp.gif) 900, indicating that the condition in eq. (11) is satisfied. Noticing this problem, Egghe later proposed two options [20]: “we can define g ![](corehtml/pmc/pmcents/thinsp.gif) = ![](corehtml/pmc/pmcents/thinsp.gif) T [T denotes the total number of papers], or better […], we can add […] fictitious articles with zero citations: We add enough of these “articles” so that […] we denote by T the new number of articles (including the fictitious ones)”. By the option 1 the g-index for Dr. Sanger is 30, where total citations ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e057.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e057.jpg) , and therefore, about 99% of citations are ignored by the g-index (30 2/79400). Therefore the option 1 could lead to the loss of citation information, especially for distinguished scientists; and in such cases, the more highly cited, the more of the loss of information. By the option 2, the g-index is always equal to [ ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e058.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e058.jpg) ], where [ ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e059.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e059.jpg) ] is the integer part of ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e060.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e060.jpg) . Therefore, for Dr. Sanger, g![](corehtml/pmc/pmcents/thinsp.gif) = ![](corehtml/pmc/pmcents/thinsp.gif) 281, suggesting that about 90% of papers (1–30/281) are fictitious. If the option 2 is adopted, by the g-index alone, there is no way for users to know, for a scientist being evaluated, how many papers are real, and how many papers are fictitious; this will confuse users, as an old saying goes “Fiction in fact, then fact becomes fiction”. Therefore, both options seem not ideal. Here I suggest that the use of an e-like index to denote the loss of citations would be another way to solve the above problem of the g-index. A simple mathematical model Based on the citation curve ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e061.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e061.jpg) , the h and e-indices can be calculated. Here we study only a simple mathematical model. We assume that where ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e063.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e063.jpg) is the maximum citations received by a paper in the h-core. First of all, we assume ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e064.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e064.jpg) . According to the definition of the h-index, we have ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e065.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e065.jpg) , leading to the result Based on eq. (6), we find When ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e068.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e068.jpg) , similarly we have The parameters could be estimated from eq. (14), and it was found that ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e071.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e071.jpg) , ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e072.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e072.jpg) , and ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e073.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e073.jpg) , respectively, for the 3 chemists listed in Table 1. I emphasize that when ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e074.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e074.jpg) , ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e075.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e075.jpg) , and then the h-index becomes unreliable in reflecting the academic performance. For example, letting ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e076.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e076.jpg) , and assuming ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e077.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e077.jpg) , we find ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e078.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e078.jpg) , and ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e079.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e079.jpg) . Consequently, ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e080.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e080.jpg) . This result shows that even when ![An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is pone.0005429.e081.jpg](picrender.fcgi?artid=2673580&blobname=pone.0005429.e081.jpg) , the ignored excess citations (80000) are much more than the h2 citations (100). Concluding remarks The h-index has already been used by major citation databases to evaluate the academic performance of individual scientists. Because of the loss of citation information, comparisons based on the h-index alone can be misleading, as exemplified by Dr. Alder, whose total citations are much more than those of many researchers having higher h-indices; the ignored excess citations (e2) are more than 5 times of h2 citations. Therefore, for accurate and fair comparisons, it is necessary to use the e-index together with the h-index. Some other h-type indices, such as a and R, are h-dependent, have information redundancy with h, and therefore, when used together with h, mask the real differences in excess citations of different researchers. Although simple, the e-index is a necessary h-index complement, especially for evaluating highly cited scientists or for precisely comparing the scientific output of a group of scientists having an identical h-index. |