How does the protein tracer plot combine p-values?

Combining p-values is a statistical technique used to summarize the evidence from multiple studies or experiments into a single p-value. The purpose of combining p-values is to make a more informed decision or draw more robust conclusions by integrating the results of multiple sources of evidence.

We adopt this technique in the protein tracer plot to create a summarised version of the p-values from a protein list rather than across experiments. In this way, the overall significance of a list of proteins can be more easily compared across experiments.

The protein tracer plot provides a number of methods made available via the scipy combined_pvalues function to combine p-values from protein lists::

Fisher’s combined probability test - Fisher's method calculates a test statistic called the combined chi-squared statistic, which is derived from the individual p-values. This combined chi-squared statistic is then converted back into a p-value, representing the overall evidence across the studies.
Pearson's Method - Pearson's method is a statistical technique used to combine p-values from independent studies or experiments. This method is based on the Pearson chi-squared test statistic, which measures the discrepancy between observed and expected frequencies. To combine the p-values, the chi-squared test statistics from the individual studies are summed, and the resulting combined chi-squared statistic is compared to the chi-squared distribution to obtain a combined p-value. Pearson's method assumes that the p-values being combined are independent and that the underlying data follow a chi-squared distribution. It provides a way to integrate evidence from multiple studies into a single p-value, aiding in the assessment of the overall significance of a research question or hypothesis.
Tippett’s Method - Tippett's method is a statistical approach used to combine independent p-values from multiple studies or experiments. It involves identifying the maximum p-value among the individual studies. This maximum p-value is then interpreted as the combined p-value, representing the overall evidence against the null hypothesis. Tippett's method assumes that the p-values being combined are independent and uniformly distributed under the null hypothesis.
Stouffer’s Method - Stouffer's method is a statistical technique that combines independent p-values by calculating a weighted average. In this method, each p-value is transformed into a standard normal score, and these scores are weighted based on the sample sizes of the individual studies. The weighted scores are then summed, and the resulting sum is divided by the square root of the total number of studies. The final value is interpreted as the combined z-score, which can be converted back to a p-value. Stouffer's method assumes that the p-values are independent and that the effect sizes have a consistent direction across the studies.
Mudholkar and George Method - The Mudholkar and George method is a statistical approach for combining p-values that addresses dependencies among studies. This method considers the covariance structure between the test statistics or p-values from the individual studies. It employs a multivariate normal distribution to model the joint distribution of the test statistics. By incorporating the covariance information, the method calculates a combined p-value that accounts for the interdependence among the studies. Mudholkar and George's method is useful when the assumption of independence among p-values is violated.

The differences between the methods can be best highlighted by their statistics and the particular aspects of a combination of p-values they prioritize when considering significance [^2]. For example, methods emphasizing large p-values are more sensitive to strong false and true negatives. Conversely, methods focusing on small p-values are more sensitive to positives.

By combining p-values, researchers can achieve several benefits:

Increased statistical power: Combining p-values from multiple studies can lead to increased statistical power, which improves the ability to detect true effects or relationships. This is particularly useful when individual studies have limited sample sizes or low statistical power on their own.
Enhanced precision: Combining p-values can reduce the uncertainty associated with individual studies and provide a more precise estimate of the overall effect or relationship. This can be helpful when the results of individual studies are inconsistent or vary in their magnitude.
Improved generalizability: Combining p-values allows for a broader assessment of the research question by incorporating findings from different populations, settings, or methodologies. This can enhance the generalizability of the conclusions and provide a more comprehensive picture.
Reduction of publication bias: Combining p-values can help mitigate publication bias, which occurs when studies with statistically significant results are more likely to be published. By considering all available evidence, including non-significant findings, combining p-values provides a more balanced and unbiased assessment.

It's important to note that combining p-values should be done carefully, considering the assumptions and limitations of the underlying studies. Additionally, there are alternative methods for combining p-values, such as Stouffer's method or inverse-variance weighting, which may be more appropriate depending on the specific context and characteristics of the data.

References and Resources

Chen, Z. Optimal Tests for Combining p-Values. Appl. Sci. 2022, 12, 322. https://doi.org/10.3390/app12010322
Heard, N. and Rubin-Delanchey, P. “Choosing between methods of combining p-values.” Biometrika 105.1 (2018): 239-246.
Pauli Virtanen, et al. (2020) SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17(3), 261-272.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.combine_pvalues.html
https://www.nature.com/articles/s41598-021-86465-y
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.combine_pvalues.html
[Fisher's Method - Wikipedia](https://en.wikipedia.org/wiki/Fisher%27s_method)
George, E. O., and G. S. Mudholkar. “On the convolution of logistic random variables.” Metrika 30.1 (1983): 1-13.
[Fisher's Method and Relation to Stouffer’s Z-score method - Wikipedia](https://en.wikipedia.org/wiki/Fisher%27s_method#Relation_to_Stouffer.27s_Z-score_method)
Whitlock, M. C. “Combining probability from independent tests: the weighted Z-method is superior to Fisher’s approach.” Journal of Evolutionary Biology 18, no. 5 (2005): 1368-1373.
Zaykin, Dmitri V. “Optimally weighted Z-test is a powerful method for combining probabilities in meta-analysis.” Journal of Evolutionary Biology 24, no. 8 (2011): 1836-1841.