傾向分數配對後的相依統計方法~晨晰統計林星帆顧問整理 @ 晨晰統計部落格新站（統計、SPSS、BIG DATA討論園地）

在觀察型研究當中，傾向分數分析（propensity score analysis）的使用，特別是傾向分數配對（propensity score matching）已經是非常普遍，這個部分可參見筆者在之前寫的文章（https://reurl.cc/qd8xg 以及 https://reurl.cc/V6Xr5）。關於傾向分數配對的技術與介紹，網路上已經有非常多資源（可參見筆者同事撰寫的一系列文章（https://reurl.cc/E7z3R、https://reurl.cc/WdL5D以及https://reurl.cc/O1qlv），但目前比較少人討論在傾向分數配對後的統計方法。

理論上，在同一個配對組合（matched pair）之下的實驗組與對照組（或暴露組與非暴露組），由於他們有很接近的傾向分數（成為實驗組/暴露組的機率），因此他們在用來計算傾向分數的基本屬性上（例如年齡、性別、共病等）也會比較相近，因此此時的實驗組與對照組不再是「獨立樣本」，而是具有相依性的配對樣本（paired sample）¹。

傾向分數領域的大師Peter Austin於2011年發表的模擬研究¹，以二元結果變項為例（dichotomous outcomes），模擬的結果指出使用配對統計方法的偏差會比較小，包括治療的選擇性偏差（treatment selection bias）或是混淆效果（confounding）。

以下表格是筆者整理的針對配對樣本（相依樣本），在各種條件之下的適用統計分析方法。

表一、針對各種結果變項類型，適用的配對統計方法

結果變項類型	適用的配對統計方法
連續變項	Paired sample t-test（僅限使用於1：1配對） Generalized estimating equation Linear mixed model
順序變項或是不符合常態分佈的連續變項	Wilcoxon signed rank test
二元類別變項	McNemar test（僅限使用於1：1配對） Conditional logistic regression Generalized estimating equation Generalized linear mixed model
計數變項（Count）	Generalized estimating equation Generalized linear mixed model
存活資料（Time to event）	Log-rank test stratified on matched pairs Cox regression stratified on matched pairs Cox regression with robust standard error

以上除了Generalized estimating equation（GEE）是群體層次的比較（population level）之外，其餘統計方法都是屬於個體層次的估計（individual matched-pair level），這邊其實涉及到固定效應與隨機效應的差別，在這邊先不詳細介紹。

然而，也有相反的意見指出不需要使用配對樣本統計^2-3，主要的論述是，傾向分數配對之後的個體配對（individual pairs，例如一個實驗組對上兩各對照組）不盡然在所有基本資料是類似的，例如26歲男性（實驗組）可能會跟55歲女性（對照組）配對在一起。傾向分數配對只能保證所有實驗組個案與所有對照組個案的傾向分數分佈是類似的。

然而，在Austin（2011）的文章之後，後續的一些關於傾向分數配對使用的回顧文章，也普遍建議要使用配對統計法^4-5。而筆者在最近的數十篇論文的審查過程中，確實也有2-3次被統計審查委員（statistical reviewer）要求改為配對樣本的統計方法，因此筆者建議未來使用傾向分數配對的研究者，盡量以相依樣本的統計方法來比較組別的差異。

參考文獻

Austin PC. Comparing paired vs non‐paired statistical methods of analyses when making inferences about absolute risk reductions in propensity‐score matched samples. Statistics in medicine 2011; 30:1292-1301.
Stuart EA. Developing practical recommendations for the use of propensity scores: Discussion of ‘A critical appraisal of propensity score matching in the medical literature between 1996 and 2003’by Peter Austin, Statistics in Medicine. Statistics in medicine. 2008;27(12):2062-2065.
Stuart EA. Matching methods for causal inference: A review and a look forward. Statistical science. 2010;25(1):1.
Benedetto U, Head SJ, Angelini GD, Blackstone EH. Statistical primer: propensity score matching and its alternatives. European Journal of Cardio-Thoracic Surgery 2018; 53:1112-1117.
Lonjon G, Porcher R, Ergina P, Fouet M, Boutron I. Potential Pitfalls of Reporting and Bias in Observational Studies With Propensity Score Analysis Assessing a Surgical Procedure. Annals of surgery 2017; 265:901-909.