Estimating the Causal Effect of Early ArXiving on Paper Acceptance¶

🎯 Pitch¶

This paper uses ICLR 2018–2022 observational data and a causal-inference pipeline—covariate matching combined with a negative-control outcome (long-term citation indicator) and difference-in-differences reasoning—to estimate the effect of posting a preprint before the review deadline on acceptance. Its key finding is that any advantage from early arXiving is small (often <4%) or statistically indistinguishable from zero and does not differ meaningfully across author prestige groups, challenging the justification for conference anonymity periods and informing policy on preprint timing and review fairness.

1. Executive Summary (2-3 sentences)¶

This paper estimates the causal effect of posting an ICLR submission on arXiv before the review deadline (“early arXiving”) on the probability of acceptance, using observational data from ICLR 2018–2022 (Abstract; §5.1). Because key drivers of acceptance like “paper quality” are unobserved, it combines covariate matching with a negative control outcome (a binarized long-term citation indicator) and a difference-in-difference–style estimator to reduce bias, finding that any acceptance advantage from early arXiving is generally small (often <4%) or statistically indistinguishable from 0, and does not differ meaningfully across author prestige subgroups (Figure 2; Figure 3; §5.3–§5.4).

2. Context and Motivation¶

Problem / gap addressed.
Double-blind review intends to hide author identities, but widespread preprints make it easier for reviewers to infer authorship, potentially introducing bias (Introduction, §1).
There is no randomized controlled trial on whether early arXiving affects acceptance decisions, so the paper turns to observational causal inference (Abstract; §1).
Why it matters.
If early arXiving increases acceptance probability, it could:
- incentivize strategic timing of preprints, and/or
- undermine fairness if the effect differs by author reputation or institution (Introduction, §1; §5.3).
The work targets two research questions (Introduction, §1):
- RQ1: Does the effect differ across author groups (citations, institution rank)?
- RQ2: What is the overall effect on acceptance?
Prior approaches and shortcomings (as positioned here).
Prior work comparing single- vs double-blind review gives mixed evidence about bias toward prestigious authors/institutions (Introduction, §1 cites Tomkins et al., 2017; Madden & DeWitt, 2006), but does not directly address arXiv timing before review.
Simple observational comparisons confound early arXiving with unobserved factors like quality/novelty/impact, which influence both the decision to preprint early and acceptance (Figure 1; §2; §4.2).
How this paper positions itself.
It formalizes the problem with a causal graph (Figure 1) and estimates the Average Treatment Effect on the Treated (ATET) using:
- rich observed confounders (Table 1),
- matching to balance those confounders (§4.1; Table A.1),
- and a negative control outcome (NCO) based on citations to address unobserved confounding (§4.2; Figure 2).

3. Technical Approach¶

3.1 Reader orientation (approachable technical breakdown)¶

The system built here is a causal estimation pipeline that uses ICLR submission metadata + paper content features to estimate how much early arXiving changes the chance of acceptance.
The solution shape is: define a causal estimand (ATET) → construct treated/control groups → match on observed confounders → use an NCO-based DiD-style estimator to reduce bias from unobserved “quality” (Figure 1; Eq. (1); §4.1–§4.2).

3.2 Big-picture architecture (diagram in words)¶

Data assembly (ICLR + external citation metadata) → produces A (early arXiv), Y (accept), observed covariates C, and citation-derived variable CC(n) (§5.1; Appendix A.2).
Feature construction → builds 18 observed confounders (Table 1; Appendix A.1).
Matching module → pairs each treated paper with a similar untreated paper on C (§4.1; Table A.1).
Effect estimation module:
“Primary” estimator: effect on matched pairs without NCO adjustment (Figure 2, red; §5.2).
“NCO-adjusted” estimator: uses N derived from citation quantiles and a DiD-style contrast (Figure 2, gray/black; §4.1; §5.4).
Subgroup analysis → repeats estimation within bins of institution rank / author citations to test heterogeneity (Figure 3; §5.3; Appendix B).

3.3 Roadmap for the deep dive¶

Define variables, causal target (ATET), and assumptions (Figure 1; Eq. (1); §2–§3.1).
Explain matching: how treated/control comparability is enforced using observed confounders (Table 1; §4.1; Table A.1).
Explain the negative control outcome idea and why citations are used (Figure 1; §4.2; Appendix A.2).
Derive the NCO/DiD estimator used and what extra assumptions it needs (Assumption 4.1; §4.1).
Walk through evaluation choices (different n and q) and interpret results (Figure 2; Figure 3; §5.2–§5.4; Appendix C/D).

3.4 Detailed, sentence-based technical breakdown¶

This is an empirical causal inference paper that combines matching with negative control outcomes to estimate an acceptance effect that is otherwise biased by unobserved “quality” (Figure 1; §4).

Variables and causal estimand¶

The treatment A is binary: A=1 if the paper is posted to arXiv before the review deadline (not the submission deadline), and A=0 otherwise (§2; footnote 5).
The outcome Y is binary: Y=1 if accepted to ICLR, Y=0 if rejected (§2).
Observed confounders C include 18 features covering year, paper structure, text-derived proxies, topic, and author/institution metadata (Table 1; Appendix A.1).
Unobserved confounders U represent hard-to-measure factors such as creativity, originality, novelty, and “paper quality” (Figure 1; §2).
The paper targets the Average Treatment Effect on the Treated (ATET):
It defines potential outcomes Y1 and Y0 as acceptance outcomes under early arXiving vs not (§3).
It estimates
[ \text{ATET} = \mathbb{E}[Y_{A=1} - Y_{A=0} \mid A=1] ] (Eq. (1); §3; reiterated in §4).

Causal assumptions (what must be true for identification)¶

The paper lists standard identification assumptions plus an NCO condition (Assumption 3.1; §3.1):

Ignorability given (C,U): after conditioning on observed and unobserved covariates, treatment assignment is as-if random.
Positivity: each covariate profile has nonzero probability of being treated and untreated.
Consistency: observed outcome matches the potential outcome under the observed treatment.
Negative control condition: the chosen negative control outcome N is not causally affected by A (Assumption 3.1(4); also stated as N ⟂ A | C,U in §4.2 footnote 8).

A key practical issue is that U is not observed, so the method must reduce bias without directly conditioning on U (Figure 1; §4.2).

Observed confounders: what is controlled directly¶

The 18 confounders used for matching and adjustment are enumerated in Table 1 (with more detail in Appendix A.1). They include:

Submission-related / content features:
counts: n_fig, n_ref, n_sec,
length: log_text_length (token count in log scale; Appendix A.1 says measured with a Longformer tokenizer),
a fluency-style text statistic text_ppl derived from token likelihoods under a pretrained RoBERTa model and normalized to (0,1) (Appendix A.1).
Topic:
topic_cluster in 20 clusters; Appendix A.1 describes spectral clustering on SPECTER embeddings of abstracts (with scikit-learn).
Author and institution features:
author count and gender indicators (from OpenReview profiles; Appendix A.1),
US-based author indicator (no_US_author),
institution rank summaries (log_inst_rank_min/avg/max), where “institute rank” is computed from counts of accepted ICLR papers in the two prior years (Table 1),
author citation summaries (log_author_cite_min/avg/max), with citations obtained via a Google Scholar API as of Feb 2022 (Appendix A.1).

The intent is: if early arXived and non-early papers are balanced on these variables, differences in acceptance are less likely to be explained by these observed factors (§4.1; Table A.1).

Matching: making treated and control groups comparable on `C`¶

The treated group contains 1,486 early-arXived submissions (treated units) (§4.1; §5.1).
From the full dataset of 10,297 ICLR 2018–2022 submissions (§5.1), the unmatched control pool includes 7,493 non-early-arXived papers (Table A.1).
The method performs pair matching: for each treated paper, it selects one non-treated paper to form 1,486 matched pairs (§4.1; §5.1).

How matching is implemented (important because it determines what “comparable” means):

It uses the tripartite matching algorithm (Zhang et al., 2021), following Chen et al. (2022) (§4.1; Appendix B).
The matching objective/constraints are described as:
Numerical variables are matched with a penalty on L2 distance (§4.1).
n_author and year are “nearly exactly matched” (§4.1; Appendix B).
The distribution of topic_cluster is made similar between treated and matched controls (“fine-balance”) (§4.1; Table A.1 shows topic_cluster SMD dropping to <0.001).

Balance evidence:

Table A.1 reports standardized mean differences (SMDs) before/after matching and shows large improvements (e.g., year SMD from 0.746 to 0.002, and similarly small post-match SMDs across many variables).

This matching step is crucial because subsequent effect estimates are computed on the matched sample (§4.1; §5.2).

Why unobserved quality remains a problem after matching¶

Even after matching, early-arXived papers have much higher citation counts than their matched controls in longer windows (e.g., §5.2 notes early-arXived papers are cited “almost 2.13× more” in a 3-year window), which the paper interprets as evidence that a strong unobserved factor like “quality” still differs between groups (§5.2; Table B.2 supports the pattern of higher average citations for A=1 in several years).

This motivates using a method that attempts to correct for remaining bias from U (§4.2; §5.4).

Negative control outcome (NCO): the mechanism to address unobserved confounding¶

Idea in plain language: find a variable N that is influenced by the same hidden factors (U, like quality) that influence acceptance, but is not caused by early arXiving. Then differences in N between treated and control help quantify the part of acceptance differences that come from U, allowing partial debiasing (§4.2; Figure 1).

What they choose as N:

They construct N from a paper’s citation count in a fixed time window after its first public appearance:
CC(n) = number of citations in the n years after the paper first becomes public (Appendix A.2 gives examples).
They define a binary NCO, \(N^{(n)}_q\), where \(N^{(n)}_q = 1\) if CC(n) is above the q-th quantile in the relevant sample (§4.2).
They vary:
n ∈ {1,2,3} years,
q ∈ {0.5, 0.75, 0.9} quantile thresholds (§4.2; Figure 2 caption).

Why “first public appearance” matters:

Measuring citations from the first online appearance (arXiv, proceedings, etc.) is intended to prevent a mechanical advantage where early-arXived papers accrue more citations simply by being available earlier (§4.2; Appendix A.2).

How citations are collected:

They match ICLR submissions to Semantic Scholar (S2) entries via fuzzy title matching, use S2’s “canonicalization” so preprint+proceedings are merged, and use S2’s publication date defined as first availability across sources (Appendix A.2).
Coverage differs by acceptance status (Table A.2):
Accepted: 3,636/3,678 (99%) have S2 IDs; of those, 88% have dates.
Rejected: 5,336/6,619 (81%) have S2 IDs; of those, 91% have dates.
Papers without S2 matches (especially among rejected) are assigned 0 citations because they were likely never posted/published (Appendix A.2).

Difference-in-difference (DiD) style estimator via NCO¶

The estimator is motivated by an equivalence between NCO adjustment and DiD (Sofer et al., 2016, cited in §4.1):

Classic DiD uses outcomes at two time periods t=0,1 and computes: [ \text{ATET} = \mathbb{E}[Y_1(1)-Y_0(1)] - \mathbb{E}[Y_1(0)-Y_0(0)] ] (§4.1).
With an NCO, the role of “pre-period outcome” is played by N, yielding: [ \text{ATET} = \mathbb{E}[Y_1 - N_1] - \mathbb{E}[Y_0 - N_0] ] (§4.1).

Key additional assumption required:

Additive equi-confounding (Assumption 4.1) states that the expected change driven by unobserved confounding behaves similarly so it cancels in the DiD contrast (§4.1).

Outcome modeling details (what is specified vs missing):

Because Y is binary, §4.1 says one “may assume” a logistic model for potential outcomes and then estimate ATET via predicted conditional means; confidence intervals come from bootstrapping.
The paper does not provide concrete model training hyperparameters (e.g., optimizer, learning rate, regularization, features used in the logistic regression), so those implementation specifics cannot be summarized beyond what is written in §4.1.

Why they dichotomize citations:

Appendix D explains dichotomization motivations:
It puts N on the same scale as binary Y, aligning with additive equi-confounding for DiD.
It intentionally ignores “minor” citation differences (e.g., 10 vs 11) when interpreting “highly cited” vs “less cited.”

They also explore a relaxation:

Appendix C introduces quantile-quantile equi-confounding (Assumption C.1) which is weaker and avoids requiring the same scale for Y and N, enabling use of continuous CC(3) as NCO (Appendix C.1; Theorem 1).

Subgroup (heterogeneity) analysis for fairness concerns (RQ1)¶

They stratify matched pairs by:
minimum institution rank among authors, grouped into Top-10, Top-10 to 100, Others (Figure 3a/c/e; §5.3; Appendix B.3), and
maximum author citation count, binned into <500, 500–2000, >2000 (Figure 3b/d/f; §5.3; Appendix B.4).
They estimate effects within each stratum and compare confidence intervals; overlapping intervals are treated as “no statistically significant difference” across groups (§5.3; Figure 3 caption).

4. Key Insights and Innovations¶

(1) Applying negative control outcomes to peer-review bias with explicit causal structure.
The causal graph in Figure 1 makes unobserved “quality” (U) central and motivates NCO as a correction tool, rather than treating arXiv timing as exogenous.
Significance: it provides a principled way to argue that naive acceptance differences can be explained by unobserved factors (§5.2; Figure 2).
(2) Concrete NCO construction from citations with “first appearance” time alignment.
Instead of counting citations by calendar year, they define CC(n) relative to first public availability to avoid a built-in exposure advantage for early preprints (§4.2; Appendix A.2).
Significance: this design directly targets a specific confounding pathway (earlier availability → more citations) that would otherwise violate the negative control condition.
(3) Matching design tailored to this application (topic fine-balance + near-exact year/author-count matching).
Using tripartite matching with fine-balance for topic_cluster and near exact matching for year/n_author aims to make treated/control papers comparable on key acceptance-relevant dimensions (§4.1; Table A.1).
Significance: reduces reliance on parametric outcome modeling and helps interpret results as comparisons among similar submissions.
(4) Fairness-oriented heterogeneity tests tied to real policy concerns.
The subgroup analysis directly checks whether early arXiving benefits authors differently by institution prestige or author citation counts (RQ1; §5.3; Figure 3).
Significance: connects causal estimation to a concrete governance debate (anonymity periods) while remaining within the paper’s data/assumption boundaries (§6; Ethics Statement).

5. Experimental Analysis¶

Evaluation setup: data, metrics, baselines¶

Dataset.
Base corpus: ICLR submissions 2018–2022 (10,297 papers) from Zhang et al. (2022) (§5.1; Appendix A).
Treated units: 1,486 early-arXived papers; matched controls: 1,486 non-early papers (§4.1; §5.1; Table A.1).
Estimand and reported metric.
Primary reported quantity is ATET as a percentage-point change in acceptance probability (Eq. (1); §4; Figure 2).
“Baselines” / comparisons inside the paper.
Unadjusted-for-unobserved-confounding estimate on the matched sample (“Unadj”): controls for observed confounders via matching but does not apply NCO debiasing (§5.2; Figure 2 red).
NCO-adjusted estimates vary n and q defining \(N^{(n)}_q\) (§4.2; Figure 2 gray/black).
Subgroup variants: repeat estimation by institution rank and author citation bins (Figure 3; §5.3).
Sample size changes with n.
Because CC(n) requires enough elapsed years, they drop recent years when computing long windows (Table B.1):
- CC(1): 1,486 matched pairs (2018–2022),
- CC(2): 1,073 pairs (2018–2021),
- CC(3): 570 pairs (2018–2020).

Main quantitative results (with specific numbers)¶

Overall effect without NCO adjustment (matched sample)¶

Figure 2 shows “Unadj” effects (red) computed on the same subset used for each CC(n) panel:

For the 1-year-citation subset: 9.79% (Figure 2, top panel, “Unadj”).
For the 2-year-citation subset: 9.90% (Figure 2, middle panel, “Unadj”).
For the 3-year-citation subset: 10.03% (Figure 2, bottom panel, “Unadj”).

These values are interpreted in §5.2 as a sizeable association, but plausibly confounded by unobserved quality.

NCO-adjusted effects (varying `n` and `q`)¶

From Figure 2 (gray/black points):

Using 1-year citations as NCO:
\(N^{(1)}_{0.5}\): −0.66%
\(N^{(1)}_{0.75}\): 3.73%
\(N^{(1)}_{0.9}\): 7.56% (Figure 2, top panel)
Using 2-year citations as NCO:
\(N^{(2)}_{0.5}\): −4.06%
\(N^{(2)}_{0.75}\): 0.74%
\(N^{(2)}_{0.9}\): 5.13% (Figure 2, middle panel)
Using 3-year citations as NCO:
\(N^{(3)}_{0.5}\): −9.17%
\(N^{(3)}_{0.75}\): −2.63%
\(N^{(3)}_{0.9}\): 2.16% (Figure 2, bottom panel)

Statistical significance:

Figure 2’s caption notes that several NCO-adjusted effects are not significant at 95% because their confidence intervals include 0 (and §5.4 emphasizes that with “stronger” NCO choices like longer windows and higher quantiles, effects become weak and often insignificant).
Without the numeric CI endpoints printed in the text, the safest grounded statement is: some settings are significant and some are not, and the “stronger” NCOs at n=3 and high q yield small, non-significant effects (Figure 2; §5.4).

Subgroup results (RQ1): no clear heterogeneity by prestige bins¶

Figure 3 shows subgroup ATET estimates and emphasizes overlapping confidence intervals across strata (§5.3; Figure 3 caption).
Concrete examples quoted in §5.3:
With \(N^{(3)}_q\) and no NCO adjustment (“Unadj”), ATET by institution bin is:
- All institutions: 7.5%
- Top-10: 2.37%
- Top-10 to 100: 12.11%
- Others: 12.17% and the confidence intervals overlap (§5.3; Figure 3(a)).
With \(N^{(3)}_{0.9}\) when stratifying by max author citations, ATET is:
- All authors: 2.03%
- <500: −4.62%
- 500–2000: 2.16%
- 2000: 2.42% again interpreted as no statistically significant differences due to overlap (§5.3; Figure 3(b)).

Do experiments support the claims?¶

The evidence strongly supports the narrower claim that naive estimates shrink substantially after NCO adjustment, consistent with unobserved confounding playing a major role (Figure 2; §5.4).
The evidence supports the RQ1 claim in the sense used here—no detectable subgroup differences under their stratifications—because subgroup confidence intervals overlap in Figure 3 and the text emphasizes wide intervals where sample sizes are small (§5.3; Appendix B; Table B.3–B.4).
The RQ2 claim (“small effect, often <4%”) is supported descriptively by Figure 2 and reiterated in §6, but it is also conditional:
effect direction and magnitude vary with n and q (Figure 2),
and some estimates remain significant, implying that either (a) early arXiving has some residual causal effect, or (b) the NCO assumptions do not fully hold (§5.4; Limitations).

Robustness / ablations / alternate assumptions¶

Robustness across NCO definitions:
They vary n ∈ {1,2,3} and q ∈ {0.5,0.75,0.9} (Figure 2; §4.2).
Alternate identification assumption:
Appendix C.1 uses QQ equi-confounding with continuous CC(3) and estimates:
- ATET = −4.375% with 95% bootstrap CI (−9.965%, −0.092%) (Appendix C.1).
This result is described as “very weak” evidence and broadly consistent (in significance sense) with the main analysis’ conclusion that effects are small/near zero under stronger adjustments (Appendix C.1 discussion).

6. Limitations and Trade-offs¶

Single-venue scope.
The analysis only uses ICLR because it uniquely releases acceptance outcomes for all submissions, so findings may not generalize to other conferences or fields (Conclusion §6; Limitations).
Strong, partly untestable assumptions.
The NCO strategy hinges on the negative control condition that early arXiving does not causally affect long-term citations in the fixed window (§3.1; §4.2; Limitations).
The paper discusses plausible violations (e.g., “flag-planting”) but notes lack of direct empirical quantification and treats the assumption as debatable (§4.2).
Citations are an imperfect proxy for the unobserved confounder.
The method requires N to share confounders with Y (quality affects both), but citations can also be influenced by other mechanisms; the paper explicitly notes citations are “not to be confused with a measure of paper quality” (§4.2 footnote 9).
Dichotomization introduces sensitivity to threshold choice (Appendix D).
Missing data and selection effects in citation matching.
Rejected papers are less likely to have S2 matches (81% vs 99% for accepted; Table A.2), and unmatched rejected papers are assigned 0 citations (Appendix A.2). This is a reasonable operational choice but could interact with NCO construction, because “unpublished” status is correlated with rejection by definition.
Small strata in subgroup analyses.
Appendix B notes some strata have few samples, producing wide confidence intervals and limiting sensitivity to detect heterogeneity (Appendix B; Tables B.3–B.4).
Model/estimator specification details are incomplete.
While §4.1 references logistic modeling and bootstrap CIs, it does not specify the exact fitted model form, covariates used in the regression step (beyond matching), or implementation hyperparameters, limiting reproducibility from the description alone.

7. Implications and Future Directions¶

Implications for conference policy debates.
Under the paper’s assumptions and ICLR setting, early arXiving does not show evidence of disproportionately benefiting high-prestige authors/institutions (Figure 3; §5.3; §6; Ethics Statement). This challenges a fairness-based justification for anonymity periods, at least as tested here (§6).
Methodological implication: how to study peer-review causally without experiments.
The paper provides a template: combine rich covariates + matching + a negative control variable to address hidden confounding (Figure 1; §4). This is relevant to many social/organizational processes where “quality” is latent.
Future research suggested by the paper.
A randomized controlled trial or “randomized encouragement” design is advocated as the cleanest way to estimate the effect, despite practical difficulty (§6; Limitations).
Identify or construct alternative negative controls that better satisfy the “not affected by treatment” condition than citations might (§4.2; Limitations; Appendix C.1 discussion).
Extend beyond ICLR to test generality if comparable acceptance-label datasets become available (§6; Limitations).
Repro/Integration Guidance (when to prefer this method).
Prefer this paper’s approach over naive comparisons when:
- treatment assignment is clearly non-random (authors choose whether to preprint early),
- important confounders like “quality” are unobserved,
- and you can propose a plausible negative control outcome that shares confounding but is not causally downstream of treatment (Figure 1; §4.2).
Avoid over-interpreting point estimates when:
- the negative control assumption is questionable or cannot be defended in the domain (Limitations),
- sample sizes shrink substantially for longer-term outcomes (Table B.1),
- or subgroup bins are small (Appendix B).