Prepare for 1.0.0 release

claude · claude · commit 905bf51de789 · 2026-01-04T19:03:21.000Z
- Update version to 1.0.0 in __init__.py and pyproject.toml
- Change development status classifier to Production/Stable
- Add warning when SyntheticDiD bootstrap has &gt;5% failure rate
- Add troubleshooting guide to documentation
- Document standard error computation differences across estimators
- Update CHANGELOG with 1.0.0 release notes
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,6 +5,37 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [1.0.0] - 2026-01-04
+
+### Added
+- **Goodman-Bacon decomposition** for TWFE diagnostics
+  - `BaconDecomposition` class for decomposing TWFE into weighted 2x2 comparisons
+  - `Comparison2x2` dataclass for individual comparisons (treated_vs_never, earlier_vs_later, later_vs_earlier)
+  - `BaconDecompositionResults` with weights and estimates by comparison type
+  - `bacon_decompose()` convenience function
+  - `plot_bacon()` visualization for decomposition results
+  - Integration via `TwoWayFixedEffects.decompose()` method
+- **Power analysis** for study design
+  - `PowerAnalysis` class for analytical power calculations
+  - `PowerResults` and `SimulationPowerResults` dataclasses
+  - `compute_mde()`, `compute_power()`, `compute_sample_size()` convenience functions
+  - `simulate_power()` for Monte Carlo simulation-based power analysis
+  - `plot_power_curve()` visualization for power analysis
+  - Tutorial notebook: `docs/tutorials/06_power_analysis.ipynb`
+- **Callaway-Sant'Anna multiplier bootstrap** for inference
+  - `CSBootstrapResults` with standard errors, confidence intervals, p-values
+  - Rademacher, Mammen, and Webb weight distributions
+  - Bootstrap inference for all aggregation methods
+- **Troubleshooting guide** in documentation
+- **Standard error computation guide** explaining SE differences across estimators
+
+### Changed
+- Updated package status to Production/Stable (was Alpha)
+- SyntheticDiD bootstrap now warns when >5% of iterations fail
+
+### Fixed
+- Silent bootstrap failures in SyntheticDiD now produce warnings
+
 ## [0.6.0]
 
 ### Added
@@ -136,6 +167,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   - `to_dict()` and `to_dataframe()` export methods
   - `is_significant` and `significance_stars` properties
 
+[1.0.0]: https://github.com/igerber/diff-diff/compare/v0.6.0...v1.0.0
 [0.6.0]: https://github.com/igerber/diff-diff/compare/v0.5.0...v0.6.0
 [0.5.0]: https://github.com/igerber/diff-diff/compare/v0.4.0...v0.5.0
 [0.4.0]: https://github.com/igerber/diff-diff/compare/v0.3.0...v0.4.0
diff --git a/diff_diff/__init__.py b/diff_diff/__init__.py
@@ -85,7 +85,7 @@
     plot_sensitivity,
 )
 
-__version__ = "0.9.0"
+__version__ = "1.0.0"
 __all__ = [
     # Estimators
     "DifferenceInDifferences",
diff --git a/diff_diff/estimators.py b/diff_diff/estimators.py
@@ -1774,6 +1774,20 @@ def _bootstrap_se(
                 continue
 
         bootstrap_estimates = np.array(bootstrap_estimates)
+
+        # Warn if too many bootstrap iterations failed
+        n_successful = len(bootstrap_estimates)
+        failure_rate = 1 - (n_successful / self.n_bootstrap)
+        if failure_rate > 0.05:
+            warnings.warn(
+                f"Only {n_successful}/{self.n_bootstrap} bootstrap iterations succeeded "
+                f"({failure_rate:.1%} failure rate). Standard errors may be unreliable. "
+                f"This can occur with small samples, near-singular weight matrices, "
+                f"or insufficient pre-treatment periods.",
+                UserWarning,
+                stacklevel=2,
+            )
+
         se = np.std(bootstrap_estimates, ddof=1) if len(bootstrap_estimates) > 1 else 0.0
 
         return se, bootstrap_estimates
diff --git a/docs/choosing_estimator.rst b/docs/choosing_estimator.rst
@@ -205,6 +205,57 @@ Common Pitfalls
 
    *Solution*: Always specify ``cluster_col`` for panel data.
 
+Standard Error Methods
+----------------------
+
+Different estimators compute standard errors differently. Understanding these
+differences helps interpret results and choose appropriate inference.
+
+.. list-table::
+   :header-rows: 1
+   :widths: 20 25 55
+
+   * - Estimator
+     - Default SE Method
+     - Details
+   * - ``DifferenceInDifferences``
+     - HC1 (heteroskedasticity-robust)
+     - Uses White's robust SEs by default. Specify ``cluster_col`` for cluster-robust SEs. Use ``inference='wild_bootstrap'`` for few clusters (<30).
+   * - ``TwoWayFixedEffects``
+     - Cluster-robust (unit level)
+     - Always clusters at unit level after within-transformation. Specify ``cluster_col`` to override. Use ``inference='wild_bootstrap'`` for few clusters.
+   * - ``MultiPeriodDiD``
+     - HC1 (heteroskedasticity-robust)
+     - Same as basic DiD. Cluster-robust available via ``cluster_col``. Wild bootstrap not yet supported for multi-coefficient inference.
+   * - ``CallawaySantAnna``
+     - Analytical (simple difference)
+     - Uses simple variance of group-time means. Use ``bootstrap()`` method for multiplier bootstrap inference with proper SEs, CIs, and p-values.
+   * - ``SyntheticDiD``
+     - Bootstrap or placebo-based
+     - Default uses bootstrap resampling. Set ``n_bootstrap=0`` for placebo-based inference using pre-treatment residuals.
+
+**Recommendations by sample size:**
+
+- **Large samples (N > 1000, clusters > 50)**: Default analytical SEs are reliable
+- **Medium samples (clusters 30-50)**: Cluster-robust SEs recommended
+- **Small samples (clusters < 30)**: Use wild cluster bootstrap (``inference='wild_bootstrap'``)
+- **Very few clusters (< 10)**: Use Webb 6-point distribution (``weight_type='webb'``)
+
+**Common pitfall:** Forgetting to cluster when units are observed multiple times.
+For panel data, always cluster at the unit level unless you have a strong reason not to.
+
+.. code-block:: python
+
+   # Good: Cluster at unit level for panel data
+   did = DifferenceInDifferences()
+   results = did.fit(data, outcome='y', treated='treated',
+                     post='post', cluster_col='unit_id')
+
+   # Better for few clusters: Wild bootstrap
+   did = DifferenceInDifferences(inference='wild_bootstrap')
+   results = did.fit(data, outcome='y', treated='treated',
+                     post='post', cluster_col='state')
+
 When in Doubt
 -------------
 
diff --git a/docs/index.rst b/docs/index.rst
@@ -40,6 +40,7 @@ Quick Links
 
 - :doc:`quickstart` - Get started with basic examples
 - :doc:`choosing_estimator` - Which estimator should I use?
+- :doc:`troubleshooting` - Common issues and solutions
 - :doc:`r_comparison` - Comparison with R packages
 - :doc:`python_comparison` - Comparison with Python packages
 - :doc:`api/index` - Full API reference
@@ -51,6 +52,7 @@ Quick Links
 
    quickstart
    choosing_estimator
+   troubleshooting
    r_comparison
    python_comparison
 
diff --git a/docs/troubleshooting.rst b/docs/troubleshooting.rst
diff --git a/pyproject.toml b/pyproject.toml

Original file line number	Diff line number	Diff line change
`@@ -85,7 +85,7 @@`
`85`	`85`	`plot_sensitivity,`
`86`	`86`	`)`
`87`	`87`
`88`		`-__version__ = "0.9.0"`
	`88`	`+__version__ = "1.0.0"`
`89`	`89`	`__all__ = [`
`90`	`90`	`# Estimators`
`91`	`91`	`"DifferenceInDifferences",`