igerber
diff --git a/‎CHANGELOG.md‎
Lines changed: 144 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 144 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 20 additions & 3 deletions b/‎README.md‎
Lines changed: 20 additions & 3 deletions
diff --git a/‎TODO.md‎
Lines changed: 15 additions & 10 deletions b/‎TODO.md‎
Lines changed: 15 additions & 10 deletions
diff --git a/‎diff_diff/__init__.py‎
Lines changed: 42 additions & 42 deletions b/‎diff_diff/__init__.py‎
Lines changed: 42 additions & 42 deletions
diff --git a/‎diff_diff/diagnostics.py‎
Lines changed: 3 additions & 0 deletions b/‎diff_diff/diagnostics.py‎
Lines changed: 3 additions & 0 deletions
@@ -0,0 +1,144 @@
+# Changelog
+
+All notable changes to this project will be documented in this file.
+
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+
+## [0.6.0] - 2025-01-04
+
+### Added
+- **CallawaySantAnna covariate adjustment** for conditional parallel trends
+  - Outcome regression (`estimation_method='reg'`)
+  - Inverse probability weighting (`estimation_method='ipw'`)
+  - Doubly robust estimation (`estimation_method='dr'`)
+  - Pass covariates via `covariates` parameter in `fit()`
+- **Honest DiD sensitivity analysis** (Rambachan & Roth 2023)
+  - `HonestDiD` class for computing bounds under parallel trends violations
+  - Relative magnitudes restriction (`DeltaRM`) - bounds post-treatment violations by pre-treatment
+  - Smoothness restriction (`DeltaSD`) - bounds second differences of trend violations
+  - Combined restrictions (`DeltaSDRM`)
+  - FLCI and C-LF confidence interval methods
+  - Breakdown value computation via `breakdown_value()`
+  - Sensitivity analysis over M grid via `sensitivity_analysis()`
+  - `HonestDiDResults` and `SensitivityResults` dataclasses
+  - `compute_honest_did()` convenience function
+  - `plot_sensitivity()` for sensitivity analysis visualization
+  - `plot_honest_event_study()` for event study with honest CIs
+  - Tutorial notebook: `docs/tutorials/05_honest_did.ipynb`
+- **API documentation site** with Sphinx
+  - Full API reference auto-generated from docstrings
+  - "Which estimator should I use?" decision guide
+  - Comparison with R packages (did, HonestDiD)
+  - Getting started / quickstart guide
+
+### Changed
+- Updated mypy configuration for better numpy type compatibility
+- Modernized ruff configuration to use `[tool.ruff.lint]` section
+
+### Fixed
+- Fixed 21 ruff linting issues (import ordering, unused variables, ambiguous names)
+- Fixed 94 mypy type checking issues (Optional types, numpy type casts, assertions)
+- Added missing return statement in `run_placebo_test()`
+
+## [0.5.0] - 2024-12-01
+
+### Added
+- **Wild cluster bootstrap** for valid inference with few clusters
+  - Rademacher weights (default, good for most cases)
+  - Webb's 6-point distribution (recommended for <10 clusters)
+  - Mammen's two-point distribution
+  - `WildBootstrapResults` dataclass
+  - `wild_bootstrap_se()` utility function
+  - Integration with `DifferenceInDifferences` and `TwoWayFixedEffects` via `inference='wild_bootstrap'`
+- **Placebo tests module** (`diff_diff.diagnostics`)
+  - `placebo_timing_test()` - fake treatment timing test
+  - `placebo_group_test()` - fake treatment group test
+  - `permutation_test()` - permutation-based inference
+  - `leave_one_out_test()` - sensitivity to individual treated units
+  - `run_placebo_test()` - unified dispatcher for all test types
+  - `run_all_placebo_tests()` - comprehensive diagnostic suite
+  - `PlaceboTestResults` dataclass
+- **Tutorial notebooks** in `docs/tutorials/`
+  - `01_basic_did.ipynb` - Basic 2x2 DiD, formula interface, covariates, fixed effects, wild bootstrap
+  - `02_staggered_did.ipynb` - Staggered adoption with Callaway-Sant'Anna
+  - `03_synthetic_did.ipynb` - Synthetic DiD with unit/time weights
+  - `04_parallel_trends.ipynb` - Parallel trends testing and diagnostics
+- Comprehensive test coverage (380+ tests)
+
+## [0.4.0] - 2024-11-01
+
+### Added
+- **Callaway-Sant'Anna estimator** for staggered difference-in-differences
+  - `CallawaySantAnna` class with group-time ATT(g,t) estimation
+  - Support for `never_treated` and `not_yet_treated` control groups
+  - Aggregation methods: `simple`, `group`, `calendar`, `event_study`
+  - `CallawaySantAnnaResults` with group-time effects and aggregations
+  - `GroupTimeEffect` dataclass for individual effects
+- **Event study visualization** via `plot_event_study()`
+  - Works with `MultiPeriodDiDResults`, `CallawaySantAnnaResults`, or DataFrames
+  - Publication-ready formatting with customization options
+- **Group effects visualization** via `plot_group_effects()`
+- **Parallel trends testing utilities**
+  - `check_parallel_trends()` - simple slope-based test
+  - `check_parallel_trends_robust()` - Wasserstein distance test
+  - `equivalence_test_trends()` - TOST equivalence test
+
+## [0.3.0] - 2024-10-01
+
+### Added
+- **Synthetic Difference-in-Differences** (`SyntheticDiD`)
+  - Unit weight optimization for synthetic control
+  - Time weight computation for pre-treatment periods
+  - Placebo-based and bootstrap inference
+  - `SyntheticDiDResults` with weight accessors
+- **Multi-period DiD** (`MultiPeriodDiD`)
+  - Event-study style estimation with period-specific effects
+  - `MultiPeriodDiDResults` with `period_effects` dictionary
+  - `PeriodEffect` dataclass for individual period results
+- **Data preparation utilities** (`diff_diff.prep`)
+  - `generate_did_data()` - synthetic data generation
+  - `make_treatment_indicator()` - create treatment from categorical/numeric
+  - `make_post_indicator()` - create post-treatment indicator
+  - `wide_to_long()` - reshape wide to long format
+  - `balance_panel()` - ensure balanced panel data
+  - `validate_did_data()` - data validation
+  - `summarize_did_data()` - summary statistics by group
+  - `create_event_time()` - event time for staggered designs
+  - `aggregate_to_cohorts()` - aggregate to cohort means
+  - `rank_control_units()` - rank controls by similarity
+
+## [0.2.0] - 2024-09-01
+
+### Added
+- **Two-Way Fixed Effects** (`TwoWayFixedEffects`)
+  - Within-transformation for unit and time fixed effects
+  - Efficient handling of high-dimensional fixed effects via `absorb`
+- **Fixed effects support** in base `DifferenceInDifferences`
+  - `fixed_effects` parameter for dummy variable approach
+  - `absorb` parameter for within-transformation approach
+- **Cluster-robust standard errors**
+  - `cluster` parameter for cluster-robust inference
+- **Formula interface**
+  - R-style formulas like `"outcome ~ treated * post"`
+  - Support for covariates in formulas
+
+## [0.1.0] - 2024-08-01
+
+### Added
+- Initial release
+- **Basic Difference-in-Differences** (`DifferenceInDifferences`)
+  - sklearn-like API with `fit()` method
+  - Column name interface for outcome, treatment, time
+  - Heteroskedasticity-robust (HC1) standard errors
+  - `DiDResults` dataclass with ATT, SE, p-value, confidence intervals
+  - `summary()` and `print_summary()` methods
+  - `to_dict()` and `to_dataframe()` export methods
+  - `is_significant` and `significance_stars` properties
+
+[0.6.0]: https://github.com/igerber/diff-diff/compare/v0.5.0...v0.6.0
+[0.5.0]: https://github.com/igerber/diff-diff/compare/v0.4.0...v0.5.0
+[0.4.0]: https://github.com/igerber/diff-diff/compare/v0.3.0...v0.4.0
+[0.3.0]: https://github.com/igerber/diff-diff/compare/v0.2.0...v0.3.0
+[0.2.0]: https://github.com/igerber/diff-diff/compare/v0.1.0...v0.2.0
+[0.1.0]: https://github.com/igerber/diff-diff/releases/tag/v0.1.0
@@ -698,14 +698,31 @@ CallawaySantAnna(
     estimation_method='dr',          # 'dr', 'ipw', or 'reg'
     alpha=0.05,                      # Significance level
     cluster=None,                    # Column for cluster SEs
-    n_bootstrap=0,                   # Must be 0 (bootstrap not yet implemented)
+    n_bootstrap=0,                   # Bootstrap iterations (0 = analytical SEs)
     seed=None                        # Random seed
 )
 ```
 
+**Covariate adjustment for conditional parallel trends:**
+
+When parallel trends only holds conditional on covariates, use the `covariates` parameter:
+
+```python
+# Doubly robust estimation with covariates
+cs = CallawaySantAnna(estimation_method='dr')  # 'dr', 'ipw', or 'reg'
+results = cs.fit(
+    data,
+    outcome='sales',
+    unit='firm_id',
+    time='year',
+    first_treat='first_treat',
+    covariates=['size', 'age', 'industry'],  # Covariates for conditional PT
+    aggregate='event_study'
+)
+```
+
 **Current limitations:**
-- Bootstrap inference (`n_bootstrap > 0`) is not yet implemented
-- Covariate adjustment for conditional parallel trends is not yet implemented
+- Bootstrap inference (`n_bootstrap > 0`) is not yet fully implemented
 
 ### Event Study Visualization
 
 
@@ -267,7 +267,16 @@ Beyond the API site:
 
 ## Completed Features
 
-### v0.5.2
+### v0.6.0
+- [x] **All 1.0 Blockers Complete** - Library is now production-ready for core DiD analysis
+- [x] **API Documentation Site** (Sphinx + ReadTheDocs theme)
+  - Full API reference auto-generated from docstrings
+  - "Which estimator should I use?" decision guide
+  - Comparison with R packages
+  - Getting started / quickstart guide
+- [x] **CallawaySantAnna Covariate Adjustment**
+  - Outcome regression, IPW, and doubly robust methods
+  - Conditional parallel trends support
 - [x] **Honest DiD sensitivity analysis** (Rambachan & Roth 2023)
   - Relative magnitudes (ΔRM) and smoothness (ΔSD) restrictions
   - Combined restrictions (ΔSDRM)
@@ -276,22 +285,18 @@ Beyond the API site:
   - Sensitivity analysis over M grid
   - `plot_sensitivity()` and `plot_honest_event_study()` visualization
   - HonestDiD, HonestDiDResults, SensitivityResults classes
-  - DeltaSD, DeltaRM, DeltaSDRM restriction classes
   - Tutorial notebook: `05_honest_did.ipynb`
-  - 49 comprehensive tests
 
-### v0.5.1
-- [x] Comprehensive test coverage for `utils.py` module (72 tests)
+### v0.5.0
+- [x] Wild cluster bootstrap (Rademacher, Webb, Mammen weights)
+- [x] Placebo tests module (fake timing, fake group, permutation, leave-one-out)
+- [x] Comprehensive test coverage (380+ tests)
 - [x] Tutorial notebooks in `docs/tutorials/`
   - Basic DiD, formula interface, covariates, fixed effects, wild bootstrap
   - Staggered adoption with Callaway-Sant'Anna
   - Synthetic DiD with unit/time weights
   - Parallel trends testing and diagnostics
 
-### v0.5.0
-- [x] Wild cluster bootstrap (Rademacher, Webb, Mammen weights)
-- [x] Placebo tests module (fake timing, fake group, permutation, leave-one-out)
-
 ### v0.4.0
 - [x] Callaway-Sant'Anna estimator for staggered DiD
 - [x] Event study visualization
@@ -308,4 +313,4 @@ Beyond the API site:
 4. **Goodman-Bacon Decomposition** - Key diagnostic for TWFE users
 5. **Power Analysis** - Study design tool practitioners need
 
-With items 1-2 complete, diff-diff now has feature parity with R's `did` + `HonestDiD` ecosystem for core sensitivity analysis. The remaining items (3-5) will complete the 1.0 release.
+All 1.0 blockers (items 1-3) are now complete. diff-diff has feature parity with R's `did` + `HonestDiD` ecosystem for core DiD analysis. Items 4-5 would strengthen the 1.0 release but are not strictly required.
@@ -5,69 +5,69 @@
 using the difference-in-differences methodology.
 """
 
+from diff_diff.diagnostics import (
+    PlaceboTestResults,
+    leave_one_out_test,
+    permutation_test,
+    placebo_group_test,
+    placebo_timing_test,
+    run_all_placebo_tests,
+    run_placebo_test,
+)
 from diff_diff.estimators import (
     DifferenceInDifferences,
-    TwoWayFixedEffects,
     MultiPeriodDiD,
     SyntheticDiD,
+    TwoWayFixedEffects,
 )
-from diff_diff.staggered import (
-    CallawaySantAnna,
-    CallawaySantAnnaResults,
-    GroupTimeEffect,
+from diff_diff.honest_did import (
+    DeltaRM,
+    DeltaSD,
+    DeltaSDRM,
+    HonestDiD,
+    HonestDiDResults,
+    SensitivityResults,
+    compute_honest_did,
+    sensitivity_plot,
+)
+from diff_diff.prep import (
+    aggregate_to_cohorts,
+    balance_panel,
+    create_event_time,
+    generate_did_data,
+    make_post_indicator,
+    make_treatment_indicator,
+    rank_control_units,
+    summarize_did_data,
+    validate_did_data,
+    wide_to_long,
 )
 from diff_diff.results import (
     DiDResults,
     MultiPeriodDiDResults,
     PeriodEffect,
     SyntheticDiDResults,
 )
-from diff_diff.visualization import (
-    plot_event_study,
-    plot_group_effects,
-    plot_sensitivity,
-    plot_honest_event_study,
-)
-from diff_diff.prep import (
-    make_treatment_indicator,
-    make_post_indicator,
-    wide_to_long,
-    balance_panel,
-    validate_did_data,
-    summarize_did_data,
-    generate_did_data,
-    create_event_time,
-    aggregate_to_cohorts,
-    rank_control_units,
+from diff_diff.staggered import (
+    CallawaySantAnna,
+    CallawaySantAnnaResults,
+    GroupTimeEffect,
 )
 from diff_diff.utils import (
+    WildBootstrapResults,
     check_parallel_trends,
     check_parallel_trends_robust,
     equivalence_test_trends,
-    WildBootstrapResults,
     wild_bootstrap_se,
 )
-from diff_diff.diagnostics import (
-    PlaceboTestResults,
-    run_placebo_test,
-    placebo_timing_test,
-    placebo_group_test,
-    permutation_test,
-    leave_one_out_test,
-    run_all_placebo_tests,
-)
-from diff_diff.honest_did import (
-    HonestDiD,
-    HonestDiDResults,
-    SensitivityResults,
-    DeltaSD,
-    DeltaRM,
-    DeltaSDRM,
-    compute_honest_did,
-    sensitivity_plot,
+from diff_diff.visualization import (
+    plot_event_study,
+    plot_group_effects,
+    plot_honest_event_study,
+    plot_sensitivity,
 )
 
-__version__ = "0.5.0"
+__version__ = "0.6.0"
 __all__ = [
     # Estimators
     "DifferenceInDifferences",
 
@@ -360,6 +360,9 @@ def run_placebo_test(
             **estimator_kwargs
         )
 
+    # This should never be reached due to validation above
+    raise ValueError(f"Unknown test type: {test_type}")
+
 
 def placebo_timing_test(
     data: pd.DataFrame,
Original file line number	Diff line number	Diff line change
`@@ -360,6 +360,9 @@ def run_placebo_test(`
`360`	`360`	`**estimator_kwargs`
`361`	`361`	`)`
`362`	`362`
	`363`	`+ # This should never be reached due to validation above`
	`364`	`+ raise ValueError(f"Unknown test type: {test_type}")`
	`365`	`+`
`363`	`366`
`364`	`367`	`def placebo_timing_test(`
`365`	`368`	`data: pd.DataFrame,`