Skip to content

docs: add utility doctest examples#804

Open
EKtheSage wants to merge 4 commits into
casact:mainfrom
EKtheSage:docs/704-utility-examples
Open

docs: add utility doctest examples#804
EKtheSage wants to merge 4 commits into
casact:mainfrom
EKtheSage:docs/704-utility-examples

Conversation

@EKtheSage
Copy link
Copy Markdown
Contributor

@EKtheSage EKtheSage commented May 16, 2026

Summary: Add Sphinx doctest examples for the PatsyFormula utility docs. Split from the larger #792 work and intentionally excludes .github/workflows/sync-main-to-docs.yml. Refs #704


Note

Low Risk
Documentation and doctest strings only; no changes to implementation or behavior.

Overview
Adds Sphinx doctest-backed docstrings to several public utilities in chainladder/utils/utility_functions.py, extending narrative Parameters/Returns text where missing and illustrating real workflows (sample triangles, estimators, round-trips).

Serialization: read_pickle documents dill round-trip fidelity for fitted Development; read_json shows restoring estimator params from to_json output.

Triangle ops: concat demonstrates stacking paid vs incurred along axis=1; minimum / maximum show low- and high-side ultimate scenarios across two chainladder runs.

ML prep: PatsyFormula gains two examples—TweedieGLM with C(development) + C(origin) and a DevelopmentML + sklearn Pipeline using the same R-style formulas.

Reviewed by Cursor Bugbot for commit 7f5c670. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 0a7c2f9. Configure here.

Comment thread chainladder/utils/utility_functions.py
@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

Codecov Report

❌ Patch coverage is 70.96774% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.69%. Comparing base (72b270c) to head (7f5c670).
⚠️ Report is 222 commits behind head on main.

Files with missing lines Patch % Lines
chainladder/utils/utility_functions.py 70.96% 10 Missing and 8 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #804      +/-   ##
==========================================
+ Coverage   86.23%   88.69%   +2.46%     
==========================================
  Files          86       89       +3     
  Lines        4947     5052     +105     
  Branches      643      645       +2     
==========================================
+ Hits         4266     4481     +215     
+ Misses        484      425      -59     
+ Partials      197      146      -51     
Flag Coverage Δ
unittests 88.69% <70.96%> (+2.46%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@henrydingliu
Copy link
Copy Markdown
Collaborator

please pull main and incorporate recent changes

@EKtheSage EKtheSage force-pushed the docs/704-utility-examples branch from 0a7c2f9 to 9175ae7 Compare May 16, 2026 20:31
Comment thread chainladder/utils/utility_functions.py Outdated

.. testcode::

clrd = cl.load_sample("clrd").groupby("LOB").sum().iloc[:2]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test demonstrates that concatting identical columns doesn't do anything, which doesn't match the example text.

def minimum(x1, x2):
"""Element-wise minimum of two triangles (delegates to ``Triangle.minimum``).

Examples
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need more basic docstring before a doctest. what's x1? what's x2?

Comment thread chainladder/utils/utility_functions.py Outdated

Examples
--------
Cap a triangle cell-by-cell by comparing it with another triangle of limits.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we certain this is true? can x2 be a scalar?

Comment thread chainladder/utils/utility_functions.py
def read_json(json_str, array_backend=None):
"""Deserialize JSON produced by ``to_json`` (triangle, estimator, or pipeline).

Examples
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example feels empty without seeing the actual json string. please follow the example from pandas

print(round(float(by_dev.ldf_.values[0, 0, 0, 0]), 6))
print(round(float(by_both.ldf_.values[0, 0, 0, 0]), 6))

.. testoutput::
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be showing all the numbers?

…henrydingliu

- read_pickle: show fitted Development estimator round-trip via pickle, verify transform works after restore
- read_json: show full Pipeline serialization round-trip with step names and params
- concat: show paid+incurred column join enabling MunichAdjustment directly
- minimum: compare volume vs simple CL ultimates, pick element-wise lower for low-side scenario
- maximum: same comparison, pick element-wise higher for high-side scenario
- PatsyFormula: clarify when to use custom DevelopmentML pipeline vs TweedieGLM; show ldf_ output instead of coefficient count
Comment thread chainladder/utils/utility_functions.py Outdated
import chainladder as cl

tri = cl.load_sample("raa")
dev = cl.Development(average="volume").fit(tri)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to demonstrate that to_pickle does something, we should use non-default parameters. something like avg = simple, n = 4.

dev.to_pickle(p)
restored = cl.read_pickle(p)
os.remove(p)
print(restored.transform(tri).ldf_.values[0, 0, 0, :4].round(4))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we print the full ldf_ from both the original and the restored estimators?

Comment thread chainladder/utils/utility_functions.py Outdated
combined = cl.concat([paid, incurred], axis=1)
adj = cl.MunichAdjustment(paid_to_incurred=("CumPaidLoss", "IncurLoss"))
result = adj.fit_transform(combined)
print(result.ldf_["CumPaidLoss"].values[0, 0, 0, :4].round(4))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good use case for concat. can we focus the test output around concat only?

@kennethshsu
Copy link
Copy Markdown
Collaborator

@EKtheSage are you interested in finishing up this PR?

- read_pickle: use non-default params (average=simple, n_periods=4),
  print ldf_ from both original and restored estimators, and call
  .transform() on restored to prove it is still functional
- read_json: show the full serialized JSON string before round-tripping,
  following pandas docstring style
- concat: remove MunichAdjustment output; focus on concat result only
  by printing combined.columns
- minimum/maximum: add prose descriptions for x1 and x2 parameters,
  confirming x2 can be a scalar
- maximum: trim testoutput to show only high_side result
@EKtheSage
Copy link
Copy Markdown
Contributor Author

@henrydingliu thanks for the detailed review. All comments have been addressed in the latest commit. Summary below:

to_pickle / read_pickle (lines 291, 301, 307)

  • Used a Development transformer with non-default params (average='simple', n_periods=4) to demonstrate pickling does something meaningful
  • Now prints ldf_ from both the original and restored estimators side-by-side to show parameters are preserved
  • Added an explicit restored.transform(tri) call to prove the restored estimator is still functional as a transformer

read_json (line 451)

  • Replaced the Pipeline round-trip with a Development example that prints the full serialized JSON string before reconstructing, following pandas docstring style

concat (lines 678, 696)

  • Removed the MunichAdjustment code and output; the example now focuses on concat itself by printing list(combined.columns) to show the two columns were merged into one triangle

minimum / maximum parameters (lines 793, 795)

  • Added prose descriptions for x1 and x2 in both functions, clarifying that x2 can be a scalar (element-wise comparison against a constant value)

maximum output (line 891)

  • Removed the intermediate ult_vol and ult_sim print lines; testoutput now shows only the high_side result

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants