feat(data-wizard): add progress_field for the AppliedControlRecordConsumer#4082
feat(data-wizard): add progress_field for the AppliedControlRecordConsumer#4082monsieurswag wants to merge 5 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughA ChangesProgress Field Addition to AppliedControl
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/data_wizard/views.py`:
- Line 910: Normalize and validate the progress_field before assigning it to the
data dict: read record.get("progress_field"), coerce/parsenon-numeric values to
0, convert floats/strings to an int, then clamp the value into the range 0–100
and assign that sanitized int instead of the raw value; update the code path
that builds the data dict (the place using record.get("progress_field") or 0) so
it uses the normalized value to avoid validation failures during spreadsheet
imports.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ef1aa301-cd16-4117-8a10-d1615ab1e2d8
📒 Files selected for processing (2)
backend/data_wizard/views.pydocumentation/data_wizard_analysis.md
| "expiry_date": _parse_date(record.get("expiry_date")), | ||
| "start_date": _parse_date(record.get("start_date")), | ||
| "observation": record.get("observation", ""), | ||
| "progress_field": record.get("progress_field") or 0, |
There was a problem hiding this comment.
Normalize and validate progress_field before assigning it.
At Line 910, the new field is passed raw (record.get("progress_field") or 0). For spreadsheet imports, numeric cells can arrive as float/string forms and cause avoidable validation failures. Parse to int and enforce [0, 100] consistently before building data.
Proposed fix
+ raw_progress = record.get("progress_field")
+ if raw_progress in (None, ""):
+ progress_field = 0
+ else:
+ try:
+ progress_field = int(float(raw_progress))
+ except (TypeError, ValueError):
+ return {}, Error(
+ record=record,
+ error="progress_field must be an integer between 0 and 100",
+ )
+ if not (0 <= progress_field <= 100):
+ return {}, Error(
+ record=record,
+ error="progress_field must be an integer between 0 and 100",
+ )
+
data = {
"ref_id": record.get("ref_id", ""),
"name": name,
"description": record.get("description", ""),
"category": category,
"folder": domain,
"status": record.get("status", "to_do"),
"priority": priority,
"csf_function": csf_function,
"effort": effort,
"control_impact": control_impact,
"link": record.get("link", ""),
"eta": _parse_date(record.get("eta")),
"expiry_date": _parse_date(record.get("expiry_date")),
"start_date": _parse_date(record.get("start_date")),
"observation": record.get("observation", ""),
- "progress_field": record.get("progress_field") or 0,
+ "progress_field": progress_field,
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/data_wizard/views.py` at line 910, Normalize and validate the
progress_field before assigning it to the data dict: read
record.get("progress_field"), coerce/parsenon-numeric values to 0, convert
floats/strings to an int, then clamp the value into the range 0–100 and assign
that sanitized int instead of the raw value; update the code path that builds
the data dict (the place using record.get("progress_field") or 0) so it uses the
normalized value to avoid validation failures during spreadsheet imports.
There was a problem hiding this comment.
It does work, but yes it would be better to make the int conversion explicit.
There was a problem hiding this comment.
@monsieurswag, here's the updated implementation that replaces the current partial handling. The key improvement is using int(float(...)) to safely handle both integer and float values that Excel commonly produces (e.g., 75.0), plus an explicit [0, 100] range guard:
raw_progress = record.get("progress_field")
if raw_progress in (None, ""):
progress_field = 0
else:
try:
progress_field = int(float(raw_progress))
except (TypeError, ValueError):
return {}, Error(
record=record,
error="progress_field must be an integer between 0 and 100",
)
if not (0 <= progress_field <= 100):
return {}, Error(
record=record,
error="progress_field must be an integer between 0 and 100",
)
data = {
# ...
"progress_field": progress_field,
}This handles all the common cases from spreadsheet imports:
None/ empty string → defaults to0- Integer cell value (
75) →int(float(75))→75✓ - Float cell value (
75.0) →int(float(75.0))→75✓ - String
"75"→int(float("75"))→75✓ - Out-of-range (e.g.,
150) → explicit error returned to the caller
🧠 Learnings used
Learnt from: ab-smith
Repo: intuitem/ciso-assistant-community PR: 3546
File: backend/core/serializers.py:3360-3403
Timestamp: 2026-02-25T14:44:51.479Z
Learning: Guideline: In this codebase, Huey tasks decorated with task() from huey.contrib.djhuey are enqueued automatically when called directly because the Huey configuration sets immediate to False. Therefore, direct calls to task()-decorated functions will queue asynchronously; use the .delay() method only when you need to pass scheduling parameters (e.g., delay or eta). This pattern applies across Python files defining Huey tasks.
Learnt from: monsieurswag
Repo: intuitem/ciso-assistant-community PR: 3438
File: backend/core/views.py:6824-6836
Timestamp: 2026-03-02T15:08:25.027Z
Learning: In this repository, when performing bulk_create on ORM models, be aware that certain models may have brittle DB-managed primary keys or defaults that do not get populated reliably in bulk operations. Specifically, the models RequirementMapping, HistoricalMetric, PersonalAccessToken, and Event can experience issues with PK population. If you bulk_create instances of these models, ensure you refresh or re-fetch their PKs after the bulk operation on backends that return PKs (i.e., where PKs are only assigned upon persistence). Apply this check to Python/Django backend code (e.g., files under backend/) and ensure subsequent references to these instances use the refreshed PKs to avoid inconsistent foreign keys or integrity errors.
Learnt from: Mohamed-Hacene
Repo: intuitem/ciso-assistant-community PR: 3891
File: backend/tprm/views.py:701-724
Timestamp: 2026-04-09T06:35:21.376Z
Learning: In backend/tprm/views.py and backend/data_wizard/views.py, the Contract export methods (ContractViewSet.export_config and export_ecosystem) intentionally emit multi-value solution references/names as newline-delimited strings (e.g., "\n".join(...)). This is round-trip compatible: the data wizard importer (_process_contracts in backend/data_wizard/views.py) parses newline/pipe/comma-separated solution_ref_id values into a list of Solution objects and updates the contract's solutions M2M relation accordingly. When reviewing these export/import flows, do NOT flag the multi-value newline-delimited serialization as a potential data-loss issue, since it is expected and has corresponding parsing logic.
Learnt from: ab-smith
Repo: intuitem/ciso-assistant-community PR: 3893
File: backend/sec_intel/views.py:39-42
Timestamp: 2026-04-10T06:55:33.554Z
Learning: When reviewing code that calls `BaseModelViewSet._get_accessible_ids_map(related_models)` (from `backend/core/views.py` in this repo), treat the parameter as an **iterable of model classes** (not a `{field_name: model_class}` dict). The established/expected call pattern is `_get_accessible_ids_map(set(field_models.values()))`, where `field_models` comes from `_get_fieldsrelated_map(serializer)`. Do not flag this usage as a bug in future reviews.
Learnt from: martinzerty
Repo: intuitem/ciso-assistant-community PR: 3924
File: backend/core/serializers.py:1385-1385
Timestamp: 2026-04-15T07:05:44.003Z
Learning: For this repository’s Evidence model, `Evidence.filename` is a regular method (not a `property`). When reviewing code, accept and require the call syntax `evidence.filename()` (with parentheses). Do not flag `evidence.filename()` as incorrect anywhere in the backend Python code that uses this model.
Learnt from: tchoumi313
Repo: intuitem/ciso-assistant-community PR: 3665
File: backend/data_wizard/views.py:447-451
Timestamp: 2026-05-04T14:41:53.700Z
Learning: In this repo, Python version for backend code must be determined from backend/pyproject.toml and backend/Dockerfile (Python >=3.14,<3.15 / python:3.14-slim). Do not infer the backend Python version from automation/.python-version or cli/.python-version. When reviewing backend/ (backend/) code, it is valid to use Python 3.14+ syntax (including PEP 696 default type parameters, e.g. `class Foo[T = None]`).
There was a problem hiding this comment.
The [0,100] should be enforced by the django ORM directly, having duplicated bound checking logic just add useless boilerplate code.
This is okay as this field isn't really security-sensitive in itself.
There was a problem hiding this comment.
@monsieurswag, that's a fair point — the Django model validators are the right layer for range enforcement. The minimal version that only handles type coercion is:
raw_progress = record.get("progress_field")
if raw_progress in (None, ""):
progress_field = 0
else:
try:
progress_field = int(float(raw_progress))
except (TypeError, ValueError):
progress_field = 0
data = {
# ...
"progress_field": progress_field,
}This safely handles None, empty strings, Excel floats (75.0), and strings ("75"), and leaves out-of-range values to Django's model validator to reject during serializer validation — no boilerplate duplication needed.
✏️ Learnings added
Learnt from: monsieurswag
Repo: intuitem/ciso-assistant-community PR: 4082
File: backend/data_wizard/views.py:0-0
Timestamp: 2026-05-06T14:32:09.893Z
Learning: In backend/data_wizard/views.py, do NOT duplicate Django model-level range validators (e.g., MinValueValidator/MaxValueValidator) in RecordConsumer.prepare_create() methods. Type coercion (e.g., int(float(...))) is acceptable for handling spreadsheet input edge cases, but bound checking (e.g., 0 <= value <= 100) should be left to the Django ORM/serializer validation layer. This applies to all fields in data wizard consumer classes.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: ab-smith
Repo: intuitem/ciso-assistant-community PR: 3546
File: backend/core/serializers.py:3360-3403
Timestamp: 2026-02-25T14:44:51.479Z
Learning: Guideline: In this codebase, Huey tasks decorated with task() from huey.contrib.djhuey are enqueued automatically when called directly because the Huey configuration sets immediate to False. Therefore, direct calls to task()-decorated functions will queue asynchronously; use the .delay() method only when you need to pass scheduling parameters (e.g., delay or eta). This pattern applies across Python files defining Huey tasks.
Learnt from: monsieurswag
Repo: intuitem/ciso-assistant-community PR: 3438
File: backend/core/views.py:6824-6836
Timestamp: 2026-03-02T15:08:25.027Z
Learning: In this repository, when performing bulk_create on ORM models, be aware that certain models may have brittle DB-managed primary keys or defaults that do not get populated reliably in bulk operations. Specifically, the models RequirementMapping, HistoricalMetric, PersonalAccessToken, and Event can experience issues with PK population. If you bulk_create instances of these models, ensure you refresh or re-fetch their PKs after the bulk operation on backends that return PKs (i.e., where PKs are only assigned upon persistence). Apply this check to Python/Django backend code (e.g., files under backend/) and ensure subsequent references to these instances use the refreshed PKs to avoid inconsistent foreign keys or integrity errors.
Learnt from: Mohamed-Hacene
Repo: intuitem/ciso-assistant-community PR: 3891
File: backend/tprm/views.py:701-724
Timestamp: 2026-04-09T06:35:21.376Z
Learning: In backend/tprm/views.py and backend/data_wizard/views.py, the Contract export methods (ContractViewSet.export_config and export_ecosystem) intentionally emit multi-value solution references/names as newline-delimited strings (e.g., "\n".join(...)). This is round-trip compatible: the data wizard importer (_process_contracts in backend/data_wizard/views.py) parses newline/pipe/comma-separated solution_ref_id values into a list of Solution objects and updates the contract's solutions M2M relation accordingly. When reviewing these export/import flows, do NOT flag the multi-value newline-delimited serialization as a potential data-loss issue, since it is expected and has corresponding parsing logic.
Learnt from: ab-smith
Repo: intuitem/ciso-assistant-community PR: 3893
File: backend/sec_intel/views.py:39-42
Timestamp: 2026-04-10T06:55:33.554Z
Learning: When reviewing code that calls `BaseModelViewSet._get_accessible_ids_map(related_models)` (from `backend/core/views.py` in this repo), treat the parameter as an **iterable of model classes** (not a `{field_name: model_class}` dict). The established/expected call pattern is `_get_accessible_ids_map(set(field_models.values()))`, where `field_models` comes from `_get_fieldsrelated_map(serializer)`. Do not flag this usage as a bug in future reviews.
Learnt from: martinzerty
Repo: intuitem/ciso-assistant-community PR: 3924
File: backend/core/serializers.py:1385-1385
Timestamp: 2026-04-15T07:05:44.003Z
Learning: For this repository’s Evidence model, `Evidence.filename` is a regular method (not a `property`). When reviewing code, accept and require the call syntax `evidence.filename()` (with parentheses). Do not flag `evidence.filename()` as incorrect anywhere in the backend Python code that uses this model.
Learnt from: tchoumi313
Repo: intuitem/ciso-assistant-community PR: 3665
File: backend/data_wizard/views.py:447-451
Timestamp: 2026-05-04T14:41:53.700Z
Learning: In this repo, Python version for backend code must be determined from backend/pyproject.toml and backend/Dockerfile (Python >=3.14,<3.15 / python:3.14-slim). Do not infer the backend Python version from automation/.python-version or cli/.python-version. When reviewing backend/ (backend/) code, it is valid to use Python 3.14+ syntax (including PEP 696 default type parameters, e.g. `class Foo[T = None]`).
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backend/data_wizard/views.py (1)
903-907:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
progress_fieldcoercion currently drops valid numeric inputs to0Line 903-907 only accepts digit-only strings. Numeric spreadsheet values that come as
int/float(or strings like"42.0") are silently reset to0, which corrupts imported progress.Proposed fix
- record_progress = record.get("progress_field") - if isinstance(record_progress, str) and record_progress.isdigit(): - progress = int(record_progress) - else: - progress = 0 + record_progress = record.get("progress_field") + if record_progress in (None, ""): + progress = 0 + else: + try: + progress = int(float(str(record_progress).strip())) + except (TypeError, ValueError): + progress = 0Based on learnings: "Do not duplicate Django model-level range validators... allow type coercion (e.g., int(float(value))) for spreadsheet input edge cases... leave bound checking to the ORM/serializer."
Also applies to: 925-925
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/data_wizard/views.py` around lines 903 - 907, The current coercion for record_progress only accepts digit-only strings and resets other valid numerics to 0; update the handling of record_progress (and the similar occurrence around the variable used at line ~925) to try converting numeric inputs more flexibly: if record_progress is an int/float use it, else if it's a string attempt int(float(record_progress)) inside a try/except to handle "42.0" or spreadsheet floats, falling back to 0 only on conversion failure; do not perform any range validation here—leave bounds checking to the ORM/serializer and ensure you update the assignment to the progress variable accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/data_wizard/views.py`:
- Around line 902-908: The file backend/data_wizard/views.py fails the formatter
check; run the project formatter (e.g., `ruff format
backend/data_wizard/views.py` or `ruff format .`) and commit the changes so CI
passes. Locate the area around the record handling code (look for
record_progress = record.get("progress_field") and the subsequent progress
assignment) to ensure only style/formatting changes are applied and no logic
around the record_progress/progress variables is altered.
---
Duplicate comments:
In `@backend/data_wizard/views.py`:
- Around line 903-907: The current coercion for record_progress only accepts
digit-only strings and resets other valid numerics to 0; update the handling of
record_progress (and the similar occurrence around the variable used at line
~925) to try converting numeric inputs more flexibly: if record_progress is an
int/float use it, else if it's a string attempt int(float(record_progress))
inside a try/except to handle "42.0" or spreadsheet floats, falling back to 0
only on conversion failure; do not perform any range validation here—leave
bounds checking to the ORM/serializer and ensure you update the assignment to
the progress variable accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 06224af1-c419-4781-b5f4-a5d405845908
📒 Files selected for processing (1)
backend/data_wizard/views.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
backend/data_wizard/views.py (1)
902-907:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
progress_fieldparsing currently drops valid spreadsheet numeric values to0.This only accepts digit-only strings, so values like
75,75.0, and"75.0"are treated as invalid and silently reset to0, which loses user data during import.Proposed fix
- record_progress = record.get("progress_field") - if isinstance(record_progress, str) and record_progress.isdigit(): - progress = int(record_progress) - else: - progress = 0 + record_progress = record.get("progress_field") + if record_progress in (None, ""): + progress = 0 + else: + try: + progress = int(float(record_progress)) + except (TypeError, ValueError): + progress = 0Based on learnings: do not duplicate Django model-level range validators in
RecordConsumer.prepare_create(); allow type coercion and leave bounds validation to ORM/serializer.Also applies to: 924-924
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/data_wizard/views.py` around lines 902 - 907, The current logic in RecordConsumer.prepare_create() (variable record_progress) only accepts digit-only strings and drops valid numeric inputs; update the parsing to coerce numeric-like values instead: if record_progress is an int or float set progress = int(record_progress); if it's a string attempt float(record_progress) then set progress = int(that_float) on success; otherwise fall back to 0. Do the same replacement for the other occurrence around the later block (the second record_progress check). Leave range/validation to the ORM/serializer and only perform safe type coercion here.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@backend/data_wizard/views.py`:
- Around line 902-907: The current logic in RecordConsumer.prepare_create()
(variable record_progress) only accepts digit-only strings and drops valid
numeric inputs; update the parsing to coerce numeric-like values instead: if
record_progress is an int or float set progress = int(record_progress); if it's
a string attempt float(record_progress) then set progress = int(that_float) on
success; otherwise fall back to 0. Do the same replacement for the other
occurrence around the later block (the second record_progress check). Leave
range/validation to the ORM/serializer and only perform safe type coercion here.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1c3d03e1-d257-40ed-a793-1f7b954576f5
📒 Files selected for processing (1)
backend/data_wizard/views.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
backend/data_wizard/views.py (1)
902-910:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
progress_fieldcoercion drops valid values and includes a debug print in request flowAt Line 906, parsing currently misses common valid inputs (
int,"75.0"), so those records silently import as0. Also, Line 903 prints on every record.Proposed fix
- record_progress = record.get("progress_field") - print(type(record_progress), record_progress) - progress = 0 - - if isinstance(record_progress, str) and record_progress.isdigit(): - progress = int(record_progress) - elif isinstance(record_progress, float): - progress = int(record_progress) + record_progress = record.get("progress_field") + progress = 0 + if record_progress not in (None, ""): + try: + progress = int(float(record_progress)) + except (TypeError, ValueError): + progress = 0Based on learnings: “Do not duplicate Django model-level range validators in
RecordConsumer.prepare_create(); allow type coercion likeint(float(value))and leave bound checks to ORM/serializer validation.Also applies to: 927-927
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/data_wizard/views.py` around lines 902 - 910, In RecordConsumer.prepare_create(), remove the debug print and change the progress_field coercion to accept ints, floats and numeric strings (including "75.0") by attempting a safe numeric cast: if record_progress is int use it, else if it's float or a numeric string try progress = int(float(record_progress)); handle exceptions by leaving progress=0 (or None) and do not perform range validation here—leave bound checks to the ORM/serializer; reference the record_progress variable and the prepare_create method for the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@backend/data_wizard/views.py`:
- Around line 902-910: In RecordConsumer.prepare_create(), remove the debug
print and change the progress_field coercion to accept ints, floats and numeric
strings (including "75.0") by attempting a safe numeric cast: if record_progress
is int use it, else if it's float or a numeric string try progress =
int(float(record_progress)); handle exceptions by leaving progress=0 (or None)
and do not perform range validation here—leave bound checks to the
ORM/serializer; reference the record_progress variable and the prepare_create
method for the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: da51b90f-0612-41c6-abd2-3a8011739459
📒 Files selected for processing (1)
backend/data_wizard/views.py
|
For whatever reason, with the following CSV, the Use the following CSV to import applied controls to recreate the bug: ref_id,name,description,domain,status,category,priority,csf_function,owner,progress_field
CTL-001,Endpoint Protection,Deploy and maintain endpoint protection on all workstations,,to_do,technical,2,protect,Desktop Support,
CTL-002,Access Review Policy,Quarterly review of user access rights,,to_do,policy,3,govern,Identity & Access Management,
CTL-003,Firewall Rule Audit,Annual review of firewall rules,,in_progress,process,2,protect,alice@example.com; Firewall Administration,10
CTL-004,Security Training,Annual security awareness training for all staff,,active,process,2,protect,Security Awareness & Training,10
CTL-005,Incident Response Plan,Maintain and test incident response procedures,,to_do,procedure,1,respond,bob@example.com,100 |
This PR follows is the fix to
SUP-1253support ticket, but hereafter described bug must be fixed before merging it.Summary by CodeRabbit