For now, it looks like many relevant building blocks already exist, but there still seems to be plenty of room:
I think this is a strong idea, especially because you are focusing on existing-process transformation rather than generic document generation.
A BRD is usually not hard because of the writing itself. It is hard because the source material is messy, incomplete, inconsistent, and full of implicit operational knowledge:
- meeting notes
- process maps
- SOPs
- current-state walkthroughs
- AHT / touch-time / volume metrics
- stakeholder comments
- old BRDs
- exception cases
- undocumented workarounds
- half-confirmed assumptions
- operational constraints that everyone “knows” but nobody wrote down
So I would avoid framing the product only as a BRD generator.
I would frame it as a requirements discovery workbench.
The BRD can still be the final exported document, but the core product should be the structured requirements package behind it.
1. Why I think the idea is promising
The strongest part of your idea is not “LLM writes a BRD”.
The strongest part is:
messy process-discovery artifacts → reviewable requirements, gaps, edge cases, and BRD sections
That is a much more valuable workflow.
A plain LLM prompt can produce a polished document, but a real BRD needs more than polished prose. It needs:
| Need |
Why it matters |
| Confirmed business facts |
Avoids turning guesses into requirements |
| Source-backed requirements |
Lets reviewers check where each requirement came from |
| Gap detection |
Shows what is missing before stakeholders find it later |
| Edge-case discovery |
Prevents happy-path-only BRDs |
| Operational metrics |
Converts AHT, volume, SLA, rework, etc. into measurable requirements |
| Human review |
Keeps SMEs / BAs / PMs accountable for final decisions |
| Traceability |
Connects source artifacts → requirements → tests / Jira / implementation |
So the product boundary I would choose is:
A source-backed requirements discovery workbench for existing-process transformation.
Not:
A better BRD writer.
2. Many building blocks already exist
There are already useful pieces in the ecosystem. That is good news. It means this is technically feasible.
But those pieces mostly live at different layers.
| Area |
Examples |
What they solve |
What they do not fully solve |
| Document ingestion |
Docling, Docling docs, Unstructured |
Parse PDFs, Office files, OCR, layouts, tables, images, transcripts |
Convert artifacts into requirement-level business facts |
| RAG / knowledge layer |
RAGFlow, RAGFlow docs, Dify, LlamaIndex |
Retrieve relevant context from documents |
Decide whether a chunk truly supports a requirement |
| Workflow automation |
n8n, Flowise |
Connect forms, files, LLM calls, docs, Jira, email, storage |
Maintain requirement review state and evidence validity |
| Requirements generation |
SpecifAI, BRD/PRD generators |
Generate BRD/PRD/NFR drafts |
Handle process metrics, gaps, edge cases, and source-backed review as first-class objects |
| Requirements quality / ALM |
Jama Connect Advisor, Trace.Space, Visure |
Analyze requirements, traceability, quality, compliance |
Lightweight upstream discovery from messy process artifacts |
| RAG evaluation |
Ragas, RAGAS paper |
Evaluate retrieval and generation quality |
Evaluate BRD-specific coverage, gaps, NFRs, edge cases, and reviewer effort |
| Spec-driven downstream |
GitHub Spec Kit, Spec-driven development post |
Move from specifications into AI-assisted implementation workflows |
Discover and validate business requirements from messy upstream artifacts |
So I do not think the gap is “there are no tools”.
The gap is more specific:
The missing layer is a requirements layer between document AI and ALM.
In other words, many systems can parse documents, retrieve chunks, run agents, or generate documents. Fewer systems turn messy discovery artifacts into requirement objects with evidence, gaps, assumptions, edge cases, metrics, and review state.
3. The missing layer
A useful product here should not be centered around document -> LLM -> BRD.
It should be centered around:
artifacts
-> business facts
-> process observations
-> requirement candidates
-> source evidence
-> gaps
-> edge cases
-> metric-derived requirements
-> human review
-> BRD / Jira / Confluence / tests / traceability matrix
The missing layer looks like this:
| Missing layer |
Why it matters |
| Requirement-level evidence binding |
A BRD reviewer needs to know which source artifact supports each requirement. |
| Gap ledger |
The most useful output is often what is missing: SLA, owner, approval rule, exception path, volume assumption, data retention, etc. |
| Process-to-requirement conversion |
A process map is not yet a requirement model. The system should convert process observations into FRs, NFRs, data requirements, integration requirements, and business rules. |
| Metric-derived requirements |
AHT, touch time, volume, SLA breach, rework rate, and exception rate should become requirements and business-case assumptions. |
| Review-first workflow |
Human BA/SME review should be a first-class product flow, not an afterthought. |
| BRD-specific evaluation |
General RAG metrics help, but BRD quality also needs coverage, grounding, gap quality, NFR coverage, edge-case coverage, and reviewer edit distance. |
4. Product framing
I would describe the product this way:
A business analysis workbench that turns fragmented process discovery artifacts into source-backed requirements, gaps, edge cases, metrics, and BRD sections.
That framing is stronger than:
Upload docs and generate a BRD.
Because “generate a BRD” can sound like a generic LLM wrapper.
The more defensible product is the structured layer underneath the BRD.
| Weak framing |
Stronger framing |
| AI BRD Generator |
Requirements Discovery Workbench |
| Upload documents, get BRD |
Upload artifacts, get a reviewable requirements package |
| Prompt engineering |
Guided elicitation |
| Context engineering |
Evidence-bound requirement generation |
| Memory engineering |
Provenance-aware project and organization memory |
| Harness engineering |
BRD-specific evaluation and review loop |
| BRD document |
BRD as one export view of structured requirement objects |
5. Core internal objects
I would make the internal model explicit.
The core objects should not only be documents and chunks. They should be requirement-oriented objects.
| Object |
Purpose |
Artifact |
Uploaded source: meeting notes, SOP, process map, KPI table, transcript, old BRD |
BusinessFact |
Atomic fact extracted from an artifact |
ProcessStep |
Current-state or future-state process observation |
RequirementCandidate |
Proposed functional / non-functional / data / reporting / integration requirement |
Evidence |
Source excerpt or process-map reference supporting a requirement |
Gap |
Missing information that blocks confident requirement writing |
EdgeCase |
Exception path, failure path, unusual variant, fallback, escalation, override |
Metric |
AHT, touch time, volume, SLA, error rate, rework rate, exception rate |
Assumption |
A statement that may be useful but is not yet fully supported |
ReviewDecision |
Accept, reject, edit, needs SME review, needs more evidence |
TraceLink |
Source artifact → requirement → test case / Jira item / BRD section |
A useful requirement object might look like this:
{
"id": "FR-014",
"type": "functional_requirement",
"text": "The system shall route exception cases to supervisor review.",
"source_evidence": [
{
"artifact_id": "meeting_notes_2026_05_20",
"excerpt": "Exceptions are currently reviewed manually by supervisors.",
"support_level": "partial"
},
{
"artifact_id": "current_process_map_v3",
"step": "Manual supervisor review",
"support_level": "strong"
}
],
"confidence": 0.74,
"assumptions": [
"Exception cases are a distinct case category."
],
"open_questions": [
"What exactly qualifies as an exception?",
"Does the threshold differ by region or product?"
],
"review_status": "needs_sme_review"
}
This is the important shift.
The BRD is not just generated text. It is a view over structured requirement objects.
6. How I would reinterpret your four engineering levers
Your four levers make sense. I would operationalize them this way.
| Original lever |
Practical version |
What it should do |
| Prompt Engineering |
Guided elicitation |
Ask the right missing questions instead of relying only on free-form prompts |
| Automated Context Engineering |
Evidence binding |
Link each requirement to the artifacts that support it |
| Memory Engineering |
Provenance-aware memory |
Reuse organizational patterns while preserving where each pattern came from |
| Harness Engineering |
BRD evaluation harness |
Measure coverage, grounding, gaps, edge cases, and reviewer effort |
Prompt Engineering → Guided elicitation
Instead of only asking the user to provide a problem statement, I would build a structured elicitation flow.
Example questions:
| Question |
Reason |
| What is the current process? |
Establish current state |
| Which roles are involved? |
Find stakeholders and permissions |
| Which systems are touched? |
Identify integration requirements |
| What are the known exceptions? |
Avoid happy-path-only BRDs |
| What metrics exist? |
Convert AHT, volume, SLA, rework into measurable requirements |
| What is still unknown? |
Create the gap ledger |
| What is the desired future state? |
Separate problem statement from solution assumption |
Context Engineering → Evidence binding
RAG is useful, but the important question is not only:
Did we retrieve relevant context?
It is:
Does this context actually support this requirement?
Ragas faithfulness is a useful conceptual reference: it checks whether generated claims are supported by retrieved context. For BRDs, I would apply that idea at the requirement level.
| Requirement |
Evidence |
Confidence |
Open question |
| System shall route exception cases to supervisor review. |
SOP step + workshop note |
Medium |
What qualifies as an exception? |
| System shall reduce AHT from 12 min to 7 min. |
KPI table + project objective |
Low-Medium |
Is 7 min a target or a hard SLA? |
| System shall validate mandatory fields before submission. |
Rework notes + SOP checklist |
High |
Which fields are mandatory by case type? |
Memory Engineering → Provenance-aware memory
Memory is useful, but I would avoid “the system remembers it, therefore it is true”.
I would split memory into types:
| Memory type |
Example |
How to use it |
| Project memory |
Decisions, unresolved questions, accepted assumptions |
Use as current project state |
| Organization memory |
Standard BRD template, terminology, approval conventions |
Use as default structure |
| Pattern memory |
Past requirement patterns from similar BRDs |
Use as candidates, not facts |
| Risk memory |
Common missing NFRs and edge cases |
Use as checklist |
| Reviewer memory |
What reviewers often edit or reject |
Use to improve drafts and evals |
Every memory item should carry provenance:
memory_item
-> source BRD / template / meeting / reviewer decision
-> confidence
-> last reviewed date
-> applicable domain
Harness Engineering → BRD-specific evaluation
This is especially important.
A polished BRD is not necessarily a good BRD. It can be clear, fluent, and still wrong.
I would evaluate the system on BRD-specific metrics:
| Metric |
What it measures |
| Evidence coverage |
Percentage of requirements with usable source evidence |
| Unsupported claim rate |
Requirements or claims with weak/no source support |
| Requirement coverage |
Whether expected categories are covered |
| NFR coverage |
SLA, security, audit, availability, performance, data retention, observability |
| Edge-case coverage |
Exception flows, rework, cancellation, fallback, manual override, escalation |
| Gap quality |
Whether open questions are specific and actionable |
| Metric conversion quality |
Whether AHT, volume, SLA breach, etc. become useful requirements |
| Reviewer edit distance |
How much BA/SME editing is needed |
| Time-to-review |
How quickly a reviewer can validate the draft |
| Traceability completeness |
Source → requirement → test/ticket linkage |
General RAG metrics from Ragas or the RAGAS paper are useful, but BRD quality needs extra domain-specific checks.
7. Historical BRDs: use them, but not only for fine-tuning
The 100–200 historical BRDs are valuable.
But I would not use them only as fine-tuning data at the beginning.
I would first use them for:
| Use historical BRDs for… |
Why |
| Retrieval examples |
Find similar past requirements and BRD sections |
| Template mining |
Extract organization-specific BRD structure |
| Requirement pattern memory |
Reuse common requirement patterns |
| Gap checklist |
Detect common omissions from prior projects |
| Evaluation set |
Test whether the system produces useful BRD packages |
| Reviewer rubric |
Learn what “good” looks like in your organization |
| Terminology normalization |
Align language with internal BA / PM / compliance conventions |
Fine-tuning may become useful later, especially for style, extraction consistency, or organization-specific language.
But early on, I would prioritize:
- structured schema
- retrieval
- evidence binding
- review workflow
- evaluation
- then fine-tuning if the data supports it
8. Suggested architecture
I would avoid building everything from scratch.
Reuse commodity layers where possible, but build the requirements layer yourself.
| Layer |
Responsibility |
Build vs. reuse |
| Artifact ingestion |
Parse PDFs, DOCX, PPTX, XLSX, transcripts, SOPs, process docs |
Reuse tools like Docling or Unstructured where possible |
| Retrieval layer |
Retrieve relevant historical BRDs, SOPs, meeting notes, templates |
Reuse RAGFlow, Dify, LlamaIndex, or a custom RAG stack |
| Requirements core |
Requirement objects, evidence, gaps, metrics, review state |
Build this as the differentiated layer |
| Gap / edge-case engine |
Detect missing information and likely exception paths |
Build domain-specific logic + LLM checks |
| Review UI |
Accept, reject, edit, assign owner, check evidence |
Build as first-class UX |
| Export layer |
BRD, traceability matrix, Jira, Azure DevOps, Confluence, Word, Markdown |
Adapter-based |
| Eval harness |
Measure grounding, coverage, unsupported claims, reviewer edits |
Build BRD-specific metrics on top of general RAG eval ideas |
The product should not depend on one parser, one model, one vector database, or one workflow tool.
I would make it adapter-first:
core/
requirement_schema
evidence_model
gap_ledger
metric_mapping
review_state
traceability_model
eval_interface
adapters/
docling
unstructured
ragflow
dify
llamaindex
jira
azure_devops
confluence
sharepoint
google_drive
word_export
markdown_export
This matters because many organizations already have their own stack:
- SharePoint
- Confluence
- Jira
- Azure DevOps
- Box
- Google Drive
- internal document stores
- Azure OpenAI
- local LLM endpoints
- internal approval workflows
A useful product should fit into that reality instead of forcing one universal stack.
9. MVP scope
I would keep the MVP narrow.
Do not start with everything: video, BPMN, ALM replacement, Jira round-trip sync, fine-tuning, test generation, and full approval workflow.
Start with a small but valuable workflow.
| Include in MVP |
Defer |
| Meeting notes |
Full video understanding |
| One SOP or process description |
Automatic BPMN generation |
| One process map if available |
Full process mining |
| One metrics table: AHT, touch time, volume, SLA breach, rework rate |
Full ALM replacement |
| One BRD template |
Complex Jira round-trip sync |
| Optional historical BRDs for retrieval |
Fine-tuning as the first step |
| Requirement candidates + evidence + gaps + edge cases |
Fully automated final BRD approval |
| Markdown / Word / CSV export |
Large-scale enterprise workflow orchestration |
A good MVP output would be:
| Output |
Purpose |
| BRD skeleton |
Gives the expected document shape |
| Requirement candidates |
Gives reviewers something structured to inspect |
| Evidence table |
Shows where each requirement came from |
| Gap ledger |
Shows what still needs SME / stakeholder input |
| Edge-case register |
Makes exception handling visible |
| Metric-derived requirements |
Converts operational data into measurable requirements |
| Open questions |
Drives follow-up workshops |
| Traceability matrix |
Prepares downstream test / Jira / implementation work |
10. Metric-derived requirements may be a strong differentiator
This is where the idea can become more than another document generator.
Operational metrics should not only appear in the background section of the BRD. They should drive requirements.
| Input metric |
Possible requirement impact |
| AHT |
Processing-time target, automation target, efficiency business case |
| Touch time |
Manual work reduction, automation candidate selection |
| Monthly volume |
Throughput, capacity, scaling, queue design |
| SLA breach rate |
Alerts, escalation, reporting, workload balancing |
| Rework rate |
Validation rules, data quality requirements |
| Exception rate |
Exception queue, routing, supervisor review |
| Handoff count |
Role redesign, workflow simplification |
| Peak volume |
Capacity planning, batch/real-time processing choices |
Example:
| Observation |
Requirement candidate |
| AHT is 12 min; target is 7 min |
NFR: redesigned workflow should support average handling time <= 7 min for standard cases |
| 18% of cases are exceptions |
FR: system shall support exception classification and supervisor queue routing |
| 11% rework due to missing fields |
FR: system shall validate required fields before submission |
| 40k cases/month |
NFR: system shall support expected monthly volume plus agreed peak margin |
| SLA breach is tracked manually |
Reporting requirement: system shall report SLA breaches by queue, region, and case type |
This is much more concrete than simply asking an LLM to write a better BRD.
11. Review-first UX
I would not make the first interface a document editor.
I would make the first interface a review queue.
| Type |
Item |
Evidence |
Risk |
Action |
| Requirement |
Supervisor review for exception cases |
SOP + meeting note |
Exception definition unclear |
Accept / Edit / Reject |
| Gap |
SLA target missing |
No source |
NFR cannot be validated |
Assign owner |
| Edge case |
Approver absent |
Implied in process notes |
Workflow may block |
Ask SME |
| Assumption |
40k cases/month |
KPI table |
Seasonal peak unknown |
Confirm |
| Metric-derived req |
Reduce AHT to 7 min |
KPI target |
Hard SLA vs goal unclear |
Clarify |
This makes the tool useful before the BRD is “done”.
In real BA work, the intermediate review artifacts are often more valuable than the first draft.
12. Security and governance should be designed early
BRDs often contain sensitive operational information:
- internal process details
- volumes and SLAs
- customer or employee data in meeting notes
- system names
- manual workarounds
- approval paths
- compliance assumptions
- audit requirements
- integration constraints
So I would design for governance early:
| Area |
Practical requirement |
| Data residency |
Know where artifacts, embeddings, and generated requirements are stored |
| Tenant isolation |
Separate projects, clients, and departments |
| Source permissions |
Preserve source document permissions when surfacing evidence |
| Audit logs |
Track who accepted, edited, or rejected each requirement |
| PII handling |
Detect and redact sensitive information in notes and transcripts |
| Prompt injection defense |
Treat uploaded documents as untrusted input |
| BYO model / private deployment |
Support Azure OpenAI, private endpoints, or local models where needed |
| Export control |
Prevent unsupported assumptions from being exported as confirmed requirements |
The OWASP Top 10 for LLM Applications and the OWASP GenAI prompt injection note are useful references here. If you expose the system through tools or MCP-style integrations, the MCP security best practices are also worth reading.
Security should not be treated as a final enterprise checkbox. It affects the architecture.
13. API and integration shape
If this becomes a platform, I would make the requirements core accessible through APIs.
Example API surface:
POST /projects
POST /projects/<project_id>/artifacts
POST /projects/<project_id>/extract-facts
POST /projects/<project_id>/generate-requirements
GET /projects/<project_id>/requirements
GET /projects/<project_id>/gaps
GET /projects/<project_id>/edge-cases
PATCH /requirements/<requirement_id>/review
GET /projects/<project_id>/traceability
POST /projects/<project_id>/export/brd
POST /projects/<project_id>/export/jira
POST /projects/<project_id>/export/azure-devops
Important point:
Exporting a BRD should be one view. Exporting structured requirements should be the real product API.
That makes downstream integration easier:
| Output view |
Consumer |
| BRD document |
Business stakeholders |
| Requirements table |
BA / PM / product owner |
| Gap ledger |
SME / process owner |
| Traceability matrix |
QA / compliance |
| Jira / Azure DevOps items |
Engineering |
| Test case seeds |
QA |
| Spec package |
Spec-driven development / AI coding workflows |
The GitHub Spec Kit and the broader spec-driven development discussion are useful downstream references. They show why structured specifications matter beyond a static document.
14. What I would not overbuild at first
I would be careful not to make the first version too broad.
| Tempting feature |
Why I would defer it |
| Full video understanding |
Expensive, noisy, privacy-sensitive, and hard to evaluate |
| Automatic BPMN generation |
Useful later, but it can distract from requirement discovery |
| Full ALM replacement |
Jama / Trace.Space / Visure-like territory is heavy |
| Fine-tuning-first strategy |
Hard to evaluate early; retrieval + rubrics give faster feedback |
| Full Jira round-trip sync |
Useful, but export/import can start simpler |
| Fully automated final BRD approval |
Risky; human review should remain explicit |
| Test-case generation as primary feature |
Valuable later, but only after requirements are grounded |
15. A practical build sequence
If I were building this, I would use a phased approach.
| Phase |
Goal |
Key output |
| Phase 1 |
Artifact normalization |
Business facts extracted from notes, SOPs, process docs, metrics |
| Phase 2 |
Requirement candidates |
FR/NFR/data/integration/reporting requirements with evidence |
| Phase 3 |
Gap ledger |
Missing information and targeted SME questions |
| Phase 4 |
Review workflow |
Accept/edit/reject/assign owner/check evidence |
| Phase 5 |
BRD export |
BRD skeleton generated from accepted / pending objects |
| Phase 6 |
Eval harness |
Coverage, grounding, unsupported claims, reviewer edit distance |
| Phase 7 |
Downstream integrations |
Jira, Azure DevOps, Confluence, tests, spec packages |
This sequence keeps the first product useful without pretending to solve the entire SDLC on day one.
16. My suggested positioning
I would position it like this:
A requirements discovery workbench for process transformation projects.
Or:
A source-backed BA workbench that turns messy process artifacts into requirements, gaps, edge cases, and BRD sections.
Or more sharply:
Find what your BRD is missing before stakeholders do.
The second and third versions are probably stronger than “AI BRD generator”.
17. Short version
I would keep the core idea, but sharpen the boundary.
Do not make the product only a better BRD writer.
Make it the missing requirements layer between document AI and ALM:
- parse messy artifacts
- extract business facts
- generate source-backed requirement candidates
- maintain a gap ledger
- mine edge cases
- convert operational metrics into requirements
- keep humans in the review loop
- export the BRD as one view of a structured requirements package
That would preserve the strongest part of the idea while making it much harder to dismiss as “just another LLM document generator”.