Creating a New App for Making BRD

I am new to AI world and I am trying to build as part of my side gig an app that will help create BRD for digital transformation of the existing process , out of different artifacts like existing process flow map, Meeting minutes , Video recordings of process , AHT, touchtime, free text notes etc. I have following thoughts and want opinion on how should I go about it. Happy to collaborate.

  1. Conventional method like giving prompt to claude ( upgraded models) provide a BRD but its more like research paper than practical business requirement . It can easily muddle the actual problem statement and take you to ideal world target state. Dont forget many edge cases gets missed

  2. I want to design/build solution which will have ingestion of artifacts + simple english problem statement. Then basis my predefined templates for reference + contextual reasoning start building a skeleton or blank schema for BRD.

  3. I want 4 Engineering levers - Prompt engineering where end user will prompt and provide all details , the only intensive Human In loop portion of the app . Second - Automated Context Engineering , Third- Memory Engineering , Fourth- Harness Engineering having human in loop wherever required .

I want this to be trained on existing library of BRDs ( 5-6 years , 100-200 BRDs) which can help train in Second, Third and Fourth levers

Can someone opine ?

For now, it looks like many relevant building blocks already exist, but there still seems to be plenty of room:


I think this is a strong idea, especially because you are focusing on existing-process transformation rather than generic document generation.

A BRD is usually not hard because of the writing itself. It is hard because the source material is messy, incomplete, inconsistent, and full of implicit operational knowledge:

  • meeting notes
  • process maps
  • SOPs
  • current-state walkthroughs
  • AHT / touch-time / volume metrics
  • stakeholder comments
  • old BRDs
  • exception cases
  • undocumented workarounds
  • half-confirmed assumptions
  • operational constraints that everyone “knows” but nobody wrote down

So I would avoid framing the product only as a BRD generator.

I would frame it as a requirements discovery workbench.

The BRD can still be the final exported document, but the core product should be the structured requirements package behind it.

1. Why I think the idea is promising

The strongest part of your idea is not “LLM writes a BRD”.

The strongest part is:

messy process-discovery artifacts → reviewable requirements, gaps, edge cases, and BRD sections

That is a much more valuable workflow.

A plain LLM prompt can produce a polished document, but a real BRD needs more than polished prose. It needs:

Need Why it matters
Confirmed business facts Avoids turning guesses into requirements
Source-backed requirements Lets reviewers check where each requirement came from
Gap detection Shows what is missing before stakeholders find it later
Edge-case discovery Prevents happy-path-only BRDs
Operational metrics Converts AHT, volume, SLA, rework, etc. into measurable requirements
Human review Keeps SMEs / BAs / PMs accountable for final decisions
Traceability Connects source artifacts → requirements → tests / Jira / implementation

So the product boundary I would choose is:

A source-backed requirements discovery workbench for existing-process transformation.

Not:

A better BRD writer.

2. Many building blocks already exist

There are already useful pieces in the ecosystem. That is good news. It means this is technically feasible.

But those pieces mostly live at different layers.

Area Examples What they solve What they do not fully solve
Document ingestion Docling, Docling docs, Unstructured Parse PDFs, Office files, OCR, layouts, tables, images, transcripts Convert artifacts into requirement-level business facts
RAG / knowledge layer RAGFlow, RAGFlow docs, Dify, LlamaIndex Retrieve relevant context from documents Decide whether a chunk truly supports a requirement
Workflow automation n8n, Flowise Connect forms, files, LLM calls, docs, Jira, email, storage Maintain requirement review state and evidence validity
Requirements generation SpecifAI, BRD/PRD generators Generate BRD/PRD/NFR drafts Handle process metrics, gaps, edge cases, and source-backed review as first-class objects
Requirements quality / ALM Jama Connect Advisor, Trace.Space, Visure Analyze requirements, traceability, quality, compliance Lightweight upstream discovery from messy process artifacts
RAG evaluation Ragas, RAGAS paper Evaluate retrieval and generation quality Evaluate BRD-specific coverage, gaps, NFRs, edge cases, and reviewer effort
Spec-driven downstream GitHub Spec Kit, Spec-driven development post Move from specifications into AI-assisted implementation workflows Discover and validate business requirements from messy upstream artifacts

So I do not think the gap is “there are no tools”.

The gap is more specific:

The missing layer is a requirements layer between document AI and ALM.

In other words, many systems can parse documents, retrieve chunks, run agents, or generate documents. Fewer systems turn messy discovery artifacts into requirement objects with evidence, gaps, assumptions, edge cases, metrics, and review state.

3. The missing layer

A useful product here should not be centered around document -> LLM -> BRD.

It should be centered around:

artifacts
  -> business facts
  -> process observations
  -> requirement candidates
  -> source evidence
  -> gaps
  -> edge cases
  -> metric-derived requirements
  -> human review
  -> BRD / Jira / Confluence / tests / traceability matrix

The missing layer looks like this:

Missing layer Why it matters
Requirement-level evidence binding A BRD reviewer needs to know which source artifact supports each requirement.
Gap ledger The most useful output is often what is missing: SLA, owner, approval rule, exception path, volume assumption, data retention, etc.
Process-to-requirement conversion A process map is not yet a requirement model. The system should convert process observations into FRs, NFRs, data requirements, integration requirements, and business rules.
Metric-derived requirements AHT, touch time, volume, SLA breach, rework rate, and exception rate should become requirements and business-case assumptions.
Review-first workflow Human BA/SME review should be a first-class product flow, not an afterthought.
BRD-specific evaluation General RAG metrics help, but BRD quality also needs coverage, grounding, gap quality, NFR coverage, edge-case coverage, and reviewer edit distance.

4. Product framing

I would describe the product this way:

A business analysis workbench that turns fragmented process discovery artifacts into source-backed requirements, gaps, edge cases, metrics, and BRD sections.

That framing is stronger than:

Upload docs and generate a BRD.

Because “generate a BRD” can sound like a generic LLM wrapper.

The more defensible product is the structured layer underneath the BRD.

Weak framing Stronger framing
AI BRD Generator Requirements Discovery Workbench
Upload documents, get BRD Upload artifacts, get a reviewable requirements package
Prompt engineering Guided elicitation
Context engineering Evidence-bound requirement generation
Memory engineering Provenance-aware project and organization memory
Harness engineering BRD-specific evaluation and review loop
BRD document BRD as one export view of structured requirement objects

5. Core internal objects

I would make the internal model explicit.

The core objects should not only be documents and chunks. They should be requirement-oriented objects.

Object Purpose
Artifact Uploaded source: meeting notes, SOP, process map, KPI table, transcript, old BRD
BusinessFact Atomic fact extracted from an artifact
ProcessStep Current-state or future-state process observation
RequirementCandidate Proposed functional / non-functional / data / reporting / integration requirement
Evidence Source excerpt or process-map reference supporting a requirement
Gap Missing information that blocks confident requirement writing
EdgeCase Exception path, failure path, unusual variant, fallback, escalation, override
Metric AHT, touch time, volume, SLA, error rate, rework rate, exception rate
Assumption A statement that may be useful but is not yet fully supported
ReviewDecision Accept, reject, edit, needs SME review, needs more evidence
TraceLink Source artifact → requirement → test case / Jira item / BRD section

A useful requirement object might look like this:

{
  "id": "FR-014",
  "type": "functional_requirement",
  "text": "The system shall route exception cases to supervisor review.",
  "source_evidence": [
    {
      "artifact_id": "meeting_notes_2026_05_20",
      "excerpt": "Exceptions are currently reviewed manually by supervisors.",
      "support_level": "partial"
    },
    {
      "artifact_id": "current_process_map_v3",
      "step": "Manual supervisor review",
      "support_level": "strong"
    }
  ],
  "confidence": 0.74,
  "assumptions": [
    "Exception cases are a distinct case category."
  ],
  "open_questions": [
    "What exactly qualifies as an exception?",
    "Does the threshold differ by region or product?"
  ],
  "review_status": "needs_sme_review"
}

This is the important shift.

The BRD is not just generated text. It is a view over structured requirement objects.

6. How I would reinterpret your four engineering levers

Your four levers make sense. I would operationalize them this way.

Original lever Practical version What it should do
Prompt Engineering Guided elicitation Ask the right missing questions instead of relying only on free-form prompts
Automated Context Engineering Evidence binding Link each requirement to the artifacts that support it
Memory Engineering Provenance-aware memory Reuse organizational patterns while preserving where each pattern came from
Harness Engineering BRD evaluation harness Measure coverage, grounding, gaps, edge cases, and reviewer effort

Prompt Engineering → Guided elicitation

Instead of only asking the user to provide a problem statement, I would build a structured elicitation flow.

Example questions:

Question Reason
What is the current process? Establish current state
Which roles are involved? Find stakeholders and permissions
Which systems are touched? Identify integration requirements
What are the known exceptions? Avoid happy-path-only BRDs
What metrics exist? Convert AHT, volume, SLA, rework into measurable requirements
What is still unknown? Create the gap ledger
What is the desired future state? Separate problem statement from solution assumption

Context Engineering → Evidence binding

RAG is useful, but the important question is not only:

Did we retrieve relevant context?

It is:

Does this context actually support this requirement?

Ragas faithfulness is a useful conceptual reference: it checks whether generated claims are supported by retrieved context. For BRDs, I would apply that idea at the requirement level.

Requirement Evidence Confidence Open question
System shall route exception cases to supervisor review. SOP step + workshop note Medium What qualifies as an exception?
System shall reduce AHT from 12 min to 7 min. KPI table + project objective Low-Medium Is 7 min a target or a hard SLA?
System shall validate mandatory fields before submission. Rework notes + SOP checklist High Which fields are mandatory by case type?

Memory Engineering → Provenance-aware memory

Memory is useful, but I would avoid “the system remembers it, therefore it is true”.

I would split memory into types:

Memory type Example How to use it
Project memory Decisions, unresolved questions, accepted assumptions Use as current project state
Organization memory Standard BRD template, terminology, approval conventions Use as default structure
Pattern memory Past requirement patterns from similar BRDs Use as candidates, not facts
Risk memory Common missing NFRs and edge cases Use as checklist
Reviewer memory What reviewers often edit or reject Use to improve drafts and evals

Every memory item should carry provenance:

memory_item
  -> source BRD / template / meeting / reviewer decision
  -> confidence
  -> last reviewed date
  -> applicable domain

Harness Engineering → BRD-specific evaluation

This is especially important.

A polished BRD is not necessarily a good BRD. It can be clear, fluent, and still wrong.

I would evaluate the system on BRD-specific metrics:

Metric What it measures
Evidence coverage Percentage of requirements with usable source evidence
Unsupported claim rate Requirements or claims with weak/no source support
Requirement coverage Whether expected categories are covered
NFR coverage SLA, security, audit, availability, performance, data retention, observability
Edge-case coverage Exception flows, rework, cancellation, fallback, manual override, escalation
Gap quality Whether open questions are specific and actionable
Metric conversion quality Whether AHT, volume, SLA breach, etc. become useful requirements
Reviewer edit distance How much BA/SME editing is needed
Time-to-review How quickly a reviewer can validate the draft
Traceability completeness Source → requirement → test/ticket linkage

General RAG metrics from Ragas or the RAGAS paper are useful, but BRD quality needs extra domain-specific checks.

7. Historical BRDs: use them, but not only for fine-tuning

The 100–200 historical BRDs are valuable.

But I would not use them only as fine-tuning data at the beginning.

I would first use them for:

Use historical BRDs for… Why
Retrieval examples Find similar past requirements and BRD sections
Template mining Extract organization-specific BRD structure
Requirement pattern memory Reuse common requirement patterns
Gap checklist Detect common omissions from prior projects
Evaluation set Test whether the system produces useful BRD packages
Reviewer rubric Learn what “good” looks like in your organization
Terminology normalization Align language with internal BA / PM / compliance conventions

Fine-tuning may become useful later, especially for style, extraction consistency, or organization-specific language.

But early on, I would prioritize:

  1. structured schema
  2. retrieval
  3. evidence binding
  4. review workflow
  5. evaluation
  6. then fine-tuning if the data supports it

8. Suggested architecture

I would avoid building everything from scratch.

Reuse commodity layers where possible, but build the requirements layer yourself.

Layer Responsibility Build vs. reuse
Artifact ingestion Parse PDFs, DOCX, PPTX, XLSX, transcripts, SOPs, process docs Reuse tools like Docling or Unstructured where possible
Retrieval layer Retrieve relevant historical BRDs, SOPs, meeting notes, templates Reuse RAGFlow, Dify, LlamaIndex, or a custom RAG stack
Requirements core Requirement objects, evidence, gaps, metrics, review state Build this as the differentiated layer
Gap / edge-case engine Detect missing information and likely exception paths Build domain-specific logic + LLM checks
Review UI Accept, reject, edit, assign owner, check evidence Build as first-class UX
Export layer BRD, traceability matrix, Jira, Azure DevOps, Confluence, Word, Markdown Adapter-based
Eval harness Measure grounding, coverage, unsupported claims, reviewer edits Build BRD-specific metrics on top of general RAG eval ideas

The product should not depend on one parser, one model, one vector database, or one workflow tool.

I would make it adapter-first:

core/
  requirement_schema
  evidence_model
  gap_ledger
  metric_mapping
  review_state
  traceability_model
  eval_interface

adapters/
  docling
  unstructured
  ragflow
  dify
  llamaindex
  jira
  azure_devops
  confluence
  sharepoint
  google_drive
  word_export
  markdown_export

This matters because many organizations already have their own stack:

  • SharePoint
  • Confluence
  • Jira
  • Azure DevOps
  • Box
  • Google Drive
  • internal document stores
  • Azure OpenAI
  • local LLM endpoints
  • internal approval workflows

A useful product should fit into that reality instead of forcing one universal stack.

9. MVP scope

I would keep the MVP narrow.

Do not start with everything: video, BPMN, ALM replacement, Jira round-trip sync, fine-tuning, test generation, and full approval workflow.

Start with a small but valuable workflow.

Include in MVP Defer
Meeting notes Full video understanding
One SOP or process description Automatic BPMN generation
One process map if available Full process mining
One metrics table: AHT, touch time, volume, SLA breach, rework rate Full ALM replacement
One BRD template Complex Jira round-trip sync
Optional historical BRDs for retrieval Fine-tuning as the first step
Requirement candidates + evidence + gaps + edge cases Fully automated final BRD approval
Markdown / Word / CSV export Large-scale enterprise workflow orchestration

A good MVP output would be:

Output Purpose
BRD skeleton Gives the expected document shape
Requirement candidates Gives reviewers something structured to inspect
Evidence table Shows where each requirement came from
Gap ledger Shows what still needs SME / stakeholder input
Edge-case register Makes exception handling visible
Metric-derived requirements Converts operational data into measurable requirements
Open questions Drives follow-up workshops
Traceability matrix Prepares downstream test / Jira / implementation work

10. Metric-derived requirements may be a strong differentiator

This is where the idea can become more than another document generator.

Operational metrics should not only appear in the background section of the BRD. They should drive requirements.

Input metric Possible requirement impact
AHT Processing-time target, automation target, efficiency business case
Touch time Manual work reduction, automation candidate selection
Monthly volume Throughput, capacity, scaling, queue design
SLA breach rate Alerts, escalation, reporting, workload balancing
Rework rate Validation rules, data quality requirements
Exception rate Exception queue, routing, supervisor review
Handoff count Role redesign, workflow simplification
Peak volume Capacity planning, batch/real-time processing choices

Example:

Observation Requirement candidate
AHT is 12 min; target is 7 min NFR: redesigned workflow should support average handling time <= 7 min for standard cases
18% of cases are exceptions FR: system shall support exception classification and supervisor queue routing
11% rework due to missing fields FR: system shall validate required fields before submission
40k cases/month NFR: system shall support expected monthly volume plus agreed peak margin
SLA breach is tracked manually Reporting requirement: system shall report SLA breaches by queue, region, and case type

This is much more concrete than simply asking an LLM to write a better BRD.

11. Review-first UX

I would not make the first interface a document editor.

I would make the first interface a review queue.

Type Item Evidence Risk Action
Requirement Supervisor review for exception cases SOP + meeting note Exception definition unclear Accept / Edit / Reject
Gap SLA target missing No source NFR cannot be validated Assign owner
Edge case Approver absent Implied in process notes Workflow may block Ask SME
Assumption 40k cases/month KPI table Seasonal peak unknown Confirm
Metric-derived req Reduce AHT to 7 min KPI target Hard SLA vs goal unclear Clarify

This makes the tool useful before the BRD is “done”.

In real BA work, the intermediate review artifacts are often more valuable than the first draft.

12. Security and governance should be designed early

BRDs often contain sensitive operational information:

  • internal process details
  • volumes and SLAs
  • customer or employee data in meeting notes
  • system names
  • manual workarounds
  • approval paths
  • compliance assumptions
  • audit requirements
  • integration constraints

So I would design for governance early:

Area Practical requirement
Data residency Know where artifacts, embeddings, and generated requirements are stored
Tenant isolation Separate projects, clients, and departments
Source permissions Preserve source document permissions when surfacing evidence
Audit logs Track who accepted, edited, or rejected each requirement
PII handling Detect and redact sensitive information in notes and transcripts
Prompt injection defense Treat uploaded documents as untrusted input
BYO model / private deployment Support Azure OpenAI, private endpoints, or local models where needed
Export control Prevent unsupported assumptions from being exported as confirmed requirements

The OWASP Top 10 for LLM Applications and the OWASP GenAI prompt injection note are useful references here. If you expose the system through tools or MCP-style integrations, the MCP security best practices are also worth reading.

Security should not be treated as a final enterprise checkbox. It affects the architecture.

13. API and integration shape

If this becomes a platform, I would make the requirements core accessible through APIs.

Example API surface:

POST /projects
POST /projects/<project_id>/artifacts
POST /projects/<project_id>/extract-facts
POST /projects/<project_id>/generate-requirements
GET  /projects/<project_id>/requirements
GET  /projects/<project_id>/gaps
GET  /projects/<project_id>/edge-cases
PATCH /requirements/<requirement_id>/review
GET  /projects/<project_id>/traceability
POST /projects/<project_id>/export/brd
POST /projects/<project_id>/export/jira
POST /projects/<project_id>/export/azure-devops

Important point:

Exporting a BRD should be one view. Exporting structured requirements should be the real product API.

That makes downstream integration easier:

Output view Consumer
BRD document Business stakeholders
Requirements table BA / PM / product owner
Gap ledger SME / process owner
Traceability matrix QA / compliance
Jira / Azure DevOps items Engineering
Test case seeds QA
Spec package Spec-driven development / AI coding workflows

The GitHub Spec Kit and the broader spec-driven development discussion are useful downstream references. They show why structured specifications matter beyond a static document.

14. What I would not overbuild at first

I would be careful not to make the first version too broad.

Tempting feature Why I would defer it
Full video understanding Expensive, noisy, privacy-sensitive, and hard to evaluate
Automatic BPMN generation Useful later, but it can distract from requirement discovery
Full ALM replacement Jama / Trace.Space / Visure-like territory is heavy
Fine-tuning-first strategy Hard to evaluate early; retrieval + rubrics give faster feedback
Full Jira round-trip sync Useful, but export/import can start simpler
Fully automated final BRD approval Risky; human review should remain explicit
Test-case generation as primary feature Valuable later, but only after requirements are grounded

15. A practical build sequence

If I were building this, I would use a phased approach.

Phase Goal Key output
Phase 1 Artifact normalization Business facts extracted from notes, SOPs, process docs, metrics
Phase 2 Requirement candidates FR/NFR/data/integration/reporting requirements with evidence
Phase 3 Gap ledger Missing information and targeted SME questions
Phase 4 Review workflow Accept/edit/reject/assign owner/check evidence
Phase 5 BRD export BRD skeleton generated from accepted / pending objects
Phase 6 Eval harness Coverage, grounding, unsupported claims, reviewer edit distance
Phase 7 Downstream integrations Jira, Azure DevOps, Confluence, tests, spec packages

This sequence keeps the first product useful without pretending to solve the entire SDLC on day one.

16. My suggested positioning

I would position it like this:

A requirements discovery workbench for process transformation projects.

Or:

A source-backed BA workbench that turns messy process artifacts into requirements, gaps, edge cases, and BRD sections.

Or more sharply:

Find what your BRD is missing before stakeholders do.

The second and third versions are probably stronger than “AI BRD generator”.

17. Short version

I would keep the core idea, but sharpen the boundary.

Do not make the product only a better BRD writer.

Make it the missing requirements layer between document AI and ALM:

  • parse messy artifacts
  • extract business facts
  • generate source-backed requirement candidates
  • maintain a gap ledger
  • mine edge cases
  • convert operational metrics into requirements
  • keep humans in the review loop
  • export the BRD as one view of a structured requirements package

That would preserve the strongest part of the idea while making it much harder to dismiss as “just another LLM document generator”.