Safe AI is trusted AI: why ChatGPT needs a library service
Placing large language models in the right role for security, risk and compliance.
ChatGPT is easy and convenient. You open a window, ask a question and within seconds you get an answer.
For security, risk and compliance teams working under time pressure and with limited capacity, that speed is tempting. Ask it to draft an audit report, tidy notes or produce minutes. Used this way, on small, bounded tasks, it can be helpful.
The challenge
Trouble begins when we ask it to do much more than a discreet bundle of work, or when a department comes to rely on it.
Ask ChatGPT to stand in for a library, a filing cabinet or a controlled document system and you run into problems.
Security, risk and compliance departments hold millions of words across policies, contracts, technical specifications, incident reports and regulatory guidance. Much of that material changes, and some of it is sensitive. ChatGPT is very good at reasoning and writing, but it cannot see most of your information, prove where a claim came from or show that the version it used is the one your auditors will accept.
That is not a technical quibble. It is the difference between a useful assistant and an unreliable authority.
Think of ChatGPT as a gifted editor with a small desk. You can place a handful of documents in front of it and it will read them and respond. Stack too many papers on the desk and the ones at the bottom are invisible. This “desk size” is the context window. When the pile exceeds the space, relevant passages drop out of view. In practice, the model may miss the appendix with a crucial definition or a newer policy that replaces last year’s rule.
Connectivity is another issue. The model does not roam the internet or your file shares unless you provide that material. If a correct answer sits outside what you supply or beyond what the model learned during training, it will generalise. Sometimes that is fine. In high-stakes environments it is not.
In risk work we care about the exact clause, the page number and the wording that tells us which button to press on the fire panel. We need to point a colleague or an auditor to the source. An answer without a clear citation is, at best, a draft that must be checked.
Repeatability matters too. Ask the same question twice and you may get two different phrasings, or even two different recommendations, unless both answers are anchored to the same evidence. If a client asks why a decision was taken, “because ChatGPT said so” will not do. We need provenance and a trail: who asked the question, which document was consulted, which version, and what changed since the last review.
Access control is an obvious concern. Security information is sensitive and shared on a need-to-know basis. A general chat interface does not, by itself, enforce who can see which paragraph in which policy, or who is allowed to combine information across departments. Nor does it keep an authoritative log of who saw what and when. You can run ChatGPT in safer configurations, but if you require fine-grained permissions and auditable access, you need more than a clever writer.
None of this means we should avoid ChatGPT. It means we should place it in the right role. For small, low-stakes work, say up to a few hundred pages of material, you can curate the excerpts, give the model a focused brief and ask it to answer in simple terms. It will save time and improve clarity.
The problems arise as soon as the scale increases, the facts change frequently, or the stakes rise. In those cases, we need an information layer between questions and the model’s words.
The solution: add a “library service” (retrieval)
A good librarian does not make you read every book. They help you find the right passages quickly. In the AI world, this is retrieval.
Instead of dumping piles of documents on the model’s desk, use a separate system to search your corpus, select the few excerpts most likely to answer the question and pass only those to the model. Then ask the model to write from those excerpts and to cite them. Store the document IDs and page numbers. On click, the relevant section from the source pops up. When the source is updated, the search index is updated too, so freshness improves without changing the model.
This approach pairs ChatGPT’s writing ability with your own library service. Factual accuracy improves and claims become checkable. Audit is easier because citations are recorded. Costs fall because you pass only a small number of targeted excerpts. Permissions can be respected by filtering sources per user and role.
Trade-offs
Building a capable library service is real work.
You must prepare documents, scan images where needed, split content into sensible chunks with headings and keep an index up to date as files change. Some formats are easier than others. Simple email text is easy. PDFs with diagrams and sketches are harder. You should monitor search quality and avoid filters that hide useful context.
Measure performance. Can the system find an answer at all? Does the answer include citations? How long does it take? How often is the information out of date?
These are not insurmountable. They are a cost you pay for trust.
When to use ChatGPT alone, and when not to
For a non-technical audience, it helps to think in terms of tasks you already do.
If the task is small and bounded, for example “summarise this intel report for a briefing”, “compare two supplier clauses” or “turn yesterday’s incident record into a clean first draft”, ChatGPT with a carefully chosen set of excerpts is likely to be good enough.
If the task is large, dynamic or regulated, for example “what to do following an acid attack”, “tell me about drone usage outside our building” or “evidence our compliance for an external review”, then you want the library service in the loop.
Decision guide
| Situation | Use ChatGPT | Do not use ChatGPT alone | Why |
|---|---|---|---|
| Scope of material | A few pages to a few hundred pages with curated excerpts | Large or sprawling corpora such as policies, contracts, incident logs and regulations | Context window limits; recall gaps without retrieval |
| Stakes | Low-stakes briefs, drafts and note-tidying | High-consequence decisions or safety-critical guidance | Needs provenance, repeatability and approvals |
| Freshness | Static or rarely changing content | Frequently updated sources; “what changed since last review?” | Risk of stale answers without indexed updates |
| Citations and provenance | Nice-to-have references | Must show clause, page, version and source ID or URL | Audits and reviews require traceability |
| Access control | Open or non-sensitive material | Need-to-know content with fine-grained permissions | Chat alone cannot enforce ACLs |
| Repeatability | Acceptable if wording varies | Same query must yield the same, sourced answer | Determinism requires evidence anchoring |
| Data tasks | Light synthesis and explanation | Cross-document joins, filters and tables at scale | Needs structured tools with retrieval |
| Latency and cost | Short prompts, quick drafts | Long pastes and heavy context | Token bloat slows responses and raises cost |
| Compliance | Internal comms and exploratory analysis | Regulated outputs and external submissions | Controlled documents beat free-form prose |
| Human approval | Reversible edits | Irreversible actions affecting people, assets or posture | Keep a named approver in the loop |
| Training and learning | Table-top scenarios and quizzes | Live SOPs, command guidance or authoritative checklists | Source of truth must be controlled and versioned |
Conclusion
ChatGPT is excellent at reasoning, but blind to most of the large, evolving corpus found in security, risk and compliance. It cannot reliably ground, cite or stay current without help.
Adding your own library service gives you recall, freshness, attribution and policy control. You pay for that with some infrastructure and ongoing care, but you gain trust. In security and risk, that is the point. Safe AI is trusted AI. Trust is earned with governance and evidence, not speed and flattery.
Author bio: Andrew Tollinton
Andrew Tollinton is Co-Founder of SIRV, the UK’s enterprise resilience platform. A leader in risk management technology, he chairs the Institute of Strategic Risk Management’s AI in Risk Management group and regularly speaks on AI and resilience at global conferences. A London Business School alumnus, Andrew brings 20+ years’ experience at the intersection of technology, compliance and security.
"SIRV helped us move beyond basic reporting into a system that actively supports decision-making". Les O'Gorman, Director of Facilities, UCB - Pharma and Life Sciences