Security Teams Don't Have a Retention Problem. They Have an Accessibility Problem.

Bright curved horizon of a planet glowing against the dark backdrop of space.

Three weeks after an employee leaves the company, something feels wrong.

Their credentials were used to access a contractor portal the day before they resigned. No one knows yet whether anything malicious happened, but the security team is asked a simple question: what did this person actually do in their final 30 days?

At first, that sounds like a narrow investigation. In reality, answering it may require reconstructing activity across identity logs, VPN events, SaaS applications, endpoint telemetry, email systems, firewall data, and cloud activity. The evidence may exist, but it is scattered across systems, stored in different formats, retained for different periods, and accessible only to people who understand where to look and how to query each source.

Now consider a different question.

An auditor asks for evidence that privileged users had MFA enabled during a prior compliance period. They do not want a general statement that the policy exists today. They need historical proof. If some privileged users authenticated without MFA, they need the list, the dates, and the supporting evidence.

This is not an advanced detection problem. It is not a threat hunting exercise. It is a security data access problem.

The organization needs the data to still exist, and it needs the data to be usable when the question arrives.

That distinction is becoming one of the most important architectural questions in modern security operations.

Security Teams Collect More Data Than They Can Use

Most organizations collect more security telemetry than ever before. Identity providers, cloud platforms, endpoints, SaaS applications, email systems, AI tools, firewalls, and security products all generate continuous streams of activity. As environments become more distributed and security programs become more mature, the volume and diversity of telemetry only increase.

At first, this appears to be progress. More telemetry should mean more visibility, and more visibility should mean better security outcomes. But in practice, security data does not become useful simply because it was collected. It becomes useful only when the organization can retrieve it, understand it, and apply it to the question it needs to answer.

That question often arrives long after the activity occurred.

A compliance team needs evidence for an audit period that ended months ago. A legal team needs to understand who accessed a system during a specific window. A CISO is asked whether employees are using unauthorized AI tools with access to sensitive data. A security team needs to reconstruct what a former employee did before leaving the company.

In each case, the team is not asking whether telemetry was generated. It almost certainly was. The real question is whether the telemetry is still available and whether anyone can turn it into an answer.

Security Data Has Two Jobs

Security telemetry has two jobs.

First, it must still exist.

Second, it must be accessible.

The first requirement is retention. The second is usability. They are often discussed together, but they are not the same thing. A company can retain years of logs and still be unable to answer a question quickly. The data may sit in cold storage, exist inside a vendor system, require specialized query expertise, or be stored in a format that only a small number of people understand.

This is why retention alone can create a false sense of readiness. The organization may believe it has the evidence because the data technically exists, but when an auditor, investigator, or security leader asks a specific question, accessing that evidence still becomes a project. Someone has to know which systems matter, which fields to search, how the data was transformed, and whether the right telemetry was retained in the first place.

That gap between stored data and usable evidence is where many security teams get stuck.

Why SIEM Became the Default Answer

Historically, SIEM platforms became the default way to solve this problem because they offered a centralized place to ingest, retain, search, and analyze security telemetry. They gave security teams a system of record and made it possible to correlate activity across multiple sources.

That was, and still is, valuable.

The challenge is that the SIEM category expanded far beyond retention and search. Modern SIEM platforms often include detection engineering, correlation rules, dashboards, reporting, workflow automation, UEBA, SOAR integrations, and extensive content management. For organizations with mature security engineering teams and complex internal requirements, those capabilities can be essential.

But many organizations are not primarily looking for all of that.

They need to retain security data for compliance, audits, legal requests, and incident response. They need searchable access to historical telemetry. They need a way to answer questions about activity that happened weeks, months, or years ago. They may already rely on MDR for detection and response, which changes the role a SIEM needs to play in their environment.

When those organizations evaluate a SIEM, they are often trying to solve a narrower problem than the platform was designed to address. They need the data to be retained and accessible. They do not always need to operate the full analytics, detection, dashboarding, and workflow layer that comes with it.

The Gap Between Log Retention and SIEM

This creates a difficult middle ground.

Basic log retention can answer the question, “Do we keep logs?” It does not always answer the more important question, “Can we use those logs when we need them?” Retention without strong access, search, and investigation workflows often leaves security teams dependent on manual effort, vendor support, or internal specialists.

A full SIEM sits on the other side of the spectrum. It can provide powerful search, detection, analytics, dashboards, and workflow capabilities, but it often introduces significant cost and operational overhead. Teams need to manage ingestion, tune data sources, maintain parsers, write queries, control storage costs, and decide which data is worth indexing.

Many security teams sit between those two models. They need more than basic retention, but they do not necessarily need a full SIEM. They need years of retained telemetry, searchable access to that data, and the ability to answer questions without turning every request into a data engineering project.

This is the space where Security Data Lakes are emerging.

Why Security Data Lakes Are Emerging

Security Data Lakes are emerging because retention and accessibility are becoming distinct requirements.

The core idea is straightforward: keep security telemetry for as long as the organization needs it, make recent data easy to search, preserve historical data economically, and allow older data to be brought back when deeper analysis is required.

This is different from treating every piece of telemetry as something that must remain continuously indexed in the most expensive tier forever. It also differs from treating storage as an archive with no practical investigation layer. A useful Security Data Lake must balance economics with accessibility.

That balance matters because security teams do not know in advance which questions they will need to answer later. The former employee case, the MFA audit request, the board question about AI tools, and the historical incident review all depend on the same underlying principle: the data needs to be there when the question appears, and the organization needs a practical way to access it.

The product team's phrase for this is simple: the problem is not the data, it is the gap between the data and the question.

That gap is what Security Data Lakes are designed to close.

Why Raw Telemetry Matters

Traditional security architectures often make data useful by normalizing it. Logs are parsed, fields are mapped, schemas are defined, and raw events are transformed into structures that humans and detection systems can query more easily.

That approach has real advantages. It makes detections easier to write, dashboards easier to build, and correlations easier to maintain. But it also forces decisions at ingestion time about how the data should be structured and which fields will matter later.

Security investigations do not always respect those assumptions.

A question asked six months later may depend on a field that was not normalized, a source-specific detail that was flattened, or a raw event that did not fit neatly into a predefined schema. When that happens, the organization may have retained the data but lost the fidelity needed to answer the question fully.

This is why raw telemetry becomes important in a Security Data Lake model. Retaining data in its original format preserves evidence before anyone knows exactly how it will be used. It allows organizations to ask new questions later without requiring every future investigation to be predicted at the time of ingestion.

The tradeoff is that raw telemetry can be harder for humans to access directly. Different systems produce different formats, and reading across them requires expertise.

That is where agents begin to change the model.

Why AI Changes Accessibility

Historically, accessibility required structure. Analysts needed to understand schemas, query languages, field names, source formats, and normalization rules. The more data an organization retained, the more expertise was required to use it effectively.

Agentic systems change part of that tradeoff. They do not eliminate the need for good data architecture, and they do not replace security judgment. But they can reduce the distance between a human question and the underlying telemetry.

A user can ask a question in natural language. The agent can translate that question into a query, search across relevant telemetry, return results, explain how it reached them, and support follow-up questions as the investigation develops. Advanced users can still use KQL directly, but access to the data no longer depends entirely on a small group of people who understand every schema and source.

This matters because many of the most common questions are not sophisticated detection problems. They are data access problems. Who logged in without MFA? What did this employee do before leaving? Which AI tools are employees using? Did this domain appear anywhere else in our environment?

The value of AI in this context is not that it makes storage smarter. It makes retained data more accessible.

Why We Built Daylight Agentic Security Data Lake

Today we are announcing Daylight Agentic Security Data Lake, a new managed service for Daylight MDR customers that provides long-term security data retention and searchable access to historical telemetry.

We built it because we kept hearing the same question from organizations evaluating MDR: if Daylight handles detection and response, do we still need a SIEM?

The honest answer is that it depends on what the organization needs the SIEM for. Some teams need a full SIEM platform because they rely on custom detections, correlation engineering, dashboards, workflow automation, and internal security engineering. For those teams, a SIEM remains important.

But other organizations are trying to solve a more specific problem. They need to retain security telemetry for years. They need access to that telemetry for compliance, audits, legal requests, internal investigations, and operational questions. They need data to be searchable without hiring a data engineer or operating a complex platform.

Daylight Agentic Security Data Lake is designed for that use case.

The service retains telemetry collected through Daylight MDR integrations. Recent data remains searchable for immediate access. Historical data can be retained in lower-cost storage and rehydrated when needed. Customers can search through an agentic natural language experience or use KQL directly when they want precise control.

The goal is not to replace every SIEM capability.

The goal is to give MDR customers long-term data retention and practical access to the telemetry they need.

What This Means for MDR Customers

Most MDR services focus on detection and response. That is necessary, but it does not solve every data question a security team faces.

A customer may need to answer an audit request that has nothing to do with an active alert. They may need to support a legal inquiry. They may need to explore employee activity, SaaS usage, AI tool adoption, or historical authentication behavior. Those questions often sit outside the normal MDR workflow, but they still require security telemetry.

With Daylight Agentic Security Data Lake, MDR customers get a managed way to retain that telemetry and access it when needed. They do not need to operate a SIEM simply to keep years of data searchable. They do not need to open a ticket every time they need to ask a historical question. And they do not need to decide in advance exactly which future question will matter.

The data remains available.

The access layer is built around questions.

That is the shift.

The Future of Security Data

Security telemetry will continue to grow. Compliance and audit requirements will continue to expand. AI systems, SaaS platforms, cloud services, and identity providers will continue to generate more data that security teams may need later.

The old assumption was that retaining and accessing that data required operating a SIEM. That will remain true for some organizations, especially those that need the full set of SIEM capabilities. But for many teams, retention and accessibility are becoming their own category of security operations need.

That is why Security Data Lakes are emerging.

Not because organizations want another place to store logs, but because they need a way to turn retained telemetry into usable evidence when questions arise.

The future of security data is not just more collection.

It is better access to the data organizations already collect.