Splunk Search

support for fill-forward or "last observation carried forward"

Kemark
Explorer

Does splunk support fill-forward or "last observation carried forward".

I want to create a daily based monitoring.
One example is getting the version of all reported items.
I'm getting the version only if it is changed. For each day I need the last available version of the item.

How can this be realized with splunk to realize a line-chart?
 
Thank you in advance
Markus

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

There are several possible approaches to such problem. One is filldown mentioned already by @richgalloway. Another is streamstats or autoregress. Or you might simply reformulate your problem to avoid this altogether. It all depends on particular use case.

Kemark
Explorer

Hi @PickleRick ,

the use case is like the following description...

My input is like the following json:
{ "timestamp": "2025-06-01T09:26:00.000Z", "item":"I.1","version":"1.1.0-1"}
{ "timestamp": "2025-06-01T09:26:00.000Z", "item":"I.2","version":"1.1.0-1"}
{ "timestamp": "2025-06-01T09:26:00.000Z", "item":"I.3","version":"1.1.0-1"}
{ "timestamp": "2025-06-01T09:26:00.000Z", "item":"I.4","version":"1.1.0-1"}
{ "timestamp": "2025-08-01T09:26:00.000Z", "item":"I.1","version":"1.1.0-2"}
There are 4 items at 06/01 and one item with an advanced version at 08/01.

 

The query just counts the current version per day.
source="..."
| eval day=strftime(_time, "%Y-%m-%d")
| chart count by day, version
 
The actual result is:

| day                   | 1.1.0-1 | 1.1.0-2 |
| -----------------| --------- |----------|
| 2025-06-01 |  4            |  0           |
| 2025-08-01 |  0            |  1           |

 

but what I expect is:

| day                   | 1.1.0-1 | 1.1.0-2 |
| ---------------- | --------- | ---------- |
| 2025-06-01 |  4           |  0             |
| 2025-07-01 |  4           |  0             |
| 2025-08-01 |  3           |  1             |

 

Another challenge is that I want to spread the result for about 60 days and there are over 100.000 items.
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Ok. So it seems you're not just filling down because at the end you're substracting from what's already been counted. There is much more logic here. Are there any limitations to the versions per day? What if there are more than two versions? It seems much more complicated.

0 Karma

Kemark
Explorer

There could be maximum 3 versions per day.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. People use their own spare time to help others. Not specifying the problem properly is simply wasting their time. Present your problem as clearly and completely as possible. Sure, there might sometimes be some unclear things but guessing not only the solution but also the problem itself is not gonna cut it.

So if you want to get some serious help, invest something firstly into specifying what you're trying to achieve - what the data is, what is the relation between the data and the desired output, what is the logic behind the output.

Just saying "three versions per day" doesn't tell anything about how the input data corresponds to the output. Maybe some counts should be aggregated, maybe not. How the versions are ordered? Do the counts "spill down"? What is going on with that data?

0 Karma

Kemark
Explorer

First of all, many thanks for your support and sorry if I'm wasting time.

Theoretically, there should only be a maximum of 3 versions. In practice, however, I have seen more versions at the same time.

I had originally hoped that splunk had native support for the LOCF topic, which I have not yet found out. In my research so far I have not come across this and have only discovered complex cross-join solutions.

Perhaps it is the wrong use case for this requirement and I need to proceed differently:

  • daily recording of the version for all items
    Disadvantage: over 100,000 log entries are recorded daily

  • cron job that records the missing versions for all items
    Disadvantage: here too, over 100,000 log entries are recorded daily

  • usage of a different tool for this use case, e.g. influxDB.
    Native LOCF support seems to exist here.
    Disadvantage: several tools must be supported
0 Karma

PickleRick
SplunkTrust
SplunkTrust

No worries. It's just that some assumptions which are obvious to the one writing the question might not be clear at all to the readers.

Again - there are some ways of approaching this problem, but we need to know what you want to do 🙂

Splunk can carry over the value from the previous result row but as for the additional logic - it has to know what to do with it.

Compare this:

| makeresults count=10 
| streamstats count
| eval field{count}=count
| table field*

With this:

| makeresults count=10 
| streamstats count
| eval field{count}=count
| table field*
| streamstats current=f last(*) as *

The values are carried over. But if there is additional logic which should be applied to them - that's another story.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

As @PickleRick says, your problem is (still) not well described. However, based on the limited information, you could try something like this

``` Find latest daily version for each item ```
| timechart span=1d latest(version) as version by item useother=f limit=0
``` Filldown to cover missing intervening days (if any exist) ```
| filldown
``` Fill null to cover days before first report (if any exist) ```
| fillnull value=0
``` Convert to table ```
| untable _time item version
``` Count versions by day ```
| timechart span=1d count by version useother=f limit=0

Here is a simulated version using gentimes to generate some dummy data (which hopefully represents your data closely enough to be valuable)

| gentimes start=-3 increment=1h
| rename starttime as _time
| table _time
| eval item=(random()%4).".".(random()%4)
| eval update=random()%2
| streamstats sum(update) as version by item global=f
``` Find latest daily version for each item ```
| timechart span=1d latest(version) as version by item useother=f limit=0
``` Filldown to cover missing intervening days (if any exist) ```
| filldown
``` Fill null to cover days before first report (if any exist) ```
| fillnull value=0
``` Convert to table ```
| untable _time item version
``` Count versions by day ```
| timechart span=1d count by version useother=f limit=0

Notice with the simulated data, all the rows add up to 16 which represents the 16 possible item names used in the simulation. Also, note that the counts move towards the bottom right as the versions of the items goes up over time.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

It would help to know more about your use case including any existing SPL you're using.

Have you looked at the filldown command?

---
If this reply helps you, Karma would be appreciated.

Kemark
Explorer

Hi @richgalloway,
yes I already tried the filldown command. But I had no sucess with it. Probably I made it wrong.
In the current thread I replied my use case to @PickleRick.

0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud's AI Assistant in Action Series: Auditing Compliance and ...

This is the third post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how to ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

What You Read The Most: Splunk Lantern’s Most Popular Articles!

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...