This post is about Incremental Publishing using the out of the box publishing mechanism, not the new Sitecore Publishing Service. There maybe reasons why you can’t use SPS just yet for your project.
Again if you are using a Smart or Full site publish this article won’t apply. But it’s recommended to setup a scheduled incremental publish.
This post relates to Sitecore 9.3, not reviewed if applicable for Sitecore 10 yet.
Great, so I know this is about Sitecore OOTB Incremental Publishing, and why it’s recommended to use this if SPS isn’t a good fit, tell me more about this edge case.
Disclaimer, this is an Incremental Publishing Edge Case issue where an item which should be published never gets published, most of the time you probably wouldn’t see this issue occur, and if you did spot something wasn’t published as it should, you’d just smart publish/republish the unpublished items & possibly their children to fix the issue.
You might possibly be thinking this could be a user error, something might not have been pushed through workflow, or not created until very recently (Always good to explore for technical issues before jumping to conclusions).
However if you have ruled out any user issues, and reviewed the logs and confirmed that the item should have been picked up in an Incremental Publish, you need to start looking for the Ghost in the Machine.
Working closely with Sitecore Support and with enough diagnostics data from the rare occurrences when this issue would rear it’s head, it was spotted that these items which wouldn’t get picked up by any incremental publish were saved around the time of the scheduled publishes.
Let’s explore what’s going on, and why an item saved very near an incremental publish operation on a rare occurrence might not get picked up.
The “Properties” table stores the Last Publish date as a “String” including the ticks.
The “PublishQueue” table “Date” column stores the date in a Sql “Datetime” column which isn’t as precise and results in rounding.
As does the “Item” table “Created” and “Updated” columns which both use the same Sql “datetime” column, with the same rounding issue.
Due to these differences, say a publishing job was started with value From 14:00:00.017545Z, it would be rounded to 14:00:00.003ms, and would miss an item added at 14:00:00.000ms.
These millisecond round issues could cause an issue with an item not getting published, however that’s very small window for the issue to occur.
When you save an item, the items statistics get updated with the timestamp from C#/.NET in memory before getting persisted to the Database.
So the latency to get the record with the timestamp into the database could certainly be an issue/factor.
The datetime value used defaults to not storing the ticks/milliseconds on the item, and just defaults to datetime.ToString("yyyyMMddTHHmmss")
Truncating any milliseconds or ticks.
(See ItemStatistics.UpdateRevision
, DateUtil.IsoNow
, DateUtil.ToIsoDate
)
So that could certainly be an issue, if saving an item towards the end of the Second (with a high millisecond value).
On save items are added to the publishqueue
, with the date field values Updated, PublishDate, UnpublishDate, ValidFrom, ValidTo.
(See DefaultPublishManager.DataEngine_SavedItem
, DefaultPublishManager.AddToPublishQueue
, DefaultPublishManager.GetActionDateFields
)
So that truncated second value gets copied across to the publishqueue
table.
If I save an item at 2020-08-13T18:00:00.9881779+00:00
It will get saved as being created/modified at 2020-08-13T18:00:00.000
,
and added to the publish queue
And I have an incremental publish kick off at 18:00, and picks up from the last publish 17:00:00.103-18:00:00.103 as a slight delay in starting. This doesn’t pick up the item which hasn’t been created yet at 18:00:00.9881779.
And the future publish operation won’t pick it up either as will look for items between 18:00:00.103-19:00:00.103, when it’s value will be saved as 18:00:00.000ms.
A work around would be to fix the query to include records affected by this precision/latency issue, at the expense of reprocessing some items more than once.
To do this can customise GetPublishQueueEntries
method on the SqlDataProvider
public override List<PublishQueueEntry>GetPublishQueueEntries(DateTime from, DateTime to, CallContext context)
{
from = from.AddSecond(-1);
return base.GetPublishQueueEntries(from, to, context);
}
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:set="http://www.sitecore.net/xmlconfig/set/">
<sitecore>
...
<dataProviders>
<main>
<patch:attribute name="type" value="YourNamespace.CustomSqlServerDataProvider, YourAssembly" />
</main>
</dataProviders>
...
</sitecore>
</configuration>
Item statistics get updated at 17:59:59.999ms in C#
Item gets saved as 17:59:59.000ms but some latency before added to the database so doesn’t get added until after 18:00:00.103ms Item gets added to the publish queue with value 17:59:59.000ms
Publish Window by default queries for
18:00:00.103-19:00:00.103
with the 1 second fix would query for
17:59:59.103-19:00:00.103
Both of these would still miss the item at 17:59:59.000ms.
To round the start time query down to 0 milliseconds, and 0 ticks, can add the following line Credit Stack Overflow:
public override List<PublishQueueEntry>GetPublishQueueEntries(DateTime from, DateTime to, CallContext context)
{
from = from.AddSecond(-1);
from = from.AddTicks(-(from.Ticks % TimeSpan.TicksPerSecond));
return base.GetPublishQueueEntries(from, to, context);
}
So the new query would pick up even this more extreme edge case, with the query
17:59:59.000-19:00:00.103
.
TL;DR; OOTB if you save an item at the same time an incremental publish kicks off, depending on the exact timing there is an edge case that it may never get published.
With a small work around in code you can ensure items saved around the time of a scheduled publish get picked up by the next incremental publish. At the expense of reprocessing some items/events. However this workaround isn’t in the form of an official hotfix, and would require extensive testing. No guarantees. Only consider this if are experiencing the issues described, and of course speak with Sitecore Support first.
Well firstly you have to be using incremental publishing and scheduled publishing. And secondly unless you’ve spotted items not getting published, have enough editor activity that keep on encountering this issue on rare occasions, you probably don’t need it.
However if you are using incremental and scheduled publishing, and have spotted odd issues with items not getting published, and can’t explain them due to user error. This might help explain the Ghost in the machine and a work around. But don’t apply this fix if you don’t need it.
And if you do implement this fix you’ll have to test this customisation, as not an official hotfix, and no guarantees. Just a suggestion that this could help if suffering from this particular issue. And if was implemented incorrectly could lead big problems, e.g. re-processing of events. So you’ve been warned.
Longer term I’m sure SPS will further improve, and will be able to be used by more customers. At which point Smart Publishing becomes fast enough, we don’t need the old incremental publishing.
If there was to be a fix for incremental publishing, the window could be reduced by calculating and storing a last modified time at Sql Server level with millisecond/timestamp precision, rather than using C# and truncating down to the second level with the latency to insert it into the database. As well as using the DateTime2 column format rather than DateTime with it’s millisecond rounding precision issues. However, this is quite a schema change for an edge case, so the work around will suffice for those that are affected.
Sitecore Bug Tracking Reference #397573.
Don’t use this work around unless you are experiencing the issues described, and even then perform your own testing, no guarantees.
These code samples haven’t been tested, and just give the gist of the fix required.
Work with Sitecore Support to confirm you are experiencing the issue described, as this customisation has a cost to it, and could have negative implications to your solution even if implemented correctly, and even more negative implication if incorrectly implemented.