A theory about a problem that has always existed
Posted: Sat Oct 02, 2021 8:27 am
Hi, ever since I started using PrimoCache many years ago I have always experienced problems when writing to external hard drives when the L1 and L2 flush to disk slows right down to only a few blocks per second and the entire system does not respond well caused by this very slow flushing of cache on the external drive. I have mentioned it a few times here.
It tends to happen during backup routines where I write many large files to the external drive. Often it slows right down to the point where the backup will not copy a new file until the cache is entirely flushed. it is very frustrating.
In the past few days I had to setup my backup routines again and found that I experienced the problem a lot, and then I finally worked out what causes this problem, and it should be an item you look to address. I don't know if it is a problem with PrimoCache or a fundamental problem with Windows itself.
I found that the problem occurs when there is overlap in files that get written to the Windows write buffer. e.g. if two 10 Gb files are written in sequence then after the first file is written but the Primocache write cache is still filling from that file but the second file has started writing also then there is an overlap of the two files in the Windows write buffer that is being written to PrimoCache's write cache. It is when PrimoCache is flushing the write cache from the point where these two files overlapped that the slowdown occurs.
You can reproduce this yourself by copying some large files to an external USB hard drive. Copy the files so that there is some overlap in them being copied to the external drive. e.g. when one is half way written then copy the next file. If you monitor the Resource manager disk writes you will see that PrimoCache flushes nicely at around 100 to 200 Mb/s (the speed of the drive). When the flushing gets to the point where there was overlap in the original copying of the files you will see the flushing slow down a LOT to just a few blocks per second and the Resource monitor shows that the writing to the external drive slows to 1 to 2 Mb/s and the queue on the external drive also fluctuates wildly with queue length increasing to as much as 300.
I haven't tested on an internal drive but I am sure that the problem is there but to a much lesser degree. You may want to test yourself on an internal. So this is why and when this problem occurs. It probably happens with smaller files too. I think it is dependent upon the amount of overlap and amount of files overlapping in the original Windows write buffer that gets written to Primocache that eventually gets flushed to the hard drive. Maybe recalculate the flush order more regularly? I'm not sure how that would impact performance. I hope this helps to solve a long existing problem.
It tends to happen during backup routines where I write many large files to the external drive. Often it slows right down to the point where the backup will not copy a new file until the cache is entirely flushed. it is very frustrating.
In the past few days I had to setup my backup routines again and found that I experienced the problem a lot, and then I finally worked out what causes this problem, and it should be an item you look to address. I don't know if it is a problem with PrimoCache or a fundamental problem with Windows itself.
I found that the problem occurs when there is overlap in files that get written to the Windows write buffer. e.g. if two 10 Gb files are written in sequence then after the first file is written but the Primocache write cache is still filling from that file but the second file has started writing also then there is an overlap of the two files in the Windows write buffer that is being written to PrimoCache's write cache. It is when PrimoCache is flushing the write cache from the point where these two files overlapped that the slowdown occurs.
You can reproduce this yourself by copying some large files to an external USB hard drive. Copy the files so that there is some overlap in them being copied to the external drive. e.g. when one is half way written then copy the next file. If you monitor the Resource manager disk writes you will see that PrimoCache flushes nicely at around 100 to 200 Mb/s (the speed of the drive). When the flushing gets to the point where there was overlap in the original copying of the files you will see the flushing slow down a LOT to just a few blocks per second and the Resource monitor shows that the writing to the external drive slows to 1 to 2 Mb/s and the queue on the external drive also fluctuates wildly with queue length increasing to as much as 300.
I haven't tested on an internal drive but I am sure that the problem is there but to a much lesser degree. You may want to test yourself on an internal. So this is why and when this problem occurs. It probably happens with smaller files too. I think it is dependent upon the amount of overlap and amount of files overlapping in the original Windows write buffer that gets written to Primocache that eventually gets flushed to the hard drive. Maybe recalculate the flush order more regularly? I'm not sure how that would impact performance. I hope this helps to solve a long existing problem.