Memory usage and performance optimizations#7
Conversation
This has the potential to drastically reduce the memory usage for large files with many revisions. The text of a commit is no longer needed once it's child commit has been processed. The memory usage optimization does not work for branches as these can't be processed reasonably by rcs-fast-export anyways.
replace will already copy the array contents on its own
flatten will just ignore those empty arrays
This avoids the creation of intermediate array and speeds up the whole conversion by approx. 30%
|
We agree with PR #7 , PR #8 , PR #10 and have applied all of the to this rcs-fast-export repo: https://github.com/lcn2/rcs-fast-export along with a |
|
@lcn2 Judging from a quick look at https://github.com/lcn2/rcs-fast-export/commits/master/ you didn't include my changes from https://github.com/MichaelEischer/rcs-fast-export/commits/master/ . Either way, from my side this PR only exists in case someone still has a use for it. I no longer have an RCS repositories. |
We will update later today, our forked repo with any missed mods from https://github.com/MichaelEischer/rcs-fast-export/commits/master/ .. stay tuned ..
We no longer have RCS repositories as well. Our repo exists in case anyone discovers an old RCS directory, such as from an old backup. UPDATE 0Fixed as suggested. |
The first commit of this series solves that problem, that long RCS histories of large files (nearly 30k revisions resulting in a 4 MB file) requires tremendous amount of memory (200GB RAM were not enough...). The solution is to keep only a hash digest for revisions which will no longer be used for diffing. This way commit coalescing is still possible by using the hash but requires a lot less memory.
The next three changes avoid some unnecessary string and array copies.
This is complemented by applying the diff using a linear scan to avoid lots of small array allocations. This change might be problematic as it introduces the new assumption that a diff always contains incrementing line numbers.