com.sleepycat.je.cleaner
Class UtilizationCalculator

java.lang.Object
  extended by com.sleepycat.je.cleaner.UtilizationCalculator

public class UtilizationCalculator
extends Object

Contains methods for calculating utilization and for selecting files to clean. In most cases the methods in this class are called by FileSelector methods, and synchronization order is always FileSelector then UtilizationCalculator. In some cases methods in this class are called directly by FileProcessor, and such methods must take care not to call FileSelector methods. This class maintains the size correction factor for LNs whose size is not counted, and related info, as explained in detail below. [#18633] (Historical note: Originally in JE 5, the corrected average LN size was used to adjust utilization. This was changed to a correction factor since different log files may have different average LN sizes. [#21106]) LN size correction is necessary because the obsolete size of LNs may be unknown when we delete or update an LN that is not resident in cache. We do not wish to fetch the LN just to obtain the size of the previous version. Therefore we record in the FileSummary that an LN became obsolete (by incrementing obsoleteLNCount) and also that its size was not counted (by not incrementing obsoleteLNSizeCounted and not adding a size to obsoleteLNSize). Later we estimate utilization by assuming that such obsolete LNs have sizes that are roughly the average LN size for obsolete LNs whose size was not counted in the file, as calculated by FileSummary.getObsoleteLNSize and getAverageObsoleteLNSizeNotCounted. This is fairly accurate when the sizes of obsolete and non-obsolete LNs in a file are uniformly distributed. But it can be inaccurate when they are not. If obsolete LNs are generally smaller than non-obsolete LNs, the average will be larger than the true size and estimated utilization will be lower then true utilization; in this case there is danger of over-cleaning and wasted resources. If obsolete LNs are generally larger, the average will be smaller than the true size and estimated utilization will be higher than true utilization; in this case there is danger of unbounded log growth and a "disk space leak". To correct the inaccurate estimate we take advantage of the fact that the true LN sizes and true average LN size can be determined when the FileProcessor reads a log file. We save the true average LN size as the result of a FileProcessor run in the adjustUtilization method. The true average LN size is then used to calculate a correction factor -- a factor that, when multiplied by the estimated average LN size, results in the true average LN size. The correction factor is used when determining the utilization of other log files; this is done by the getBestFile method. To guard against over-cleaning (the first case described above), the FileProcessor will naturally select a file for cleaning because estimated utilization is lower than true utilization. When the file is cleaned, the LN size correction factor will be determined. However, to guard against unbounded log growth (the second case above), this mechanism is not sufficient because estimated utilization is higher than true utilization and a file may not be selected for cleaning. To handle this case, a log file must sometimes be read by the FileProcessor, even though estimated utilization is above the minUtilization threshold, to "probe" the true utilization and determine the true average LN size. Probing is a read-only operation and does not migrate the active LNs in the log file or attempt to delete the file. Probing a log file is extra work and we strive to do it as infrequently as possible. The following algorithm is used. 1) We determine the number of new log files N that have been written since the last time adjustUtilization was called. Recall that adjustUtilization is called when FileProcessor reads a file, for cleaning a file or probing it. N is determined by the shouldPerformProbe method, which is called via FileSelector.selectFileForCorrection by the FileProcessor when the FileProcessor is woken to do cleaning but no file is selected for cleaning. 2) shouldPerformProbe returns true only if N is greater than a threshold value. The threshold has a high value (20 by default) if the minimum number of initial adjustments (files cleaned or probed, 5 by default) has not been reached, and has a lower value (5 by default) after the initial adjustment minimum has been reached. In other words, we do probes more frequently when we haven't collected a lot of input data. 3) If shouldPerformProbe returns true, then selectFileForCorrection in the FileSelector calls the getBestFile method here, passing true for the isProbe argument. In this mode, getBestFile will select a file based on worst case (lowest possible) utilization. The maximum LN size in the file is used, rather than the average LN size, for this worst case calculation. See the FileSummary getMaxObsoleteLNSize method. If no file is selected using worst case utilization, then probing is not needed because total utilization cannot be lower than the configured minUtilization. 4) If FileSelector.selectFileForCorrection returns a non-null file number, then FileProcessor will read the file in calcUtilizationOnly mode. True utilization will be determined and adjustUtilization will be called, but active LNs will not be migrated and the file will not be deleted. To summarize, a file is read in probing mode only if both of the following conditions are satisfied: a) The number of new files written since the last correction, N, is greater than the threshold, which is 5 or 20 depending on the whether the last correction is greater or less than 0.9. b) Using a worst case utilization calculation that uses the maximum obsolete LN size in a file rather than the average, the total log utilization is under the configured minUtilization. To determine the LN size correction factor that is applied when estimating utilization, rather than using the size for a single log file, the sizes for the last several log files are saved and the average value is used. The average over the last 10 adjustments is computed when adjustUtilization is called. If the mix of LN sizes for obsolete and non-obsolete LNs is roughly the same for all log files, and this mix does not change over time, then using a running average would not be necessary and a single size would be sufficient. However, some applications have access patterns that are not uniform over time, and the running average size will help to compensate for this. It is important to avoid premature use of the LN size correction factor in selecting files for cleaning, before enough input data has been collected to make the computed value meaningful. To accomplish this, the correction factor is not computed or used until the average size data for a minimum number of LNs with uncounted sizes (1000 by default) is available. Because it is expensive to compute, the recently computed values are stored persistently as part of the CheckpointEnd log entry. Flushing this info once per checkpoint is sufficient, and no special handling is necessary after a crash; losing the new info since the last checkpoint is acceptable.


Nested Class Summary
(package private) static class UtilizationCalculator.AverageSize
          Bundles a count of LNs and their total size, for use in calculating a running average.
(package private) static class UtilizationCalculator.TestAdjustment
          For passing adjustment information to a test hook.
 
Constructor Summary
UtilizationCalculator(EnvironmentImpl env, Cleaner cleaner)
           
 
Method Summary
(package private)  void adjustUtilization(long fileNum, long endFileNum, FileSummary estimatedFileSummary, FileSummary trueFileSummary)
          Saves the true average LN size for use in calculating utilization.
(package private)  Long getBestFile(SortedMap<Long,FileSummary> fileSummaryMap, boolean forceCleaning, Set<Long> lowUtilizationFiles, boolean isBacklog, boolean isProbe)
          Returns the best file that qualifies for cleaning or probing, or null if no file qualifies.
(package private)  Long getCheapestFileToClean(SortedMap<Long,FileSummary> fileSummaryMap, SortedSet<Long> candidateFiles)
          Returns the cheapest file to clean from the given list of files.
 float getLNSizeCorrectionFactor()
          Returns the factor to be multiplied by the average LN size (for LNs with uncounted sizes) to correct for differences between obsolete and active LN sizes.
 CleanerLogSummary getLogSummary()
          Returns log summary info that should be saved persistently.
(package private)  boolean isCorrectionEstablished()
          Returns whether enough adjustments have been made to conclude that the the LN size correction factor has been established, or at least unnecessary because very few LN sizes are uncounted.
(package private)  void setAdjustmentHook(TestHook<UtilizationCalculator.TestAdjustment> hook)
          See UtilizationCorrectionTest.
 void setLogSummary(CleanerLogSummary logSummary)
          Restores log summary info that was read from persistent storage.
(package private)  void setProtectedFiles()
          Determine which files are protectd from deletion, which influences which files are cleaned.
(package private)  boolean shouldPerformProbe(long endFileNum)
          Returns whether a correction probe should be attempted, if worst case utilization also indicates that cleaning may be needed.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

UtilizationCalculator

UtilizationCalculator(EnvironmentImpl env,
                      Cleaner cleaner)
Method Detail

getLogSummary

public CleanerLogSummary getLogSummary()
Returns log summary info that should be saved persistently.


setLogSummary

public void setLogSummary(CleanerLogSummary logSummary)
Restores log summary info that was read from persistent storage.


setAdjustmentHook

void setAdjustmentHook(TestHook<UtilizationCalculator.TestAdjustment> hook)
See UtilizationCorrectionTest.


setProtectedFiles

void setProtectedFiles()
Determine which files are protectd from deletion, which influences which files are cleaned. This method must not be called while synchronized on FileSelector. RepImpl.getCleanerBarrierStartFile does cursor operations and this would cause a deadlock.


getLNSizeCorrectionFactor

public float getLNSizeCorrectionFactor()
Returns the factor to be multiplied by the average LN size (for LNs with uncounted sizes) to correct for differences between obsolete and active LN sizes. This method is intentionally unsynchronized to provide fast access for stats.


getBestFile

Long getBestFile(SortedMap<Long,FileSummary> fileSummaryMap,
                 boolean forceCleaning,
                 Set<Long> lowUtilizationFiles,
                 boolean isBacklog,
                 boolean isProbe)
Returns the best file that qualifies for cleaning or probing, or null if no file qualifies.

Parameters:
fileSummaryMap - the map containing file summary info.
forceCleaning - is true to always select a file, even if its utilization is above the minimum utilization threshold.
lowUtilizationFiles - is a returned set of files that are below the minimum utilization threshold.
isBacklog - is true if there is currently a backlog, in which case FilesToMigrate won't be used to return a file.
isProbe - is true to use the maximum LN obsolete size to determine file utilization. It should be false when selecting a file cleaning a file normally, to use the average LN size for uncounted sizes along with correction factor. It should be true when selecting a file to calculate utilization without cleaning, to determine the worst case (lowest possible) utilization and to ignore the correction factor.

getCheapestFileToClean

Long getCheapestFileToClean(SortedMap<Long,FileSummary> fileSummaryMap,
                            SortedSet<Long> candidateFiles)
Returns the cheapest file to clean from the given list of files. The cheapest file is considered to be the one with least number of active (non-obsolete) entries, since each active entry requires a Btree lookup and must be logged. This method is used to select the first file to be cleaned in the batch of to-be-cleaned files. If there is no backlog, then this method always returns the first and only file in the candidate set, so the cost algorithm has no impact. Returns null iff the candidate set is empty.


adjustUtilization

void adjustUtilization(long fileNum,
                       long endFileNum,
                       FileSummary estimatedFileSummary,
                       FileSummary trueFileSummary)
Saves the true average LN size for use in calculating utilization. The true FileSummary info is determined by FileProcessor when cleaning or probing a file. The following example shows how the true average LN size (for LNs with their size uncounted) is calculated, given the estimated and the true FileSummary info. Estimated FileSummary --------------------- This calculation is performed by FileSummary getAverageObsoleteLNSizeNotCounted and documented there. estimatedFileSummary.totalLNCount: 1,000 estimatedFileSummary.totalLNSize: 100,000 estimatedFileSummary.obsoleteLNCount: 500 estimatedFileSummary.obsoleteLNSizeCounted: 250 estimatedFileSummary.obsoleteLNSize: 12,500 average counted obsolete LN size: 12,500/250 == 50 remainder LN count: 1,000 - 250 == 750 remainder LN size: 100,000 - 12,500 = 87,500 average uncounted obsolete LN size: 87,500 / 750 == 116.67 estimated obsolete LN size: (116.67 * 250) + 12,500 == 41,667.5 True FileSummary ---------------- This calculation is perform in this method, given the obsoleteLNSize amount in the true FileSummary and the information from the estimated FileSummary above. trueFileSummary.obsoleteLNSize: 60,000 average uncounted obsolete LN size: (60,000 - 12,500) / (500 - 250) == 190 which is: (trueFileSummary.obsoleteLNSize - estimatedFileSummary.obsoleteLNSize) / (estimatedFileSummary.obsoleteLNCount - estimatedFileSummary.obsoleteLNSizeCounted) As a check, if 190 were used in the estimation further above, rather than using 116.67, then the result would be 60,000: estimated obsolete LN size: (190 * 250) + 12,500 == 60,000


isCorrectionEstablished

boolean isCorrectionEstablished()
Returns whether enough adjustments have been made to conclude that the the LN size correction factor has been established, or at least unnecessary because very few LN sizes are uncounted. When false is returned, a backlog of to-be-cleaned files should not be created, because when the correction factor is established we may find no need to clean the files in the backlog. Note that we do not clear the backlog when we establish the size.


shouldPerformProbe

boolean shouldPerformProbe(long endFileNum)
Returns whether a correction probe should be attempted, if worst case utilization also indicates that cleaning may be needed. As a side effect whenever true is returned, we update endFileNumAtLastAdjustment, which is somewhat redundant because adjustUtilization will also update this field after the probe is complete. Updating it here ensures that only a single probe will be performed at the scheduled interval, and that multiple cleaner threads will not do a probe at the same time. This method is synchronized to guarantee that.



Copyright (c) 2004-2012 Oracle. All rights reserved.