Statement : Why GC takes too long time to finish
Solution :
Possible Reasons
- Large repository size – GC performance proportional to repository/datastore size
- DataStore GC executed for the first time or after a long gap
Identification
- GC has info level logs for each phase and it is easy to identify the phase taking the longest time.
- Empirically, deletions could take the maximum time even spilling over 24 hours if large number of blobs to be deleted
Mitigation
- Regular DataStore GC
o Mark
phase (collection of blob references used) can affect general repository
performance critically so, should be scheduled during off-peak hours
o Sweep
phase should not critically affect system performance so, Ok to have it
continue if it spill over to normal working hours
No comments:
Post a Comment