Size vs Capacity in Java Collections
Collection Size
The size of a collection is the number of elements that are in the collection.
final var list = List.of(1, 2, 3);
This collection contains three elements and has a size of three.
Collection Capacity
Most collections contain an array that stores the elements in the collection. This array's size is the capacity of the collection. A collection's capacity is always greater than or equal to the size of the collection. Arrays are fixed in size, and collections generally are not. When the collection grows bigger than the array, a new array needs to be created and the elements copied from the previous array to the new array to allow the collection to grow.
Capacity and Performance
Whenever you know what size the collection needs to be, you should create the collection with that capacity. This prevents the array from being copied more than it needs to. This will increase the performance and reduce the memory needed for that collection. Most collections allow you to specify an initial capacity when constructing them.
final List<String> list = new ArrayList<>(50);
This creates an ArrayList with a capacity of 50.
Bulk Operations and Performance
When you are adding multiple elements to a collection, use bulk operations. The following example has a collection that contains 100 elements and adds them one by one to a different collection.
final var items = new ArrayList<String>();
for (final var item : collectionOf100Elements) {
items.add(item);
}
Each time an element is added, it first checks to see if the capacity needs to be increased. If it does, it creates a new array with a new capacity and copies the previous array's elements into the new array. Usually the new capacity will be the previous capacity doubled.
An ArrayList has a default initial capacity of 10. The previous example creates and copies a new array when the 11, 21, 41, and 81 elements are added. Adding 100 elements this way, the items variable will have a capacity of 160 once it is completed.
items.addAll(anotherCollection);
Using the bulk addAll() method, it checks the capacity once and creates and copies the previous elements to the new array if it needs to. This will increase the performance and reduce the memory needed for a collection.
Conclusion
Always specify an initial capacity if you know it when creating a collection. If you don't know the initial capacity, make an educated guess based on the data the collection will be holding and how it is used. This will help increase performance and reduce the amount of memory the collection needs. Also, always use bulk operations when adding multiple elements to a collection.