Introduction

fastutil extends the Java™ Collections Framework by providing type-specific maps, sets, lists and queues with a small memory footprint and fast access and insertion; provides also big (64-bit) arrays, sets and lists, and fast, practical I/O classes for binary and text files. It is free software distributed under the Apache License 2.0.

The classes implement their standard counterpart interface (e.g., Map for maps) and can be plugged into existing code. Moreover, they provide additional features (such as bidirectional iterators) that are not available in the standard classes.

Besides objects and primitive types, fastutil classes provide support for references, that is, objects that are compared using the equality operator rather than the equals() method.

The sources are generated using a C preprocessor, starting from a set of driver files. You can peek at the javadoc-generated documentation. In particular, the overview explains the design choices used in fastutil.

Core jar

If the standard fastutil jar is too large, there is a core jar containing only data structures specific for integers, longs and doubles. Note that those classes are duplicated in the standard jar, so if you are depending on both (for example, because of transitive dependencies) you should exclude the core jar.

Previous split versions would provide different classes in different jars, but managing sensibly dependencies turned out to be impossible.

Speed

fastutil provides in many cases the fastest implementations available. You can find many other implementations of primitive collections (e.g., HPPC, Koloboke, etc.). Sometimes authors are a little bit quick in defining their implementations the “fastest available“: the truth is, you have to take decisions in any implementation. These decisions make your implementation faster of slower in different scenarios. I suggest to always test speed within your own application, rather than relying on general benchmarks, and ask the authors for suggestions about how to use the libraries in an optimal way. In particular, when testing hash-based data structures you should always set explicitly the load factor, as speed is strongly dependent on the length of collision chains.

Big data structures

With fastutil 6, a new set of classes makes it possible to handle very large collections: in particular, collections whose size exceeds 231. Big arrays are arrays-of-arrays handled by a wealth of static methods that act on them as if they were monodimensional arrays with 64-bit indices, and big lists provide 64-bit list access. The size of a hash big set is limited only by the amount of core memory. The usual methods from java.util.Arrays and similar classes have been extended to big arrays: have a look at the Javadoc documentation of BigArrays and IntBigArrays to get an idea of the generic and type-specific methods available.

History and Motivation

fastutil came up as a necessity during the development of UbiCrawler, as we needed to manage structures with dozens of millions of items very efficiently. The same reasons led to the development of the high-performance classes of dsiutils and MG4J (e.g., MutableString).