Java collection interview questions

What are the basic interfaces of the Java collection framework?

Collection is the root interface at the collection level. A collection represents a set of objects, which are its elements. The Java platform does not provide any direct implementation of this interface.
- Set is a collection that cannot contain duplicate elements. This interface models the mathematical set abstraction and is used to represent the set, like a deck of cards.
- List is an ordered collection that can contain repeating elements. You can access any element through its index. List is more like an array of dynamically transformed lengths.
Map is an object that maps key to value. A map cannot contain duplicate keys, and each key can only map at most one value.
Some other interfaces are Queue, Dequeue, SortedSet, SortedMap and ListIterator.

? Why doesn't Collection inherit from Cloneable and Serializable interfaces?

The Collection interface specifies a group of objects, which are its elements.

How to maintain these elements is determined by the specific implementation of the Collection. For example, some Collection implementations such as List allow duplicate elements, while others like Set do not.
Many Collection implementations have a public clone method. However, it doesn't make sense to put it in all implementations of the set. This is because Collection is an abstract expression, and what is important is implementation.

The semantics and meanings of cloning or serialization only come into play when dealing with specific implementations. So, the specific implementation should determine how to clone or serialize it, or whether it can be clone or serialized. Authorizing cloning and serialization in all implementations ultimately leads to less flexibility and more limitations,A specific implementation should determine whether it can be cloned and serialized。

Why does the Map interface not inherit the Collection interface?

Although the Map interface and its implementation are also part of the collection framework, Map is not a collection, and collection is not a map. Therefore, Map inherits Collection meaningless and vice versa.

If Map inherits the Collection interface, where do the elements go? Map contains key-value pairs, which provide a way to extract key or value list collections ( Collection ), but it is not suitable for the "group of objects" specification.

? Why does the Map interface not inherit the Collection interface?

? What is the difference between Collection and Collections?

Collection is the upper-level interface of the collection class, and the main interfaces inherited and include Set and List.
Collections is a tool class for collection classes. It provides a series of static methods to implement search, sorting, thread-safe operations on various collections.

? What are the general algorithms implemented in the collection framework?

The Java collection framework provides commonly used algorithm implementations, such as sorting and searching.

The Collections class contains these method implementations. Most algorithms operate on List, but some are available for collections of all types. Some algorithms include sorting, searching, mixing, and maximum and minimum values.

? Summary of the underlying data structure of the collection framework

1）List

ArrayList: Object array.
Vector: Object array.
LinkedList: Bidirectional linked list (JDK6 was previously a circular linked list, and JDK7 canceled the loop).

2）Map

HashMap ：
- Before JDK8, HashMap consisted of an array + linked list. The array is the main body of HashMap, and the linked list exists mainly to resolve hash conflicts (the "zipper method" resolves conflicts).
- After JDK8, there are major changes when resolving hash conflicts. When the linked list length is greater than the threshold (default is 8), the linked list is converted into a red and black tree to reduce search time.
LinkedHashMap: LinkedHashMap inherits from HashMap, so its underlying layer is still based on a zippered hash structure, that is, it consists of an array and a linked list or a red and black tree. In addition, LinkedHashMap has added a two-way linked list based on the above structure, so that the above structure can maintain the insertion order of key-value pairs. At the same time, by performing corresponding operations on the linked list, the access sequence related logic is realized. For details, you can view:"LinkedHashMap Source Code Detailed Analysis (JDK1.8)" 。
Hashtable: composed of array + linked list. The array is the main body of HashMap, and linked lists mainly exist to resolve hash conflicts.
TreeMap: Red and black tree (self-balanced sorted binary tree).

3）Set

HashSet: Unordered, unique, based on HashMap, the underlying layer uses HashMap to save elements.
LinkedHashSet: LinkedHashSet inherits from HashSet and is implemented internally through LinkedHashMap. It's a bit similar to what we mentioned earlier. LinkedHashMap is based on HashMap implementation, but there is still a little difference.
TreeSet: Ordered, unique, red and black tree (self-balanced sorted binary tree).

What is an iterator?

The Iterator interface provides many methods to iterate over collection elements. Each collection class contains iterative methods that can return an iterator instance. Iterators can delete elements of the underlying collection during iteration, but they cannot directly call the collection.#remove(Object Obj)Method deletion, can be used by the iterator#remove()Method delete.

? What is the difference between Iterator and ListIterator?

Iterator can be used to traverse Set and List collections, but ListIterator can only be used to traverse List.
Iterator can only traverse the set forward, and ListIterator can be either forward or backward.
ListIterator implements the Iterator interface and contains other functions. For example: add elements, replace elements, get the index of the previous and next elements, etc.

? What is the difference between fast-fast and safe-safe?

The difference is the ConcurrentModification exception:

Fast Failure: When you are iterating over a collection, if another thread is modifying the collection you are accessing, a ConcurrentModification exception will be thrown. existAll the packages fail quickly.
Security failure: When you iterate, you will go to the underlying collection to make a copy, so you will not be affected when modifying the upper collection and will not throw a ConcurrentModification exception. existAll packages failed safely.

? How to delete an element in List?

There are two ways, as follows:

Method 1: Use Iterator, in the order downwards, and if the element is found, use the remove method to remove it.
Method 2: traverse List in reverse order. If an element is found, use the remove method to remove it.

? What is the difference between Enumeration and Iterator interfaces?

Enumeration is twice as fast as Iterator and consumes less memory.
However, Iterator is safer than Enumeration because other threads cannot modify the collection object traversed by the current iterator. At the same time, Iterators allows the caller to remove elements from the underlying collection, and none of these Enumerations can be completed.

For many fat friends, they may not have used the Enumeration class, so you can take a look"Java Enumeration Interface"article.

? Why is there no specific implementation of the Iterator interface?

The Iterator interface defines a method to traverse a collection, but its implementation is the responsibility of the collection implementation class. Each collection class that can return an Iterator used for traversal has its own Iterator implementation inner class.

This allows the collection class to select whether the iterator is fail-fast or fail-safe. For example, the ArrayList iterator is fail-fast, while the CopyOnWriteArrayList iterator is fail-safe.

What is the difference between Comparable and Comparator?

Comparable interface, inUnder the package, it is used for comparison between the current object and other objects, so it has a#compareTo(Object obj)Methods are used to sort, and this method has only one parameter.
Comparator interface, inUnder the package, it is used to compare two objects passed in, so it has one#compare(Object obj1, Object obj2)The method is used to sort, and the method has two parameters.

Details, you can check it out"Java Custom Comparator"Article, the focus is on how to implement Comparable and Comparator yourself.

? What does the return value of the compareTo method mean?

greater than 0 means that the object is greater than the parameter object.
Less than 0 means that the object is smaller than the parameter object
equal to 0 means that the two are equal.

? How to sort the List of Objects?

rightObject[]When sorting the array, we can useArrays#sort(...)method.
rightList<Object>When sorting the array, we can useCollections#sort(...)method.

What are the best practices for Java Collection Framework?

Choosing to use the correct type of collection based on application requirements is very important for performance. For example, if the size of the element is fixed and the priority is known, we will use an Array instead of an ArrayList .
Some collection classes allow us to specify their initial capacity. Therefore, if we know the approximate value of storing data, we can avoid rehashing or size adjustments.
Generics are always used to ensure type safety, reliability and robustness. At the same time, using generics can also avoid ClassCastException exceptions at runtime.
Use the immutable class provided by JDK as a key in Map, which can avoid the hashcode implementation and the equals method of our custom class.
It should be programmed according to the interface rather than the implementation.
Returns a zero-length set or array instead of returning anull, this prevents the underlying collection from being empty.

the difference

What is the difference between List and Set?

List and Set are inherited from the Collection interface.

List Features: Elements have order in which elements can be repeated.
Set Features: There is no order of putting elements into place, elements cannot be repeated, and repeated elements will be overwritten.

Note: Although there is no order of placement of elements, the position of elements in Set is determined by the hashcode of the element, and its position is actually fixed.

In addition, List supportsforLooping, that is, iterators can be used to traverse through subscripts, but Set can only use iteration because it is unordered and cannot use subscripts to get the desired value.

Comparison between Set and List:

Set: Retrieval elements are efficient, deletion and insertion are inefficient, and insertion and deletion will not cause element position changes.
List: Similar to arrays, List can grow dynamically, find elements inefficient, and insert and delete elements inefficient, because it may cause other elements to change their position.

What is the difference between List and Map?

List is a collection of objects that allow objects to be repeated.
Map is a collection of key-value pairs, and key duplication is not allowed.

What is the difference between Array and ArrayList? When is Array more suitable?

Array can hold primitive types and objects, while ArrayList can only hold objects.
Array is of the specified size, while ArrayList is fixed and can be automatically expanded.
Array does not provide as many functions as ArrayList, such as addAll, removeAll, and iterator.

Although ArrayList is obviously a better choice, sometimes Array is more useful, such as the following three situations.

1. If the size of the list is specified, most of the time it is to store and traverse them.
2. For traversing basic data types, although Collections uses autoboxing to ease encoding tasks, working on lists of basic types of specified sizes can also become slow.
3. If you want to use multi-dimensional arrays, use[][]It's more convenient than List.

What is the difference between ArrayList and LinkedList?

? ArrayList

Advantages: ArrayList implements a data structure based on dynamic arrays. Because the address is continuous, once the data is stored, the query operation efficiency will be relatively high (it is placed in memory contiguously).
Disadvantages: Because the addresses are continuous, ArrayList needs to move data, so the insertion and deletion operations are relatively inefficient.

? LinkedList

Advantages: LinkedList is based on the data structure of the linked list, and the address is arbitrary, so you don’t need to wait for a continuous address when opening up memory space. For new and delete operations add and remove, LinedList is more advantageous. LinkedList is suitable for scenes where you want to operate head-to-tail or insert a specified position.
Disadvantages: Because LinkedList needs to move pointers, the query operation performance is relatively low.

? Applicable scenario analysis：

When random access to the data is required, use ArrayList.
When multiple additions, deletion and modifications are required to be made to the data, use LinkedList.

If the capacity is fixed and will only be added to the tail, it will not cause expansion, and ArrayList is preferred.
Of course, in the scenario of the vast number of business, using ArrayList is enough. Mainly, be careful to avoid expansion of ArrayList and indirect insertion.

? How does ArrayList expand?

Watch it directly"Detailed explanation of ArrayList dynamic expansion"The article is very detailed. The main conclusions are as follows:

If constructed through parameterless construction, the initial array capacity is 0, and the capacity is truly allocated when the array is actually added. Every time1.5The ratio of times (bit operation) is expanded by copeOf.
In JKD6, if constructed without parameters, the initial array capacity is 10, and the capacity is the original after each expansion by copeOf.1.5Time.

The focus is on 1.5x expansion, which is different from HashMap's 2x expansion.

How to improve efficiency when an ArrayList collection adds 10,000 pieces of data?

The default initial capacity of ArrayList is 10. When inserting a large amount of data, it needs to be continuously expanded, and expansion is very affecting performance. Therefore, now that 100,000 pieces of data have been clarified, we can set the capacity of ArrayList directly during initialization!

This will improve efficiency ~

What is the difference between ArrayList and Vector?

ArrayList and Vector are both implemented using arrays, and there are three main differences:

1. Vector is multi-threaded and thread-safe. Thread-safe means that multiple threads access the same code will not produce uncertain results, while ArrayList is not. This can be seen from the source code, there are many methods in the Vector classsynchronizedMake modifications, which makes Vector unable to compare with ArrayList in terms of efficiency.

Vector is an old dynamic array, thread-synchronized, very inefficient, and generally not favored.
2. Both use linear continuous space storage elements, but when there is insufficient space, the two classes are added differently.
3. Vector can set the growth factor, but ArrayList cannot.

Applicable scenario analysis:

1. Vector is thread-synchronized, so it is also thread-safe, while ArrayList is thread-free and it is not safe. If thread safety factors are not taken into account, ArrayList is generally more efficient.

In actual scenarios, if you need to access a secure array with multiple threads, use CopyOnWriteArrayList.
2. If the number of elements in the set is greater than the length of the current set array, using data with a relatively large amount of data in the set has certain advantages.

In this case, it is more appropriate to use LinkedList.

What is the difference between HashMap and Hashtable?

Hashtable was created in Java 1.0, and the unified standard naming of collections was agreed upon in Java 2.0 later, and the release of other collection classes at that time constituted a new collection framework.

Hashtable inherits Dictionary, and HashMap inherits the Map interface that appears in Java2.
2. HashMap removes the contains method of Hashtable, but adds containsValue and containsKey methods.
3. HashMap allows empty key values, while Hashtable does not.
[Key points] 4. HashTable is synchronous, while HashMap is asynchronous, which is more efficient than HashTable. Therefore, HashMap is more suitable for single-threaded environments, while HashTable is more suitable for multi-threaded environments.
5. The iterator of HashMap is a fail-fast iterator, and the enumerator iterator of HashTable is not a fail-fast.
6. The default size of the array in HashTable is 11, and the expansion method isold * 2 + 1, the default size of HashMap is 16, and the exponential size is 2 each time.

Generally, HashTable is not recommended now. The main reasons are two points:

First, HashTable is a legacy class, and there are many internal implementations without optimization and redundancy.
Second, even in a multi-threaded environment, there is now a synchronous ConcurrentHashMap replacement, and there is no need to use Hashtable because it is multi-threaded.

? There is obviously only one statement "return count;" in the #size() method of Hashtable, so why do you still need to synchronize?

Only one thread can execute the synchronization method of the fixed class at the same time, but for the asynchronous method of the class, multiple threads can access it at the same time. So, there is a problem. Maybe thread A is adding data when executing the put method of Hashtable, and thread B can call it normally#size()The method reads the number of current elements in the Hashtable, and the value read may not be the latest. It may be that thread A has added the data, but it is not correct.count++, thread B has already readcountThen for thread B,countIt must be inaccurate.

After adding synchronization to the #size() method, it means that thread B calls the #size() method only after thread A calls the put method, which ensures thread safety。

What is the difference between HashSet and HashMap?

Set is a linear structure and the values cannot be repeated. HashSet is a hash implementation of Set. The median value of HashSet cannot be repeated is implemented using the key of HashMap.
Map is a key-value pair map, which can be null keys and null values. HashMap is the hash implementation of Map. The uniqueness of the key is determined by the uniqueness of the key value hashcode, and the value value is the linked list structure.

Because different key values may have the same hashcode, the value value needs to be a linked list structure.

Their common point is the uniqueness of the hash algorithm implementation. None of them can hold basic types, they can only hold objects.

For better performance, Netty implements HashMap with key as a basic type, for exampleIntObjectHashMap 。

What is the difference between HashSet and TreeSet?

HashSet is implemented using a hash table, so its elements are unordered. The duration complexity of the methods included in the add, delete and HashSet isO(1) 。
TreeSet is implemented with a tree structure, so it is ordered. The duration complexity of the methods included in the Add, Remove, and TreeSet areO(logn) 。

? How to decide whether to choose HashMap or TreeMap?

HashMap is the best choice for operations like inserting, deleting, and locating elements in a Map.
However, if you need to traverse an ordered collection of keys, TreeMap is a better choice.

Based on the size of your collection, it may be faster to add elements to the HashMap, and then replace HashMap with TreeMap for an ordered key traversal.

What is the difference between HashMap and ConcurrentHashMap?

ConcurrentHashMap is a thread-safe implementation of HashMap. The main differences are as follows:

1. ConcurrentHashMap segments the entire bucket array (segment), and then uses a lock lock to protect each segment. Compared with Hashtable's syn keyword lock, the granularity is more refined and the concurrency performance is better. HashMap has no lock mechanism and is not thread-safe.

After JDK8, ConcurrentHashMap enabled a completely new way to implement it, using the CAS algorithm.
2. HashMap key-value pairs are allowednull, but ConCurrentHashMap is not allowed.

What are queues and stacks, listing their differences?

Both the stack and the queue are used to pre-store data.

is an interface whose implementation class is in Java concurrent package. Queues allow first-in-first-out (FIFO) to retrieve elements, but this is not always the case. The Deque interface allows elements to be retrieved from both ends.
The stack is similar to a queue, but it allows for last-in-first-out (LIFO) search of elements.
- Stack is a class that extends from Vector, while Queue is an interface.

principle

How does HashMap work?

We know that the two most commonly used structures in Java are arrays and simulated pointers (references). Almost all data structures can be implemented in combination with these two, and the same is true for HashMap. In fact, HashMap is a **"linked list hash"**.

HashMap is based on the principle of hashing.

HashMap illustration

We use#put(key, value)Method to store objects into HashMap, usingget(key)Method gets the object from the HashMap.
When we give#put(key, value)When a method passes keys and values, we first call the keys.#hashCode()Method, the returned hashCode is used to find the bucket location to store the Entry object.

? What happens when the hashCode of two objects is the same?

Because the hashcode is the same, their bucket location is the same and "collision" will occur.

Because HashMap uses a linked list to store objects, this Entry (the object containing key-value pairs) is stored in the linked list.

? What are the importance of hashCode and equals methods?

HashMap uses key object#hashCode()and#equals(Object obj)The method determines the index of the key-value pair. These methods are also used when we try to get values from HashMap.

If these two methods are not implemented correctly, in this case, two different keys may produce the same#hashCode()and#equals(Object obj)Output, HashMap will consider them the same and overwrite them instead of storing them in different places.

Similarly, all collection classes that do not allow storing duplicate data are used#hashCode()and#equals(Object obj)Go find duplicates, so it is very important to implement them correctly.#hashCode()and#equals(Object obj)The implementation of the method should follow the following rules:

if(o2),So() == ()Always fortrueof.
if() == (), does not mean(o2)Will betrue 。

? What is the default capacity of HashMap?

The default capacity is 16 and the load factor is 0.75. That is, when HashMap fills 75% of buskets, the smallest possibility is (16 * 0.75 = 12), generally 2 times the original memory.

? What are the orders of HashMap implementation classes?

LinkedHashMap is based on the order in which elements enter the set or the order in which they are accessed.
TreeMap is based on the inherent order of elements (determined by Comparator or Comparable).

? Can we use any class as the key of the Map?

We can use any class as the key of the Map, however, before using them, we need to consider the following:

1. If the class overrides the equals method, it should also override the hashcode method.
2. All instances of the class need to follow rules related to equals and hashcode.
3. If a class does not use equals, you should not use it in hashcode.

4. The best practice for user-defined key classes is to make it immutable, so that the hashcode value can be cached for better performance. Immutable classes can also ensure that hashcode and equals will not change in the future, which will solve the problem related to mutability.

For example, I have a class MyKey that uses it in a HashMap. The code is as follows:

//The name parameter passed to MyKey is used in equals() and hashCode()
 MyKey key = new MyKey('Pankaj'); //assume hashCode=1234
 (key, 'Value');
 // The following code will change the hashCode() and equals() values of the key
 ('Amit'); //assume new hashCode=7890
 //The following will return null, because HashMap will try to find the key that stores the same index, and the key has been changed, the match fails, and null is returned
 (new MyKey('Pankaj'));

That's why String and Integer are used extensively as keys for HashMap.

? Why is the length of HashMap power to 2?

In order to make HashMap access efficient and try to avoid collisions, that is, try to allocate data evenly, and each linked list/red and black tree length is roughly the same. This implementation is an algorithm to store the data in which linked list/red and black tree.

How should this algorithm be designed? We may first think of adopting%The operation of taking the rest is implemented. But, the point is:

Take the rest (%) If the divisor is a power of 2 in the operation, it is equivalent to the sum of the divisor subtracting one from its divisor (&) Operation (that is,hash % length == hash & (length - 1)The premise is that length is to n power of 2;).
And, use binary bit operations&, relative to%Can improve computing efficiency,

This explains why the length of HashMap is at the power of 2.

How does HashSet work?

HashSet is a Set hashing implementation class built on HashMap. Let's just use the source code, the code is as follows:

// 

private transient HashMap<E,Object> map;

private static final Object PRESENT = new Object();

mapProperty, when we create a HashMap object, it will also create an internalmapObject. All subsequent HashSet operations are actually based on thismapPackage above.
PRESENTStatic properties, allmapThe corresponding values of KEY in the KEY are all it to avoid repeated creation.
OK, take a look at the add method again, the code is as follows:
```
// 

public boolean add(E e) {
    return (e, PRESENT) == null;
}
```
- Is it clear at a glance?

? How to check for duplications in HashSet?

Yang: Just as we saw the implementation principle of HashSet above, we can naturally deduce that HashMap is also how to check duplicates.

The following excerpt from the second edition of "Head First Java":

When you add an object to a HashSet, the HashSet will first calculate the hashcode value of the object to determine the location where the object is added, and will also compare it with the hashcode value of other added objects.

If there is no matching hashcode, HashSet will assume that the object does not appear repeatedly.
However, if an object with the same hashcode value is found, the equals method will be called to check whether the objects with equal hashcode are really the same.
- If both are the same, HashSet will not allow the joining operation to be successful.
- If the two are different, HashSet will make the join operation successful。

[For details, you can view the Java basic series]