contexts
In our normal coding, we usually save some objects, which is mainly considered the cost of object creation.
For example, such as thread resources, database connection resources or TCP connections, etc., the initialization of such objects usually takes a long time, and if they are frequently requested and destroyed, they will consume a large amount of system resources, resulting in unnecessary performance loss.
And all of these objects have the distinguishing feature that they can be recycled and reused over and over again through a lightweight reset effort.
At this point, we can use a virtual pool to save these resources, and when it's time to use them, we can just quickly get one from the pool.
In Java, pooling techniques are widely used, and some common ones aredatabase connection poolIn this article, we will focus on connection pooling, and thread pooling will be introduced in the subsequent blogs.
Commons Pooling Package Commons Pool 2
Let's start by taking a look at Commons Pool 2, the common pooling package in Java, to get an idea of whatobject poolThe general structure of the
According to our business requirements, pooling of objects can be easily achieved using this API.
<dependency> <groupId></groupId>
<artifactId>commons-pool2</artifactId>
<version>2.11.1</version>
</dependency>
GenericObjectPool is the core class for object pools, which can be created quickly by passing in an object pool configuration and a factory for objects.
public GenericObjectPool(
final PooledObjectFactory<T> factory,
final GenericObjectPoolConfig<T> config)
case (law)
Jedis, a popular Redis client, uses Commons Pool to manage its connection pool, which is arguably a best practice. The following diagram shows the main code block of Jedis that creates an object using a factory.
The main method of the object factory class is makeObject, whose return value is of type PooledObject, which can be returned as a simple wrapper using new DefaultPooledObject<>(obj).
, use the factory to create objects.
@Override
public PooledObject<Jedis> makeObject() throws Exception {
Jedis jedis = null;
try {
jedis = new Jedis(jedisSocketFactory, clientConfig);
//Major time-consuming operations
jedis.connect();
//Returns the wrapped object
return new DefaultPooledObject<>(jedis);
} catch (JedisException je) {
if (jedis != null) {
try {
jedis.quit();
} catch (RuntimeException e) {
("Error while QUIT", e);
}
try {
jedis.close();
} catch (RuntimeException e) {
("Error while close", e);
}
}
throw je;
}
}
Let's go over the object generation process again, as shown below, when an object is fetched, it will first try to take one out of the object pool, and if there are no free objects in the pool, it will use the methods provided by the factory class to generate a new one.
public T borrowObject(final Duration borrowMaxWaitDuration) throws Exception {
//A number of lines are omitted here
while (p == null) {
create = false;
//First try to get it from the pool.
p = ();
// If you can't get it from the pool, then call the factory to generate a new instance.
if (p == null) {
p = create();
if (p != null) {
create = true;
}
}
//A number of lines are omitted here
}
//A number of lines are omitted here
}
So where does the object reside? This storage responsibility is taken care of by a structure called LinkedBlockingDeque, which is a two-way queue.
Next look at the main properties of GenericObjectPoolConfig:
// Properties of GenericObjectPoolConfig itself
private int maxTotal = DEFAULT_MAX_TOTAL;
private int maxIdle = DEFAULT_MAX_IDLE;
private int minIdle = DEFAULT_MIN_IDLE;
// Properties of its parent class BaseObjectPoolConfig
private boolean lifo = DEFAULT_LIFO;
private boolean fairness = DEFAULT_FAIRNESS;
private long maxWaitMillis = DEFAULT_MAX_WAIT_MILLIS;
private long minEvictableIdleTimeMillis = DEFAULT_MIN_EVICTABLE_IDLE_TIME_MILLIS;
private long evictorShutdownTimeoutMillis = DEFAULT_EVICTOR_SHUTDOWN_TIMEOUT_MILLIS;
private long softMinEvictableIdleTimeMillis = DEFAULT_SOFT_MIN_EVICTABLE_IDLE_TIME_MILLIS;
private int numTestsPerEvictionRun = DEFAULT_NUM_TESTS_PER_EVICTION_RUN;
private EvictionPolicy<T> evictionPolicy = null;
// Only 2.6.0 applications set this
private String evictionPolicyClassName = DEFAULT_EVICTION_POLICY_CLASS_NAME;
private boolean testOnCreate = DEFAULT_TEST_ON_CREATE;
private boolean testOnBorrow = DEFAULT_TEST_ON_BORROW;
private boolean testOnReturn = DEFAULT_TEST_ON_RETURN;
private boolean testWhileIdle = DEFAULT_TEST_WHILE_IDLE;
private long timeBetweenEvictionRunsMillis = DEFAULT_TIME_BETWEEN_EVICTION_RUNS_MILLIS;
private boolean blockWhenExhausted = DEFAULT_BLOCK_WHEN_EXHAUSTED;
There are a lot of parameters, and to get a sense of what they mean, let's first look at the life cycle of a pooled object throughout the pool.
As shown in the figure below, there are two main pool operations: a business thread and a detection thread.
Object pools are initialized by specifying three main parameters:
-
maxTotal The maximum number of objects managed in the object pool.
-
maxIdle Maximum number of idles
-
minIdle Minimum number of idles
Where maxTotal is related to the business thread, when the business thread wants to fetch an object, it will first check if there is a free object.
If there is one, it returns one; otherwise, it enters the creation logic. At this point, if the number of pool has reached the maximum value, the creation fails and an empty object is returned.
One very important parameter when the object is fetched is the maxWaitMillis, which has a large impact on the performance of the application side. This parameter defaults to -1, which means it never times out until an object is free.
As shown in the following figure, if the object creation is very slow or the usage is very busy, the business thread will keep blocking (blockWhenExhausted is true by default), which will cause the normal service to fail.
interview question
Usually the interviewer will ask: what would you set the timeout parameter to? I usually set the maximum wait time, to the maximum delay that the interface can tolerate.
For example, if the response time of a normal service is about 10ms, and it will feel laggy if it reaches 1 second, then it is OK to set this parameter to 500~1000ms.
After the timeout, the NoSuchElementException exception will be thrown, the request will fail fast and will not affect other business threads, this Fail Fast idea is very widely used in the Internet.
Arguments with the word evcit mainly deal with object eviction. Pooled objects, besides being more expensive to initialize and destroy, also take up system resources at runtime.
For example, connection pooling takes up multiple connections, thread pooling adds scheduling overhead, etc. The business, under bursty traffic, will request resources for objects that are beyond the normal case and put them in the pool. When these objects are no longer in use, we need to clean it up.
Objects that exceed the value specified by the minEvictableIdleTimeMillis parameter will be forcibly reclaimed, and this value is 30 minutes by default. softMinEvictableIdleTimeMillis is similar, but it only removes objects when the current number of objects is greater than the minIdle, so the action of the former is a bit more violent. so the former action is a bit more violent.
There are also four test parameters: testOnCreate, testOnBorrow, testOnReturn, and testWhileIdle, which specify whether or not the pooled objects are tested for validity at the time of creation, acquisition, return, and idle detection, respectively.
Turning on these tests ensures the validity of the resource, but it can be a performance hog, so it defaults to false.
On production environments, it is recommended to only set testWhileIdle to true and ensure resource availability and also efficiency by adjusting the timeBetweenEvictionRunsMillis, for example 1 minute.
JMH Testing
How big is the performance difference between using connection pooling and not using connection pooling?
Here's a simple JMH test example (see the repository) that performs a simple set operation to set a random value for a redis key.
@Fork(2)
@State()
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 5, time = 1)
@BenchmarkMode()
public class JedisPoolVSJedisBenchmark {
JedisPool pool = new JedisPool("localhost", 6379);
@Benchmark
public void testPool() {
Jedis jedis = ();
("a", ().toString());
();
}
@Benchmark
public void testJedis() {
Jedis jedis = new Jedis("localhost", 6379);
("a", ().toString());
();
}
//Omit a few lines here
}
The results of the test were graphed using meta-chart and are shown in the following figure, where you can see that the throughput of the connection pooling approach is 5 times higher than that of the non-connection pooling approach!
Database Connection Pooling HikariCP
HikariCP is derived from the Japanese word "Kouro", meaning light, implying that the software works as fast as the speed of light, it is the default database connection pool in SpringBoot.
Database is often used in our work to the components, designed for the database client connection pool is very much, its design principle and we mentioned at the beginning of this article is basically the same, can effectively reduce the database connection creation, destruction of resource consumption.
The same connection pool, their performance is also different, the following figure is an official HikariCP test chart, you can see its excellent performance, the official JMH test code see Github.
The general interview question is this: why is HikariCP fast?
There are three main areas:
-
It uses FastList instead of ArrayList, which reduces the operation of out-of-bounds checking by initializing the default value of the
-
Optimized and streamlined bytecode to reduce the performance loss of dynamic proxies by using Javassist, e.g. using the invokestatic directive instead of the invokevirtual directive.
-
Implemented lockless ConcurrentBag to reduce lock contention in concurrency scenarios
Some of HikariCP's optimization operations for performance are very worthwhile, and we will analyze several optimization scenarios in detail in subsequent blogs.
Database connection pooling also faces a maximum value (maximumPoolSize) and minimum value (minimumIdle) of the problem. Here is also a very frequent interview question: how big would you normally set the connection pool?
Many students believe that the size of the connection pool is set as large as possible, and some students even set this value to more than 1000, which is a misunderstanding.
As a rule of thumb, only 20~50 database connections are enough. The specific size should be adjusted according to the business attributes, but it is definitely not appropriate to be too big.
HikariCP does not recommend setting minimumIdle, it will be set to the same size as maximumPoolSize by default. If you have a large amount of free connection resources on the Server side of the database, you may also want to remove the dynamic adjustment function of the connection pool.
In addition, depending on the database query and transaction type, it is possible to configure multiple database connection pools in an application. This optimization technique is rarely known and is briefly described here.
There are usually two types of business: one requires fast response time to return data to the user as soon as possible; the other can be executed slowly in the background, which takes longer and does not require high timeliness.
If these two business types, share a database connection pool, it is easy to resource contention, which in turn affects the interface response speed.
While microservices can address this situation, most services are not equipped to do so, and this is where connection pooling can be split.
As shown in the figure, in the same business, based on the attributes of the business, we split two connection pools that are there to handle this situation.
HikariCP also mentioned another point of knowledge, in the JDBC4 protocol, the validity of the connection can be checked by ().
This way, we don't have to set a bunch of test parameters, which HikariCP doesn't provide.
Results Cache Pool
At this point you may find many similarities between Pool and Cache.
One thing they have in common with each other is that the objects are processed and stored in a relatively high-speed area. I habitually think of the cache as data objects and the objects in the pool as execution objects. The data in the cache has a hit-and-miss problem, while the objects in the pool are generally peer-to-peer.
Consider the following scenario: jsp provides dynamic functionality for web pages, which can be compiled into a class file after execution to speed up execution; or, some media platforms will convert popular articles into static html pages at regular intervals, which can cope with highly concurrent requests (static and dynamic separation) by relying on nginx load balancing alone.
These are the times when it's hard to tell if it's an optimization for caching or if it's pooling objects that in essence just save the result of an execution step, making it unnecessary to start from scratch the next time you access it.
I usually call this technique Result Cache Pool (Result Cache Pool), belonging to a combination of optimization means.
wrap-up
Let me briefly summarize the content of this article focuses on: we speak from the most common common pooling package Commons Pool 2 in Java, introduces some of its implementation details, and some important parameters of the application of the explanation.
Jedis is wrapped around Commons Pool 2, and through JMH testing, we've found that there's a nearly 5x performance improvement after object pooling.
Next, we introduced HikariCP, a very fast database connection pooling, which is on top of the pooling technique, and further performance enhancement through coding techniques. HikariCP is one of the class libraries that I focus on, and I also recommend that you add it to your own to-do list.
Overall, when you encounter the following scenarios, consider using pooling to increase system performance:
-
The creation or destruction of objects consumes more system resources
-
The creation or destruction of objects is time-consuming, requiring tedious operations and long waiting times.
-
After an object is created, it can be reused over and over again with some state resets
After pooling the objects, only the first step of optimization is turned on. To achieve optimal performance, some key parameters of the pool have to be adjusted. A reasonable pool size coupled with a reasonable timeout can make the pool more valuable. Similar to cache hits, monitoring the pool is also very important.
In the following figure, you can see that the number of database connection pool connections remains high for a long time without releasing, while the number of waiting threads increases dramatically, which can help us quickly locate the database transaction problem.
In normal coding, there are many similar scenarios. For example, Http connection pooling, Okhttp and Httpclient both provide the concept of connection pooling, you can analyze the analogy, the focus is also on the connection size and timeout.
In the underlying middleware, such as RPC, it is also common to use connection pooling techniques to accelerate resource acquisition, such as Dubbo connection pooling, Feign switching to httppclient implementation, and other techniques.
You will find that the pooling design is similar at different resource levels. Thread pooling, for example, provides a second layer of buffering of tasks via queues, provides diverse rejection policies, etc. Thread pooling we will cover in a subsequent article.
These same characteristics of thread pooling you can borrow from connection pooling technology to mitigate request overflow and create some overflow strategies.
In reality, we do the same thing. So how exactly do we do it? What are the practices? That part is left for you to ponder.
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="/POM/4.0.0" xmlns:xsi="http:///2001/XMLSchema-instance" xsi:schemaLocation="/POM/4.0.0 /xsd/maven-4.0."> <modelVersion>4.0.0</modelVersion> <parent> <groupId></groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>2.3.</version> <relativePath/> <!-- lookup parent from repository --> </parent> <groupId></groupId> <artifactId>qy151-springboot-vue</artifactId> <version>0.0.1-SNAPSHOT</version> <name>qy151-springboot-vue</name> <description>Demo project for Spring Boot</description> <properties> <>1.8</> </properties> <dependencies> <dependency> <groupId></groupId> <artifactId>mybatis-plus-generator</artifactId> <version>3.5.2</version> </dependency> <dependency> <groupId></groupId> <artifactId>freemarker</artifactId> <version>2.3.31</version> </dependency> <dependency> <groupId></groupId> <artifactId>mybatis-plus-boot-starter</artifactId> <version>3.5.2</version> </dependency> <dependency> <groupId></groupId> <artifactId>druid-spring-boot-starter</artifactId> <version>1.2.8</version> </dependency> <dependency> <groupId></groupId> <artifactId>swagger-bootstrap-ui</artifactId> <version>1.9.6</version> </dependency> <dependency> <groupId>com.spring4all</groupId> <artifactId>swagger-spring-boot-starter</artifactId> <version>1.9.</version> </dependency> <dependency> <groupId></groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId></groupId> <artifactId>mybatis-spring-boot-starter</artifactId> <version>2.2.2</version> </dependency> <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <scope>runtime</scope> </dependency> <dependency> <groupId></groupId> <artifactId>lombok</artifactId> <optional>true</optional> </dependency> <dependency> <groupId></groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> </dependencies> <build> <plugins> <plugin> <groupId></groupId> <artifactId>spring-boot-maven-plugin</artifactId> <configuration> <excludes> <exclude> <groupId></groupId> <artifactId>lombok</artifactId> </exclude> </excludes> </configuration> </plugin> </plugins> </build> </project>