It took seven days to summarize the strongest MySQL optimization in history, and from then on, optimize So Easy!

This article is reproduced from:It took seven days to summarize the strongest MySQL optimization in history, and from then on, optimize So Easy!

1. Overview

1. Why optimize

An application throughput bottleneck often occurs in the processing speed of the database
With the use of applications, database data gradually increases, and database processing pressure gradually increases.
The data of a relational database is stored on disk and has slower reading and writing speed (compared to data in memory)

2. How to optimize

The design stage of tables and fields, consider better storage and calculations
Optimization functions provided by the database itself, such as indexing
Scale outward, master-slave replication, read-write separation, load balancing and high availability
Typical SQL statement optimization (very little effect)

2. Field design

1. Typical Plan

①. There are requirements for accuracy

decimal
Decimal to integer

②. Try to use integers to represent strings (IPs)

inet_ aton("ip' )
inet_ ntoa(num)

③. Use not null as much as possible

The calculation logic of nuI values is relatively complicated

④. Choices of fixed length and non-fixed length

Longer digital data can be used with decimal
char is a fixed length (content exceeding the length will be truncated), varchar is a non-fixed length, text saves content length and varchar saves length to take up data space

⑤. Do not have too many fields. Field comments are necessary. Field naming is in the name. Fields can be reserved for expansion.

2. Paradigm

①. The first normal form: segment atomicity (Relational databaseIf there is a column, the default is in line)

②. The second normal form: eliminate partial dependence on the primary key (because there may be more than one primary key); use a business-independent field as the primary key

③. The third paradigm: eliminate the transitive dependence on primary keys; high cohesion. For example, the product table can be divided into two tables: the product brief information table and the product details table.

3. Selection of storage engines (MyISAM and Innodb)

1. Functional Differences

Innodb supports transactions, row-level locking, external health

2. Storage Differences

①. Storage method: MyISAM data and cable | are stored separately (.) , while Innodb exists together (.frm)

②. Table mobility: The table movement can be achieved by moving the MYI and MYD corresponding to the table, and Innodb has additional associated files.

③. Fragmented space: When MyISAM deletes data, it will generate fragmented space (occupies table file space), and it needs to be manually optimized through optimizetable table-name regularly. And Innodb won't.

④. Orderly storage: Innodb inserts data in an orderly manner according to the primary key. Therefore, the data in the table is ordered by the primary key by default (it consumes writing time because it needs to find the insertion point in b+ tree, but the search efficiency is high)

3. Select the difference

①. Read more and write less and use MyISAM: News, Blog Websites

②. Read more and write more and use Innodb more:

Support transactions/foreign keys to ensure data-inheritance and integrity
Strong concurrency ability (locking)

IV. Index

1. What is index

Identical keywords extracted from the data and have mapping relationships to the corresponding data

2. Type

①. Primary key index primary key: The keyword is required to be unique and not null

②. Normal index key: Comply with indexes only in order according to the first field

③. Unique index unique key: requires unique keywords

④. Full text index fulltext key (Chinese is not supported)

3. Index management syntax

①. View the index

show create table student
desc student

②. Create an index

Specified when creating, such as first. name varchar(1 6), last name(1 6), key name(first_ name,last_ name)
Change the table structure: alter table student add key/unique key/primary key/ultext key key. name(first_ name,last_ name)

③. Delete the index

alter table student drop key key_ name
If the deleted primary key index and the primary key grows by itself, alter modify needs to cancel the self-growth and then delete it.

4. Execute plan explain

Analyze whether SQL execution uses index and what index is used

5. Index usage scenarios

where: If all search fields are indexed, the index overrides
Order by: If the sorting field has an index and the index is arranged in an orderly manner, you can just take the corresponding data based on the index, which is very efficient compared to reordering all the data found in the query.
join: If the conditional field of join on is indexed, the search will become efficient
Index overlay: directly search the index without reading the data

6. Syntax details

Even if the index is established, some scenarios may not be used.

where id+1 = ? It is recommended to write it as where id = ?-1, which means to ensure the independent appearance of the rope bow|field
Like statements should not be fuzzy before keywords, that is, "%keyword will not use indexes, but "keyword% will use indexes
or the index will be used only when both sides of the key condition fields are indexed. As long as one side is not one, the full table will be scanned.
Status value. A state value like gender, a keyword corresponds to many pieces of data, and it is considered that using an index is less efficient than scanning the full table.

7. The storage structure of the index

btree: Search for multi-forktree: Keywords in nodes are arranged in an orderly manner, there is a pointer between keywords, and search efficiency log(nodeSize, N), where nodeSize refers to the number of keywords in a node (it depends on the keyword length and node size)
b+ tree: upgraded from btree, there is a space for data and keywords, saving time from mapping from keywords to data to find the data storage location.

5. Query cache

1. Cache the select query results, the key is SQL statement, and the value is the query results

If the SQL function is the same, but only multiple spaces or slight changes will cause a key mismatch

2. Client is enabled

query. cache. _type

0-Not turned on
1-Open, cache each select by default, and does not cache for a certain sq: select sql-no-cache
2-Open, no cache is allowed by default. Use select sql-cache to determine which cache is available.

3. Client sets cache size

query_ cache .size

4. Heavy egg cache

reset query cache

5. Cache invalidation

Daily changes to the data table will cause all caches based on the data table to be invalidated (surface-level management)

6. Partition

1. By default, a table corresponds to a set of storage files, but when the data volume is large (usually tens of millions of units) you need to divide the data into multiple storage files to ensure the processing efficiency of a single file.

2. partition by partition function (partition field) (partition logic)

hash-partition field is integer
key-partition field is a string
range-based on comparison, only supports less than
list-based on state value

3. Partition management

Partition when creating: create table article0 partition by key(title) partitions 10
Modify the table structure: alter table article add partition (partition logic)

4. Common pixel detection fields should be selected for partition fields, otherwise partitioning will not be of great significance.

7. Horizontal and vertical segmentation

1. Level

Multiple tables with the same structure store the same type of data

A single table guarantees id uniqueness

2. Vertical

Split fields into multiple tables, these table records are a corresponding relationship

8. Cluster

1. Master-slave copy

①. First, manually synchronize slave and master

stop slave
Master exports data to slave and executes it once
show master status with read lock record File and Position
Go to slave. Change master to

②. start slave to view Slave_ IO_ Running and Slave_ SQL__Running, both must be YES

③. The master can be readable and writeable, but the slave can only be read, otherwise the master-slave copy will fail and need to be re-synced manually.

④. MySQLreplicate quickly configures master-slave replication

2. Read and write separation (based on master-slave replication)

①. Use the original stcConecton

WriteDatabase provides write connections

ReadDatabase provides read connections

②. Dynamic data source switching with Spring AOP and Aspect

RoutingDataSourcelmpl extends AbstractRoutingDataSource, rewrite determineDatasource, inject it into SqISessionFactory, configure defaultTargetDatasource and targetDatasource (select the specific data source value-ref based on the return value of determineDatasource)
DatasourceAspect sectionComponents, configure the entry point @Pointcut aspect0 (all methods of all DAO classes), configure the pre-enhancement @Before(" aspect0") before(Joinpoint point), obtain the method name, and METHODTYPEMAP prefix collection comparison, set write/read to the current thread (it is also the thread that needs to execute the DAO method next, and the pre-enhanced intercepts it)
DatasourceHandler, use ThreadLocal to bind the data source to be used by the method to the thread that executes the method in the pre-notification. When the execution method wants to obtain the data source, it will be obtained according to the current thread.

3. Load balancing

algorithm

polling
Weighted polling
According to load

4. High availability

Provide a redundant machine for stand-alone service

Heartbeat detection
Virtual IP
Master-slave copy

9. Typical SQL

1. Online DDL

To avoid long-term schedule-level locking

Copy policy, copy line by line, record the old table SQL log is re-execute during copying.
mysql|5.6 online ddl, greatly shortens lock time

2. Batch import

①. Disable indexes and constraints first, and create them uniformly after importing.

②. Avoid transaction by item

In order to ensure consistency, innodb defaults to each SQL plus transaction (it is also time-consuming). Transactions should be manually established before batch import, and transactions should be submitted manually after the import is completed.

3. limit offset,rows

Larger offset for avoiding rabbits (larger page number)

offset is used to skip data. You can use filtering to filter data, instead of skipping through offset after finding it out

4. select *

Try to query the required fields to reduce network transmission delay (not much impact)

5. order by rand（）

A random number will be generated for each piece of data and finally sorted according to the random number. You can use the application to generate a random primary key instead.

6. limit 1

If you have determined that only one data is retrieved, it is recommended to add limit 1

10. Slow query log

1. Position SQL with low query efficiency and optimize targeted

2. Configuration Items

Turn on slow_ query. log
critical time long_ query. time

3. The slow query log will record SQL that exceeds the critical time by itself and save it in the datadir

11. Profile

1. Automatically record the execution time of each SQL and the time spent on specific SQL steps.

2. Configuration item date

Turn on profiling

3. View log information show profiles

4. Time and time it takes to view detailed steps for specific SQL

show profiles for query Query_ ID

12. Typical server configuration

1. max_ connections, maximum number of client connections

2. table_ open_ cache, the number of cache handles for table files, speeding up reading and writing of table files

3. key_ buffer. _size, index cache size

4. Innodb_ buffer. pool size, the buffer pool size of innodb, the prerequisite for implementing various functions of innodb

5. innodb_ file_per_ table, one ibd file per table, otherwise innodb shares the tablespace

13. Pressure testing tool MySQLSlap

1. Automatically generate sq| and execute to test performance

myqslap -a-to-generate sql -root -root

2. Concurrent testing

mysqlslap --auto-generate-sql --concurrency= 100 -uroot -proot, simulate 100 clients to execute SQL

3. Multiple rounds of tests, average reaction situation

mysqlslap --auto-generate-sql --concurrency= 100 --interations=3 -uroot -proot, simulate 100 clients to execute SQL. Execute 3 rounds

4. Storage engine testing

–engine=innodb: mysqlslap --auto-generate-sql --concurrency= 100 --interations=3 - engine-innodb -uroot -proot, simulates 100 clients to execute SQL. Execute 3 rounds, innodb processingperformance
– engine= myisam: mysqlslap – auto-generate-sql --concurrency= 100 --interations=3 --engine-innodb -uroot -proot, simulate 100 clients to execute sql. Execute 3 rounds, myisam's processing performance