web123456

Common MySQL interview questions (Latest in 2024)

Table of contents

  • Preface
  • The difference between varchar
  • 2. Three major paradigms of database
  • 3. Do you understand the execution order of SQL?
  • 4. What is the index
  • 5. Advantages and disadvantages of indexing
  • 6. Index type
  • 7. How to design index (optimize)
  • 8. How to avoid index failure (also belongs to a type of SQL optimization)
  • 9. Indexed data type
  • 10. Why indexing uses tree structure
  • 11. Binary search tree, B tree, B+ tree
  • 12. Why not use B-tree
  • 13. Leftmost matching principle
  • How to check whether the index is used or how to check the SQL execution plan
  • 15. A SQL query is very slow. How can we troubleshoot and optimize it?
  • The difference from InnoDB and Memory
  • 17. What is a transaction
  • 18. Four major characteristics of transactions (ACID)
  • 19. Dirty reading, non-repeat reading, or fantasy reading
  • 20. The isolation level of transactions?
  • 21. How to optimize the database
  • optimization
  • 23. Commonly used aggregate functions
  • 24. Several types of related queries
  • The difference between exists
  • , truncate, delete

Preface

The latest java interview questions (java basics, collections, multithreading, jvm, locks, algorithms, CAS, Redis, databases, mybatis, spring, springMVC, springBoot, microservices)

The difference between varchar

① The length of the char is set is the length. Varchar can change the length, so the space utilization rate of char is not as high as that of varchar.
② Because the length is fixed, the access speed is faster than varchar.
③ Char is suitable for fixed-length character strings, such as ID number, mobile phone number, etc., and varchar is suitable for non-fixed character strings.

2. Three major paradigms of database

The first normal form (1NF):Ensure that fields cannot be further divided and atomicity is guaranteed.
The second normal form (2NF):Under the premise of satisfying 1NF, each column of the table must be related to the primary key. Eliminate partial dependencies.
The third normal form (3NF):Under the premise of satisfying 2NF, each column ratio of the table must be directly related to the primary key and cannot be indirect. Eliminate transitive dependencies

3. Do you understand the execution order of SQL?

select distinct(deduplication) Aggregation function

fromTable 1

[inner join | left join | right join](connect)Table 2

on(connection condition)Table 1. Field = Table 2. Field

whereQuery criteria

group by (group)Fields

havingGroup filtering conditions

order by(sort)Fields

limit(pagination) 0,10

4. What is the index

It is a data structure that efficiently obtains data, which is equivalent to a directory, and finds data faster. It is a file that occupies physical space.

5. Advantages and disadvantages of indexing

advantage:
① Improve the speed of search.
② The index column sorts the data to reduce the sorting cost.
③ After mysql 8, it was introduced to hide the index. When an index is hidden, it will not be used by the optimizer. You can see the impact of the index on the database, which is conducive to tuning.
shortcoming:
① The index is also a file, so it will take up space.
② Reduce the update speed, because not only do you need to update the data, but also update the index.

6. Index type

① Normal index:Basic index type, allowing fields that define indexes to be null and duplicate.
② Unique index:The value of the index must be unique, allowing the field that defines the index to be null.
③ Primary key index:The index value must be unique and cannot be empty.
④Composite index:Multiple fields are indexed and followed by the leftmost matching rules.
⑤ Global index:Only available on the MyISAM engine.

7. How to design index (optimize)

①Select unique index: the value is unique and the query is faster.
② It is often used as a field for query conditions to add indexes.
③ Index fields that often require sorting, grouping and joint operations: order by, group by, union, distinct, etc.
④ Limit the number of indexes: The more indexes are, the more disk space you need. When updating tables, it will be very difficult to refactor and update the index.
⑤ It is not recommended to use indexes (within million levels): if there is too little data, it is possible to query faster than traversing the index.
⑥Delete infrequently used and no longer used indexes.
⑦ Use small types as indexes: For example: if int and BIGINT can be used, use int. Because of the small type, the query speed is fast and the index takes up less space.
⑧ Use prefix index. If the string is longer, the more space the index takes up, and the longer the time it will take.

8. How to avoid index failure (also belongs to a type of SQL optimization)

① When a column uses a range query (>, <, like, between and), all column indexes on the right will also be invalid.
②Do not perform calculations on index fields.
③ Do not use OR, !=, <> and judgment on the value null in the where clause.
④ Avoid using fuzzy queries that start with likes starting with '%'.
⑤ The string does not have single quotes, causing the index to be invalid.

9. Indexed data type

Hash:When querying, call the Hash function to get the address and return to the table to query the actual data. (InnoDB and MylSAM are not supported, Memory supports).
B+ Tree:Each time, we start from the root node to query, then get the address, and return to the table to query the actual data.

10. Why indexing uses tree structure

Because it can speed up query efficiency and keep order.

11. Binary search tree, B tree, B+ tree

Binary search tree (binary sorting tree, binary search tree):There are at most two child nodes in a node (small left and large right), and the number of queries and comparisons are the smallest, but the index is on disk. When the amount of data is too large, the entire index file cannot be loaded directly into memory. It requires multiple IO times. In the worst case, the number of IO times is the height of the tree. In order to reduce IO, the tree needs to be turned from vertical to horizontal.
B-tree (B-):It is a multi-channel query tree. Each node contains K child nodes. The nodes store index values ​​and data. K is the order of the B tree (the tree height is called the order of the tree). Although the comparisons are relatively frequent, they are negligible in memory comparisons, but the number of B-tree IO is less than that of binary search trees, because the height of B-tree can be lower.
B+ Tree:In the upgraded version of B-tree, only the leaf nodes store data from the database pointed to by the index value.

12. Why not use B-tree

① B-tree is only suitable for random search, while B+tree supports both random search and sequential search (because leaf nodes are equivalent to linked lists, and the index values ​​are stored in an orderly manner).
Sequential search:The comparison is found in sequence order.
Random search:Continuously randomly extract data from the sequence for comparison and finally find the result.

② Reduce disk IO and improve space utilization:Because the non-leaf node of the B+ tree does not store data and only has index values, the non-leaf node can save more index values, so the B+ tree can be shorter and reduce the number of IO times.

③B+ tree is suitable for range search:This is the key, because most databases are range searches. The leaf nodes of B+ tree are ordered linked lists, and it is enough to traverse directly. The range search of B tree may be very far apart, so it can only be searched through intermediate order traversal, so it is more appropriate to use B+ tree.
In-order traversal:(The root is in the middle, from left to right, the left subtree of a tree is always in front of the root, and the root is always in front of the right subtree)

13. Leftmost matching principle

The leftmost priority is, any consecutive indexes starting from the leftmost point can match. At the same time, the match will stop encountering range queries (>, <, between and, like).

For example: Z table establishes a joint index (a,b,c)

//In this way, the index abc columns are effective, because they comply with the principle of leftmost matching, and the order of several search conditions in the where clause does not affect the query result, because there is a query optimizer in Mysql, which will automatically optimize the query order
select  *  from Z where a = 1 and b = 2 and c = 3 

//Because column a is the starting point, no column a cannot match, so the index fails
select * from table_name where  b = 2 and c = 3 

//Because there is no continuous b, only column a index takes effect
select * from table_name where  a = 1 and c = 3 

How to check whether the index is used or how to check the SQL execution plan

useexplain

For example: explain select * from table name where condition
Result: The key will be found, and the key is the index you use. There is also the type field, which allows you to see whether the index is a full table scan or an index scan, etc.
Comparison of content performance of type field: ALL < index < range ~ index_merge < ref < eq_ref < const < system

15. A SQL query is very slow. How can we troubleshoot and optimize it?

Troubleshooting:
(1) Turn on slow query.
(2) View the slow query log (locate inefficient SQL, command: show processlist).
(3) Use explain to view the execution plan of SQL (see if the index is invalid or has low performance)

optimization:
sql optimization + index + database structure optimization + optimizer optimization

The difference from InnoDB and Memory

MylSAM:The storage engines before mysql5.5 are at the table lock (pessimistic lock) level and do not support transactions and foreign keys.
InnoDB:The storage engine after mysql5.5 is at the row lock (optimistic lock) level and supports transactions and foreign keys.
Memory:The in-memory database engine, because it operates in memory, read and write quickly, but restarting the Mysql service will lose data and do not support transactions and foreign keys.

17. What is a transaction

Detailed explanation of transaction and isolation levels and practical applications

Transactions are operations that uniformly roll back or submit a series of operations in the database, mainly used to ensure the integrity and consistency of data.

18. Four major characteristics of transactions (ACID)

Atomicity:Either all succeed or all fail.
Consistency:Before and after transaction execution, the data that was originally consistent with the database is still consistent.
Isolation:Transactions do not interfere with each other.
Durability:Once a transaction is committed, the change to the data in the database is permanent.

19. Dirty reading, non-repeat reading, or fantasy reading

Dirty reading:Also called "read uncommitted", as the name suggests, it means that a certain transaction A reads data that transaction B has not committed.
Cannot be read repeatedly:In a transaction, the same data is read multiple times, but different results are returned. In fact, this is because during the period when data is read between the transaction, other transactions have modified this data and have been submitted, and an unrepeatable read accident will occur.
Fantasy reading:In the same transaction, the result set is read first and the result set is different from the result set read second. It's like hallucination, so it's called illusion reading.
From the above, we can see that dirty reading and non-repeatable reading are errors based on data values, while phantom reading is based on errors based on increase or decrease in number of characters.

20. The isolation level of transactions?

① read uncommitted (read uncommitted content):At this isolation level, all transactions can see the execution results of other uncommitted transactions. Read unsubmitted data, also known as Dirty Read
② read committed (read the submitted content):This is the default isolation level for most database systems (but not MySQL default). A transaction can only see changes that have been submitted to the firm. Can solve dirty reading
③ repeatable read (rereadable):This is the default transaction isolation level of MySQL. Multiple instances of the same transaction will see the same data when reading data concurrently. But in theory, this leads to another tricky problem: Phantom Read. Can solve dirty reading and cannot be repeated reading
④ serializable (serializable):This is the highest isolation level, which solves the phantom reading problem by forcing transaction sorting to make it impossible to conflict with each other. In short, it is to add a shared lock to each read data line. At this level, it may lead to a large number of timeout phenomena and lock competition. It can solve dirty reading, non-repeat reading, and phantom reading.

21. How to optimize the database

① SQL optimization
②Add cache
③Subdivision
④Read and write separation

optimization

①Do not use select *, use specific fields.
② Use numeric values ​​instead of strings, such as: 0=sing, 1=jump, 2=rap.
③Avoid returning a large amount of data, it is best to use paging.
④ Use indexes to improve query speed. It is not advisable to build too many indexes and cannot build them on fields with more duplicate data.
⑤Batch insertion is faster than single insertion, because the transaction only needs to be turned on once, and the data volume is too small and cannot be reflected.
⑥ Avoid subqueries and optimize for multi-table join query.
⑦ Try to use union all instead of union, because union will automatically deduplicate.

23. Commonly used aggregate functions

①sum(column name) sum
②max(column name) maximum value
③min(column name) minimum value
④avg(column name) average
⑤First(column name) First record
⑥last(column name) last record
⑦count(column name) The number of statistics records does not contain null values ​​count(*) contains null values.

24. Several types of related queries

Inner join:Query the matching data of two tables.
Left join:Query all rows on the left table and rows matching the right table.
Right join:Query all rows in the right table and rows matching the left table.

The difference between exists

in():Suitable for cases where subtables (subquery) are smaller than main table data.
exists():Suitable for cases where sub-tables (subquery) are larger than main table data.

, truncate, delete

speed: drop > truncate > delete。
rollback:delete is supported, truncate and drop are not supported.
Delete content:The delete table structure is still there, deleting some or all of the data without freeing up space. The truncate table structure is still there, deletion of all data and free up space. The drop table structure and data are not there, including indexes and permissions, freeing up space.