MySQL Internals Manual  /  X Protocol  /  Use Cases

15.7 Use Cases

Topics in this section:

Prepared Statements with Single Round-Trip

In the MySQL Client/Server Protocol, a PREPARE/EXECUTE cycle required two round-trips as the COM_STMT_EXECUTE requires data from the COM_STMT_PREPARE-ok packet.

Figure 15.13 Single Round-Trip


The X Protocol is designed in a way that the EXECUTE stage doesn't depend on the response of the PREPARE stage.

Figure 15.14 Without PREPARE Stage Dependency


That allows the client to send both PREPARE and EXECUTE after each other without waiting for the server's response.

Figure 15.15 Without Server Response


Note

See the Implementation Notes about how to efficiently implement pipelining.

Streaming Inserts

When inserting a large set of data (data import), make a trade-off among:

  • memory usage on client and server side

  • network round-trips

  • error reporting

For this example it is assumed that 1 million rows, each 1024 bytes in size have to be transferred to the server.

Optimizing for Network Round-Trips

(Assuming the MySQL Client/Server Protocol in this case) Network round-trips can be minimized by creating a huge SQL statements of up to 1Gbyte in chunks of 16Mbyte (the protocol's maximum frame size) and sending it over the wire and letting the server execute it.

INSERT INTO tbl VALUES
  ( 1, "foo", "bar" ),
  ( 2, "baz", "fuz" ),
  ...;

In this case:

  • the client can generate the SQL statement in chunks of 16Mbyte and write them to the network

  • (memory usage) the server waits until the full 1GByte is received

  • (execution delay) before it can start parsing and executing it

  • (error-reporting) in case an error (parse-error, duplicate key error, ...) the whole 1Gbyte message will be denied without any good way to know where the error in that big message happened

The Execution Time for inserting all rows in one batch is:

1 * RTT +
(num_rows * Row Size / Network Bandwidth) +
num_rows * Row Parse Time +
num_rows * Row Execution Time

Optimizing for Memory Usage and Error-Reporting

The other extreme is using single row INSERT statements:

INSERT INTO tbl VALUES
  ( 1, "foo", "bar" );

INSERT INTO tbl VALUES
  ( 2, "baz", "fuz" );

  ...
  • client can generate statements as it receives data

  • streams it to the server

  • (execution delay) server starts executing statements as soon as it receives the first row

  • (memory usage) server only has to buffer a single row

  • (error-reporting) if inserting one row fails, the client knows about it when it happens

  • as each statement results in its own round-trip, the network-latency is applied for each row instead of once

  • each statement has to be parsed and executed in the server

Using Prepared Statements solves the last bullet point:

Figure 15.16 Optimization for Memory


The Execution Time for inserting all rows using prepared statements and the MySQL Client/Server Protocol is:

num_rows * RTT +
(num_rows * Row Size / Network Bandwidth) +
1 * Row Parse Time +
num_rows * Row Execution Time

Optimizing for Execution Time and Error-Reporting

In the X Protocol, a pipeline can be used to stream messages to the server while the server is executing the previous message.

Figure 15.17 Optimization for Execution


The Execution Time for inserting all rows using prepared statements and using pipelining is (assuming that the network is not saturated):

1 * RTT +
(1 * Row Size / Network Bandwidth) +
1 * Row Parse Time +
num_rows * Row Execution Time
  • one network latency to get the initial prepare/execute across the wire

  • one network bandwith to get the initial prepare/execute across the wire. All further commands arrive at the server before the executor needs them thanks to pipelining.

  • one row parse time to parse the prepare

  • num_rows row execution time stays as before

In case error reporting isn't a major topic one can combine multi-row INSERT with pipelining and reduce the per-row network overhead. This is important in case the network is saturated.

SQL with Multiple Resultsets

Figure 15.18 Multiple Resultsets


Inserting CRUD Data in a Batch

Inserting multiple documents into a collection col1 is a two-step process:.

  1. prepare the insert

  2. pipeline the execute messages

Figure 15.19 Batch


By utilizing pipelining the execute message can be batched without waiting for the corresponding executeOk message to be returned.

Cross-Collection Update and Delete

Deleting documents from collection col2 based on data from collection col1.

Instead of fetching all rows from col1 first to construct a big delete message it can also be run in nested loop:

Crud.PrepareDelete(stmt_id=2, Collection(name="col2"), filter={ id=? })
Crud.PrepareFind(stmt_id=1, Collection(name="col1"), filter={ ... })

Sql.PreparedStmtExecute(stmt_id=1, cursor_id=2)

while ((rows = Sql.CursorFetch(cursor_id=2))):
  Sql.PreparedStmtExecute(stmt_id=2, values = [ rows.col2_id ])

Sql.PreparedStmtClose(stmt_id=2)
Sql.PreparedStmtClose(stmt_id=1)

Figure 15.20 Update and Delete



User Comments
Sign Up Login You must be logged in to post a comment.