Understanding MySQL Client / Server Protocol Using Python & Wireshark: Part 1
MySQL Client / Server protocol is used in many areas. For example:
- MySQL Connectors like ConnectorC, ConnectorJ and etc.
- MySQL proxy
- Between master and slave
What is MySQL Client / Server protocol?
MySQL Client / Server protocol is accepted conventions (rules). Through these rules client and server “talks” and understand each other. Client connects to server through TCP connection with special socket, sends to server special packets and accepts them from server. There are two phases of this connection:
- Connection phase
- Command phase
Next illustration describes phases:
STRUCTURE OF PACKETS
Each packet consists of valuable data types. Maximum length of each packet can be 16MB. If the length of packet is more than 16MB, then it is separated into several chunks (16MB). First of all let’s see the protocol data types. MySQL Client / Server protocol has two data types:
- Integer types
- String types
(See the official documentation: https://dev.mysql.com/doc/internals/en/basic-types.html)
Integer types also separates into two section:
- Fixed length integer types
- Length-encoded integer types
Fixed length integer type consumes 1, 2, 3, 4, 6 or 8 bytes. For example if we want to describe number 2 in int<3> data type then we can write it like this in hex format: 02 00 00. Or if we want to describe number 2 in int<2> then we can write it like this in hex format: 02 00
Length-encoded integer types consumes 1, 3, 4 or 9 bytes. Before length-encoded integer types comes 1 byte. To detect the length of integer we have to check that first byte.
- If the first byte is less than 0xfb ( < 251 ) then next one byte is valuable (it is stored as a 1-byte integer)
- If the first byte is equal to 0xfc ( == 252 ) then it is stored as a 2-byte integer
- If the first byte is equal to 0xfd ( == 253 ) then it is stored as a 3-byte integer
- If the first byte is equal to 0xfe ( == 254 ) then it is stored as a 8-byte integer
But if the first byte is equal to 0xfb there is no need to read next bytes, it is equal to the NULL value of MySQL, and if equal to 0xff it means that it is undefined.
For example to convert fd 03 00 00 … into normal integer we have to read first byte and it is 0xfd. According to the above rules we have to read next 3 bytes and convert it into normal integer, and its value is 2 in decimal number system. So value of length-encoded integer data type is 2.
String types also separates into several sections.
- String – Fixed-length string types. They have a known, hardcoded length
- String – Null terminated string types. These strings end with 0x00 byte
- String – Variable length string types. Before such strings comes fixed-length integer type. According to that integer we can calculate actual length of string
- String – Length-encoded string types. Before such strings comes length-encoded integer type. According to that integer we can calculate actual length of string
- String – If a string is the last component of a packet, its length can be calculated from the overall packet length minus the current position
SNIFF WITH WIRESHARK
Let’s start wireshark to sniff the network, filter MySQL packets by ip (in my case server ip is 188.8.131.52). Then let’s try to connect to MySQL server by MySQL native client on our local machine.
>> mysql -u[username] -p[password] -h[host ip] -P3306
As you can see after TCP connection to the server we several MySQL packets from the server. First of them is greeting packet.
Let’s dig into this packet and describe each field.
First 3 bytes are packet length:
Next 1 byte is packet number:
Rest of bytes are payload of Greeting packet of MySQL Client / Server protocol
Let’s describe each field of greeting packet.
- Protocol number – Int<1>
- Server version – String
- Thread id – Int<4>
- Salt1 – String
- Server capabilities – Int<2>
- Server language – Int<1>
- Server Status – Int<2>
- Extended Server Capabilities – Int<2>
- Authentication plugin length – Int<1>
- Reserved bytes – 10 bytes
- Salt2 – String
- Authentication plugin string – String
Server language is integer, next table will help us to pick appropriate language by integer value:
In my case server language is 0x08 (in decimal number system it is 8 also). From above table we can see that equivalent of 8 is latin1_swedish_ci. Now we know that default language of server is latin1_swedish_ci.
Server capabilities and server status are also integers. But reading each BIT of these integers we can know about server capabilities and status. Next illustration describes server capability and status bits:
Using greeting packet client prepares Login Request Packet to send to the server for authentication. Now let’s research login request packet.
- First 3 bytes describes payload length
- Next 1 byte is packet number
- Client capabilities – Int<2> / Same as Server capabilities
- Extended client capabilities – Int<2> / Same as Server extended capabilities
- Max packet – Int<4> / describes the maximum length of packet
- Charset – Int<1> / in my case it is 0x21 (in decimal number system is 33), from the table we can see that it is utf8_general_ci. We set server’s default charset from latin1_swedish_ci to utf8_general_ci
- Username – String
- Password – String
- Client Auth Plugin string – String
As you can see the password is encrypted. To encrypt a password we will use sha1, md5 algorithms, also salt1 and salt2 strings from previous Greeting Packet sent from server.
Then we get OK packet from the server if we are authenticated successfully. Otherwise we would get ERR packet.
- 3 bytes are packet length
- 1 byte is packet number
- Affected rows – Int<1>
- Server status – Int<2>
- Warnings – Int<2>
That’s all. We have finished the theory. Now it’s time to start the practical part. In the second part of this article we will write our own MySQL native client from scratch using no external module or library.
If you’re a brilliant developer looking for remote software jobs, Turing may be able to help you very quickly. Head over to the Jobs page to know more!
Join a network of the world's best developers and get long-term remote software jobs with better compensation and career growth.