Elshad Agayev

@
MySQL Client / Server Protocol Using Python & Wireshark: P2
Developers Corner

Understanding MySQL Client / Server Protocol Using Python & Wireshark: Part 2

In this blog, we’ll learn how to write our own native MySQL client from scratch without using a connector or external libraries.

In the previous article we researched MySQL Client / Server Protocol using WireShark. Now lets start to write our code in python to simulate MySQL native client. Final codes are here: Github repo

First of all we have to create MYSQL_PACKAGE class. MYSQL_PACKAGE class is the parent of all other package classes (HANDSHAKE_PACKAGE, LOGIN_PACKAGE, OK_PACKAGE and etc.)

It accepts resp parameter on initialization. Resp is the binary response received from the server in bytesarray type. One of the important and interesting method of this class is next method.

Method next reads a portion of the bytes from the binary response. When we call this method, it reads some portion of bytes and puts a pointer to the last position where reading ended (changes a value of self.start and self.end properties). When we call this method again, it starts to read bytes at the point it last stopped.
Method next accepts five parameters: length, type, byteorder, signed, and freeze. If freeze is True it reads some portion of bytes from the binary response but does not change pointer position. Otherwise it reads a portion of bytes with given length and changes the position of pointer. If length is None then method reads bytes until the end of response bytesarray. Parameter type can be int, str, and hex data types. Method next converts a portion of bytes into the appropriate datatype according to the value of type parameter.
Parameter byteorder determines the conversion of bytes to integer type. It is up to the architecture of your computer. If your machine is big-endian, then it stores bytes in memory from the big address to the little. If your machine is little-endian, then it stores bytes in memory from the little address to the big. Thats why we have to know the exact type of our architecture to be able to convert bytes to integer correctly. In my case, it is little-endian, that’s why i’ve set the default value of byteorder parameter to “little”.
Parameter signed is also used in conversion of bytes to integer. We tell the function to consider each integer as unsigned or signed.
A second interesting method of this class is encrypt_password. This method encrypts a password with the given algorithm.

This method accepts two parameters: salt and password. Parameter salt is the concatenation of two salt1 and salt2 strings from the Greeting Packet received from the server. And parameter password is the password string of mysql user.
In the official documentation password encryption algorithm is:
password_encrypt_algorithm
Here “20-bytes random data from server” is concatenation of salt1 and salt2 from the Greeting Packet received from server. To remember what the greeting packet is look at the previous article
Now I want to explain the encrypt_password method line by line.
bytes1 = sha1(password.encode(“utf-8”)).digest()
We are converting password string to bytes, then encrypting it with sha1 function and assigning to bytes1 variable. It is equal to this part of algorithm:
password_encrypt_algorithm1
Then we are converting salt string into bytes and assigning to the concat1 variable.
concat1 = salt.encode(‘utf-8’)
password_encrypt_algorithm5
Third line of the method is:
concat2 = sha1(sha1(password.encode(“utf-8”)).digest()).digest()
password_encrypt_algorithm2
Here we are double-encrypting password string with sha1 function and assign it to the concat2 string.
Now we have two concat1 and concat2 variables. We have to concatenate them into one byte array:
bytes2 = bytearray()
bytes2.extend(concat1)
bytes2.extend(concat2)
password_encrypt_algorithm6
Then we have to encrypt concatenated bytes with sha1 function and assign to the bytes2 variable.
bytes2 = sha1(bytes2).digest()
password_encrypt_algorithm3
So we have two variables with encrypted bytes: bytes1 and bytes2. Now we have to do bitwise XOR operation between these variables and return the obtained hash.
hash=bytearray(x ^ y for x, y in zip(bytes1, bytes2))
return hash
password_encrypt_algorithm4

CLASSES FOR DATATYPES

In the previous article we’ve learned about Int and String data types of MySQL Client / Server protocol. Now we need some classes to be able to read fields from received packets.

INT CLASS

Int class implements INT data type of MySQL Client / Server protocol. It accepts package parameter on initialization. Parameter package should be the instance of any package class inherited from MYSQL_PACKAGE class. Method next detects the type of integer (int<fix> or int<lenenc> (see previous article) and calls the next method of package object to read the byte portion of received response.

STR CLASS

Str class implements STRING data type of MySQL Client / Server protocol. It accepts package parameter on initialization. Parameter package should be the instance of any package class inherited from MYSQL_PACKAGE class. Method next detects the type of String (String<fix>, String<Var>, String<NULL>, String<EOF> or String<lenenc>. See previous article) and calls the next method of package object to read the byte portion of received response.

HANDSHAKE_PACKAGE CLASS

HANDSHAKE_PACKAGE class is used for parsing the Greeting Packet received from server. It is inherited from MYSQL_PACKAGE class and accepts resp parameter on initialization. Parameter resp is the Greeting Packet response in bytes type recieved from the server.

Method parse reading fields from the response using Int and Str classes and puts them into a dictionary and returns.

LOGIN_PACKAGE CLASS

This class is used for create Login Request packet.

OK package and ERR package are the response package of server after authentication or after sending query to server on command phase.

MYSQL CLASS

MYSQL class is the wrapper class which creates TCP connection with server, sends and receives packages from server using above classes.

I think everything is clear in this class. I’ve defined __enter__ and __exit__ to be able to use this class with “with” statement to automatically close TCP connection. In __enter__ method i’m creating TCP connection over socket. And in __exit__ method i’m closing created connection. This class accepts host, port, user and password parameters on initialization.
In the connect method we receive greeting packet from server:
resp = self.client.recv(65536)
return HANDSHAKE_PACKAGE(resp)
In the login method we create Login request package using LOGIN_PACKAGE and HANDSHAKE_PACKAGE classes and sends to the server and gets OK or ERR packages.
That’s all. We’ve implemented the connection phase. To avoid making this article too long I will not explain the command phase. Because the command phase is easier than the connection phase. You can research it yourself with the knowledge you’ve accumulated from this and previous articles.
Demo Video:

If you’re a brilliant developer looking for remote software jobs, Turing may be able to help you very quickly. Head over to the Jobs page to know more!

Join a network of the world's best developers and get long-term remote software jobs with better compensation and career growth.

Apply for Jobs

By Oct 9, 2020
MySQL Client / Server Protocol Using Python & Wireshark: P1
Developers Corner

Understanding MySQL Client/Server Protocol Using Python Wireshark

This blog explains what the MySQL Client / Server protocol is all about, according to the official MySQL documentation.

MySQL Client / Server protocol is used in many areas. For example:

  • MySQL Connectors like ConnectorC, ConnectorJ and etc.
  • MySQL proxy
  • Between master and slave

What is MySQL Client / Server protocol?

MySQL Client / Server protocol is accepted conventions (rules). Through these rules client and server “talks” and understand each other. Client connects to server through TCP connection with special socket, sends to server special packets and accepts them from server. There are two phases of this connection:

  • Connection phase
  • Command phase

Next illustration describes phases:

STRUCTURE OF PACKETS

Each packet consists of valuable data types. Maximum length of each packet can be 16MB. If the length of packet is more than 16MB, then it is separated into several chunks (16MB). First of all let’s see the protocol data types. MySQL Client / Server protocol has two data types:

  • Integer types
  • String types

(See the official documentation: https://dev.mysql.com/doc/internals/en/basic-types.html)

INTEGER TYPES

Integer types also separates into two section:

  • Fixed length integer types
  • Length-encoded integer types

Fixed length integer type consumes 1, 2, 3, 4, 6 or 8 bytes. For example if we want to describe number 2 in int<3> data type then we can write it like this in hex format: 02 00 00. Or if we want to describe number 2 in int<2> then we can write it like this in hex format: 02 00

Length-encoded integer types consumes 1, 3, 4 or 9 bytes. Before length-encoded integer types comes 1 byte. To detect the length of integer we have to check that first byte.

  • If the first byte is less than 0xfb ( < 251 ) then next one byte is valuable (it is stored as a 1-byte integer)
  • If the first byte is equal to 0xfc ( == 252 ) then it is stored as a 2-byte integer
  • If the first byte is equal to 0xfd ( == 253 ) then it is stored as a 3-byte integer
  • If the first byte is equal to 0xfe ( == 254 ) then it is stored as a 8-byte integer

But if the first byte is equal to 0xfb there is no need to read next bytes, it is equal to the NULL value of MySQL, and if equal to 0xff it means that it is undefined.

For example to convert fd 03 00 00 … into normal integer we have to read first byte and it is 0xfd. According to the above rules we have to read next 3 bytes and convert it into normal integer, and its value is 2 in decimal number system. So value of length-encoded integer data type is 2.

STRING TYPES

String types also separates into several sections.

  • String – Fixed-length string types. They have a known, hardcoded length
  • String – Null terminated string types. These strings end with 0x00 byte
  • String – Variable length string types. Before such strings comes fixed-length integer type. According to that integer we can calculate actual length of string
  • String – Length-encoded string types. Before such strings comes length-encoded integer type. According to that integer we can calculate actual length of string
  • String – If a string is the last component of a packet, its length can be calculated from the overall packet length minus the current position

SNIFF WITH WIRESHARK

Let’s start wireshark to sniff the network, filter MySQL packets by ip (in my case server ip is 54.235.111.67). Then let’s try to connect to MySQL server by MySQL native client on our local machine.

>> mysql -u[username] -p[password] -h[host ip] -P3306

As you can see after TCP connection to the server we several MySQL packets from the server. First of them is greeting packet.

picture1

Let’s dig into this packet and describe each field.

First 3 bytes are packet length:

picture2

Next 1 byte is packet number:

picture3

Rest of bytes are payload of Greeting packet of MySQL Client / Server protocol

picture4

Let’s describe each field of greeting packet.

  • Protocol number – Int<1>
  • Server version – String
  • Thread id – Int<4>
  • Salt1 – String
  • Server capabilities – Int<2>
  • Server language – Int<1>
  • Server Status – Int<2>
  • Extended Server Capabilities – Int<2>
  • Authentication plugin length – Int<1>
  • Reserved bytes – 10 bytes
  • Salt2 – String
  • Authentication plugin string – String

Server language is integer, next table will help us to pick appropriate language by integer value:

In my case server language is 0x08 (in decimal number system it is 8 also). From above table we can see that equivalent of 8 is latin1_swedish_ci. Now we know that default language of server is latin1_swedish_ci.

Server capabilities and server status are also integers. But reading each BIT of these integers we can know about server capabilities and status. Next illustration describes server capability and status bits:

Using greeting packet client prepares Login Request Packet to send to the server for authentication. Now let’s research login request packet.

picture5

  • First 3 bytes describes payload length
  • Next 1 byte is packet number
  • Client capabilities – Int<2> / Same as Server capabilities
  • Extended client capabilities – Int<2> / Same as Server extended capabilities
  • Max packet – Int<4> / describes the maximum length of packet
  • Charset – Int<1> / in my case it is 0x21 (in decimal number system is 33), from the table we can see that it is utf8_general_ci. We set server’s default charset from latin1_swedish_ci to utf8_general_ci
  • Username – String
  • Password – String
  • Client Auth Plugin string – String

As you can see the password is encrypted. To encrypt a password we will use sha1, md5 algorithms, also salt1 and salt2 strings from previous Greeting Packet sent from server.

Then we get OK packet from the server if we are authenticated successfully. Otherwise we would get ERR packet.

picture6.png

  • 3 bytes are packet length
  • 1 byte is packet number
  • Affected rows – Int<1>
  • Server status – Int<2>
  • Warnings – Int<2>

That’s all. We have finished the theory. Now it’s time to start the practical part. In the second part of this article we will write our own MySQL native client from scratch using no external module or library.

If you’re a brilliant developer looking for remote software jobs, Turing may be able to help you very quickly. Head over to the Jobs page to know more!

Join a network of the world's best developers and get long-term remote software jobs with better compensation and career growth.

Apply for Jobs

By Oct 2, 2020