Amazon FreeRTOS porting: mbedTLS handshake failure (hang)

lesudu wrote on May 29, 2018:

I am porting AFR to ATSAM4E from Atmel.

I simulated Amazon FreeRTOS with windows simulator by generating the key-certificate pair with AWS IoT. And I am using the same Key-certificate pair in the application code in the hardware.

During mbedtls_ssl_handshake(), the code hangs in client_hello() step. I went through code (step over), and I can see that sometimes it hangs while selecting a cipher, or sometimes in MBED_TLS_HELLO case in ssl_cli.c which is basically pointing to another state. I have also observed that, sometimes it hangs in random functions. I am not able to find the exact cause.

I ran out of ways to debug the code. Also, I have very little experience in embedded security.

FYI, I am using all new version AFR libraries (1.2.6). Currently using SD card to read-write credentials and EEPROM for the entropy generation.

Kindly suggest a solution.

Regards,
Sudarshan Bhat

Edited by: lesudu on May 29, 2018 4:43 PM

SarenaAtAws wrote on May 29, 2018:

Hello Sudarshan Bhat,

I see that you have mentioned that your entropy source is the EEPROM here, so I have also now noticed you mentioned that in your other thread as well. Is your entropy source working as expected?
Each port using pkcs11 with mbedtls needs to have +mbedtls_hardware_poll+ defined correctly. If this function is not setup correctly, then we may have some undesired behavior. For example, if the same random number is provided each time, or each time the device is reset, then the “randomly” generated sequence number for the TCP/IP connection will be reused. This may cause the server to think that the current session is the same as before. The server will then return an error that the socket has already been closed.

Let us know if this helps.

Thanks,
Sarena

lesudu wrote on May 30, 2018:

Hello,

When I navigate through the code during debug, I can see that a 32 Byte characters read/write to and from EEPROM is happening properly. And also,the check about ‘have at least one entropy source?’ returns positive. I can not check if it is different every time, as some characters are not displayed properly in the debug window(not a unicode character).

However, I have not implemented hardware poll yet. Is it a must even if I have an entropy source already?

FYI, the board is Atmel ATSAM4E, it does not have a hardware TRNG (True Random number generator). So, even if I want to implement the aws_hardware_poll function, is it possible?

Regards,
Sudarshan Bhat

Edited by: lesudu on May 30, 2018 10:13 AM

Edited by: lesudu on May 30, 2018 10:15 AM

Edited by: lesudu on May 30, 2018 1:37 PM
Update: I have checked the generated random numbers which are stored in EEPROM. The unicode characters look the same every time I reset the board. I am assuming same “random” number is generated every time. and I have no idea yet how to make it work!

lesudu wrote on May 30, 2018:

Hello,

Thanks for the response.

It is mentioned in the mbedtls website that the pseudo random number generators cannot be used as they do not make a strong entropy source.

It says:
"4. How to implement the Non-Volatile seed entropy source
If a hardware platform does not have a hardware entropy source to leverage into the entropy pool, alternatives have to be considered. "

So, NV_seed method can be used as an alternative to mbedtls_hardware_poll()?
Can you please confirm?

Best regards,
Sudarshan

SarenaAtAws wrote on May 30, 2018:

Hello Sudarshan,

mbedtls_hardware_poll must be implemented for +mbedtls+ to access your device’s entropy source.

The datasheet for the Atmel SMART SAM4E series,
http://ww1.microchip.com/downloads/en/DeviceDoc/Atmel-11157-32-bit-Cortex-M4-Microcontroller-SAM4E16-SAM4E8_Datasheet.pdf
appears to state there is a “10-bit pseudo random number generator”.

The random number must be unique at the startup of the board each time.

Here is more information from about mbedtls_hardware_poll porting:
https://docs.mbed.com/docs/mbed-os-handbook/en/5.2/advanced/tls_porting/

Let us know if you have more questions.

Thanks,
Sarena

SarenaAtAws wrote on May 31, 2018:

Hello Sudarshan,

If you are taking your device to production and there is not a TRNG on your device, then implementing the non-volatile seeding method is an alternative.

For lab testing and verifying your code, it is sufficient to use the pseudo random generator for the hardware poll.

Thanks,
-Sarena

lesudu wrote on June 04, 2018:

Hello,

I took your suggestion and implemented the hardware poll using Pseudo Random number generator.

However, again the program hangs at ssl handshake like before. Specifically at client_hello().


ssl_tls.c(4433): => handshake
ssl_cli.c(2748): client state: 0
ssl_tls.c(2053): => flush output
ssl_tls.c(2071): <= flush output
ssl_cli.c(2748): client state: 1
ssl_tls.c(2053): => flush output
ssl_tls.c(2071): <= flush output
ssl_cli.c(0515): => write client hello
ssl_cli.c(0551): client hello, max version: [3:3]
ssl_cli.c(0560): client hello, current time: 947335827
ssl_cli.c(0575): dumping 'client hello, random bytes' (32 bytes)
ssl_cli.c(0575): 0000:  38 77 32 93 94 b0 d7 5f 39 f3 aa c5 e3 d6 ee 8e  8w2...._9.......
ssl_cli.c(0575): 0010:  3c ce dd 4e 64 40 30 fa 7a db ae 9d d3 44 20 57  <..Nd@0.z....D W
ssl_cli.c(0625): client hello, session id len.: 0

Hangs here!

FYI, I am seeding the random number generator and can see that different number is generated every time.

Please help me solve this issue.

Regards,
Sudarshan

Edited by: lesudu on Jun 5, 2018 3:41 PM

lesudu wrote on June 05, 2018:

Update:

  • I found out that the heap memory is running out when the program is running. The malloc failed hook is called.(I am using heap5).
  • I used static memories for mbedTLS with changes in config.h.
  • If I decrease the allocated memory for tasks to make memory available to heap, I get stack overflow error.
    -The application hangs in different places each time (most times during server_hello)

Is there anything else I can try?

regards,
Sudarshan

Gaurav-Aggarwal-AWS wrote on June 06, 2018:

That’s a great finding and seems like you are running out of memory. You can try the following to optimize memory usage:

  • Use +uxTaskGetStackHighWaterMark+ to find out the unused stack space for each task and adjust the stack size for each task accordingly.
  • Use +xPortGetMinimumEverFreeHeapSize+ to find out the unused heap space and adjust the heap size accordingly.
  • If you are using FreeRTOS+TCP, you can configure network buffers (only applicable if you are using BufferAllocation_1.c) according to your application needs: https://www.freertos.org/FreeRTOS-Plus/FreeRTOS_Plus_TCP/Embedded_Ethernet_Buffer_Management.html
  • You can adjust buffers in the bufferpool used by MQTT according to your application needs- bufferpoolconfigNUM_BUFFERS and bufferpoolconfigBUFFER_SIZE macros in aws_bufferpool_config.h file.

Let us know if that helps.

Thanks.