when I use the code like this the system ends up in the __prefetch_handler, with very starnge values in the LR register.
If a comment the call to xTaskGetTickCount then all goes fine.
I’ve examined the function xTaskGetTickCount and it seems quite harmless.
I’ve tried to step into xTaskGetTickCount but it does not crash on the first 10 calls or so, it takes longer.
I’m not answering your question but just raising a warning about your methodology: do not forget that xTaskGetTickCount() will only return a number of ticks, with the granularity of the system ticks. If you run at 1000Hz (which is I believe the case for most if not all demos, see configTICK_RATE_HZ), then your precision is only that.
If a task records 1 tick with your counting method, it may well have executed only for a few nanoseconds and crossed a system tick increment. Similarily, a task may have executed for 999 microseconds and record 0 tick because it was all contained between two system ticks.
If these situations do repeat during the lifecycle of your application, you may end up with Task1 having apparently consumed all the time and Task2 having consumed none while in fact Task2 used 99% of CPU time and Task1 0,001%.
Because of that, I would suggest to use a system timer which will record the amount of time spent in each task much more accurately, especially if you intend to use rate-monotonic-scheduling to determine a set of static priorities for your task.
I presume this is being called form the tick in which case you cannot use xTaskGetTickCount() because is uses a critical section. Create a version without a critical section or just read the tick count variable directly.