{"id":4345,"date":"2018-11-28T20:00:09","date_gmt":"2018-11-28T19:00:09","guid":{"rendered":"https:\/\/msalamon.pl\/?p=4345"},"modified":"2025-12-27T20:06:06","modified_gmt":"2025-12-27T19:06:06","slug":"how-much-does-using-floats-cost-and-what-does-the-fpu-offer","status":"publish","type":"post","link":"https:\/\/msalamon.pl\/en\/how-much-does-using-floats-cost-and-what-does-the-fpu-offer\/","title":{"rendered":"How much does using floats cost, and what does the FPU offer?"},"content":{"rendered":"\n<p>Have you ever seen on some forum or social media group how \u201chigh-ranking\u201d programmers forbid using float? Have you noticed that none of them explains why? Just because! That\u2019s it. Why is using float for some people as bad as killing small animals? Let\u2019s find out!<\/p>\n\n\n\n<!--more-->\n\n\n\n<p>In this post, I will base it on ST\u2019s documentation <a data-e-disable-page-transition=\"true\" class=\"download-link\" title=\"\" href=\"https:\/\/msalamon.pl\/download\/486\/?tmstv=1766861370\" rel=\"nofollow\" id=\"download-link-486\" data-redirect=\"false\"><br>\n\tAN4044 &#8211; Floating point unit demonstration on STM32 microcontrollers.pdf<\/a><br>\n. For some, this document may be incomprehensible, so let me shed some light on floats for STM32 and beyond.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><a href=\"https:\/\/sklep.msalamon.pl\/kategoria-produktu\/dev-boardy\/stm32-nucleo\/?utm_source=blog&amp;utm_medium=banner&amp;utm_campaign=float&amp;utm_content=nucleo\"><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"400\" src=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2020\/07\/Nucleo-64-baner.jpg\" alt=\"\" class=\"wp-image-1593\" srcset=\"https:\/\/msalamon.pl\/wp-content\/uploads\/2020\/07\/Nucleo-64-baner.jpg 1200w, https:\/\/msalamon.pl\/wp-content\/uploads\/2020\/07\/Nucleo-64-baner-300x100.jpg 300w, https:\/\/msalamon.pl\/wp-content\/uploads\/2020\/07\/Nucleo-64-baner-1024x341.jpg 1024w, https:\/\/msalamon.pl\/wp-content\/uploads\/2020\/07\/Nucleo-64-baner-768x256.jpg 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><\/figure>\n<\/div>\n\n\n<p><\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Float representation<\/h1>\n\n\n\n<p>First, it\u2019s worth saying how floating-point numbers are stored by the MCU. As we all know, a bit has only two states and you can\u2019t insert any fractional number between <em>10<\/em> and <em>11<\/em>. So how does it work? There is the IEEE 754 arithmetic standard that defines the encoding and basic operations on <em>floats<\/em>. The most commonly used floating-point numbers are single and double precision \u2013 <em>single<\/em> and <em>double<\/em>. Their encoding looks like this:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><a href=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_single_and_double_representation.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"338\" src=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_single_and_double_representation-1024x338.jpg\" alt=\"\" class=\"wp-image-488\" srcset=\"https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_single_and_double_representation-1024x338.jpg 1024w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_single_and_double_representation-300x99.jpg 300w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_single_and_double_representation-768x254.jpg 768w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_single_and_double_representation-24x8.jpg 24w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_single_and_double_representation-36x12.jpg 36w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_single_and_double_representation-160x53.jpg 160w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_single_and_double_representation.jpg 1080w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n<\/div>\n\n\n<p><\/p>\n\n\n\n<p>What does this mean?<\/p>\n\n\n\n<p>s \u2013 the sign bit. It decides whether the number is positive or negative. 0 \u2013 positive, 1 \u2013 negative.<\/p>\n\n\n\n<p>e \u2013 8- or 11-bit exponent. The integer to which we raise the base of the number system \u2013 in our case the number 2 because we encode in binary. You must add the so-called <em>bias<\/em> to the exponent, equal to 127 for <em>single<\/em> and 1023 for <em>double<\/em>.<\/p>\n\n\n\n<p>f \u2013 23- or 52-bit mantissa. The number from which we obtain the fraction. It\u2019s a normalized number, more on that in a moment.<\/p>\n\n\n\n<p>The formula for a floating-point number therefore looks as follows:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><a href=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_equation_.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"278\" height=\"35\" src=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_equation_.jpg\" alt=\"\" class=\"wp-image-490\" srcset=\"https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_equation_.jpg 278w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_equation_-24x3.jpg 24w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_equation_-36x5.jpg 36w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/float_equation_-160x20.jpg 160w\" sizes=\"auto, (max-width: 278px) 100vw, 278px\" \/><\/a><\/figure>\n<\/div>\n\n\n<p><\/p>\n\n\n\n<p>Looks complicated, right? Let\u2019s try to convert some fraction to the float representation. Let it be the number 243.45.<\/p>\n\n\n\n<p>First, we should handle the integer 243, which is quite simple. The result is <em>11110011<\/em>. It\u2019s worse with the manual conversion of 0.45. I recommend watching a video that explains how to do it (<a href=\"https:\/\/www.youtube.com\/watch?v=kLLF1qAKoFI\" target=\"_blank\" rel=\"noopener\">link<\/a>). 0.45 in binary representation is <em>011100(1100\u2026)<\/em>.<\/p>\n\n\n\n<p>So 243.45 = <em>11110011.011100(1100\u2026)<\/em><\/p>\n\n\n\n<p>Now we need to normalize this number. A normalized number is one that lies in the right-open interval [1, B) where B is the base of the number\u2019s encoding. In our case, the encoding is binary, so the point must be after the first one. Count how many places you shifted the point. This is needed for encoding the exponent.<\/p>\n\n\n\n<p>After normalization, our number is <em>1.110011011100(1100\u2026) x 10^7<\/em>.<\/p>\n\n\n\n<p>Thanks to the fact that the integer part is always 1, there is no need to store it. In the <em>float<\/em> number, only the fractional part is stored and this is the mantissa shortened to 23 or 52 bits.<\/p>\n\n\n\n<p><strong>f = 11001101110011001100110<\/strong><\/p>\n\n\n\n<p>Now the exponent. I shifted the point 7 places to the left, so it is 7. Adding the <em>bias<\/em> for the 32-bit representation gives 134, which in binary is <em>10000110<\/em>.<\/p>\n\n\n\n<p><strong>e = 10000110<\/strong><\/p>\n\n\n\n<p>The number is positive, so the sign is zero<\/p>\n\n\n\n<p><strong>s = 0<\/strong><\/p>\n\n\n\n<p>Now we can assemble our float.<\/p>\n\n\n\n<p><strong>243.45 (dec) = 0 10000110&nbsp;11001101110011001100110 (float)<\/strong><\/p>\n\n\n\n<p>Simple, isn\u2019t it? \ud83d\ude42<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Operations on floating point<\/h2>\n\n\n\n<p>The IEEE 754 standard also defines floating-point arithmetic. It contains 6 operations on numbers:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Addition<\/li>\n\n\n\n<li>Subtraction<\/li>\n\n\n\n<li>Multiplication<\/li>\n\n\n\n<li>Division<\/li>\n\n\n\n<li>Remainder (modulo)<\/li>\n\n\n\n<li>Square root<\/li>\n<\/ul>\n\n\n\n<p>Let\u2019s try something easy, e.g., let\u2019s add 188.1 and 182.69. In your head you can quickly compute that it will be 370.79, but will float give the same result?<\/p>\n\n\n\n<p><strong>188.1<\/strong><\/p>\n\n\n\n<p>188 in binary is <strong>10111100<\/strong> while 0.1 is equivalent to <strong>0(0011\u2026)<\/strong>. Do you see the small danger related to the infinite expansion? If not, don\u2019t worry, it will be visible. Combining the two components, I get <strong>10111100.0001100110011(0011)<\/strong> and adapting it to the float standard <strong>1.01111000001100110011001|10011 x 10^7<\/strong>. What\u2019s after the vertical bar is redundant for float32 and is lost forever. <strong>We have the first loss of information<\/strong>. The exponent is 7, so after adding the bias we get 134.<\/p>\n\n\n\n<p><strong>188.1<\/strong>(dec)<strong> = 0 10000110 01111000001100110011001<\/strong>(float)<\/p>\n\n\n\n<p>Now <strong>182.69<\/strong><\/p>\n\n\n\n<p>182 = <strong>10110110.10110000101000111101 = 1.01101101011000010100011|1101 x 10^7<\/strong><\/p>\n\n\n\n<p>Again, we lose information about the fractional expansion. Converting to the IEEE 754 representation:<\/p>\n\n\n\n<p><strong>182.69<\/strong>(dec) <strong>= 0 100000110 01101101011000010100011<\/strong>(float)<\/p>\n\n\n\n<p>The first thing to do when adding floating-point numbers is to equalize their exponents. In my example we have the same exponents, so we don\u2019t need to do this operation. The next step is to add the mantissas, keeping in mind the one before the point, which is not in the float encoding. I will skip the manual binary addition process for readability<\/p>\n\n\n\n<p>1.01111000001100110011001 x 10^7 + 1.01101101011000010100011 10^7 = <strong>10.11100101100101000111100 x 10^7<\/strong><\/p>\n\n\n\n<p>The result must be normalized: <strong>1.01110010110010100011110|0 x 10^8<\/strong><\/p>\n\n\n\n<p>In float encoding our result looks like this: <strong>0 100000111&nbsp;01110010110010100011110<\/strong><\/p>\n\n\n\n<p>We should decode this result. It will be easiest for me to extract it from the normalized result, of course truncated to the size of the float standard because that\u2019s what we actually extract the operation result from.<\/p>\n\n\n\n<p>1.01110010110010100011110 x 10^8 = <strong>101110010.110010100011110<\/strong><\/p>\n\n\n\n<p>I split this into an integer and fractional part:<\/p>\n\n\n\n<p>101110010 = <strong>370<\/strong> \u2013 success. Now the fraction. You can find the method of converting a binary fraction to decimal form at this <a href=\"https:\/\/www.youtube.com\/watch?v=iHAKLh7MP_Y\" target=\"_blank\" rel=\"noopener\">link<\/a>.<\/p>\n\n\n\n<p>0.110010100011110 = <strong>0.78997802734375<\/strong><\/p>\n\n\n\n<p>So? It\u2019s a bit less than 0.79. And here lies one of the dangers associated with float. With a small number of operations and low precision requirements, e.g., one or two decimal places, it doesn\u2019t matter that much, but imagine when an MCU performs hundreds or thousands of such operations and each introduces such a small error.<\/p>\n\n\n\n<p>For this reason remember: <strong>never use the == and != operators to compare floating-point numbers.<\/strong> Use some small delta, or epsilon (people call it differently). For example: if( abs((expected \u2013 result)) &lt;= 0.01 )\u2026<\/p>\n\n\n\n<p>Why is that?<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Precision<\/h2>\n\n\n\n<p>The result of addition in the example above is a consequence of the precision of numbers stored in a float variable. Do you think you can store all fractions with them? Well, no. I prepared in Octave a plot with marked <em>floats<\/em> for an 8-bit mantissa and an exponent from -5 to 5.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><a href=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/floats_on_axe.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"326\" src=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/floats_on_axe-1024x326.jpg\" alt=\"\" class=\"wp-image-491\" srcset=\"https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/floats_on_axe-1024x326.jpg 1024w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/floats_on_axe-300x96.jpg 300w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/floats_on_axe-768x245.jpg 768w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/floats_on_axe-24x8.jpg 24w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/floats_on_axe-36x11.jpg 36w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/floats_on_axe-160x51.jpg 160w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/floats_on_axe.jpg 1344w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n<\/div>\n\n\n<p><\/p>\n\n\n\n<p>Each circle represents one number. What stands out? What about all the numbers between 0.5 and 1? According to these parameters there are only 4 of them. The rest don\u2019t exist. Of course, with a mantissa and exponent consistent with the IEEE 754 standard there will be more of them, but it still won\u2019t cover all possible numbers.<\/p>\n\n\n\n<p>Also notice that the further from zero, the sparser the points. Do you guess what that means? <strong>Operations on large numbers produce even greater errors<\/strong> resulting from these limitations.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Computational complexity and FPU<\/h2>\n\n\n\n<p>Adding floats seems quite simple, but it includes several operations such as extracting the exponents, equalizing them, adding the mantissas, and transforming the result back into IEEE 754 form. It\u2019s similar for subtraction, multiplication, and the rest of the operations. Compared to integer (binary) arithmetic, a considerable processor overhead is required. In other words, performing a single operation on floating-point numbers requires the MCU to perform dozens of operations, which are carried out using ordinary binary operations.<\/p>\n\n\n\n<p>Some microcontrollers are equipped with an additional unit dedicated to floating-point computations. This is the FPU \u2013 Floating Point Unit. This unit is included, among others, in MCUs from the F4 family based on the Cortex-M4 and Cortex-M7 cores, i.e., STM32 series F4, L4, F7, H7. This unit literally does miracles with floats. ST in its microcontrollers offers, with its help, several hardware operations on single-precision floats:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><a href=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/FPU_cycles.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"682\" height=\"260\" src=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/FPU_cycles.jpg\" alt=\"\" class=\"wp-image-492\" srcset=\"https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/FPU_cycles.jpg 682w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/FPU_cycles-300x114.jpg 300w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/FPU_cycles-24x9.jpg 24w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/FPU_cycles-36x14.jpg 36w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/FPU_cycles-160x61.jpg 160w\" sizes=\"auto, (max-width: 682px) 100vw, 682px\" \/><\/a><\/figure>\n<\/div>\n\n\n<p><\/p>\n\n\n\n<p>As you can see, absolute value, addition, subtraction, or multiplication are performed <strong>in just one clock cycle<\/strong>. All those operations I did by hand, the FPU does with a snap of a digital finger. Division or square root takes only 14 cycles. Additionally, we can see in the table that type conversions take only one cycle. Magic? Kind of \ud83d\ude42<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">So how much does it take then?<\/h2>\n\n\n\n<p>They promised so much, so now let\u2019s check the most important operations using simulation in the <em>Keil \u00b5Vision 5<\/em> IDE on an STM32F401RE, which I have on one of my Nucleo boards. The HAL library I\u2019ll use is version 1.21. In the project settings I enabled the simulator and set no optimization. In the code, right after initializing HAL and clocks, I disable SysTick so it does not interfere with the computations. In the main loop I wrote simple code that performs basic operations on numbers. First I\u2019ll test operations on uint32_t, then float with FPU disabled and enabled, and finally I\u2019ll compare them. After each for I set a breakpoint and check the clock cycle counter in the simulation. For better visualization I perform each operation 100 times. Unfortunately there will be some additional cycles due to the for loops, but for each option it will be the same amount, so it won\u2019t significantly affect the proportional result. FPU is enabled\/disabled in the project settings. In Eclipse (or SW4STM32) this will be under <strong>Project &gt; Properties &gt; C\/C++ Build &gt; Settings &gt; MCU Settings<\/strong> in the dropdown named Floating point hardware.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"c\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">uint32_t a = 6754;\nuint32_t b = 1267;\nuint32_t result;\n\n\/\/float a_f = 12.67;\n\/\/float b_f = 6.754;\n\/\/float result_f;\n\nwhile (1)\n{\n\tuint16_t i;\n\t\/\/ Add\n\tfor(i=0; i&amp;lt;100; i++)\n\t{\n\t\tresult = a+b;\n\t\t\/\/result_f = a_f+b_f;\n\t}\n\n\t\/\/ Substract\n\tfor(i=0; i&amp;lt;100; i++)\n\t{\n\t\tresult = a-b;\n\t\t\/\/result_f = a_f-b_f;\n\t}\n\n\t\/\/ Multiply\n\tfor(i=0; i&amp;lt;100; i++)\n\t{\n\t\tresult = a*b;\n\t\t\/\/result_f = a_f*b_f;\n\t}\n\n\t\/\/ Divide\n\tfor(i=0; i&amp;lt;100; i++)\n\t{\n\t\tresult = a\/b;\n\t\t\/\/result_f = a_f\/b_f;\n\t}\n\n\t\/\/ Modulo\n\tfor(i=0; i&amp;lt;100; i++)\n\t{\n\t\tresult = a%b;\n\t\t\/\/result_f = fmod(a_f,b_f);\n\t}\n\n\t\/\/ Square root\n\tfor(i=0; i&amp;lt;100; i++)\n\t{\n\t\tresult = sqrt(a);\n\t\t\/\/result_f = sqrtf(a_f);\n\t}\n\n\t\/\/int to float\n\tfor(i=0; i&amp;lt;100; i++)\n\t{\n\t\tresult_f = (float)a;\n\t}\n\n\t\/\/float to int\n\tfor(i=0; i&amp;lt;100; i++)\n\t{\n\t\tresult = (uint32_t)a_f;\n\t}\n\n\tresult = result+1; \/\/ delete warning\n\tresult_f = result_f+1; \/\/ delete warning\n\t\/* USER CODE END WHILE *\/\n\n\t\/* USER CODE BEGIN 3 *\/\n}<\/pre>\n\n\n\n<p>Are you curious about the results? They\u2019re interesting.<\/p>\n\n\n\n<p>First up is the size of the resulting code.<\/p>\n\n\n\n<figure class=\"wp-block-table tablepress tablepress-id-7\"><table class=\"has-fixed-layout\"><thead><tr><th>Variable type<\/th><th>Size in bytes<\/th><\/tr><\/thead><tbody><tr><td>uint32_t<\/td><td>3594<\/td><\/tr><tr><td>float without FPU<\/td><td>5032<\/td><\/tr><tr><td>float with FPU<\/td><td>4684<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><a href=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Code_size.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"856\" height=\"446\" src=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Code_size.jpg\" alt=\"\" class=\"wp-image-497\" srcset=\"https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Code_size.jpg 856w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Code_size-300x156.jpg 300w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Code_size-768x400.jpg 768w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Code_size-24x13.jpg 24w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Code_size-36x19.jpg 36w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Code_size-154x80.jpg 154w\" sizes=\"auto, (max-width: 856px) 100vw, 856px\" \/><\/a><\/figure>\n<\/div>\n\n\n<p><\/p>\n\n\n\n<p>The amount of additional code resulting from operations on float increases by just under 1.5 kB. Is that a lot? For small AVR microcontrollers, definitely yes. STM32s have quite a lot of Flash memory and I believe this is not a big problem.<\/p>\n\n\n\n<p>Now the most interesting part: operations on numbers. The numbers in the table are the number of clock cycles needed to perform each operation one hundred times.<\/p>\n\n\n\n<figure class=\"wp-block-table tablepress tablepress-id-8\"><table class=\"has-fixed-layout\"><thead><tr><th>Variable type<\/th><th>+<\/th><th>&#8211;<\/th><th>*<\/th><th>\/<\/th><th>%<\/th><th>sqrt<\/th><\/tr><\/thead><tbody><tr><td>uint32_t<\/td><td>706<\/td><td>706<\/td><td>706<\/td><td>1106<\/td><td>1306<\/td><td>211806<\/td><\/tr><tr><td>float without FPU<\/td><td>6706<\/td><td>9706<\/td><td>6306<\/td><td>27906<\/td><td>47106<\/td><td>204306<\/td><\/tr><tr><td>float + FPU<\/td><td>806<\/td><td>806<\/td><td>806<\/td><td>2106<\/td><td>55306<\/td><td>4806<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Interesting, isn\u2019t it? A combined chart is very unreadable due to the huge values for square root, so I\u2019ll split the results.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><a href=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Operations_ticks.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"470\" src=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Operations_ticks-1024x470.jpg\" alt=\"\" class=\"wp-image-500\" srcset=\"https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Operations_ticks-1024x470.jpg 1024w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Operations_ticks-300x138.jpg 300w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Operations_ticks-768x353.jpg 768w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Operations_ticks-24x11.jpg 24w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Operations_ticks-36x17.jpg 36w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Operations_ticks-160x73.jpg 160w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Operations_ticks.jpg 1180w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n<\/div>\n\n\n<p><\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><a href=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Sqrt_ticks.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"470\" src=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Sqrt_ticks-1024x470.jpg\" alt=\"\" class=\"wp-image-501\" srcset=\"https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Sqrt_ticks-1024x470.jpg 1024w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Sqrt_ticks-300x138.jpg 300w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Sqrt_ticks-768x353.jpg 768w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Sqrt_ticks-24x11.jpg 24w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Sqrt_ticks-36x17.jpg 36w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Sqrt_ticks-160x73.jpg 160w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Sqrt_ticks.jpg 1180w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n<\/div>\n\n\n<p><\/p>\n\n\n\n<p>Do you see now why many people on forums get upset when someone uses floats unnecessarily? For now, let\u2019s compare the results of uint32_t vs float without FPU. Not every microcontroller has an FPU. All the more so popular Arduino, where ready-made libraries push float wherever possible. The computation time for float is slower than for uint32_t by:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>~9.5 times for addition<\/li>\n\n\n\n<li>~13.75 times for subtraction<\/li>\n\n\n\n<li>~8.93 times for multiplication<\/li>\n\n\n\n<li><strong>~25.23<\/strong> times for division!<\/li>\n\n\n\n<li><strong>~36<\/strong> times for modulo!<\/li>\n<\/ul>\n\n\n\n<p>Interestingly, calculating the square root takes a similar number of clock cycles. I dug a bit into the M4 core documentation. It doesn\u2019t have an instruction for binary square root, hence probably such overhead. The library must handle it using basic operations for both uint32_t and float. Now look at square root using the FPU. Impressive, isn\u2019t it? The floating-point unit already has instructions dedicated to this operation. It can perform it lightning fast (~42 times faster) compared to uint32_t and float without FPU.<\/p>\n\n\n\n<p>What did enabling the FPU give?<\/p>\n\n\n\n<p>The number of CPU cycles needed for float calculations almost equalized with those for uint32_t. The exception is the modulo operation. This is for a simple reason. The FPU doesn\u2019t support modulo division, hence the library requires computational overhead using other operations such as addition, subtraction, multiplication, and division. Here we won\u2019t gain anything and, as the chart shows \u2013 we will even lose. I can\u2019t explain why float modulo with FPU enabled needed even more clock cycles than software-only support. I didn\u2019t delve into the CMSIS libraries and CPU instructions. Maybe one of the readers has more knowledge in this area and will share it in the comments?<\/p>\n\n\n\n<p>There\u2019s also type conversion left.<\/p>\n\n\n\n<figure class=\"wp-block-table tablepress tablepress-id-9\"><table class=\"has-fixed-layout\"><thead><tr><th>Operation type<\/th><th>Without FPU<\/th><th>FPU<\/th><\/tr><\/thead><tbody><tr><td>uint32_t to float<\/td><td>4606<\/td><td>1006<\/td><\/tr><tr><td>float to uint32_t<\/td><td>2706<\/td><td>906<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><a href=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Type_conversion_ticks.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"820\" height=\"492\" src=\"http:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Type_conversion_ticks.jpg\" alt=\"\" class=\"wp-image-506\" srcset=\"https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Type_conversion_ticks.jpg 820w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Type_conversion_ticks-300x180.jpg 300w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Type_conversion_ticks-768x461.jpg 768w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Type_conversion_ticks-24x14.jpg 24w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Type_conversion_ticks-36x22.jpg 36w, https:\/\/msalamon.pl\/wp-content\/uploads\/2018\/11\/Type_conversion_ticks-133x80.jpg 133w\" sizes=\"auto, (max-width: 820px) 100vw, 820px\" \/><\/a><\/figure>\n<\/div>\n\n\n<p><\/p>\n\n\n\n<p>The gain from using the FPU for type conversions is indisputable and I think it needs no comment. For square root, it pays to convert ints to float, compute, and convert back. One should check whether the precision of the float representation will be sufficient for the application.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How to live?<\/h2>\n\n\n\n<p>When writing programs, it\u2019s better to consider whether floating-point numbers are indispensable. A common statement is that \u201cit doesn\u2019t matter here\u201d when a small program by a beginner is being discussed. The discussion sometimes grows to the size of \u201cPC vs console\u201d. My opinion is that wherever possible, float should be avoided. You never know when, writing a bulky float-based library, you\u2019ll want to call these operations hundreds of times per second. Then the MCU may choke. Of course, there are applications where floating-point numbers are indispensable and you should be aware of frequent float usage.<\/p>\n\n\n\n<p>You can often manage differently than with float. Example: <strong>a \/= 2.55 will be equivalent to a = (a * 100)\/255 performed on uints<\/strong>. Sometimes you have to do it on ints not 8-bit but 16, 32 or even 64. It will still be much faster and the MCU will reward you with speed. Unless we\u2019re dealing with numbers so large that the 64-bit range is too small.<\/p>\n\n\n\n<p>I would avoid modulo division on floats. Fortunately, it\u2019s rarely used. I myself have never used this type of operation. Can anyone give a practical use case?<\/p>\n\n\n\n<p>You can, however, gain on square roots with floats once the FPU is enabled. Type conversion then takes one cycle, and square root a dozen or so. This will be decidedly faster than computing the square root with library code on uints.<\/p>\n\n\n\n<p>And what is your opinion?<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n\n\n\n<p>I hope you now understand at least a bit why using floats on microcontrollers evokes so many emotions. The topic is interesting and I hope I\u2019ve inspired you to read more, e.g., about float division, <em>overflow<\/em> or <em>underflow<\/em>, or other floating-point pitfalls. If you\u2019d like to learn more, I recommend for example the presentation <a href=\"http:\/\/asawicki.info\/Download\/Productions\/Lectures\/Adam%20Sawicki%20-%20Pulapki%20liczb%20zmiennoprzecinkowych.pdf\" target=\"_blank\" rel=\"noopener\">Pu\u0142apki liczb zmiennoprzecinkowych<\/a> where the author described more topics related to floats. There are many great publications on the Internet on this topic.<\/p>\n\n\n\n<p>Thank you for reading this post. If you like this kind of content, let me know in the comments. I will also be grateful for topic suggestions you would like me to cover.<\/p>\n\n\n\n<p><span>If you noticed any mistake, disagree with something, would like to add something important, or just feel like discussing this topic, write a comment. Remember that the discussion should be polite and in accordance with the rules of the Polish language.<\/span><\/p>\n\n\n<div class=\"kk-star-ratings kksr-auto kksr-align-left kksr-valign-bottom\"\n    data-payload='{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;4345&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;bottom&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;0&quot;,&quot;legendonly&quot;:&quot;&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;0&quot;,&quot;starsonly&quot;:&quot;&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;0&quot;,&quot;greet&quot;:&quot;&quot;,&quot;legend&quot;:&quot;0\\\/5 - (0 votes)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;title&quot;:&quot;How much does using floats cost, and what does the FPU offer?&quot;,&quot;width&quot;:&quot;0&quot;,&quot;_legend&quot;:&quot;{score}\\\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}'>\n            \n<div class=\"kksr-stars\">\n    \n<div class=\"kksr-stars-inactive\">\n            <div class=\"kksr-star\" data-star=\"1\" style=\"padding-right: 0px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"2\" style=\"padding-right: 0px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"3\" style=\"padding-right: 0px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"4\" style=\"padding-right: 0px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"5\" style=\"padding-right: 0px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n    <\/div>\n    \n<div class=\"kksr-stars-active\" style=\"width: 0px;\">\n            <div class=\"kksr-star\" style=\"padding-right: 0px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 0px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 0px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 0px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 0px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n        <\/div>\n    <\/div>\n<\/div>\n                \n\n<div class=\"kksr-legend\" style=\"font-size: 19.2px;\">\n            <span class=\"kksr-muted\"><\/span>\n    <\/div>\n    <\/div>\n","protected":false},"excerpt":{"rendered":"<p>Have you ever seen on some forum or social media group how \u201chigh-ranking\u201d programmers forbid using float? Have you noticed that none of them explains why? Just because! That\u2019s it. Why is using float for some people as bad as killing small animals? Let\u2019s find out!<\/p>\n","protected":false},"author":1,"featured_media":3041,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","_links_to":"","_links_to_target":""},"categories":[160],"tags":[176,174],"class_list":["post-4345","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-stm32","tag-programming","tag-stm32"],"_links":{"self":[{"href":"https:\/\/msalamon.pl\/en\/wp-json\/wp\/v2\/posts\/4345","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/msalamon.pl\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/msalamon.pl\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/msalamon.pl\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/msalamon.pl\/en\/wp-json\/wp\/v2\/comments?post=4345"}],"version-history":[{"count":3,"href":"https:\/\/msalamon.pl\/en\/wp-json\/wp\/v2\/posts\/4345\/revisions"}],"predecessor-version":[{"id":4454,"href":"https:\/\/msalamon.pl\/en\/wp-json\/wp\/v2\/posts\/4345\/revisions\/4454"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/msalamon.pl\/en\/wp-json\/wp\/v2\/media\/3041"}],"wp:attachment":[{"href":"https:\/\/msalamon.pl\/en\/wp-json\/wp\/v2\/media?parent=4345"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/msalamon.pl\/en\/wp-json\/wp\/v2\/categories?post=4345"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/msalamon.pl\/en\/wp-json\/wp\/v2\/tags?post=4345"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}