{"id":4327,"date":"2018-10-09T22:33:08","date_gmt":"2018-10-09T20:33:08","guid":{"rendered":"https:\/\/engel-wolf.com\/?p=4327"},"modified":"2019-10-26T23:13:14","modified_gmt":"2019-10-26T21:13:14","slug":"why-do-we-need-human-readable-tests-for-a-programming-language-why-do-we-need-human-readable-tests-for-a-programming-language","status":"publish","type":"post","link":"https:\/\/engel-wolf.com\/?p=4327","title":{"rendered":"Why do we need human readable tests for a programming language?"},"content":{"rendered":"\n<h3 class=\"wp-block-heading\" id=\"aa75\"><em>Software can save lifes!\u200a\u2014\u200aR and Python&nbsp;<\/em>programming language rank in the&nbsp;<a href=\"https:\/\/stackify.com\/popular-programming-languages-2018\/\" rel=\"noreferrer noopener\" target=\"_blank\">Top10 of programming languages&nbsp;<\/a>today. Both languages come out of open-source and research environments and are now moving into the industry. Testing software is really essential in industry. Why to talk about human readable&nbsp;tests?<\/h3>\n\n\n\n<p>Let me introduce you to my working environment. On a daily basis I\u2019m writing code in&nbsp;<em>R.&nbsp;<\/em>From this code we build software that is applied in projects every day. A lot of people agree on the&nbsp;<a href=\"https:\/\/www.amazon.de\/gp\/product\/B07DGLPGZN\/ref=as_li_tl?ie=UTF8&amp;camp=1638&amp;creative=6742&amp;creativeASIN=B07DGLPGZN&amp;linkCode=as2&amp;tag=zappingseb-21&amp;linkId=153c0bb6ba370bc1de87c825dd7edb31\" rel=\"noreferrer noopener\" target=\"_blank\">fact that software shall be tested<\/a>. Even deeply tested until 100% code coverage is reached. OK, if you do not agree, you can stop reading now.<\/p>\n\n\n\n<p>Additionally my work influences people\u2019s life, directly. Not only their life, but whether they stay alive. I\u2019m writing code in a clinical environment. So a feature in my software that was not tested could mean my software produces an outcome, that your doctor interprets wrong which can cause you pain, because he takes a wrong treatment decision. So far so good, the same accounts for a guy who coded the micro controller software of your car\u2019s steering wheel. If you steer left because you do not want to hit a wall, you do not want your car to steer right and leave you flat and dead. Such software shall be deeply tested, too.<\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"cb7b\">The clinical environment and regulatory authorities<\/h3>\n\n\n\n<p>Now what\u2019s special about software in a clinical environment is that each application, whether it is a medical device or a drug itself or even just the process how to produce that drug has to be checked. This checking process is guided by the government. If you are taking a drug or your blood gets analyzed by a medical device, you want authorities to make sure that it saves your life or at least makes you healthier. If you\u2019re from the U.S., the authority is called&nbsp;<a href=\"http:\/\/www.fda.org\" rel=\"noreferrer noopener\" target=\"_blank\">FDA&nbsp;<\/a>and will do that for you. In case you\u2019re living in Europe, find out which of our 26 different authorities is responsible for you, in Germany it\u2019s the&nbsp;<a href=\"https:\/\/www.tuev-sued.de\/plants-buildings-technical-facilities\/fields-of-engineering\/cleanroom-technology\/pharma-life-sciences\" rel=\"noreferrer noopener\" target=\"_blank\">T\u00dcV<\/a>, in France it\u2019s the&nbsp;<a href=\"https:\/\/ansm.sante.fr\/\" rel=\"noreferrer noopener\" target=\"_blank\">ANSM<\/a>.<\/p>\n\n\n\n<p>Let\u2019s focus, I\u2019ll limit the scope of these authorities a bit. I would like to talk about a classical example from clinical applications like a Urine test strip. Everybody has seen such a thing. They can test e.g. for sugar in your Urine. If there was sugar in your Urine, you should be checked for Diabetes. So, the authority has to make sure, that the whole process around the test strip improves your diagnosis. The authority makes sure the test strip can tell you if you shall be further analyzed for Diabetes. If you have Diabetes, the test strip shall at least give a hint.<\/p>\n\n\n\n<p>Let\u2019s make the scope a bit smaller. To evaluate the test strip after it was dipped into your Urine, a doctor plugs it into a test strip reader. The reader gives the result of your test as a number. How does it do that? There is a software evaluating the measurements of the sensor and printing the test outcome to a display due to a specific algorithm. Now regulatory authorities have to have the ability to check that:<\/p>\n\n\n\n<ol><li>the algorithm is right.<\/li><li>the algorithm was implemented right.<\/li><li>the software takes the right input from the right sensor.<\/li><li>the display of the device gets the right outcome of the software.<\/li><\/ol>\n\n\n\n<p>Step 1 basically needs good documentation of the software and the algorithm. This is a different topic. Step 2\u20134 can be done by software testing, in best case automated testing. Let\u2019s assume sensor and display are working fine and are tested already.<\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"81ce\">The test&nbsp;case<\/h3>\n\n\n\n<p>The device I just made up shall be tested now. It consists of a sensor and a display and in between sits a chip that runs a software.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"416\" height=\"150\" src=\"https:\/\/engel-wolf.com\/wp-content\/uploads\/1_hKUKyd9KDomgw_trcbliMA.png\" alt=\"\" class=\"wp-image-4329\" srcset=\"https:\/\/engel-wolf.com\/wp-content\/uploads\/1_hKUKyd9KDomgw_trcbliMA.png 416w, https:\/\/engel-wolf.com\/wp-content\/uploads\/1_hKUKyd9KDomgw_trcbliMA-300x108.png 300w\" sizes=\"(max-width: 416px) 100vw, 416px\" \/><\/figure><\/div>\n\n\n\n<p>We now have to have a set of numbers that come from the sensor which result in a set of numbers that shall be displayed on the screen of the device:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">Sensor Value    Display Value<br>3               12<br>5               15<br>5.5             15<br>8               24<br>1               <em>too low to evaluate<\/em><br>3               12<br>2.9             12<br>24.2            <em>too high to evaluate<\/em><\/pre>\n\n\n\n<p>Now the algorithm might not be clear to you. But guess there is a detailed description available:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p>The algorithm of&nbsp;<em>device123&nbsp;<\/em>shall evaluate values smaller than 2 as \u201ctoo low to be evaluated\u201d, values smaller than 5 as \u201c12\u201d, values smaller than 7 as \u201c15\u201d, values smaller than 15 as \u201c24\u201d and values above 15 as \u201ctoo high to be evaluated\u201d.<\/p><\/blockquote>\n\n\n\n<p>The task of the regulatory authority is not to test the algorithm if they shall approve&nbsp;<em>device123.<\/em>&nbsp;There job is to check that the producer of the device checked the algorithm and its software implementation. Therefore the two following things have to exist:<\/p>\n\n\n\n<ol><li>Test cases<\/li><li>A test report telling how the test cases were evaluated<\/li><\/ol>\n\n\n\n<p>Test cases in the programming language&nbsp;<em>R<\/em>&nbsp;can be written with the packages&nbsp;<a href=\"https:\/\/cran.rstudio.com\/web\/packages\/RUnit\/index.html\" rel=\"noreferrer noopener\" target=\"_blank\"><em>Runit<\/em>&nbsp;<\/a>or&nbsp;<a href=\"https:\/\/cran.r-project.org\/web\/packages\/testthat\/index.html\" rel=\"noreferrer noopener\" target=\"_blank\"><em>testthat<\/em><\/a>. Both allow developers and testers to check the software. The test cases shown above in the code box could look like this in&nbsp;<a href=\"https:\/\/cran.r-project.org\/web\/packages\/testthat\/index.html\" rel=\"noreferrer noopener\" target=\"_blank\"><em>testthat<\/em><\/a>e.g.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><code><a href=\"http:\/\/testthat.r-lib.org\/reference\/test_that.html\" rel=\"noreferrer noopener\" target=\"_blank\">test_that<\/a>(\"1 is interpreted correctly\", {<br>  <a href=\"http:\/\/testthat.r-lib.org\/reference\/expect_success.html\" rel=\"noreferrer noopener\" target=\"_blank\">expect_e<\/a>qual(device123(sensor=1),\"<\/code><em>too low to evaluate<\/em><code>\")<br>})<\/code><\/pre>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\ntest_that(&quot;8 is evaluated correctly&quot;, {  expect_equal(device123(sensor=8),24)})\n<\/pre><\/div>\n\n\n<pre class=\"wp-block-preformatted\">...<\/pre>\n\n\n\n<p>Now for people who read&nbsp;<em>R<\/em>&nbsp;code every day, this seems great. The&nbsp;<a href=\"https:\/\/cran.r-project.org\/web\/packages\/testthat\/index.html\" rel=\"noreferrer noopener\" target=\"_blank\"><em>testthat<\/em><\/a>package will tell you if your function device123 upon being called with 1 or 8 gives the exact value. The only problem is, test_that does not tell you if your test was successful, what was your expected value, what was the input. This tiny tool will just tell you how many tests were run and which failed. See the reference from<a href=\"http:\/\/r-pkgs.had.co.nz\/tests.html\" rel=\"noreferrer noopener\" target=\"_blank\">&nbsp;Hadley Wickham<\/a>\u2019s blog:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\nExpectation : ...........rv : ...Variance : .72....\n<\/pre><\/div>\n\n\n<p>Each line represents a test file. Each&nbsp;<code>.<\/code>&nbsp;represents a passed test. Each number represents a failed test. The numbers index into a list of failures that provides more details:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n1. Failure(@test-device123.R#5): 8 is interpreted correcty -----device123(8) not equal to 24Mean relative difference: 3\n<\/pre><\/div>\n\n\n<p>Now it assumes that all tests ran and you can check that those were successful. But the outcome is just command line stuff and just readable for people who are used to&nbsp;<em>R.<\/em><\/p>\n\n\n\n<p>I would like to come back to you as a&nbsp;<strong>patient<\/strong>. You pay the regulatory authority with your taxes. Do you expect the guy at the regulatory authority to know&nbsp;<em>R<\/em>&nbsp;or that cryptic stuff that comes out of it? Do you really expect walking into your doctors office and seeing a \u201cproof\u201d sign at his devices, which tells that someone at the authority looked into the code of this device?<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"alignleft\"><img loading=\"lazy\" decoding=\"async\" width=\"373\" height=\"543\" src=\"https:\/\/engel-wolf.com\/wp-content\/uploads\/1_lkOytZZeivWW6250zmJ6Cw.png\" alt=\"\" class=\"wp-image-4330\" srcset=\"https:\/\/engel-wolf.com\/wp-content\/uploads\/1_lkOytZZeivWW6250zmJ6Cw.png 373w, https:\/\/engel-wolf.com\/wp-content\/uploads\/1_lkOytZZeivWW6250zmJ6Cw-206x300.png 206w\" sizes=\"(max-width: 373px) 100vw, 373px\" \/><figcaption> Icons from&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/www.flaticon.com\/authors\/ddara\" target=\"_blank\">dDara<\/a>,&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/www.flaticon.com\/authors\/monkik\" target=\"_blank\">monkik<\/a>,&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/www.flaticon.com\/authors\/prosymbols\" target=\"_blank\">prosymbols<\/a><br><br> <\/figcaption><\/figure><\/div>\n\n\n\n<p>My answer is&nbsp;<strong>no<\/strong>. I want the regulatory authority to be keen on the values the device gives to the doctor and maybe on the chemistry of the test strip, but software shall be something that works and does the described job. If it is well documented, it shall follow its documentation. Now the responsibility for the test cases lies at the side of the company writing the software. This company has to show the authority that the software was tested. The authority just has to make sure, this process was valid.<\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"4c8b\">How do we allow regulatory authorities to understand test&nbsp;cases?<\/h3>\n\n\n\n<p>Now we know that automated software testing allows to check if&nbsp;<em>device123&nbsp;<\/em>has the right algorithm implemented. The major problem we have is reading the code, test the code and check if the test was valid. Testing code with code seems not to be the right option to see if it\u2019s valid. For a company it will be hard to tell the authority, see we tested code with code. We have a bunch of cryptic command line outputs you can read that proof it.<\/p>\n\n\n\n<p>No, you want something nice.<\/p>\n\n\n\n<p>In case you\u2019re a&nbsp;<em>.NET&nbsp;<\/em>developer there is a really simple solution for this. It is called&nbsp;<a href=\"https:\/\/specflow.org\/getting-started\/\" rel=\"noreferrer noopener\" target=\"_blank\"><em>specflow<\/em><\/a><em>.&nbsp;<\/em>It generates really easy to interpret human readable test cases:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">Feature: Device123. We prepare a device that can use our algorithm to get a screen value out of a sensor value.<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\">Scenario: Check number 1   <br>Given the sensor measures <em>1 <\/em>in the device<br>Then the result should be \"<em>too low to evaluate<\/em>\" on the screen<\/pre>\n\n\n\n<pre class=\"wp-block-preformatted\">Scenario: Check number 8<br>Given the sensor measures 8 in the device<br>Then the result should be <em>24<\/em> on the screen<\/pre>\n\n\n\n<p>The outcome of those tests is given in pretty reports:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"440\" src=\"https:\/\/engel-wolf.com\/wp-content\/uploads\/1_NknslRHfhqE_fPo71Soy8A.png\" alt=\"\" class=\"wp-image-4331\" srcset=\"https:\/\/engel-wolf.com\/wp-content\/uploads\/1_NknslRHfhqE_fPo71Soy8A.png 800w, https:\/\/engel-wolf.com\/wp-content\/uploads\/1_NknslRHfhqE_fPo71Soy8A-300x165.png 300w, https:\/\/engel-wolf.com\/wp-content\/uploads\/1_NknslRHfhqE_fPo71Soy8A-768x422.png 768w, https:\/\/engel-wolf.com\/wp-content\/uploads\/1_NknslRHfhqE_fPo71Soy8A-500x275.png 500w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><figcaption> Image taken from:&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/specflow.org\/getting-started\/\" target=\"_blank\">https:\/\/specflow.org\/getting-started\/<\/a><\/figcaption><\/figure>\n\n\n\n<p>But I\u2019m not a&nbsp;<em>.NET<\/em>&nbsp;developer and making use of&nbsp;<em>specflow<\/em>&nbsp;to code tests in&nbsp;<em>R&nbsp;<\/em>or&nbsp;<em>Python&nbsp;<\/em>is rather hard.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"f264\">A solution for human readable tests in&nbsp;R<\/h3>\n\n\n\n<p>In our team came up with a solution for human readable tests called&nbsp;<a href=\"https:\/\/cran.r-project.org\/web\/packages\/RTest\/index.html\" rel=\"noreferrer noopener\" target=\"_blank\"><em>RTest<\/em><\/a>. It\u2019s an R-package that allows to use XML files for testing other R-packages and gives reports in form of documents. We know that XML is not as nice as pseudo language, but as a beginning I think it\u2019s a great way to start. Our XML files for the example would look like this:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">&lt;device.TestCase&gt;<br>&lt;ID&gt;Test Case1&lt;\/ID&gt;<br>&lt;synopsis&gt;<br>    &lt;author&gt;Sebastian Wolf&lt;\/author&gt;<br>    &lt;date&gt;2018-05-25&lt;\/date&gt;<br>    &lt;desc&gt;Test device123&lt;\/desc&gt;<br>&lt;\/synopsis&gt;<br>&lt;tests&gt;<br>    &lt;device test-desc=\"Test return value of input 1\"&gt;<br>     &lt;params&gt;&lt;sensor value=\"1\" type=\"numeric\"\/&gt;&lt;\/params&gt;<br>     &lt;reference&gt;<br>      &lt;variable value=\"<em>too low to evaluate<\/em>\" type=\"character\"\/&gt;<br>     &lt;\/reference&gt;<br>    &lt;\/device&gt;<br>    &lt;device test-desc=\"Test return value of input 8\"&gt;<br>     &lt;params&gt;&lt;sensor value=\"8\" type=\"numeric\"\/&gt;&lt;\/params&gt;<br>     &lt;reference&gt;<br>      &lt;variable value=\"24\" type=\"numeric\"\/&gt;<br>     &lt;\/reference&gt;<br>    &lt;\/device&gt;<br>&lt;\/tests&gt;<br>&lt;\/device.TestCase&gt;<\/pre>\n\n\n\n<p>This setup not only allows to define numerical functions to test the algorithm inside the device but also to note some basic environment information, like who really wrote this test and when he started to write the test. This information shall of course be verified by the source-code control and a co-developer.<\/p>\n\n\n\n<p>The outcome using&nbsp;<em>RTest&nbsp;<\/em>would look similar to this:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"847\" src=\"https:\/\/engel-wolf.com\/wp-content\/uploads\/1_paJ26bNWiUxavC2kSKTiZQ.jpeg\" alt=\"\" class=\"wp-image-4332\" srcset=\"https:\/\/engel-wolf.com\/wp-content\/uploads\/1_paJ26bNWiUxavC2kSKTiZQ.jpeg 800w, https:\/\/engel-wolf.com\/wp-content\/uploads\/1_paJ26bNWiUxavC2kSKTiZQ-283x300.jpeg 283w, https:\/\/engel-wolf.com\/wp-content\/uploads\/1_paJ26bNWiUxavC2kSKTiZQ-768x813.jpeg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><\/figure>\n\n\n\n<p>The test report not only shows for each test how it was executed but also the execution time, if it was successful, the reference value and the outcome. Someone who knows what the software shall do from the<a href=\"#87af\">&nbsp;algorithm description<\/a>&nbsp;can now by reading the test case and the test report, see what was tested and also see if this makes sense. For co-workers who are new to the project, it is also way easier to find into the project. Reading test cases and report outcomes allows them to see in a minute which parts of the project still have problems or which functions are not yet tested.<\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"9180\">Summary<\/h3>\n\n\n\n<p>Understanding how&nbsp;<em>R&nbsp;<\/em>software was validated now does not need an&nbsp;<em>R&nbsp;<\/em>programmer anymore. The environment presented here allows people to see how the software was tested. I think that human-readable tests will make statistical software more fail-proof, easier to understand and more sophisticated. As&nbsp;R\u2019s&nbsp;way out of a research environment into clinical environments or even car industries took place already, the process is not finished, yet. Many more tools will be needed to allow regulatory authorities to trust in such a big open-source project. Human readable test cases are a first step in helping companies to support the validity of their open-source solutions. Using&nbsp;R&nbsp;and a good testing framework will make people\u2019s life safer, because you\u2019ll have not only great statistical tools but great&nbsp;<strong>validated<\/strong>&nbsp;statistical tools.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p>The ideas and opinions expressed in this post are those of the author alone, and are not to be construed as representing the opinions of his employer or anyone&nbsp;else.<\/p><\/blockquote>\n\n\n\n<blockquote class=\"wp-block-quote\"><p>This article was previously published in the <a href=\"https:\/\/medium.com\/datadriveninvestor\/why-do-we-need-human-readable-tests-for-a-programming-language-1786d552f450\">Data Driven Investor<\/a><\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>Software can save lifes!\u200a\u2014\u200aR and Python&nbsp;programming language rank in the&nbsp;Top10 of programming languages&nbsp;today. Both languages come out of open-source and research environments and are now moving into the industry. Testing [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":4183,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"footnotes":""},"categories":[1],"tags":[398,397,399,384,394,396,381,395],"_links":{"self":[{"href":"https:\/\/engel-wolf.com\/index.php?rest_route=\/wp\/v2\/posts\/4327"}],"collection":[{"href":"https:\/\/engel-wolf.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/engel-wolf.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/engel-wolf.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/engel-wolf.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4327"}],"version-history":[{"count":5,"href":"https:\/\/engel-wolf.com\/index.php?rest_route=\/wp\/v2\/posts\/4327\/revisions"}],"predecessor-version":[{"id":4336,"href":"https:\/\/engel-wolf.com\/index.php?rest_route=\/wp\/v2\/posts\/4327\/revisions\/4336"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/engel-wolf.com\/index.php?rest_route=\/wp\/v2\/media\/4183"}],"wp:attachment":[{"href":"https:\/\/engel-wolf.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4327"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/engel-wolf.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4327"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/engel-wolf.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4327"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}