Sunday, 9 November 2014

The Importance of Bug Reports

Fixing your own bugs is all fine and dandy, but what about fixing other peoples problems? One of the quintessential components of debugging is knowing how to write a proper bug report. We were shown how to write out own bug reports in class, but in today's post I want to strain the importance of the bug reports including key sections to make them as clear and concise as possible.

The way I view bug reports is this: if you spend an extra 5 minutes making one, you may save someone an hour of their time. Detailing the report with as much evidence as you can find using concise text and images may take longer, but will help whomever views the report greatly. Different key sections would include:
  • Technical details (OS, compiler, third-party libraries, etc)
  • Summary of the program or section of the program the bug exists in.
  • Severity
  • Summary of the problem.
  • How to create the process.
  • Past attempts and results.
  • Any debug info (call stack, debugging log files, console output)
I will be using a program from class I created and purposely broke for homework purposes. I will build a bug report out of it for you to follow along too.

Technical Details

Rather self-explanatory, technical details is a list that shows your hardware and software. This includes OS, third-party libraries, any API's or SDK's, If you're program runs on a PC, include your specs if you truly feel its necessary. For AAA PC games, specs are certainly useful, but for a small project between buddies in University may not be as critical.



Summary of the program or section.

Rather self explanatory, this section details what your program should do. Feel free to summarize your program as a whole if it is small enough (for example, a basic addition/subtraction program), or describe what particular section is breaking (when you try to grab loot after defeating a particular boss). If the program is small enough, you can include summary of the program and summary of the problem in the same section, which is what I do in this case.



Severity

How severe is the bug in relations to the final release of the game. My professor gave us three levels of severity: A, B, and C. A means the product cannot be set to release (for example, crashes on start up). B is still severe but the game is playable (for example, lag spike when loading level 3). Several B bugs could be as serious as an A bug. C level bugs are small but don't really effect the grand scheme of things. They are still important to squash, but won't be a big hindrance on your final project (for example, if you unplug the controller 10 times really fast, the game crashes).


Summary of the problem 

The summary of the problem should be a quick, clear, and concise description of the expected behavior and the deviated behavior. A bad example of a summary would be, "it breaks when it compiles". Receiving this on bug reports makes me immediately send it back because nothing is really stated. What is the error? Is it a buffer overrun or linker errors? Do the textures flicker, and if so, how frequently and at what position? Summarizing the problem in a clear manner is critical to getting other people to understand it.

How to recreate the bug

Arguably the most important section, knowing how to recreate the bug is of utmost critical importance. This section I stress heavily and I mean HEAVILY taking the extra time to provide as much detail as possible into recreating the bug. Make a step by step process as if a 10 year old was trying to create it. Provide images, even a video if you feel it necessary, just make sure the person on the other side can recreate your bug the first time. By spending an extra five or ten minutes making sure you provide as much detail as possible can save someone else hours.




Past Attempts and Results

While not necessary for all bug reports, this section can come in handy if your team had already tried some different tests and could not determine the problem. If you are going to include past attempts and results, then treat it like a "How to recreate the bug" section and provide all the necessary details into what you did and what the results were. This can be useful in smaller group projects when your partner programmer details different things they tried, the results, and brings it to you saying they are stuck.

Note: I did not include this section in my built bug report.

Debug Info

Providing debug info can make or break finding a bug. Debug info can include the call stack, any program specific log files, console outputs, debug symbols, etc. This information acts as a breadcrumb trail that can help lead programmers on the right track to solving a bug. You may want to include this in a separate text file. Not all bugs need you to provide all of this information, but it can really go a long way to helping solving the problem.


My debugging section is rather lackluster as it could include more information, but it gets the point across that the console is not displaying any issues and that we are including debugging symbols.

Summary

That concludes this weeks post on bug reports. I hope you learned something new about the importance of bug reports. 

Saturday, 1 November 2014

How not to debug, Part II

In my last post, I gave some general hints and tips on how not to set yourself up for debugging. This would include practicing ergonomics, getting into the right mindset, proper coding practices, documentation, commenting code, and more. Today, I'll be taking all of those and actually debugging some real problems I encountered over the years, and how concepts from class lectures, the textbook, and my previous blog all apply. Without further ado, let's get started.

Example problem

Last month I was working on an assignment for my artificial intelligence class. In this assignment, the user was placed in a room filled with 20 doors and each door was assigned three properties:
  1. Cold or hot.
  2. Noisy or quiet. 
  3. Safe or unsafe.
Each door would be assigned one from each pair, so a door could be cold, noisy, and safe, or hot, quiet, and safe, and so on. I ran into two significant bugs while working on this project.
  1. Noisy doors should play their sound when within close proximity. Regardless of proximity, the doors were playing their sound.
  2. Doors were not being assigned the variables correctly. 
I will analyze the number 1 first. So, as mentioned, each door should be noisy when close. However, regardless of proximity, the sound plays. This is my current scene:


Noisy doors problem

Four doors are cold, one is hot. They are all defaulted to safe and only the hot one is supposed to be noisy. From where I stand in the scene currently, a lion roar sound plays. This should only play when directly in front of the door. Because the title of today's post is how not to debug, let's go over some ways not to debug this particular problem

What not to do: Start changing code.

While it may seem very, very attractive to just start playing with values, checking initialization, moving stuff around, etc, this is not how you should start out this problem. When you begin immediately start changing code without looking into how or why it defects, you create a habit where you do not think logically about your problem. This in turn can waste your time and create artificial stress. You could easily play with little segments of code for hours before actually figuring out the problem, to which you may not understand how you fixed it, or accidentally creating even more problems.

How to do it instead: What should I do instead?


Recreate the problem, or at least try too. Try to remember exactly how you got to the problem and recreate it. And when you do recreate it, either write it down or remember it, because this will help you develop a hypothesis later.

After you recreate the problem in code, you can do one of two things, and I believe both are valid depending on certain circumstances: Google (research) the problem or use all tools available (breakpoints, call stack, any other tools) to try and narrow down the problem so you can at least get an idea of the problem. I only advocate for researching when, based on the defect and how you made it, could be fixed by a quick search into documentation or Google. 

Actually Solving it

So instead of just changing variables, I look into the problem. I place break points at initialization, the play function, and at other points of interest, view the call stack, write a few lines of code to cout if some information is initialized correctly, a practice I should have done before, etc. 

I see that the sound is playing from exactly where it should be, the listener is exactly where it should be, and the sound has the correct set up in terms of how the sound resonates, falls off, and volume. So if it not an apparent problem with the code I wrote itself, it could be a problem elsewhere, such as:
  • SFML
  • I screwed up on the audio file export
  • Could still be a problem with my code, but it's looking very unlikely
If it is a problem with SFML, then I can find it in two different locations, Google and documentation. I resort to Google first and after scrolling and scanning through a few different links, I find that SFML had bugs for stereo sound, not mono. My audio file was exported as stereo. After a quick export of the file in Mono, load it in, test, bam it works. While the way I fixed it was valid, another way would have been to load up the documentation on SFML's sound class and simply skim it. It says it blatantly in the document. 

The problem was relatively simple, but a lot of programmers can fall into the trap of not debugging correctly by looking for a problem that they don't understanding.

Doors not being assigned truth table correctly

So I had spent hours into the night coding the assignment of variables for the doors. Essentially the, the doors are assigned variables based on a probability. See the chart below.

Hot Noisy       Safe Door           Percentage of Doors
Y Y              Y                             0.05
Y Y              N                             0.10
Y N              Y                             0.03
Y N              N                             0.21
N Y              Y                             0.06
N Y              N                             0.11
N N              Y                             0.40

N N              N                             0.04

The problem I encountered was that the doors were not randomizing properly. A quick run down of the program: upon initialization, it creates the room and loads the door property text file. For each property, it uses the percentage value to determine what the door should be. I have two std::vectors, one contains ID values for each set of properties (values 1,2,3, etc), and another vector the probabilities (represented by an int value, so I multiple the percentages by 100).

The program sorts the percentages from highest to lowest, and sorts the ID's so it ensure that the ID's match the sorted value (see below), generates a random value, and if the value lands between any of the percentages, it selects a set of properties to give to a door. An example of what was happening was this: if I set the probability of a door to be hot, noisy, and safe, to 100%, it would not land on that, instead it would land on a different set of properties.

Before Sort:

Hot Noisy       Safe Door           Percentage of Doors     ID
Y Y              Y                             0.05                       1
Y Y              N                             0.10                       2
Y N              Y                             0.03                       3
Y N              N                             0.21                       4
N Y              Y                             0.06                       5
N Y              N                             0.11                       6
N N              Y                             0.40                       7
N N              N                             0.04                       8

After Sort:

Hot Noisy       Safe Door           Percentage of Doors     ID
Y N              Y                             3                          3
N N              N                             4                          8
Y Y              Y                             5                          1
N Y              Y                             6                          5
Y Y              N                             10                        2
N Y              N                             11                        6
Y N              N                             21                        4
N N              Y                             40                        7


What not to do: Panic, immediately change code, blame yourself.

Changing code, panicking, stressing, etc, are the last thing you want to do because you're wasting your time. In fact, if you really have your heart set on being a bad debugger, make sure in your critical functions, such as setting door probabilities, you have zero (or meaningless) comments, poor naming conventions, and no plan either. When it comes to a bug like this, if you panic, try to change some code, slap some stuff together, or try to hack it, you'll waste a lot of time and not realize what you're doing.

What to do: Recreate the problem, search for patterns, and hypothesize.

When it comes to a problem like this, where probabilities are not setting correctly, it becomes very tricky because it is most likely you have a logical error somewhere. The best thing to do is make sure, if you don't already, have a plan for how your program should flow and what each block of code should do. Describe and justify your code! Try to break it by thinking of how it could not work under certain cases. This can really help wrap your head around the problem.

Next, play with your text file a bit. Try a few different cases and see if it works perfectly in some cases and not so perfect in others. After that, check the door setting function using break points. In my program, I print out the initially loaded text file, the after effects of the door sorting algorithm, and which random value is generated and where it lands/selects. When trying to solve the problem, this is probably a good place to start.

Actually Solving it

This bug gave me a headache because sometimes logical errors just screw you up big time. The first thing I did was want to understand the problem in greater depth. Here is a general run down of what I did to try and understand said problem:

  • Recreated the bug several times to see if it broke in multiple cases with or without a relationship (meaning were cases where it broke related to other broken cases or did they appear to break arbitrarily).
  • Analyzing the programs output and trying to see if it is landing on the incorrect set function.
  • Going back to my original plan, trying to determine cases where my code would break, and seeing if my case would fit.
  • Analyzing my load function of the door.
What was important about solving this bug was enumerating and eliminating possibilities. Very quickly I was able to determine that my script loader was not to blame. It was determining the random value correctly, so it's not that. It sorted correctly but I was able to determine through my output and analysis of the breakpoints that the final value it was getting to determine which set function it used was incorrect. If the value used for selecting which door to use was incorrect, then the problem most likely lays within the block of code where it resides.

As a hypothesis I state that the value used to select the appropriate ID is randomizing correctly, but finding the incorrect value. Here is the block of code associated with ID selection.


Essentially, it generated a random value, creates a new door, selects the random number, then tries to place it within the sorted ID's. After viewing this function several times, it works as intended. The problem doesn't appear to be here. What about the sort function? If the problem exists with the ID not selecting the right door, what if I'm not sorting correctly?


Again, after thorough analysis, the sort function appears to be sorting things as intended. So where could this problem exist?

Taking a step back

A good way to debug poorly is not stepping back from your code to clear your mind. If you keep pushing yourself harder and harder to understand a problem that you simply aren't understanding at that moment, you'll doom yourself to stress and anxiety. This was one of those bugs that I needed to step back from and go back to the drawing board about. I thought very carefully about the sorting function and began to ask myself questions such as, why am I even sorting in the first place, what does it get me, why am I using and ID value, and so on.

What I came to realize that when it came to sorting was that I had an inherit flaw in the organization of my code. Essentially, I had a block of this code:


This was responsible for setting the properties of the door. This wasn't the problem, but it was a clue as to what my problem was. Essentially, I was sorting the doors and NOT sorting these if statements to be recognized by my new sort. Which begs the question, why am I even sorting in the first place? Was there an easier way to get the random number generation I needed without having to write a convoluted mess of code just to make sure the if statements lined up?

The idea then became a mindset of doubt, which is healthy. I doubted my very sorting function saying, what if you're wrong? How can I prove you to be wrong? Going back to the drawing board, I knew that the total value of all probabilities should equal 100%. So why not instead not sort the values and use them as is? For example, if I have two values, 0.2, 0.5, and 0.3, why sort them? If I multiply by 100 I get 20, 50, and 30, and they all add to 100.
I tested several scenarios on paper and determined that by adding value 0 to value 1 in the vector, and then the summation of value 0 and 1 to value 2 in the vector, and so on, I actually get a correct scale in which to generate numbers that will, hopefully, lead to correct  random number choosing. Time to comment out my sort function, write the new code, and test that hypothesis.

It worked!

Long story short, the program worked as intended and help up on several tests just to make sure. Bugs fixed. Hooray!

In conclusion

I hope you learned something about what not to do when debugging. When coding, remember, don't panic, don't cry, don't pull your hair out. Relax, think logically, and do your best. That, and, read the documentation, make your own documentation, comment, and have good practices for naming conventions. Thank you for reading!