Tuesday, May 21, 2019

Check the boundaries of your function output

I recently spent about 6 hours trying to figure out why my answer to the Count Triplets problem on Hacker Rank was failing 4/12 of their test cases.

After failing to create a test case that could fail my solution, I decided to spend five Hackos to reveal one of the failing test cases.

It turned out that my return type wasn't large enough to accomodate the correct answer. I should have known based on the problem statement that the output would exceed what int32 can hold since I was calculating combinations of values where the count of values was almost touching the boundary of int32. I also ignored a clue given to me by the skeleton code Hacker Rank provided.

The skeleton code they provide for the solution in C# uses the long variable type everywhere instead of int, even though all the input value are constrained within int numeric space; all the input constraints were below 10^9 (1 billion) and int supports roughly up to 2x10^9 (2 billion). I changed all of the instances of long to int because I thought that would save considerable memory allocation space and some execution time. I however should have noticed that the output could be much larger than int due to what is being calculated.

The problem was basically find all sequential sets of three values within an array that could have up to 100,000 (10^5) values in it. That means an upper limit on the answer is roughly: 10 ^ 5 ^ 3 = 10 ^ 15. An int can only hold 2x10^9, but a long can hold 2x10^18. So a long is the proper data type to use.

All I had to do to get my solution to pass the remaining four test cases was to change the type of any variable holding the return value or an intermediate value to long.

What I learned from this issue is that I need to make sure that I've definetly figured out the output bounds for a function I'm testing based on the most extreme situation possible for the code. I've known this for many years as it applies to input values, but I forgot to apply it to the output.

Also, my reasoning behind "optimizing" the code was wrong. An int is 32 bits on both 64 and 32 bit processors, but the 64-bit processor performs operations using 64 bits of precision. So using an int rather than a long saves memory but not execution time.



Tuesday, April 23, 2019

Subscribe Android phone to Facebook calendar/events

What you'll need:
  1. On a non-mobile device, open a browser that supports sending links to your mobile device.
    • I use Firefox, but Chrome and probably Opera have this feature as well.
  2. Navigate to https://www.facebook.com/events/
  3. In the lower right corner of the page you'll see a box with links for "Upcoming Events" and "Birthdays".
  4. Right click on the "Upcoming Events" link and choose the "Send to device" option.
  5. On your phone you should have a notification to open the link. Use it.
  6. Tap and hold on the address bar to copy the address.
  7. Open the ICSx⁵ app
  8. Click the + button
  9. Paste the link into the URL field
  10. You don't need authentication as Facebook provided an api key in the link for you
  11. Finish the process
  12. Depending on the calendar app you use, you may need to enable showing the Facebook Events calendar. 
    • I use Simple Calendar, which required me opening the app's settings to enable the calendar.
  13. You may need to manually refresh the calendar after enabling it. I had to refresh it several times before it showed my events.
Note that if you change your Facebook password, you will have to redo this process as the api key provided in the link from Facebook will become invalid.

Programming: Sanitizing your inputs

Sometimes the users of our applications manage to enter invalid data. Other times, we create bugs that introduce invalid data. Whatever the case may be of how it was introduced, there are a series of precautionary measures we can use to prevent invalid data from affecting application functionality and performance.

The first line of defense is of course "form validation". Ideally, all user entry mistakes are caught at this stage. Form validation involves configuring rules for your UI architecture (Angular/React/etc) to interpret, or writing your own validation functions. Form validation should always describe the issue to the user so that they can fix their mistake. If it doesn't do this well, then expect to receive many support phone calls and have your manager breathing down your neck about unhappy customers and expensive customer support costs.

The second line of defense is "backend validation". This should include all security focused frontend validation, plus any additional validation the backend can do; The backend has access to more information about the state of the system, such as other data records that can inform further validation of the entered data. Your service architecture should provide a framework for this type of validation, but you may also end up writing your own code if your framework doesn't provide it, or it is not capable of handling certain types of validation, such as cross-referencing other records in the database.

The final line of defense is "data access layer validation". This type of validation occurs right before writing a record or records to the database. It is the lightest and most rudimentary form of validation. The only concern at this layer, is whether fields that are required for properly storing the record are present and valid. The errors caught at this stage are always dev team errors. This is because the earlier validation layers failed to catch a user error, or a developer made some other mistake earlier in the call stack.

You may have noticed that I made no mention of data validation-on-read. This is because you shouldn't do this. You should catch bad data before it reaches your database, or else you can expect a costly customer support incident that requires a developer to fix. Also, fixing data in place is a delicate procedure that may result in further damage to the data in the database.

But don't we want to know about bad data in the database? Yes, we do. However, if you perform data validation-on-read you will prevent your users from being able to use the system or fix the issue themselves. Yes, your users are intelligent humans and might be able to fix the problem entirely on their own, but only if you let them. Also, customer support may be able to fix the issue, but only if they can retrieve the data to update it. Finally, if you have a way to detect the issue on read, then why can't you detect it on write instead? So put that data validation logic before writing to the db so that someone besides a developer can fix the problem when and if it arrises.
your users are intelligent humans and might be able to fix the problem entirely on their own, but only if you let them.
An example of validation on read that I've seen in C# code is the use of the LINQ methods Single() and First(). Don't use these methods when reading or returning data to the end user. These methods throw exceptions and prevent the data from making it to the end user, such as when your assumption about the data turns out to be wrong. It would be better to send the user incomplete data than no data at all. They will know that there's a problem if some data is missing, and either re-enter it or call customer support to fix the issue. So use (Single/First)OrDefault instead and smooth over any potential null reference issues that might arise from that.
It would be better to send the user incomplete data than no data at all.
It is my hope that this article will lead to less hot database fixes, and system downtime. Maybe it will also get software developers thinking a little more in terms of how their users might be able to dig their way out of their own messes, or even perhaps your mess.