Sunday, April 20, 2014

Buffer Cache Makes Slow Disks Seem Fast, Till You Need Them.

Linux has this wonderful thing called the buffer cache (for more detail read here ). In summary, it uses all your free ram as a cache for file access. Because of buffer cache you can easily get under 1 millisecond response times.
However, this sets a lot of people up for a trap. Imagine you buy a “database server” with a 5400RPM hard disk at Best Buy, while you’re there you pick up an extra 8 gigs of RAM. After loading the latest Ubuntu and restoring a 1 gig customer database backup. You check to see how much RAM you’re using and you have 2 gigs free. You test the new server out and records are coming off that server at an unbelievable speed, your happy, your boss is happy, you look like a genius.
Later that month your company acquires a new firm, you write a script to load their 3 gigs worth of customer data into your same customer database. The next thing you know your server is behind, your website is timing out, your data takes forever to come back, but when you go to access a common page you use for testing the responses are instant. What gives?
Before your files were all getting accessed in RAM effectively, maybe on initial restart of the server things were slow, but eventually all data was stored in RAM and the workload of the computer never exceeded the latencies provided by RAM, once you could only cache part of your frequently accessed data you entered a weird world where some data is coming back in 1 millisecond and some data is coming back in 2 seconds, because the disk is never able to catch up, nor normally would have been able to catch up under normal workloads if you had no buffer cache. However, you’ve never encountered this scenario previously so you never realized your server could never hope to keep up without having a ton of RAM to throw at the problem. You call your original Linux mentor and he has you go buy some good SSDs, install them in your server and once you restore from backup everything is running fine, not as fast as before but no one can tell the difference. Why the big difference?
Because the buffer cache was hiding the bad disk configuration from the get go, and once that little 5400 RPM hard disk had to get some real work done it quickly fell behind and was never able to catch up.
This happens more frequently than you’d think and I’ve seen people who are super happy with their 6 figure SANs until they have an application that exceeds their buffer cache and they quickly find to their horror, their expensive SAN is really terrible for latency sensitive workloads which most databases are ( a good background lesson on the importance of latency is here).


The lesson here is if you ever want to benchmark how a system will do at times of stress, start with a dataset that you can’t fit into buffer cache so you’ll know how it performs when using the disk directly.

Reads and the perils of index tables.

I frequently see index tables in Cassandra being used to allow a One Source Of Truth. It’s important to remember when designing a truly distributed system relational algebra really doesn’t scale, and in memory joins will only get you so far (very little really). So as we often do in relational systems when we have an expensive dataset that is to expensive to calculate on the fly we create a materialized view. In Cassandra it’s helpful to think this way for all datasets that would require joins in a relational system.
Since Cassandra is write optimized lets take a look a typical social networking pattern with a “users stream” and see how we’d model with traditional normalization of data and how this looks in Cassandra with denormalized data.
user_stream_modeling
Instead of many many round trips to the database as we query each index table and then query the results of that index, then pulling data across many nodes into one query, which could take on the order of seconds, we can pull from one table and get results back in under 1 millisecond. Furthermore we can get optimized ordering on the post date and that users partition and get things back very quickly indeed, we’d have to order the comments client side, but even on the biggest comment threads that’d be a fast operation (if this won’t work there are other modeling options from here).
So about now you’re probably screaming bloody murder about how much work this is on updates and that updates for a 1000 users following a post will result in 1000 writes to the database, but I say so what. Cassandra is extremely write optimized and while there is a level at which this modeling may become expensive and require other concessions this will get you light years farther than the normalized approach, where the cost of reads will drag you down far before writing that much will (realize that most workloads are read more than write). But what about consistency? What if my update process only writes the post to 990 followers and then fails? Do I need batch processes to do consistency checks later?

Consistency through BATCH.

Cassandra offers the BATCH keyword. Batches are atomic, when they fail mid update, they will rollback. If the node or client fall down (or both) a hint at the start of this process that allows other nodes to pick the rollback of that batch up and finish the process.
If I assume a blog model and I want to display posts by tag and username. I can update both tables every time a post_title changes in one go, assuming I have the full post information, which is why putting this in the save path for posts is the perfect place for this to go.
post_modeling_example
BATCH BEGIN
  UPDATE posts_by_username SET post_title = 'Cassandra Consistency' WHERE username = 'rsvihla' AND post_id = '5f148a02-ccec-4c98-944d-6431cd86ad5c'
  UPDATE posts_by_tag SET post_title = 'Cassandra Consistency' WHERE tag='Scaling' post_id = '5f148a02-ccec-4c98-944d-6431cd86ad5c'
  UPDATE posts_by_tag SET post_title = 'Cassandra Consistency' WHERE tag='Cassandra' post_id = '5f148a02-ccec-4c98-944d-6431cd86ad5c'
APPLY BATCH
Now there is probably a practical size limit on batches so once you start having to update more than 100 rows at a time you want to consider batches of batches, which will lose you the easy consistency of one atomic batch, however, you will get that in the given batch size.

Summary



This is a quick tour of data modeling at scale and doing so with Cassandra. There are a lot more use cases and variants of this, but this is the basic idea.

How to become in polyglot in 5 hard steps.

With today’s world of programming languages where many languages are better at certain tasks than other’s you’ll find it useful to learn multiple languages over the course of your career (as well as keeping your skill sets current).
Here are some tips I’ve had to learn the hard way:

Step 1: Your second language should be similar to your first

Your brain will confuse a lot of language decisions with computational necessities. Warts from your first language will show more, and syntax will make your head explode.

Step 2: Compare and contrast common task based libraries to get a gist of the differences.

Namely, ORMs, Web Frameworks, Unit testing libraries, Xml reading/writing, Csv reading/writing, Http/REST clients, templating languages, email sending, and I’m sure a few more I’m forgetting. Doing this will not only teach you language specific idioms quickly, it’ll also give your brain a chance to see the similarities and what bits of data are really necessary to do task X or Y.

Step 3: Get involved with the community

Get on mailing lists or better still go to user groups in your area. See what the programmers in that community are obsessed with (I’m looking at you Python and your giant PEP 8 discussion about style), ask foolish questions. Learning a language is a lot about fitting into a community. You may be the determined to bring some concepts from your mother language in, but first learn how to treat your new language as a second mother first.

Step 4: Write several simplistic projects that replace the common big frameworks.

This is an extension of the last couple steps.  Write a unit test library, an ORM, a web framework, rest library, etc, make them simple enough to just work barely, but focus on what you think is a good client API heavily. They will suck, have someone who’s good at that language tell you why it sucks.

Step 5: Learn a third language and a fourth, fifth…

The third should be totally earth shatteringly different and ideally solve some task for you that is hard otherwise in your other languages (this is a great time to learn functional programming). You’ll learn a lot more about programming in general this way. Repeat steps 2 through 4.

Monday, May 20, 2013

Web API FromUrl and FromBody

I've been using ASP.NET Web API for new service my lone .NET customer. They needed to build a most html/js application into the company intranet and they are only interested in Microsoft Technologies so I figured this would give me something similar to RESTful resources in Rails.

While the WebAPI certainly seems modeled after that, and it's a noble effort, I've been struggling to get routes working the way I want running into issues with nested resources and using parameters in the url. The preferred solution seems to be to use IQueryable and OData,  I don't use an ORM that implements IQueryable, and I wasn't willing to switch. So I have stumbled onto the pattern of using FromUrl to at least pass parameters in a more complex than the standard way.

Sunday, April 7, 2013

An Evernote backed Journal using Vim/Emacs

I journal quite a bit and my holy grail has been using my favorite text editor (Vim or Vim bindings) with Evernote to store the everything in a smart searchable format. Today I stumbled onto a neat little tool that makes this all happen called Geeknote http://www.geeknote.me/.  It's written in Python and works fine on my Mac.

Installing Geeknote

In the directory of your choosing run the following script.  This will checkout the latest copy from source and allow you to login:
#!/bin/sh
# Download the repository.
git clone git://github.com/VitaliyRodnenko/geeknote.git
 
cd geeknote
 
# Launch Geeknote and go through login procedure.
python geeknote.py login
#change vim to whatever you want it to be
python geeknote.py settings --editor vim
 
Then add the following script and execute it from whatever directory you want to install Geeknote into
 
#!/bin/sh
 
#change checkout_dir to match where you've checked out the latest
checkout_dir=~/Documents/geeknote
#change notebook to whatever notebook you use as your journal
notebook=Journal
title=$1
if [ -z "$1" ]
then
title=$(date +%Y-%m-%d)
fi
echo creating a note named $title in the $notebook notebook
 
python $checkout_dir/geeknote.py create --title $title --notebook $notebook --content "test"
python $checkout_dir/geeknote.py edit --note $title --notebook $notebook --content "WRITE"

Writing Journal Entries

My script is named journal so for me I just type either of the following:
journal  #creates a note in journal with todays date as the title
or
journal custom_title  #no spaces allowed and will use the title specified
 

Summary

With a few simple scripts and in moments you two can be writing notes in the command prompt. I highly recommend you extend these scripts to your needs or just use the Geeknote command prompt as you see fit.  

 

 

Monday, February 18, 2013

Ruth’s Story


Ruth Ann Svihla came into this world screaming and angry on October 30 2011 at 6:57 am. She was and has always been a beautiful, intelligent child that brought us a great deal of joy, but she was born with many challenges to overcome.

Cleft

The first such challenge we faced was a pretty severe cleft lip and pallet. Although this turned out to the be the most minimal of her problems, it was significant at the time. She was unable to eat on her own that first day, and we had to feed her through a nose tube. Attempts to teach her how to eat were met with frustration, both from my daughter and from us as well. I went to bed that night distraught and worried at challenges of being a new father of “special needs” child. Would my wife and I be able to cope?
The next day brought more of the same, frustrated parents and angry frustrated baby. But somewhere along the way I realized that she was frustrated not with attempting to eat, nearly as much as being helped to eat. So we stopped trying to help her and she began to eat on her own instantly. We were told most cleft lip and pallet babies take 2 weeks to learn to eat as well as Ruth did that second day. Only two days old and already adaptable and fiercely independent!

A “Normal” Life

After that we settled into being mostly normal parents, my wife made weekly trips to Houston from San Antonio for her plastic surgery consults and we looked forward to the day when she would just be a “normal” kid. However, around Christmas time she became extra colicky and struggled to hold down any food, but otherwise was largely consolable. Eventually she was down to 1/4 of what she’d been able eat before and we ended up in the hospital for a week, just to try to get her some nutrition. She did well once we started getting her fed and we were ready to go home. About an hour before leaving, the doctor pulled us aside in a very concerned tone. Within the hour, we were in an ambulance on our way to Texas Children’s Hospital in Houston and so began our next chapter.

Pompe Disease

We went through so many hills and valleys with our subsequent trip to Houston, which would become our home, that it would take a book to tell it all. Let me summarize and tell you all that I have watched my daughter lose the ability to move entire limbs, to smile, to otherwise become inert for long periods of time, but I’ve also had the opportunity to watch her go through a lifetime’s worth of conquered challenges.
Pompe disease can be quite deadly and horrible to go through. The key metrics for survivability are catching it quickly (which we did), and having a type of the disease that responds to treatment (which Ruth had). The journey to finding out those two realities unfortunately took a month and a half and the damage in the meantime was quite severe. Pompe disease is exceedingly rare (1 in 40k births depending on population) and is rarely caught when it occurs. Most cases are believed to be misdiagnosed as SIDS as the infantile form of Pompe disease typically has a 6-8 month life span. Pompe patients are unable to process glycogen fully and so it stores in their muscles. Once the levels become toxic the muscle tissue becomes damaged, sometimes permanently. Despite this it can be managed often quite well, and we’d incorrectly believed at the time of her diagnosis we could return her to a “normal” life with some small challenges.
Ruth to her credit treated life as “normal”. Despite having a heart far weaker and larger than many kids on the transplant list and having a muscular dystrophy she was able to become a constant ball of motion and troublemaking whereas many kids on the transplant list are suffering.

Home

After 2 1/2 months in TCH thanks to some amazing doctors, nurses, and therapists (OT/PT and RT) we were able to go to our new home in Houston and manage her disease from there, which was within a mile of TCH so we could make her multiple per week appointments. Initially everyone was pleased with Ruth’s progress and we were filled with hope. However, after 4 months her heart remained stubbornly weak and it had actually slightly grown over the preceding months. Most kids with Pompe disease that can handle it’s primary treatment Myozyme get fully healthy functional hearts in time, their other physical attributes often never fully recover. In Ruth’s case she was able to physically move as if the Myozyme was working exceptionally well, but her heart was telling us it was not working at all.

Routine checkups can be so very non-routine

On sept 18 we checked back into the hospital after routine checkup told us her measurements for heart failure had suddenly gotten very bad.  Within the day we were told she was going to have to be intubated and at her level of heart failure it was extremely risky and likely lethal  Within 5 hours one change led to my daughter sitting up on my lap and giggling as if nothing wrong had ever happened. By 3 am that night she’d woken up half the infant pod on the CVICU floor with her squeals of delight. I realize looking back now this was the moment that Ruth proved she was just content to find what joy she could at any time. She’d had at least 2 near death experience before her first birthday, and I think she knew at some level she could feel terrible one minute and great the next, so why not just enjoy the good ones.
I wish I could tell you it was an amazing recovery after that. We tried many times to get her onto “normal” support and we failed many times to do so. Her patience with hospital care basically became zilch, however, otherwise she continued to be a very happy baby. She learned a great deal, started to sit up on her own support for up to a minute at a time, and in general was a complete joy in our life. She became very sneaky. She would spend hours trying to pull bandages off, or pull out NG tubes, and do “fake” coughs for somehow more attention. She did all of this in between the constant attention and support she received from her mother, her aunt and I, not to mention the whole hospital of people trying to help us.

Christmas

Nevertheless her heart continued to be stubborn. We tried to get the board to consider a heart transplant, but as there were many complicating medical factors I chose not to go into that could lead to the heart being destroyed in a matter of months, they were obviously unwilling to take a heart away from another kid waiting transplant that would benefit. By Christmas time the side effects of heart failure led to her lungs filling up with fluid and she became oxygen dependent. We slowly watched her fade away by inches and we were convinced that we’d lost her, despite our constant efforts and the level of care she was receiving. A preliminary genetic test came back indicating she may have a chromosomal defect related to dilated cardio myopathies. This could explain how her heart was so much weaker than is normally seen in Pompe patients, it could explain how the Myozyme was not remodeling her heart. However, this along with several other factors made her case completely unique in the truest sense (the only child on record with her genetic variant of Pompe), and so it made the path much murkier for us and for her team of doctors. It also probably meant that not only would we never be able to give her a “normal” life, and we would be in for the fight of our life to keep her alive. She could have not one by two commonly fatal conditions that have expected lifespans less than she was already alive.
Despite all of the bad news and despair her parents were feeling, Ruthie decided it was time to get better. So much so that we reduced support until we were able to get home January 18th, four months to the day after checking into the hospital this last time.

Joy and Heartache

My daughter was probably happier than all of us to be home. She was surrounded by those she loved the most and was no longer being poked and prodded by strangers. We began to believe again that she may yet surprise us and recover completely. Then one morning she became very clingy. I had to hold her that entire weekend as she seemed unable to be away from me for any length of time. After having gone through what we had I was completely content with the task of holding my daughter close, until Sunday afternoon when she became very uncomfortable. I wrote in my journal that day that I was suddenly fearful she did not have nearly as long as we’d hoped and I was afraid that this may be it. We talked to the on call medical staff, her recent labs had been a bit elevated but otherwise in range for her, they were concerned but knew we didn’t want to go back into the hospital as everyone knew we may never get back out. So we decided to make adjustments in her care to compensate (lay off some diuretics and add some pain meds) and stay home until her appointment the next morning.
Unfortunately, at that appointment her labs came back horrifyingly bad. Her kidneys were in the process of completely shutting down. We tried one last ditch attempt to save her and checked her into the hospital and tried IV therapy. You see if you have an extremely bad heart and your kidneys shut down you’re in a very bad place. Most of the interventions you’d do for one problem are blocked by the other problem. Over the next 16 hours my daughter degraded quickly. We got her quickly into hospice care and at 9am on January 30 2013 while bathed in sunlight and surrounded by family my daughter finally rested a rest that was more complete than her mother and I could ever give her.

A Lesson Of Resilience

It’s easy to see nothing but heartache and tragedy in Ruth’s story, and my humble writing skills and lack of eloquence do her an injustice. Despite this I hope you were able to see that no matter what happened to my daughter, she was happy every moment she could be. She got better every chance she got.
Through all of her tragedy and setbacks my daughter played video games meant for 7 year olds on her iPad, got to see her first birthday, learned how to smile not once but two times, had favorite movies, danced to Miles Davis and Louis Armstrong and screamed for joy an untold amount of times. I’ve never loved anything more on this earth, and I’ve never been so proud of anyone ever as my daughter. She had a mountain sized heart physically and in spirit, and she never ever gave up.
At this horrible time it’s easy to want to stop trying to try at all, but she was unrelenting in trying no matter the difficulty. It’s easy to be sad all the time, but she was happy every chance she got. It’s so incredibly easy to not want to live anymore, but that would do the ultimate disservice to her repeated attempts to live.

Sunday, September 25, 2011

Rail 3.1 CI setup with Jenkins, Test Unit & SimpleCov on OS X Lion.

I recently had to setup a build server for some rails work I'm doing. Still wanting to support my other projects I setup Jenkins. I ran into several issues.

Running Jenkins as a hidden user

First I noticed that jenkins was running as the "daemon" user, this obviously wasn't going to work for github and rvm needs. So I did some googling and had some guides to get Jenkins running as a specific user. I did the following (sourced from http://colonelpanic.net/2011/06/jenkins-on-mac-os-x-git-w-ssh-public-key/ ). Note: That's really $PASSWORD up above. This gives you a prompt to enter that password. Next you'll need to stop the Jenkins service and edit the plist and start the service back up.
You're plist file should end up like this.


RVM issues

Now my next issue was despite what I'd read elsewhere I was unable to get Jenkins to use the default ruby provided by RVM. So I just pasted the commands that I would run anyway in the "Execute Shell" build step.


Getting Jenkins to see tests

I've been using Test:Unit/Minitest lately just to keep more consistent with my day to day work. However I haven't found a way to get my tests to show when using the "Execute Shell" task. I found a little gem called ci_reporter that exports to the standard junit format, unfortunately it doesn't work with minitest yet. That's ok I haven't done anything that Test:Unit doesn't support so far so I added the the following to my Gemfile (note the part about unit-test 2.0):
Running "rake ci:setup:testunit test" should give you a bunch of xml files in tests/reports. Now we need to tell Jenkins where to find those reports so add a post build action to pick them up as junit reports.



Rcov reports

This was pretty easy.
  1. Install Jenkins plugin for RCov (it's in the plugin list in the admin section).
  2. add simplecov and semiplecov-rcov to your Gemfile.
  3. configure Jenkins rcov plugin to look in coverage/rcov
  4. add the following 4 lines to the TOP of your tests/test_helper.rb file:


In closing

This took a fair amount of time, but the end result was quite satisfying. I now have CI with tests, tests coverage, RVM to target different versions of Ruby and bundler to make sure my Gem environment is sane.