Saturday, October 6, 2012

Zoom and center with d3

I've been spending lots of time with Data Driven Documents (d3) for building data visualizations.  At its simplest, it's a bare bones library for binding data to SVG to create visualizations.  For anyone who has used it, the warnings are usually the same, "It's a steep learning curve".  The funny thing is that d3 is fairly simple.  The problem is that you need to be fairly intimate with drawing and transforming shapes, lines and nodes using SVG.  That was the most difficult part for me and it's nothing that d3 has brought to the table.  If your work already involves things like transformation matrices, you'd be in familiar territory and d3 would be a pushover.

One of the first things I needed to do was select a circle and have it "fit to screen" which was essentially zooming and centering the circle.  There were a couple gotchas that I had to hammer through before I figured it out.

  1. Scale the view THEN translate.  It fits my mental model better to make the view the size I need via scale and then to move around it with a translation.  I'm sure you could translate first and then scale if you want.
  2. Translations performed on a scaled shape do not need to be scaled themselves.  So, if I scale a circle with a diameter of 40 by 2 then now the coordinate system has doubled and its diameter is 80.  If I wanted to translate the shape by (10,20) I wouldn't have to scale it first and make it (20, 40).
  3. The top left of your viewport is your origin (0,0) and its positive expansion is down and right.
Here's a quick example I worked up here:  

First, we need to know how much we need to scale the view.
var scale = height / (radius * 2)
We divide the height of the viewport by the diameter of the circle so we know how many times we need to magnify the circle to make it fit the screen.  In the case of my example, we're scaling based on the height of the viewport since it's shorter than the width.  We can only zoom as far as the shortest dimension so as not to cut off the shape.
scaledCenterX = (width / scale) / 2
scaledCenterY = (height / scale) / 2
When you scale, the viewport is being expanded beyond the visible boundaries downwards and to the right.  What you see in the viewport if you only scaled would be the top left of the newly scaled view.  You need to find out what the X and Y coordinates are  for the center of your smaller "window" since that's where you want to put your zoomed in shape.
x = -(node.x - scaledCenterX)
y = -(node.y - scaledCenterY)

var transform = "scale(" + scale + ")";
transform += " translate(" + x + "," + y + ")";

vis.transition().duration(500).attr("transform", transform);
The rest is simple.  Just subtract the shape's position from the center of your window to get your translation.  Please note again, the scaling comes before the translation.

Tuesday, September 13, 2011

Stop thinking Agile, and start thinking agile

There's been plenty of sentiment about Agile (capital A as my buddy Ed pointed out to me) being dead or that it's a failure or that it's in some state of decay.  In a sense, I completely understand why people would think that.  It may also be some of what is holding back the free thinking in the process of software development.  There are few big, new ideas (outside of lean but then that's not new) that are really providing direction for the software development process.

From my point of view, this wound is self inflicted.  Agile has become the hammer to nail all process problems.  It was championed and supported by some really smart guys in the software dev space.  Popularity grew, Agile become all the rage and, as all fads in the space, it became a personal point of pride to declare that you or your company "went Agile".  There were books written about it that contained all the activities and routines you must partake in in order to be considered Agile.  Training classes and certifications inevitably popped up that claimed that with 3 days of training you too could be a scrum master and Agile wizard.  Before anyone knew it, Agile was a defined creature with a strict set of rules (standup meetings in the AM, planning poker, TDD, pair programming, retrospectives, etc.).  It was no longer the organic process that it was supposed to be.

I certainly wasn't there in the early days of Agile but my company did "go Agile" almost 3 years ago.  With it, I've found some fantastic practices that are now part of my software development six-demon bag and I've chronicled many of them on this blog.  It's truly been a great addition to my craft and I would certainly advocate others to try it as well.  GO AGILE!  You may thank me later.

So what's the problem here?  I just said Agile prevents free thinking but then I advocate that you try it.  The problem isn't the Agile movement, it's how it is adopted.  People want to add it to their process without appreciating where it came from and why your company should care to use it.  It's the same behavior I allude to in my "The Goal" entry.  Blind adoption of the Agile process isn't going to be to your benefit.

When I talk to people about a good software development practice, I no longer think in terms of the canonical Agile.  Every company, every product, every team have their own unique environment and set of problems.  It would be a mistake to not take those things into account before you start throwing the Agile bible at everyone.   My goal is to streamline the process and use the best tools I can to reach the coveted hyper productivity.   Agile is comprised of many different practices and procedures.  Take what you need from it and recognize what works best for you.

Remember, Agile wasn't created from thin air.  It was coined by a bunch of guys that saw waste, friction and deficiencies in the process.  This is not uncommon to any of us.  These are things we all see in our days, isn't it? "Wow, that meeting was useless."  "The number of regressions grows on each release!"  "We build what they ask for but then they never like it!"  "These manual deployments take hours and they're error prone!"

LISTEN to those thoughts and react to them.  As a developer I know how easy it is to identify what is going wrong and grouse about it.  It feels great to vent.  If I asked you what pissed you off, you'd ramble about it for hours, eventually repeating yourself but then you'd dive back into it anyways because you're that passionate about it.  Take that energy and think of how you would fix things.  If that meeting was useless, how would you make it better?  Should you even have been there?  Were there too many people talking at once?  Was there one loudmouth that dominated the whole discussion?

Don't accept this reality.  Change it.  Make it better.  Be organic in your process.  Evolve.  Mature.  You don't need a certification to understand where things are going wrong for you.  Fix it.  You have manual builds today.  Automate them tomorrow.  You have 6 month projects with unhappy customers.  Decompose them into 1 month deliverables to get quicker feedback.  You have too many bugs.  Increase test coverage and improve your QA.  You have a schizophrenic product organization that pulls you in 8 directions?  Stop the assembly line and demand consensus and priority.

That's all Agile ever was.

...but before you go, buy a book, ask questions on forums and learn about all the wonderful practices that fall under the Agile umbrella because we've learned a lot in the last decade about good processes.

Sunday, July 31, 2011

From a Java/C# world to Ruby's: Writing modular code

For the past handful of months I've been dabbling in Rails in my free time to play with some projects.  It's a fun platform for creating applications and I (as has been mentioned) dig Ruby.  I'm starting to actually build up enough code to warrant a little more discipline than what I've had so far.  For the most part, I've been plowing through just trying to experience all the tools and tricks the platform has to offer.  I couldn't really be effective otherwise if I hadn't given myself any time to play in the sandbox.

One thing I'm trying to learn is how to modularize code in Ruby.  How and where do you draw lines of responsibilities?  Where should code live?  Should I use a module or should I use a class?  There hasn't been a straight forward answer.  Ruby being dynamic, open classes and being able to intercept/modify just about any behavior in the feels like my first year away at college (well, not really but you catch my drift).  While it's a lot of fun, you have to make sure you don't run wild with your freedoms.  There's a limit and you have to govern yourself.

Again, where are the lines drawn?  What are the right things to do?  It's not apparent.  Look at models as they're known in Rails.  Models use the Active Record pattern where each instance constitutes a record in a persistent store.  So what is the typical Rails model responsible for?  Reading and writing itself from a persistent store at the least (along with any of its child models).  This alone will make members of the CQRS Illuminati grow faint.  But it doesn't stop there.  Also, you need to perform your validation there which makes sense for any stateful, data-driven creature to do.  

At first, I let everything pile into my models.  If I had 3 ways I wanted to query things from the DB?  Oh, hey, I need to query my associations too, what should I do with all that logic?  Put it in the model.  If I had extended validations dependent on certain conditions?  It undoubtedly went in the model.  If I add an authentication framework that needed to decorate the client classes?  Hey, I'll just add it to the model!

It never felt right and as I piled more functionality on, things became especially itchy.  Instead of trying to foresee how this would all pan out and try to apply some half-brained pattern of my own, I just went for it and made things a sloppy mess.  I really wanted to see what the wrong way to do things was so then the answer would be more apparent.  Just like my early days when I realized how tests benefited my code, I could learn from it.  Why?  Because Ruby isn't Java or C# and I'm a Ruby part-timer.  I've seen and read about trying to apply patterns from either of those languages that is inappropriate.  I decided to let mother nature dictate how I should proceed.

In the case of the rogue models, I found what makes me most comfortable.  First, anything related to the data and validation of a model stays in its class definition.  Second, any associations defined stay in the model's class definition.  Third, anything that demonstrates how that model behaves in its domain should stick around (if possible).  I want to be able to see and quickly digest what the model is and what it's related to.  

Last, everything else, provided that its a significant amount of code, is placed into modules.  Modules allow me to create meaningful, cohesive groups of methods and constants.  For example, the code to query the DB (in any number of ways) is pulled out and placed in some sort of data access module.  Modules, while not being the same as classes, act very much like a class in most senses.  What I don't have a feel for is how many includes is too many includes.  The models get this very facade-like feeling.  They do a lot.  It's still something I haven't quite gotten used to yet.  

So, the short of the long, modules are nifty and I can draw parallels with how I used interfaces (and ultimately their implementations) in Java/C#.  It's the same song, just a different dance.  Use them to decompose the larger objects and group logically related functionality.  

Tuesday, July 26, 2011

MongoDB repair on bad shutdown

I've been working on a little side project and I've been using MongoDB.  One insanely annoying thing I've run into a couple times is if Mongo doesn't shutdown cleanly then you have to repair it. The next time you try to start it up you get errors.  You'd hope that these things would happen automagically but oh well...I'm just starting to learn it so I may be missing out on the larger reason.

You may end up seeing something in the console like this when you try to connect to your local server,

:~$ mongo
MongoDB shell version: 1.8.2
connecting to: test
Tue Jul 26 20:45:17 Error: couldn't connect to server shell/mongo.js:79
exception: connect failed
or this (this one was just me putzing around trying to get it going),

:~$ sudo mongod
mongod --help for help and startup options
Tue Jul 26 20:47:35 [initandlisten] MongoDB starting : pid=3358 port=27017 dbpath=/data/db/ 32-bit
** NOTE: when using MongoDB 32 bit, you are limited to about 2 gigabytes of data
**       see
**       with --dur, the limit is lower
Tue Jul 26 20:47:35 [initandlisten] db version v1.8.2, pdfile version 4.5
Tue Jul 26 20:47:35 [initandlisten] git version: 433bbaa14aaba6860da15bd4de8edf600f56501b
Tue Jul 26 20:47:35 [initandlisten] build sys info: Linux #1 SMP Fri Feb 15 12:39:36 EST 2008 i686 BOOST_LIB_VERSION=1_37
Tue Jul 26 20:47:35 [initandlisten] exception in initAndListen std::exception: dbpath (/data/db/) does not exist, terminating
Tue Jul 26 20:47:35 dbexit:
Tue Jul 26 20:47:35 [initandlisten] shutdown: going to close listening sockets...
Tue Jul 26 20:47:35 [initandlisten] shutdown: going to flush diaglog...
Tue Jul 26 20:47:35 [initandlisten] shutdown: going to close sockets...
Tue Jul 26 20:47:35 [initandlisten] shutdown: waiting for fs preallocator...
Tue Jul 26 20:47:35 [initandlisten] shutdown: closing all files...
Tue Jul 26 20:47:35 closeAllFiles() finished
Tue Jul 26 20:47:35 dbexit: really exiting now
I found the answer in the comments of this blog.  Run the following commands and all will be well,

sudo rm /var/lib/mongodb/mongod.lock
sudo chown -R mongodb:mongodb /var/lib/mongodb/
sudo -u mongodb mongod -f /etc/mongodb.conf --repair
sudo service mongodb start

Monday, June 20, 2011

The Goal

The only right way to write code is to use TDD!

Scrum should only take 5 minutes!!

You should ALWAYS use DDD to develop your model!!!

NOSQL is the only way!  Down with SQL!

That's the sentiment you can find in a lot of arguments between developers.  It's too black and white, too absolute.  The issue is that the focus leaves the goal.  It artificially limits the solutions you can come up with.

In the case of TDD, you can most certainly write code before tests.  What do all software development practices boil down to?  Building easily maintained software.  It isn't about creating beautiful code, it isn't about designing a family of classes, it isn't about creating a model that sounds like a spoken language.  Building easily maintained software can be done in any number of ways.  Something like TDD helps to reinforce proven practices for maintainable software but it's not the only solution.  I'm a TDD practitioner but you know what?  I've got no beef with writing code first.  You don't always need to TDD.

In the case of agile development, for anyone adopting agile, there's a LOT of focus on the religious practices that come with it.  You will, for a long time, follow these practices blindly, even when they don't improve your process and become a nuisance.  The answer is that not all practices may make sense.  Take a step back and look at what they are intending to achieve.  A 5 minute scrum doesn't mean you're doing a good scrum.  You may find that, for your group, have something like 8-10 minutes is more comfortable.  Don't hate yourself because you're not following it to a T.  You've done exactly what these practices were intended to do; maximize your productivity.  That's the goal of this whole thing.

In the case of NOSQL, it's a poor musician to blame his instrument.  People finally realized that they were using SQL as the hammer to nail everyone's persistence problems.  With the host of new applications we build, SQL isn't always the answer.  Maybe it's a document database.  Maybe it's a key-value store.  Maybe you need something that natively supports object graphs.  Instead of recognizing the fact that applications and their persistence mechanisms come in all shapes and sizes, the sentiment has become that SQL is the scourge of software development.  We have ourselves to blame on that one, trying to force an object model into a system that poorly represented it.  Somehow this is SQLs fault?  We stopped questioning why we did the things we did.  Take a step back, ask yourself what you're really trying to do and then it becomes a no-brainer.  That's what gave birth to all these new forms of persistence.

For some, it may take a while to see it but you'll get there.  Don't fall into the trap of blindly doing things without re-evaluating constantly.  There's more than one way to do something and you shouldn't let software purity or zealotry prevent you from considering it.  It leads to the innovations that improve our lives as developers and the people who use our software.

Tuesday, March 8, 2011

Thoughts on the Test Pyramid

About a half year ago, one of my coworkers sparked a discussion on the test pyramid.  Specifically, he referenced a link and asked our thoughts on it.  Since I like to geek out, I had to respond.  I actually spent some time creating what I thought was a reasonable analogy so I figured I'd share it.  I'm also offering this now as I have another post I really want to discuss in regards to the boundaries in testing between a QA group and developers.  Is there one?  Should there be one?  How do we deal with test overlap?  It's something I've been mulling around and I need to think out loud.

So, here's how the original thread went,

Some interesting posts on testing pyramids and automation. I'm a proponent of using this pyramid metaphor for organizing our test strategy. Thoughts?

It's an effective way to describe a proven testing strategy. The simple law here is that tests are all about feedback and how valuable that feedback is. The value is tied directly to their maintenance cost. If you look at it this way, you stop thinking in terms of layers. For example, brittle unit tests are high cost and thus less valuable than verifiable, repeatable, solid unit tests. This is the mindset all test implementers need to have in order to build the best test suite.  
In terms of the test pyramid, we can certainly assume that there are responsibly written tests. What's illustrated there is how the test maintenance cost rises in parallel to the growing level of component interaction and complexity. The more complex the interactions in the test then the more likely it is to break. Broken tests need to be maintained. Tests of complex interactions take longer to repair.  
At each level of the pyramid, we have a set budget on how much we can spend on maintenance. Unit tests are cheap therefore we buy lots. Integration tests are a little more expensive so we can't afford as much. Functional tests carry the heaviest price tag so we just buy a handful.  
Tests at all layers are valuable and provide a distinctly different type of feedback. That's why it isn't a good idea to simply forego testing a layer at all. Having a codebase with only unit tests doesn't prove that the system actually runs.  
There are a few more factors in what defines the value of a test but I'm glossing over that by calling them "responsible" tests. If you had two integration tests that exercised the same grouping of components then you should probably entertain getting rid of one. We want clean, clear feedback and having multiple test breaks for a single issue will obscure the root of the problem. Given that we're trying to be as frugal as possible we'd probably identify tests like that early on.  
The one caveat/tangent I might mention is that the pyramid metaphor (and how I commonly see it described) doesn't account for things like capacity testing; the throughput of the application, the transactions/second they can service, their memory footprint and resource consumption and other issues that can potentially derail a release.  
Application deployment is an implicit test to make sure that you can actually install and run the application. There would never be an NUnit test for that but their is value in knowing that the application can be reliably deployed. We should try doing that earlier to get that feedback when the information is more fresh in our minds. 
And that was about it.  I had a few more words but they dealt with the internals of our specific situation.  Not worth repeating for obvious reasons.

Sunday, February 27, 2011

On the fence about Java, open source is the shiznit and .NET build automation sucks

I moved to a new team more than half a year ago and at first things were rough.  The team itself was fine but I was giving up my precious C# for Java and I was entering a whole universe of new software.  My technology stack changed from Microsoft branded IDEs to that of the open source crowd.  I was out of my element but I thought, "Hey, time to try something new!".

Let's just get this out of the way; as a language, Java is not aging well.  Certainly not in comparison to C# which is getting updated more frequently with a lot better language niceties.  When .NET rev'ed itself up to v3 it brought along lambdas and extension methods which makes for some really nice, clean code.  This is noticeably absent from Java.  Even the simple act of passing around functions is no where to be seen in Java.  I have to create a full-blown class that implements one method.  It feels like Java is more worried about being pulled from the holy scripture of Design Patterns than it is about making a language that is enjoyable.  Ah, you can pass an anonymous class but it's still waaaay too verbose.

But that's the thing with Java.  It seems like it tries to be as horrifically verbose as possible.  One of the common Java practices is to mark everything final.  You see final everywhere.  Final member variables, final arguments, final variable declaration, final classes, final final final.  You'd wish that they marked things NON-final because there are far fewer instances of that.  Better yet, choose a language where variables are immutable by default.  You end up with a lot of chatty code, which is unfortunate since it's already a static don't want to be known for pushing the envelope on verbosity.  Here's my favorite quote about it,
Whenever I write code in Java I feel like I'm filling out endless forms in triplicate. 
-- Joe Marshall
I can forgive these sins because I've finally been exposed to open source software.  Software built by the community.  For us, by us.  Open source software has a distinctly different flavor to Microsoft software.  Tools load faster.  There's less fluff.  It's more of what you need and less of what you don't need.  If something has a bug, it's fixed quickly and you can get the patch as soon as someone posts it.  You don't have to live with the same crap for years because it just doesn't show up as a priority on Microsoft's radar.  You know, like the "add reference dialog" that has been the bane of our existence since early 2000 that they only recently addressed (sort of).

I was introduced to Maven which is build and dependency management for Java.  For anyone who cares about build automation (and that had better be ALL of you), Maven beats the ever living snot out of anything you can find in .NET.  .NET's build automation story is absolutely abysmal in comparison.  I know some people think Maven is kinda crappy but it's still light years beyond what the .NET crowd has going on.  The only thing that seems to be standardized in .NET is using NAnt (and possibly but incredibly doubtfully, Albacore/Rake).  Here's a simple scenario that is more difficult to do in .NET,

  1. Gather dependencies
  2. Compile code
  3. Run Unit tests
  4. Run Integration tests
  5. Deploy the application
  6. Run smoke tests

Step 1 was always something that was custom to whatever shop you worked at.  This is where someone had to copy third party libraries somewhere your build process could access them (probably a lame network share).  This is a problem that has been solved for years with things like the Maven repository in Java and Gems repository in Ruby.  Maven just does this out of the box.  When you run a build, it will retrieve your dependencies prior to compilation.  There's a project NuGet for .NET that is supposed to alleviate this problem and I admit I haven't tried using it.  I can only hope it gets integrated into another project like Maven for the .NET community.

Steps 2-3 are straight forward (and usually) don't introduce too much pain.  Step 4 usually requires partially constructing the environment of your application which includes databases, web applications and other such "heavy" dependencies.  This is where the open source community trounces .NET.  When I tried automating testing of web applications it required that IIS be installed, that I had the correct users/authentication, that I figure out the mystery of the Web Deploy tools that are anything but intuitive and extremely prejudiced to IIS 6, blah blah blah.  It was a process that required massive manual intervention and implementation.  It's brittle and no company wants to pay for the time it would take to implement that.

With Maven, I just reference the Tomcat web server plugin, tell it to start a web application on the fly, install it, and run it.  It's disgustingly simple to do and it's actually running a very good analog of my production environment.  You don't get Tomcat Home Edition or Tomcat Developer Edition or some other lame neutered version of the web server product (I'm looking at you Cassini web server).  You get the real deal and you get it with only a handful of XML declarations.

Step 5 in .NET has no real public support or method.  It sucks.  Using the same Maven Tomcat plugin I mentioned above, deploying your web application is a breeze.

I don't think I care to carry on describing how immature the state of build automation is in .NET.  Suffice it to say, we .NET devs should stop waiting for Microsoft to make it for us.  It's a glacial development.  There's too much to be learned from the open source software projects.  First lesson, not all .NET projects need to be monetized.  Yes, you can be compensated for your hard work but there are tons of opportunities to do that outside of selling a product.  Embrace your community, have them push the feature list.  We can go from there.

So while I get a lot better developer tools in open source software, it doesn't come without a price.  I'm not impressed with documentation and it's difficult to find the answers you need.  Unless you subscribe to their IRC channel or mailing list, you aren't going to find help from available online content. Sure, it can get you going but once the rubber meets the road you will inevitably hit an issue that doesn't seem to be covered in anything you can find with conventional Google searches.  I've come to loathe all the websites that index mailing lists, jira and java documentation only to provide you with advertisement-laden dead ends.  You'll have to subscribe to a mailing list, post your question and pray that somebody wants to respond to a question they've probably seen a dozen times before.

The open source world isn't perfect but I love the tooling and the low barrier for entry.  If you've been living in the .NET bubble for some time, take a break and try something new.  The worst case is that you recognize ways to make your life easier.