Two weeks ago, our number one user request for Kicksend was not related to filesharing (because we were already awesome at that), but for clustered email notifications. Kicksend users often send files in small groups punctuated by short intervals. For example, they drag in a batch of photos from iPhoto, then select the next batch and then drag those in with a brief interval in between those two events. The first round of email notifications for our MVP just sent out an email for each file, which as you can imagine was pretty annoying if someone sent you twenty files at once. Time for an upgrade.
What we really needed was a way to group file upload events together so that we could send a digest email instead of multiple separate emails, and since we were a weeknight-only startup with limited time and resources we needed to do it in a way that was simple, quick and scalable.
We were already using Chris Wanstrath’s most excellent Resque for our background jobs, which includes sending out emails using resque_mailer. It would’ve been nice if any solution we came up leveraged the existing background job stack instead of adding a ton more dependencies.
Thinking it through on paper, we got a basic flow down that solved 80% of our problem. First, we needed the following pieces:
- The ability to schedule Resque jobs to be run in the future
- A special notification event class to handle the clustering logic
- A clean way to update scheduled notifications with new information
We found resque_scheduler and it’s Resque#enqueue_at method to be perfect for scheduling Resque jobs to be run in the future (we believe DJ has the same feature) and also a way to remove already scheduled jobs. Next, we created a NotificationCluster class that encapsulates all of the logic required to create, group and schedule email notifications. Here’s what that class looks like:
Now, let’s trace through a typical ‘event’. Bear with us here. When the user drags in the first file: a new NotificationCluster object A is created in the data store and that object’s email job is scheduled to be sent out in CLUSTER_TIMEOUT seconds. Then, every subsequent file that’s uploaded within CLUSTER_TIMEOUT seconds since the previous one will update object A and append the file-id to A’s associated_files serialized array. Along with the append, it will remove any existing jobs scheduled to be run for A, and creates a new email sending job scheduled for the future.
This mechanism allows for rolling events. When the user is done uploading files and a subsequent CLUSTER_TIMEOUT period has passed without any new files, A’s final email is then sent out to the recipient with a list of all the files (from the serialized associated_files array) that were updated.
The solution we’ve outlined is a very simple way to do clustered events without having to resort to K-Means clustering of timestamps, which would’ve definitely taken us a lot longer than the hour and a half it took to implement and test this. When time and money is limited, resourceful solutions built off pre-existing, well-tested libraries that solve 80% of the problem trumps hours of coding up new solutions that’ll get you the remaining 20%.
Ultimately, all of this engineering just comes down to making your users happy. People have a very low tolerance to a badly written email. Doing whatever you can to convert annoying emails into useful ones is, in our opinion, totally worth it.
Kicksend is effortless realtime filesharing with your family and friends.