Software Engineer and Startup Founder
Bio for technologists: Software engineer with a broad background in languages, toolkits, and platforms. Special depth of knowledge in Java, Python, JavaScript and C. The main enemy of my productivity is multi-tasking; my tools for fighting the enemy are GTD and Agile project management. My allies in productivity are dynamic languages, frameworks, idioms, and UNIX tools. My favorite design pattern is "Interpreter." I believe in thinking before coding. You might call it thought-driven development.
Bio for mere mortals: Software engineer with over a decade of experience. CTO and President at Parse.ly, a service that connects users with the content they'll love through personalization and recommendations. Also CEO at Aleph Point, a small software consultancy with a focus on large-scale data processing applications (leveraging information retrieval or natural language processing techniques) and web application development. Specialty in the rapid development of prototypes. Prior, was a technical lead and agile project manager for a small, super-bright team within a top-tier investment bank.
See My LinkedIn profile for more information.
It has become increasingly common for technology companies to run as Fully Distributed teams. That is, teams that collaborate primarily over the web rather than using informal, face-to-face communication as the main means of collaborating.
This has only become viable recently due to a mix of factors, including:
Thanks to these factors, we can now run Fully Distributed teams without a loss in general productivity for many (though not all) roles.
In my mind, there are three models for scaling number of employees in a growing company in the wild today. These are:
Choice 1 is the most “traditional” scaling approach. The theory behind this approach is that face-to-face communication is the most efficient collaboration method. Nice offices are the best way to build team camaraderie and attract the best talent. It can be tempting when growing to switch from a Vertically Scaled team to a Horizontally Scaled one, by simply adding a second office location.
Choice 2 is typically how small, colocated teams end up becoming “semi distributed” teams via multiple offices connected via web-based collaboration tools. In other words, some companies that are going through growing pains under a Vertical model may switch to Horizontal as a coping mechanism. If there is a talent shortage in a Vertical team’s original headquarters location, or if enough office space cannot be procured to sustain employee growth, Choice 2 may be seen as the only option on the table.
Choice 3 is probably the rarest among growing technology companies today, but is, as I mentioned, increasingly common. In Fully Distributed teams, there is no real “headquarters” or “central office”. All communication is digital-preferred, which means employees can work from anywhere and not be fearful about out-of-band communication. Typically, employees will set up home offices or work out of coworking spaces in their home area. Occasionally, the team will still have a “headquarters” office, though this office will often be reserved for things like management, sales, marketing, business development, or other operational roles that do not need to be scaled as heavily.
Choice 3 tends to work extremely well for technology-heavy teams due to a few factors:
Related Reading
Github’s fully distributed team is described well in this GigaOM piece, Tales from the Trenches: GitHub. Zach Holman’s presentation, How Github Works, also provides an insider’s take. Github is interesting because though there are 77 employees and the team operates in “fully distributed” mode, there are a significant number of engineers at the company’s headquarters in San Francisco (perhaps as many as 30 by some reports).
The CEO of Automattic, the company behind WordPress.com, describes 5 reasons why your team should be (fully) distributed. They also discussed their distributed culture and collaboration environment on their team blog. This AllThingsD article points out that the company grew to over 100 employees and over $45 million/year in revenue with no central office.
Basho, the creators of the distributed database Riak, are a fully distributed team of nearly 75 employees, millions per year in revenue. An excellent blog post on Basho’s blog dated last year describes their distributed culture. The person who wrote that post also provided a breakdown of headcount by job function: Engineering/Support: ~20, Executive: ~6, Sales/Marketing: ~7, Admin: 1.
37 Signals founder DHH wrote about the technology “talent shortage” to “Stop whining and start hiring remote workers”. In it, he indicated that 37 Signals distributed team includes employees from: Fenwick (Canada), Phoenix (Arizona), Caldwell (Idaho), Romiley (UK), Jefferson Hills (Pensylvania), Ann Arbor (Michigan), Boulder (Colorado), and Tampa (Florida). Another great post from their blog talks about Equality and Remote Teams — noting a negative behavior that happens in Semi Distributed teams where remote workers are seen as “second-class” team members to those in the “primary” office, and what they did to eliminate this behavior.
Fully Distributed: is it fully viable?
I have heard many VCs and business school academics describe distributed teams as a big mistake. To me, this makes perfect sense. VCs and academics are themselves only in the early stages of seeing their own business models disrupted by distributed financing and distributed education. So long as there is still a strong premium placed on face-to-face interaction in their own fields, they will assume it holds true for other fields, as well.
Vertical/Horizontal Scaling are “proven” Web 1.0 techniques for growing companies (think of Google or Amazon), while Fully Distributed is as of yet unproven. VCs in particular like to remove whatever risk factors they can from deals — the Fully Distributed model seems like a risk with no corresponding upside, especially if capital is cheap and talent is widely available.
But perhaps the biggest reason for distrust of Fully Distributed teams is simply past (and out-dated) experience. VCs and business school academics are typically older. They have lived through an age where corporate America became obsessed with Information Technology and its hyped ability to virtualize and distribute all communications, even when Internet connections were slow/unreliable and collaboration software was beta-stage, at best. Thus, they associate digital collaboration tools with the slow/frustrating kludges of yesteryear. They have also not yet fully internalized the various ways in which Internet distributed models have beaten centralized/co-located models in advanced collaboration — some big examples here include Wikipedia and the Linux kernel.
Fully Distributed’s viability is a recent development — I suspect we only reached the threshold about 3 or 4 years ago. But believe me: we have reached it. It is now a fully viable model alongside the other two more traditional organization scaling models. And it will become increasingly common in the next few years as the tools get better and the option becomes accepted by the wider ecosystem.
Last week, I decided to give myself the present of a 512GB SSD drive, which was available at a nearly 25% discount on NewEgg for a limited time.
The price-per-gigabyte for SSDs has finally fallen to nearly $1/GB, and the rewrite cycle problems that used to afflict these drives is now becoming a non-issue with the Linux kernel’s TRIM implementation and the updated firmware on these drives.
So, I took the plunge. My main development workstation was a Thinkpad T400, maxed out to 8GB of RAM, and with dual 500GB platter drives (via Thinkpad’s excellent Ultrabay extension). I was running Ubuntu 10.10 for a long time. I timed the SSD purchase with the release of Ubuntu 12.04 LTS — 10.10 no longer being supported, I figured I’d do a clean install on the new SSD and clean up my development workstation for the first time in a couple of years.
A couple of things occurred to me in this process. First of all, since 2009, I have moved more and more of my data into cloud services. I have moved the lion’s share of my “business and personal documents”, including photos, into Dropbox with my 50GB account. I have moved my Music into Ubuntu One Cloud, due to its excellent streaming service for iPad, iPhone, and Android (which I also use from my Macbook Air via a Fluid App). And I have moved my truly old files and digital keepsakes into NAS drives that I host in my little server room at home.
One aspect that I didn’t move into the “cloud” was my custom UNIX configuration. But last year, I started using Github, so I took this transition as an opportunity to remedy this, by open sourcing my dotfiles on Github.
The only remaining “non-cloud” files are:
As a result of this, “migrating” to a clean install of Ubuntu 12.04 was actually pretty easy. I had to spend a Saturday getting the keyboard shortcuts and environment set up the way I like, and installing some workarounds for hardware-specific problems. For the most part, I was able to get “productive” very quickly.
I reflect on my new workstation and what is amazing to me is its speed and lightness. Thanks to the SSD and Ubuntu’s recent focus on performance, it boots in less than 10 seconds. Hibernating the machine with TuxOnIce takes less than 10 seconds, as well. Sleeping the machine takes 1-2 seconds.
The SSD is blazing fast. Even with all my personal files now consolidated in my home directory, running UNIX’s find command to enumerate every file in my home takes only a couple of seconds. Booting up my VMWare instances takes seconds, as well. Launching browsers, terminals, and vim sessions each take less than a second.
Further, in the last year, I have decided to drastically simplify my development environment. I have tossed aside all IDEs that I used to tout (such as Eclipse or WingIDE), and instead use plain old UNIX and a vim text editor. This software — designed to run fast on the machines available in the 80s and 90s — simply screams on my modern hardware. And thanks to the SSD, grep’ing through hundreds of megabytes of code and configuration can happen within seconds. The fancy search indexes that memory-hogging IDEs like Eclipse and PyCharm start to seem clunky and antiquated vs my good old UNIX tool from the 70′s. The speed of our hardware has made the simplicity of our UNIX tools seem wise in their old age.
There is another opportunity here, too. The only pieces of my old workstation that are not reproduced here are the various databases and data services that I used to run to test the software I build on a daily basis. Systems like Postgres, Redis, Solr, and MongoDB.
But I’m realizing that thanks to my fast hard drive and excessive memory — along with significant strides in developer operations tooling, such as Chef and Vagrant — I can now make these systems self-contained and thus not suffer the overhead of running them on my main box all the time. So, I’m now taking the time to check out my code but run all of the dependent services inside Virtualbox VMs created by vagrant. I should have done this earlier; this is providing an excellent opportunity to do it right.
Ubuntu’s 12.04 release — featuring the controversial Unity interface — has also gotten me thinking about desktop environments in general. Despite all the flak the Ubuntu folks are getting from the community for this — much of it deserved — I think the general direction they have identified is a good one.
One aspect of Unity is that it is focused on making the window environment and filesystem access of your desktop as speedy and lightweight as we have become accustomed for our web applications.
Unity’s Dash launcher and HUD display both leverage an important user interface concept that has always been underutilized on desktops but heavily exploited on the web: search.
It’s not perfect yet, but the concepts behind Zeitgeist, Lenses, and Scopes are some of the best I’ve seen in achieving web/desktop parity. The goal is the admirable thing: rather than guiding the users with interactive actions, let the user simply express what he or she wants. And then pull up the answer in less than a second. I am already brainstorming lenses and scopes that could help me during development — e.g. git commit log searches.
Thanks to virtualization, cheap memory, solid state disks, and cloud storage, my main development workstation is becoming notable for its speed and lightness. I really can’t imagine things being much faster or more optimized, though I am sure this trend will continue.
One thing’s for certain: it’s a great time to be a software engineer.
I read an excellent debrief on a startup’s experience with MongoDB, called “A Year with MongoDB”.
It was excellent due to its level of detail. Some of its points are important — particularly global write lock and uncompressed field names, both issues that needlessly afflict large MongoDB clusters and will likely be fixed eventually.
However, it’s also pretty clear from this post that they were not using MongoDB in the best way. For example, in a small part of their criticism of “safe off by default”, they write:
We lost a sizable amount of data at Kiip for some time before realizing what was happening and using safe saves where they made sense (user accounts, billing, etc.).
You shouldn’t be storing user accounts and billing information in MongoDB. Perhaps MongoDB’s marketing made you believe you should store everything in MongoDB, but you should know better.
In addition to that data being highly relational, it also requires the transactional semantics present in mature relational databases. When I read “user accounts, billing” here, I cringed.
Things that it makes total sense to use MongoDB for:
Analytics Systems: where server write thorughput, client-side async (unsafe) upserts/inserts, and the atomic $inc operator become very valuable tools. See this post for one example.
Content Management Systems: Here, schema-free design, avoidance of joins, its query language, and support for arbitrary metadata become an excellent set of tradeoffs vs. tabular storage in an RDBMS. MongoDB’s website has some nice examples of uses in media.
Document Management Systems: MongoDB can be used to great sucess as the canonical store of documents which are then indexed in a full-text search engine like Solr. You can do this kind of storage in an RDBMS, but MongoDB has less administrative overhead, a simpler development workflow, and less impedance mismatch with document-based stores like Solr. Further, with GridFS, you can even use MongoDB as a store for actual files, and leverage MongoDB’s replica sets for spreading those files across machines.
So, when evaluating MongoDB for your project, look at these use cases and see if they match yours. Because these are some of MongoDB’s “sweet spots”, and where you will likely get the most benefit out of its design.
However, SQL databases were developed over the course of decades because of patterns of software data storage requirements. Therefore, before choosing MongoDB, you shouldn’t flush all of this industry knowledge and learning down the toilet. You need to ask yourself: Is my data relational? Can I benefit from transactional semantics? Can I benefit from on-the-fly data aggregation (SQL aggregates)?
Answered “yes” to these questions? Then, by all means, use a relational database. Just because a technology isn’t brand new doesn’t mean it isn’t right. You should also check out this video by Brandon Rhodes at PyCon to get a better appreciation of SQL databases: Flexing SQLAlchemy’s Relational Power.
Using multiple data stores is a reality of all large-scale technology companies. Pick the right tool for the right job. At my company, we use MongoDB, Postgres, Redis, and Solr — and we use them each on the part of our stack where we leverage their strengths and avoid their weaknesses.
The original article reads to me like someone who decided to store all of their canonical data for an e-commerce site in Solr, and then complains when they realized that re-indexing their documents takes a long time, index corruption occurs upon Solr/Lucene upgrades, or that referential integrity is not supported. Solr gives you excellent full-text search, and makes a lot of architectural trade-offs to achieve this. Such is the reality of technology tools. What, were you expecting Solr to make your coffee, too?
Likewise, MongoDB made a lot of architectural tradeoffs to achieve the goals it set out in its vision, as described in their Philosophy document.
It may be a cool technology, but no, it won’t make your coffee, too.
In the end, the author writes, “Over the past 6 months, we’ve scaled MongoDB by moving data off of it. [...] we looked at our data access patterns and chose the right tool for the job. For key-value data, we switched to Riak, which provides predictable read/write latencies and is completely horizontally scalable. For smaller sets of relational data where we wanted a rich query layer, we moved to PostgreSQL.”
That’s the best lesson. They ended up in the right place — storing/indexing their big, multi-form data in multiple, purpose-built data stores.
New Things and Terrible Ideas
Here’s a terrible idea: implementing full text search with MongoDB. Instead, you should use Solr, Sphinx, or ElasticSearch.
Here’s another terrible idea, implementing grouping and aggregation using MongoDB’s map/reduce support. Though this “works”, you should just use Postgres or any other database that supports these operations out of the box.
Sometimes it’s good to share the worst resources on the web along with the best.
Any new, shiny technology brings with it a bunch of terrible ideas. I actually think it’s kind of funny to see the ways people on the web contort perfectly good technologies in terrible ways. For example, I regularly see people trying to implement relations in Solr, implement document storage in SQL, implement transactions with MongoDB… the list goes on and on.
In my work, we try to fit the right tool to the job. This can be challenging and lead to technology fragmentation, but I think it’s a reality one simply must navigate these days. It’s not good enough for a startup CTO to say “We’re a MongoDB company” or “We’re a Postgres company”, as if picking a data store is a cultural statement. You need to have reasons to pick one data storage approach over another.
MongoDB happens to be a really good fit for analytics applications. We’re not the only ones who think so — Chartbeat has also implemented their real-time analytics system on MongoDB, Gilt Group implemented theirs (Hummingbird), and Square implemented their analytics system with it, and open sourced some underlying functionality in Cube. There are lots of other examples.
I have heard that Cassandra might be an even better fit for analytics systems than MongoDB. It’s how Twitter implemented its own internal analytics (which it now provides to advertising customers) — but my team hasn’t had the cycles to evaluate it yet. Plus, we’re pretty happy with the architecture we landed on.
We still store customer data (e.g. names, billing addresses, credit card tokens, API keys, etc.) in Postgres. I love Postgres, especially for this kind of data. And there are times I wish I had some subset of my data in Postgres so I could use aggregates and views.
We use Redis for ephemeral, real-time data where we have to sustain higher write throughput than even MongoDB will muster. Having a data store that can automatically expire keys can simplify some use cases drastically. We also use Redis as a simple queuing mechanism in some systems.
Finally, we use Solr for search — because Solr is awesome at it and we leverage features like TermVectorComponent, MoreLikeThisHandler, and boolean query syntax in a lot of places.
You could implement key expiration or queuing in MongoDB, you could implement full text search in Postgres, you could implement relational data in Redis. But just because you can doesn’t mean you should.
The reason NoSQL has become such a terrible moniker is because it suggests that perhaps SQL “was a mistake”. It wasn’t. It just represents one set of tradeoffs for data storage. The only mistake that happened with SQL is the dogma that it is the “only” way to store data. There were many years that even I can remember where “knowing data storage” simply meant “knowing SQL”.
We don’t live in that world anymore, and I am glad we don’t. There is more choice and diversity in data stores today, and that is A Good Thing, because there is bigger data, taking more diverse forms, than ever before.
Let’s embrace the new things, while denouncing the terrible ideas of how to use them!
—
Interested in Dev Ops? Are you located in Central Virginia? Want to work at a healthy, growing technology startup? Parse.ly is hiring for a Dev Ops role. See our jobs page for more information. If you apply based on this blog post, simply write an email to hello@parsely.com with “Dev Ops role, referred by pixelmonkey” as subject line.
My friend Jennifer Anyaegbunam (@JenniferAdaeze) has published a new piece on HuffingtonPost about the role of humanities in medical education.
Matriculating into medical school, we were proud of our humanities roots and felt it made us uniquely poised to become great clinicians. Yet, we have often found that we have had to defend our educational choices to interviewers, advisors and even our peers– something science majors rarely, if ever, have to do. This is because the medical humanities is often regarded as a “second tier” or an extracurricular interest and not something that is fundamental to the practice of medicine.
She finds that the humanities are derided in a classroom setting, as well:
Courses on ethics and social science are few and far between. To make matters worse, students often do not take these exercises seriously, and these courses are often the ones with the poorest attendance, for example
Here, I’ll offer a parallel from a different field: computer science.
As a computer science major at NYU, I too encountered hostility and a dismissive attitude toward the humanities and other “softer” fields from my peers.
A traditional computer science curriculum consists of mathematics, algorithms, and theory. These are important areas of academic interest, and provide a good foundation for thinking about the deepest problems surrounding computation. But the vast majority of computer science majors don’t go on to research computation. They go on to practice it — by becoming software professionals (programmers), writing applications used by real people.
It turns out that to be a successful software professional, you need much more than a computer science background. Indeed, many of the world’s most successful programmers have no computer science background at all. My father was a software professional, but when he graduated from college, computer science did not even exist as a field of study!
You need software design skills, which are often not taught except in a trivial way in traditional curriculums. It is considered “vocational”. You need communication, management, and product design skills. These are too “soft” to be taken seriously.
The industry suffers from a widespread lack of these skills.
For example, though hiring a good backend programmer is relatively easy in our economy, frontend programmers — programmers work on user interfaces and communicate directly with customers to iterate a product design — are extremely scarce. This makes sense: the typical computer scientist considers these vital skills “trivial” in comparison to large-scale data processing challenges. I’ll let you take a guess at which role is more important to ensuring users have a good experience with software.
Some corporate recruiters report interviewing seniors straight out of computer science programs who are unable to write trivial programs that would be the first exercise in any programming book. And I haven’t met a single computer science major who can design and implement the user interface for a real piece of software using just the skills they learned in school.
It gets worse, though. Due to the emphasis of the curriculum, computer science scares away many potentially brilliant software practitioners, because the kind of people likely to enjoy deep mathematics and theory are often not the kind of poeple who tend to enjoy product design and usability. If computer science curricula were about making beautiful, interconnected, evolving applications that millions of people can use on a daily basis — rather than about flexing math theory muscles — perhaps we could combat the flight from the field by women (see this NYTimes story). This flight is probably also driven by a repulsion to the stereotype of the male-dominated, antisocial culture as displayed in movies like Office Space. The stereotype has a basis in reality, but I sometimes reflect how this reality might be connected to this major’s trivialization of creativity, craft, and personal connection.
Another theory is that we simply need to make a newer curriculum that is more inter-disciplinary to capture this kind of real-world practice. My alma mater has put out a new program called Tisch ITP which mashes up computer science, communications, art/design, and social behavior. It has some nice early successes — the former CTO of Huffington Post and the founder of Foursquare are among its alumni.
You are right to wonder about the absence of humanities in medical education, and what that might mean for us as patients. With human lives and emotions at stake (rather than slipped project schedules and user frustration), your question seems even more vital in your field.
Note: some have asked me whether I regret majoring in computer science given the content of this post. Not at all. For me personally, I always had the “programmer spirit” in me. I had been programming since I was ~14 years old, and when I was in high school, I had already developed experience working with customers, designing interfaces, practicing design, and building things. I describe this a bit in What One Does. When I was in college, I debated between majoring in computer science (which I perceived as an opportunity to “peel the onion” on computation for my own curiosity) or other areas in the humanities. I compromised by majoring in computer science but still taking lots of other classes ranging from math to journalism to philosophy to history to creative writing. But I consider myself a bit of an odd duck; most people entering college have not already supplemented their four years of study with four years of “pre-study”.
So, here’s the deal. Some startup founders at Curebit.com decided to copy a design used by 37signals’ Highrise product for their own app. They did this in a less-than-gracious way, by simply copy/pasting the code and even leaving in some hard links to the original code. The story on VentureBeat tells the full story.
The founder of Curebit responded on HackerNews with this:
We had a different homepage, were a/b testing different pages, came across the 37signals post and were like ‘wow we should see how that converts!’ We are big fans of rails and what 37signals is doing and did not really think through the implications of what we were doing. We just kind of thought about it as a fun test to run.
Clearly it was stupid. It was not meant to offend anyone and we are adding credit where due.
As I pointed out to @dhh on Twitter, it’s unlikely this explanation is actually valid, given that their pricing page is also basically identical to Basecamp’s.
Clearly, @dhh isn’t amused with the founder “digging deep” for excuses. He wrote:
@allangrant THERE IS NO VALID WAY TO RIP-OFF PEOPLE’S DESIGNS AND HAVE IT BE OK. Not we’re small, not we’re a/b testing.
I think @dhh’s real frustration is that the founder isn’t admitting what is obvious to everyone else. He liked 37signals’ design. He thought it was good. And rather than get inspired by it and design something derived from the good concepts in the original, he and his team simply ripped off the original.
I think what this whole argument is missing is a little honesty. The truth is, no one on the web designs in a vacuum. We are all continually inspired / deriving from each other. If we were to believe that every marketing page and product homepage were designed by an obsessed designer living in an ivory tower, we would be in a total fantasy land. That’s not the web. Even designers are borrowing from, and getting inspired by, each other. Hell, that’s half the point of a site like Dribbble or Forrst.
Some honesty, for once: Parse.ly’s recent design
For our Parse.ly Dash public launch, we recently redesigned Parse.ly’s homepage and Dash’s marketing materials. I’m going to try, in this post, to outline all the sources of inspiration we drew on — some more directly than others — to land at the current site design (which is still a work-in-progress).
We started with the homepage
So, first some caveats. Parse.ly’s current branding and website design was put together by a sort of informal collaboration between two designers we work with. We found those designers on Dribbble and we paid them retainers to work through the designs. Our last homepage was basically a “v1″ of this design. In between our prior homepage launch and the latest one, our original designer left us and so we hired someone with similar sensibilities to inherit his work and improve upon it. This landed us at our “v2″ homepage design, which is the current one. For it, we mainly improved our color palette, branding elements, and clarity of presentation. Our awesome engineering intern, Emmett Butler, also took the opportunity to convert the design over to Hyde, a Python static website generator, so it would be easier to maintain.
For our recent launch, we also hired a firm, LessFilms, to produce our overview video. We wanted this video to be a mere click away on our homepage, so the new design needed to offer a way to launch the video.
As a team, we believed our homepage should be simple, drive people to watch our video, and drive people toward our signup form where we could then follow-up with them and evaluate if they might be good potential customers. We also wanted the page to be infused with the Parse.ly brand, but showcase the Dash product in a bold, but clean, way.
Our basic homepage design came straight from wireframes from our designers. I was the one who produced the initial HTML/CSS markup, this then became a shared project with my other colleagues at Parse.ly.
We used good, open source components
I quickly realized the topbar could be modeled well by Twitter’s Bootstrap framework. We ended up using modal dialogs from that framework for the video display, their grid framework for the footer, and popovers for our pricing page, as well. I also liked the idea of showing some screenshots right on the homepage. So I integrated our wireframe with the NivoSlider jQuery Plugin, so we could conserve space while having nice animation effects, to boot.
We solved problems with simple — and tested — elements
We wanted our tagline, “Insights for the web’s best publishers”, and our intricate Dash logo masthead, to stand out on the page. But we faced a clear problem: how do we get people to realize there was a video to watch?
We chose to go more abstract than explicit. We used a “play” button very much inspired by Dropbox’s play button on their homepage. Ours is a bit smaller and tucked in the corner of our product pane, but served the same purpose — nearly 50% of our homepage visitors click this main area as a result.
We were able to reproduce something that looked like Dropbox’s abstract play button in Photoshop and made it match our color scheme and branding.
This was clearly a case of us borrowing a design element. But I don’t think the Dropbox guys will mind — how many different ways are there to design a “play video” button, after all? We recognized a play button was needed, and we looked toward a use “in the wild” to get the right idea across.
We tried not to sweat the small stuff
A similar example of a small problem we solved by looking elsewhere is our homepage footer. Our friends at SeatGeek have had a nicely-designed homepage for a long time. I follow the founder Jack on Twitter and we talk about design occasionally. Heck, I was in the same room when Jack and Russ were brainstorming the name for SeatGeek. Likewise, they were in the same room as us when we came up with name for Parse.ly.
So when we needed to think about how to design our footer, we took a look at a few sites on the web, including those of our friends, and found that SeatGeek’s seemed like a no-nonsense, minimal fuss footer that was clean and crisp. We simply sent our designer a note that said, “let’s make the footer have similar proportions/look-n-feel to seat geek.com.”
His first revision came back pretty ugly, and we were confused. Then we realized the space in the original email — he thought we meant geek.com (which does have a pretty ugly footer), and indeed, he was “confused why we were inspired by it”.
At that point we corrected the error and decided to get more explicit about what we wanted. The typo was a good opportunity to reflect a bit. We liked these attributes of SeatGeek’s footer design: First, it inverted the color scheme (white-on-black rather than black-on-white) to create a dark break from the rest of the site. Second, it had a call-out to the logo/brand. Third, it had plenty of room for columns and columns of links. We sent out these instructions with a screenshot. We also indicated not to go too crazy/ornate on this — it’s just a footer, after all. As much as I love Vimeo’s footer, we had a website to launch, damnit! (If you haven’t seen Vimeo’s footer, you need to take a look!)
In the end, our footer design ended up pretty close to SeatGeek’s. We obviously have a different highlight color (due to our brand), different fonts (also due to our brand), and a different logo/tagline. But pretty much the same from a proportions / spacing / concept standpoint.
Is this design “stolen”? No, I’d say this is a pretty simple example of “recognizing good stuff” and “doing similarly good stuff”. Our footer is our own design, inspired by SeatGeek’s. Who knows who SeatGeek was inspired by in theirs? Thus is the constant process of remixing and derivation that happens in a creative world, especially on the web.
Sometimes, good design is “obvious”
We wanted to have a nice product tour page that showed screenshots of our product in an easy, visual, clickable layout. We wanted a little more detail to be revealed here as compared to the video, which gave a more high-level overview of the problem the product is solving. This would be a page to recognize the live product’s existence and its design.
I also didn’t want the design of this page to be intricate, because I thought our product screenshots would do a good job speaking for themselves. I solicited opinions from my team about product tour pages that met these criteria. The most-liked one was a pretty “obvious” grid layout of screenshot thumbs and captions put together by the creators of the Flow task management app that one of my colleagues uses.
It is a simple grid with embossed screenshot frames and 100% zoomed detail crops of the textured user interface. Each thumb has a caption and a link to the full-size version. It seemed perfect and obvious.
So, we whipped together some screenshot thumbs of our own, integrated it again into our product and company branding, and ended up with our version.
Did we steal Flow’s design? Hardly. A grid of product screenshots and captions is about as straightforward as web design gets. But we thought some of the other choices — 100% zoom, embossed, captions, clickable — fit our needs well.
Some inspiration is less direct
I know I was indirectly inspired by ShowMe’s jobs page, which I think is just downright pretty.
Our jobs page only shared some conceptually similarity — big “Apply” buttons, a two-column layout, some iconography to break up the dry wall of text.
Likewise, our pricing page is extremely barebones at the moment (we have a few designs floating around internally). The one that provided the most root inspiration to me was Tinderbox’s:
In the end, we landed on something quite a bit less ornate, a bit easier to read, and with a bit more detail available via anchor links.
So, what’s the point?
I’m a little disappointed how dishonest everyone in the startup world seems to be about design. I think one form of dishonesty is to outright steal someone else’s, like the Curebit guys did, via a crude copy/paste. But another form of dishonesty is to pretend as though all web designs need to be completely and 100% original. We all know that’s impossible.
Here’s my key takeaway: It’s OK to be inspired by other designs, but make sure to understand the source of your inspiration, and apply that in the context of your own brand, product, and company needs.
I can see why DHH is frustrated. For the last 5 years, he has watched copycat startups rip his products off left and right. The copycat culture on the web is a problem. But the reason it’s a problem isn’t because every design should be uniquely crafted from a raw block of marble.
It’s a problem because people copy without thinking, instead of letting inspirational sources cause moments of design reflection.
The web is a remix culture. What the Curebit.com guys did was wrong, but not because every website deserves a unique design. It was wrong because it showed that they were lazy toward their customers and disrespectful toward their inspirations.
Rather than remix the best design ideas they’ve seen into something that worked for their brand and customers, they decided to steal someone else’s and cross their fingers that it would work for them, too. And it backfired. But let’s not draw the wrong lesson. Design on the web is about remixing from your inspirations. There’s nothing wrong with that. It just needs to be done thoughtfully and respectfully.
Acknowledgements: thanks to Jack + Russ from SeatGeek, San from ShowMe, and Kyle from Forrst for reviewing and providing feedback on a draft of this article.
On my LinkedIn profile, I list one of my skills as “thought-driven development”. This is a little tongue-in-cheek; software engineering over the last few years has developed a lot of “XDDs,” such as test-driven development, behavior-driven development, model-driven development, etc. etc.
“Thought-driven development” doesn’t actually exist, but by it, I simply mean: perhaps we should think about what we’re doing, rather than reaching for a nearby methodology du jour.
In my last job, a colleague of mine used to also joke about “design-driven design” — perhaps the ultimate play on the XDDs since it is also a strange loop.
All this is not to say the XDDs aren’t useful — they definitely are. A lot of them have spawned entire groups of cross-platform open source projects. I am all for anything that makes the adoption of XYZ best practice easier for my team. But these techniques often require some lateral thinking to get to any real benefit.
When evaluating technologies like this, you have to take each little community with a grain of salt. Almost every programming framework / methodology / etc. that exists portends to offer some order of magnitude increase in software reliability / developer productivity / whatever else. And almost all, if not all, fail to do so, in practice.
Here is one anecdote to illustrate the point. From 2006-2008 at Morgan Stanley, the entire corporation was obsessed with Java’s Spring framework and its core “architectural pattern”, Inversion of Control. I can’t even begin to explain to you the number of man-hours that were wasted re-architecting existing, working software to meet this chimerical conception of component decoupling. I even contributed to this, urged by the zealots and their blind faithful. All of the reasons seemed great: decoupling code, using more interfaces, allowing for easier unit testing, being able to “rewire dependencies” and use fancy technologies like “aspect-oriented programming”.
Even Google got swept up in the madness and developed their own, competing framework called Guice. And in the end — after all that work — my diagnosis is that IoC is basically a non-starter, a complete waste of time.
A complicated framework that morphed into a programming methodology, developed exclusively to work around some annoying limitations of the Java language. Since it was applied without thinking, now everyone’s Java code has to suffer, and you can hardly pick up a Java web application today without being crushed by the weight of its IoC container’s XML configuration files. (Nevermind that most other communities, such as Python’s and Ruby’s, have hardly a clue what IoC is all about — a good enough indication that it is a waste of time.)
Every framework and approach should be judged on its true merits, that is the true cost/benefit analysis of applying that particular technology. Will it save us time? Will it simplify — not complicate — our code? Will it make our code more flexible / adaptable? Will it let us serve our users and customers better?
I regularly go back to old classics like the Mythical Man-Month to remember that nothing we do in software is truly new. I highly recommend you read it, and also its most famous essay, “No Silver Bullet”.
tl;dr stay healthily skeptical, don’t drink the kool-aid.
Me: Thanks so much for your fix to my issue. My friend, who majors in business, once told me that I should no longer major in Computer Science because “programming is like banging your head against the wall repeatedly, but with less reward.” I find that to be a rather rash dramatization, but I know in dealing with bugs as subtle as these it may feel that way. I hope at least the end-result is rewarding for you.
Programmer: Are you a Computer Science major? If so, don’t let your friend discourage you. Just ask him about “head banging” when those business majors find that their product development and marketing efforts fail to work after spending millions of dollars.
Me: Yes, I’m a CS major. And you’re right — the reward is great in software, and the cost of building an useful product is relatively minimal. That is one of the reasons I chose this path. It’s why I love helping out honest, intelligent developers such as yourself in any way I can. I have found that hardworking CS majors who are not only better programmers, but more often than not better thinkers and better managers — if you’d give them the chance. I am happy in my decision, and still have that naivete that perhaps I can change the industry a bit, shake things up, come up with an idea that changes everything, innovate in whatever way I can. Big aspirations; we’ll see what happens. For now, I’ll just keep respecting the good software I find in the world, such as yours.
I recently watched Oliver Stone’s Wall Street again. It really is amazing how relevant this movie is in 2011, ~25 years after its original release in 1987.
This speech, in particular, is a knockout, given the recent Occupy Wall Street movement:
Bud: How much is enough, Gordon? When does it all end, huh? How many yachts can you water-ski behind? How much is enough, huh?
Gekko: It’s not a question of enough, pal. It’s a Zero Sum game – somebody wins, somebody loses. Money itself isn’t lost or made, it’s simply transferred – from one perception to another. Like magic. This painting here? I bought it ten years ago for sixty thousand dollars. I could sell it today for six hundred. The illusion has become real, and the more real it becomes, the more desperately they want it. Capitalism at its finest.
Bud: How much is enough, Gordon?
Gekko: The richest one percent of this country owns half our country’s wealth, five trillion dollars. One third of that comes from hard work, two thirds comes from inheritance, interest on interest accumulating to widows and idiot sons – and what I do, stock and real estate speculation. It’s bullshit. You got ninety percent of the American public out there with little or no net worth. I create nothing. I own. We make the rules, pal. The news, war, peace, famine, upheaval, the price per paper clip. We pick that rabbit out of the hat while everybody sits out there wondering how the hell we did it. Now, you’re not naive enough to think we’re living in a democracy, are you, buddy? It’s the free market. And you’re a part of it.
On Fred Wilson’s blog post about Raise Cache and HackNY, someone asked a very legitimate question:
Why are we raising money to benefit CS students from top programs around the country? Why are we raising money to help companies like Business Insider and bit.ly hire interns?
The event looks like fun but I’ve been trying to understand hackNY and don’t understand why it’s a charitable cause.
And, I wrote a response — providing some anecdotal information about how Parse.ly benefited from HackNY, and why it matters in a city flush with gold-plated Wall St. internships.
Let me give you a hint why HackNY is somewhat charitable: Wall Street firms pay between $15-$30k/summer for technical interns in NYC. Most startups — especially early-stage ones — simply can’t compete with that.
You may not think startups like bit.ly and Business Insider need any help (and with their $13-15M in capital funding raised, maybe you are right), but in 2010, when Parse.ly sought a summer intern, we had no funding and a
That said, it'd probably be good to have more truly early-stage (pre-funding) companies on the roster -- one of the problems with this, though, is that a lot of pre-funding companies are in such a fragile state that the internship might be over a couple weeks into the summer. Also, pre-funding companies aren't typically "buzzworthy". The first two years of HackNY has been partially about creating some buzz about NYC tech among university students, something at which it has succeeded spectacularly.
Both of our HackNY interns (2010 + 2011) have commented about how one of the interesting parts of the program is that it simply raises awareness of the different stages of companies. Since the HackNY interns all live together in university housing, they share stories -- and you'll have folks working on >100-person teams at places like Gilt Group and folks working on <10-person teams at places like Parse.ly. Also, the program’s lecture series does a good job of encouraging students to consider entrepreneurship or startup work as a post-graduation option. This is a countervailing force to the professional HR/recruiting teams employed by Wall Street and other Fortune 500s to market their positions to top students.
from Raise Cache at AVC.com.
I came across this wonderful piece of historical retelling by David Beazley, one of my favorite Pythonistas and the author of Python Essential Reference. Here is a man who conquered C++ in just about every way, but ultimately found himself trapped in its byzantine complexity, only to escape by way of Python.
Swig grew a fully compatible C++ preprocessor that fully supported macros. A complete C++ type system was implemented including support for namespaces, templates, and even such things as template partial specialization. Swig evolved into a multi-pass compiler that was doing all sorts of global analysis of the interface. Just to give you an idea, Swig would do things such as automatically detect/wrap C++ smart pointers.It could wrap overloaded C++ methods/function. Also, if you had a C++ class with virtual methods, it would only make one Python wrapper function and then reuse across all wrapped subclasses.
Under the covers of all of this, the implementation basically evolved into a sophisticated macro preprocessor coupled with a pattern matching engine built on top of the C++ type system [...] This whole pattern matching approach had a huge power if you knew what you were doing [...]
In hindsight however, I think the complexity of Swig has exceeded anyone’s ability to fully understand it (including my own). For example, to even make sense of what’s happening, you have to have a pretty solid grasp of the C/C++ type system (easier said than done). Couple that with all sorts of crazy pattern matching, low-level code fragments, and a ton of macro definitions, your head will literally explode if you try to figure out what’s happening. So far as I know, recent versions of Swig have even combined all of this type-pattern matching with regular expressions. I can’t even fathom it.
Sadly, my involvement was Swig was an unfortunate casualty of my academic career biting the dust. By 2005, I was so burned out of working on it and so sick of what I was doing, I quite literally put all of my computer stuff aside to go play in a band for a few years. After a few years, I came back to programming (obviously), but not to keep working on the same stuff. In particularly, I will die quite happy if I never have to look at another line of C++ code ever again. No, I would much rather fling my toddlers, ride my bike, play piano, or do just about anything than ever do that again.
From this Swig mailing list entry.
It’s hard to find me gushing more unapologetically than when I talk about the virtues of my favorite programming language, Python.
Indeed, my life for the last 3 years has been dominated by the language. In many ways, pursuing a startup and enduring the associated financial hardship was partially because I had become frustrated with using Java in my full-time work and wanted to convert hobby projects I was building outside of work hours into full-fledged projects.
Something else I noticed in the last three years is that my programming life has become very zen-like. I now rarely discuss or debate things like programming language features, strange constructs like generics, or which framework to use or ignore. Instead, I spend most of my time building a product that people love. My colleagues and I communicate with code. And what better language to communicate in than arguably the world’s most readable? What better language to deliver value in than one that simply gets out of your way?
I therefore get a great amount of joy of showing other people what the Zen of Python can mean in their lives. Last year, I gave a training course to a 20-person team of government employees who were using dated languages like Fortran and COBOL to build important government systems. A bright manager in the organization realized how much more productive their team could be if they stopped worrying about compiler versions and IDEs and started thinking in code. But the key was to understand the value of Python, not necessarily as a language with a certain set of features, but as a way of doing things, as a cultural influence. This is a culture that says, “the language should fade away”, similarly to how Edward Tufte argues that the chrome and administrative debris should fade away when displaying content.
Since then, I have used these slides countless times to espouse the virtues of my favorite interpreter. This has included giving short seminars in NYC, training Parse.ly interns and new hires (here’s a shot of me and two Parse.ly engineers, Michael and Zach), holding Python office hours at HackNY’s hackathon (indeed, here’s an action shot on Flickr to prove it!), and introducing my friends to the joy of building software.
Code and Slides: The Zen of Python in 3 days
It’s therefore with elation that today I am able to provide the slides and materials for this talk free to the world on my Github account. Simply click here:
https://github.com/Parsely/python-adv-slides
Or, if you’re not interested in the code behind the slides (because the slides are actually created with the help of Python itself), you can go here to simply view them.
http://pixelmonkey.org/pub/python-training/
(note, modern web browser like Chrome or Firefox recommended for speed and smoothness)
This may be my first “open source presentation”, in that the slide presentation itself is provided as code that is freely available, and reproducing the slide presentation is a matter of running a code generator.
A touch of NLP with Python
I mentioned that one of the reasons I love Python is because it gets out of your way. Nowhere is this more evident than Python’s ability to prototype algorithms to natural language processing and corpus linguistics problems, something I do everyday. The Natural Language Toolkit (NLTK) provides an excellent starting point from learning about this aspect of computer science.
Recently, Parse.ly has gotten some new engineers on the team who are great Python programmers but without any background in NLP. To give them a taste of how well-suited Python is for this task, I gave a seminar entitled, “Just Enough NLP with Python”. Just as with the prior slides, you can view these online on Github:
https://github.com/Parsely/python-nlp-slides
And you can view the compiled slides here:
http://pixelmonkey.org/pub/nlp-training/
Between these two pieces, you can learn the Zen of Python (PEP 20) very rapidly, and incorporate it into your everyday software life.
If you have any comments or feedback, feel free to reach out to me on Twitter. I’d love to hear what you think. And if you, too, are bitten by the Python bug (as one of our Parse.ly interns recently was in a big, almost romantic way) then you should reach out to us, since we’re hiring.
And, sadly, our top engineering graduates don’t always become engineers. They move into finance or management consulting — both of which pay far higher salaries than engineering. I have seen the dilemma that my engineering students at at Duke University have faced. Do they take a job in civil engineering that pays $70,000, or join big Wall Street financial firm and make $120,000? With the hefty student loans that hang over their heads, most have made the financially sensible decision. In some years, half of our graduates have ended up taking jobs outside of engineering. Instead of developing new types of medical devices, renewable energy sources and ways to sustain the environment, my most brilliant students are designing new ways to help our investment banks engineer the financial system.
[...] We also need to make the engineering profession “cool” again, with the same sense of excitement and urgency in engineering and science that we saw during the Sputnik days. Back then, engineering was considered essential to the nation’s survival. Engineers and scientists were national heroes. It’s not that we don’t have problems to solve. The economy is in dire straits. Natural resources such as food, water, and crude oil are becoming scarce. Drug-resistant bacteria threaten us with doomsday plagues. But we’re not offering our best minds incentive to solve them.
From Mr. President, there is no engineer shortage.
Luckily this is happening already in high tech in NYC, thanks to awesome programs like HackNY and collabraCode (both of which my startup Parse.ly formally supported). As much as it pains me to say it, I also think The Social Network may be seen as a cultural catalyst for software engineers becoming “cool”.
But high tech is only a small piece of the puzzle — we need the same active marketing for students’ minds in biotech, education, medical research, civil engineering, etc.
Today, I turn 27. Even though I was deep in the middle of a project late last night, I peeled myself away from my monitors, went to sleep, and woke up late to enjoy a day of reading outside.
Parse.ly has an official “take your birthday off” policy, so I made sure to set a good example.
I remember when I was younger, I used to look forward to birthdays very eagerly. Birthdays were when I got a new videogame or programming book. Birthdays were about stuff, and taking the day to play with new toys.
Now, over a decade later, my birthday is much less about stuff. I don’t play videogames anymore, and I already know how to program. I am fortunate to live comfortably and don’t long for stuff any longer. My Nintendo Wii gathers dust (like everyone else’s, it seems). My computer is no longer used to amuse me, but to allow me to work on my passions — building software, building a company, staying informed, informing others. I have a seemingly endless queue of books I’d like to read, movies I’d like to watch, things I’d like to write, software I’d like to build. I’ve come to realize that birthdays, at my age, are more about time.
In my ruthlessly efficient worldview — where I regularly talk of cost-benefit analysis, backlog prioritization, and productivity — my birthday has become about taking a moment to flip my prioritized world on its head. Let’s not pick an item from the top of the prioritized backlog. Instead, let me take something from the backburner, for once. Let me behave — if only for a day — as if I had all the time in the world.
I don’t need stuff. I just need time. Of course, that’s the bittersweet part of one’s birthday. That even as you come to realize the importance of time, the day acts as a reminder of how our time on this earth is limited. 1 day passes, and only n-1 left to make a difference.
I’ve been quite busy with work lately, so haven’t had time to send a few posts toward my blog. However, I have been working on some spare time and work-related projects that I’d love to share with everyone here.
Among them:
Stay tuned!
I was a bona fide Java programmer for 5 years before I started working on Aleph Point and Parse.ly. I truly believe that Python and JavaScript are fundamentally better languages than Java for a variety of reasons born out of experience with each of them. (Note: Before this gets marked as flamebait, please notice that not only was I Java programmer for more than 5 years, but I was also a Java open source contributor!) I have enormous respect for the Java open source community, which has produced some of the highest quality modules available anywhere.
Now, don’t get me wrong — Python also has batteries included, and usually, when I think that I’m missing a great module I used to use in Java, it already exists in a much more powerful form in Python’s Standard Library or the wealth of modules on PyPI, GitHub, and Bitbucket. However, I believe in not reinventing the wheel, and so if a great open source tool exists in Java, I will want to interact with it.
One of these modules which we use extensively at Parse.ly is Apache Solr, and its surrounding Lucene project modules. Lucene is an extremely mature framework for document indexing, and Solr is a powerful server-ization of that technology that fits well into complex, mixed language distributed systems. I know there are efforts — like Whoosh — to build fast search engines atop the Python language. And I applaud these efforts — more projects means more competition, and more competition means better products. However, I still believe that you go with the best of breed tools available for production software, and you try not to let religious arguments about programming language get in the way.
Lately, I have come across more and more Java open source projects that have no equivalent in Python, and which I would like to access. Knowing that I wanted to feel comfortable incorporating Java open source projects — beyond Solr, which was already nicely wrapped as a web service — I, at first, thought that I’d be forced to still live among the weeds of complex class and interface definitions, cumbersome Java IDEs, XML configuration files, and (IMO) time-wasting rabbit holes like dependency injection, configuration management, and classpath hell. And then I found Groovy.
The most under-rated language ever
Groovy was released as one of the earliest dynamic languages written atop the JVM, back in 2003. I remember hearing about it back then and not “taking it seriously”, as we programmers often do with important, new technologies. I had the same ill feelings toward JavaScript until as late as 2007, but now consider myself more a JavaScript programmer than a Java one!
In truth, had I read James Strachan’s blog post discussing why he created Groovy, I would have probably paid more attention:
So I’ve been musing a little while if its time the Java platform had its own dynamic language designed from the ground up to work real nice with existing code; creating/extending objects normal Java can use and vice versa. Python/Jython’s a pretty good base – add the nice stuff from Ruby and maybe sprinkle on some AOP features and we could have a really Groovy new language for scripting Java objects, writing test cases and who knows, even doing real development in it.
I had thoughts like these back in my Java programming days, too. I would see features that were standard in other languages — concise list/map syntax, “bare” functions (that could live outside of class definitions), optional/dynamic typing, metaprogramming, a proper REPL, first-class functions — and think to myself, “Aren’t we missing a whole lot by lacking these features in Java?” The more I started to code Python and Java side-by-side, the more I started to realize that many “core” Java technologies were basically created to cope with inadequacies of the language itself.
For example, unit testing “frameworks” and formal IDE/debuggers are more popular in Java than other languages because of the lack of the REPL — dynamic language programmers tend to test their code from an interactive shell as a regular part of programming. Utility libraries like Apache Common Utils were created to cope with syntactic deficiencies around core types like lists and maps. And on and on.
Groovy is a “post-modern language” dynamic language. By that, I mean it is clearly inspired by and learns from the lessons of both Python and Ruby. It is also the world’s first programming language that was written to target an existing community of programmers. Let me explain. Python was not written “to target” C programmers or Visual Basic programmers. It was written as a new language altogether, not targeted at anyone in particular. However, Groovy was written “to target” Java programmers — it is the dynamic language Sun should have built for the JVM as dynamic languages gained popularity and the Java language stagnated.
Groovy is respectful of and cooperative with Java itself. One of its primary design goals is to live alongside existing Java code, even while Groovy’s syntax far surpasses that of Java. In this respect, Groovy plays a very similar role in the Java ecosystem that Python plays in the C ecosystem. However, unlike Python and C’s relationship, Groovy’s relationship with Java is bidirectional. What that means is that one can easily write Java libraries that execute Groovy code (provided the Groovy Language is on the classpath as a JAR), and Groovy libraries can easily execute Java code. In fact, you can even have complex relationships among these two languages, such as Groovy classes serving as base classes for Java classes and vice versa.
However, what’s perhaps most astonishing about Groovy is that it truly got a whole lot of things right. That is, the language simply is extremely well designed, and specifically greased for programmer productivity. It learned from the mistakes of existing dynamic languages, and made a dynamic language that is truly the “best of both worlds”. When I introduced the language to an engineer on my team at Parse.ly, he uttered some words that really stuck with me: “This language was clearly designed by very, very lazy programmers.”
So true. Typing is a chore — let’s do less of it to code without sacrificing readability. That’s the Groovy way.
Over the last several years, it has matured to the point where I feel extremely comfortable writing production code in Groovy, especially thanks to the surrounding ecosystem of the Grails project.
Python is still my muse, but Groovy is my Winston Wolf
A famous character in the movie Pulp Fiction is Harvey Keitel’s “Winston Wolf”. He’s the underworld problem solver — a guy who gets things done with a no-nonsense attitude.
Winston Wolf: If I’m curt with you, it’s because time is a factor. I think fast, I talk fast, and I need you two guys to act fast if you want to get out of this.
Python is a beautiful language and will continue to be my language of choice for production code. However, I simply can’t ignore the enormous wealth of production-quality Java code that has been written for which there is no good Python equivalent. I also can’t ignore the unbelievable engineering effort that went into making the JVM a powerful, scalable, and stable platform for production software. I thought I’d have to throw this community away (for myself and my companies) due to the deficiencies of the Java language. However, I have come to realize that Groovy has proven itself the worthy successor to Java. (Yes, successor — as in, you would be a fool to write your code directly in Java these days.) And personally, I will never work on a Java project again without Groovy. Java is a mess, but Groovy is my clean-up man.
I have also observed that the “ported” JVM languages have failed. This was a surprise to me as it was happening, but in retrospect, is not as surprising. Jython and JRuby are simply not viable long-term open source projects. The target community is only those programmers who know both Python and Java or both Ruby and Java. And there was no great impetus to truly “learn” those environments, because Python programmers felt that they already “knew” Jython, since the language was the same. Is the benefit of having access to Java libraries worth the downside of using a non-standard VM for my primary programming language? Usually not.
Now, while I wasn’t even paying attention — slowly learning Groovy in my spare time and using it for small projects dependent on Java — it seems that the world of independent dynamic JVM languages has taken off. Groovy has not had nearly as much popularity as languages like Scala (used at Twitter and Foursquare) or Clojure. And Groovy lacks “sex appeal”, possibly because no significant large-scale projects have been documented atop Groovy and because the language is an incremental improvement on successful mixed-paradigm languages.
However, unlike Clojure and Scala, the Groovy language can be learned in a weekend by an existing Java programmer with dynamic language experience, and can be learned in a couple weeks by anyone else. And unlike those other languages, it does not try to be a fundamentally new programming language with new programming models. Instead, it simply dramatically improves the Java language itself using a syntax that becomes quickly familiar to dynamic programmers.
When you consider the cost/benefit analysis of learning Groovy — for a couple weeks of my time, I gain access to being able to easily script — and, with Grails, wrap as web services — an almost limitless supply of Java open source code, you realize you’d be a fool not to learn the language.
I like new stuff as much as the next guy, but Java is a language that is still used by millions of professional programmers across the world, and has proven its ability to be the basis for large projects. And Groovy is a 2X or 3X productivity improvement over Java, with full bi-directional compatibility with existing Java code? I’ll take that over a new-fangled “revolutionary” language any day!
So, what’s so groovy about Groovy?
Oh, nothing… just things like this:
["Rob", "Christopher", "Joe", "John"].findAll{ it.size() < = 4 }.each { println it } // ==> prints "Rob", "Joe" and "John"
Yes, we’re really not in Java anymore. Let’s walk through the pain that is Java step by step to arrive at our Groovy salvation.
First, Java loves boilerplate. It really is an affliction of the language and the community. A great example of this is Java’s take on module imports.
Even though Java has a great standard library that is extremely well-designed and complete (for the most part), it fails to make this standard library easily accessible to you. That’s because all of Java’s best libraries are hidden in obtuse, deeply-nested library locations, and one can never remember where they are. IDEs solve this problem by providing an “Organize Imports” facility, which expert Java programmers use instead of managing imports themselves. This problem, alone, puts Java IDEs on significantly more productive footing than simple text editors like vim.
Groovy says, “this is nonsense — stop the madness”. In Groovy, a slew of commonly-used Java modules are simply imported automatically. If you run into a namespace conflict, that’s OK — we’re all adults here.
// these are all automatic in every Groovy module you write import java.lang.*; import java.util.*; import java.net.*; import java.io.*; import java.math.BigInteger; import java.math.BigDecimal; import groovy.lang.*; import groovy.util.*;
Note that groovy.lang and groovy.util add significant utilities above and beyond Java.
Next up are the major data structures used by any modern programmer — lists and maps (aka dictionaries). If you haven’t been doing Java in awhile, you won’t miss this kind of code.
List<Integer> someItems = Arrays.asList(new Integer[] {1, 2, 3, 4});
for (Integer item : someItems) {
System.out.println(item);
}
Map<String, String> someMapping = new HashMap<String , String>() {{
put("ST", "started);
put("IP", "in progress");
put("DN", "done");
}}
for (Map.Entry<String, String> entry : someMapping.entrySet()) {
String key = entry.getKey();
String value = entry.getValue();
System.out.println(key + " => " + value);
}
Note that the above is about the most concise possible code you can write for these operations using standard Java facilities. In fact, I have even tapped into some obscure features, using Arrays.asList to initialize the ArrayList (leveraging the fact that arrays, but not lists, have a concise initialization syntax) and leveraging double curly brace initialization for the Map, which is an extremely non-standard and perhaps even overly inefficient technique. (The alternative, though, is to write out someMapping.put(key, val) for each entry in the Map — ick).
Let’s look at this same syntax, but in Groovy.
someItems = [1, 2, 3, 4] someItems.each { println it } someMapping = ["ST": "started", "IP": "in progress", "DN": "done"] someMapping.each { key, val -> println "${key} => ${val}" }
Not only is this code more concise, it is also much more readable. The signal-to-noise ratio is much better.
Groovy has great support for Functional Programming and Domain-Specific Languages.
// first-class functions/methods and DSL syntax clean = { text -> text.replaceAll("[^A-Za-z]*", "") } clean "my-text-*foo" remove = someItems.&remove remove 3 someItems
It retains support for loved Java features like switch statements, but these become much more powerful and concise, e.g. in this factory function that actually returns cleaner functions depending on the passed-in type.
// proper closure, switches and variable interpolation cleanerFactory = { type -> switch (type) { case "text": return clean case "numbers": return { text -> text.replaceAll("[^0-9]*", "") } default: throw new IllegalArgumentException("invalid type ${type}") } }
Using the cleanerFactory is a breeze, simply call it and you get back a function which you may then call. Yes, this is basically impossible in Java. To get something close, you’d have to make the factory return “Runnable” instances and then call .run() on the results.
// try/catch and bean syntax shorthand test = "my-text-*foo-1234" println cleanerFactory("text")(test) println cleanerFactory("numbers")(test) try { println cleanerFactory("foo")(test) } catch (IllegalArgumentException ex) { println "cleanerFactory returned exception: ${ex.class.simpleName}, ${ex.message}" }
Just like Java, Groovy supports classes. But it improves them by adding a concise syntax for declaring proper JavaBeans. The following class has getters, setters, and a keyword constructor autogenerated for you.
// classes and types, if you need them class Article { String title String summary String content Date date } article = new Article( title: "Parse.ly uses Groovy, oh noes!", summary: "Didier resigns in protest", content: "Keith goes back to .Net, Toms returns to Ireland. Ahhhh!", date: new Date(2011, 3, 7) ) article.properties.each{println it}
Also notice the special “.properties” property, which allows you to iterate over a JavaBean’s property without using Java’s obtuse reflection API.
// inheritance if you need it class WSJArticle extends Article { List topics } wsj = new WSJArticle(topics: ["US", "Finance", "World News"]) wsj.topics.each{ println it }
Inheritance works exactly as expected, and follows Java rules. However, inheritance gets even better in Groovy because it loosens stupid rules that Java has, such as “final” classes.
// and sure, even call and inherit from Java code (even final classes!) class BetterDate extends Date { def getTimeString = { return "{this.time}" } } println new BetterDate().timeString
It’s simply not possible to subclass Date in Java, but Groovy has no problem with that. We’re all adults here.
// and sure, use Groovy scripting constructs with existing Java APIs! lotsaItems = [1, 1, 2, 2, 2, 3, 3, 3, 3, 3] println Collections.frequency(lotsaItems, 3) Collections.shuffle(lotsaItems) println lotsaItems
The above is an example of using existing Java APIs. Collections is surely a nice static class. Groovy lets you use it to your heart’s content.
The API for the Collections.binarySearch method is written as:
static <T> int binarySearch(List<? extends T> list, T key, Comparator<? super T> c)
Yes, that is “documentation” So basically, I need to create a class that implements the Comparator interface, and then I need to pass that comparator to this method along with my list. That will require a lot of angle brackets to get the generics right. Hmm, who cares, let’s just do it the easy way: in Groovy.
// even existing, complex APIs work fine in Groovy, and with less code! comparator = [ compare: { a, b -> a.equals(b) ? 0 : Math.abs(a) < Math.abs(b) ? -1 : 1 } ] as Comparator println Collections.binarySearch([0, 2, 5, 7, 8], 5, comparator)
This shows off the ability of Groovy to easily coerce values using some very intelligent rules. Comparator is an interface that expects a “compare” method — therefore, we can coerce a map with a “compare” key and a function value into an implementation of that interface.
// convenience features abound, like heredocs and safe pointer access myDocument = """ This is a long document... """ foo = [myDoc: null] println foo.myDoc?.size() // returns null rather than an exception
If I had a dime for every time I noticed Java programmers using StringBuilder to work around its lack of multi-line strings, or adding extra methods just to do null checks... well, I'd be very rich indeed Groovy says -- no need to complicate your life with that noise.
// and type coercion that actually works! set = [1, 2, 3, 4, 5] as Set 5 in set // ==> true!
Yes, the in keyword. Yes, lists can be converted to sets. Because that makes sense!
// this is a feature I wish Python had -- JavaScript-like map syntax map = [:] map.item1 = "foo" map.item2 = "bar" map.each { println it } // ==> "item1"="foo", "item2"="bar" println "done"
Groovy also learns a little from JavaScript -- maps are accessible using either key index or dot syntax. I personally like this, I know some Python people won't.
// nice range and slice syntax items = [1, 2, 3, 4, 5] println items[0..2] for (i in 0..5) { println i }
Similar to Python's slice syntax, Groovy supports a "Range" type using two dots. It works where you expect it to: array indexing, for loops.
Using Grails, the standard web framework for Groovy, you can write simple RESTful, JSON web services in just tens of lines of code. e.g., here's a Grails controller that renders the result of a function as JSON.
package service import com.thirdparty.library.ContentExtractor import grails.converters.JSON class ParseController { def contentExtractor = new ContentExtractor() def index = { def article = contentExtractor.extractContent(params.url) render article as JSON } }
This will respond to a service like http://service:8080/parse?url=http://google.com with a JSON document containing the Java properties from the article object.
Seriously, how much easier can it get to wrap up a Java library as a JSON web service? Not much.
// in short, a pretty language worth looking at println "DONE."
Early on during this startup adventure, a person I trust told me, “Watch out — startups aren’t for the faint of heart.” Looking back on my personal net income graph from 2009 to present, I can see what he meant.
May 2009 is when I entered Dreamit Ventures to begin working on what would become Parse.ly. That’s when I plunged my “savings buffer” into the company. The few months after that had me frantically trying to recover from the realization that startup progress is measured in months and years, not days and weeks.
Sachin and I switched gears from targeting consumers with a free product to targeting large online content properties with a paid product, and bootstrapped the company with side consulting gigs. We didn’t tell anyone we did the side consulting work (unless they specifically asked). We watched other entrepreneurs go into credit card debt and borrow money from trusting friends and relatives. We didn’t believe in that, so we took the hard road of “earning our survival”.
However, our costs were going up, not down, as we pursued a more ambitious product with more demanding clients. Also, my expenses skyrocketed as COBRA disappeared for my health insurance and I had to pay for horribly overpriced sole proprietorship plans. (Fact: America’s broken healthcare system is harmful to entrepreneurs.) I knew I needed to do something to “stop the bleeding” on my financial situation — so, I took on more consulting gigs…
My consulting gigs only started earning me enough to stabilize my personal expenses late last year. That’s when you see my net income turn into a very nice flat line. Those months felt more secure than any of the preceding ones. It was around then that I realized a lot of people took advantage of my desperate position and underpaid me. I raised my prices and started to round up talented engineers to work with me as a distributed team. That got things under control.
Mid-to-end of 2010, I felt like I could conquer the world. My finances stable, I didn’t need me no stinkin’ investors. We could bootstrap this company to success ourselves. After a year of hard work, I had plentiful consulting clients, and more knocking down my door. There was a stable 50/50 split between Parse.ly time and paid consulting time. We were also on the verge of closing big contracts for Parse.ly. I could cover my expenses well, and saw lots of personal gain in my future. My consulting company even seemed like a nicely profitable and happy company in its own right.
Of course, it was around this time that investors finally decided to notice Parse.ly and that we closed financing. Note to entrepreneurs — no one ever gives you money when you actually need it!
Mid-January was first time in 2 years that I had health insurance paid by someone else and my first “real paycheck” in 2 years. January and February have been amazing, productive, ambitious, and life-changing months. So proud of Parse.ly team and what’s coming up next for us.
In all, the past 18 months has been a crazy ride, but I survived! Personal relationships were strained, but not soured. Financial pressure applied, but not to the breaking point.
This graph above is also quite an interesting side effect of this process. It shows the net income not for me personally, but for my consulting company, Aleph Point, Inc. In the span of 10 months, I took my consulting company from a 1-man shop to a profitable, legitimate company. That required me to wear a “CEO hat” and focus on things like profit margins, contract negotiation, and customer acquisition. What I like about this graph is that it’s a hockey stick profit chart. True, consulting work is not scalable, but it sure is profitable. Of course, due to Parse.ly’s funding round, I now need a bigger hockey stick.
“Don’t do other things?”
I wanted to write about this experience because it directly contradicts some advice one of my online mentors, Paul Graham, gave in his essay, “How Not to Die”. He wrote:
The number one thing not to do is other things. If you find yourself saying a sentence that ends with “but we’re going to keep working on the startup,” you are in big trouble. Bob’s going to grad school, but we’re going to keep working on the startup. We’re moving back to Minnesota, but we’re going to keep working on the startup. We’re taking on some consulting projects, but we’re going to keep working on the startup. You may as well just translate these to “we’re giving up on the startup, but we’re not willing to admit that to ourselves,” because that’s what it means most of the time. A startup is so hard that working on it can’t be preceded by “but.”
If I hadn’t worked on my consulting projects last year, Parse.ly would have died. PERIOD. PG is often right, but he’s wrong on this one. I agree that startups require your undivided attention. But surviving isn’t a compromise — it’s a necessity. Do it however you possibly can.
In any event, I learned a lot from this experience. On to the next! It’s time to pump some green for Parse.ly!
Hacker News
Oh, wow. I ended up on the front page of Hacker News. Check out the great comments and discussion going on there. Also, feel free to check out some other posts I’ve written about startups, such as:
I made an interesting discovery today.
“Free Market” vs. “Labor Union” in Google Ngram Book Viewer
Explains why no one has heard of labor unions and everyone is raving about the free market
(by the way, you can download the entire dataset behind this neat little Google Labs project)
Let’s say that you want to make some money at the horse tracks. The guy behind the booth gives you the following choice:
Most people, even if they don’t know much about horse betting, will rightfully think The Option is the way to go. No matter how much information you might have about each of the horses and riders before the race, the information you gather during the race is much more valuable. A top-ranked horse might break a leg, or its rider might trip up on the start. Even if you pick a favorite before the race begins, once it gets going and you see him lagging behind, the economic reality put in front of you will override your speculative capacities.
In short, it’s easier to play the option than the bet. It’s less risky. It’s more fun. There are less ways to lose. So, it’s no surprise most horse betting does not work this way.
However, let’s analyze the interesting crowd behavior that emerges during a race where The Option is available.
Some Examples
Let’s say one horse suddenly pulls ahead of the pack. Those who are paying attention and who hold the option will quickly exercise it. Many other gamblers will miss their opportunity, creating an oversubscribed horse.
Take another example. Since the bets are public, other gamblers will be inclined to pay attention to signals from the gamblers with the best past record of picking horses. If a well-known gambler with an excellent record picks a horse in the middle of the pack, its two other spots will likely fill up very quickly.
Some gamblers will avoid making picks based on sudden movements by the horses or other gamblers. Instead, these less emotional players will wait until a reasonable time period before the end of the race — say 30 or 45 seconds. They will wait until they have all the trending information necessary to make a call. They won’t mind picking a #2 or #3 winner — they just want minimal risk of picking a loser.
Finally, some horses will go unpicked altogether. That’s not to say they aren’t strong horses with a lot of potential. They just have a lot of perceived risk. And gamblers don’t like taking on risk when The Option gives them a way to altogether mitigate it.
Changing the Rules
A horse race with The Option at play seems complicated enough, but let’s change the rules a little more. Let’s say the race doesn’t end — doesn’t have a set start time and end time. Let’s say that once a horse gets 3 options exercised on it, it’s removed from the race. Let’s say new horses enter the race every day, and that horses only tend to run for between 2-5 minutes each. After 5 minutes, most riders remove their horse from the race on their own.
In this new race, small signals have a big impact. Charge ahead suddenly and you might get your 3 spots filled. Convince a top-tier gambler to go to bat for you, and you’re all set — your other two spots will fill up quickly. If you’ve been in the race before and had your spots filled up quickly, you’ll likely get them filled up quickly when you enter the race again.
Managing which options to exercise also becomes a little trickier. With new horses entering all the time, the biggest gamblers will start to have a horse flow management problem. They’ll delegate some of the initial analysis and tracking work to junior gamblers. Riders and horse owners will start to do outrageous things to get noticed by the gambling ecosystem.
As but one horse of hundreds or thousands, it will therefore become possible for you to run a steady race without any of these signals, and basically go unnoticed in the crowd. Occasionally, you will get considered by the unemotional, data-oriented gamblers. However, they will wait until the last possible minute before pulling the trigger on your option, because it is in their interest to do so.
Do the Gamblers Change the Game?
One has to wonder what effect The Option has on the race. For example, if the goal is to leave the race via The Option — rather than to cross the finish line — will some riders start slow and then sprint quickly ahead to attract the fast-acting trend-followers? Will some riders try to network with the top-tier gamblers before they enter the race, to win their favor outside the game itself?
Would any gambler even choose The Old-Fashioned Bet anymore? It seems to me the only time someone would is if the rider and horse had such a great past record, that they were a favorite to win. No-name rider and newly-trained horses will never garner Bets, only Options.
And what about those hundreds of mid-range horses that never make sudden forward motions? Instead, they slowly and steadily continue to beat their best lap time. They aren’t flashy and never catch a glance from the top-tier gamblers, or even the stadium audience. For them, there is but one choice: Keep riding. Keep pushing. A gambler may notice you at the last moment, just when you think you haven’t any stamina or wherewithal left.
Sure, it won’t seem fair, to see all these other horses leaving the race early. But don’t look left and right — just look ahead. Before you know it, you may not even care about The Option, since you will have crossed the finish line all on your own.
So, next time you hear a “maybe” when you ask someone to bet on your horse, remember: it’s easier to play the option than the bet. The reason you entered the race in the first place was to ride like hell and win for yourself, not to be someone else’s winning bet.
Pythonic isn’t just idiomatic Python — it’s tasteful Python. It’s less an objective property of code, more a compliment bestowed onto especially nice Python code. The reason Pythonistas have their own word for this is because Python is a language that encourages good taste; Python programmers with poor taste tend to write un-Pythonic code.
This is highly subjective, but can be easily understood by Pythonistas who have been with the language for awhile.
Here’s some un-Pythonic code:
def xform(item): data = {} data["title"] = item["title"].encode("utf-8", "ignore") data["summary"] = item["summary"].encode("utf-8", "ignore") return data
This code is both un-Pythonic and unidiomatic. There’s some code duplication which can very easily be factored out. The programmer hasn’t used concise, readability-enhancing facilities that are available to him by the language. Even lazy programmers will recognize this code’s clear downsides.
Here’s another version that is more idiomatic but is nonetheless still un-Pythonic:
class ItemTransformer(object): def __init__(self, item): self.item = item def encode(self, key): return self.item[key].encode("utf-8", "ignore") def transform(self): return dict( title=self.encode("title"), summary=self.encode("summary"))
Nothing about this code is particularly unidiomatic. I might even see code like this in many popular open source projects. But it’s in poor taste. It’s un-Pythonic.
What is the code doing? It’s just taking an incoming dictionary, encoding its values using utf-8, and returning a new dictionary with those encoded values. There is no need to introduce an ItemTransformer object — it’s an extra abstraction and is just making the signal-to-noise ratio poorer. People coming from Java often write un-Pythonic code because Java is a language that does not reward good taste. The Pythonic view is: programming is hard enough — let’s not make it harder for ourselves.
Here’s a more Pythonic version:
def encode_news_item(item): def encoded(*keys): for key in keys: yield (key, item[key].encode("utf-8", "ignore")) return dict(encoded("title", "summary"))
This code shows comfort with Python’s features, but does not abuse this comfort by obfuscating the code with mind-bending constructions. The programmer has reduced the problem to two comprehensible subproblems: creating a stream of tuples (key, encoded_value) and constructing the new dictionary from that stream. This leverages the elegant fact that in Python, dictionaries can be easily constructed from (key, value) tuples.
This version avoids code duplication while also making the last line (the return statement) a rough description of the entire function. “return a dictionary of the encoded values for the keys title and summary” Idiomatic, yes. But also tasteful, and thus Pythonic.
Even Pythonic code can be improved:
def encode_news_item(item): encoded = lambda key: item[key].encode("utf-8", "ignore") keys = ("title", "summary") return dict((key, encoded(key)) for key in keys)
And though Pythonic code is often smaller than its un-Pythonic counterparts, the experienced Pythonista knows the road to hell is paved with good intentions:
def encode_news_item(item): keys = ("title", "summary") return dict(zip(keys, map(lambda key: item[key].encode("utf-8", "ignore"), keys)))
Ick… time to hg revert this idiomatic Python code to the more Pythonic version
Last summer, we got our company, Parse.ly, off the ground at DreamIt Ventures incubator program in Philadelphia. Since then, we’ve talked to a lot of founders about our experience in the program. Many founders are data-driven people who are looking for concrete advice about how to optimize their experience at these programs. One of the most successful runway-extending pieces of advice we have given has been to keep food costs low. We were able to get our food cost down to $4/person/day through some simple planning during that summer, and each of us also lost 10-15 pounds in the process. We felt great, were productive, and made our DreamIt investment last. I think this might be one of the core reasons for our company’s survival and success. This is the story behind “The Startup Diet”.
DreamIt Ventures had just cut us a check for $20K to get our startup off the ground. But my cofounder Sachin and I were worried. $20K seems like a lot of money, but it’s actually not that much. Not when you’re using it for both living expenses and to hire other people to get your company off the ground. So we started planning our spend and rationing the money immediately.
We knew we’d use some of the money for our living expenses. We had just arrived in Philadelphia, and we were living in a startup house with Matt and Burak, the founders of Tidal Labs, and Jack, one of the founders of SeatGeek. It turns out that rent wasn’t that expensive in Philly, especially in this arrangement. Instead, our number one cost, we determined, was going to be food.
The Next Meal
One of the founders of modern management, Peter F. Drucker, once wrote, “In all recorded history there has not been one economist who has had to worry about where the next meal would come from.” How true. But every startup entrepreneur I talked to obsessed with one thing and one thing only: “runway”. In other words, where the next meal would come from once the small amount of funding we got ran out.
With some good take-out and food truck options in University City (where DreamIt’s offices were), we were already spending as little as $10/person/day. But actually getting this food wasted valuable time, and it usually wasn’t very healthy. We heard reports of the “Startup 15″ — the 15 pounds many founders gain when beginning work on their companies. We also knew that when you eat like crap, it affects your productivity negatively. Finally, we knew healthier takeout options were prohibitively expensive. This presented a serious problem. Thinking like founders, we decided to hack this situation to our advantage.
Most companies at DreamIt figured it couldn’t get any better than the food trucks, and stuck to low-cost takeout most every day. We wondered — was there a way we could cut our food cost down, eat healthily, have positive productivity, save time, and hell, even lose weight in the process?
The Birth of a Good Idea
Thus was born The Startup Diet. What we needed was a diet with the following properties:
Sachin went online researching diet options and came across Tim Ferriss’s blog. Ferriss takes a scientific approach to living, and is best known for his popular book, The 4-Hour Work Week. The relevant article Sachin found was called How to Lose 20 lbs. of Fat in 30 Days… Without Doing Any Exercise. This provided an excellent starting point.
The Four Rules
I encourage you to read his post, but for the impatient, here are his rules:
The diet we came up with was basically a fork of Ferriss’s. We followed his four rules. The most important of these ended up being rules #2 and #4. By turning our meals into a routine, it became much easier to make them efficient and reduce the time spent buying, prepping, and cooking food. By taking one day off per week, we didn’t feel that we were “missing out” on important business opportunities as a result of our diet — e.g., lunch appointments with mentors or a night out at the bar to blow off steam with the team and other founders. Our “cheat day” also had an interesting other efficiency effect: we were just forced to schedule all of our breakfast/lunch/dinner meetings for a single day, and thus didn’t interrupt our work week with distractions.
The Core Insight: Beans as a Base
The goals of our diet were different than Ferriss’s. But like him, we used beans/legumes as a base. Beans are an excellent base food for a startup diet because they meet all of our core requirements above. But the best thing about beans is their startup economics. If you consider that our goal is to ingest the cheapest/healthiest possible calories, you can’t find a more flexible foodstuff than canned black beans. You can buy organic canned black beans for as little as $1.50/can. One can of beans is around 600 calories.
If you look at black beans from an efficiency standpoint, you get 4 calories per $0.01. Compare a slice of pizza which is 200 calories and will usually run you $2.50; that’s only about 0.8 calories per $0.01, making beans about 5X more efficient. But beans aren’t just more energy-efficient, they’re also loaded with protein and fiber (about 7g of each per serving). Most of the cost in a diet comes from eating expensive proteins like chicken and beef, and beans have the ability to provide both efficient calories and efficient proteins at the same time. Finally, you can buy canned beans in bulk and store them.
For other nutrition, we would lean on four more sources: eggs; salad (mixed greens, romaine lettuce, tomatoes, and peppers); prepackaged vegetarian protein sources like tofu, seitan, and wheat gluten which we would use sparingly; and canned soups.
The Trial Run
We went to the supermarket to run our first trial. We knew this would need some coordination. Our first trick was to make an up-front investment in lots of differently-sized Ziploc containers and canvas carrying bags, so that we could flexibly relocate food where we needed it (home, office).
We wanted to prep food for the week ahead of time and bring it into the office to ensure we didn’t cheat on the diet. The next trick was to come up with a meal plan using the four rules and our beans-as-a-base theory. Here is the meal plan we came up with:
An Agile-Friendly Diet
I mentioned Matt from Tidal Labs. He was also our landlord. His Spruce Hill House was the perfect atmosphere for getting some serious summer hacking done. But one of the nicest — and eventually, most useful — things about the house was the huge kitchen and huge Sub Zero refrigerator. Matt later told me bought the refrigerator off eBay for a $1 (it wasn’t functional and the owner just wanted to get rid of it), and then Matt fixed it himself. He is scrappy and awesome that way.
Sachin and I treated this Sub Zero fridge as our diet command center. We would prep our meals before and after our long days at the office. One of us would prep breakfast for the day while the other prepped lunch/dinner materials for the next 2 days. We’d collaborate on prepping dinner. Every couple days, we’d swap.
We cherished this time for its management value to the company. In the morning, we’d have a founder standup: we’d figure out the things we wanted to accomplish for the day while prepping and eating breakfast. We’d evaluate the day over dinner, a kind of mini retrospective. The Startup Diet became a kind of management technique.
Making It Work
Perhaps by now you’re convinced that this diet has a lot of cost and efficiency benefits. You can’t beat living on $4/person/day for food. You can’t beat saving (by our estimate) 1-2 hours a day on prep and meal time.
But, the main question we get about this was, didn’t that taste horrible? Surprisingly, no. We used a lot of obvious flavor enhancers to keep things tasting pretty good. We invested in good olive oil, sea salt (in a grinder), black peppercorns (in a grinder), and threw in a liberal helping of spices, hot sauce, and herbs.
The second common question we get was, wasn’t it mind-numbingly boring to keep eating the same thing over and over again? Again, surprisingly, no. The first week you wonder how you’ll go a whole summer like this. The second week, it just becomes the routine.
Remember, startups are hard in the early stages. Every day requires your fullest attention. Therefore, turning meals into a routine actually made a lot of sense — it reduced the degree to which we were interrupted in getting the company off the ground. Our body got used to it, and we hardly even noticed that we were eating the same meals week by week. Rule #4, the cheat day, was the real savior in this aspect. We used to look forward to our cheat days, and we made them count. Cheat days also acted as a kind of reward for a week of focused work.
Making It Your Own
I am not a doctor or a nutrition expert, and so all the regular caveats apply. But this was how we were able to keep our costs down while Parse.ly was being incubated at DreamIt Ventures.
Learn to love black beans and other legumes — they are a healthy, flexible and efficient food. Plan your approach. Turn eating into a routine, and prepare most of your meals ahead of time. Cut out the carbs, beer and soda — drink lots of water instead. Use meal prep time to manage your company. Use your cheat day to manage external relationships and reward yourself.
Stay focused on your startup. Lose weight. Gain productivity. Extend your runway.
The rest is up to you.
One of America’s greatest strengths is social mobility. There are several cases of an individual starting with nothing and persevering to become rich, powerful, and influential. Success stories of this kind have become an important part of American business mythology, especially in the world of entrepreneurship. They are strong motivators for individuals embarking on companies of their own.
For those of us who start companies, we see the company as a vehicle to creating something valuable and lasting for society, while also advancing our personal goals. This isn’t usually hubris or ego, though sometimes it may be. Instead, it’s usually an attempt to make your time worthwhile: to yourself, to those close to you, and — if you’re lucky and persistent enough — to the entire world.
The problem with social mobility is that not every individual begins at the same starting line. In fact, the range is huge. Those who start with an influential family or significant capital resources have a much easier time getting to the top. For those who don’t have this head start, things are a lot harder.
Though America is not entirely merit-based, it can reward individuals for hard work. I’ve experienced the benefit both of an advantageous starting point and hard work in my 26 years on this earth. I also believe that with each step and milestone in my life, my potential to create enduring value for society has increased significantly.
Beginnings
The inspiration for this essay was a comment I read online about a successful young businessman who was the son of a successful businessman. “I’d [like to] read a story about a 25-yo [who] made good on the same scale[,] who went to a state college, had screwed up parents who were too busy fighting with each other or gettiing drunk to even have a clue what he was doing, isn’t childhood friends with a celebrity… Just happened to be smart and hardworking and optimistic even despite all those factors. That would actually be an interesting story.”
My parents weren’t screwed up, but they did fight a lot — my Mom and Dad separated when I was in elementary school and divorced shortly after that. I’ve not been childhood friends with a celebrity and I don’t have a trust fund.
I’m not saying I came from a disadvantaged upbringing — in fact, quite the opposite. I went to public high school in New York. To a New Yorker, that may not sound like a huge step up in the world, but I recognize that public school in New York represents one of the top educations you can get.
I grew up in a nice house in a quiet suburban neighborhood. I had good, encouraging teachers. My parents were liberal and a positive influence. I didn’t have a silver spoon in my mouth, but I also didn’t have any serious handicap in my upbringing. Probably my biggest step up in the world, given my current trajectory, was that when I was 10 or 11 years old, my Dad noticed an interest I had in computers. And so, he bought me a PC (running DOS / Windows 3.1) and set it up for me on Christmas Day. From that point forward, I was enchanted by the machine. And once I got a 28K modem and dial-up access to the web (on one of the first ISPs, The Pipeline), I became a citizen of the world before I had even hit puberty.
This I do know — though I had a head start, I also worked hard. I was a geek — as I got older, I built my own servers in my basement, taught myself to program, and discovered Linux and Free Software. But I also kept ahead in “the real world”. I did feel a little disconnected from my peers in my private pursuits; reflecting on my childhood, I realize I “grew up” a little more quickly than my peers.
My First “Startups”
At 15, I had built and launched somewhat popular websites for a couple of personal projects. Nothing you would remember — each one was related in some way to an interest I had in multiplayer online video games, an interest I aged out of a year later. Each website had between a few hundred and several thousand visitors a week.
But the sites looked nice. When I told my family friends and teachers in my local suburb that “I built websites”, they were all interested in having me work for them. Recognizing the opportunity, I turned this into a business.
By 2010 standards, the websites I built were boring and simple — static designs, gaudy flash “intros”, contact forms and quickly-outdated information portals. Yet in 1999, this was cutting-edge. So cutting-edge that people were willing to pay to be on the web. I printed up business cards and called my company a “digital online identity firm”. Might seem like an obvious thing now, but I felt proud to deliver that value to my early-adopter customers. This was my first “B2B startup”. During this period, my friends would ask me to hang out on the weekends, but I’d be working on HTML/CSS, JavaScript and Macromedia Flash, building up my online portfolio.
At 16, I was approached by the editors of a technology publishing company called Friends of ED to write a couple chapters for a book of theirs called Flash 5 Dynamic Content Studio. They found me on the web — my first “digital introduction”, before the days of LinkedIn. On the weekends, I regularly answered people’s Actionscript programming questions in a listserv called flashpro. They liked that I had both a Flash and programming background, so they contacted me to contribute to their book — one of the first that discussed building backend systems and tying them to Flash frontends.
I corresponded with them entirely via e-mail. I wrote the chapters and edited about 80% of the book during my junior year of high school, while also prepping for SATs and acting as editor of my school newspaper. I probably got 3 hours of sleep a night. My friends thought I was crazy.
When the book finally went to press and started showing up on Amazon.com and in Barnes & Noble stores, it was one of my proudest moments in my early adulthood. I just thought about how the work I did was being seen and read by thousands of people across the world. Each of those individuals would go on and use that knowledge to build web applications for other individuals. There was seemingly no limit to the impact of my work. I may have been naive, but I thought, what could be better than this?
I still remember when the editors in the UK found out I was only 16 (it was while we were negotiating payment). They were shocked, they actually didn’t believe me. They had to check with lawyers to see if they could even pay me. Then they paid me the bunch of money they owed me for my hourly editing rate and IP rights, and I had all the savings I needed to live throughout my first couple college years without asking my parents for support.
By 16, I had some unique experiences. I created a market in my skills. I built working software for companies small and large. I contributed to a project that had global impact. I made some real money. And, in my junior and senior years of high school, I got a taste of the competing demands of personal and work life.
College: Creating Value for Myself
When I got into NYU on scholarship, I was thrilled. My grades in high school let me go to a good college without breaking the bank. I would be able to attend a top-tier school without the financial stress typically associated with private schools. I’d get to stay in New York, one of my true loves. I had also been rejected by just about every other top-tier university in the country. Come to think of it, the ups and downs of the college application process was good training for startup life — especially the rejections.
I worked my butt off in college. I really got into it. Not just the coursework, but also just pursuing my passions within the student body. I held talks that tried to stoke up my peers about technology, economics and open source software, all passions of mine. My friends rarely saw me because I was often working on these and other spare time projects.
College was interesting. I worked hard, but rarely got paid for my work. In some sense, working hard in college is good preparation for a startup lifestyle. You work hard because you believe in yourself and in your own potential, not because you have a paycheck coming.
It was also in college where I came to an important realization: I am a software engineer and computer scientist. Though there are many ways for me to add value, my preparation and passions make me especially well-suited to building software. And so, that’s where I should have an impact.
Internship: Creating Value for Others
I took an internship in summer of my sophomore year where I got paid a flat $3000 to build a web application for a NYC-based non-profit, The Unemployment Action Center of NY. I probably spent 750 hours on the thing (60 hours a week), not making it a very profitable proposition. My friends again thought I was crazy — lots of them had taken well-paying and prestigious internships at software companies and banks, but I didn’t even apply to those.
This was my first experience with building a real product that people actually used. I worked entirely from home. I used my own development environment and hosting environment. I gathered requirements directly from the end-users. I made all the technical decisions and implemented all the code. That was good experience for startup life. Built useful stuff, solved real problems. Made lots of little decisions.
The case management application I built is still running today, relatively bug-free, and has helped thousands of unemployed people get legal representation for free, so that they can get justice for wrongful termination and other cases. I’m really proud of that application. I didn’t care that I got paid $4/hour for it. Looking back, I would have done it for free. What a phenomenal organization and what a great cause to apply technology toward.
Post-College: Creating Value for Money
The rest of my story is told pretty well by this NY Observer article.
I worked at Morgan Stanley for a couple of years while living at home on Long Island to save money. Working at the bank was a huge change from the rest of my life. The primary focus of my job became my paycheck. Technology decisions were made for me, by external committees. I was expected to follow all sorts of processes and procedures. I was expected to become a conventional programmer. I found myself building products that had an unclear end-user.
I did learn things there, too. I learned what it was like to work with engineers on a daily basis. I learned a lot from my managers and colleagues, many of whom were just astoundingly intelligent individuals. I learned how big companies operated. But mostly, I learned a lot about myself.
I learned that work meant more to me than a paycheck. Work should be about solving problems, helping people, and creating enduring value. Money is important, but it wasn’t what attracted me to technology in the first place. And that’s why I knew I had to leave that firm.
Startup Life: Creating Lasting Value for Myself, for Others, and for Money
I accepted this compromise: I would work at the bank to save the money I needed to build my own company. I was “realistic”, and this seemed a fair compromise.
I anticipated that I’d need at least a $20K buffer to start my startup (it ended up being more than that). And in March of last year, I quit Morgan Stanley to embark on this path. My awesome, like-minded friend from NYU, Sachin Kamdar, quit his job too. Together, we tried to build something of value, even though we didn’t know what that would be initially. We worked out of cafes, brainstorming ideas and building prototypes. We consulted on the side — I, through the boutique software engineering firm I founded, Aleph Point. Early on, my girlfriend prophetically referred to my financial strategy in all this as “putting a bandaid on a hemorrhaging femoral artery”.
Our friends thought we were crazy. Chuck a steady paycheck to work on an ill-defined “startup”? For some reason, all we could think was, hell yes we would.
Our first stroke of luck was getting into DreamIt Ventures. NYC didn’t have seed programs like Tech Stars or Seed Start yet. We moved to Philadelphia for the summer, leaving our NYC apartment leases and girlfriends behind. They thought we were crazy. The $20K from DreamIt basically covered our living expenses in Philly, but it did lift some financial pressure for a few months. We came up with a $4/person/day diet — which we called “the startup diet” — that involved eating beans, soup, romaine lettuce, and veggie burgers basically every day. We lived in a “startup house” with the founders of SeatGeek and Tidal Labs. We worked in an office with the founders of Postling, NoteHall, and other fledgling companies. We started hacking on what would eventually become Parse.ly. And for the first time in our lives, we were surrounded by individuals who also felt that what one does is more important than how one benefits.
Our second stroke of luck was finding our exceptionally talented lead engineer, Didier Deshommes, who has been with us ever since we incubated Parse.ly in Philadelphia. Ever since DreamIt, we operate on this maxim: we got our share of luck — now, let’s make our own.
Creating Your Own Luck
Perhaps our startup founding story isn’t remarkable. But lately, reflecting on the last year — where we bootstrapped and self-financed this startup, overworking ourselves and straining our personal relationships — I’ve started to feel like it is pretty remarkable.
We made our own luck. And we’ve brought all our skills and experience to bear. I’ve been preparing for running Parse.ly since the age of 15. I’ve been preparing, basically, my whole life. And I’m so glad we did this together. With all the ups and downs, if I had done this on my own, I would have given up. But we just kept pushing each other. Every time we hit a roadblock, we just said, “we can still do this.” And we did.
I’m really proud of this company and our team. We’ve built a real, valuable technology of lasting value. We’ve learned so much. And we’ve done it all against the odds. We are a small NYC startup with two no-name founders. Neither of us have “prior exits” or “a solid track record”. Neither of us have built and launched startups before. But none of that matters. We discovered that people want their content filtered and prioritized and that online publishers want their content optimized, and we used elbow grease, hustle, and dedicated hacking to make the rest happen.
Our startup is three hard-working and optimistic guys who believe that we can change the world. We put it all on the line to create something from nothing. Our work is still in progress, and there are still challenges ahead. There are easier things we could be doing, but we don’t care. You may think we’re crazy, and you may be right.
But that’s the final lesson of startups, isn’t it? You have to be a little crazy.
—
Thanks to Sachin Kamdar for editing this essay. And thanks to our awesome friends, significant others, and families for still cheering us on even as they rightfully thought we were crazy.
Nobody moves to New York because they think they’re just like everybody else. A young kid, fueled by a toxic blend of bravado and wicked insecurity, can expend a truly terrifying amount of energy trying to prove her exceptionalism, prove that she is different (read: better) than the dull hometown peers she left behind, who go to uncreative jobs in uncreative clothes, eat at Chili’s, practice monogamy.
Describes many non-native New Yorkers really well. No offense non-natives, but some of you are definitely trying to prove something! From Dirty pictures I didn’t want taken
My good friends at HiiDef just launched a new app that has been in beta for awhile, Flavors.me. This is an excellent tool that has a great, simple, and usable design.
What’s the value preposition of Flavors.me? It’s to unify your various “online identities” into a single, dynamic, automatically-updated, and elegant website.
What do I mean by that? OK — so, like most people on the web, you spread public information about yourself in multiple places. You might run one or two blogs (personal and work?). You might have a Facebook account, a Twitter account. You may share your favorite books at GoodReads, your favorite movies at Netflix, and your favorite music on Last.fm.
Flavors.me lets you take all that information and put it together in a single website to serve as your “online identity”. All your publicly shared information, aggregated in one place, and displayed beautifully.
I’ve been running a Flavors.me site for some time that you can see here: http://flavors.me/pixelmonkey
Now, that’s the end product. All the content gets pulled dynamically from your various online feeds. The real magic with Flavors.me is how easy it is to get there. You can drastically change the look and feel of this site using a dynamic, “WYSIWYG” interface. You can do one or two clicks to add a service, reorder it, rename it. Another couple of clicks and you change font sizes, colors, and even the overall layout.
As I was writing this post, I noticed that Flavors.me had added LinkedIn support, which wasn’t available earlier in the beta. So I went ahead and added it. I simply logged in and pulled up the design panel.
From there, I could navigate over to “Services”, click the “LinkedIn” logo, and Flavors.me would guide me through the authorization process to make LinkedIn data available to them for display on my page. You simply get redirected to LinkedIn.com, log in there (if not already logged in), and then get redirected back to the Flavors control panel.
Next, I could simply drag-and-drop the service box in the panel in order to reorder it on the page. Yay, drag-and-drop! There is also in-line editing, and editing the title of the LinkedIn “popout” dynamically updates my profile page in real-time! Nice touch.
Finally, I can go to the “Design” panel to tweak fonts, layout, sizes, etc. Look at how Flavors.me displays fonts. It’s amazing — this is the web, but it’s easier to change my Flavors.me web design font styles than it is to change font styles in a Microsoft Word document!
Now, depending on where, exactly, you store the bits that make up your online identity, you may find yourself disappointed by a lack of this-or-that service. They already support quite a few, and they are adding more in the future, I hear. The one I sorely missed was Delicious, so I could share my bookmarks with the world.
But wait! For those services that still don’t have first-class support, Flavors.me gracefully supports RSS feeds of any variety. I simply popped the RSS feed for my Delicious bookmarks into Flavors.me services panel, and, voila! — my bookmarks are now publicly shared!
It’s this kind of simplicity, design sense, and user-centric approach that makes me love the web as a place to develop, deploy, and use software. So, what are you waiting for? Sign up for Flavors.me today: it’s free!
Andrew Leonard of Salon.com has written an article about switching from Chase to a local community bank, in response to HuffPo’s MoveYourMoney campaign.
I’ve written on this blog multiple times about my frustration with Chase bank, but it’s interesting to see someone with as big a readership as Andrew Leonard writing about it. Are commercial big banks’ days numbered?
I recently re-read Douglas Crockford’s JavaScript: The Good Parts. I have been writing more and more JavaScript lately, especially object-oriented JavaScript plugging into existing frameworks. Re-reading the book has definitely been a useful exercise — I think when I first read it approximately 6 months ago, I didn’t fully understand it. But now, I do.
I also found it very interesting to hear Crockford wax poetic about the virtue of simplicity in all forms of software design. The following passage concludes the book.
When I started thinking about this[...], I wanted to take the subset idea further, to show how to take an existing [product] and make significant improvements to it by making no changes except to exclude the low-value features.
We see a lot of feature-driven product design in which the cost of features is not properly accounted. Features can have a negative value to consumers because they make the products more difficult to understand and use. We are finding that people like products that just work. It turns out that designs that just work are much harder to produce than designs that assemble long lists of features.
Features have a specification cost, a design cost, and a development cost. There is a testing cost and a reliability cost. The more features there are, the more likely one will develop problems or will interact badly with another. In software systems, there is a storage cost, which was becoming negligible, but in mobile applications is becoming significant again. There are ascending performance costs because Moore’s Law doesn’t apply to batteries.
Features have a documentation cost. Every feature adds pages to the manual, increasing training costs. Features that offer value to a minority of users impose a cost on all users. So, in designing products[...], we want to get the core features—the good parts—right because that is where we create most of the value.
We all find the good parts in the products that we use. We value simplicity, and when simplicity isn’t offered to us, we make it ourselves. My microwave oven has tons of features, but the only ones I use are cook and the clock. And setting the clock is a struggle. We cope with the complexity of feature-driven design by finding and sticking with the good parts.
It would be nice if products[...] were designed to have only good parts.
I removed direct references to the main subject of Crockford’s discussion — namely, the JavaScript language itself. The truth is, this advice is much more valuable for the design of all software products. Perhaps one day someone will write the much needed book, Startups: The Good Parts.
I’ll start off this post with a somewhat controversial claim: I invented Dropbox.
I’ll show why this claim doesn’t matter later, but for now, I’ll assure you that it’s true.
How many of you out there use Dropbox? If you don’t, you should — it’s an excellent tool. In its free version, it provides you with 2GB of storage “in the cloud”, using a new kind of folder called a “Dropbox”. What distinguishes a Dropbox from other folders on your computer? The following:
Dropbox is supported on Windows, Mac OS X, and Linux, and now even has mobile applications, as well. Further, I have a special place in my heart for this service because I started using it almost 2 years ago, and it has acted as a file sharing and project management tool for my own startup’s internal operations at Parse.ly. I was therefore more than ecstatic to discover that this excellent tool and its smart founders had also made it through all of the hurdles necessary to get an early-stage company the financing it needs: they’ve raised over $7 million in financing and have over 3 million users.
But there is another reason I absolutely love Dropbox: because it was my idea. I invented it.
In the summer of 2004, I was really itching to get into Google’s Summer of Code competition. This was the summer I had taken a job working from home as the lead web developer at the Unemployment Action Center of NY. Though the job was great experience — letting me build my first full web application for a real client — I was itching to work on a technically juicy problem, something that affected me in my daily computer use.
And so, I sat down for a day and wrote up a Google Summer of Code proposal for a new system I had invented called Persistent Folders. It wasn’t exactly like Dropbox, but damn close. Even the implementation is close: Dropbox and my system both sync files using rsync, and both use a Python daemon process. The main difference is that since my system was meant to be open source, it did not require the use of a company-maintained service; instead, I proposed that users piggyback existing storage they have via web hosting providers.
Unfortunately, my project wasn’t selected.
Why am I posting this? I recently had a discussion with another engineer after I had discussed some of the technology behind Parse.ly with him. He was surprised at how liberal I was with explaining our internal implementation, architecture, and algorithms. He asked me, “Aren’t you worried that I could steal your idea?”
I responded, “You can steal it all you want; I dare you to try and implement it!” I then explained that to me, ideas don’t matter. I had the idea for a hundred startups that now exist before they started. I know from talking to users and customers of Parse.ly that they had our idea before we implemented it. What matters in software is not an idea, but execution of that idea. Ideas are a dime a dozen.
I began this post with the statement, I invented Dropbox. And now I’m here to tell you that it doesn’t matter one bit, because I never implemented Dropbox. And you can’t own ideas…
If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it. Its peculiar character, too, is that no one possesses the less, because every other possesses the whole of it. He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me.
-Thomas Jefferson
Imagine if when I had come up with this idea, I had patented it. Would it really have been fair to force Dropbox to license my technology? To sue them for “infringing” on my patent?
No way — these guys deserve the success they have. How stifling it would have been for me to put any roadblocks in their way because “I had the idea first”! The world is better because Dropbox exists. And, these guys have had — in my opinion — near-flawless execution. So, kudos to them.
Consider the following matrix:
| Mediocre Idea | Great Idea | |
| No implementation | ||
| Mediocre Implementation | ||
| Great Implementation |
A mediocre idea with a good implementation is worth infinitely more than a carefully-guarded, good idea with no implementation. Of course, the best products are both great ideas and great implementations. And I think my proposal for “Persistent Folders” — written three years before Dropbox even started — proves this to me in a very personal way.
Now, I’m certain the Dropbox guys never read my little proposal, because it was only sent to Google and otherwise sat on my hard drive for years, not viewed by anyone. I had a great idea, but no implementation. And the Dropbox guys took a great idea (one they arrived at on their own, I’m sure), and gave it the implementation it deserved.
For those of you who guard your ideas carefully, I’d suggest you stop wasting your time, get off your butt, and focus on actually building stuff. Because if you don’t, someone else will!
For the curious, below is my proposal to Google Summer of Code 2004, unaltered from its original draft. This is just a relic; for now, I’m glad to keep hacking on my own little idea, hoping one day I can look back and say I executed as well as Dropbox did…
PERSISTENT FOLDERS: A New Metaphor for Data Synchronization
May 13, 2004The Problem
Computer users are more and more finding themselves with a serious problem: the fragmentation of personal data across multiple physical machines, and even multiple operating systems. Users who find it most comfortable to have a desktop machine at home and a mobile laptop computer “on-the-go” (for business or trips) have to deal with chaotic and often frustrating manual methods to copy that data to the needed places. Some users carry USB memory sticks — which hold their important “working set” of files — and either work directly off those disks (suffering reduced speeds) or make copies of the folders therein on their actual machines, as they become needed, spreading yet more copies and compounding the problem of synchronization. Others abuse ubiquitous technologies for accessing important files, by either e-mailing the files to themselves or by uploading them temporarily to web or FTP servers.
The end result of any of these methods is fragmentation of personal data, with error-prone manual processes for replication, and the inability for a consistent way to search, backup or even just keep track of all the important files and folders spread among the PCs in question.
What is needed to solve this problem, or at least make the problem more manageable, is an intelligent, user-friendly, customizable cross-platform program that allows for transparent synchronization of personal data across multiple computers and systems, either via a LAN or via the Internet.
Introducing the Persistent Folder
Up to this point, most users are used to understanding folders as existing solely in one location. The only way to get a folder’s data onto another machine is to copy that folder, thus making a duplicate. The goal of this project will be to introduce a new high-level data storage mechanism known as a Persistent Folder. Persistent Folders differ from regular folders in that they are meant to be transparently persistent, or synchronized, across different computers.
To explore a hypothetical example, imagine user Joe sits down to do some personal accounting work on Friday. He creates a folder on his desktop called “Personal Accounting,” and begins working on some OpenOffice spreadsheets in that folder. A few minutes later, Joe realizes that some of his accounting work will have to be done while he is away during the weekend, on his laptop computer.
To enable this, he simply right-clicks the folder on his desktop and says “Make this folder persistent across multiple computers.” When he does this, a dialog comes up asking him to type in a description of the persistent folder, and to select computers on which he wishes to make this folder persistent. Joe is presented with a dialog of computers currently available on the LAN. He sees his laptop, “MobileJoe”, and selects it. He then presses OK.
Within seconds, Joe sees the folder “Personal Accounting” appear on his mobile computer’s desktop, even though he hasn’t even touched his mobile computer yet. When he enters that folder, he sees the same spreadsheets that are available on his main PC’s desktop.
These two folders are now treated by Joe as “one persistent folder available across two computers.” He can add files to one folder and they will automatically be propagated to the other. He can modify files and the new versions will then exist in the other. Joe no longer has to worry about pushing his data back and forth across the computers.
Internet Synchronization by way of Ubiquitous Services
One of the major questions one may ask at this point, “that is all well and good for persistence of a folder on a LAN, but what about when I leave my home/office, and need to synchronize over the Internet?”
One approach would be to have a special server application with which all computers that wish to share persistent folders could synchronize. But then the user needs his own server, and needs to install my little server daemon on it. And not many users have their own servers on which they can just install any old application. So for normal users, this, in fact, is no approach at all.
Although most users don’t run their own servers, many modern users do have _server access_. That is, many users do have Web/FTP hosting providers to whom they pay subscriptions, and these servers power their blogs, personal photo stores, etc. The goal of Persistent Folders would be to allow users to piggyback their existing web services to synchronize their files, essentially turning a folder on that server into a Persistent Folder Repository.
A dialog in the properties of a persistent folder would include a checkbox that allows the user to “make this folder available over the Internet.” It would then allow a user to choose a method for making this folder available via the Internet, asking the user to provide a “persistent folder repository” via FTP by default (due to commonality), but equally possible would be scp, rsync, NFS, or even something like cvs/svn (the goal, of course, would be to make the design modular enough that it could support any server type with basic file system operations).
Then, every other visible machine on the LAN who shares that persistent folder would have option enabled automatically. If the machines are offline, the option can be entered manually by the same method explained above.
From that point on, the folders become persistently available and transparently usable, just as before. If the other computers are available on the LAN, then the program utilizes LAN speeds and synchronizes directly. Otherwise, it synchronizes with the Internet Repository.
Lucky Joe is now able to work on his files in the office, leave his laptop there (“the darn thing is so heavy to carry around!”), return home, and continue right on working on those files which are now found in the “same folder” on his desktop machine at home.
Implementation Ideas
I plan on implementing this idea in Python, since one of the major goals of the project is to allow Persistent Folders to exist not just across computers, but also across operating systems (so that a Windows desktop could have a persistent folder that also exists on a Linux laptop, for example). For a user interface, I plan to use PyGTK, since that’s what I know, and since it is also cross-platform. I need something powerful like GTK since there will be times when user intervention will be necessary, but since it is a goal of this project to reduce the number of times the user must intervene, I want to make sure that when he does, he is presented with sane, human-readable, and user-friendly dialogs.
I don’t plan to reinvent the wheel. Most of the magic of Persistent Folders is just making existing synchronization tools work relatively silently and in a way that makes sense with a user’s workflow. The main tool I am thinking of using is rsync, the relatively-ubiquitous UNIX utility for incremental synchronization of directory trees. I considered using the unison, as recommended by Ubuntu’s Wiki entry, but I saw two major problems: (1) unison is no longer under active development and (2) it is written in OCaml, a relatively obscure language which I don’t know. Therefore, any bugs I discover in unison would not be fixed in a timely fashion by its developers, and any features I’d like to add to support my idea would be quite difficult to implement.
As for monitoring folders for changes, I imagine the most elegant solution would be to take advantage of inotify under Linux (like Beagle does) and perhaps handle ChangeNotify events under Windows. I’d really like to avoid polling, since polling is just plain evil for something as potentially neat as this. Regardless, I’ll probably need to code Yet Another Daemon (or Windows service) to keep track of Persistent Folders on the local machine and their equivalents across the network.
Finally, for secure synchronization of files, I plan to use SSH tunneling wherever possible, which is available under Windows under the OpenSSH for Windows Sourceforge project, http://sshwindows.sf.net. Linux distros like Ubuntu, of course, have all the ssh support one needs.
Conclusion
This project aims to introduce a new metaphor users may utilize to share important files across multiple computers: the Persistent Folder. A Persistent Folder is not a shared a folder; rather, it is seen as a single folder that exists locally across multiple computers, and can be treated by the user as such. Changes at any one folder rapidly propagate to the others. Properly implemented, this may provide a better way for users to manage important data that might otherwise be scattered, fragmented, and even lost through the daily shuffle of file transfers across networked PCs.
Appendix A: Project Roadmap
o June 24: Begin Work, with ideas now fully developed in the form of documents. Post these documents to Ubuntu’s Wiki to encourage ideas from community.
o July 5: Have a console synchronization wrapper and some network discovery stuff in Python written, and have UI concepts designed in Glade.
o July 20: Make 0.1 (GUI and basic features working) release, so that Google can show it off at OSCON?
o August 1: Have other great features, like the Internet synchronization and multi-protocol support, in 0.2 release.
o August 10: Consider working on features to allow transparent backup and versioning, to include with bug fixes in a 0.3 release.
o August 20: Make simultaneous 0.4 releases on Linux and
Windows, hammering out as many cross-platform issues as possible. Add inotify/ChangeNotify support if not already there.o August 30: The big 0.5 release, finished for Google’s deadline. Let Google/Ubuntu make the decision if it’s worthy of being renamed a “1.0″ release.
o September 1: Live a less stressful life, since my files are now neatly synchronized among my PCs! But I’ll keep making it better.
Appendix B: Hold On, Isn’t Samba Good Enough?
One common response to this project may be, “Aren’t Samba and SMB shared folders good enough?” Though Samba is good, I do not believe it is good enough. Here’s why:
(1) Samba does not aim to present the user with a metaphor of a folder existing in multiple locations at the same time. I believe this metaphor would be appreciated as powerful by longtime computer users and subconsciously acknowledged as highly usable by novices.
(2) Samba only allows users to share folders on their drives. Other Samba users may then mount those folders via the LAN. This two-step, asymmetric process already seems complicated and convoluted to end users. But more importantly, synchronization is left up to the user. In theory, users could avoid synchronization altogether and work on the files directly, via the share. In practice, LAN speeds are not adequate, and other issues are raised (such as file locking and write conflicts). This forces users to home-brew their own synchronization protocol to make sure duplicate files at different versions doesn’t result in an accidental loss of data.
(3) Samba does not provide any easy method to access personal files once one leaves the LAN and enters the WAN, short of opening up a bunch of ports on your router and trying to connect in from outside (a very slow and insecure method).
(4) Samba provides no recourse when a computer is no longer available. Persistent Folders, on the other hand, make data available “offline” by design.
(5) Samba does not know whether two duplicate files exist across the network. Therefore, Samba cannot be immediately utilized as a transparent form of backup.
Is the Persistent Folder meant to replace Samba? Of course not. I believe Samba has specific purposes: to allow for easy, fast, one-time transfers of files across a LAN, and to allow networked printer and device sharing. I do not believe Samba is an adequate solution to allow a user to treat a folder as if it existed on multiple computers at once.
Appendix C: Versioning’s the Thing
Upon further contemplation of this project, I realized I had left a question unanswered in this document. What is the right way to deal with version conflicts during the silent synchronization phase among Persistent Folders?
Imagine user Joe works on a file in a Persistent Folder which exists on his laptop and desktop computers. The first version (let’s call it version 0.1) was created on his desktop, and Joe had the laptop on the same network, allowing automatic synchronization to occur. But then Joe’s laptop was disconnected from his desktop, as Joe was without network or even Internet access. He works on the document a bit, which can now be called version 0.2. But then Joe forgets that he worked on the document on his laptop, and when he gets home, he works on his desktop computer, creating a version 0.2 there too. Which version 0.2 should be synchronized?
Normally, the answer would be to pick the most recent one. But we don’t want Joe’s laptop work to be lost, just because Joe forgot about his prior work, do we? Well, what needs to happen is a bit of smart conflict resolution with no data loss.
I propose that Persistent Folders should also be versioned to a sane degree. Perhaps by default 3-5 versions of files are always kept, with the most recent versions visible directly in the persistent folder, and other versions buried (in .dotfiles or hidden folders) behind there. Then, the Persistent Folder monitor should inform Joe (“whisper” to Joe via the notification area/systray) that there was a conflict, and the newest file was chosen. But at any point, Joe can choose to revert a file back to an older version. The interface should be such that Joe can even see where the file was edited, for example:
Personal Accounts.xls – Prior Versions
o version 0.2: edited on MobileJoe
o version 0.2: edited on DesktopJoe
o version 0.1: edited on DesktopJoeOf course, versioning could be disabled (at risk of data loss to the user), or could be enabled as “smart versioning” to only version files when conflicts occur during synchronization. Some may say that versioning is dangerous because it can drastically increase disk usage, but as I mentioned above, the other benefit is that the user gets redundant network backup of files for free. In higher versions of Persistent Folders, things could get more sophisticated by allowing users to turn off versioning even at the file-level, but I don’t think that’d be necessary in the first few releases.
So, for those not keeping score, Dropbox supports:
Go Dropbox!
I recently did some web design work in collaboration with a graphic designer. She introduced me to what has become my latest favorite piece of CSS code: 960.gs.
960.gs is a CSS grid framework, similar in spirit to Blueprint CSS and YUI Grid. However, 960.gs is at once more minimalist than these approaches, and more thorough.
The author has a detailed blog post explaining his motivations for working on 960.gs, so I won’t rehash each of those. Instead, I’ll just dive into what I liked about it.
Finally, the entire project is hosted on bitbucket and developed in the open. What more could you ask for? This has really simplified my approach to standard CSS designs.
I presented Parse.ly at the NYC Search & Discovery Meetup on Thurs, Oct. 29. The meetup is organized by Otis Gospodnetic (blog), who is one of the authors of Lucene in Action and the author of the upcoming book, Solr in Action. We make heavy use of Lucene and Solr on Parse.ly, so it was exciting to get an opportunity to present to a community of fellow technologists building systems with these excellent technologies.
In addition to running Parse.ly, I also run a small consulting business, Aleph Point, Inc. In the course of working on client jobs, I sometimes have to make business purchases, which I always pay in full at the end of every month. I have never carried a balance on my credit card and I never intend to.
When I signed up for a business checking account at Chase, the branch manager who I worked with (and who now no longer works there) encouraged me to sign up for a business credit card, as well. I thought, hey, why not — I’m just going to use it for small purchases like monthly hosting fees and the like.
Recently, I made a relatively large purchase at Best Buy for a client, which I was going to be reimbursed for. It was about $200. I already had a balance of $350 on my account, and a few days later my account was closing for the month.
When I looked over my account information a few days later, I found a strange charge. $39 OVERLIMIT FEE. What’s that, I thought?
Well, I went to my branch to find out. The branch manager explained that my credit card had a credit limit of $500. Wow. That’s a low credit limit. I explained that I sometimes make reimbursed purchases of more than $500, so this was quite strange value to pick. Further, I already had multiple other credit cards with much higher limits, so I don’t get why they would choose such a small limit for me. “Oh”, the branch manager said, “that’s because we’re picking $500 limits for all new business customers.” Hmph, fine… seems strange, but fine. (The engineer in me thinks, “Couldn’t they have done an analysis of my cash flow to figure out a more reasonable limit? Of course not — this is a bank, after all. That’s expecting them to be smart.”)
“OK,” I said. “So I have a $500 limit on the account. But why is there an OVERLIMIT FEE? Shouldn’t it just be declined? What is that about?” She says, “Oh, unfortunately, I can’t explain why that was charged. You’ll have to call the number on the back of the card.” I say, “Really? Why?” She says, “Unfortunately, we can’t discuss credit card fees at the branch.” This strikes me as a very strange policy, probably just put in place as a deterrent for people actually contesting their fees. Very sneaky little bastards, these Chase guys. But fine, for now I’ll follow the policy.
A couple days later, I call up the number on the back of the card. I ask about the fee. He says, “Yes, sir, that is a completely valid fee.” I reply, “Valid? Who cares if it is valid? Any fee you guys put on my account is ‘valid’. The question is, why was it put there? And is it justified?” He replied, “It was put there because you went over your $500 limit.” I asked, “Why did you let me go over the $500 limit? If I have a limit, shouldn’t I get TRANSACTION DECLINED when I go over? Isn’t that the whole point of a limit?”
The guy on the phone laughs. Literally, he laughed at me. “No, Chase provides overlimit protection as a convenience to our customers. So that if, for example, you’re taking a client out to dinner, you won’t be embarrassed by going over your limit.”
I didn’t respond for a couple of seconds, because I was parsing his sentence. “So, this isn’t so much a fee, as much as a convenient service Chase is providing me. You guys are saving my embarrassment for the mere cost of $39. I get it now.”
“Yes, sir.”
Apparently, the guy didn’t get my sarcasm. “Here’s what I think,” I continued. “I think you guys are just ripping me off. There is absolutely no reason not to decline the transaction, except that in allowing the transaction to go through, you now assess a fee. It doesn’t cost you anything for me to go $50 over limit, as I did. And I paid down the account in full through my automatic payment system just a few days later. So, I think you guys just figured — hey, here’s an easy way to make free money off our consumers. Here’s another fee we can invent because we are greedy bastards.”
He seemed a little taken aback, and then said, “No, sir. Chase does not rip off its customers. Chase is here to serve its customers.”
“OK,” I say. “I’ve been your customer for more than a decade. So since you’re in the business of serving customers, I suppose you’ll have no problem removing this fee, which is completely unjustified.”
“Unfortunately, sir,” he responds, “I cannot remove the fee.”
“What do you mean? Of course you can remove it,” I say, incredulous. “Just pull up my account and press delete on the line that says, OVERLIMIT FEE.”
“No, sir, I cannot. It is a valid fee. Valid fees cannot be removed.”
“There’s that word again. Here is another example of a valid fee: $500 for transferring money between my checking and savings account on a Wednesday. If you guys had that policy, that would be a valid fee.”
“We don’t have that policy.”
“Listen,” I’m revving now. “It would be one thing if you guys charged $1 for going over my credit limit. But you charge $39. You know how you guys came up with the number $39?”
“This fee is set by the executives of the banks.”
“Wow, I wish I had just recorded that to send to Congress! No, really –”, I interrupt. “I don’t care who came up with the number. But I’ll tell you how $39 was picked. It was picked because after rigorous market testing, they found that if the fee were $40, people would grab their kitchen knife, run out in the street, and kill every fucking Chase banker in sight.”
He was silent.
“So, the executives decided — better make it $39.”
He didn’t laugh this time. No sense of humor. “Listen, just let me speak to a supervisor to get this issue resolved.”
He insists, “Sir, no supervisor can remove this fee. The executives mandate that we cannot remove it.”
“Who can I contact to appeal the fee, AND the policy that does not allow it be removed, AND the appointment of the executive who instituted it?”
“You can write a letter to business services in Delaware,” the representative says.
Frustrated, I end with a rant. It was long-winded and I won’t produce the whole thing here. It talked about my prior blog post, which pointed out how Chase ran an insecure document exchange system. It talked about how there are currently 21 Chase customers who are “mad as hell and not gonna take it anymore”, ranting on my site about how crappy Chase is. And I discussed how even after contacting Chase about this issue, even after pointing them to a significant problem with numerous angry customers, they do nothing. Just like they are doing nothing now. A company that is like a black hole.
“And as a systems engineer,” I conclude, “the fact that you guys can’t remove this fee just makes me so god-damn depressed, I can’t even express it to you. We have managed to put a man on the moon, but JPMorgan Chase bank cannot hit delete on a fucking spreadsheet. How pathetic is that?”
I resignedly ask for the complaint address. I write it down. And I end the conversation with, “Listen, I know you’re just doing your job. But I want to make it clear to you that your company does not deserve a single god-damn dime of my money. Neither the money in my personal checking account, nor in my business checking account, nor any of the money my representatives awarded you in that $25 billion bailout that saved your company’s greedy ass. It’s not your fault — it’s your company’s fault. But for the love of everything good and just in this world, man, why the hell do you work for these clowns?”
That bit of humanity connected with him, just a little. I could hear it in his voice. “I hope things turnaround for you with Chase. I really do. But there’s nothing I can do for you at this time.”
I hope things turn around, too. In the meanwhile, I’ll sharpen my kitchen knife.
If you were trying to log into Parse.ly between 11pm-1am this Sunday, you may have noticed that it was intermittently down for maintenance. Over the last several weeks, we've been working hard to roll out some new features, polish some rough edges, and improve our infrastructure after our launch last ...
Mike Singleton of FourSquare recently wrote a blog post entitled, “Algorithms as a Service”:
I think there’s a market opportunity to crease an AAS (algorithms as a service) company which provides simple APIs to implementations of common algorithms… Algorithms as a service would give you development efficiency, problem scalability (access to CPU farms), and confidence in the results.
Andrew chimed in with this:
I think what you’ve identified is that some APIs are about getting data into and out of an existing system that sort of lives on its own — e.g., Twitter’s, FourSquare’s, Flickr’s.
Then, other APIs are about abstracting certain problems and simplifying them to a simple API call. These are “algorithms as a service”.
So, in this category I put things like OpenCalais.com (entity extraction algorithms) and SimpleGeo.com (geolocation algorithms). I also put my own startup, Parse.ly, in this category; see http://parse.ly/p3 and http://parse.ly/api. For Parse.ly, what we’re doing is simplifying the following painful steps:
1) parsing and cleaning RSS/Atom feeds and other content sources in near-real-time
2) building personalized “resonance profiles” for different users that can be trained and queried
3) delivering personalized recommendations (Amazon/Netflix-style) of content to users, that can be listed, searched, and filteredOur whole value proposition is that, yes, you could build algorithms to do personalized recommendations yourself and in-house, but it’s hard. There’s a lot of infrastructure that goes along with it. Your engineering team will spend months — not days — getting it right. So, why not just plug into our nice API instead?
I don’t think it needs a new name — it’s just an evolution of APIs and SaaS given the growing needs of developers to build more complex, dynamic applications and their increasing willingness to license best-of-breed 3rd-party platforms to do so.
January was an exciting month for Parse.ly. At the end of 2009, we were heads-down, polishing our own “algorithms-as-a-service” offering. We aligned our development around a public launch of it at the SIIA Information Industry Summit in NYC, where we were invited to present. Sachin gave a great presentation; here’s what one blogger had to say about it:
Parse.ly, a semantic tool that recommends content, steers users towards content towards personalization and recommendation through their licensed content. When and how [do] personalization really happen? [...] Parse.ly collects a little personal interest information from users, “listens” to their content habits and provides recommendations that can be embedded in any number of content applications. Market segmentation data and other demographics fall out of this information naturally. Parse.ly is available to publishers now for integration via their new P3 platform.
At the same time as launching the Parse.ly Publisher Platform (P3), we also put online our API docs and made it possible for you get an API key. Then, we started conversations with some great brands in online / digital publishing (household names, even) about using our platform. These conversations have been going really well — almost too well! These companies know how much more valuable their online properties would be if they were built around engaging, personalized recommendations in the Amazon/Netflix style. And they have a lot of ideas about how to use the data and recommendations P3 will give them. We’ve already started to mock up new user interfaces for our API to make the integration with publishers as smooth as possible.
We’re excited for this new direction for Parse.ly. We agree with Mike that there are opportunities all around us to simplify algorithmically-tough problems to simple and highly-usable APIs. This will not only make web developers more productive, but it will also make the websites we use daily more useful and powerful!
Our good friends at HiiDef just launched a new app that has been in beta for awhile, Flavors.me. This is an excellent tool that has a great, simple, and usable design.
What’s the value preposition of Flavors.me? It’s to unify your various “online identities” into a single, dynamic, automatically-updated, and elegant website.
From the article:
Flavors.me lets you take all that information and put it together in a single website to serve as your “online identity”. All your publicly shared information, aggregated in one place, and displayed beautifully. [...] It’s this kind of simplicity, design sense, and user-centric approach that makes me love the web as a place to develop, deploy, and use software.
Sorry for the lack of posts recently, but we’ve been busy changing and improving Parse.ly for the better!
We did, though, get picked up by a couple popular blogs in the past few weeks. Here a few snippets from both ReadWriteWeb and ZDNet.
Bloggers, muckrakers and news fanatics, lend me your ears. It’s entirely possible that we’ve discovered one of the best approaches to media monitoring since RSS itself. My mother always said, “You’ll never get what you want unless you ask.” But with adaptive feed application Parse.ly, that simply isn’t true. Rather than forcing us to abandon our overflowing feed readers, Parse.ly records our preferences and learns to work with us.
I haven’t figured out a way to manage Google Reader. I tried using Fever, but it doesn’t find news that matters to me… and it cost $30. Techmeme is my home page, but I think it needs an upgrade. I would like a feed reader that saves favorite feeds for me, and finds other content that is similar and interesting.
A new product called Parse.ly caught my eye that makes content discovery a painless process.
Check out our press page for more articles written about Parse.ly. We’ll update you soon about what we have in store for the future!
Hi Parse.ly fans. Andrew here. I just wanted to let you know that I presented Parse.ly at the NYC Search & Discovery Meetup on Thurs, Oct. 29. The meetup is organized by Otis Gospodnetic (blog), who is one of the authors of Lucene in Action and the author of the forthcoming Solr in Action book. It was graciously hosted at kgbweb (thanks for making that happen, Joe West!).
We make heavy use of Lucene and Solr on Parse.ly, so it was exciting to get an opportunity to present to a community of fellow technologists building systems with these excellent technologies.
Here is the abstract from the talk:
Parse.ly: Inside a modern RIA built with Solr
Andrew Montalenti
—
Parse.ly is a rich, adaptive web application that discovers your unique interests to filter and prioritize content from countless news and blog sources on the web. This talk will introduce Parse.ly with a quick demo and then delve right into how the Parse.ly engineering team makes use of the Solr open source search engine. This will include discussion of initial design mistakes that were later revised and “real world issues” that were overcome in scaling a system that currently processes millions of articles per week. Finally, we will discuss the existing Solr and Python landscape, and how we at Parse.ly aim to help the Solr community with the open source release of high-quality, Pythonic components for doing common Solr tasks.
Otis has written about the talk, and the slides are online, as well. Special thanks to my kickass Parse.ly colleague Didier for setting up our BitBucket repository and starting to tease the code out that is ready for the community.
Thanks also to everyone who attended, and if you have any questions about it, feel free to contact us.
Hi Parse.ly users,
Around 12:16am this morning, Parse.ly’s main database server that powers the Parse.ly news reading interface went down. Unfortunately, our system administrator is on vacation in Greece at the moment — or, should we say, fortunately for him!
I’ve successfully rolled our backups to the failover Parse.ly database server, however since our last complete backup happened before the start of the weekend, approximately 5% of our total users may be affected by some strange behavior until we can recover the data off the original server that turned off today.
ALL USERS will notice notice a “gap” in your articles for the weekend days, since any articles our system processed over the weekend haven’t yet been reflected in this restored database. Tomorrow, we are going to work hard to recover that last bit of data so that these 5% of directly affected users no longer have the strange experience, and so that the rest have no other issues. But if you notice anything strange that isn’t covered in this blog post, please do contact us to let us know.
Finally, the engineering team here at Parse.ly apologizes for any interruption of service you may have experienced. We are a young service and though we are careful to ensure our systems are backed up and that we have failover servers available for each of our production servers, we do not yet have automated failovers, aka high availability built into our free web application. Tonight’s experience is one of our first lessons about how useful it would be to build out this system. We were fortunate in that today’s outage happened at a relatively off-peak hour, but there is no guarantee this will be the case in the future. Once our sysadmin is back in the states, we’re going to work on making Parse.ly’s disaster recovery more robust, so that users don’t have to suffer the downtime they did tonight. Thanks for baring with us! And we’ll update you on this blog once we’re totally done with the remaining data recovery tasks.
Sorry Parse.ly users, but one of our servers unexpectedly failed on us. As a result, we took Parse.ly down. We’re working hard now roll our backups over to our failsafe and get Parse.ly back up and running. We should back to normal in a few hours. We’ll update this post when we’re all good to go.
Andrew’s a lifetime New Yorker. He grew up in the city and Long Island (and now lives in Astoria, Queens), and has always been a fan of the New York. I’ve been in NYC for the past eight years bouncing between Mnhattan and Brooklyn (I just moved to Ft. Greene this past weekend), and have grown to love the city
Well, this past week we had the opportunity — the privilege — to present Parse.ly (our baby) at NYC Demo Day, an event focused exclusively on New York City, early-stage startups. You can check out some photos of Andrew and I presenting here.
The event was a blast, as evidenced by Andrew’s inexplicably happy face after it was over (see the photos to see what I mean). Good luck to all the companies: Renthop, Trendsta, SECWatch, LegalRiver, Sensobi, Localytics, Seatgeek, Postling … and, of course, Parse.ly!
If you were trying to log into Parse.ly between 11pm-1am this Sunday, you may have noticed that it was intermittently down for maintenance. Over the last several weeks, we’ve been working hard to roll out some new features, polish some rough edges, and improve our infrastructure after our launch last month. Our first beta users have been amazing in providing us with detailed and specific feedback on what works and doesn’t work well within Parse.ly. We’ve diligently addressed many of the issues raised by these users and rolled out a new version of Parse.ly this weekend.
So, what’s new in Parse.ly?
We also made some improvements to our Interest Setup Wizard, but this will mostly affect users who first sign up for the system. (You did remember to invite your friends to join Parse.ly, right? ) These include:
We are also working on some big changes within the Parse.ly engineering team to take our product to the next level. We have partnered with our excellent hosting company, The New York NOC (NYNoc), to scale out our infrastructure and process more content than ever before. We are also planning our future iterations where we hope to innovate and deliver more features to save you time and let you discover the content that best matches your interests.
A Personal Note from Andrew
As the lead developer for Parse.ly, I just want to say “Thanks!” to all our awesome users. The thing that has impressed me most about Parse.ly’s users so far is how detailed, intelligent and thoughtful their feedback has been. We want our company to be driven by your feedback, so please, do not hesitate to provide more of it on our Cog Tree Get Satisfaction Page, on Twitter, or directly via e-mail.
This last few weeks reminds me of something Jim Young, the founder of HotOrNot.com, said to us at DreamIt this summer: “when the site became successful, we were left having to figure out how to change the engine while the car was still running.” It’s going to be a lot of work evolving Parse.ly even while providing the service to our existing users, but I’m looking forward to the challenge.
Thanks again to all our users for their feedback so far, and I hope you enjoy the new version of Parse.ly! (Also, don’t forget to report any bugs or tell us what is working well for you.) And if you haven’t signed up for Parse.ly yet, what are you waiting for? Do it now at http://parse.ly
We’re rolling out a new version of Parse.ly this weekend, so the service may be up and down intermittently while that happens. We’ll update our blog once the new version is rolled out to let everyone know what has changed. Stay tuned!
Update (9/16/2009): after some additional testing, we’ve decided to delay the production rollout of the new version of Parse.ly until later this week. As promised, we’ll write up a full blog post about it once it’s online.
Check out our feature on Thrillist! Here’s a little snippet:
Nothing beats having someone else do all the legwork, like your sister inviting her sorority friends home for the holidays, or accepting Michael Flatley’s lifetime achievement award for him, and keeping it. Get all the online content you’re seeking sans effort, with Parse.ly.
Don’t forget to check out Thrillist Philadelphia! And if you’re just signing up for http://parse.ly read our feature, and use the promo code “thrillist” to jump to the top of the beta accounts we’re handing out!
Who has one of the best team's on the web? @parsely does: http://t.co/p06UsfQq (yes, even though I need a shave)
Great tutorial on Clojure for non-lisp programmers here: http://t.co/C2cqDo0a (via @emmett9001)
These @parsely team members are all new in 2012: http://t.co/7uHZ2pb7 -- new team page uploaded at http://t.co/aEyLAcsd
These @parsely team members are all new in 2012: http://t.co/7uHZ2pb7 -- new team page uploaded at http://t.co/aEyLAcsd
mrjob is adding support for pig scripts, still in review stage but almost in
Some of the most epic moments from the time at @parsely office http://t.co/Xq53gqxA
Great piece on the viability of fully distributed teams: http://t.co/gc0uDv13 (by @amontalenti).