Starting Points in Distributed Systems, Pt. III: Epicyclic Explorations

This is the third and final part of a three-part miniseries about building a conceptual foundation in distributed systems through independent study. In this series, I sketch out the map that I wish I’d had when I started studying last year, drawing from my own experience and currently available resources. My hope is that this informal guide will assist fellow students on their own journeys.

Epicyclic Explorations

Once you’ve made it this far, you should have at least a broad sense of the complexity and richness of distributed systems as a field of research and practice, and a basic ability to locate yourself and navigate within those spaces. From here, your options open up and things really start to get interesting.

I’ve tried to organize my own studies from this point in terms of deep dives into either a particular concept or system. In my experience, the closer I come to grokking a concept, the less conceptual or topical divisions make sense and the more interconnected everything appears. That said, as an initial approach, this divide-and-conquer strategy has worked well for me.

The basic procedure is to begin with a foundational exposition or two, and then follow connections to other work as either your curiosity leads or comprehension demands. Sometimes this will mean following explicit citations or hyperlinks. Other times, it will mean trying to better understand a key term (e.g., linearizability) or context (e.g., Google’s architecture at a particular point) in a given discussion by independently searching for more information.

At the risk of both mixing and straining metaphors, I think of this kind of reading a bit like tracing a variety of nearer and more distant orbits around your starting document-sun, jumping from paper-planet to paper-planet, but always circling back and changing your viewing perspectives in the process. The mental pathways might resemble a rough sketch of Ptolemaic epicycles or even epicyclic gearing. You’re drafting and re-drafting a model of a system of knowledge.

Two essential aspects of this activity are the cyclical movement and ongoing addition of new information and perspectives. The goal is not to totally understand everything you read the first time, but to keep returning with a deeper understanding, or, to borrow a phrase from one of my own best teachers, to be “confused at a higher level.”

I do realize that that was very abstract. Let’s look at some concrete examples of how you could begin a concept-focused exploration and a system-focused exploration. For this first case, we’ll take a look at the ubiquitous but still frequently-misunderstood CAP theorem. For the second, let’s look at Riak as a Dynamo system.

Example start for a concept- or topic-focused exploration: the CAP theorem

“CAP,” as Michael Bernstein reminds us, “is an acronym, which is a super easy thing to make sh*t up about.” Let’s try to avoid that temptation through education.

Either Henry Robinson’s FAQ or Mikito Takada’s treatment, both cited in an earlier post, would make a good starting point.

You could then proceed directly from the latter to Seth Gilbert and Nancy Lynch’s 2012 paper, “Perspectives on the CAP theorem”, and/or to Eric Brewer’s own “CAP Twelve Years Later: How the ‘Rules’ Have Changed”.

From there, you might…

  • …go to the proof itself: Seth Gilbert and Nancy Lynch’s “Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services.”
  • …notice that both the “Perspectives on the CAP theorem” and “Twelve Years Later” pieces initially appeared in the same 2012 issue of IEE Computer, which, in fact, presents a CAP retrospective. A little digging might lead you to Daniel Abadi’s blog post on that issue and, from there, his work on PACELC.
  • …better contextualize CAP consistency (i.e., “atomic,” or “linearizable” consistency ) by reading more about consistency models in Doug Terry’s “Replicated Data Consistency Explained Through Baseball.”
  • …learn more about how systems respond to network partitions by watching and/or reading Kyle Kingsbury’s Jepsen series of talks and blog posts. A brief, more formal presentation of the first part of the Jepsen project is also available here.
  • …or follow any other path as your interest leads.

Notice that any of these options would help to contextualize the others and help improve your mental model of the distributed systems space. Also notice that any of these choices would present further connections to follow and open up still further perspectives. For instance, to follow up on CAP consistency, you might go read the original paper that defined linearizability, and that paper in turn might be a starting point for a new exploration.

As an exercise, try reading either Gilbert and Lynch’s “Perspectives on the CAP theorem” or Brewer’s “Twelve Years Later” a second time after reading three of four other articles, or even a good blog series, and see how much more you can get out of a second reading. Now we’re confused at a higher level!

Example start for a system-focused exploration: Riak as a Dynamo System

In this type of exploration, you’d pick a system of interest to you and work through the most relevant formal description(s) and the best documentation for at least one implementation. If possible, you’d do some work with the implementation yourself as part of the learning process. I’ve found this type of exploration invaluable in building an understanding of how algorithms and concepts come together in real systems.

We’ll take Riak as an example here. Riak is a highly available, eventually consistent distributed database descended from the influential Dynamo paper. Among Dynamo systems, Riak is a good choice for an independent study due to the relatively clear lines of development from Dynamo to Riak, the especially high quality of Basho’s documentation, and the open-source availability of the software.

In the context of these “starting points” posts, Riak is also a good choice since Mikito Takada’s fifth chapter provides an introduction to Dynamo as well as the CRDT research supporting the new data types soon to be available in Riak 2.0.

Here, you might start historically, with the Dynamo paper itself:

Giuseppe DeCandia et al., “Dynamo: Amazon’s Highly Available Key-Value Store,” in Proceedings of the 21st ACM Symposium on Operating Systems Principles, Stevenson, WA, October 2007.

You might also start with Eric Redmond’s excellent Little Riak Book.

After those, I’d move to Basho’s Riak documentation, with special attention to the “Theory and Concepts” section.

I’d cap off my start by setting up a Riak cluster and client on my local machine.

Once you’ve made it this far, you’d be well positioned to either branch off into papers on topics like consistent hashing and CRDTs with a deeper sense of context, or move to a comparative study of another system. You could also dig into the Riak code itself.

Series Conclusion

“We have not succeeded in answering all our problems—indeed we sometimes feel we have not completely answered any of them. The answers we have found have only served to raise a whole set of new questions. In some ways we feel that we are as confused as ever, but we think we are confused on a higher level, and about more important things.”
Earl C. Kelley, The Workshop Way of Learning (1951)

In these last few posts, I’ve described the primary processes through which I’ve built my own conceptual foundation in distributed systems. I know that I still have a lot to learn. While I certainly don’t claim to have mastered this material yet, I will claim to have achieved confusion at a higher level. The resources and model I’ve described here have helped me advance. I’ve taken the time to share them in the hope that they can help other independent learners too.

I encourage anyone reading this to do your own exploring and let one topic lead to another. Keep reading and thinking. You can source topics, systems, and articles from Mikito Takada’s book, Chris Meiklejohn’s “Readings in Distributed Systems” list or Think Distributed podcasts, and groups like the Distributed Systems Reading Group at MIT.

As a final exhortation, don’t forget the human element! Note the people writing the blog posts, books, articles, and code you read and follow them on Twitter. Join a mailing list or two. See if you can attend a relevant conference or meet-up and make some contacts in person. Reach out to people in general. I’ve personally found that most people in the distributed systems community can be quite kind and helpful to newcomers, especially if you’ve bothered to do some homework ahead of time. Distributed systems are hard, but you might be surprised by how many people out there are willing to help you find your way.

Starting Points in Distributed Systems, Pt. II: Orientation and a Survey

This is the second part of a three-part miniseries about building a conceptual foundation in distributed systems through independent study. In this series, I sketch out the map that I wish I’d had when I started studying last year, drawing from my own experience and currently available resources. My hope is that this informal guide will assist fellow students on their own journeys.

Orientation

To begin the next part of your study, I’d start with some definitions, commonly cited reference points, and a bit of history. This might seem like a lot to start with, but each of the following pieces is fairly brief and accessible.

For definitions, you might simply start with the Wikipedia entry on “Distributed Computing” and follow links from there as interest or comprehension demands. Note that the scope of distributed computing as a topic exceeds the scope of these “Starting Points” posts, as well as most resources that you’ll probably use in your self-education.

Continue building your working definition of a distributed system with the first chapter of Mikito Takada’s book, Distributed Systems for Fun and Profit. We’ll return to this book later.

To prime your thinking as you go deeper into this space, familiarize yourself with the “Fallacies of Distributed Computing” as Takada suggests in his book, and check out a few different treatments like those by Arnon Rotem-Gal-Oz (“Fallacies of Distributed Computing Explained”) or Brian Doll (“The Fallacies of Distributed Computing Reborn: The Cloud Era”).

Read Jeff Hodges’ emerging classic, “Notes on Distributed Systems for Young Bloods.”

Watch and/or read Michael Bernstein’s “Distributed Systems Archaeology” to help gain an appreciation for the history of the field, as well as inspiration for taking on your own extended reading project.

Surveying the Landscape

At this point, you should be sufficiently acclimatized to the definition, practice, and history of distributed computing to really benefit from a survey of the landscape. This next step will give you a framework with which to relate your subsequent, deeper trips into more specific topics and studies.

Here, I highly recommend reading through the entirety of Mikito Takada’s Distributed Systems for Fun and Profit. This is the integrative, overarching survey that I most wish I’d had when starting out. I strongly recommend reading (and re-reading) this book. Takada has many excellent citations and links to follow, but before diving into too many of those, I’d try to get a broad view by reading the book once through.

If, like me, you’re particularly interested in distributed databases, you might also like to read Pramod J. Sadalage and Martin Fowler’s helpful book, NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence (ISBN 978-0321826626). Like Takada’s work, this is a high-level survey that can help you organize and relate the next phases of your learning. Try not to get caught up in often-unhelpful “NoSQL” label and be sure to remember that a system composed of clients communicating with even a single database server over a network is still a distributed system, whether that database is relational or not.

In the next post, I introduce more resources for continuing your journey.

Starting Points in Distributed Systems, Pt. I: Background

This is the first part of a three-part miniseries about building a conceptual foundation in distributed systems through independent study. In this series, I sketch out the map that I wish I’d had when I started studying last year, drawing from my own experience and currently available resources. My hope is that this informal guide will assist fellow students on their own journeys.

Background: my experience

During the second half of 2013 I undertook an independent study of distributed computer systems, with a particular focus on distributed databases. Like many others–or so I imagine–I started with a few conversations with a knowledgeable friend and a few citations to follow: Leslie Lamport’s “Time, Clocks, and the Ordering of Events in a Distributed System”, the Dynamo paper, and a recommended FAQ or two about something called the CAP theorem. Over the following months, I expanded my knowledge by branching outward from those starting points. I built a conceptual foundation in the field primarily through a process of reading papers, articles, blog posts, and product documentation, and by following the citation chains between sources. In addition to internal citations, I also found more personal direction through helpful fellow attendees of conferences and meet-ups, and more programmatic direction through university course websites and reading lists.

Although my academic training in another field prepared me to approach a new field by tracing webs of citations and joining networks of students, researchers, and teachers, I still found this to be an especially challenging process. In particular, I often felt a need for more integrative and intermediate resources for students. By volume, most of the resources on distributed systems that I found fell into one of two categories: i) documentation and marketing presentations of a particular software product or service, or ii) formal papers by academic researchers and/or industry practitioners. Working with resources of the first type, I had difficulty identifying broader contexts for particular systems, their relationships and interconnections, and their conceptual foundations. In papers of the second type, I struggled with unfamiliar formalisms and the high level of assumed background in computer science and mathematics. I found myself wanting an overarching structure to help integrate concepts and approaches across systems.

Since I started learning in the spring of 2013, new resources have been created to help students, both by simplifying the task of assembling and navigating the relevant literature, and by offering an accessible introduction to distributed systems as a field. To really engage the distributed systems literature, you’ll still need venture into the deep, dark, forest of formal papers for and by specialists, but a decent map or two can dramatically facilitate your progress.

In that spirit, the rest of this series describes where I would tell a friend–or my past self–to start today. I begin here, with a few prerequisites, continue next time with an orientation in, and survey of part of the distributed systems landscape, and conclude with a third post, in which I sketch out an exploratory method for continuing the journey.

Prerequisites, or a prelude

This may go without saying, but to make substantial progress towards understanding distributed systems, you’ll need at least a working knowledge of computer programming, systems, and networking.

For help building that knowledge in computer networking, I can personally recommend the Computer Networks course on Coursera offered by the University of Washington’s David Wetherall, Arvind Krishnamurthy, and John Zahorjan. This free course is taught “at the upper-undergraduate level.” I found the lectures, problem sets, and exams to be challenging but accessible. Since this course focuses in large part on the Internet and web applications, you’ll actually be learning quite a bit about distributed systems and some key algorithms already. In that sense, you might think of this, or a similar course, as a prelude to your more extended study.

If you push through to formal computer science papers in later parts of your studies, you’ll also want to be able to read and think with formal proofs and descriptions of algorithms. When I started, I wasn’t anywhere near the level I needed to be on to work through papers by researchers like Nancy Lynch and Leslie Lamport. I found that I could learn quite a bit from the introductory and concluding sections of papers, but I needed to do some substantial catch-up work in mathematics to be able to (start to) follow the meatier sections.

If you’re in a similar position, I can personally recommend Daniel J. Velleman’s How to Prove It: A Structured Approach (2nd ed., ISBN: 978-0521675994) on proofs and Tom Jenkyns and Ben Stephenson’s Fundamentals of Discrete Math for Computer Science: A Problem-Solving Primer (ISBN: 978-1447140689) on algorithms. Unlike the overwhelming majority of resources I’ll be mentioning in these posts, these aren’t free, but they are relatively inexpensive paperbacks that I consider to have been excellent investments.

In the next post, we’ll start to better orient ourselves in the distributed systems landscape and take on a survey.

Current Projects (January 2014)

By way of a brief update, here are my current main projects for the new year:

  • I’ve invested in and have started working through Peter Van Roy and Seif Haridi’s textbook Concepts, Techniques, and Models of Computer Programming, a.k.a. CTM. As Professor Van Roy explains here, this book “brings the computer science student a comprehensive and up-to-date presentation of all major programming concepts, techniques, and paradigms in a unified framework.”
  • I’ve started learning functional programming in Clojure through a combination of Aphyr’s “Clojure from the Ground Up” posts and Chas Emerick, Brian Carper, and Christophe Grand’s book Clojure Programming: Practical Lisp for the Java World.
  • As a project to tie in with both my continuing computer science education and Clojure study, I am working on implementing a series of common data structures using Clojure. I will be pushing my work to a GitHub repo as I progress.

On (re-)learning to type

I’ve been using a QWERTY keyboard more or less daily for decades, but I didn’t get serious about typing well until March. (Re-)learning this basic skill has led me to reflect on the process of learning itself, and the importance of admitting and remedying gaps in my knowledge and abilities, however basic or embarrassing they might seem. I’ve come to locate a truer embarrassment in letting ego or fear hold us back from confronting and overcoming our own ignorance.

I came to practice typing technique when I realized just how poor my abilities were this year. Learning to program showed me just how much of an impediment my idiosyncratic technique was to my ability to write and edit text quickly and accurately. Technical deficiencies that hadn’t noticeably held me back from years of writing became painfully obvious in the face of unfamiliar character sequences and combinations. Just as an example, I think I needed to type several years worth of {}s, =s, and #s in the first few weeks of programming basics alone. As I write and revise this now in VIM, I remember just how awkward the command combinations felt when at first. The time had come to finally learn proper typing technique.

This deficiency was totally my responsibility. As I child, I was lucky enough to be encouraged, and even required, to learn and practice proper typing technique, but, at that time, I just didn’t put in the effort to master the skill and keep it up. As a student in high school, college, and even graduate school, I never reached a point where I needed to have better technique than I already did. I became faster and more accurate haphazardly, in unstable and contextually-dependent patterns. It now would be my responsibility to retrace my steps, go back to the basics, and get it right.

I started working through the lessons on typingweb.com from the very beginning, cycling back through the beginning and intermediate sections over and over. To paraphrase Master Yoda, I had to unlearn what I had learned. I slowed down significantly before I started speeding up. By the end of March, I had started to plateau around 65 words per minute with 98-99% accuracy. Since then, progress has been been steady but slower, and each small improvement has taken more work to realize. I am now working towards bridging the gap between 75 and 80 wpm, with the goal of breaking the 80 wpm barrier. I can feel the muscle memory building. I catch my fingers moving to the keys far faster than I track them as a conscious mental process. I can feel the movements becoming natural.

Reflecting on the process and practice it took to reach this point, I’m struck by the wider resonances of this process with my larger project of learning and re-invention. On a small scale–small enough for me to notice and feel it daily–(re-)learning to type has been like learning programming or mathematics over the same period. I’ve needed a similar attitude towards admitting my own deficiencies, going (back) to basics, and frequent practice. In each case, I couldn’t let ego, and/or a fear of exposing my own ignorance, get in the way of learning. I couldn’t allow myself to cover a lack of knowledge with a thin veneer of competence and miss out on the foundations.

In my experience as a researcher, student, and educator, I’ve seen the opposite approach over and over. From time to time, I’ve been tempted to it myself. This approach is the way of the student who, afraid of appearing ignorant, holds back a question, and in so doing, loses the chance to learn. It is the way of the scholar who assumes an analytical concept uncritically, and without really knowing what it means to zir. It is the way of the person who excuses zir inability, however accurately, as the fault of poor teachers or circumstances, and chooses not to act to remedy it later in life. It is also the way of the person who chooses not to learn now because such knowledge would be “kids’ stuff.” These attitudes impede our learning by leading us to perform competence rather than building its underlying substance.

As a student of programming and software development. I don’t want to fall into that trap. I don’t want to fall prey to fears and insecurities and miss out on the chance to learn. I don’t want to skip or b.s. my way through foundational skills, even those as basic as typing well. With experience, I’ve learned to consider it far worse to remain ignorant than admit my deficiencies, however potentially embarrassing, and overcome them.   I’m reminded of this as I type.

A busy July

My journey has continued at an accelerated pace this July.

As an update on my progress, here’s a little about what I’ve been working on in the last few weeks.

Programming

At present, my short-term challenge goal is to contribute additional tests and data types to Aphyr’s meangirls library of CRDTs (Convergent Replicated Data Types). I am working to meet this challenge by solidifying the conceptual and practical prerequisites I will need. Conceptually, I have been learning more about CRDTs through studying some of the key research and mathematics behind them. Practically, I have been improving my Ruby skills through developing less complex projects.

My current practice project is a command-line implementation of a hangman game, which will feature inter-operating files and classes, tests, and documentation. I chose to assign myself a simpler problem so I can focus more directly on the learning the mechanics of structuring projects and writing tests in the context of a problem that I can easily reason about. I’ve been discovering just how important that choice was as I investigate and correct unexpected results from processes that had previously seemed straightforward. As part of this exercise, I am also making the transition to writing with the VIM text editor.

I am still working on this  project now, but I intend to publish a version of it or a similar project on GitHub soon, as well as sharing some of my more interesting work from Project Euler thus far.

Mathematics

I am also continuing my study of discrete mathematics. Most recently, I’ve been working on Tom Jenkyns and Ben Stephenson’s Fundamentals of Discrete Math for Computer Science: A Problem-Solving Primer  (ISBN: 978-1447140689). I highly recommend it to other independent learners.

Computer Networks

In addition, I have branched out into learning more about the core technologies and history of the Internet through University of Washington and Michigan courses on Coursera. I believe that investing the time to learn about network engineering now will help me to communicate more effectively with specialists in future software jobs. Similarly, learning more about the history of computing will enable me to ask more incisive questions of current contacts and future colleagues.

I’ve been doing this all part-time while finishing my last university job. It has been a very busy July indeed! This weekend, I’m planning to take a little time off and visit my family. I am so thankful for their patience and support in this time of transition, and I am really looking forward to relaxing with them for a few days before diving back in.

Building a Foundation in Programming

How does a novice start building a credible foundation in programming without the benefit of a formal curriculum or training? This post offers one possible answer and some recommended resources from the perspective of a current student.

Introduction: My Story

I started studying programming part-time in January 2013 with no prior experience. After several months of exploring programming as one of several possibilities leading towards a new career path, I decided to pursue a career in the software industry with a focus on development for the web. In order to overcome my lack of both a formal background in computer science and prior work history in an immediately related field, land a job, and do it well over the long term, I knew that it would be vitally important for me to invest serious, sustained effort in building a strong practical and conceptual foundation. Although I knew that in a year or so of part-time independent study I could not expect to match the depth of knowledge and experience gained by formally-trained computer science majors, I was also convinced that I could at least ground myself more firmly than many other self-taught programmers. So, despite temporal and financial pressure to move as quickly as possible to building sample projects with popular libraries and web frameworks, I’ve focused most of my attention so far on building a deeper understanding of the underlying concepts and languages upon which current and future software development will depend on a deeper level. For me, these foundations most definitely include significant forays into mathematics, databases, and networking, which I will leave to other posts.

Today, as I am starting to move towards developing my own first projects for public release, I want to pause to reflect and share part of what I’ve already learned and how I went about learning it. I know that there are many other people interested in learning some programming skills without the benefit of formal classes or the ability to study full-time, and that a fellow learner’s perspective can be a big help in pointing the way, or at least in offering some encouragement. To that end, this post offers references to resources that I have found particularly helpful in building the start of my own foundation in programming with a focus on web applications, beginning with conceptual basics, and moving on to more specific language learning at the beginning-to-intermediate level in Ruby, JavaScript, HTML, and CSS.

Please note that I am offering this post as contribution back to the larger community of students and professionals whose work has helped me to learn, rather than as a “how-to guide” or curriculum for a transition I’ve yet to complete. I similarly make no claims to exclusivity or comprehensiveness, either as a description of my own learning, or as an account of helpful resources or necessary concepts. There are surely other good resources that I do not mention here and other approaches to learning that could work both for me and for others.

1. Basic Concepts and Exposure

I had to start at the very beginning. I needed a basic, working concept of both software development and computer science as fields, and I needed to learn to recognize and use the basic building blocks of both programming and markup languages in a manageable, controlled way. I needed exposure. Luckily for me, all of this is possible with nothing more than Internet access and a modern browser. In fact, thanks to free, self-paced courses and tutorials available online and the availability of tutorial programming environments in web browsers, it has probably never been easier to gain the familiarity I needed to get started.

Resources for Getting Started:

If you are also interested in starting from the very beginning, I can recommend the following free resources from my own experience.

A. For background and context:

Nick Parlante of Stanford’s Computer Science 101 Course on Coursera. This is a MOOC (Massive Open Online Course) available as a “self study” without waiting for a new session to be offered. I found the “self study” option, in which all of the course materials are immediately available, to be particularly helpful for picking up an accessible introduction to topics like information storage in bits and bytes or Ethernet. There are also basic introductions to topics like Boolean logic that you will quickly encounter in programming tutorials. In some cases, I have found it can be helpful to have access to another explanation. The beauty of the “self study” option lies in your ability to get what you need and come back later.

Blog Posts by Kyle Kingsbury (Aphyr): “Getting Started in Software” and “Core Language Concepts” My friend and mentor Aphyr wrote these posts to help me get oriented, and he is sharing them with you too on his blog. The first, “Getting Started” should be accessible almost right away. The second, “Core Language Concepts,” will probably make more sense once you’ve had a chance to work with some programming tutorials.

B. Early exposure

Codecademy can be a good resource for interactive lessons and tutorials to help you get started and prepare you for further learning. It’s not perfect. You will probably run into a variety of criticisms of Codecademy and its competitors, some more tempered and well-intentioned than others. Look for a few that seem reasonable to you and read them. Think about the points they raise. Try out a few lessons, and see what you think. I’m not going to add my own proper review of Codecademy here, but I will offer a few comments to frame my recommendation.

For someone with little or no prior experience, Codecademy courses can be great for basic exposure to some very simple programming. As with most things in life, in order to get the maximum benefit from the courses, you will need to have reasonable expectations and a positive attitude. If you are a total novice as I was, resources like those above in section ‘A’ will help provide some valuable additional context.

Be patient and don’t expect too much of a free resource still under development. For instance, you will almost certainly run into an exercise validation issue that you don’t immediately understand. It may or may not actually be a problem with the site. Sometimes it will be part of the site or a browser behavior that you have no control over as a user. Sometimes you will have a perfectly good–or even better–solution to an exercise and it will not validate. Still other times it will be entirely your own mistake. Be patient. Try to figure out what is happening. Read your own code very carefully first, and then check the forums. If it still doesn’t make sense, check back later and try again. If/when you are able, try to reason about why the error is happening and learn something more in addition to resolving the immediate issue. Problems with exercise validation can be teaching moments about testing and writing your own tests in the future. Similarly, browser issues with the interface are teaching you about the very real challenges of making highly interactive applications behave well and predictably across the spectrum of browsers. To give one final example, in February, there was a database migration problem on Codecademy’s backend. Some of my user data was permanently lost. This was also a teaching moment –an opportunity to start thinking and learning about databases. With the right attitude, you can make problems part of your learning process rather than an impediment to it. In my experience, it will be vital to adopt this attitude as a learner whether or not you are working with Codeacademy.

Furthermore, remember that, although they can be valuable, these courses are just a start, and treat them as such. By itself, working through Codecademy (or something similar) is not going to turn you into a credible developer. The courses will also not necessarily teach you the best way to write or think about even a basic concept or problem. In fact, there can be content problems on a variety of levels that someone with more training and experience will notice.[1] But here’s the thing: working through the lessons can start you on a path to become a person who can point out those problems yourself. This has worked for me. At least in my experience, Codecademy courses are enough to give a beginner just enough exposure and practice to take on more complex tutorials and resources, and just enough of a taste to pique a learner’s interest and help build zir confidence to make that sound both possible and fun. As such, I think Codecademy can be a valuable resource for beginners and I can recommend the site.

That said, you could also choose not to use Codecademy at all. You might try something different to get comparable exposure, or even skip right ahead to something like Chris Pine or Zed Shaw’s tutorials (see below). My guess is that it would take a little extra momentum and encouragement to launch most total novices right into those, but it could certainly work.

2. Next Steps:

From here, things get more difficult. I’ve personally focused on learning Ruby as well as possible as a first programming language and enough about JavaScript, HTML, CSS, and the DOM to be a competent junior participant in conversations with front-end developers and eventually build simple sites myself.

I’ve generally been learning these subjects in parallel, mostly because I  get excited to know how the pieces will fit together in the end, and I have a lot of experience managing that kind of parallel learning successfully. I don’t regret learning this way, but it has probably made things more complicated than they otherwise would need to be. If you are learning too, it might be easier to handle one thing at a time.

My general strategy here was to make sure that I had the basics down, and then move onto more intermediate-level resources. I deliberately sought and invested time in second and even third presentations of the basics I had seen before on Codecademy or other tutorials. I wanted to be sure that I was learning best practices, and furthermore, in my experience, repetition and practice, especially in variations, powerfully promotes retention of new information.

Resources for your next steps:

The following is a kind of annotated outline of resources that I have personally used and would recommend to another learner at this stage. Some of these are books that you would need to borrow from a friend or buy. Although money can be tight for those making a career change, I think these are still worth acquiring because of their quality and ongoing utility to a learner.

A. Ruby

Beginning:

First, you’ll want to solidify the basics with some resources aimed at newcomers to both programming and Ruby. In respect, I found Chris Pine’s Learn to Program to be helpful. I used the free online version[2], but there is also a revised and expanded second edition available in paperback (ISBN: 978-1934356364).

Beginning-to-Intermediate:

Zed Shaw’s Learn Ruby The Hard Way, available in a free online version, is also very helpful. Shaw will take you back through the basics and then push you onto a more intermediate level. Start with the same author’s “Command Line Crash Course” if you need to learn (or re-learn) the basics of the command line interface as I did.

I recently acquired a copy of Peter Cooper’s Beginning Ruby: From Novice to Professional (2nd ed. : ISBN 978-1430223634), skimmed the earlier portions, and started working through some of the later chapters, which cover some more advanced topics at a greater depth than Shaw’s book. It also has the advantage of being originally written about Ruby, whereas Shaw’s work is translation of his Python book. Cooper’s book is helping me to solidify my grasp of a number of things that I didn’t quite master the first time around by jumping from Shaw to Olsen’s Eloquent Ruby (see below), so I’m also recommending it here.[3]

I personally like Shaw’s Hard Way a lot and think it is worth working through in addition to Cooper’s Beginning Ruby, especially because Shaw’s format deliberately emphasizes practice in reading and writing as well as introducing concepts. However, in retrospect, I think it might also work to move from the Pine book or a similar introductory resource (even if in a different language) right into and through Cooper without Shaw. I think the choice would come down to your own learning style and time you can make available to study.

Around this point, you may also find the Ruby Koans to be useful as extra practice and reinforcement. I know that I benefitted from working through the exercises. You will read a lot of Ruby carefully, find and fix problems, and get more practice using the command line and your favorite text editor.

Also at this stage, start working through the Project Euler problems. These will make you apply your knowledge in new ways. You can also learn an enormous amount about efficient algorithms and writing eloquent code by reading a variety of programmers’ solutions to the same problems. It may also drive you to learn more mathematics, which is definitely a good thing.

Intermediate:

If and when you feel ready to dig a little deeper, you might move on to Russ Olsen’s Eloquent Ruby (ISBN: 978-0321584106). Olsen will help you to understand and think in the language at a deeper level. My personal copy of this is joyfully and extensively annotated. I keep coming back to it and learning more every time. Try to find ways to apply what you learn here to your work in Project Euler or building the practice projects described by Shaw and Cooper.

For other ideas about books, you might also consider Olsen’s own recommendations. You can check out his “Ruby Reading List” here.

B. Front-End Web Development with HTML, CSS, and the DOM

To start really working with JavaScript and its libraries, you first need to learn to understand its context within web browsers, which means learning your way around the DOM, and, as part of/along with that, HTML and CSS.

In my own work towards that goal, I have personally learned a lot through the companion books Designing with Web Standards, 3rd ed., by Jeffrey Zeldman with Ethan Marcotte (ISBN: 978-0321616951) and Developing with Web Standards, by John Allsopp (ISBN: 978-0321646927).

These books are written for professionals and serious students, with a strong emphasis on understanding and applying web standards and best practices. Working through and referring back to these really helped me to be able to open up Firebug or Chrome DevTools and grasp a lot of what I see, even if I can’t yet build it all myself. Thanks to these, I can also read and understand a lot of the conversations and arguments that front-end developers and web designers are having, which opens up a whole universe of professional discourse to an advancing student.

C. JavaScript

For learning JavaScript itself, I can personally recommend Eloquent JavaScript: A Modern Introduction to Programming, by Marijn Haverbeke, which is available in both an original online version and a revised paperback version (ISBN: 978-1593272821), as well as JavaScript: The Good Parts, by Douglas Crockford (ISBN: 978-0596517748). Neither of these books is likely to be very accessible to a total beginner, so I would recommend picking them up only after gaining some more experience.

Although the author labels it an “introduction to programming,” Eloquent JavaScript past Chapter 4 can be very tough for a beginner. I picked it up very early on, just after completing Codecademy’s JavaScript course, and I was stumped by the time I hit Chapter 6 (“Functional Programming”). I came back to it a month or two later, after I had learned the rudiments of programing in Ruby (which was more accessible to me as a novice) and reviewed some mathematics that I had mostly forgotten. I was then able to work through the rest of the book.

On the first page of his preface, Crockford explicitly writes that The Good Parts is “not a book for beginners” and warns that “the book is small, but it is dense” (xi). He deserves to be taken at his word. I’ve learned a great deal from this book, but only by working slowly and carefully, and only because I had already built a substantial conceptual foundation in programming through another language. In my case, this was Ruby, but for you, it could be something else.

Conclusion:

There are, of course, many paths to building a foundation in programming. In this post, I’ve described how I started building my own from the ground up, and I’ve offered references to some resources that have been helpful to me as an independent learner. No two students are exactly alike. You may not like these same resources or prefer others. You might prefer to learn Ruby, for example, from the ever-popular and quite entertaining Why’s (Poignant) Guide to Ruby, by Why the Lucky Stiff. I think that’s great.

Remember, I never meant this to be comprehensive, exclusive, or read as some sort of a curriculum. Frankly, I’m nowhere close to having earned the right make a curriculum. At this point, I’m still working at level of the “intermediate” Ruby books I describe above, and consider myself something like an “advanced beginner.” I’m not yet an expert by any means. I simply wrote this in order to share part of my own story and help give back to the larger community of teachers and learners making this all possible. I hope that his post proves helpful to you in some way. The most important things to take away may simply be that it is possible to learn and that good help is available.

Notes:

  1. To be fair, I understand that the Codecademy team is actively working to improve their content all the time. I hope that the larger context of my comments here will make it clear that I do not intend to carp at Codecademy or its employees, but rather to recommend their product, which like all products, has its limitations. I also realize that it may seem somewhat unfair of me to suggest that there are problems without citing specific examples and evidence. In this case I disagree. Citing and arguing about specific lessons would take us too far afield from the main topics of this post and would also require me to put considerable time and effort into an unsolicited and unpaid work of criticism, which would be unlikely to directly improve the content itself.
  2. The server seems to be inconsistently available at the time of this writing. Don’t be surprised if you get a 503 error.
  3. I certainly don’t intend this as a criticism of Shaw or Olsen’s work. My need to go back has much more to do with my own learning process and growing levels of understanding and ability at different times in my training than any flaw in the materials. A big part of my journey has been circling back to pick up bits and pieces of knowledge that I skipped over, forgot, or didn’t quite understand at an earlier point. Even the most diligent student attempting this kind of learning at speed will not be able to retain all of the details or fully work through each and every example or exercise in this many resources. I certainly don’t claim to have managed this myself. I don’t see that as a failure, but rather as utterly normal and nothing to be worried or embarrassed about in itself. Gaps in the foundation become a problem when you ignore them and fail to address them when they appear. So I go back. I put in my time. This is why review is important. This is why learning takes practice. If you want to build a solid foundation, you have to fill in the gaps. To paraphrase the great poet Kanye West, you must crawl before you ball. Maybe that should have been the subtitle of this post– “Programming Foundations: Crawl Before You Ball”?