Documentation and how it rots
If you’ve ever had to manage documentation in a large codebase, you know how quickly it “rots”. Who takes the time to ensure the README.md
still makes sense 6 months into a project, whether your database migrations run without errors and your seeds load correctly, or whether people have been adding new API routes and example requests to tools like Postman or Insomnia?
Usually, unless there is a customer constantly reading and using your documentation (e.g. you are an API provider), engineering teams are neither incentivized nor held accountable for when documentation lapses (it’s always a nice to have, not a need to have). Then, the primary way (outside of customer-facing APIs) that documentation gets updated ends up being with new hires — they end up following the README to setup a project, immediately run into trouble, and decide to fix the documentation as a way to both easily show their contribution early on, and because they don’t want the next onboarding person to run into the same challenges that frustrated them. So, if you’ve read this far, an easy solution to documentation debt might be to just always be hiring for your project!
Kidding aside, documentation debt isn’t (really) anyone’s fault. While it would be great to say that all companies should make addressing documentation a core value, usually it’s one of the first “values” that get deprioritized when crunch-time happens (how many times have you heard “I’ll come back after the feature launch to fix the documentation and tests, I swear!”?). No matter what type of processes, values, and workflows you instill at your company, if they are human-driven, they will eventually fail to scale, and be applied inconsistently, leading to continued gaps.
Solving for documentation debt
An existing solution: automation and careful choice of technology
While there are many ways to help address documentation debt, one solution is when you as a developer are required (either by the programming language, tests, etc) to implement code that helps your documentation stay up to date. E.g. developers will add Typescript types to their javascript methods / classes / etc if they know their code won’t pass tests / GitHub actions / other automated tasks they’ve added to their repo otherwise. This should then incentivize engineering leaders to choose technologies or programming languages that can enforce good practices — if adding TypeScript to Javascript results in better documentation and easier developer experience at the cost of just a little extra work while writing new code, it seems like a no-brainer to choose it.
The challenge then becomes what other “no-brainers” we can use to help with documentation debt problems, and unfortunately the answer isn’t straightforward for Ruby / Rails. Sure you can implement tools like Rubocop to make the code a bit cleaner to follow consistent rules, but implementing other libraries like the type-checking library Sorbet seem potentially risky for any engineering team, as it’s not really endorsed by Ruby or Rails as a solution, and doesn’t have a ton of community support.
This became evident when I sought to start documenting my team’s Rails API. At the start, I had been using Postman and had been adding routes / params / etc there, but just as I mentioned above, over time I failed to keep it updated or complete. Instead, I thought that it might make sense to build API documentation right into the app itself, and found rswag, a great library that made adding, updating, and even generating your OpenAPI / Swagger documentation as seamlessly as writing tests. Sounds like a great find, right?
Well, if we had implemented this library right at the start of the project perhaps we’d be fine, but after writing more than 1100 tests to our app, the idea of going back and rewriting a large portion of these tests to model how rswag wanted you to format our tests was very unappealing. I had hoped there was a more automated way, and while I did find a very interesting talk and library that seemed very promising, it still seemed to fall short of making it “easy” to add automated documentation.
A potential Rails solution, at least for API documentation?
I believe Rails should manage this aspect for you, and I don’t think it would be that “hard”[0] to add into the existing Rails framework:
- Rails already generates your routes (via
rails routes
) - Rails also recognizes the params / strong params you are using in each controller method
- Rails finally knows what potential render actions you’ll take
Why couldn’t it just complete the circle and give you a way to define some simple types for your params, and then automatically generate documentation for your REST methods?
E.g. let’s imagine a slightly-modified Rails scaffolded controller:
def update(params: post_params)
respond_to do |format|
if @post.update(params)
format.html { redirect_to post_url(@post), notice: "Post was successfully updated." }
format.json { render :show, status: :ok, location: @post }
else
format.html { render :edit, status: :unprocessable_entity }
format.json { render json: @post.errors, status: :unprocessable_entity }
end
end
end
# ....
private
def post_params
params.require(:post).permit(title: String, body: String, publish_at: DateTime)
end
Above, we’ve added a couple things:
- A method argument for our controller action
(params: post_params)
- Some classes representing types to our
post_params
(e.g.title
being aString
, andpublished_at
being anDateTime
, etc)
But now, our controller should possess all the necessary information to generate its API documentation:
rails routes
can determine how the path to that action will appear, along with whether it’s aPUT
/PATCH
/POST
, etcpost_params
defines the parameters to be utilized in the controller method and can exclude any parameters that don’t match the listing of param attributes, or the types for the attributes listed.- The
respond_to
block recognizes which format types it supports (json
andhtml
), along with what type of responses it will give (:unprocessable_entity
and:ok
, along with the standard:not_found
and 500 errors rails will give you out of the box)
This approach seems like a potential easy win for Rails, offering significant value without substantial overhead. A comparable analogy might be how Rails integrates with GraphQL to auto-generate your schema, but without all the hassle and pain that comes with supporting GraphQL.
…But what about response typings?
One large piece missing from my “simple” example above is that we still don’t know what the response object will look like. For HTML responses this isn’t a concern, but for JSON / other response types, it would be important for any auto-generating API tool to know what the response object could be.
To solve for this, I think the same concept as above could apply, but we’d need to do it at the ActiveModel::Serializer
, jbuilder
, blueprinter
level. E.g. if we were using blueprinter, the code could look something like:
class PostBlueprint < Blueprinter::Base
field :id, type: Integer
field :title, type: String
field :body, type: String
# etc etc
end
The idea would be that typing could be added in gradually — as TypeScript does for JS, mypy does for Python, Hack does for PHP, etc (or perhaps this could become a part of Sorbet for ruby).
Footnotes
- [0] By “easy,” I’m referring to the concept after extensive effort to navigate all edge cases and implement solutions in a ‘Rails-esque’ manner.” I understand implementing this into a framework that needs to support millions of developers will take a lot of work!