Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why is language documentation still so terrible? (walnut356.github.io)
94 points by commandersaki on Sept 12, 2024 | hide | past | favorite | 81 comments


To me MDN is the standard bearer for good language/library documentation. It has example uses. A description of what the function/class/object is meant for. Tells you information about how/if you can use it.

Rust's is pretty good too. I also like Pythons. I agree cppreference is very very bad and unreadable.


> I agree cppreference is very very bad and unreadable.

The layout could definitely be better, but at least the information density is high. My take is that any attempt at C++ docs just kinda really does need all of the information that cppreference gives you.

I laughed in despair at their operator==,!=,<,<=,>,>=,<=>(std::unique_ptr) picture but the thing is, C++ is just kinda like that, and I don't think any other way of organizing that information would be clearer.


A good alternative to cppreference is https://devdocs.io/cpp

Same content, better search.


In my experience all three you mentioned have their own excellent areas.

MDN covers a vast range of subjects and knowledge levels but is most laborious to maintain and I have seen some gaps due to that.

Rust is also similar in those aspects, but has different "documentations" that don't fully interconnect to each other yet. Rustdoc's automatic nature is great to have a complete and working documentation, but also means that API documentations are often weaker than those of MDN as the documentation should be embedded into the source code. Python has the same issues and also inconsistencies due to the lack of language-wide automatic documentation system (Sphinx autodoc is close but still annoying).

Cppreference (perhaps not intentionally) covers only higher knowledge levels and absolutely standard matters. It is of course very clear that you can't really use cppreference for learning C++ from scratch, but it is well written as long as you fit with those assumptions and its writing quality is comparable to MDN in my opinion.


Why does embedding the documentation into the source code make it weaker? The author can put as much or as little effort into the documentation as they want either way.


A documentation is written for particular audiences, not always but typically users. And a user-directed documentation is largely a hindrance for whoever is actually updating a source code adjacent to that documentation, because it acts like a boilerplate template. An embedded documentation does make the absence of documentation obvious and helps towards the completeness, but not without a compromise. Ideally it should then be reordered and further elaborated externally so that it can be maximally useful.

Rust has a good documentation but the general lack of such manipulation makes it often hard to navigate. For example `Vec` [1] has a lot of inherent methods with two separate orderings, where one is alphabetical in the sidebar and another is the declaration order, but none of them are actually logical and will benefit from reordering in general. Of course Rust folks are aware of this issue and have put a lengthy introduction with most important methods for each category, but that's less flexible and prone to future updates.

[1] https://doc.rust-lang.org/stable/std/vec/struct.Vec.html


I’m not so sure that the order of methods matters very much in most cases. In Rust you can at least present them in any order you like, or you can use the module or trait documentation to explain how the order matters.


C++'s canonical reference is a very thick book. I figure there is a conflict of interest between the published version and presenting an equivalent online.

It's worth noting there is no good C documentation in one place, either.

Python dumped the book quickly and went online, but that was probably more for the publisher not being interested in releasing new versions every time something changed. It's also a little younger, but still appeared before the widespread use of broadband internet.

I think Perl was probably the first one to make documentation a real first class citizen, with perhaps Emacs shining a light in that direction.


Personally, I’ve often found cppreference has a bit of a learning curve to it.

Or, at first it seems overly verbose and hard to parse. As you get more familiarized with C++’s semantics though, it feels very useful for covering a lot of ground quickly.

That said, it feels like something that comes time.


I disagree, cppreference is pretty good. The information is very complete and useful examples abound. And just because you may not understand a template operator definition does not make it useless documentation.


Yes 100%.

I'll expand by saying I think there's two types of documentation: references and guides. cppreference is a really, really good reference. It's complete, highly specific, and well formatted.

But it's an awful guide. If I was trying to learn C++ or it's standard lib, I would want to kill myself.

The problem is these two types of documentation are almost perfectly perpendicular in my mind. Meaning, a good guide is a poor reference, and a good reference is a poor guide.

Hand-holding step by step instructions, popular with many js frameworks now, are great guides. But when I want to know what function X returns and under what circumstances, I don't want to prune through a guide that starts at the beginning of the universe. And often those guides will be missing huge amounts of detail to lower the complexity.

So you need one kind of documentation when you start, and then another 5 years later.


That “python float” example is a hall of shame for Google, not python. I searched the same term using kagi.com and first results are exactly what the author expects.

Link: https://kagi.com/search?q=python+float&r=us&sh=cDUw0QAZf3DUs...


Even searching 'float' on docs.python.org brings descent results.

At first the author complains about bad search implementations on the documentation sites and then judges Python by the bad google results.


Python stdlib docs are totally fine, too.


Good language documentation requires a collection of skills:

* writing

* super high empathy

* a deep understanding of the language in both substance and form

* persistence and adversity through a challenging effort

Most developers I have worked with the in past struggle in all those bullet points except possibly that last one. Everybody in software likes to think they are awesome, but tell them to write an essay or formal documentation and that confidence is immediately shattered.


The empathy part I don’t understand. I don’t even need to write good docs for other people. I can do it for myself in 6 months. Anyone who has looked back at their old code should understand that unless you’re doing something entirely trivial a comment or two will be really useful.


> The empathy part I don’t understand.

This is what you need when writing documentation covering...

1. Where should someone new to the codebase start? Literally, which line in which file?

2. How does the code "fit together"? What is the call-graph of its top use-cases?

3. What are its compile-time and run-time dependencies? Which exact versions of those dependencies were used during development?

4. When something isn't working, how can someone replicate your development environment EXACTLY in order to determine what's different about theirs?

5. How would you approach debugging on this codebase? A detailed write-up or even a video of every step you took could help someone new gain a deep understanding quickly.


Yes, those are the main things I would also probably not remember in 6 months, and tend to write down in a readme.


That's often not enough when your audience is much more clueless than you assume. You would have to repeat examples over and over and throw some more just in case.


If there was a contest, then PHP takes first prize. Whatever they're doing with community-driven examples is a mess and should be removed permanently.

For those unaware, you can submit example suggestions for functions in PHP. The majority are being down voted for being garbage.


Community-driven suggestions are great when they are correct and/or useful enough, which sadly haven't been for years in the case of PHP. I believe the official documentation should have been periodically updated to assimilate suggestions somehow, so that it does cover common use cases and questions but also remains correct and useful enough by curation and possible corrections.


That's not even the worst of its sins. The official part of the page is often incomplete, doesn't tell about important edge cases or differences between win and *nix. Also it is regularly flat out wrong.

Sometimes the comments are all screaming about how bad the doc is, then all give an inconsistent variant of the truth.


PHP has the best docs available.

When started using Python, oh man, I immediately noticed how much time I wasted because of the horribly structured all over the place docs.

Google also has some of the worst documentation, like on browser extensions.


When I was learning php as a teenager in early 2000' those community examples where the best.

I haven't touched php in forever, but I imagine that the spammers and idiots would ruin that feature.


These append-only community-driven docs were, like PHP itself, an amazing feature in 2001.

The massive amount of garbage and the downvoting wasn't a thing then.

They came when the Internet got populated by normies.

StackOverflow stopped being a fun place to be, too.


I believe that the documentation is a huge reason why php became so dominant on the net. Human curated, many real-work examples and helpful comments to get things done.

It's all I needed to learn web dev.


I completely forgot StackOverflow was even a thing. Haven’t used it since ChatGPT 3.5 was released


It was declining years before LLMs.


Rust is this way because Mozilla decided to prioritize documentation early, and documentation was taken seriously by the folks involved. With that in place, it also takes a tremendous amount of work by many people over years.


Everyone who thinks that they know how their documentation should be indexed should take a look at the Permuted Symbol Index <https://www.lispworks.com/documentation/HyperSpec/Front/X_Sy...> from the Common Lisp Hyper Spec. <https://www.lispworks.com/documentation/HyperSpec/Front/inde...>. The rest of the CLHS is very… old–school. Very complete and useful, but definitely from an earlier age.


I just want to keep slamming ^ to keep this permanently at the top of this site. In the course of my day I have the occasion to visit the documentation of many software projects. Some created by mega corporations (AWS), some more humble open source projects (Apache Spark) and uniformly they are all terrible. Most are clearly created with automated tools which seem to have been designed to save the developers time at the expense of anyone actually trying to use what they have created.


Apple's Swift documentation is killing my enthusiasm for learning the language. Their search is so terrible I wish it just redirected to duckduckgo.

When I look up a class, I can't see all of its methods on a single page because it doesn't show methods inherited by the Protocols it conforms to.

Also, Apple's license for the documentation means that sites like devdocs can't put the docs on their website in a better format.


If you're on macOS, Apple's docs are best viewed through Dash. While it can't overcome the documentation's fundamental lack of visibility into inherited API, its search and presentation are fantastic.

If you're not on macOS and just searching open source Swift docs, try swiftpackageindex.org and swiftinit.org. Swift Init especially has fast rendering and better search, though ultimately is based off the same inline docs as Apple's.


I have written several API documentations to various extents and want to stress that: please use or invest at a superior tooling but also understand that it takes a lot of time and efforts regardless of that. Rust documentation is great partly because its main "book" (The Rust Programming Language) was initially written by paid and thus dedicated authors. Rust would still have good documentation without that investment, but probably its book would've been much weaker or even replaced by the Language Reference. If you feel you can write much better than what is already in place, and of course if no other people seem to already work on that (cause consistency matters in documentations), then please give a hand to maintainers!


The author rants about Java a lot, when they admit they've never even used it. It shows. People view documentation in their IDEs. Generated javadoc is almost not a thing nowadays. I suspect this is the same in similarly statically typed languages with good IDE support.


First time I saw an experienced Java programmer at work was a revelation. One hand on the keyboard, one hand on the mouse. Hardly ever actually typed anything. Line width set to 200 characters. Cranking out boilerplate, 5 lines with a single mouse click. Measured in lines of code, his productivity was off the charts.


And still, I switched to C# now and I have to tell, my overall java productivity was massively better. Better tooling, better code analysis, better literally everything.


IDEs will show you the docs for whatever method you're using… but their support for navigation or exploration tends to be somewhere between "absent" and "severely lacking".


Navigation is quite good (at least in Visual Studio C#). You can see usages and go up and down the class hierarchy quite easily.

Exploration, though, not so much. But you have is "object." and then read the methods and properties. It's not awful, but if you don't know the object first you can't do it.


Do you have specific examples in mind?


I still use javadoc a lot. I was shocked to see the author complain about it. I have always found javadoc to be one of the best language docs to date. I find myself missing it constantly when I am using other less documented languages and libraries. It is well structured and exhaustive, the only confusing parts of it are where the actual underlying class is poorly structured.

I would highly suggest using javadoc before something like SO when you are confused about how to use a class. The vast majority of SO's java help is frozen in time due to SO's 'no duplicate questions' policy. Java has improved a lot since java 7.


Java IDEs still rely on Javadoc behind the scene in my knowledge, so you might be correct that some UIs have been replaced by IDEs but its basic format never changed. Also such documentations do not account for introductional or topical contents that can't be reached from a source code.


Right, I was careful to say "generated javadoc", but that nuance might not have been obvious. The author's point is not about the syntax you use to write documentation, it's about the experience you have consuming it. (And FWIW, Java 23 supports Markdown comments[0]).

Re: "introductional/topical contents", this is what package-info.java files are for.

0) https://openjdk.org/jeps/467


Do you look up symbols in the IDE? What about finding all symbols in a package? I use inline IDE documentation a lot but still also use the online or offline copy of the HTML reference manual.


> Do you look up symbols in the IDE?

Yes.

> What about finding all symbols in a package?

I do not know what exactly you mean here, but I look for stuff like "all methods" or "implementations" in IDE.

> I use inline IDE documentation a lot but still also use the online or offline copy of the HTML reference manual.

I literally never ever use offline copy of the HTML manual nor need a need for it. I use online copy when google lands me there - but typically I then find the same thing in IDE, because then I see also a source code and have generally great experience.

To me, having to read manual online is a fail of documentation.


Someone show this to Linux kernel maintainers


Kernel maintainers are so far into believing that every aspect of their job is unique (some aspects undoubtedly are!) that I don’t think any of them could be convinced to learn anything from any other software development effort.

They’re like the Americans of tech. Truly.


I want to start by saying the author's right about wanting clarity in documentation and actually having documentation that's easy to navigate. golang, for example, has pretty good documentation, but whenever I visit the official docs I find it difficult to find my way to the information I'm looking for.

Having said that, it's worth asking if some of these asks are orthogonal to each other. For example:

>That page must contain (not link to) every method, and the descriptions of those methods, that can be called by that class, preferably including all inherited functions.

>That page must be as uncluttered as possible

"Including all inherited functions" is a pretty deep stack pretty quickly in a lot of languages. I'm entirely willing to acknowledge that maybe that means the page being "as uncluttered as possible" is to be read in the same vein as "the design should be as simple as possible, _and no simpler_".

>Seriously, cppreference straight up taking you to duckduckgo when using the search box is fucked.

In this case, we're seeing the tension between "The official docs should be great" and "I've mistaken a community project for the official docs" (cf. https://en.cppreference.com/w/Cppreference:FAQ the question "Who is behind this site?")

This sort of conversation naturally invites the more subtle conversations around who funds/maintains open/libre projects and whether those in the community who aren't actively working to improve the situation should follow ESR's wonderful advice, "Every good work of software starts by scratching a developer's personal itch."


> In this case, we're seeing the tension between "The official docs should be great" and "I've mistaken a community project for the official docs"

What official documentation? The standards? C and C++ are weird languages from a modern perspective, and that includes their cultures: They don't have a single blessed implementation, they have a standards body and a community. The standards body issues standards, the community does everything else. Both C and C++ come from a time when all "serious" languages worked like that. Yes, even BASIC. There's an ANSI standard for BASIC.

It would be great if the FSF and/or the LLVM people wrote documentation for C and C++ and it would be even better if they collaborated on it. But it would be no more "official" than cppreference is, because they don't write the standards.


I intend to sound like I'm agreeing with you here, not disagreeing.

The standard is what the committee publishes, yeah. For C++ it's https://www.iso.org/standard/79358.html ; I didn't look up the corresponding C one, but I have a friend (hi, Kate!) who maintains a C compiler who's pretty comfortable with the latest standard. The standard is the official documentation.

There's drawbacks having whitepaper standards that cost money, but this is what those languages have.

For what it's worth, it used to bug me that part of the standard is that there's intentionally undefined behaviour, but I went to BoostCon and heard some of the standard body talk and they impressed me as being thoughtful about leaving areas for implementers to innovate specific optimisations, so as not to restrict the potential of the community.

If I'm reading you right we agree that there's a difference between the standard and the implementation that's practical and real, but I'm not certain there's much to be done about that.


The C and C++ draft standards are publicly and freely available, and are generally better quality than the actual ISO standards. You can get the C standard here https://www9.open-std.org/JTC1/SC22/WG14/www/projects.html, and the C++ standard is also available in HTML form at https://eel.is/c++draft/ (having hyperlinks to individual sections is actually quite useful).


As a tangent to the primary example in the article (rust docs), I'll vent about a pet peeve: If traits are used instead of concrete types, the elegant stream of links the article mentions terminates. Unless there are manual examples (In docs, repo, lib website etc), it's time to insert arbitrary types in, (i8 is nice!) and see what the compiler message shows.


cppreference is the best you can do for C++. It isn't the documentation's fault that the language got so big and frequently introduces or deprecates features.

cppreference is best used for the normal workflow:

Search with Google, instantly find the correct page among the first 10 results, go there, skim the page, find what you need.

cppreference is more like man pages. You slowly build up your own mental map rather than having a structure enforced on you by hyperlinks and a directory hierarchy.

I find this model superior.


The main trouble I have with Rust as someone who is unfamiliar with the language is all the traits and trait implementations. I’d rather just see all the instance methods available on a string (for example) regardless of what protocol they implement.

But these days you can ask AI chat, so it’s not a big deal.


>But these days you can ask AI chat, so it’s not a big deal.

Yesterday I asked chatGPT about OnceCell. It said "SyncOnceCell<T> in Rust is the thread-safe version of OnceCell<T>". This is incorrect, the thread safe version is OnceLock.

When confronted with this it went on to say "What I was actually referring to is OnceCell<T>, which is thread-safe when used as a static/global value" which is also not true.


I can vouch for the quality of the Java documentation. Having learned Java primarily from the online javadoc pages, the Sun Java Tutorial, and Java in a Nutshell, I think the Java documentation really set the bar high.


I work with elixir and the docs are really nice, I encourage other languages to take inspiration.


Yep, some of the nicest docs for a language I've seen.


"Ick", "Ow", what is this word salad.

C# documentation is pretty decent, exhaustive and comprehensive. How would you even "improve" table of contents anyway? It just says what a type has. You can't add/remove much from it. Are they really complaining that a website they haven't used much is different to a website they have used a lot?

There are plenty examples of bad documentation but for simple type info navigation all C#, Go and C++ are totally fine - I only had to peruse the C++ one for example and found everything quickly and in sufficiently great detail (it did not help with abrasive nature of the language for general purpose coding but I digress). It's the additional information/context that is often lacking. Moreover in C# you can use your IDE or a VS Code plugin to see all types present in a namespace or a package. Most of the time, their type and method names are self-descriptive - what matters more is hand-written guidance to using the library the right way, which is extremely hit and miss with numerous Rust crates.

I think the author just wanted to make a complaint for its own sake and is not looking at documentation from perspective of being productive with a particular language.


> How would you even "improve" table of contents anyway?

See the Rust example at the top of the post.


I appreciate the author’s angle. Developers are given thousands of dollars in equipment and software licenses. But then language reference pages are insanely convoluted. I don’t know why people put up with it.

The only part I don’t agree with is the author’s assessment of SEO. Search algorithms aren’t handed down by God on stone tablets. Google decides how to rank search results. They can figure out how to give sane results for common programming language queries.


Lack of positive feedback, hard to measure the value. So people tend to invest little efforts on documentation. That’s my guess.


What I like the most about go is that it has a unified way to write both unit tests and examples, which get integrated in the autogenerated documentation.

Extra quirks and edge cases are just comments in the implementation code.

In my opinion that's exactly how it should be. The documentation should be as close to the codebase as possible to avoid redundancies and an out of date documentation.

Only the go documentation proxies are a little messed up, see sourcehut blog posts about it. But the huge advantage is that you can selfhost them and create your own automated documentation base for your company, for example.

[1] https://github.com/golang/pkgsite


I've had better experience reading Go docs than Rust.

Though it's been a year since I was dabbling with Rust, I had a hard time finding specific functions/methods in different crate docs. For example tokio_tungstenite::accept_async returns a WebSocketStream<S> and a lot of the code snippet examples show use of a split() method, but I couldn't find it in the docs or any examples.

Go shows prominent examples for almost all things in stdlib, but couldn't say the same for Rust.


Evangelism around a framework trumps good documentation.

Just today we were talking about how dull and overtly technical AWS documentation, and how Amazon Q is justified in it’s existence just to make the eye-bleeding-like experience of configuring a policy slightly more tolerable.

If you want just-terribly-awful-embarrassing documentation, just try doing anything in the KDE Plasma stack without a search engine. Try building a kirigami app and looking up class definitions. The easiest path is to install the Kirigami Gallery app and click on the examples of what you want, which opens a web browser to the source code of the app. Written in QML resembling the jQuery mega scripts of yesteryear.


C man pages are great because you don't need internet access or to when leave your terminal. Just open up a tmux pane next to your vim ...

If I have to switch to my web browser there's a reasonable chance I get distracted.


For bare C, man pages or opengroup.org (e.g. [0]) are often quite nice. Linux's (GNU's really) man pages can be sometimes noisy though.

[0]: https://pubs.opengroup.org/onlinepubs/9799919799/functions/o...


I understand that some people prefer a dark theme. What I don't understand is why do some people think that everyone must prefer a dark theme, so they set up their websites to always use a dark theme, regardless of the browser preference?

At least, this website lets you switch to a light theme. But if you do, inline code fragments still use a background color from the dark theme, which makes them practically unreadable. Looks like the author doesn't care about people who prefer a light theme. Then why they complain about websites made by people who don't care about people who prefer a dark theme?


I agree that implementing two themes but not following browser preference is odd, although I'm a dark mode user. The weird thing is that the Hugo theme used (PaperMod) follows browser preference by default. The blog author had to explicitly turn that off.


It’s a consequence of thinking light/dark modes are personal preference when they’re accessibility features.

And when viewed from that lens you can see how off-putting it is when it’s a paid feature, or only adjustable when logged in, or a user-setting that seemingly always defaults to light.


Does anyone remember Codeigniter docs? Was quite good as far as I can remember


What I personally hate in documentation is raw English. Give me a synopsis, a bunch of examples, a dump of typedefs, a tree/graph of connections between concept names. I can understand it immediately as long as you write it in latin script.

But if you only blah blah, blah blah blah blah <optionname> blah blah blah, blah blah, blah blah blah <value1> blah, blah blah blah blah blah blah <value3> blah blah <value1> blah blah, then screw your manual and you too.

Also, “documentation” in README.md and in markdown in general. The laziest form, especially unreadable on 4+ level headers which are indistinguishable from just text.


cppreference is a Mediawiki site. It absolutely could be a better experience, but it's currently running an ancient version that makes things like responsiveness and theme improvements harder to pull off. https://en.cppreference.com/w/Special:Version


C# is so painful to find what I want. Last time I want to write a quick GUI, I gave up and go to youtube because they are simply easier.


WinForms is the easiest gui platform I've ever used.

Still supported in .net 8. Windows only though.

Beyond that, I have not bothered.


I have the opposite experience, I've found the MSDN docs for dotnet are very helpful both at the conceptual and implementation level.

GUI might be special a case because there are so many flavors of it, and half of them are abandoned dead-ends.


Was that something specific you had trouble with? You can v. quickly start with e.g.

  dotnet new --install Avalonia.Templates
  dotnet new avalonia.app
  dotnet run
Will work on any platform.


It’s almost like the people who undervalue people who can communicate well end up reaping what they have sown.


Honorable mention: CRAN & RStudio. Just excellent overall.


> That page must contain (not link to) every method, and the descriptions of those methods, that can be called by that class, preferably including all inherited functions

For the love of god, before thinking about that be sure that at least a version of the site loads on any device.

On developer.android.com many pages are impossible to open on lower-end phones.


"You're forgetting Rust!" ... oh wait, you're not. :-D




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: