Rust solves the problem of incomplete Kernel Linux API docs

dagmx · on Aug 31, 2024

This is one of the biggest advantages of the newer wave of more expressively typed languages like Rust and Swift.

They remove a lot of ambiguity in how something should be held.

Is this data type or method thread safe? Well I don’t need to go look up the docs only to find it’s not mentioned anywhere but in some community discussion. The compiler tells me.

Reviewing code? I don’t need to verify every use of a pointer is safe because the code tells me itself at that exact local point.

This isn’t unique to the Linux kernel. This is every codebase that doesn’t use a memory safe language.

With memory safe languages you can focus so much more on the implementation of your business logic than making sure all your codebases invariants are in your head at a given time.

GolDDranks · on Aug 31, 2024

It's not _just_ memory safety. In my experience, Rust is also liberating in the sense of mutation safety. With memory safe languages such as Java or Python or JavaScript, I must paranoidly clone stuff when passing stuff to various functions whose behaviour I don't intimately know of, and that is a source of constant stress for me.

miki123211 · on Aug 31, 2024

Also newtype wrappers.

If you have code that deals e.g. with pounds and kilograms, Dollars and Euros, screen coordinates and window coordinates, plain text and HTML and so on, those values are usually encapsulated in safe wrapper structs instead of being passed as raw ints or floats.

This prevents you from accidentally passing the wrong kind of value into a function, and potentially blowing up your $125 million spacecraft[1].

I also find that such wrappers also make the code far more readable, as there's no confusion exactly what kind of value is expected.

[1] https://www.simscale.com/blog/nasa-mars-climate-orbiter-metr...

db48x · on Aug 31, 2024

As much as I like Rust, I don’t think that it would have solved the Mars Climate Orbiter problem. That was caused by one party writing numbers out into a CSV file in one unit, and a different party reading the CSV file but assuming that the numbers were in a different unit. Both parties could have been using Rust, and using types to encode physical units, and the problem could still have happened.

westurner · on Aug 31, 2024

W3C CSVW supports per-column schema.

Serialize a dict containing a value with uncertainties and/or Pint (or astropy.units) and complex values to JSON, then read it from JSON back to the same types. Handle datetimes, complex values, and categoricals

"CSVW: CSV on the Web" https://github.com/jazzband/tablib/issues/305

7 Columnar metadata header rows for a CSV: column label, property URI, datatype, quantity/unit, accuracy, precision, significant figures https://wrdrd.github.io/docs/consulting/linkedreproducibilit...

CSV on the Web: A Primer > 6. Advanced Use > 6.1 How do you support units of measure? https://www.w3.org/TR/tabular-data-primer/#units-of-measure

You can specify units in a CSVW file with QUDT, an RDFS schema and vocabulary for Quantities, Units, Dimensions, and Types

Schema.org has StructuredValue and rdfs:subPropertyOf like QuantitativeValue and QuantitativeValueDistribution: https://schema.org/StructuredValue

There are linked data schema for units, and there are various in-code in-RAM typed primitive and compound type serialization libraries for various programming languages; but they're not integrated, so we're unable to share data with units between apps and fall back to CSVW.

There are zero-copy solutions for sharing variables between cells of e.g. a polyglot notebook with input cells written in more than one programing language without reshaping or serializing.

pessimizer · on Aug 31, 2024

Unless the unit was part of the type, and why wouldn't it be? A unit error is a type error. You don't have to say that something is 5 long, you can say that it's 5 inches long. And the function can be made only to accept a length in inches, and you can implement :Into<Inches> on your length type for an automagic conversion.

db48x · on Aug 31, 2024

Of course, but that doesn’t solve the problem. The problem here is that two separate organizations with completely different software were communicating via CSV file. There was a contractual obligation between them that required them to use specific units, but no technical collaboration. The problem was not in their software, but in the management and engineering discipline of the two groups.

nurettin · on Aug 31, 2024

That wouldn't be enough unless the csv somehow kept serialized column type in the header and the deserializer knew how to interpret it.

tialaramex · on Aug 31, 2024

Yeah, even an extremely powerful units type system like C++ mp-units doesn't fix "I assumed this number was feet and you assumed it was millimetres".

But both the mp-units and this lighter weight Rust approach would at least make you write that down in the software. There's some chance the mistake is realised, whereas when it's just an organisational assumption that just doesn't happen.

kwhitefoot · on Aug 31, 2024

> prevents you from accidentally passing the wrong kind of value into a function,

You can do this in other languages such as Pascal with subtypes and something functionally equivalent in C# and other object oriented languages.

0cf8612b2e1e · on Aug 31, 2024

I do not think the claim was that this is a unique construct to Rust.

the__alchemist · on Aug 31, 2024

IMO, this is one of the most underrated advantages of Rust over most (all I've used) alternatives. Mutation is explicit/principle-of-least-surprise.

layer8 · on Aug 31, 2024

> This is every codebase that doesn’t use a memory safe language.

Many of the same problems — the mentioned underdocumented API assumptions — exist in memory-safe languages like Java and C#. Memory safety is just one among many aspects in interface contracts, and is actually often one of the better documented ones in non-memory-safe languages.

Type-based checking is of course very beneficial, but TFA specifically focuses on the documentation aspect, not on the enforcement, which are in principle orthogonal to each other. TFA draws attention to the fact that the enforcement here also entails better self-documentation. However, it’s limited to whatever aspects the type system enforces.

misiek08 · on Aug 31, 2024

For project like this one - it would be cumbersome, but proper Java is also solving the problem. Currently we write code without all throws definitions, because those are handled by Spring-like monsters or other libraries, but having method definied with LockedEx, NullPointerEx etc. is even more friendly as doc generation. Funny, but I don’t like Java that much, but it’s one of those languages also solving documentation problems here.

Yes I know it’s "faster-development" to have type instead of additional definition attributes, but I think the problem would be also solved.

dagmx · on Aug 31, 2024

The posted link talks about how the types act as documentation, and in turn enforce it. I’d recommend reading the whole thread that someone else posted below.

layer8 · on Aug 31, 2024

I was criticizing your incorrect inference “memory-safe language => better-enforced API contract”. The only thing memory-safe languages do in that context is to remove a single aspect from the universe of things that can be ambiguous in an API, and they also don’t cause anything else to be enforced. (Rust does more, but that’s because of properties beyond mere memory-safety.) In other words, “memory-safe language” doesn’t mean what you want it to mean.

dagmx · on Aug 31, 2024

Jeez, there’s a whole comment above that explaining the context and caveating it with the needs of robust type system too.

If you’re pedantically holding on to one sentence as a gotcha, because I didn’t caveat everything, then fine. But seems a bit much to then say I’m forcing a definition just because you ignore the rest.

brigadier132 · on Aug 31, 2024

The way I like to think of it is as follows:

There is a space of all possible programs that can be written. Every time you add a restriction like the borrow checker, more expressive data types, or immutability, you are shrinking the space of all possible programs you can write. Sure, some of those programs are correct but 99.9999% repeating are incorrect.

tialaramex · on Aug 31, 2024

Right, we can and in some domains we absolutely should, prefer languages which are specialised to only be able to express programs which match some simple model of what we intended. You cannot invent a wholly new idea this way, but you often don't want new ideas.

WUFFS is an example of this, the WUFFS JPEG decoder decodes JPEGs, duh, but even if you deliberately sabotaged it, your sabotaged decoder just... decodes JPEGs wrong, maybe now they all decode to the Goatse image. It not only can't escape to execute arbitrary code, it can't do much of anything outside of the task we intended, all the resulting runtime bugs (for which you need unit tests) are things like oops the grass is blue, or the decoder rejects this 1x1 image even though it's valid. No "Segfault" no "Ooh it truncated the file instead of reading it", those aren't things we want, so no need to be able to express them.

For example I think a lot of big data work could use such a language. Data goes in, processing happens, data comes out. No "Oops, it wiped the hard disk on nodes 6, 18 and 24, I wonder why". No weird interactions with the OS that weren't catered for, just data in, processing, data out, which is what the people using these things always want.

helltone · on Aug 31, 2024

I didn't know rust was able to tell when a method is thread safe. Where can I read more about that/try it out?

dagmx · on Aug 31, 2024

It’s probably more correct if I say that data types themselves are thread safe or not.

https://doc.rust-lang.org/nomicon/send-and-sync.html

So therefore a function can’t do things that invalidate that type level contract.

remexre · on Aug 31, 2024

Send and Sync are the keywords to search for.

metaltyphoon · on Aug 31, 2024

Isn’t anything with Arc thread safe?

ninkendo · on Aug 31, 2024

No. Just because something’s behind an Arc<T> doesn’t mean two threads can hold it at once. Arc just ensures that it is freed (dropped) when no threads are using it any more. T itself has to be Send+Sync for Arc to be.

To a first approximation, Arc<Mutex<T>> is what you want when you want a blob of mutable stuff to just work between threads, but you still have to worry about terrible performance if you hold the mutexes too long, etc. (Not to mention mutexes are just slower in general than having types which are natively Send+Sync.)

jpc0 · on Aug 31, 2024

> Not to mention mutexes are just slower in general than having types which are natively Send+Sync

Without knowing the implementation, how would you know this statement is true?

I could just stick the Arc<Mutex<T>> within my abstraction and say it is send+sync and it would have the exact same performance, and that might be the most efficient solution possible on the given hardware.

ab5tract · on Aug 31, 2024

Telling when a thing is thread safe is hardly the same as telling when something isn’t.

aaomidi · on Aug 31, 2024

Race conditions within rust are technically impossible :)

jrpelkonen · on Aug 31, 2024

Safe Rust prevents data races, but not race conditions in general.

CryZe · on Aug 31, 2024

Though arguably, the fact that in order to mutate data from multiple threads, the compiler,forces you to use some synchronization primitive, results in you having to think about how you are going to synchronize access, which should lower the chance of race condition bugs overall.

tialaramex · on Aug 31, 2024

The analogy I find helpful for this is, you let the cat out your front door and close it. You walk to the back door and close that too.

A race condition (which safe Rust can have) is a normal phenemenon in our world. If we put the cat out the front door, then walk to the kitchen and close that door by the time we reached the back door maybe the cat had run around the outside of the building and come inside again, closing the doors in the wrong order introduced the opportunity for a race - be very careful.

A data race (which safe Rust does not have) is a weird thing caused by the mismatch between how you think the computer works and how it actually works. What happens if Alice tries the put the cat out the front door at the same time Bob tries to put it out the back door? Well, in the real world that cannot happen, either Alice has the cat or Bob does. But for many languages† this sort of nonsense can happen and when it does that's a disaster

† In safe Rust it can't happen, in Java, OCaml and Go it can happen, but it may not be a disaster (the details are different for each, it is a bug though).

irundebian · on Aug 31, 2024

Interesting analogy.

aaomidi · on Aug 31, 2024

That’s why I said inside rust. If you bring in external factors then of course it can/will happen.

sixfiveotwo · on Aug 31, 2024

Hmmm, no.

https://stackoverflow.com/questions/31912781/why-doesnt-the-...

devnull3 · on Aug 31, 2024

In the beginning, I was trying to use Rust like C (which I think lot of people do) and was struggling with the borrow checker. I had a light bulb in my head one day to read the function signature and see if its '&self' or '&mut self'.

If any one API of a struct/enum has '&mut self' then its instance cannot be shared across threads without using a mutex (which brings its own friend along i.e Arc). This led me to structure the code in my mind much in advance to prevent borrow checker issues.

Another use of this is while embedding an object in a struct. When I realized that '&mut self' infects everything up the call chain it was a big learning moment. That is the embedding struct also needs a mutable borrow as well.

chungy · on Aug 31, 2024

There are tons of underdocumented Rust libraries, where I've had to pore over it for very long periods of time because the author probably thought the code would speak for itself and not bother writing any documentation strings on any of the functions.

No, Rust doesn't solve the problem of incomplete API docs. Developers being diligent enough to document the APIs solves the problems, and that solution is entirely language-independent.

conradludgate · on Aug 31, 2024

I feel you misunderstood the thread if that's your take away.

The argument is that properly utilising a stronger type system is more self-documenting. And quite often to achieve safe abstractions, you have to lean into this paradigm more. It of course still requires discipline, but in my experience it works well in practice.

Since C has no generics/type templates, you cannot achieve the same level of type based abstractions, thus you cannot rely on the types semantics alone to suggest it's own use.

stefan_ · on Aug 31, 2024

More self-documenting for whom, the code generating documentation? Because for using it as a programmer I've found the code that has been Rust-ified to normalize everything into 1000 traits of which the given type you are using might implement 100 you can't see without going into more documentation to be a nightmare.

estebank · on Aug 31, 2024

I've seen crates like that: APIs with very complex trait bounds that require one to sit down with a cup of tea to pour over before you can understand what they do. I don't particularly like those. What APIs written that way enable is that you're forced to use them correctly, you have a bunch of type ceremony that you need to follow before the compiler will accept a simple method call. That means that the API is resistant to misuse. What you are asking for, reasonably, is for the opposite situation: give me enough information that if I'm accidentally misusing the API, which will be rejected, I will be able to understand what's missing so that I can use it correctly.

I'm personally of the opinion that crate author should provide explanations, examples, and take into account the kind of compiler errors people will get when they inevitably try to misuse the APIs.

db48x · on Aug 31, 2024

We're talking about C code that has refcounted objects, where you have to know ahead of time to add or remove a refcount before you call specific C functions. In Rust that would just be an Rc or an Arc, and the programmer would not need specific knowledge.

Or a C function that either returns an existing object, or a new object that is incompletely initialized. If it is an existing object, you are required to call a second function to finish the initialization. And you’re just supposed to know about that requirement; it’s not documented anywhere. On the Rust side they decided to return an Either<EmptyObject, InitializedObject> instead, so that the programmer can directly see what is going on.

They’re not saying that all Rust APIs are immediately easy to understand, only that they require less manually–written documentation because the constraints can be encoded into the type system.

Ygg2 · on Aug 31, 2024

Just use basic types?

Sure you can over engineer Rust, for dubious gains. But that's true for any language. *cough* MetaAbstractFactoryProxyFactoryFacadeBuilderProxySingletonWebViewFactory *cough*

> More self-documenting for whom

Anyone using basic Rust types.

Is type `Sync`? It's safe to share between threads.

Is type `Option` its value can be missing i.e. None.

Is type moved? Yes, if it isn't borrowed, and not Copy.

db48x · on Aug 31, 2024

Isn’t that supposed to be a Bean?

Certhas · on Sept 1, 2024

That is not the point of the author. They clarify on reddit:

https://www.reddit.com/r/programming/comments/1f5qbhv/rust_s...

> This isn't a great title for the submission. Rust doesn't solve incomplete/missing docs in general (that is still a major problem when it comes to things like how subsystems are engineered and designed, and how they're meant to be used, including rules and patterns that are not encodable in the Rust type system and not related to soundness but rather correctness in other ways). What I meant is that kernel docs are specifically very often (almost always) incomplete in ways that relate to lifetimes, safety, borrowing, object states, error handling, optionality, etc., and Rust solves that.

Blahah · on Aug 31, 2024

"pour over" -> "pore over" btw

chungy · on Aug 31, 2024

Thanks, I didn't know this. I suppose it's like writing "could of" instead of "could've" :)

ilyagr · on Aug 31, 2024

One great advantage of replacing comments with language constructs is that if the compiler checks the "docs", people don't forget to update it and can actually trust what's written down

bobajeff · on Aug 31, 2024

>But the end result of all this is that you CAN, in fact, just look a the Rust API and know how to use it correctly for the most part.

For this I have to bring up that there are indeed rust API's that I have to Google around a bit because I have no way of knowing just by looking at the function signature how to create the type that it expects. Because some types you can't just make from scratch and have to use a combination of other functions to create.

pornel · on Aug 31, 2024

It's still an improvement, because you can find what makes that MysteryObject and you know how to pass it to a function that takes a &MysteryObject or Rc<MysteryObject> or other variation.

In C, it's all MysteryObject* and you have no clue if that is a Box, Rc/Arc, &mut, Cow, MutexGuard<MysteryObject>, maybe a slice of &[MysteryObject].

And if you're unlucky you get functions taking or returning void*.

bobajeff · on Aug 31, 2024

Well sometimes it's not actually MysteryObject. It's actually some special type like SpecificCaseMysteryObject that's been unified inside of MysteryObject but only MysteryObject is in the signature.

tafia · on Sept 4, 2024

I usually go to docs.rs (or quickly generate it on my pc) then I just search for the functions which returns that object (not a reference of it) or Self.

bobbylarrybobby · on Aug 31, 2024

I've seen this in Haskell too. “The type signature is self documenting!” Unless you're talking about the most trivial functions, that's almost never the case.

grimgrin · on Aug 31, 2024

IDK if this is bad practice but I've rolled up the messages if you wanna read this in one view:

https://gist.github.com/shmup/d6290988c876c303663d56563bc570...

faitswulff · on Aug 31, 2024

You can also click the eye icon in the top right to auto expand all the comments in the thread.

KTibow · on Aug 31, 2024

It's a weird icon choice, I would've used something like `expand-all`

coder_san · on Aug 31, 2024

Thank you. That made it easier to read.

bobajeff · on Aug 31, 2024

Thanks, it's impossible to read if you're not logged in to x.

faitswulff · on Aug 31, 2024

It's a Mastodon post

bobajeff · on Aug 31, 2024

Oops. My mistake. Also it turns out I just had to click on the little expand button on the posts to read them.

cesarb · on Aug 31, 2024

That is not Twitter, vt.social is a Mastodon instance.

raggi · on Aug 31, 2024

You can also state this without the Rust involvement: adding a duplicate of the APIs in any other dialect, going through that exercise pushes implementors to reach clarity and understanding that is more detailed and complete than any incremental code review, and you can benefit greatly if you utilize that feedback in a timely fashion to improve the code and/or docs. There's an interesting parallel here to one of the common rust feedback tropes: until rust has two implementations and a specification it's not ready; this is interesting to reflect on in context.

This also came up in the video referenced in the step-down announcement, specifically just before Ted jumped in all hot, they were in the middle of about to propose to take some of what had been learned and perhaps rename or even document the C side once it was understood. Amusingly/disappointingly that suggestion which was coming was also counter to what Ted assumed they were saying.

A recent personal experience example down this same path happened in the GSO paths in the network stack, where a well intentioned incremental linux change made its way through review, landed not only in head but also into stable and LTS before anyone noticed that it was broken. It had actually broken kselftests (noted by a maintainer after the report it was broken), but kselftests is quite a mess to use on any given day, so it's hard to "spot" regressions and demonstrably isn't used as a tool to guard against regressions consistently. Reviewers missed the semantics, and i wouldn't blame them because it turns out the semantics in question aren't explicitly or clearly written down anywhere. Last year we wrote against this interface at work, and gained substantial performance improvements, but to do so we had to read the docs, read the source, read the review threads, make assumptions, try stuff, and ultimately settle on the observed semantics more than anything else. This is a userspace interface, so it should be well defined and stable following the mantra, and it's mostly ossified now due to having real users, but it's not well documented or well understood. We follow along some of our fellow offload implementors and have seen a number struggle trying to work solely off of the docs and looking at our implementation, as they miss minor semantics - you can only get those by reading the source and patches very closely. The point of this story is that these problems of understandability cause regressions inside the kernel too, all the time. Reaching for tools to avoid this is a good idea.

oconnor663 · on Aug 31, 2024

> Does a ref counted arg transfer the ref or does it take its own ref?

This sort of thing also comes up a lot when you write Python extensions in C. You have to know the "calling convention" when you pass object refs in and out of functions. This entire category of problems goes away when you write extensions in Rust with PyO3, and the barrier to entry for new folks is much lower. C++ can do similar things (std::shared_ptr), but it tends to introduce new footguns in a way that Rust really doesn't.

yencabulator · on Aug 31, 2024

Concrete example for those not familiar with Rust:

In most languages, you have a lock on the side that you need to remember to take to protect some data. In Rust, the lock wraps the guarded data, you can't access the data without holding the lock, you can't forget to unlock, and you can't keep a reference alive while releasing the lock.

https://doc.rust-lang.org/std/sync/struct.Mutex.html

https://doc.rust-lang.org/std/sync/struct.MutexGuard.html

irundebian · on Aug 31, 2024

These people are heros and do awesome work. Really appreciate it!

edfletcher_t137 · on Aug 31, 2024

Another perspective which makes it clear this is The Way: it brings the promise of fully self-documenting code much closer to fruition, if not all the way!

The_Colonel · on Aug 31, 2024

It's a false promise I stopped believing many years ago. No type system will replace docs or even comments. No type signature will give you the why, the big picture or the context. The day I stopped believing in the possibility of "self-documenting code" was the day I started writing more readable code.

edfletcher_t137 · on Aug 31, 2024

> The day I stopped believing in the possibility of "self-documenting code" was the day I started writing more readable code.

"More readable" is self-documenting! Doesn't matter how you get there (though it's a bummer you've lost faith), just matters that you did.

> No type system will replace docs or even comments. No type signature will give you the why, the big picture or the context.

Types are just one small part of the picture: did you read even the first full message, much less the full post? Are you aware of the other non-type-related benefits Rust provides?

OP helpfully gave an illustrative list I'll reproduce here for you since I'm not sure you saw it:

> When a callback is called are any locks held or do you need to acquire your own? What about free callbacks, are they special? What's the intended locking order? Are there special cases where some operations might take locks in some cases but not others?

> Is a NULL argument allowed and valid usage, or not? What happens to reference counts in the error case? Is a returned ref counted pointer already incremented, or is it an implied borrow from a reference owned by a passed argument?

> Is the return value always a valid pointer? Can it be NULL? Or maybe it's an ERR_PTR? Maybe both? What about pointers returned via indirect arguments, are those cleared to NULL on error or left alone? Is it valid to pass a NULL * if you don't need that return pointer?

Only a few of those are concerned with type safety. You really should read the whole thing.

The_Colonel · on Sept 3, 2024

> Doesn't matter how you get there (though it's a bummer you've lost faith), just matters that you did.

What bothers me is exactly this attitude that all you need is Rust:

> The solution is called Rust. Encode all the rules in the code and type system once, and never have to worry about them again.

My point is that you can encode all important information in a formal language. It does not solve the documentation problems, it can mitigate them. That's not nitpicking, because I've met plenty of people thinking the former and their code is usually difficult to read as a consequence.