Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Rewrite Tor in Rust (torproject.org)
94 points by Jonhoo on May 3, 2016 | hide | past | favorite | 27 comments


The OP is likely nothing more than wishful thinking, but I can see why people are intrigued by the prospect. An important thing to note is that it's entirely possible to perform a slow, gradual integration of Rust code into a C/C++ codebase (such as Mozilla is doing with Firefox), but only if your codebase is sufficiently modular with well-defined interfaces for Rust-based libs to slot into. If all you've got is a big ball of mud, then your first step is to work on the modularization of your codebase (which, IMO, has a good chance of increasing your code quality regardless of whether you eventually decide to introduce any Rust).


>If all you've got is a big ball of mud, then your first step is to work on the modularization of your codebase (which, IMO, has a good chance of increasing your code quality regardless of whether you eventually decide to introduce any Rust).

I wanted to up-vote that bit.


Since Rust and C can agree on ABI and memory layout issues, you can move the code over function-by-function if you were so inclined, right? Declare the structs in the headers and auto-gen the Rust versions (is there a tool for the inverse?).


You can, but idiomatic Rust code looks pretty different from idiomatic C code - to give a concrete example,

    int get_a_string(char **out) // returns error code
could be translated literally as

    fn get_a_string(out: *mut *mut c_char) -> c_int
but much preferable would be

    fn get_a_string() -> Result<String, Errcode>
[where Result and String are standard library types; Errcode would be a custom enum.]

I suppose you could go from the first to the second incrementally, then refactor to the third. But I'm not sure it'd work well.


Is there a solution yet in Rust to this problem of, monadic-ish constructs like Result and Option, but no nice way to chain them together without hideous nesting, like Haskell's "do" or Scala's "for"?


In the special case of Result, the new ? operator is going to subsume try! and offer more flexibility:

https://github.com/rust-lang/rfcs/pull/243 (note that what was actually accepted differs from the text of the RFC, you need to scroll to the bottom)

As steveklabnik says, there's nothing resembling generalized monadic do notation for now.


We don't have HKT and therefore we can't have real do notation. The mdo crate provides a duck-typed macro.

.and_then calls themselves aren't too bad, with the right whitespace.


This presumes that functions are well-defined boundaries in your C codebase, which may not be true if one is leaning heavily on global state.


You end up having to toss a layer of c around your c++ last time I checked. Does c++ even have an abi yet?


C never had one standard ABI, not one that I can read in ANSI C document.

What happens is that C ABI == OS ABI, when the OS is written in C. This is not the case in the mainframe OSes that are still alive, for example.

So people have come to expect C ABI as being some kind of standard.

On an OS written in pure C++, the OS vendor C++ compiler would be the ABI.

Having said this, there are efforts to partially standardize the C++ ABI:

https://isocpp.org/blog/2014/05/n4028

Also many vendors use the Intel's C++ Itanium ABI as reference.


Note that the standardization effort is not what most people expect. It really is just a call for:

a) each platform to document its C++ABI. b) each platform to offer a ABI stable variant of the standard library as an option.

Both points are really already the norm and the default on many platforms. For example GCC on most OSs follows the documented Itanium ABI while libstdc++ has been ABI stable for a long time at least on Linux.

The notable exception is windows and MSVC. While the C ABI is documented, I believe that C++ ABI is pretty much "whatever MSVC does" and had to be reverse enginereed. Additionally MSVC reservers the right to break its library ABI at every major release (and in fact it does).


Sure, hence why I said partially.

However being mainly a .NET/JVM developer nowadays, I just follow C++ standardization on the sidelines.

Thanks for correction.


You can call into Rust directly from C++ (with a header file).

But yeah, the reverse requires a C layer. rust-bindgen has experimental C++ support which we use for spidermonkey but it isn't perfect.


> You can call into Rust directly from C++ (with a header file).

Can you tell me more about this?


This is C and I tossed it together VERY quickly but

        $ cat hello.rs
        #[no_mangle]
        pub extern fn plus_one(x: isize) -> isize {
            x + 1
        }
        $ cat hello.c
        #include<stdio.h>

        extern int plus_one(int);

        int main() {
                int result = plus_one(5);

                printf("result is %d\n", result);

                return 0;
        }
        $ rustc hello.rs --crate-type=dylib
        $ gcc hello.c -lhello -L .
        $ LD_LIBRARY_PATH=. ./a.out
        result is 6


I get the calling into Rust from C via no_mangle + extern. I was looking for the header only C++ to Rust path.


It's the same way you call into C from C++. When Rust has a no_mangle extern C function to the universe it's just a C function. Your C++ header just needs to mark the fn as extern C and you're done, you can call it as you would call any other C/++ function.


Oh, I thought there was a magical way to call from C++ into Rust using only a C++ header w/o no_mangle/extern.


Hm, I might have misunderstood either you or the OP, then. A C++ header would include the same extern declaration, though it would mark it with the C ABI as well.


well, not a standard one. That's what extern "C" is for : if you want a standardized API, C always had one.


Neither C nor C++ have a standardized ABI, if by standardized you mean part of their respective ISO standard. Nor they could, as the many of ABI properties (names, calling conventions, layout, object and executable file format) are intimately platform (as in OS and CPU) specific.

Most platform officially specify their C ABI, and many also specify their C++ ABI (usually in therm of the C one). Itanium C++ ABI (plus the platform specific psABIs) has become the de facto C++ ABI for many unix and unix-like platforms.


As a side note, Galois have written an implementation of Tor which happily interacts with official tor, for use in HalVM based unikernels. So, if you want ephemeral tor nodes, HalVM is a pretty great way to get it working.

Edit: Video about it [2]

[1] https://github.com/GaloisInc/haskell-tor [2] https://www.youtube.com/watch?v=oHcHTFleNtg


Is there an anti-pattern for this kind of thing? Its a two line bug report suggesting years of development effort. There is a huge multiplication factor between the time invested by the bug reporter and the time required to do what he asks.


Only if your goal is to doom the existing tor project. Lots of great research projects start this way. Often times it influences the main project or merges.


what good would come of this, to be honest?


The discussion is 10 months old and never went anywhere. What makes this relevant now?


It would be a bit embarrassing to rewrite in Rust and then have the network penetrated because they screwed up the crypto during the rewrite...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: