A (weak) stationary distribution has no trend, no seasonality, and no changes in...

A (weak) stationary distribution has no trend, no seasonality, and no changes in variance. With these properties, predictions are independent of the absolute point in time. The transformations to turn non-stationary series into stationary ones are reversible (usually differencing, log-transformations and alike), thus the predictions can be applied back to the original time series.

Treating time as a first-class component really just means to factor in the absolute point in time into the models at training time. This only makes sense if the absolute time changes properties of the distribution that cannot be accounted for with regular transformations. If that's the case, then we assume that these changes cannot be modeled, and are thus either random or follow a complicated systematic we can't grasp. In the first case, a NN wouldn't improve either, in the second case, we either need to always use the full history of the time series to make a prediction, or hope that a complex NN like LSTM might capture the systematic.

In any case, I think one of the more compelling reasons to use NN is to not have to do preprocessing. The trade-off is that you end up with a complicated solution compared to the six or so easy-to-understand parameters a SARIMA model might give you. And the latter even might give you some interpretable intuition for the behavior of the process.