I’m doing a bunch of AI stuff that needs compiling to try various unrelated apps. I’m making a mess of config files and extras. I’ve been using distrobox and conda. How could I do this better? Chroot? Different user logins for extra home directories? Groups? Most of the packages need access to CUDA and localhost. I would like to keep them out of my main home directory.
I did Linux From Scratch recently and they have a brilliant solution. Here’s the full text but it’s a long read so I’ll briefly explain it. https://www.linuxfromscratch.org/hints/downloads/files/more_control_and_pkg_man.txt
Basically you make a new user with the name of the package you want to install. Login to that user then compile and install the package.
Now when you search for files owned by the user with the same name as the package you will find every file that package installed.
You can document that somewhere or just use the find command when you are ready to remove all files related to the package.
I didn’t actually do this for my own LFS build so I have no further experience on the matter. I think it will eventually lead to dependency hell when two packages want to install the same file.
I guess flatpaks are better about keeping libraries separate but I’m not sure if they leave random files all over your hard drive the way apt remove/apt purge does. (Getting really annoyed about all the crud left in my home dir)
That’s clever. It should work on any system, shouldn’t it?
Any POSIX compliant system as far as I know.
Thanks. I’ll keep that in mind for again.
deleted by creator
Thanks for the info! I’m gonna look into flatpak.
I built nodejs from source yesterday and it took forever. I’d definitely prefer something huge like that in a flatpak.
Thanks for the read. This is what I was thinking about trying but hadn’t quite fleshed out yet. It is right on the edge of where I’m at in my learning curve. Perfect timing, thanks.
Do you have any advice when the packages are mostly python based instead of makefiles?
for python, a bunch of venvs should do it
This method should work with any command that’s installing files on your disk but it’s probably not worth the headache when virtual environments exist for python.
Python, in these instances, is being used as the installer script. As far as I can tell it involves all of the same packaging and directory issues as what make is doing. Like, most of the packages have a Python startup script that takes a text file and installs everything from it. This usually includes a pip git+address or two. So far, just getting my feet wet to try out AI has been enough for me to overlook what all is happening behind the curtain. The machine is behind an external whitelist firewall all by itself. I am just starting to get to the point where I want to dial everything in so I know exactly what is happening.
I’ve noticed a few oddball times during installations pip said something like “package unavailable; reverting to base system.” This was while it is inside conda, which itself is inside a distrobox container. I’m not sure what “base system” it might be referring to here or if this is something normal. I am probing for any potential gotchas revolving around python and containers. I imagine it is still just a matter of reading a lot of code in the installation path.
I hope someone who has more info comes along. It might be time for you to make a new post though since we’re getting to the heart of the problem now.
Also it will be a lot easier for people to diagnose if you are specific about which programs you are failing to install.
I’ve only experimented with Python in docker and it gave me a lot of headaches.
That’s why I prefer to pip install things inside venvs because I can just tar them myself and have decent portability.
But since your installing files across the system I’m not sure what the best solution is.
Nix
NixOS containers could do what OP’s asking for, but it’ll be trickier with just nix (on other distro). It’ll handle build dependencies and such, but you’ll still need to keep your home or other directories clean some other way.
OP could use flakes to create these dev environments and clean them up without a trace once done.
Any files created by programs running in the dev environments will remain.
nix-collect-garbage
Does NOT delete any files that were written to, for example,
~/.local
or~/.config
from dev shell.One of OP’s problems was,
I’m making a mess of config files and extras.
I use a mixture of systemd-nspawn and different user logins. This is sufficient for experimentation, for actual use I try to package (makepkg) those tools to have them organized by my package manager.
Also LVM thinpools with snapshots are a great tool. You can mount a dedicated LV to each single user home to keep everything separated.
Qubes: you can install software inside of its own disposable VM. Or it can be a persistent VM we’re only the data in home persists. Or it can be a VM where the root persists. You have a ton of control. And it’s really useful to see what’s changed in the system.
All the other solutions here are talking about in the operating system, qubes is doing it outside the operating system
I use Gentoo where builds from source are supported by the package manager. ;)
Overall though, any containerisation option such as Docker / Podman or Singularity is what I would typically do to put things in boxes.
For semi-persistent envs a chroot is fine, and I have a nice Gentoo-specific chroot script that makes my life easier when reproing bugs or testing software.
Wait. Does emerge support building packages natively when they are not from Gentoo?
Most of the stuff I’m messing with is mixed repos with entire projects that include binaries for the LLMs, weights, and such. Most of the “build” is just setting up the python environment with the right dependency versions for each tool. The main issues are the tools and libraries like transformers, pytorch, and anything that interacts with CUDA. These get placed all over the file system for each build.
Ebuilds (Gentoo packages) are trivial to create for almost anything, so while the answer is ‘no the package manager doesn’t manage non PM packages’, typically you’ll make an ebuild (or two or three) to handle that because it’s (typically) as easy as running make yourself. :)
For “desktop” stuff (gaming, office etc.) I just install bare-metal, for “server” stuff I basically only look for containerisation in the form of Podman (Docker compatible). If it doesn’t exist as a compose file it isn’t worth my time.
deleted by creator
I have read up on it some, but Fedora does UEFI, secure boot, and a self compiling Nvidia driver that gets built for each kernel update so well that I hesitate to leave. I tried installing the NIX package manager on fedora, but having a user owned directory folder mounted in root is the ugliest thing I’ve ever seen and immediately removed it.
Secureboot can use your own keys, which any distro can do regardless. Nix essentially (simplified) rebuilds your whole rootfs every time you do a config change so that every change is fully reproducable.
having a user owned directory folder mounted in root
Do you mean the /nix directory? Nixos doesn’t use the same FHS as most distros, so of course it would save its own data somewhere on root, since your actual rootfs is rebuilt declarativly when you request it. IMO it’s really elegant, and its hard to go back to files strewn all over the place and the headache of trying to have multiple versions of the same library installed.
Unfortunately, the UEFI on my laptop doesn’t allow custom keys. I can disable secure boot. I can make and place custom keys, but it never switches over from the initial unprotected state to whatever they call that transition state after the new custom PK key is added. Once all the custom keys are configured and I try to reinstate secure boot in the bios, it flushes the custom keys and recreates a new set automatically using the secret Trusted Protection Module key(s) built into the hardware.
Since my initial failure trying to add custom keys, I’ve come across Sakaki’s old UEFI guide on Gentoo and noted the possibility of maybe installing keys with the EFI KeyTool to boot into EFI directly, but I have not tried it.
Do you mean the /nix directory?
Yeah. I wanted to try a flake I came across for an AI app I was having trouble compiling on my own. The flake was setup for a Linux/Windows subsystem though. I tried to install the Nix package manager as a single user because I don’t want some extra daemon running or anything that has such an elaborate uninstall as the multiuser Nix lists. At least it is too much to deal with for a short term goal of installing a single app. After installing the single user Nix pm, the flake I was trying to use wasn’t listed in the Nix repo and reconfiguring what was already setup looked like a waste of time.
In general I would rather have my entire root directory locked down. I don’t really know the real world implications of having a user owned directory in my root file system. It just struck me as too strange to overlook, and it is far too deep into a rabbit hole for my goal that had proved fruitless already. I searched for several of the tools I’ve had to compile on my own, and none of them were listed in Nix. There are a couple on the AUR but no distro seems to do FOSS AI yet.
I’ve already been burned by running Arch natively years ago. I dropped it and installed Gentoo which I ran for a few months before switching to Silverblue because I didn’t really have the scripting skills to make Gentoo work for me at the time. I’m very weary of any elitist rhetoric about any distro now. When I see stuff like ‘Nix is a language you just learn/only for power users,’ I have flashbacks of dozens of tabs open in the arch wiki, back when I learned what a fractal link chasm is, and all those times I had to actually use backups to function with Arch; the only time I’ve ever needed to restore backups in my life. At this point, I think I’m on the edge of transitioning from an intermediate to “power user” after ten years on Linux exclusively, but I know there is a lot of stuff I do not grasp yet. An operating system shouldn’t be a project I need to actively manage, maintain, or stop what I am working on for a random tangential deep dive just to use. I can’t tell what Nix is like in practice. The oddity of the package manager does not inspire confidence, but I’m admittedly skeptical by default. I see how dependencies are handled better in Nix in some ways, but I do not have infinite storage for a bunch of redundant copies of everything just for every obscure package on my base system. I’m not clear about how Nix does configs and dot files “better” than my present situation. The lack of a deep dive into UEFI security and details about the ability of Nix to coexist in a system with a separate Windows drive (because laptop has a few configuration elements only available in Windows), I haven’t tried Nix OS.
No worries, UEFI is definitely a “milage may vary” kinda standard.
I’ve personally only used NixOS and not Nix on a different distro by itself so I’m less familiar with that setup. No system is perfect for all use cases, but that’s sorta the point in Unix-land. I personally have been gripped by NixOS and having to go back to Fedora for some of my old servers has been a pain. They use it like a buzzword all the time, but declarative administration is so awesome. It does have a heck of a learning curve though.
Is the Nix learning curve like Arch’s f.u. user-go cry to rsync/CS Masters expectations, or like Gentoo’s tl;dr approach of “our packagers know how to create sane defaults and include key info you need for sound decisions” approach. I never want to deal with another distro that randomly dumps me into an enormous subject to read because they made a change in a dependency that requires me to manually intervene in a system update, or any OS that makes basic FOSS tools like gimp and FreeCAD tedious.
In my experience its mostly sane defaults and a mixed bag in terms of documentation. For anyone else reading this, https://search.nixos.org/options using this to search for all the built in options is usually a good enough starting point for installing something.
Nix does dependencies very differently, since every program and everything it needs are put into their own checksummed directory, then linked into your PATH as requested in your config. So far I’ve never needed to do anything other than
nixos-rebuild --upgrade switch
and only needed to reboot for kernel updates.I mostly work in container spaces, so building things from source, or out-of-repo pkgs, while rare, are done in containers with podman. For example, running Automatic1111’s stable diffusion works perfectly for me in a container with an AMD GPU no less. Eventually I’d like to get into flakes, but their still marked experimental so I haven’t looked too much into it.
Overall the learning experience is figuring out the overall structure of the system, then taking advantage of all the super powerful tooling and consistency those tools offer.
I think Podman should do a good job but I never used it myself, Distrobox is build on it and a lot easier to use so that’s what I would recommend!
Not sure if that’s a good idea but if you use Fedora, you also have your root on a BTRFS partition after a default installation. You could utilize the snapshot features of BTRFS to roll back after testing.
I need to explore this BTRFS feature, I just don’t have a good place or reason to start Dow that path yet. I’ve been on Silverblue for years, but decided to try Workstation for now. Someone in the past told me I should have been using BTRFS for FreeCAD saves, but I never got around to trying it.
Have an lxc config that enables glx on x11 in the container, spin one up and throw stuff in there, temp zfs volume.
Lxc-rm when done.
software like stow keeps track of files installed, and helps you remove it later
I’ve never worried about this but I’d use Flatpak. The whole install goes in a specific directory and the metadata/config/data files go in their own specific directory.
Those Flatpak configs are not quite as scattered, most are in .config .var or .local. Most Flatpaks leave junk behind in these directories. I just deleted a few today. A lot of the problems start happening when you need to compile stuff where each package has the same dependency but a different version of the dep in each one. Then you have a problem and need to track down some related library that is not in the execution path and suddenly there are 10 copies of a dozen files all related to the stupid thing on your system and scattered all over the place. It becomes nearly impossible to track down which file is related to the container with the problem.
This is only an issue if you find yourself playing in software that is not yet supported directly my any packagers for Linux distros; stuff like FOSS AI right now.
it it does not need a gui, use docker and log in into it. do the stuff and when you are done, docker rm and everything disappear.
you can enable cuda inside the container, follow the docs for that.
bonus point, vs code can open itself inside a container.
You can use GUI stuff in docker as well, though it can be a bit fiddly to setup.
Haven’t tried it (and don’t use docker), so a wild shot: https://github.com/jupyterhub/repo2docker
‘repo2docker fetches a repository (from GitHub, GitLab, Zenodo, Figshare, Dataverse installations, a Git repository or a local directory) and builds a container image in which the code can be executed. The image build process is based on the configuration files found in the repository.’
That way you can perhaps just delete the docker image and everything is gone. Doesn’t seem to depend on jupyter…