Dear ML users/researchers, what hardware do you use?

@[email protected] · 2 years ago

Dear ML users/researchers, what hardware do you use?

@jaded79 · 2 years ago

Cloud is really cheap. Lambda labs is great, but I mostly use my university-supplied compute.

@[email protected] · 2 years ago

I think major training should just be done on dedicated servers/on the cloud. That being said it is very helpful to test locally, so in case you are planning on using Nvidia equipped servers just get any somewhat recent consumer Nvidia card and you can always run locally on some sample data and test much more easily.

@[email protected] · edit-2 1 year ago

I second that. Being able to test medium sized models locally can make debugging much easier.

I have a 3070 with 8GB VRAM, which can train e.g. a GPT2 with a batch-size of 1 with full precision.

Sagar Acharya · 2 years ago

SO Edge by Pine! Fantastic performance!

@[email protected] · edit-2 2 years ago

Interesting! How do you use it? Do you connect it to your main PC? How?

Also, what RAM does it use? Does it use the main system RAM?

@kraegar · 2 years ago

I have a mixed approach: I have a laptop with an RTX3060 (surprisingly good for small models and dev work). There are a few beefy servers I have access to through work/school which I leverage when I need more resources.

If I didn’t have access to work/school servers, I would likely go with cloud or build a new desktop for myself. Most of my work has been with time series forecasting and anomaly detection so the models tend to be smaller. If you need bigger models this wouldn’t work well for you.

@radical_action · 2 years ago

I would suggest just getting a laptop and nice external interface (keyboard/mouse if you prefer, nice monitor) + remote server. I bought a desktop + gpu setup back when I started my masters, but I use it shockingly little for work. The type of work that a single gpu + local machine incetivize are usually against good scientific and experimental practice. You dont really want that running jobs during the day.

As for specific cloud reccomendations, I have none. I just use what is available at my institution.

@hmh · 2 years ago

The type of work that a single gpu + local machine incetivize are usually against good scientific and experimental practice

TFW working on models for small embedded systems :( Honestly though I’d love to have a cluster to remote to just to train even more at once, but can’t for privacy reasons

@[email protected] · 2 years ago

Depends on the use cases I guess. If any larger scale deep learning is going on, you cannot afford buying all the required GPUs anyways.

However, I found myself using my tower PC quite a lot during my Masters. Especially for Uni projects my GPU came in very handy and was much appreciated by group members. Having your own GPU was often more convenient than using the resources provided by the lab.

Also, while relying mostly on cloud resources in my last job, I would have found having a GPU available on my work machine very convenient at certain times. Very nice for EDA and playing with models during the early phase of a project.

Besides from that, IMO a good CPU and > 32GB RAM on your own machine are sufficient for EDA and related things while I would rely on cloud resources for everything else, e.g., model training and large scale analyses.

@tetelestia · 2 years ago

I do 95% of my personal stuff on a desktop with a GTX 1070, often remoting into it from a laptop. Someday soon I’ll throw a bigger GPU in, but the 1070 has served me well for years.

I find the sunk cost of building a machine encourages me to use it more. I don’t mind running something for a week even if I have no idea if it’ll work or not.

Same deal at work, but with much beefier hardware. In both cases, I’ll spin up a cloud instance if I want some results faster.

@[email protected] · 2 years ago

Thank you! How much does the GTX 1070 help, compared to simply running your training runs on a recent CPU?

@tetelestia · 2 years ago

Yes, the 1070 is substantially faster than CPU. Without benchmarking, I would guess 10-20x faster than a recent consumer CPU. In reality, unless you’re interested in big NLP tasks or big computer vision models, a 1070 works just fine.

A 4090 might be 10x faster, so it turns a weekend job into an afternoon, or a month into a weekend, but plenty of real work can be done with a modest setup.

If I were building something from scratch on a budget, I’d look at the best 30-series Nvidia card I can afford. If you’re using TensorFlow, TF32 is usually basically a free speed up, with PyTorch it’s a bit less stable. You should be able to build a full system with a 3060 12GB for under $1000, or with a 3090 for under $2000.