fusrodaftpunk

  • 0 Posts
  • 8 Comments
Joined 2 years ago
cake
Cake day: June 10th, 2023

help-circle
rss
  • Re: c) I will be a dirty shill for VSCode and R lol, example here. I find it much better for R shiny development, projects with multiple people and projects with multiple languages. Notebook support is less good out of the box, you will have to get a jupyter kernel set up - but I use scripts more so than notebooks anyway.

    Anyway, onto the question! Base R. Yeah, I said it! Whenever I have a weird enough situation where tidyverse functions won’t work due to poor quality data, then I shed a single solemn tear and quietly wish I had done the project in python as I start writing a for loop in what will no doubt be the most hacky solution ever.



  • +1 for parquet and arrow. If you’re pushing memory better to just treat it as a completely out of memory problem. If you can split the data into multiple parquet files with hive style or directory partitioning it will be more efficient. You don’t want parquet files too small though (I’ve heard people saying 1 GB each file is ideal, colleagues at work like 512 MB per file - but that’s on an AWS setup).

    Bonus is once you’ve learned the packages it’ll be the same for all out of memory big datasets.






  • Just getting the hang of it. Jerboa app for android is very nice to have - sometimes find it hard to search up a community in another lemmy server even when I know the same (I think the search is case sensitive?). That’s on the mobile web interface… Haven’t figured out how to find a new community on jerboa itself yet