I like this idea, but I do not really understand who it is aimed at. The focus seems to be on reproducibility of scientific experiments and code, which is great! Many existing code artifacts are WOGSL (Works On Graduate Student's Laptop) which is the CS equivalent of "runs when parked".
So, let's break down the fields of CS for which this should be applicable:
* Systems: This won't work except for the few systems projects that are entirely in-RAM AND will work on tinycore's kernel version
* ML: This, I can see, especially with the seeming focus on dataset management. Much ML is compute-bound and the overhead of using the FUSE FS's is hopefully negligible.
So, is this focused on ML and ML-using code and experiments? If so, I think that should be clarified. I think a lot of systems folk will be (rightly or wrongly) turned away from it due to the seeming overhead of the various hyper* extensions. Not to mention that they are all written in Node/JS (Again, rightly or wrongly, many systems folk will not want to run their stuff on platforms written in JS)
I like the direction this project can go, but there seems to be a lack of focus or direction in your mission right now.
> So, let's break down the fields of CS for which this should be applicable:
> So, is this focused on ML and ML-using code and experiments?
Your completely missing the point. Please look into 'Computational Science' (or Scientific Computing, or Numerical Analysis), that applies to 80%+ of all disciplines that exist today (e.g., computational physics, comp. biology, comp. economics, comp. aspects of engineering disciplines, the list goes on).
Yup, I can see it for computational experiments or "applied CS" fields. I realized this soon after I posted the comment, but I didn't bother to update my comment.
However, this still isn't clear in their website. I will give them the benefit of the doubt since they are early in their project, but I think it would behoove them to nail down their mission sooner rather than later.
This is probably what I get being in the CS bubble. =)
Well to their credit, they did mention "scientific research reproducibility" which is a very well known phrase in computational circles.
But I agree, it would help if they expand on this from the pure CS point of view. Especially if they mention things like containers, CS people would be interested in finding out what they're up to.
I guess you could also say that CS is one of those "applied math" fields :)
Seriously though, this kind platform is a critical component in scientific reproducibility. The dream is that we can have code, data, and the results of the composition of the two in the same revision control system. A minimal layer to allow the execution of linux software would support the use of legacy code and binaries in this new platform. Javascript has its advantages, but it's a waste to build a data RCS and require all functions on the data to be written in it.
And to go a bit further, it's not just for science. For example, you could write a HN clone in dat. I could fork it and get both your code and all the posts.
So, let's break down the fields of CS for which this should be applicable:
* Systems: This won't work except for the few systems projects that are entirely in-RAM AND will work on tinycore's kernel version
* ML: This, I can see, especially with the seeming focus on dataset management. Much ML is compute-bound and the overhead of using the FUSE FS's is hopefully negligible.
So, is this focused on ML and ML-using code and experiments? If so, I think that should be clarified. I think a lot of systems folk will be (rightly or wrongly) turned away from it due to the seeming overhead of the various hyper* extensions. Not to mention that they are all written in Node/JS (Again, rightly or wrongly, many systems folk will not want to run their stuff on platforms written in JS)
I like the direction this project can go, but there seems to be a lack of focus or direction in your mission right now.