Great stuff, good catch on the syscall! I've been tinkering with C coroutines as well recently but since our critical code runs in kernel space they ended up being more hassle than they're worth. I'm still using them in our (user space) testing sandbox.
As you mention malloc()ing stacks: you might want to allocate them using mmap() and MAP_ANONYMOUS instead. You can map the adjacent memory pages with appropriate protection to prevent stack overflows and as a result, memory corruption. I'm not aware of any drawbacks (malloc itself typically uses mmap above a certain size) but it certainly beats hoping your stacks are big enough. Less of an issue on 64-bit environments of course.
As you mention malloc()ing stacks: you might want to allocate them using mmap() and MAP_ANONYMOUS instead. You can map the adjacent memory pages with appropriate protection to prevent stack overflows and as a result, memory corruption. I'm not aware of any drawbacks (malloc itself typically uses mmap above a certain size) but it certainly beats hoping your stacks are big enough. Less of an issue on 64-bit environments of course.