Hacker Newsnew | past | comments | ask | show | jobs | submit | kalap_ur's commentslogin

How do you diversify now? I presume you don't refer to stock portfolio, do you?

Stocks are fine for diversification, just stocks that have a different risk factors. So back in the 90's I had been working at Sun then did a couple of startups, and all of my 'investment' savings (which I started with stock from the employee purchase plan at Sun) were in tech of one kind or another. No banking stocks, no pharmaceutical stocks, no manufacturing sector stocks. Just tech, and more precisely Internet technology stocks. So when the Internet bubble burst every stock I owned depreciated rapidly in price.

One of the reasons I told myself I "couldn't" diversify was because if I sold any of the stock to buy different stock I'd pay a lot of capital gains tax and the IRS would take half and now I'd only be half as wealthy.

Another reason was my management telling me I couldn't sell my stock during "quiet" periods (even though they seemed too) and so sometimes when I felt like selling it I "couldn't."

These days, especially with companies that do not have publicly traded stock, that is harder than ever to diversify. The cynic in me says they are structured that way so that employees are always the last to get paid. It can still work though. You just have to find a way to option the stock you are owed on a secondary market. Not surprisingly there are MBA types who really want to have a piece of an AI company and will help you do that.

So now I make sure that not everything I own is in one area. One can do that with mutual funds, and to some extent with index funds.

But the message is if you're feeling "wealthy" and maybe paying your mortgage payments by selling some stock every month, you are much more at risk than you might realize. One friend who worked at JDS Uniphase back in the day just sold their stock and bought their house, another kept their stock so that it could "keep growing" while selling it off in bits to pay their mortgage. When JDSU died they had to sell their house and move because they couldn't afford the mortgage payments on just their salary. But we have a new generation that is getting to make these choices, I encourage people in this situation to be open to the learning.


Yeah. Hyperscalers who are building compute capacities became asset heavy industries. Today's Google, MSFT, META are completely different than 10 years ago and market has not repriced that yet. These are no longer asset light businesses.

I wonder if there is somebody here high up in the MSFT stack who understands the tech/code but also oversees more stuff to be able to opine.

great perspective and wisdom nuggets.

Well, this sounds like a "no shit Sherlock" statement: >>Finding 3: Natural "overthinking" increases incoherence more than reasoning budgets reduce it We find that when models spontaneously reason longer on a problem (compared to their median), incoherence spikes dramatically. Meanwhile, deliberately increasing reasoning budgets through API settings provides only modest coherence improvements. The natural variation dominates.<<

Language models are probabilistic and not deterministic. Therefore incoherence _by definition_ increases as a response becomes lengthier. This is not true for humans, who tend to act/communicate deterministically. If I ask the human, to read a pdf and ask, is there a word of "paperclip" in the pdf? The human deterministically will provide a yes/no answer and no matter how many times we repeat the process, they will provide the same answer consistently (not due to autocorrelation, because this can be done across different humans). LMs will have a probabilistic response - dependent on the training itself: with a very well trained model we can get a 99% probabilistic outcome, which means out of 100 simulations, it will give you 1 time the wrong answer. We have no clue about the "probablistic" component for LMs, however, simulations could be done to research this. Also, I would be very curious about autocorrelation in models. If a human did a task and came to a conclusion "yes", then he will always respond with increasing amount of eyerolling to the same task: "yes".

Also, imagine the question: "is the sky blue?" answer1: "Yes." This has 0 incoherence. answer2: "Yes, but sometimes it looks like black, sometimes blue." While this answer seemingly has 0 incoherence, the probability of increased incoherence is larger than 0 given that answer generation itself is probabilistic. Answer generation by humans is not probabilistic.

Therefore, probability driven LMs (all LMs today are probability driven) will always exhibit higher incoherence than humans.

I wonder if anybody would disagree with the above.


Well, Linux reached ~5% market share in 2025. Imagine the incremental market share they have. https://www.reddit.com/r/linux/comments/1lpepvq/linux_breaks...

My only issue is that i am not a developer, I am heavily reliant on Excel, i know it inside and out and just not sure whether OpenOffice supports excel files. In the past it barely did.


LibreOffice does fine, though you’ll probably be unhappy. What is more important to you? Freedom, privacy, consent, or spreadsheet features?

VMs are an option to partition your life as well.


There are many features of Excel that LibreOffice Calc doesn't support. Most importantly: structured references, VBA, PowerQuery. Not to mention its UI is very laggy even on powerful machines.

For real financial/business work, Calc is just not a serious player.


I even had to switch my reading list spreadsheet over from LibreOffice to Excel when the former started seriously lagging with about 250 rows total

I have a spreadsheet I've been using since 2017 to track all my spending and savings accounts on a weekly basis, plus some trend analytics, plus some simple graphs on multiple sheets. A few hundred rows and columns, both entered and calculated values (simple formulas, nothing fancy). Haven't noticed any slowness. When I have some data to look at (like .csv or even .xlsx), I always use Calc. I work with Excel at work all the time, it might be faster on larger data sets, but Libre's Calc is more than enough for many use cases.

I think there is a recent performance regression but hopefully will be fixed soon. Hasn't affected me. Learn Python, much better than VB.

Fedora + Google's Office Suite is the best way.

Don't bother with Libreoffice. Its trash. I'm convinced that Microsoft is deliberately sabotaging the project.


It is not the scale that matters here, in your example, but intent. With 1 joint, you want to smoke yourself. With 400, you very possibly want to sell it to others. Scale in itself doesnt matter, scale matters only as to the extent it changes what your intention may be.


> It is not the scale that matters here, in your example, but intent. With 1 joint, you want to smoke yourself. With 400, you very possibly want to sell it to others. Scale in itself doesnt matter, scale matters only as to the extent it changes what your intention may be.

It sounds then like you're saying that scale does indeed matter in this context, as using every single piece of writing in existence isn't being slurped up purely to learn, it's being slurped up to make a profit.

Do you think they'd be able to offer a usefull LLM if the model was trained only what what an average person could read in a lifetime?


It's common knowledge among LLM experts that the current capabilities of LLMs are triggered as emergent properties of training transformers on reams and reams of data.

That is intent of scale. To trigger LLMs to reach this point of "emergence". Whether or not it's AGI is a debate I'm not willing to entertain but everyone pretty much agrees that there's a point where the scale flips from a transformer being an autocomplete machine to something more than that.

That is legal basis for why companies would go for scale with LLMs. It's the same reason why people are allowed to own knives even though knives are known to be useful for murder (as a side effect).

So technically speaking these companies have legal runway in terms of intent. Making an emergent and helpful AI assistant is not illegal, but also making a profit isn't illegal either.


Right, but in the weed analogy, the scale is used as a proxy to assume intent. When someone is caught with those 400 joints, the prosecution doesn't have to prove intent, because the law has that baked in already.

You could say the same in LLM training, that doing so at scale implies the intent to commit copyright infringement, whereas reading a single book does not. (I don't believe our current law would see it this way, but it wouldn't be inconsistent if it did, or if new law would be written to make it so.)


It’s clear nvidia and every single one of these big AI corps do not want their AIs to violate the law. The intent is clear as day here.

Scale is only used for emergence, openAI found that training transformers on the entire internet would make is more then just a next token predictor and that is the intent everyone is going for when building these things.


I don't think that's clear at all. Businesses routinely break the law if they believe the benefits in doing so will outweigh the consequences.

I think this is even more common and more brazen when it comes to "disruptive" businesses and technologies.


>Businesses routinely break the law if they believe the benefits in doing so will outweigh the consequences.

I'm saying there's collective incentive among businesses to restrict the LLM from producing illegal output. That is aligned and ultra clear. THAT was my point.

But if LLMs produce illegal output as a side effect and it can't be controlled than your point comes into play here because now they have to weigh the cost + benefit as they don't have a choice in the matter. But that wasn't what I'm getting at. That's your new point, which you introduced here.

In short it is clear all corporations do not want LLMs to produce illegal content and are actively trying to restrict it.


If there is one exact sentence taken out of the book and not referenced in quotes and exact source, that triggers copyright laws. So model doesnt have to reproduce the entire book, it only required to reproduce one specific sentence (which may be a characteristic sentence to that author or to that book).


If there is one exact sentence taken out of the book and not referenced in quotes and exact source, that triggers copyright laws.

Yes, and that's stupid, and will need to be changed.


Sure, but that use would easily pass a fair use test, at least in the US.


You can only read the book, if you purchased it. Even if you dont have the intent to reproduce it, you must purchase it. So, I guess NVDA should just purchase all those books, no?


Yep, I agree. That’s the part that’s clearly illegal. They should purchase the books, but they didn’t.


This is the bit an author friend of mine really hates. They didn’t even buy a copy.

And now AI has killed his day job writing legal summaries. So they took his words without a license and used them to put him out of a job.

Really rubs in that “shit on the little guy” vibe.


Obviously not; one can borrow books from libraries and read them as well.


That's true. But the book itself was legally purchased. So if nvidia went to the library and trained AI by borrowing books, that should be technically legal.


Do you have the same legal rights to something that you've borrowed as you do with something you've purchased, though?

Would it be legal for me to borrow a book from the library, then scan and OCR every page and create an EPUB file of the result? Even if I didn't distribute it, that sounds questionable to me. Whereas if I had purchased the book and done the same, I believe that might be ok (format shifting for personal use).

Back when VHS and video rental was a thing, my parents would routinely copy rented VHS tapes if we liked the movie (camcorder connected to VCR with composite video and audio cables, worked great if there wasn't Macrovision copy protection on the source). I don't think they were under any illusions that what they were doing was ok.


Well If I copied it word for word maybe, but if I read it and "trained it" into my brain then it's clearly not illegal.

SO the grey area here is if I "trained" an LLM in a similar way and not copied it word for word then is it legal? Because fundamentally speaking it's literally the same action taken.


I paid $150 for a 64GB DDR5 in Jan 2025. That is today $830 representing 5.5x.


What are the specs of the kit?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: