When you can't have Apache Lucene, because you are using PHP/RoR/Django/node.js/whatever, you can always have "basic text indexing and search".
that's a bad news for users of all those PHP/RoR/Django/node etc. apps, who will never get proper on site search functionality.
majority of lazy devs won't go for Solr-like solution
My environment of choice is Python. I've used solr before but for a project a while back I used the pure Python Whoosh http://packages.python.org/Whoosh/
The intention was to quickly develop the extra search pieces needed in Python and then port them to solr. (For example we needed a custom scoring mechanism, and needed to experiment with spelling errors, pronunciation equivalency etc). However Whoosh turned out performant enough that I didn't need to touch solr again (XML config files always make me judder!)
So if you Python, I strong recommend giving Whoosh a go especially when starting out a project as you'll be more productive.
This is just flat out misinformation. You can use lucene from pretty much any environment. In rails it is utterly trivial to integrate solr/lucene, it's probably about 2 or 3 lines of code. I assume it's similar for other frameworks.
What host doesn't let you run java software? I did a dry run a long time ago with solr and ubuntu under VMware fusion in a half gig VM (or maybe 684 Meg), and it's my impression that solr won't run well in limited memory (sphinx works fine) but it's been a while
You can't run Solr/Lucene properly on Google App Engine(just an example), other software may run better in such circumstances, but its quality is questionable
We used it only for Spanish, it worked well enough (and fast enough). We deployed on bare metal so no hosting provider (appart from the rack space) was in the middle. If you are doing things this "difficult" you'll need at least a VPS of course.
Thanks to rake and brew etc, Thinking sphinx, sunspot and elastic search/tire are all pretty easy with rails if you want default indexes on English language docs. It all gets complex quickly when you start layering on multiple search strategies and indexes, n-gram search, convert ISO latin to ASCII, etc, not to mention the S word "scaling" and anything near realtime index updates
As long as there is a better one - yes. Why not? There are elasticsearch-clients that pretty much plug into activerecord and elasticsearch is pretty darn good.
>Personally, for a lot of use cases I prefer exact string matches over BS stem indexing.
Really? I've worked on a few search projects in different spaces (venues (aka places/stores), source code, and products) in the past, and while exact string matches are often a good sign of quality, stemming and other analyzers make huge improvements in recall (and when measuring transaction volume in A/B testing strict string matching performed substantially worse). Certainly if you throw out the exact match signal (i.e. only index stemmed) I've seen that result in a deterioration of quality. What sort of data do you work with?
that's a bad news for users of all those PHP/RoR/Django/node etc. apps, who will never get proper on site search functionality. majority of lazy devs won't go for Solr-like solution